Using Machine Language Subroutines For High Speed Vector Mathematics
A number of years ago I worked on a project that required extensive manipulation of digital waveforms; the waveforms were stored in double precision floating point vectors (or arrays if you prefer). We quickly discovered that even when we compiled the programs we were not getting the speed required because of the loops needed to step through the entire vector. Not having any direct vector oriented operators such as those available in APL we decided to create our own subroutines that we could pass a vector to and have the operation done quickly in machine language.

