Forward propagation as well as backpropagation leads to some operations on matrixes. The most common one is a matrix multiplication. In order to perform matrix multiplication in reasonable time you will need to optimise your algorithms.

There is a simple way to do it on macOS by means of their Accelerate Framework . Actually this is an umbrella framework for vector-optimized operations:

- vecLib.framework – Contains vector-optimized interfaces for performing math, big-number, and DSP calculations, among others.
- vImage.framework – Contains vector-optimized interfaces for manipulating image data.

Cblas_sgemm function can help you reach really hight performance.

Actually, vecLib is only a ported version of two libs BLAS and LAPACK.

**cblas.h** and **vblas.h** are the interfaces to Apple’s implementations of BLAS. You can find reference documentation in BLAS. Additional documentation on the BLAS standard, including reference implementations, can be found on the web starting from the BLAS FAQ page at these URLs: http://www.netlib.org/blas/faq.html and http://www.netlib.org/blas/blast-forum/blast-forum.html.

**clapack.h** is the interface to Apple’s implementation of LAPACK. Documentation of the LAPACK interfaces, including reference implementations, can be found on the web starting from the LAPACK FAQ page at this URL: http://netlib.org/lapack/faq.html

This is a good way to combine your code with C++ library on Linux and macOS platforms.

Read More