Hi all, 

We are in process of adding support for lower precision and wanted to give 
everyone heads up. By lower precision, I mean support storing matrices in float 
array (or half precision array) and performing operations using float kernels. 
Initial experiments suggest that we can get up to 2x improvements in terms of 
performance for Deep Learning algorithms. Also, this reduces  the memory 
requirements by 2x. 

Please provide any concerns or suggestions. 

The high-level plan is as follows:
1. Support lower precision on GPU. Please see 
https://github.com/apache/systemml/pull/688
2. Support lower precision with native BLAS.
3. Support lower precision on CP/Spark. This includes writing float matrix in 
binary format and updating memory estimation in hops.
4. Extend Python APIs to support lower precision.

The first two steps requires the conversion of double array to float/half 
precision array.

Thanks 

Niketan.

Reply via email to