Baunsgaard commented on issue #872: [SYSTEMDS-273] Refactor Compressed Package
URL: https://github.com/apache/systemml/pull/872#issuecomment-615276665
 
 
   Major refactor of compressed Matrix Block to "simplify" responsibilities.
   Many changes in compression planning especially in memory estimation.
   
   Feedback appreciated :+1:
   
   The remaining errors, are mainly in the sparse estimation of compression 
sizes. But include
   
   - Sparse estimation of Number of Distinct values is off when the input is 
sparse the wrong sample based estimators are used.
   - The bitmaps encoded for extracting column facts does not contain 
information of if there is a 0 present in the column.
   - A bug (intend to fix today) in unary operators when compressing with a 
specific compression scheme.
   
   Hopefully if we merge the bugs mentioned above can be fixed within 
reasonable time.
   
   Bellow is an extract of the different changes:
   
   - Separated sub-parts of compression into different packages.
   - Array memory footprint worst case calculations.
   - Moved Compressed Size Estimation Calculation to specific ColGroups
   
   - Extensive testing of size Estimation of ColGroups and compression
     - Jol Memory Estimate tests for compression blocks
     - Using worst case Jol Estimate JVM using uncompressed 64-bit JVM
     - Ideal input generator for testing Col groups compression.
   
   - Factory pattern added for selected constructors
     - ColGroups
       - NameChange from ColGroupCompressor to ColGroupFactory
     - CompressedMatrixBlock
   
   - Enable the parallel execution of the ColGrouping
   
   - Settings File added for Compression to enable selection of specific
     compression types.
   
   - Added abstract compressed block for overwriting default MatrixBlock
   
   - Add Test Libs to pom.xml
     - Memory estimator framework JOL from OpenJDK to measure object sizes
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to