BACtaki opened a new pull request, #1740:
URL: https://github.com/apache/systemds/pull/1740

   This patch converts the existing unique() function from a script to a 
built-in. The script-based approach is based on sorting, which is very 
expensive computationally, especially for large multiblock inputs. The new 
approach, on the other hand, is based on a new data sketch for the unique() 
function. This first patch creates the framework for the new unique sketch and 
implements the CP RowCol case; other cases - CP Row/Col and Spark 
RowCol/Row/Col - will be implemented in subsequent patches.
   
   Tests:
       [X] Unit tests
       [ ] Integration tests
       [ ] N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to