Hi,

I wanted to start a discussion on building parity of built-in functions
with popular OSS libraries. I am thinking of attaining parity as a 3-step
process:

*Step 1*
As far as I can tell from the existing built-in functions, SystemDS aims to
offer a hybrid set of APIs for scientific computing and ML (data
engineering included) to users. Therefore, the most obvious OSS libraries
for comparison would be numpy, sklearn (scipy), and pandas. Apache
DataSketches would be another relevant system for specialized use cases
(sketches).

*Step 2*
Once we have established a set of libraries, I would propose that we create
a capability matrix with sections for each library, like so:

Section 1: numpy

f_1

f_2

[..]


f_n

Section 2: sklearn

[..]


The columns could be a checklist like this: f_i -> (DML, Python, CP, SP,
RowCol, Row, Col, Federated, documentationPublished)

*Step 3*
Create JIRA tasks, assign them, and start coding.


Thoughts?


Thanks,
Badrul

Reply via email to