Hi, I wanted to start a discussion on building parity of built-in functions with popular OSS libraries. I am thinking of attaining parity as a 3-step process:
*Step 1* As far as I can tell from the existing built-in functions, SystemDS aims to offer a hybrid set of APIs for scientific computing and ML (data engineering included) to users. Therefore, the most obvious OSS libraries for comparison would be numpy, sklearn (scipy), and pandas. Apache DataSketches would be another relevant system for specialized use cases (sketches). *Step 2* Once we have established a set of libraries, I would propose that we create a capability matrix with sections for each library, like so: Section 1: numpy f_1 f_2 [..] f_n Section 2: sklearn [..] The columns could be a checklist like this: f_i -> (DML, Python, CP, SP, RowCol, Row, Col, Federated, documentationPublished) *Step 3* Create JIRA tasks, assign them, and start coding. Thoughts? Thanks, Badrul
