This is an automated email from the ASF dual-hosted git repository.
mboehm7 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/systemds.git
The following commit(s) were added to refs/heads/main by this push:
new 6b37c859c5 [MINOR] Fix outdated readme SystemDS overview
6b37c859c5 is described below
commit 6b37c859c57726068218592daee425743ceef0c5
Author: Matthias Boehm <[email protected]>
AuthorDate: Sun Jan 26 11:56:29 2025 +0100
[MINOR] Fix outdated readme SystemDS overview
---
README.md | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/README.md b/README.md
index 5ba4e37b33..0f94878616 100644
--- a/README.md
+++ b/README.md
@@ -19,15 +19,12 @@ limitations under the License.
# Apache SystemDS
-**Overview:** SystemDS is an open source ML system for the end-to-end data
science lifecycle from data integration, cleaning,
-and feature engineering, over efficient, local and distributed ML model
training, to deployment and serving. To this
-end, we aim to provide a stack of declarative languages with R-like syntax for
(1) the different tasks of the data-science
-lifecycle, and (2) users with different expertise. These high-level scripts
are compiled into hybrid execution plans of
-local, in-memory CPU and GPU operations, as well as distributed operations on
Apache Spark. In contrast to existing
-systems - that either provide homogeneous tensors or 2D Datasets - and in
order to serve the entire data science lifecycle,
-the underlying data model are DataTensors, i.e., tensors (multi-dimensional
arrays) whose first dimension may have a
-heterogeneous and nested schema.
-
+**Overview:** Apache SystemDS is an open-source machine learning (ML) system
for the end-to-end
+data science lifecycle from data preparation and cleaning, over efficient ML
model training,
+to debugging and serving. ML algorithms or pipelines are specified in a
high-level language
+with R-like syntax or related Python and Java APIs (with many builtin
primitives), and the
+system automatically generates hybrid runtime plans of local, in-memory
operations and distributed
+operations on Apache Spark. Additional backends exist for GPUs and federated
learning.
Resource | Links
---------|------