This is an automated email from the ASF dual-hosted git repository. baunsgaard pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/systemds.git
commit 5a5eece7aa5fe1361adfe614149c8b3ec582b693 Author: Sebastian Baunsgaard <[email protected]> AuthorDate: Fri Aug 30 14:05:51 2024 +0200 [SYSTEMDS-3737] Python API docs oneline This commit fix a bug in the Python API where for some reason an update decided to make some of the code blocks one liners in the sphinx docs of our python API. --- .../source/getting_started/simple_examples.rst | 16 +++++++++++---- .../python/docs/source/guide/algorithms_basics.rst | 14 ++++++++++--- src/main/python/docs/source/guide/federated.rst | 12 ++++++++--- .../docs/source/guide/python_end_to_end_tut.rst | 23 ++++++++++++++++++++++ 4 files changed, 55 insertions(+), 10 deletions(-) diff --git a/src/main/python/docs/source/getting_started/simple_examples.rst b/src/main/python/docs/source/getting_started/simple_examples.rst index dd20c89fd0..75d4c4ccee 100644 --- a/src/main/python/docs/source/getting_started/simple_examples.rst +++ b/src/main/python/docs/source/getting_started/simple_examples.rst @@ -30,8 +30,10 @@ Matrix Operations Making use of SystemDS, let us multiply an Matrix with an scalar: .. include:: ../code/getting_started/simpleExamples/multiply.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: As output we get @@ -48,8 +50,10 @@ Let us do a quick element-wise matrix multiplication of numpy arrays with System Remember to first start up a new terminal: .. include:: ../code/getting_started/simpleExamples/multiplyMatrix.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: More complex operations ----------------------- @@ -58,8 +62,10 @@ SystemDS provides algorithm level functions as built-in functions to simplify de One example of this is l2SVM, a high level functions for Data-Scientists. Let's take a look at l2svm: .. include:: ../code/getting_started/simpleExamples/l2svm.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: The output should be similar to @@ -81,8 +87,10 @@ instead of using numpy arrays that have to be transfered into systemDS. The above script transformed goes like this: .. include:: ../code/getting_started/simpleExamples/l2svm_internal.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: When reading in datasets for processing it is highly recommended that you read from inside systemds using sds.read("file"), since this avoid the transferring of numpy arrays. diff --git a/src/main/python/docs/source/guide/algorithms_basics.rst b/src/main/python/docs/source/guide/algorithms_basics.rst index 7206605222..825c0d066f 100644 --- a/src/main/python/docs/source/guide/algorithms_basics.rst +++ b/src/main/python/docs/source/guide/algorithms_basics.rst @@ -42,6 +42,8 @@ To setup this simply use :code: python :start-line: 22 :end-line: 30 + :encoding: utf-8 + :literal: Here the DataManager contains the code for downloading and setting up numpy arrays containing the data. @@ -86,9 +88,11 @@ Step 3: Training To start with, we setup a SystemDS context and setup the data: .. include:: ../code/guide/algorithms/FullScript.py + :code: python :start-line: 31 :end-line: 35 - :code: python + :encoding: utf-8 + :literal: to reduce the training time and verify everything works, it is usually good to reduce the amount of data, to train on a smaller sample to start with @@ -169,9 +173,11 @@ from our sample of 1k to the full training dataset of 60k, in this example the m to again reduce training time .. include:: ../code/guide/algorithms/FullScript.py + :code: python :start-line: 31 :end-line: 43 - :code: python + :encoding: utf-8 + :literal: With this change the accuracy achieved changes from the previous value to 92%. But this is a basic implementation that can be replaced by a variety of algorithms and techniques. @@ -185,6 +191,8 @@ One noteworthy change is the + 1 is done on the matrix ready for SystemDS, this makes SystemDS responsible for adding the 1 to each value. .. include:: ../code/guide/algorithms/FullScript.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: diff --git a/src/main/python/docs/source/guide/federated.rst b/src/main/python/docs/source/guide/federated.rst index 4afafa070d..cdd7a698d0 100644 --- a/src/main/python/docs/source/guide/federated.rst +++ b/src/main/python/docs/source/guide/federated.rst @@ -54,15 +54,19 @@ This should be located next to the ``test.csv`` file called ``test.csv.mtd``. To make both the data and metadata simply execute the following .. include:: ../code/guide/federated/federatedTutorial_part1.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: After creating our data the federated worker becomes able to execute federated instructions. The aggregated sum using federated instructions in python SystemDS is done as follows .. include:: ../code/guide/federated/federatedTutorial_part2.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: Multiple Federated Environments ------------------------------- @@ -82,8 +86,10 @@ Start with 3 different terminals, and run one federated environment in each. Once all three workers are up and running we can leverage all three in the following example .. include:: ../code/guide/federated/federatedTutorial_part3.py - :start-line: 20 :code: python + :start-line: 20 + :encoding: utf-8 + :literal: The print should look like diff --git a/src/main/python/docs/source/guide/python_end_to_end_tut.rst b/src/main/python/docs/source/guide/python_end_to_end_tut.rst index 961b47d61b..6237450382 100644 --- a/src/main/python/docs/source/guide/python_end_to_end_tut.rst +++ b/src/main/python/docs/source/guide/python_end_to_end_tut.rst @@ -56,6 +56,8 @@ a fraction of the training and test set into account to speed up the execution. :code: python :start-line: 20 :end-line: 51 + :encoding: utf-8 + :literal: Here the DataManager contains the code for downloading and setting up either Pandas DataFrames or internal SystemDS Frames, for the best performance and no data transfer from pandas to SystemDS it is recommended to read directly from disk into SystemDS. @@ -70,6 +72,8 @@ training data. Afterward, we can make predictions on the test data and assess th :code: python :start-line: 53 :end-line: 54 + :encoding: utf-8 + :literal: Note that nothing has been calculated yet. In SystemDS the calculation is executed once compute() is called. E.g. betas_res = betas.compute(). @@ -80,6 +84,8 @@ We can now use the trained model to make predictions on the test data. :code: python :start-line: 56 :end-line: 57 + :encoding: utf-8 + :literal: The multiLogRegPredict function has three return values: - m, a matrix with the mean probability of correctly classifying each label. We do not use it further in this example. @@ -98,6 +104,8 @@ for the predictions and the confusion matrix averages of each true class. :code: python :start-line: 59 :end-line: 60 + :encoding: utf-8 + :literal: Full Script ~~~~~~~~~~~ @@ -108,6 +116,8 @@ In the full script, some steps are combined to reduce the overall script. :code: python :start-line: 20 :end-line: 65 + :encoding: utf-8 + :literal: Level 2 ------- @@ -125,6 +135,8 @@ but instead of preparing the test data, we only prepare the training data. :code: python :start-line: 20 :end-line: 47 + :encoding: utf-8 + :literal: Step 2: Load the algorithm ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -139,6 +151,8 @@ The file can be found here: :code: python :start-line: 48 :end-line: 51 + :encoding: utf-8 + :literal: Step 3: Training the neural network @@ -154,6 +168,8 @@ The seed argument ensures that running the code again yields the same results. :code: python :start-line: 52 :end-line: 58 + :encoding: utf-8 + :literal: Step 4: Saving the model @@ -169,6 +185,8 @@ is saved. :code: python :start-line: 59 :end-line: 65 + :encoding: utf-8 + :literal: Step 5: Predict on Unseen data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -180,6 +198,8 @@ unseen data: :code: python :start-line: 66 :end-line: 77 + :encoding: utf-8 + :literal: Full Script NN @@ -192,3 +212,6 @@ The complete script now can be seen here: :code: python :start-line: 20 :end-line: 80 + :encoding: utf-8 + :literal: +
