[ https://issues.apache.org/jira/browse/SPARK-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-16421: ------------------------------ Assignee: Bryan Cutler Priority: Major (was: Trivial) > Improve output from ML examples > ------------------------------- > > Key: SPARK-16421 > URL: https://issues.apache.org/jira/browse/SPARK-16421 > Project: Spark > Issue Type: Sub-task > Components: Examples, ML > Reporter: Bryan Cutler > Assignee: Bryan Cutler > Fix For: 2.1.0 > > > In many ML examples, the output is useless. Sometimes {{show()}} is called > and any pertinent results are hidden. For example, here is the output of > max_abs_scaler > {noformat} > $ bin/spark-submit examples/src/main/python/ml/max_abs_scaler_example.py > +-----+--------------------+--------------------+ > |label| features| scaledFeatures| > +-----+--------------------+--------------------+ > | 0.0|(692,[127,128,129...|(692,[127,128,129...| > | 1.0|(692,[158,159,160...|(692,[158,159,160...| > | 1.0|(692,[124,125,126...|(692,[124,125,126...| > {noformat} > Other times a few rows are printed out when {{show}} might be more > appropriate. Here is the output from binarizer_example > {noformat} > $ bin/spark-submit examples/src/main/python/ml/binarizer_example.py > 0.0 > > 1.0 > 0.0 > {noformat} > But would be much more useful to just {{show()}} the transformed DataFrame > {noformat} > +-----+-------+-----------------+ > |label|feature|binarized_feature| > +-----+-------+-----------------+ > | 0| 0.1| 0.0| > | 1| 0.8| 1.0| > | 2| 0.2| 0.0| > +-----+-------+-----------------+ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org