[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636133#comment-16636133 ] Apache Spark commented on SPARK-25321: -- User 'mengxr' has created a pull request for this issue: https://github.com/apache/spark/pull/22618 > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Weichen Xu >Priority: Blocker > Fix For: 2.4.0 > > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636136#comment-16636136 ] Apache Spark commented on SPARK-25321: -- User 'mengxr' has created a pull request for this issue: https://github.com/apache/spark/pull/22618 > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Weichen Xu >Priority: Blocker > Fix For: 2.4.0 > > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610379#comment-16610379 ] Weichen Xu commented on SPARK-25321: [~josephkb] There're 2 changes which break compatibility we need review: [SPARK-10413] ML models should support prediction on single instances: This PR break source and binary compatibility, if there're subclasses defined by users which override "protected predict" method. If we make `predict` public, then all subclasses must public there "predict" methods. What do you think of it? Is it a big issue ? [SPARK-14681] Provide label/impurity stats for spark.ml decision tree nodes: This will break binary compatibility, but it looks like keeping source compatibility. > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612436#comment-16612436 ] Joseph K. Bradley commented on SPARK-25321: --- You're right; these are breaking changes. If we're sticking with the rules, then we should revert these in branch-2.4, but we could keep them in master if the next release is 3.0. Is it easy to revert these PRs, or have they collected conflicts by now? > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618118#comment-16618118 ] Joseph K. Bradley commented on SPARK-25321: --- [~WeichenXu123] Have you been able to look into reverting those changes or discussed with [~mengxr] about reverting them? Thanks! > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618495#comment-16618495 ] Xiangrui Meng commented on SPARK-25321: --- [~WeichenXu123] Could you check whether mleap is compatible with the tree Node breaking changes? This line is relevant: https://github.com/combust/mleap/blob/master/mleap-runtime/src/main/scala/ml/combust/mleap/bundle/ops/classification/DecisionTreeClassifierOp.scala If it is hard to make MLeap upgrade, we should revert the change. cc: [~hollinwilkins] > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621862#comment-16621862 ] Weichen Xu commented on SPARK-25321: [~mengxr] mleap is NOT compatible with the tree Node breaking changes. compile failed: ``` [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/classification/DecisionTreeClassifierOp.scala:34: type mismatch; [error] found : org.apache.spark.ml.tree.Node [error] required: org.apache.spark.ml.tree.ClassificationNode [error] rootNode = rootNode, [error]^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/clustering/LDAModelOp.scala:59: reassignment to val [error] oldLocalModel = oldLocalModel, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/regression/DecisionTreeRegressionOp.scala:33: type mismatch; [error] found : org.apache.spark.ml.tree.Node [error] required: org.apache.spark.ml.tree.RegressionNode [error] rootNode = rootNode, [error]^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:50: trait LeafNode is abstract; cannot be instantiated [error] new tree.LeafNode(prediction = node.values.indexOf(node.values.max), [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:50: not found: value prediction [error] new tree.LeafNode(prediction = node.values.indexOf(node.values.max), [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:51: not found: value impurity [error] impurity = 0.0, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:52: not found: value impurityStats [error] impurityStats = calc) [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:75: trait InternalNode is abstract; cannot be instantiated [error] new tree.InternalNode(split = split, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:75: reassignment to val [error] new tree.InternalNode(split = split, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:76: not found: value leftChild [error] leftChild = left, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:77: not found: value rightChild [error] rightChild = right, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:78: not found: value prediction [error] prediction = 0.0, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:79: not found: value gain [error] gain = 0.0, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:80: not found: value impurity [error] impurity = 0.0, [error] ^ [error] /Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:81: not found: value impurityStats [error] impurityStats = null) [error] ^ [error] 15 errors found ``` > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find iss
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622942#comment-16622942 ] Apache Spark commented on SPARK-25321: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/22492 > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623003#comment-16623003 ] Apache Spark commented on SPARK-25321: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/22510 > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs
[ https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623004#comment-16623004 ] Apache Spark commented on SPARK-25321: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/22510 > ML, Graph 2.4 QA: API: New Scala APIs, docs > --- > > Key: SPARK-25321 > URL: https://issues.apache.org/jira/browse/SPARK-25321 > Project: Spark > Issue Type: Sub-task > Components: Documentation, GraphX, ML, MLlib >Affects Versions: 2.4.0 >Reporter: Weichen Xu >Assignee: Yanbo Liang >Priority: Blocker > > Audit new public Scala APIs added to MLlib & GraphX. Take note of: > * Protected/public classes or methods. If access can be more private, then > it should be. > * Also look for non-sealed traits. > * Documentation: Missing? Bad links or formatting? > *Make sure to check the object doc!* > As you find issues, please create JIRAs and link them to this issue. > For *user guide issues* link the new JIRAs to the relevant user guide QA issue -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org