[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-10-02 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636133#comment-16636133
 ] 

Apache Spark commented on SPARK-25321:
--

User 'mengxr' has created a pull request for this issue:
https://github.com/apache/spark/pull/22618

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Weichen Xu
>Priority: Blocker
> Fix For: 2.4.0
>
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-10-02 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16636136#comment-16636136
 ] 

Apache Spark commented on SPARK-25321:
--

User 'mengxr' has created a pull request for this issue:
https://github.com/apache/spark/pull/22618

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Weichen Xu
>Priority: Blocker
> Fix For: 2.4.0
>
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-11 Thread Weichen Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610379#comment-16610379
 ] 

Weichen Xu commented on SPARK-25321:


[~josephkb]
There're 2 changes which break compatibility we need review:
[SPARK-10413] ML models should support prediction on single instances: This PR 
break source and binary compatibility, if there're subclasses defined by users 
which override "protected predict" method. If we make `predict` public, then 
all subclasses must public there "predict" methods. What do you think of it? Is 
it a big issue ?

[SPARK-14681] Provide label/impurity stats for spark.ml decision tree nodes: 
This will break binary compatibility, but it looks like keeping source 
compatibility.


> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-12 Thread Joseph K. Bradley (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612436#comment-16612436
 ] 

Joseph K. Bradley commented on SPARK-25321:
---

You're right; these are breaking changes.  If we're sticking with the rules, 
then we should revert these in branch-2.4, but we could keep them in master if 
the next release is 3.0.  Is it easy to revert these PRs, or have they 
collected conflicts by now?

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-17 Thread Joseph K. Bradley (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618118#comment-16618118
 ] 

Joseph K. Bradley commented on SPARK-25321:
---

[~WeichenXu123] Have you been able to look into reverting those changes or 
discussed with [~mengxr] about reverting them?  Thanks!

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-17 Thread Xiangrui Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618495#comment-16618495
 ] 

Xiangrui Meng commented on SPARK-25321:
---

[~WeichenXu123] Could you check whether mleap is compatible with the tree Node 
breaking changes? This line is relevant: 
https://github.com/combust/mleap/blob/master/mleap-runtime/src/main/scala/ml/combust/mleap/bundle/ops/classification/DecisionTreeClassifierOp.scala

If it is hard to make MLeap upgrade, we should revert the change.

cc: [~hollinwilkins]

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-20 Thread Weichen Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621862#comment-16621862
 ] 

Weichen Xu commented on SPARK-25321:


[~mengxr]  mleap is NOT compatible with the tree Node breaking changes. compile 
failed:
```
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/classification/DecisionTreeClassifierOp.scala:34:
 type mismatch;
[error]  found   : org.apache.spark.ml.tree.Node
[error]  required: org.apache.spark.ml.tree.ClassificationNode
[error] rootNode = rootNode,
[error]^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/clustering/LDAModelOp.scala:59:
 reassignment to val
[error]   oldLocalModel = oldLocalModel,
[error] ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/ops/regression/DecisionTreeRegressionOp.scala:33:
 type mismatch;
[error]  found   : org.apache.spark.ml.tree.Node
[error]  required: org.apache.spark.ml.tree.RegressionNode
[error] rootNode = rootNode,
[error]^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:50:
 trait LeafNode is abstract; cannot be instantiated
[error] new tree.LeafNode(prediction = node.values.indexOf(node.values.max),
[error] ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:50:
 not found: value prediction
[error] new tree.LeafNode(prediction = node.values.indexOf(node.values.max),
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:51:
 not found: value impurity
[error]   impurity = 0.0,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:52:
 not found: value impurityStats
[error]   impurityStats = calc)
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:75:
 trait InternalNode is abstract; cannot be instantiated
[error] new tree.InternalNode(split = split,
[error] ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:75:
 reassignment to val
[error] new tree.InternalNode(split = split,
[error] ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:76:
 not found: value leftChild
[error]   leftChild = left,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:77:
 not found: value rightChild
[error]   rightChild = right,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:78:
 not found: value prediction
[error]   prediction = 0.0,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:79:
 not found: value gain
[error]   gain = 0.0,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:80:
 not found: value impurity
[error]   impurity = 0.0,
[error]   ^
[error] 
/Users/weichenxu/work/projects/mySpark/mleap/mleap-spark/src/main/scala/org/apache/spark/ml/bundle/tree/decision/SparkNodeWrapper.scala:81:
 not found: value impurityStats
[error]   impurityStats = null)
[error]   ^
[error] 15 errors found
```

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find iss

[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-20 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622942#comment-16622942
 ] 

Apache Spark commented on SPARK-25321:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/22492

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-20 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623003#comment-16623003
 ] 

Apache Spark commented on SPARK-25321:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/22510

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25321) ML, Graph 2.4 QA: API: New Scala APIs, docs

2018-09-20 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623004#comment-16623004
 ] 

Apache Spark commented on SPARK-25321:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/22510

> ML, Graph 2.4 QA: API: New Scala APIs, docs
> ---
>
> Key: SPARK-25321
> URL: https://issues.apache.org/jira/browse/SPARK-25321
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Affects Versions: 2.4.0
>Reporter: Weichen Xu
>Assignee: Yanbo Liang
>Priority: Blocker
>
> Audit new public Scala APIs added to MLlib & GraphX. Take note of:
>  * Protected/public classes or methods. If access can be more private, then 
> it should be.
>  * Also look for non-sealed traits.
>  * Documentation: Missing? Bad links or formatting?
> *Make sure to check the object doc!*
> As you find issues, please create JIRAs and link them to this issue. 
> For *user guide issues* link the new JIRAs to the relevant user guide QA issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org