[jira] [Updated] (SPARK-12326) Move GBT implementation from spark.mllib to spark.ml

2016-08-08 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-12326:
--
Priority: Minor  (was: Major)

> Move GBT implementation from spark.mllib to spark.ml
> 
>
> Key: SPARK-12326
> URL: https://issues.apache.org/jira/browse/SPARK-12326
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, MLlib
>Reporter: Seth Hendrickson
>Assignee: Seth Hendrickson
>Priority: Minor
>
> Several improvements can be made to gradient boosted trees, but are not 
> possible without moving the GBT implementation to spark.ml (e.g. 
> rawPrediction column, feature importance). This Jira is for moving the 
> current GBT implementation to spark.ml, which will have roughly the following 
> steps:
> 1. Copy the implementation to spark.ml and change spark.ml classes to use 
> that implementation. Current tests will ensure that the implementations learn 
> exactly the same models. 
> 2. Move the decision tree helper classes over to spark.ml (e.g. Impurity, 
> InformationGainStats, ImpurityStats, DTStatsAggregator, etc...). Since 
> eventually all tree implementations will reside in spark.ml, the helper 
> classes should as well.
> 3. Remove the spark.mllib implementation, and make the spark.mllib APIs 
> wrappers around the spark.ml implementation. The spark.ml tests will again 
> ensure that we do not change any behavior.
> 4. Move the unit tests to spark.ml, and change the spark.mllib unit tests to 
> verify model equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-12326) Move GBT implementation from spark.mllib to spark.ml

2016-03-03 Thread Nick Pentreath (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Pentreath updated SPARK-12326:
---
Assignee: Seth Hendrickson

> Move GBT implementation from spark.mllib to spark.ml
> 
>
> Key: SPARK-12326
> URL: https://issues.apache.org/jira/browse/SPARK-12326
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, MLlib
>Reporter: Seth Hendrickson
>Assignee: Seth Hendrickson
>
> Several improvements can be made to gradient boosted trees, but are not 
> possible without moving the GBT implementation to spark.ml (e.g. 
> rawPrediction column, feature importance). This Jira is for moving the 
> current GBT implementation to spark.ml, which will have roughly the following 
> steps:
> 1. Copy the implementation to spark.ml and change spark.ml classes to use 
> that implementation. Current tests will ensure that the implementations learn 
> exactly the same models. 
> 2. Move the decision tree helper classes over to spark.ml (e.g. Impurity, 
> InformationGainStats, ImpurityStats, DTStatsAggregator, etc...). Since 
> eventually all tree implementations will reside in spark.ml, the helper 
> classes should as well.
> 3. Remove the spark.mllib implementation, and make the spark.mllib APIs 
> wrappers around the spark.ml implementation. The spark.ml tests will again 
> ensure that we do not change any behavior.
> 4. Move the unit tests to spark.ml, and change the spark.mllib unit tests to 
> verify model equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-12326) Move GBT implementation from spark.mllib to spark.ml

2015-12-16 Thread Seth Hendrickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seth Hendrickson updated SPARK-12326:
-
Description: 
Several improvements can be made to gradient boosted trees, but are not 
possible without moving the GBT implementation to spark.ml (e.g. rawPrediction 
column, feature importance). This Jira is for moving the current GBT 
implementation to spark.ml, which will have roughly the following steps:

1. Copy the implementation to spark.ml and change spark.ml classes to use that 
implementation. Current tests will ensure that the implementations learn 
exactly the same models. 
2. Move the decision tree helper classes over to spark.ml (e.g. Impurity, 
InformationGainStats, ImpurityStats, DTStatsAggregator, etc...). Since 
eventually all tree implementations will reside in spark.ml, the helper classes 
should as well.
3. Remove the spark.mllib implementation, and make the spark.mllib APIs 
wrappers around the spark.ml implementation. The spark.ml tests will again 
ensure that we do not change any behavior.
4. Move the unit tests to spark.ml, and change the spark.mllib unit tests to 
verify model equivalence.

  was:
Several improvements can be made to gradient boosted trees, but are not 
possible without moving the GBT implementation to spark.ml (e.g. rawPrediction 
column, feature importance). This Jira is for moving the current GBT 
implementation to spark.ml, which will have roughly the following steps:

1. Copy the implementation to spark.ml and change spark.ml classes to use that 
implementation. Current tests will ensure that the implementations learn 
exactly the same models. 
2. Move the decision tree helper classes over to spark.ml (e.g. Impurity, 
InformationGainStats, ImpurityStats, DTStatsAggregator, etc...). Since 
eventually all tree implementations will reside in spark.ml, the helper classes 
should as well.
3. Remove the spark.mllib implementation, and make the spark.mllib APIs 
wrappers around the spark.ml implementation. The spark.ml tests will again 
ensure that we do not change any behavior.
4. Move the unit tests to spark.ml, and change the spark.mllib unit tests to 
verify model equivalence.

Steps 2, 3, and 4 should be in separate Jiras. 


> Move GBT implementation from spark.mllib to spark.ml
> 
>
> Key: SPARK-12326
> URL: https://issues.apache.org/jira/browse/SPARK-12326
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, MLlib
>Reporter: Seth Hendrickson
>
> Several improvements can be made to gradient boosted trees, but are not 
> possible without moving the GBT implementation to spark.ml (e.g. 
> rawPrediction column, feature importance). This Jira is for moving the 
> current GBT implementation to spark.ml, which will have roughly the following 
> steps:
> 1. Copy the implementation to spark.ml and change spark.ml classes to use 
> that implementation. Current tests will ensure that the implementations learn 
> exactly the same models. 
> 2. Move the decision tree helper classes over to spark.ml (e.g. Impurity, 
> InformationGainStats, ImpurityStats, DTStatsAggregator, etc...). Since 
> eventually all tree implementations will reside in spark.ml, the helper 
> classes should as well.
> 3. Remove the spark.mllib implementation, and make the spark.mllib APIs 
> wrappers around the spark.ml implementation. The spark.ml tests will again 
> ensure that we do not change any behavior.
> 4. Move the unit tests to spark.ml, and change the spark.mllib unit tests to 
> verify model equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org