[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-28 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-13448:
--
Priority: Blocker  (was: Major)

> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Blocker
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param
> * SPARK-14931: Mismatched default Param values between pipelines in Spark and 
> PySpark
> * SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame 
> stats (previously used custom sampling logic). Buckets will differ for same 
> input data and params. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Pentreath updated SPARK-13448:
---
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
* SPARK-14931: Mismatched default Param values between pipelines in Spark and 
PySpark
* SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame stats 
(previously used custom sampling logic). Buckets will differ for same input 
data and params. 

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
* SPARK-14931: Mismatched default Param values between pipelines in Spark and 
PySpark
* SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param
> * SPARK-14931: Mismatched default Param values between pipelines in Spark and 
> PySpark
> * SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame 
> stats (previously used custom sampling logic). Buckets will differ for same 
> input data and params. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Pentreath updated SPARK-13448:
---
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
* SPARK-14931: Mismatched default Param values between pipelines in Spark and 
PySpark
* SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
* SPARK-14931: Mismatched default Param values between pipelines in Spark and 
PySpark


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param
> * SPARK-14931: Mismatched default Param values between pipelines in Spark and 
> PySpark
> * SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-30 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
* SPARK-14931: Mismatched default Param values between pipelines in Spark and 
PySpark

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param
> * SPARK-14931: Mismatched default Param values between pipelines in Spark and 
> PySpark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-21 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
** (*pending further discussion*)


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-20 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib
* SPARK-14768: Remove expectedType arg for PySpark Param
** (*pending further discussion*)

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib
> * SPARK-14768: Remove expectedType arg for PySpark Param
> ** (*pending further discussion*)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-19 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
spark.mllib

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and 
> spark.mllib



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-19 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.
* SPARK-10574: HashingTF uses MurmurHash3 by default

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.
> * SPARK-10574: HashingTF uses MurmurHash3 by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-08 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.
* SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
not handle them correctly.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.
> * SPARK-12153: Word2Vec now respects sentence boundaries.  Previously, it did 
> not handle them correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-04-07 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results
* SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
default, if checkpointing is being used.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results
> * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by 
> default, if checkpointing is being used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-03-15 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide / release notes.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide / release notes.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-03-15 Thread Joseph K. Bradley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.
* SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
results

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.
> * SPARK-12363: Bug fix for PowerIterationClustering which will likely change 
> results



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: Intercept will not be regularized if users train binary 
classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because 
it calls ML LogisticRegresson implementation. Meanwhile if users set without 
regularization, training with or without feature scaling will return the same 
solution by the same convergence rate(because they run the same code route), 
this behavior is different from the old API.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized if 
users train binary classification model with L1/L2 Updater by 
LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson 
implementation. Meanwhile if users set without regularization, training with or 
without feature scaling will return the same solution by the same convergence 
rate(because they run the same code route), this behavior is different from the 
old API.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: Intercept will not be regularized if users train binary 
> classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, 
> because it calls ML LogisticRegresson implementation. Meanwhile if users set 
> without regularization, training with or without feature scaling will return 
> the same solution by the same convergence rate(because they run the same code 
> route), this behavior is different from the old API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized if 
users train binary classification model with L1/L2 Updater by 
LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson 
implementation. Meanwhile if users set without regularization, training with or 
without feature scaling will return the same solution by the same convergence 
rate(because they run the same code route), this behavior is different from the 
old API.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if users train binary classification model with L1/L2 Updater by 
LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. 
When without regularization, training with or without feature scaling will 
return the same solution by the same convergence rate(because they run the same 
code route), this behavior is different from the old API.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized 
> if users train binary classification model with L1/L2 Updater by 
> LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson 
> implementation. Meanwhile if users set without regularization, training with 
> or without feature scaling will return the same solution by the same 
> convergence rate(because they run the same code route), this behavior is 
> different from the old API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if users train binary classification model with L1/L2 Updater by 
LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. 
When without regularization, training with or without feature scaling will 
return the same solution by the same convergence rate(because they run the same 
code route), this behavior is different from the old API.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if users train binary classification model with L1/L2 Updater, it 
calls ML LogisiticRegresson implementation.
When without regularization, training with or without feature scaling will 
return the same solution by the same convergence rate(because they run the same 
code route).


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
> Meanwhile if users train binary classification model with L1/L2 Updater by 
> LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. 
> When without regularization, training with or without feature scaling will 
> return the same solution by the same convergence rate(because they run the 
> same code route), this behavior is different from the old API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if users train binary classification model with L1/L2 Updater, it 
calls ML LogisiticRegresson implementation.
When without regularization, training with or without feature scaling will 
return the same solution by the same convergence rate(because they run the same 
code route).

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without regularization, training with or without feature scaling 
will return the same solution by the same convergence rate(because they run the 
same code route).


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
> Meanwhile if users train binary classification model with L1/L2 Updater, it 
> calls ML LogisiticRegresson implementation.
> When without regularization, training with or without feature scaling will 
> return the same solution by the same convergence rate(because they run the 
> same code route).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without regularization, training with or without feature scaling 
will return the same solution by the same convergence rate(because they run the 
same code route).

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without regularization, training with standardization and without 
standardization will return the same solution by the same convergence 
rate(because they run the same code route).


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
> Meanwhile if without regularization, training with or without feature scaling 
> will return the same solution by the same convergence rate(because they run 
> the same code route).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without regularization, training with standardization and without 
standardization will return the same solution by the same convergence 
rate(because they run the same code route).

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without penalty, training with standardization and without 
standardization will return the same solution by the same convergence 
rate(because they run the same code route).


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
> Meanwhile if without regularization, training with standardization and 
> without standardization will return the same solution by the same convergence 
> rate(because they run the same code route).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
Meanwhile if without penalty, training with standardization and without 
standardization will return the same solution by the same convergence 
rate(because they run the same code route).

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept should not be regularized. 
If without penalty, training with standardization and without standardization 
will return the same solution by the same convergence rate(because they run the 
same code route).


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. 
> Meanwhile if without penalty, training with standardization and without 
> standardization will return the same solution by the same convergence 
> rate(because they run the same code route).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegressionWithLBFGS intercept should not be regularized. 
If without penalty, training with standardization and without standardization 
will return the same solution by the same convergence rate(because they run the 
same code route).

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegression intercept should not be regularized.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegressionWithLBFGS intercept should not be 
> regularized. If without penalty, training with standardization and without 
> standardization will return the same solution by the same convergence 
> rate(because they run the same code route).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-28 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated SPARK-13448:

Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.
* SPARK-7780: LogisticRegression intercept should not be regularized.

  was:
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.
> * SPARK-7780: LogisticRegression intercept should not be regularized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-02-22 Thread Xiangrui Meng (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-13448:
--
Description: 
This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.

* SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
to 1e-6.

  was:This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
remember to add them to the migration guide.


> Document MLlib behavior changes in Spark 2.0
> 
>
> Key: SPARK-13448
> URL: https://issues.apache.org/jira/browse/SPARK-13448
> Project: Spark
>  Issue Type: Documentation
>  Components: ML, MLlib
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>
> This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can 
> remember to add them to the migration guide.
> * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 
> to 1e-6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org