[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13448: -- Priority: Blocker (was: Major) > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng >Priority: Blocker > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param > * SPARK-14931: Mismatched default Param values between pipelines in Spark and > PySpark > * SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame > stats (previously used custom sampling logic). Buckets will differ for same > input data and params. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param * SPARK-14931: Mismatched default Param values between pipelines in Spark and PySpark * SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame stats (previously used custom sampling logic). Buckets will differ for same input data and params. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param * SPARK-14931: Mismatched default Param values between pipelines in Spark and PySpark * SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param > * SPARK-14931: Mismatched default Param values between pipelines in Spark and > PySpark > * SPARK-13600: QuantileDiscretizer now uses approxQuantile from DataFrame > stats (previously used custom sampling logic). Buckets will differ for same > input data and params. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param * SPARK-14931: Mismatched default Param values between pipelines in Spark and PySpark * SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param * SPARK-14931: Mismatched default Param values between pipelines in Spark and PySpark > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param > * SPARK-14931: Mismatched default Param values between pipelines in Spark and > PySpark > * SPARK-13600: Use approxQuantile from DataFrame stats in QuantileDiscretizer -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param * SPARK-14931: Mismatched default Param values between pipelines in Spark and PySpark was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param > * SPARK-14931: Mismatched default Param values between pipelines in Spark and > PySpark -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param ** (*pending further discussion*) > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib * SPARK-14768: Remove expectedType arg for PySpark Param ** (*pending further discussion*) was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib > * SPARK-14768: Remove expectedType arg for PySpark Param > ** (*pending further discussion*) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and spark.mllib was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default in both spark.ml and > spark.mllib -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. * SPARK-10574: HashingTF uses MurmurHash3 by default was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. > * SPARK-10574: HashingTF uses MurmurHash3 by default -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did not handle them correctly. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. > * SPARK-12153: Word2Vec now respects sentence boundaries. Previously, it did > not handle them correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by default, if checkpointing is being used. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results > * SPARK-13048: LDA using the EM optimizer will keep the last checkpoint by > default, if checkpointing is being used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide / release notes. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide / release notes. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. * SPARK-12363: Bug fix for PowerIterationClustering which will likely change results was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. > * SPARK-12363: Bug fix for PowerIterationClustering which will likely change > results -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: Intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: Intercept will not be regularized if users train binary > classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, > because it calls ML LogisticRegresson implementation. Meanwhile if users set > without regularization, training with or without feature scaling will return > the same solution by the same convergence rate(because they run the same code > route), this behavior is different from the old API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson implementation. Meanwhile if users set without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. When without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized > if users train binary classification model with L1/L2 Updater by > LogisticRegressionWithLBFGS, because it calls ML LogisticRegresson > implementation. Meanwhile if users set without regularization, training with > or without feature scaling will return the same solution by the same > convergence rate(because they run the same code route), this behavior is > different from the old API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if users train binary classification model with L1/L2 Updater by LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. When without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route), this behavior is different from the old API. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if users train binary classification model with L1/L2 Updater, it calls ML LogisiticRegresson implementation. When without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route). > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. > Meanwhile if users train binary classification model with L1/L2 Updater by > LogisticRegressionWithLBFGS, it calls ML LogisiticRegresson implementation. > When without regularization, training with or without feature scaling will > return the same solution by the same convergence rate(because they run the > same code route), this behavior is different from the old API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if users train binary classification model with L1/L2 Updater, it calls ML LogisiticRegresson implementation. When without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route). was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route). > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. > Meanwhile if users train binary classification model with L1/L2 Updater, it > calls ML LogisiticRegresson implementation. > When without regularization, training with or without feature scaling will > return the same solution by the same convergence rate(because they run the > same code route). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without regularization, training with or without feature scaling will return the same solution by the same convergence rate(because they run the same code route). was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without regularization, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. > Meanwhile if without regularization, training with or without feature scaling > will return the same solution by the same convergence rate(because they run > the same code route). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without regularization, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without penalty, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. > Meanwhile if without regularization, training with standardization and > without standardization will return the same solution by the same convergence > rate(because they run the same code route). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. Meanwhile if without penalty, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept should not be regularized. If without penalty, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept will not be regularized. > Meanwhile if without penalty, training with standardization and without > standardization will return the same solution by the same convergence > rate(because they run the same code route). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegressionWithLBFGS intercept should not be regularized. If without penalty, training with standardization and without standardization will return the same solution by the same convergence rate(because they run the same code route). was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegression intercept should not be regularized. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegressionWithLBFGS intercept should not be > regularized. If without penalty, training with standardization and without > standardization will return the same solution by the same convergence > rate(because they run the same code route). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-13448: Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. * SPARK-7780: LogisticRegression intercept should not be regularized. was: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. > * SPARK-7780: LogisticRegression intercept should not be regularized. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-13448: -- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 to 1e-6. was:This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can remember to add them to the migration guide. > Document MLlib behavior changes in Spark 2.0 > > > Key: SPARK-13448 > URL: https://issues.apache.org/jira/browse/SPARK-13448 > Project: Spark > Issue Type: Documentation > Components: ML, MLlib >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can > remember to add them to the migration guide. > * SPARK-13429: change convergenceTol in LogisticRegressionWithLBFGS from 1e-4 > to 1e-6. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org