[jira] [Commented] (SPARK-1215) Clustering: Index out of bounds error

2014-07-14 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061095#comment-14061095 ] Joseph K. Bradley commented on SPARK-1215: -- Submitted fix as PR 1407:

[jira] [Created] (SPARK-2497) @DeveloperApi tag does not suppress MIMA warnings

2014-07-15 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-2497: Summary: @DeveloperApi tag does not suppress MIMA warnings Key: SPARK-2497 URL: https://issues.apache.org/jira/browse/SPARK-2497 Project: Spark

[jira] [Updated] (SPARK-2497) @DeveloperApi tag does not suppress MIMA warnings

2014-07-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-2497: - Issue Type: Sub-task (was: Bug) Parent: SPARK-2487 @DeveloperApi tag does not

[jira] [Commented] (SPARK-1215) Clustering: Index out of bounds error

2014-07-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063152#comment-14063152 ] Joseph K. Bradley commented on SPARK-1215: -- Just to let you know, I'll give the

[jira] [Issue Comment Deleted] (SPARK-1215) Clustering: Index out of bounds error

2014-07-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-1215: - Comment: was deleted (was: Just to let you know, I'll give the go-ahead for this

[jira] [Created] (SPARK-2692) Decision Tree API update

2014-07-25 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-2692: Summary: Decision Tree API update Key: SPARK-2692 URL: https://issues.apache.org/jira/browse/SPARK-2692 Project: Spark Issue Type: Improvement

[jira] [Comment Edited] (SPARK-2692) Decision Tree API update

2014-07-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074769#comment-14074769 ] Joseph K. Bradley edited comment on SPARK-2692 at 7/25/14 7:00 PM:

[jira] [Commented] (SPARK-2197) Spark invoke DecisionTree by Java

2014-07-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078675#comment-14078675 ] Joseph K. Bradley commented on SPARK-2197: -- This error is at least partly caused

[jira] [Commented] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078705#comment-14078705 ] Joseph K. Bradley commented on SPARK-2737: -- Relating to [SPARK-2197 Spark invoke

[jira] [Created] (SPARK-2756) Decision Tree bugs

2014-07-30 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-2756: Summary: Decision Tree bugs Key: SPARK-2756 URL: https://issues.apache.org/jira/browse/SPARK-2756 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2756) Decision Tree bugs

2014-07-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080136#comment-14080136 ] Joseph K. Bradley commented on SPARK-2756: -- Submitted

[jira] [Issue Comment Deleted] (SPARK-2756) Decision Tree bugs

2014-07-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-2756: - Comment: was deleted (was: Submitted [https://github.com/apache/spark/pull/1673] with

[jira] [Updated] (SPARK-2756) Decision Tree bugs

2014-07-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-2756: - Description: 3 bugs: Bug 1: Indexing is inconsistent for aggregate calculations for

[jira] [Created] (SPARK-2796) DecisionTree bug with ordered categorical features

2014-08-01 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-2796: Summary: DecisionTree bug with ordered categorical features Key: SPARK-2796 URL: https://issues.apache.org/jira/browse/SPARK-2796 Project: Spark

[jira] [Updated] (SPARK-2909) Indexing for vectors in pyspark

2014-08-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-2909: - Component/s: PySpark Indexing for vectors in pyspark ---

[jira] [Updated] (SPARK-2909) Indexing for vectors in pyspark

2014-08-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-2909: - Summary: Indexing for vectors in pyspark (was: Indexing for vectors in MLlib)

[jira] [Created] (SPARK-3041) DecisionTree: isSampleValid indexing incorrect

2014-08-14 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3041: Summary: DecisionTree: isSampleValid indexing incorrect Key: SPARK-3041 URL: https://issues.apache.org/jira/browse/SPARK-3041 Project: Spark Issue

[jira] [Created] (SPARK-3043) DecisionTree aggregation is inefficient

2014-08-14 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3043: Summary: DecisionTree aggregation is inefficient Key: SPARK-3043 URL: https://issues.apache.org/jira/browse/SPARK-3043 Project: Spark Issue Type:

[jira] [Created] (SPARK-3155) Add support for pruning to DecisionTree

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3155: Summary: Add support for pruning to DecisionTree Key: SPARK-3155 URL: https://issues.apache.org/jira/browse/SPARK-3155 Project: Spark Issue Type:

[jira] [Updated] (SPARK-3155) Support DecisionTree pruning

2014-08-20 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3155: - Summary: Support DecisionTree pruning (was: Add support for pruning to DecisionTree)

[jira] [Created] (SPARK-3157) Avoid duplicated stats in DecisionTree extractLeftRightNodeAggregates

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3157: Summary: Avoid duplicated stats in DecisionTree extractLeftRightNodeAggregates Key: SPARK-3157 URL: https://issues.apache.org/jira/browse/SPARK-3157 Project:

[jira] [Created] (SPARK-3156) DecisionTree: Order categorical features adaptively

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3156: Summary: DecisionTree: Order categorical features adaptively Key: SPARK-3156 URL: https://issues.apache.org/jira/browse/SPARK-3156 Project: Spark

[jira] [Created] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3158: Summary: Avoid 1 extra aggregation for DecisionTree training Key: SPARK-3158 URL: https://issues.apache.org/jira/browse/SPARK-3158 Project: Spark

[jira] [Created] (SPARK-3159) Check for reducible DecisionTree

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3159: Summary: Check for reducible DecisionTree Key: SPARK-3159 URL: https://issues.apache.org/jira/browse/SPARK-3159 Project: Spark Issue Type:

[jira] [Created] (SPARK-3161) Cache example-node map for DecisionTree training

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3161: Summary: Cache example-node map for DecisionTree training Key: SPARK-3161 URL: https://issues.apache.org/jira/browse/SPARK-3161 Project: Spark Issue

[jira] [Created] (SPARK-3160) Simplify DecisionTree data structure for training

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3160: Summary: Simplify DecisionTree data structure for training Key: SPARK-3160 URL: https://issues.apache.org/jira/browse/SPARK-3160 Project: Spark

[jira] [Created] (SPARK-3163) Separate continuous and categorical features in DecisionTree

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3163: Summary: Separate continuous and categorical features in DecisionTree Key: SPARK-3163 URL: https://issues.apache.org/jira/browse/SPARK-3163 Project: Spark

[jira] [Created] (SPARK-3162) Train DecisionTree locally when possible

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3162: Summary: Train DecisionTree locally when possible Key: SPARK-3162 URL: https://issues.apache.org/jira/browse/SPARK-3162 Project: Spark Issue Type:

[jira] [Created] (SPARK-3164) Store DecisionTree Split.categories as Set

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3164: Summary: Store DecisionTree Split.categories as Set Key: SPARK-3164 URL: https://issues.apache.org/jira/browse/SPARK-3164 Project: Spark Issue Type:

[jira] [Created] (SPARK-3165) DecisionTree does not use sparsity in data

2014-08-20 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3165: Summary: DecisionTree does not use sparsity in data Key: SPARK-3165 URL: https://issues.apache.org/jira/browse/SPARK-3165 Project: Spark Issue Type:

[jira] [Created] (SPARK-3207) Choose splits for continuous features in DecisionTree more adaptively

2014-08-25 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3207: Summary: Choose splits for continuous features in DecisionTree more adaptively Key: SPARK-3207 URL: https://issues.apache.org/jira/browse/SPARK-3207 Project:

[jira] [Created] (SPARK-3213) spark_ec2.py cannot find slave instances

2014-08-25 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3213: Summary: spark_ec2.py cannot find slave instances Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type:

[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances

2014-08-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109697#comment-14109697 ] Joseph K. Bradley commented on SPARK-3213: -- [~vidaha] Please take a look.

[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances

2014-08-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109700#comment-14109700 ] Joseph K. Bradley commented on SPARK-3213: -- The security group name I was using

[jira] [Updated] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This

2014-08-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3213: - Summary: spark_ec2.py cannot find slave instances launched with Launch More Like This

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2014-08-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110241#comment-14110241 ] Joseph K. Bradley commented on SPARK-3155: -- Hi Qiping, thanks very much for the

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2014-08-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111685#comment-14111685 ] Joseph K. Bradley commented on SPARK-3155: -- With respect to

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2014-08-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111813#comment-14111813 ] Joseph K. Bradley commented on SPARK-3155: -- Qiping, I think it's up to you; both

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2014-08-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112735#comment-14112735 ] Joseph K. Bradley commented on SPARK-3155: -- That sounds good---thank you!

[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This

2014-08-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112748#comment-14112748 ] Joseph K. Bradley commented on SPARK-3213: -- Testing now... spark_ec2.py cannot

[jira] [Commented] (SPARK-3155) Support DecisionTree pruning

2014-08-27 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113283#comment-14113283 ] Joseph K. Bradley commented on SPARK-3155: -- It would be best if so, but that code

[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree

2014-08-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114166#comment-14114166 ] Joseph K. Bradley commented on SPARK-3272: -- With respect to [SPARK-2207], I think

[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree

2014-08-28 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114742#comment-14114742 ] Joseph K. Bradley commented on SPARK-3272: -- Hi Qiping, you are right; I missed

[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree

2014-08-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115757#comment-14115757 ] Joseph K. Bradley commented on SPARK-3272: -- Hi Qiping, No worries; we are on

[jira] [Commented] (SPARK-3366) Compute best splits distributively in decision tree

2014-09-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120218#comment-14120218 ] Joseph K. Bradley commented on SPARK-3366: -- It is not really a bottleneck for

[jira] [Created] (SPARK-3380) DecisionTree: overflow and precision in aggregation

2014-09-03 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3380: Summary: DecisionTree: overflow and precision in aggregation Key: SPARK-3380 URL: https://issues.apache.org/jira/browse/SPARK-3380 Project: Spark

[jira] [Created] (SPARK-3381) DecisionTree: eliminate bins for unordered features

2014-09-03 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3381: Summary: DecisionTree: eliminate bins for unordered features Key: SPARK-3381 URL: https://issues.apache.org/jira/browse/SPARK-3381 Project: Spark

[jira] [Created] (SPARK-3382) GradientDescent convergence tolerance

2014-09-03 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3382: Summary: GradientDescent convergence tolerance Key: SPARK-3382 URL: https://issues.apache.org/jira/browse/SPARK-3382 Project: Spark Issue Type:

[jira] [Created] (SPARK-3383) DecisionTree aggregate size could be smaller

2014-09-03 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3383: Summary: DecisionTree aggregate size could be smaller Key: SPARK-3383 URL: https://issues.apache.org/jira/browse/SPARK-3383 Project: Spark Issue

[jira] [Updated] (SPARK-3160) Simplify DecisionTree data structure for training

2014-09-05 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3160: - Description: Improvement: code clarity Currently, we maintain a tree structure, a flat

[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree

2014-09-08 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125969#comment-14125969 ] Joseph K. Bradley commented on SPARK-3272: -- Hi Qiping, Thanks for your patience;

[jira] [Updated] (SPARK-3160) Simplify DecisionTree data structure for training

2014-09-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3160: - Description: Improvement: code clarity Currently, we maintain a tree structure, a flat

[jira] [Updated] (SPARK-3158) Avoid 1 extra aggregation for DecisionTree training

2014-09-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3158: - Description: Improvement: computation Currently, the implementation does one unnecessary

[jira] [Created] (SPARK-3494) DecisionTree overflow error in calculating maxMemoryUsage

2014-09-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3494: Summary: DecisionTree overflow error in calculating maxMemoryUsage Key: SPARK-3494 URL: https://issues.apache.org/jira/browse/SPARK-3494 Project: Spark

[jira] [Created] (SPARK-3516) DecisionTree Python support for params maxInstancesPerNode, maxInfoGain

2014-09-12 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3516: Summary: DecisionTree Python support for params maxInstancesPerNode, maxInfoGain Key: SPARK-3516 URL: https://issues.apache.org/jira/browse/SPARK-3516

[jira] [Updated] (SPARK-3161) Cache example-node map for DecisionTree training

2014-09-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3161: - Description: Improvement: worker computation When training each level of a DecisionTree,

[jira] [Updated] (SPARK-3161) Cache example-node map for DecisionTree training

2014-09-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3161: - Description: Improvement: worker computation When training each level of a DecisionTree,

[jira] [Updated] (SPARK-3573) Dataset

2014-09-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with

[jira] [Updated] (SPARK-3573) Dataset

2014-09-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3573: - Description: This JIRA is for discussion of ML dataset, essentially a SchemaRDD with

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-1547: - Description: This task requires adding the gradient boosting algorithm to Spark MLlib.

[jira] [Updated] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-1547: - Description: This task requires adding the gradient boosting algorithm to Spark MLlib.

[jira] [Commented] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149918#comment-14149918 ] Joseph K. Bradley commented on SPARK-1547: -- [~hector.yee] I strongly agree about

[jira] [Created] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3702: Summary: Standardize MLlib classes for learners, models Key: SPARK-3702 URL: https://issues.apache.org/jira/browse/SPARK-3702 Project: Spark Issue

[jira] [Updated] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3702: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-1856 Standardize MLlib

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149974#comment-14149974 ] Joseph K. Bradley commented on SPARK-3702: -- The design doc is only partly

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149980#comment-14149980 ] Joseph K. Bradley commented on SPARK-3702: -- Both JIRAs discuss class hierarchy.

[jira] [Commented] (SPARK-3507) Create RegressionLearner trait and make some currect code implement it

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149988#comment-14149988 ] Joseph K. Bradley commented on SPARK-3507: -- [~epahomov] Hi, I strongly agree

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149991#comment-14149991 ] Joseph K. Bradley commented on SPARK-3702: -- [SPARK-3251] discusses a subset of

[jira] [Commented] (SPARK-3251) Clarify learning interfaces

2014-09-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1415#comment-1415 ] Joseph K. Bradley commented on SPARK-3251: -- I just linked JIRA related to this;

[jira] [Created] (SPARK-3703) Ensemble learning methods

2014-09-26 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3703: Summary: Ensemble learning methods Key: SPARK-3703 URL: https://issues.apache.org/jira/browse/SPARK-3703 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-09-29 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3717: Summary: DecisionTree, RandomForest: Partition by feature Key: SPARK-3717 URL: https://issues.apache.org/jira/browse/SPARK-3717 Project: Spark Issue

[jira] [Updated] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-09-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3717: - Description: h1. Summary Currently, data are partitioned by row/instance for

[jira] [Created] (SPARK-3726) RandomForest: Support for bootstrap options

2014-09-29 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3726: Summary: RandomForest: Support for bootstrap options Key: SPARK-3726 URL: https://issues.apache.org/jira/browse/SPARK-3726 Project: Spark Issue

[jira] [Created] (SPARK-3727) DecisionTree, RandomForest: More prediction functionality

2014-09-29 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3727: Summary: DecisionTree, RandomForest: More prediction functionality Key: SPARK-3727 URL: https://issues.apache.org/jira/browse/SPARK-3727 Project: Spark

[jira] [Commented] (SPARK-1547) Add gradient boosting algorithm to MLlib

2014-09-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152101#comment-14152101 ] Joseph K. Bradley commented on SPARK-1547: -- This will be great to have! The WIP

[jira] [Updated] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-09-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3717: - Description: h1. Summary Currently, data are partitioned by row/instance for

[jira] [Updated] (SPARK-3165) DecisionTree does not use sparsity in data

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3165: - Priority: Minor (was: Major) DecisionTree does not use sparsity in data

[jira] [Updated] (SPARK-3380) DecisionTree: overflow and precision in aggregation

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3380: - Priority: Minor (was: Major) DecisionTree: overflow and precision in aggregation

[jira] [Commented] (SPARK-2692) Decision Tree API update

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153647#comment-14153647 ] Joseph K. Bradley commented on SPARK-2692: -- Closing this since it is part of the

[jira] [Closed] (SPARK-2692) Decision Tree API update

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-2692. Resolution: Duplicate This is superceded by the new API JIRA. Decision Tree API update

[jira] [Commented] (SPARK-3702) Standardize MLlib classes for learners, models

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153763#comment-14153763 ] Joseph K. Bradley commented on SPARK-3702: -- Thanks for taking a close look! *

[jira] [Created] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-09-30 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3751: Summary: DecisionTreeRunner functionality improvement Key: SPARK-3751 URL: https://issues.apache.org/jira/browse/SPARK-3751 Project: Spark Issue

[jira] [Updated] (SPARK-3751) DecisionTreeRunner functionality improvement

2014-09-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3751: - Description: DecisionTreeRunner functionality additions: * Allow user to pass in a test

[jira] [Updated] (SPARK-3841) Pretty-print Params case classes for tests

2014-10-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3841: - Description: Provide a parent class for the Params case classes used in many MLlib

[jira] [Created] (SPARK-3841) Pretty-print Params case classes for tests

2014-10-07 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3841: Summary: Pretty-print Params case classes for tests Key: SPARK-3841 URL: https://issues.apache.org/jira/browse/SPARK-3841 Project: Spark Issue Type:

[jira] [Created] (SPARK-3903) Create general data loading method for LabeledPoints

2014-10-10 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3903: Summary: Create general data loading method for LabeledPoints Key: SPARK-3903 URL: https://issues.apache.org/jira/browse/SPARK-3903 Project: Spark

[jira] [Updated] (SPARK-3903) Create general data loading method for LabeledPoints

2014-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3903: - Description: Proposal: Provide a more general data loading function for LabeledPoints. *

[jira] [Commented] (SPARK-3251) Clarify learning interfaces

2014-10-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169659#comment-14169659 ] Joseph K. Bradley commented on SPARK-3251: -- I agree it's hard to say. Based on

[jira] [Created] (SPARK-3934) RandomForest bug in sanity check in DTStatsAggregator

2014-10-13 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-3934: Summary: RandomForest bug in sanity check in DTStatsAggregator Key: SPARK-3934 URL: https://issues.apache.org/jira/browse/SPARK-3934 Project: Spark

[jira] [Commented] (SPARK-3161) Cache example-node map for DecisionTree training

2014-10-13 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170443#comment-14170443 ] Joseph K. Bradley commented on SPARK-3161: -- Your summary for option (1) sounds

[jira] [Updated] (SPARK-3164) Store DecisionTree Split.categories as Set

2014-10-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3164: - Description: Improvement: computation For categorical features with many categories, it

[jira] [Commented] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2014-10-21 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178798#comment-14178798 ] Joseph K. Bradley commented on SPARK-3717: -- Hi Sumanth, it would be great to get

[jira] [Commented] (SPARK-4022) Replace colt dependency (LGPL) with commons-math

2014-10-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181777#comment-14181777 ] Joseph K. Bradley commented on SPARK-4022: -- Hi Sean, Thanks for picking this up!

[jira] [Comment Edited] (SPARK-4022) Replace colt dependency (LGPL) with commons-math

2014-10-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181777#comment-14181777 ] Joseph K. Bradley edited comment on SPARK-4022 at 10/23/14 7:00 PM:

[jira] [Updated] (SPARK-4081) Categorical feature indexing

2014-10-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4081: - Description: DecisionTree and RandomForest require that categorical features and labels

[jira] [Commented] (SPARK-3573) Dataset

2014-10-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188980#comment-14188980 ] Joseph K. Bradley commented on SPARK-3573: -- [~sparks] Trying to simplify things,

[jira] [Created] (SPARK-4197) Gradient Boosting API cleanups

2014-11-02 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4197: Summary: Gradient Boosting API cleanups Key: SPARK-4197 URL: https://issues.apache.org/jira/browse/SPARK-4197 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4210) Add Extra-Trees algorithm to MLlib

2014-11-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14195552#comment-14195552 ] Joseph K. Bradley commented on SPARK-4210: -- [~0asa] For the API, do you plan for

[jira] [Commented] (SPARK-4210) Add Extra-Trees algorithm to MLlib

2014-11-05 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198939#comment-14198939 ] Joseph K. Bradley commented on SPARK-4210: -- The API and internal implementation

[jira] [Commented] (SPARK-4210) Add Extra-Trees algorithm to MLlib

2014-11-05 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14198990#comment-14198990 ] Joseph K. Bradley commented on SPARK-4210: -- One more thought: If this is your

  1   2   3   4   5   6   7   8   9   10   >