[jira] [Comment Edited] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-11-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240284#comment-16240284 ] Yan Facai (颜发才) edited comment on SPARK-3383 at 11/6/17 1:2

[jira] [Commented] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-11-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240284#comment-16240284 ] Yan Facai (颜发才) commented on SPARK-3383: [~WeichenXu123] Good work! I'

[jira] [Commented] (SPARK-3165) DecisionTree does not use sparsity in data

2017-09-26 Thread
[ https://issues.apache.org/jira/browse/SPARK-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181925#comment-16181925 ] Yan Facai (颜发才) commented on SPARK-3165: The PR proposed by me has been cl

[jira] [Commented] (SPARK-21748) Migrate the implementation of HashingTF from MLlib to ML

2017-08-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-21748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133947#comment-16133947 ] Yan Facai (颜发才) commented on SPARK-21748: - There seems to be something w

[jira] [Comment Edited] (SPARK-21748) Migrate the implementation of HashingTF from MLlib to ML

2017-08-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-21748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133947#comment-16133947 ] Yan Facai (颜发才) edited comment on SPARK-21748 at 8/19/17 4:4

[jira] [Comment Edited] (SPARK-21748) Migrate the implementation of HashingTF from MLlib to ML

2017-08-16 Thread
[ https://issues.apache.org/jira/browse/SPARK-21748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128725#comment-16128725 ] Yan Facai (颜发才) edited comment on SPARK-21748 at 8/16/17 12:3

[jira] [Commented] (SPARK-21748) Migrate the implementation of HashingTF from MLlib to ML

2017-08-16 Thread
[ https://issues.apache.org/jira/browse/SPARK-21748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128725#comment-16128725 ] Yan Facai (颜发才) commented on SPARK-21748: - [~yanboliang] Thanks, yanbo

[jira] [Created] (SPARK-21748) Migrate the implementation of HashingTF from MLlib to ML

2017-08-16 Thread
Yan Facai (颜发才) created SPARK-21748: --- Summary: Migrate the implementation of HashingTF from MLlib to ML Key: SPARK-21748 URL: https://issues.apache.org/jira/browse/SPARK-21748 Project: Spark

[jira] [Commented] (SPARK-21690) one-pass imputer

2017-08-10 Thread
[ https://issues.apache.org/jira/browse/SPARK-21690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122922#comment-16122922 ] Yan Facai (颜发才) commented on SPARK-21690: - Cool! Just go head. > o

[jira] [Comment Edited] (SPARK-21690) one-pass imputer

2017-08-10 Thread
[ https://issues.apache.org/jira/browse/SPARK-21690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122904#comment-16122904 ] Yan Facai (颜发才) edited comment on SPARK-21690 at 8/11/17 6:0

[jira] [Comment Edited] (SPARK-21690) one-pass imputer

2017-08-10 Thread
[ https://issues.apache.org/jira/browse/SPARK-21690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122904#comment-16122904 ] Yan Facai (颜发才) edited comment on SPARK-21690 at 8/11/17 6:0

[jira] [Commented] (SPARK-21690) one-pass imputer

2017-08-10 Thread
[ https://issues.apache.org/jira/browse/SPARK-21690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122904#comment-16122904 ] Yan Facai (颜发才) commented on SPARK-21690: - We can use `df.summary("me

[jira] [Commented] (SPARK-21341) Spark 2.1.1: I want to be able to serialize wordVectors on Word2VecModel

2017-07-09 Thread
[ https://issues.apache.org/jira/browse/SPARK-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16079507#comment-16079507 ] Yan Facai (颜发才) commented on SPARK-21341: - Yes, [~sowen] is right. Why no

[jira] [Comment Edited] (SPARK-21341) Spark 2.1.1: I want to be able to serialize wordVectors on Word2VecModel

2017-07-07 Thread
[ https://issues.apache.org/jira/browse/SPARK-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078987#comment-16078987 ] Yan Facai (颜发才) edited comment on SPARK-21341 at 7/8/17 6:2

[jira] [Comment Edited] (SPARK-21341) Spark 2.1.1: I want to be able to serialize wordVectors on Word2VecModel

2017-07-07 Thread
[ https://issues.apache.org/jira/browse/SPARK-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078987#comment-16078987 ] Yan Facai (颜发才) edited comment on SPARK-21341 at 7/8/17 6:2

[jira] [Commented] (SPARK-21341) Spark 2.1.1: I want to be able to serialize wordVectors on Word2VecModel

2017-07-07 Thread
[ https://issues.apache.org/jira/browse/SPARK-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078987#comment-16078987 ] Yan Facai (颜发才) commented on SPARK-21341: - Hi, [~zsellami]. I guess that s

[jira] [Commented] (SPARK-21331) java.lang.NullPointerException for certain methods in classes of MLlib

2017-07-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-21331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077620#comment-16077620 ] Yan Facai (颜发才) commented on SPARK-21331: - [~anirband] How about using this

[jira] [Comment Edited] (SPARK-21331) java.lang.NullPointerException for certain methods in classes of MLlib

2017-07-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-21331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077605#comment-16077605 ] Yan Facai (颜发才) edited comment on SPARK-21331 at 7/7/17 5:2

[jira] [Commented] (SPARK-21331) java.lang.NullPointerException for certain methods in classes of MLlib

2017-07-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-21331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077605#comment-16077605 ] Yan Facai (颜发才) commented on SPARK-21331: - Hi, I run the code in descriptio

[jira] [Commented] (SPARK-21306) OneVsRest Conceals Columns That May Be Relevant To Underlying Classifier

2017-07-05 Thread
[ https://issues.apache.org/jira/browse/SPARK-21306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075974#comment-16075974 ] Yan Facai (颜发才) commented on SPARK-21306: - [~cathalgarvey] By the way, s

[jira] [Comment Edited] (SPARK-21306) OneVsRest Conceals Columns That May Be Relevant To Underlying Classifier

2017-07-05 Thread
[ https://issues.apache.org/jira/browse/SPARK-21306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075970#comment-16075970 ] Yan Facai (颜发才) edited comment on SPARK-21306 at 7/6/17 5:4

[jira] [Commented] (SPARK-21306) OneVsRest Conceals Columns That May Be Relevant To Underlying Classifier

2017-07-05 Thread
[ https://issues.apache.org/jira/browse/SPARK-21306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075970#comment-16075970 ] Yan Facai (颜发才) commented on SPARK-21306: - I agree with [~n...@svana.org]

[jira] [Commented] (SPARK-21285) VectorAssembler should report the column name when data type used is not supported

2017-07-03 Thread
[ https://issues.apache.org/jira/browse/SPARK-21285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073125#comment-16073125 ] Yan Facai (颜发才) commented on SPARK-21285: - It seems easy, and I can wor

[jira] [Commented] (SPARK-21066) LibSVM load just one input file

2017-06-22 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060381#comment-16060381 ] Yan Facai (颜发才) commented on SPARK-21066: - Downgrade to Trivial s

[jira] [Updated] (SPARK-21066) LibSVM load just one input file

2017-06-22 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Facai (颜发才) updated SPARK-21066: Priority: Trivial (was: Major) > LibSVM load just one input f

[jira] [Comment Edited] (SPARK-21066) LibSVM load just one input file

2017-06-20 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055336#comment-16055336 ] Yan Facai (颜发才) edited comment on SPARK-21066 at 6/20/17 8:2

[jira] [Commented] (SPARK-21066) LibSVM load just one input file

2017-06-20 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055336#comment-16055336 ] Yan Facai (颜发才) commented on SPARK-21066: - [~sowen] I believe that the API

[jira] [Commented] (SPARK-21066) LibSVM load just one input file

2017-06-20 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055328#comment-16055328 ] Yan Facai (颜发才) commented on SPARK-21066: - Hi, [~darion] . If `numFeature

[jira] [Comment Edited] (SPARK-21066) LibSVM load just one input file

2017-06-20 Thread
[ https://issues.apache.org/jira/browse/SPARK-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16055328#comment-16055328 ] Yan Facai (颜发才) edited comment on SPARK-21066 at 6/20/17 8:1

[jira] [Commented] (SPARK-20787) PySpark can't handle datetimes before 1900

2017-05-29 Thread
[ https://issues.apache.org/jira/browse/SPARK-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028253#comment-16028253 ] Yan Facai (颜发才) commented on SPARK-20787: - Just go head, [~RBerenguel] !

[jira] [Comment Edited] (SPARK-19581) running NaiveBayes model with 0 features can crash the executor with D rorreGEMV

2017-05-26 Thread
[ https://issues.apache.org/jira/browse/SPARK-19581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015345#comment-16015345 ] Yan Facai (颜发才) edited comment on SPARK-19581 at 5/26/17 9:0

[jira] [Commented] (SPARK-20498) RandomForestRegressionModel should expose getMaxDepth in PySpark

2017-05-26 Thread
[ https://issues.apache.org/jira/browse/SPARK-20498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025949#comment-16025949 ] Yan Facai (颜发才) commented on SPARK-20498: - [~iamshrek] Hi, Xin Ren. As the

[jira] [Comment Edited] (SPARK-20787) PySpark can't handle datetimes before 1900

2017-05-25 Thread
[ https://issues.apache.org/jira/browse/SPARK-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025853#comment-16025853 ] Yan Facai (颜发才) edited comment on SPARK-20787 at 5/26/17 6:0

[jira] [Commented] (SPARK-20787) PySpark can't handle datetimes before 1900

2017-05-25 Thread
[ https://issues.apache.org/jira/browse/SPARK-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025853#comment-16025853 ] Yan Facai (颜发才) commented on SPARK-20787: - It seems that the exception is ra

[jira] [Comment Edited] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015448#comment-16015448 ] Yan Facai (颜发才) edited comment on SPARK-20768 at 5/18/17 8:5

[jira] [Commented] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015448#comment-16015448 ] Yan Facai (颜发才) commented on SPARK-20768: - It seems easy, I can work o

[jira] [Commented] (SPARK-20768) PySpark FPGrowth does not expose numPartitions (expert) param

2017-05-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-20768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015437#comment-16015437 ] Yan Facai (颜发才) commented on SPARK-20768: - Hi, I'm newbie. `numParti

[jira] [Commented] (SPARK-19581) running NaiveBayes model with 0 features can crash the executor with D rorreGEMV

2017-05-18 Thread
[ https://issues.apache.org/jira/browse/SPARK-19581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015345#comment-16015345 ] Yan Facai (颜发才) commented on SPARK-19581: - [~barrybecker4] Hi, Becker. I c

[jira] [Commented] (SPARK-19581) running NaiveBayes model with 0 features can crash the executor with D rorreGEMV

2017-05-08 Thread
[ https://issues.apache.org/jira/browse/SPARK-19581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001987#comment-16001987 ] Yan Facai (颜发才) commented on SPARK-19581: - [~barrybecker4] Could you gi

[jira] [Commented] (SPARK-20526) Load doesn't work in PCAModel

2017-05-01 Thread
[ https://issues.apache.org/jira/browse/SPARK-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992367#comment-15992367 ] Yan Facai (颜发才) commented on SPARK-20526: - Can you give a sample code? &g

[jira] [Comment Edited] (SPARK-16957) Use weighted midpoints for split values.

2017-04-30 Thread
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990195#comment-15990195 ] Yan Facai (颜发才) edited comment on SPARK-16957 at 4/30/17 11:2

[jira] [Commented] (SPARK-16957) Use weighted midpoints for split values.

2017-04-30 Thread
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990195#comment-15990195 ] Yan Facai (颜发才) commented on SPARK-16957: - To match the other libraries, we

[jira] [Comment Edited] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-25 Thread
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984161#comment-15984161 ] Yan Facai (颜发才) edited comment on SPARK-20199 at 4/26/17 6:1

[jira] [Commented] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-25 Thread
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984161#comment-15984161 ] Yan Facai (颜发才) commented on SPARK-20199: - The work is easy, however Pu

[jira] [Commented] (SPARK-16957) Use weighted midpoints for split values.

2017-04-22 Thread
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980268#comment-15980268 ] Yan Facai (颜发才) commented on SPARK-16957: - [~vlad.feinberg] Hi, I found that

[jira] [Updated] (SPARK-16957) Use weighted midpoints for split values.

2017-04-22 Thread
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Facai (颜发才) updated SPARK-16957: Description: We should be using weighted split points rather than the actual continuous

[jira] [Commented] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-19 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976098#comment-15976098 ] Yan Facai (颜发才) commented on SPARK-20081: - By the way, for StringInd

[jira] [Commented] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-14 Thread
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969622#comment-15969622 ] Yan Facai (颜发才) commented on SPARK-20199: - ping [~jkbreuer] [~sethah] [~me

[jira] [Comment Edited] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967347#comment-15967347 ] Yan Facai (颜发才) edited comment on SPARK-20081 at 4/13/17 9:4

[jira] [Comment Edited] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967347#comment-15967347 ] Yan Facai (颜发才) edited comment on SPARK-20081 at 4/13/17 9:4

[jira] [Comment Edited] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967347#comment-15967347 ] Yan Facai (颜发才) edited comment on SPARK-20081 at 4/13/17 9:4

[jira] [Commented] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967347#comment-15967347 ] Yan Facai (颜发才) commented on SPARK-20081: - Yes, you should use `builder.put

[jira] [Commented] (SPARK-19141) VectorAssembler metadata causing memory issues

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967246#comment-15967246 ] Yan Facai (颜发才) commented on SPARK-19141: - `VectorAssembler` will cr

[jira] [Comment Edited] (SPARK-19141) VectorAssembler metadata causing memory issues

2017-04-13 Thread
[ https://issues.apache.org/jira/browse/SPARK-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967246#comment-15967246 ] Yan Facai (颜发才) edited comment on SPARK-19141 at 4/13/17 7:4

[jira] [Commented] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-12 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967204#comment-15967204 ] Yan Facai (颜发才) commented on SPARK-20081: - How about adding a `setNumClass

[jira] [Comment Edited] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-12 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967196#comment-15967196 ] Yan Facai (颜发才) edited comment on SPARK-20081 at 4/13/17 6:4

[jira] [Updated] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-12 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Facai (颜发才) updated SPARK-20081: Component/s: ML > RandomForestClassifier doesn't seem to support more than 10

[jira] [Commented] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-12 Thread
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967196#comment-15967196 ] Yan Facai (颜发才) commented on SPARK-20081: - [~creinig] Chris

[jira] [Commented] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-12 Thread
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15966926#comment-15966926 ] Yan Facai (颜发才) commented on SPARK-20199: - It's not hard, and I can w

[jira] [Commented] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-11 Thread
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965296#comment-15965296 ] Yan Facai (颜发才) commented on SPARK-20199: - Yes, as [~pralabhkumar]

[jira] [Comment Edited] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-04-11 Thread
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963741#comment-15963741 ] Yan Facai (颜发才) edited comment on SPARK-3383 at 4/12/17 2:0

[jira] [Commented] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-04-11 Thread
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965263#comment-15965263 ] Yan Facai (颜发才) commented on SPARK-3383: How about the idea? 1. We use `bin

[jira] [Comment Edited] (SPARK-10788) Decision Tree duplicates bins for unordered categorical features

2017-04-11 Thread
[ https://issues.apache.org/jira/browse/SPARK-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965244#comment-15965244 ] Yan Facai (颜发才) edited comment on SPARK-10788 at 4/12/17 1:3

[jira] [Commented] (SPARK-10788) Decision Tree duplicates bins for unordered categorical features

2017-04-11 Thread
[ https://issues.apache.org/jira/browse/SPARK-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965244#comment-15965244 ] Yan Facai (颜发才) commented on SPARK-10788: - [~josephkb] As categories A, B a

[jira] [Commented] (SPARK-3383) DecisionTree aggregate size could be smaller

2017-04-10 Thread
[ https://issues.apache.org/jira/browse/SPARK-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963741#comment-15963741 ] Yan Facai (颜发才) commented on SPARK-3383: I think the task contains two sub

[jira] [Commented] (SPARK-16957) Use weighted midpoints for split values.

2017-04-06 Thread
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960177#comment-15960177 ] Yan Facai (颜发才) commented on SPARK-16957: - I think that it is helpful for s

[jira] [Commented] (SPARK-3159) Check for reducible DecisionTree

2017-03-30 Thread
[ https://issues.apache.org/jira/browse/SPARK-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948754#comment-15948754 ] Yan Facai (颜发才) commented on SPARK-3159: [~josephkb] Hi, is the jira still ne

[jira] [Comment Edited] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accept

2017-03-23 Thread
[ https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939571#comment-15939571 ] Yan Facai (颜发才) edited comment on SPARK-20043 at 3/24/17 2:1

[jira] [Commented] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accepted

2017-03-23 Thread
[ https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939571#comment-15939571 ] Yan Facai (颜发才) commented on SPARK-20043: - The bug can be reproduce

[jira] [Issue Comment Deleted] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are

2017-03-23 Thread
[ https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Facai (颜发才) updated SPARK-20043: Comment: was deleted (was: [~zsellami] could you give an example of your code? I try to

[jira] [Commented] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accepted

2017-03-23 Thread
[ https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939560#comment-15939560 ] Yan Facai (颜发才) commented on SPARK-20043: - [~zsellami] could you give an exa

[jira] [Commented] (SPARK-20043) CrossValidatorModel loader does not recognize impurity "Gini" and "Entropy" on ML random forest and decision. Only "gini" and "entropy" (in lower case) are accepted

2017-03-23 Thread
[ https://issues.apache.org/jira/browse/SPARK-20043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939533#comment-15939533 ] Yan Facai (颜发才) commented on SPARK-20043: - Perhaps it's better t

[jira] [Commented] (SPARK-3728) RandomForest: Learn models too large to store in memory

2017-03-22 Thread
[ https://issues.apache.org/jira/browse/SPARK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936208#comment-15936208 ] Yan Facai (颜发才) commented on SPARK-3728: RandomForest already use a stack to