[MLlib] Performance issues when building GBM models

2015-02-08 Thread Christopher Thom
at DecisionTreeMetadata.scala:111, took 5.495166 s Any thoughts or advice, or even suggestions on where to dig for more info would be welcome. thanks chris Christopher Thom QUANTIUM Level 25, 8 Chifley, 8-12 Chifley Square Sydney NSW 2000 T: +61 2 8222 3577 F: +61 2 9292 6444 W

RE: Does DecisionTree model in MLlib deal with missing values?

2015-01-11 Thread Christopher Thom
: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org Christopher Thom QUANTIUM Level 25, 8 Chifley, 8-12 Chifley Square Sydney NSW

[MLlib] Scoring GBTs with a variable number of trees

2015-01-07 Thread Christopher Thom
the MSE is minimum. i.e. in a plot of MSE vs Number of trees, the error rate will decrease (as the model improves), hit a minimum (the optimal point), and then increase (as the model starts to overfit the data). cheers chris Christopher Thom QUANTIUM Level 25, 8 Chifley, 8-12 Chifley Square

RE: python API for gradient boosting?

2015-01-05 Thread Christopher Thom
January 2015 8:43 AM To: Christopher Thom Cc: user@spark.apache.org Subject: Re: python API for gradient boosting? I created a JIRA for it: https://issues.apache.org/jira/browse/SPARK-5094. Hopefully someone would work on it and make it available in the 1.3 release. -Xiangrui On Sun, Jan 4, 2015

python API for gradient boosting?

2015-01-05 Thread Christopher Thom
compelling. As an alternative, if it'll be a while before this API is implemented, does anyone have suggestions for scala replacements for the above python libraries? cheers chris Christopher Thom QUANTIUM Level 25, 8 Chifley, 8-12 Chifley Square Sydney NSW 2000 T: +61 2 8222 3577 F: +61 2 9292 6444 W

python API for gradient boosting?

2015-01-04 Thread Christopher Thom
Hi, I wonder if anyone knows when a python API will be added for Gradient Boosted Trees? I see that java and scala APIs were added for the 1.2 release, and would love to be able to build GBMs in pyspark too. cheers chris Christopher Thom QUANTIUM Level 25, 8 Chifley, 8-12 Chifley Square