Re: train many decision tress with a single spark job

2015-01-12 Thread Josh Buffum
:53 AM, Josh Buffum jbuf...@gmail.com wrote: I've got a data set of activity by user. For each user, I'd like to train a decision tree model. I currently have the feature creation step implemented in Spark and would naturally like to use mllib's decision tree model. However, it looks like

Re: train many decision tress with a single spark job

2015-01-12 Thread Josh Buffum
are using RDDs inside RDDs. But I am also not sure you should do what it looks like you are trying to do. On Jan 13, 2015 12:32 AM, Josh Buffum jbuf...@gmail.com wrote: Sean, Thanks for the response. Is there some subtle difference between one model partitioned by N users or N models per each 1 user

train many decision tress with a single spark job

2015-01-10 Thread Josh Buffum
I've got a data set of activity by user. For each user, I'd like to train a decision tree model. I currently have the feature creation step implemented in Spark and would naturally like to use mllib's decision tree model. However, it looks like the decision tree model expects the whole RDD and