Re: Feedback: Feature request

2015-08-28 Thread Manish Amde
> though. E.g > > "lhs":0,"op":"<=","rhs":-35.0 > On Aug 28, 2015 12:03 AM, "Manish Amde" > wrote: > >> Hi James, >> >> It's a good idea. A JSON format is more convenient for visualization >>

Re: Feedback: Feature request

2015-08-27 Thread Manish Amde
Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though. You should probably create a JIRA for this. CC: dev list -Manish > On Aug 26, 2015,

Re: DecisionTree Algorithm used in Spark MLLib

2015-01-01 Thread Manish Amde
Hi Anoop, The Spark decision tree implementation supports: regression and multi class classification, continuous and categorical features, pruning and does not support missing features at present. You can probably think of it as distributed CART though personally I always find the acronyms confusi

Re: Print Node info. of Decision Tree

2014-12-08 Thread Manish Amde
Hi Jake, The "toString" method should print the full model in versions 1.1.x. The current master branch has a method "toDebugString" for DecisionTreeModel which should print out all the node classes and the "toString" method has been updated to print the summary only so there is a slight change i

Re: Status of MLLib exporting models to PMML

2014-11-17 Thread Manish Amde
le also. > I like the comprehensiveness of PMML but as you mention the complexity of > management for large models is a concern. > Cheers > > On Fri, Nov 14, 2014 at 1:35 AM, Manish Amde > wrote: > >> @Aris, we are closely following the PMML work that is going on and as &g

Re: Status of MLLib exporting models to PMML

2014-11-13 Thread Manish Amde
@Aris, we are closely following the PMML work that is going on and as Xiangrui mentioned, it might be easier to migrate models such as logistic regression and then migrate trees. Some of the models get fairly large (as pointed out by Sung Chung) with deep trees as building blocks and we might have

Re: Anybody built the branch for Adaptive Boosting, extension to MLlib by Manish Amde?

2014-09-18 Thread Manish Amde
st in boosting algos. We are eager to add them to MLlib ASAP. On Thu, Sep 18, 2014 at 7:27 PM, Aris wrote: > Thank you Spark community you make life much more lovely - suffering in > silence is not fun! > I am trying to build the Spark Git branch from Manish Amde, available here: > h

Re: Gradient Boosted Machines

2014-08-05 Thread Manish Amde
Hi Daniel, Thanks a lot for your interest. Gradient boosting and AdaBoost algorithms are under active development and should be a part of release 1.2. -Manish On Mon, Jul 14, 2014 at 11:24 AM, Daniel Bendavid < daniel.benda...@creditkarma.com> wrote: > Hi, > > My company is strongly consider

Re: MLLib : Decision Tree with minimum points per node

2014-06-19 Thread Manish Amde
Hi Justin, I have created a JIRA ticket to keep track of your request. Thanks. https://issues.apache.org/jira/browse/SPARK-2207 -Manish On Thu, Jun 19, 2014 at 2:35 PM, Manish Amde wrote: > Hi Justin, > > I am glad to know that trees are working well for you. > > The tre

Re: MLLib : Decision Tree with minimum points per node

2014-06-19 Thread Manish Amde
Hi Justin, I am glad to know that trees are working well for you. The trees will support minimum samples per node in a future release. Thanks for the feedback. -Manish On Fri, Jun 13, 2014 at 8:55 PM, Justin Yip wrote: > Hello, > > I have been playing around with mllib's decision tree librar

Re: MLLib : Decision Tree not getting built for 5 or more levels(maxDepth=5) and the one built for 3 levels is performing poorly

2014-06-15 Thread Manish Amde
ue that you mentioned. > > Thanks and Regards, > Suraj Sheth > > > > On Sat, Jun 14, 2014 at 12:05 AM, Manish Amde wrote: > >> Hi Suraj, >> >> I can't answer 1) without knowing the data. However, the results for 2) >> are surprising indeed. We ha

Re: MLLib : Decision Tree not getting built for 5 or more levels(maxDepth=5) and the one built for 3 levels is performing poorly

2014-06-13 Thread Manish Amde
Hi Suraj, I can't answer 1) without knowing the data. However, the results for 2) are surprising indeed. We have tested with a billion samples for regression tasks so I am perplexed with the behavior. Could you try the latest Spark master to see whether this problem goes away. It has code that li

Re: Random Forest on Spark

2014-04-18 Thread Manish Amde
Sorry for arriving late to the party! Evan has clearly explained the current implementation, our future plans and key differences with the PLANET paper. I don't think I can add more to his comments. :-) I apologize for not creating the corresponding JIRA tickets for the tree improvements (multicla