Re: SparkML algos limitations question.

2016-03-21 Thread Joseph Bradley
The indexing I mentioned is more restrictive than that: each index corresponds to a unique position in a binary tree. (I.e., the first index of row 0 is 1, the first of row 1 is 2, the first of row 2 is 4, etc., IIRC) You're correct that this restriction could be removed; with some careful

Re: SparkML algos limitations question.

2016-03-21 Thread Eugene Morozov
Hi, Joseph, I thought I understood, why it has a limit of 30 levels for decision tree, but now I'm not that sure. I thought that's because the decision tree stored in the array, which has length of type int, which cannot be more, than 2^31-1. But here are my new discoveries. I've trained two

RE: SparkML algos limitations question.

2016-01-04 Thread Ulanov, Alexander
regards, Alexander From: Yanbo Liang [mailto:yblia...@gmail.com] Sent: Sunday, December 27, 2015 2:23 AM To: Joseph Bradley Cc: Eugene Morozov; user; d...@spark.apache.org Subject: Re: SparkML algos limitations question. Hi Eugene, AFAIK, the current implementation of MultilayerPerceptronClassifier

Re: SparkML algos limitations question.

2016-01-04 Thread Yanbo Liang
y, December 27, 2015 2:23 AM > *To:* Joseph Bradley > *Cc:* Eugene Morozov; user; d...@spark.apache.org > *Subject:* Re: SparkML algos limitations question. > > > > Hi Eugene, > > > > AFAIK, the current implementation of MultilayerPerceptronClassifier hav

Re: SparkML algos limitations question.

2015-12-27 Thread Yanbo Liang
Hi Eugene, AFAIK, the current implementation of MultilayerPerceptronClassifier have some scalability problems if the model is very huge (such as >10M), although I think the limitation can cover many use cases already. Yanbo 2015-12-16 6:00 GMT+08:00 Joseph Bradley : > Hi

Re: SparkML algos limitations question.

2015-12-15 Thread Joseph Bradley
Hi Eugene, The maxDepth parameter exists because the implementation uses Integer node IDs which correspond to positions in the binary tree. This simplified the implementation. I'd like to eventually modify it to avoid depending on tree node IDs, but that is not yet on the roadmap. There is not