JIRA created: https://issues.apache.org/jira/browse/SPARK-7781
Joseph, I agree, I'm debating removing this feature altogether, but I'm
putting the model through its paces.
Thanks.
-Don
On Wed, May 20, 2015 at 7:52 PM, Joseph Bradley
wrote:
> One more comment: That's a lot of categories for a
One more comment: That's a lot of categories for a feature. If it makes
sense for your data, it will run faster if you can group the categories or
split the 1895 categories into a few features which have fewer categories.
On Wed, May 20, 2015 at 3:17 PM, Burak Yavuz wrote:
> Could you please op
Could you please open a JIRA for it? The maxBins input is missing for the
Python Api.
Is it possible if you can use the current master? In the current master,
you should be able to use trees with the Pipeline Api and DataFrames.
Best,
Burak
On Wed, May 20, 2015 at 2:44 PM, Don Drake wrote:
> I
I'm running Spark v1.3.1 and when I run the following against my dataset:
model = GradientBoostedTrees.trainRegressor(trainingData,
categoricalFeaturesInfo=catFeatu
res, maxDepth=6, numIterations=3)
The job will fail with the following message:
Traceback (most recent call last):
File "/Users/dr