Re: Size exceeds Integer.MAX_VALUE issue with RandomForest

2017-09-18 Thread Pulluru Ranjith
Hi,

Here are the commands that are used.
-
> spark.default.parallelism=1000
> sparkR.session()
Java ref type org.apache.spark.sql.SparkSession id 1
> sql("use test")
SparkDataFrame[]
> mydata <-sql("select c1 ,p1 ,rt1 ,c2 ,p2 ,rt2 ,avt,avn from test_temp2
where vdr = 'TEST31X' ")
>
> nrow(mydata)
[1] 544140
> lat_model <- spark.randomForest( mydata, avt~ c1 + p1 + rt1 + c2 + p2 +
rt2 , maxDepth = 30)
[Stage 10:==> (7 + 1) /
8]17/09/18 10:50:30 WARN TaskSetManager: Lost task 0.0 in stage 10.0 (TID
66, node1.test, executor 1): java.lang.IllegalArgumentException: Size
exceeds Integer.MAX_VALUE


On Sat, Sep 16, 2017 at 8:54 PM, Akhil Das  wrote:

> What are the parameters you passed to the classifier and what is the size
> of your train data? You are hitting that issue because one of the block
> size is over 2G, repartitioning the data will help.
>
> On Fri, Sep 15, 2017 at 7:55 PM, rpulluru 
> wrote:
>
>> Hi,
>>
>> I am using sparkR randomForest function and running into
>> java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE issue.
>> Looks like I am running into this issue
>> https://issues.apache.org/jira/browse/SPARK-1476, I used
>> spark.default.parallelism=1000 but still facing the same issue.
>>
>> Thanks
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> Cheers!
>
>


Re: Size exceeds Integer.MAX_VALUE issue with RandomForest

2017-09-16 Thread Akhil Das
What are the parameters you passed to the classifier and what is the size
of your train data? You are hitting that issue because one of the block
size is over 2G, repartitioning the data will help.

On Fri, Sep 15, 2017 at 7:55 PM, rpulluru  wrote:

> Hi,
>
> I am using sparkR randomForest function and running into
> java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE issue.
> Looks like I am running into this issue
> https://issues.apache.org/jira/browse/SPARK-1476, I used
> spark.default.parallelism=1000 but still facing the same issue.
>
> Thanks
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
Cheers!


Size exceeds Integer.MAX_VALUE issue with RandomForest

2017-09-15 Thread rpulluru
Hi,

I am using sparkR randomForest function and running into 
java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE issue.
Looks like I am running into this issue 
https://issues.apache.org/jira/browse/SPARK-1476, I used
spark.default.parallelism=1000 but still facing the same issue.

Thanks



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org