Re: graphframe out of memory

2017-09-08 Thread Imran Rajjad
No I did not, I thought Spark would take care of that itself since I have
put in the arguments.

On Thu, Sep 7, 2017 at 9:26 PM, Lukas Bradley 
wrote:

> Did you also increase the size of the heap of the Java app that is
> starting Spark?
>
> https://alvinalexander.com/blog/post/java/java-xmx-xms-
> memory-heap-size-control
>
> On Thu, Sep 7, 2017 at 12:16 PM, Imran Rajjad  wrote:
>
>> I am getting Out of Memory error while running connectedComponents job on
>> graph with around 12000 vertices and 134600 edges.
>> I am running spark in embedded mode in a standalone Java application and
>> have tried to increase the memory but it seems that its not taking any
>> effect
>>
>> sparkConf = new SparkConf().setAppName("SOME APP
>> NAME").setMaster("local[10]")
>> .set("spark.executor.memory","5g")
>> .set("spark.driver.memory","8g")
>> .set("spark.driver.maxResultSize","1g")
>> .set("spark.sql.warehouse.dir", "file:///d:/spark/tmp")
>> .set("hadoop.home.dir", "file:///D:/spark-2.1.0-bin-hadoop2.7/bin");
>>
>>   spark = SparkSession.builder().config(sparkConf).getOrCreate();
>>   spark.sparkContext().setLogLevel("ERROR");
>>   spark.sparkContext().setCheckpointDir("D:/spark/tmp");
>>
>> the stack trace
>> java.lang.OutOfMemoryError: Java heap space
>>  at java.util.Arrays.copyOf(Arrays.java:3332)
>>  at java.lang.AbstractStringBuilder.ensureCapacityInternal(Abstr
>> actStringBuilder.java:124)
>>  at java.lang.AbstractStringBuilder.append(AbstractStringBuilder
>> .java:448)
>>  at java.lang.StringBuilder.append(StringBuilder.java:136)
>>  at scala.StringContext.standardInterpolator(StringContext.scala:126)
>>  at scala.StringContext.s(StringContext.scala:95)
>>  at org.apache.spark.sql.execution.QueryExecution.toString(
>> QueryExecution.scala:230)
>>  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutio
>> nId(SQLExecution.scala:54)
>>  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
>>  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$e
>> xecute$1(Dataset.scala:2385)
>>  at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$D
>> ataset$$collect$1.apply(Dataset.scala:2390)
>>  at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$D
>> ataset$$collect$1.apply(Dataset.scala:2390)
>>  at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2801)
>>  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$c
>> ollect(Dataset.scala:2390)
>>  at org.apache.spark.sql.Dataset.collect(Dataset.scala:2366)
>>  at org.graphframes.lib.ConnectedComponents$.skewedJoin(Connecte
>> dComponents.scala:239)
>>  at org.graphframes.lib.ConnectedComponents$.org$graphframes$
>> lib$ConnectedComponents$$run(ConnectedComponents.scala:308)
>>  at org.graphframes.lib.ConnectedComponents.run(ConnectedCompone
>> nts.scala:139)
>>
>> GraphFrame version is 0.5.0 and Spark version is 2.1.1
>>
>> regards,
>> Imran
>>
>> --
>> I.R
>>
>
>


-- 
I.R


Re: graphframe out of memory

2017-09-07 Thread Lukas Bradley
Did you also increase the size of the heap of the Java app that is starting
Spark?

https://alvinalexander.com/blog/post/java/java-xmx-xms-memory-heap-size-control

On Thu, Sep 7, 2017 at 12:16 PM, Imran Rajjad  wrote:

> I am getting Out of Memory error while running connectedComponents job on
> graph with around 12000 vertices and 134600 edges.
> I am running spark in embedded mode in a standalone Java application and
> have tried to increase the memory but it seems that its not taking any
> effect
>
> sparkConf = new SparkConf().setAppName("SOME APP
> NAME").setMaster("local[10]")
> .set("spark.executor.memory","5g")
> .set("spark.driver.memory","8g")
> .set("spark.driver.maxResultSize","1g")
> .set("spark.sql.warehouse.dir", "file:///d:/spark/tmp")
> .set("hadoop.home.dir", "file:///D:/spark-2.1.0-bin-hadoop2.7/bin");
>
>   spark = SparkSession.builder().config(sparkConf).getOrCreate();
>   spark.sparkContext().setLogLevel("ERROR");
>   spark.sparkContext().setCheckpointDir("D:/spark/tmp");
>
> the stack trace
> java.lang.OutOfMemoryError: Java heap space
>  at java.util.Arrays.copyOf(Arrays.java:3332)
>  at java.lang.AbstractStringBuilder.ensureCapacityInternal(
> AbstractStringBuilder.java:124)
>  at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
>  at java.lang.StringBuilder.append(StringBuilder.java:136)
>  at scala.StringContext.standardInterpolator(StringContext.scala:126)
>  at scala.StringContext.s(StringContext.scala:95)
>  at org.apache.spark.sql.execution.QueryExecution.
> toString(QueryExecution.scala:230)
>  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(
> SQLExecution.scala:54)
>  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
>  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$
> execute$1(Dataset.scala:2385)
>  at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$
> Dataset$$collect$1.apply(Dataset.scala:2390)
>  at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$
> Dataset$$collect$1.apply(Dataset.scala:2390)
>  at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2801)
>  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$
> collect(Dataset.scala:2390)
>  at org.apache.spark.sql.Dataset.collect(Dataset.scala:2366)
>  at org.graphframes.lib.ConnectedComponents$.skewedJoin(
> ConnectedComponents.scala:239)
>  at org.graphframes.lib.ConnectedComponents$.org$graphframes$lib$
> ConnectedComponents$$run(ConnectedComponents.scala:308)
>  at org.graphframes.lib.ConnectedComponents.run(
> ConnectedComponents.scala:139)
>
> GraphFrame version is 0.5.0 and Spark version is 2.1.1
>
> regards,
> Imran
>
> --
> I.R
>


graphframe out of memory

2017-09-07 Thread Imran Rajjad
I am getting Out of Memory error while running connectedComponents job on
graph with around 12000 vertices and 134600 edges.
I am running spark in embedded mode in a standalone Java application and
have tried to increase the memory but it seems that its not taking any
effect

sparkConf = new SparkConf().setAppName("SOME APP
NAME").setMaster("local[10]")
.set("spark.executor.memory","5g")
.set("spark.driver.memory","8g")
.set("spark.driver.maxResultSize","1g")
.set("spark.sql.warehouse.dir", "file:///d:/spark/tmp")
.set("hadoop.home.dir", "file:///D:/spark-2.1.0-bin-hadoop2.7/bin");

  spark = SparkSession.builder().config(sparkConf).getOrCreate();
  spark.sparkContext().setLogLevel("ERROR");
  spark.sparkContext().setCheckpointDir("D:/spark/tmp");

the stack trace
java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Arrays.java:3332)
 at
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
 at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
 at java.lang.StringBuilder.append(StringBuilder.java:136)
 at scala.StringContext.standardInterpolator(StringContext.scala:126)
 at scala.StringContext.s(StringContext.scala:95)
 at
org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:230)
 at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:54)
 at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
 at org.apache.spark.sql.Dataset.org
$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2385)
 at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2390)
 at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2390)
 at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2801)
 at org.apache.spark.sql.Dataset.org
$apache$spark$sql$Dataset$$collect(Dataset.scala:2390)
 at org.apache.spark.sql.Dataset.collect(Dataset.scala:2366)
 at
org.graphframes.lib.ConnectedComponents$.skewedJoin(ConnectedComponents.scala:239)
 at
org.graphframes.lib.ConnectedComponents$.org$graphframes$lib$ConnectedComponents$$run(ConnectedComponents.scala:308)
 at
org.graphframes.lib.ConnectedComponents.run(ConnectedComponents.scala:139)

GraphFrame version is 0.5.0 and Spark version is 2.1.1

regards,
Imran

-- 
I.R