Hi, here are two tips for you, 1. increase the parallism level 2.increase the driver memory
On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak <[email protected]> wrote: > Hi everyone, > I have the following configuration. I am currently running my app in local > mode. > > val conf = new > SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory", > "3g").set("spark.storage.memoryFraction", "0.1") > > I am getting the following error. I tried setting up spark.executor.memory > and memory fraction setting, however my UI does not show the increase and I > still get these errors. I am loading a TSV file from HDFS (around 5 GB). > Does this mean, I should update these settings and add more memory or is it > somethign else? Spark master has 24 GB physical memory and workers have 16 > GB, but we are running other services (CDH 5.1) on these nodes as well. > > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Getting 2 non-empty blocks out of 2 blocks > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Getting 2 non-empty blocks out of 2 blocks > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Started 0 remote fetches in 6 ms > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Started 0 remote fetches in 6 ms > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > maxBytesInFlight: 50331648, targetRequestSize: 10066329 > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > maxBytesInFlight: 50331648, targetRequestSize: 10066329 > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Getting 2 non-empty blocks out of 2 blocks > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Getting 2 non-empty blocks out of 2 blocks > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Started 0 remote fetches in 1 ms > 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: > Started 0 remote fetches in 1 ms > 14/07/31 09:48:17 ERROR Executor: Exception in task ID 5 > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2271) > at > java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught > exception in thread Thread[Executor task launch worker-3,5,main] > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2271) > at > java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0) > 14/07/31 09:48:17 WARN TaskSetManager: Loss was due to > java.lang.OutOfMemoryError > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2271) > at > java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times; > aborting job > 14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 1 > 14/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at > ComputeScores.scala:76 > 14/07/31 09:48:17 INFO Executor: Executor is trying to kill task 6 > 14/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled >
