[jira] [Resolved] (SPARK-22458) OutOfDirectMemoryError with Spark 2.2
[ https://issues.apache.org/jira/browse/SPARK-22458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22458. --- Resolution: Not A Problem Changes in implementation between minor releases could change memory usage. Your overhead is pretty low. This is not bug and should not be reopened. > OutOfDirectMemoryError with Spark 2.2 > - > > Key: SPARK-22458 > URL: https://issues.apache.org/jira/browse/SPARK-22458 > Project: Spark > Issue Type: Bug > Components: Shuffle, SQL, YARN >Affects Versions: 2.2.0 >Reporter: Kaushal Prajapati >Priority: Blocker > > We were using Spark 2.1 from last 6 months to execute multiple spark jobs > that is running 15 hour long for 50+ TB of source data with below > configurations successfully. > {quote}spark.master yarn > spark.driver.cores10 > spark.driver.maxResultSize5g > spark.driver.memory 20g > spark.executor.cores 5 > spark.executor.extraJavaOptions *-XX:+UseG1GC > -Dio.netty.maxDirectMemory=1024* -XX:MaxGCPauseMillis=6 > *-XX:MaxDirectMemorySize=2048m* > -Dlog4j.configuration=file:///conf/log4j.properties -Dhdp.version=2.5.3.0-37 > spark.driver.extraJavaOptions > *-Dio.netty.maxDirectMemory=2048 -XX:MaxDirectMemorySize=2048m* > -Dlog4j.configuration=file:///conf/log4j.properties -Dhdp.version=2.5.3.0-37 > spark.executor.instances 30 > spark.executor.memory 30g > *spark.kryoserializer.buffer.max 512m* > spark.network.timeout 12000s > spark.serializer > org.apache.spark.serializer.KryoSerializer > spark.shuffle.io.preferDirectBufs false > spark.sql.catalogImplementation hive > spark.sql.shuffle.partitions 5000 > spark.yarn.driver.memoryOverhead 1536 > spark.yarn.executor.memoryOverhead4096 > spark.core.connection.ack.wait.timeout600s > spark.scheduler.maxRegisteredResourcesWaitingTime 15s > spark.sql.hive.filesourcePartitionFileCacheSize 524288000 > spark.dynamicAllocation.executorIdleTimeout 3s > spark.dynamicAllocation.enabled true > spark.hadoop.yarn.timeline-service.enabledfalse > spark.shuffle.service.enabled true > spark.yarn.am.extraJavaOptions*-Dhdp.version=2.5.3.0-37 > -Dio.netty.maxDirectMemory=1024 -XX:MaxDirectMemorySize=1024m*{quote} > Recently we tried to upgrade from Spark 2.1 to Spark 2.2 to get some fixes > using latest version. But we started facing DirectBuffer outOfMemory error > and exceeding memory limits for executor memoryOverhead issue. To fix that we > started tweaking multiple properties but still issue persists. Relevant > information is shared below > Please let me any other details is requried, > > Snapshot for DirectMemory Error Stacktrace :- > {code:java} > 10:48:26.417 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 5.0 in > stage 5.3 (TID 25022, dedwdprshc070.de.xxx.com, executor 615): > FetchFailed(BlockManagerId(465, dedwdprshc061.de.xxx.com, 7337, None), > shuffleId=7, mapId=141, reduceId=3372, message= > org.apache.spark.shuffle.FetchFailedException: failed to allocate 65536 > byte(s) of direct memory (used: 1073699840, max: 1073741824) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:418) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:59) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec
[jira] [Resolved] (SPARK-22458) OutOfDirectMemoryError with Spark 2.2
[ https://issues.apache.org/jira/browse/SPARK-22458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22458. --- Resolution: Not A Problem Please read http://spark.apache.org/contributing.html This just means you're out of YARN container memory. You need to increase the overhead size. This is not a bug. > OutOfDirectMemoryError with Spark 2.2 > - > > Key: SPARK-22458 > URL: https://issues.apache.org/jira/browse/SPARK-22458 > Project: Spark > Issue Type: Bug > Components: Shuffle, SQL, YARN >Affects Versions: 2.2.0 >Reporter: Kaushal Prajapati >Priority: Blocker > > We were using Spark 2.1 from last 6 months to execute multiple spark jobs > that is running 15 hour long for 50+ TB of source data with below > configurations successfully. > {quote}spark.master yarn > spark.driver.cores10 > spark.driver.maxResultSize5g > spark.driver.memory 20g > spark.executor.cores 5 > spark.executor.extraJavaOptions *-XX:+UseG1GC > -Dio.netty.maxDirectMemory=1024* -XX:MaxGCPauseMillis=6 > *-XX:MaxDirectMemorySize=2048m* > -Dlog4j.configuration=file:///conf/log4j.properties -Dhdp.version=2.5.3.0-37 > spark.driver.extraJavaOptions > *-Dio.netty.maxDirectMemory=2048 -XX:MaxDirectMemorySize=2048m* > -Dlog4j.configuration=file:///conf/log4j.properties -Dhdp.version=2.5.3.0-37 > spark.executor.instances 30 > spark.executor.memory 30g > *spark.kryoserializer.buffer.max 512m* > spark.network.timeout 12000s > spark.serializer > org.apache.spark.serializer.KryoSerializer > spark.shuffle.io.preferDirectBufs false > spark.sql.catalogImplementation hive > spark.sql.shuffle.partitions 5000 > spark.yarn.driver.memoryOverhead 1536 > spark.yarn.executor.memoryOverhead4096 > spark.core.connection.ack.wait.timeout600s > spark.scheduler.maxRegisteredResourcesWaitingTime 15s > spark.sql.hive.filesourcePartitionFileCacheSize 524288000 > spark.dynamicAllocation.executorIdleTimeout 3s > spark.dynamicAllocation.enabled true > spark.hadoop.yarn.timeline-service.enabledfalse > spark.shuffle.service.enabled true > spark.yarn.am.extraJavaOptions*-Dhdp.version=2.5.3.0-37 > -Dio.netty.maxDirectMemory=1024 -XX:MaxDirectMemorySize=1024m*{quote} > Recently we tried to upgrade from Spark 2.1 to Spark 2.2 to get some fixes > using latest version. But we started facing DirectBuffer outOfMemory error > and exceeding memory limits for executor memoryOverhead issue. To fix that we > started tweaking multiple properties but still issue persists. Relevant > information is shared below > Please let me any other details is requried, > > Snapshot for DirectMemory Error Stacktrace :- > {code:java} > 10:48:26.417 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 5.0 in > stage 5.3 (TID 25022, dedwdprshc070.de.xxx.com, executor 615): > FetchFailed(BlockManagerId(465, dedwdprshc061.de.xxx.com, 7337, None), > shuffleId=7, mapId=141, reduceId=3372, message= > org.apache.spark.shuffle.FetchFailedException: failed to allocate 65536 > byte(s) of direct memory (used: 1073699840, max: 1073741824) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:418) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:59) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.Whol