[ 
https://issues.apache.org/jira/browse/SPARK-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Shen updated SPARK-5421:
-----------------------------
    Description: 
ExternalAppendOnlyMap if only for the spark job that aggregator isDefined,  but 
sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL won't spill at 
shuffle, it's very easy to throw OOM at shuffle.  I think sparkSQL also need 
spill at shuffle.
One of the executor's log, here is  stderr:
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Don't have map outputs for 
shuffle 1, fetching them
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Doing the fetch; tracker 
actor = 
Actor[akka.tcp://sparkDriver@10.196.128.140:40952/user/MapOutputTracker#1435377484]
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Got the output locations
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Getting 143 
non-empty blocks out of 143 blocks
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Started 4 remote 
fetches in 72 ms
15/01/27 07:47:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 
15: SIGTERM

here is  stdout:
2015-01-27T07:44:43.487+0800: [Full GC 3961343K->3959868K(3961344K), 29.8959290 
secs]
2015-01-27T07:45:13.460+0800: [Full GC 3961343K->3959992K(3961344K), 27.9218150 
secs]
2015-01-27T07:45:41.407+0800: [GC 3960347K(3961344K), 3.0457450 secs]
2015-01-27T07:45:52.950+0800: [Full GC 3961343K->3960113K(3961344K), 29.3894670 
secs]
2015-01-27T07:46:22.393+0800: [Full GC 3961118K->3960240K(3961344K), 28.9879600 
secs]
2015-01-27T07:46:51.393+0800: [Full GC 3960240K->3960213K(3961344K), 34.1530900 
secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill %p"
#   Executing /bin/sh -c "kill 9050"...
2015-01-27T07:47:25.921+0800: [GC 3960214K(3961344K), 3.3959300 secs]


  was:
ExternalAppendOnlyMap if only for the spark job that aggregator isDefined,  but 
sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL won't spill at 
shuffle, it's very easy to throw OOM at shuffle.
One of the executor's log, here is  stderr:
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Don't have map outputs for 
shuffle 1, fetching them
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Doing the fetch; tracker 
actor = 
Actor[akka.tcp://sparkDriver@10.196.128.140:40952/user/MapOutputTracker#1435377484]
15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Got the output locations
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Getting 143 
non-empty blocks out of 143 blocks
15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Started 4 remote 
fetches in 72 ms
15/01/27 07:47:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 
15: SIGTERM

here is  stdout:
2015-01-27T07:44:43.487+0800: [Full GC 3961343K->3959868K(3961344K), 29.8959290 
secs]
2015-01-27T07:45:13.460+0800: [Full GC 3961343K->3959992K(3961344K), 27.9218150 
secs]
2015-01-27T07:45:41.407+0800: [GC 3960347K(3961344K), 3.0457450 secs]
2015-01-27T07:45:52.950+0800: [Full GC 3961343K->3960113K(3961344K), 29.3894670 
secs]
2015-01-27T07:46:22.393+0800: [Full GC 3961118K->3960240K(3961344K), 28.9879600 
secs]
2015-01-27T07:46:51.393+0800: [Full GC 3960240K->3960213K(3961344K), 34.1530900 
secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill %p"
#   Executing /bin/sh -c "kill 9050"...
2015-01-27T07:47:25.921+0800: [GC 3960214K(3961344K), 3.3959300 secs]



> SparkSql throw OOM at shuffle
> -----------------------------
>
>                 Key: SPARK-5421
>                 URL: https://issues.apache.org/jira/browse/SPARK-5421
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>            Reporter: Hong Shen
>
> ExternalAppendOnlyMap if only for the spark job that aggregator isDefined,  
> but sparkSQL's shuffledRDD haven't define aggregator, so sparkSQL won't spill 
> at shuffle, it's very easy to throw OOM at shuffle.  I think sparkSQL also 
> need spill at shuffle.
> One of the executor's log, here is  stderr:
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Don't have map outputs 
> for shuffle 1, fetching them
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Doing the fetch; tracker 
> actor = 
> Actor[akka.tcp://sparkDriver@10.196.128.140:40952/user/MapOutputTracker#1435377484]
> 15/01/27 07:02:19 INFO spark.MapOutputTrackerWorker: Got the output locations
> 15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Getting 143 
> non-empty blocks out of 143 blocks
> 15/01/27 07:02:19 INFO storage.ShuffleBlockFetcherIterator: Started 4 remote 
> fetches in 72 ms
> 15/01/27 07:47:29 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED 
> SIGNAL 15: SIGTERM
> here is  stdout:
> 2015-01-27T07:44:43.487+0800: [Full GC 3961343K->3959868K(3961344K), 
> 29.8959290 secs]
> 2015-01-27T07:45:13.460+0800: [Full GC 3961343K->3959992K(3961344K), 
> 27.9218150 secs]
> 2015-01-27T07:45:41.407+0800: [GC 3960347K(3961344K), 3.0457450 secs]
> 2015-01-27T07:45:52.950+0800: [Full GC 3961343K->3960113K(3961344K), 
> 29.3894670 secs]
> 2015-01-27T07:46:22.393+0800: [Full GC 3961118K->3960240K(3961344K), 
> 28.9879600 secs]
> 2015-01-27T07:46:51.393+0800: [Full GC 3960240K->3960213K(3961344K), 
> 34.1530900 secs]
> #
> # java.lang.OutOfMemoryError: Java heap space
> # -XX:OnOutOfMemoryError="kill %p"
> #   Executing /bin/sh -c "kill 9050"...
> 2015-01-27T07:47:25.921+0800: [GC 3960214K(3961344K), 3.3959300 secs]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to