[jira] [Updated] (SPARK-31496) Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError
[ https://issues.apache.org/jira/browse/SPARK-31496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Shestakov updated SPARK-31496: Description: Local spark with one core (local[1]) while trying to save Dataset to parquet local file cause OOM. {code:java} SparkSession sparkSession = SparkSession.builder() .appName("Loader impl test") .master("local[1]") .config("spark.ui.enabled", false) .config("spark.sql.datetime.java8API.enabled", true) .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") .config("spark.kryoserializer.buffer.max", "1g") .config("spark.executor.memory", "4g") .config("spark.driver.memory", "8g") .getOrCreate(); {code} {noformat} [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.967] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.969] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.970] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.973] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:34.371] INFO [boundedElastic-2 org.apache.spark.SparkContext:57] q: - Starting job: save at LoaderImpl.java:305 [20-Apr-2020 11:42:34.389] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Got job 0 (save at LoaderImpl.java:305) with 1 output partitions [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Final stage: ResultStage 0 (save at LoaderImpl.java:305) [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Parents of final stage: List() [20-Apr-2020 11:42:34.392] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Missing parents: List()[20-Apr-2020 11:42:34.398] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305), which has no missing parents [20-Apr-2020 11:42:34.634] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0 stored as values in memory (estimated size 166.1 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.945] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 58.0 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.949] INFO [dispatcher-BlockManagerMaster org.apache.spark.storage.BlockManagerInfo:57] q: - Added broadcast_0_piece0 in memory on DESKTOP-A1:58276 (size: 58.0 KiB, free: 18.4 GiB) [20-Apr-2020 11:42:34.953] INFO [dag-scheduler-event-loop org.apache.spark.SparkContext:57] q: - Created broadcast 0 from broadcast at DAGScheduler.scala:1206 [20-Apr-2020 11:42:34.980] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305) (first 15 tasks are for partitions Vector(0)) [20-Apr-2020 11:42:34.981] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.TaskSchedulerImpl:57] q: - Adding task set 0.0 with 1 tasks Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125) at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859) at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712) at org.apache.spark.util.Utils$$anon$2.write(Utils.scala:153) at com.esotericsoftware.kryo.io.Output.flush(Output.java:185) at com.esotericsoftware.kryo.io.Output.close(Output.java:196) at
[jira] [Updated] (SPARK-31496) Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError
[ https://issues.apache.org/jira/browse/SPARK-31496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Shestakov updated SPARK-31496: Description: Local spark with one core (local[1]) while trying to save Dataset to parquet local file cause OOM. {code:java} SparkSession sparkSession = SparkSession.builder() .appName("Loader impl test") .master("local[1]") .config("spark.ui.enabled", false) .config("spark.sql.datetime.java8API.enabled", true) .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") .config("spark.kryoserializer.buffer.max", "1g") .config("spark.executor.memory", "4g") .config("spark.driver.memory", "8g") .getOrCreate(); {code} {noformat} [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.967] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.969] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.970] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.973] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:34.371] INFO [boundedElastic-2 org.apache.spark.SparkContext:57] q: - Starting job: save at LoaderImpl.java:305 [20-Apr-2020 11:42:34.389] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Got job 0 (save at LoaderImpl.java:305) with 1 output partitions [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Final stage: ResultStage 0 (save at LoaderImpl.java:305) [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Parents of final stage: List() [20-Apr-2020 11:42:34.392] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Missing parents: List()[20-Apr-2020 11:42:34.398] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305), which has no missing parents [20-Apr-2020 11:42:34.634] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0 stored as values in memory (estimated size 166.1 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.945] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 58.0 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.949] INFO [dispatcher-BlockManagerMaster org.apache.spark.storage.BlockManagerInfo:57] q: - Added broadcast_0_piece0 in memory on DESKTOP-A1:58276 (size: 58.0 KiB, free: 18.4 GiB) [20-Apr-2020 11:42:34.953] INFO [dag-scheduler-event-loop org.apache.spark.SparkContext:57] q: - Created broadcast 0 from broadcast at DAGScheduler.scala:1206 [20-Apr-2020 11:42:34.980] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305) (first 15 tasks are for partitions Vector(0)) [20-Apr-2020 11:42:34.981] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.TaskSchedulerImpl:57] q: - Adding task set 0.0 with 1 tasks Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125) at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859) at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712) at org.apache.spark.util.Utils$$anon$2.write(Utils.scala:153) at com.esotericsoftware.kryo.io.Output.flush(Output.java:185) at com.esotericsoftware.kryo.io.Output.close(Output.java:196) at
[jira] [Updated] (SPARK-31496) Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError
[ https://issues.apache.org/jira/browse/SPARK-31496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Shestakov updated SPARK-31496: Description: Local spark with one core (local[1]) while trying to save Dataset to parquet local file cause OOM {noformat} [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.967] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.969] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.970] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.973] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:34.371] INFO [boundedElastic-2 org.apache.spark.SparkContext:57] q: - Starting job: save at LoaderImpl.java:305 [20-Apr-2020 11:42:34.389] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Got job 0 (save at LoaderImpl.java:305) with 1 output partitions [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Final stage: ResultStage 0 (save at LoaderImpl.java:305) [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Parents of final stage: List() [20-Apr-2020 11:42:34.392] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Missing parents: List()[20-Apr-2020 11:42:34.398] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305), which has no missing parents [20-Apr-2020 11:42:34.634] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0 stored as values in memory (estimated size 166.1 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.945] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 58.0 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.949] INFO [dispatcher-BlockManagerMaster org.apache.spark.storage.BlockManagerInfo:57] q: - Added broadcast_0_piece0 in memory on DESKTOP-A1:58276 (size: 58.0 KiB, free: 18.4 GiB) [20-Apr-2020 11:42:34.953] INFO [dag-scheduler-event-loop org.apache.spark.SparkContext:57] q: - Created broadcast 0 from broadcast at DAGScheduler.scala:1206 [20-Apr-2020 11:42:34.980] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305) (first 15 tasks are for partitions Vector(0)) [20-Apr-2020 11:42:34.981] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.TaskSchedulerImpl:57] q: - Adding task set 0.0 with 1 tasks Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125) at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859) at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712) at org.apache.spark.util.Utils$$anon$2.write(Utils.scala:153) at com.esotericsoftware.kryo.io.Output.flush(Output.java:185) at com.esotericsoftware.kryo.io.Output.close(Output.java:196) at org.apache.spark.serializer.KryoSerializationStream.close(KryoSerializer.scala:273) at org.apache.spark.util.Utils$.serializeViaNestedStream(Utils.scala:158) at org.apache.spark.rdd.ParallelCollectionPartition.$anonfun$writeObject$1(ParallelCollectionRDD.scala:65) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1343) at org.apache.spark.rdd.ParallelCollectionPartition.writeObject(ParallelCollectionRDD.scala:51) at
[jira] [Updated] (SPARK-31496) Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError
[ https://issues.apache.org/jira/browse/SPARK-31496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Shestakov updated SPARK-31496: Description: Local spark with one core (local[1]) while trying to save Dataset to parquet local file cause OOM {noformat} [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.967] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.969] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:27.970] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1 [20-Apr-2020 11:42:27.973] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter [20-Apr-2020 11:42:34.371] INFO [boundedElastic-2 org.apache.spark.SparkContext:57] q: - Starting job: save at LoaderImpl.java:305 [20-Apr-2020 11:42:34.389] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Got job 0 (save at LoaderImpl.java:305) with 1 output partitions [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Final stage: ResultStage 0 (save at LoaderImpl.java:305) [20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Parents of final stage: List() [20-Apr-2020 11:42:34.392] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Missing parents: List()[20-Apr-2020 11:42:34.398] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305), which has no missing parents [20-Apr-2020 11:42:34.634] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0 stored as values in memory (estimated size 166.1 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.945] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 58.0 KiB, free 18.4 GiB) [20-Apr-2020 11:42:34.949] INFO [dispatcher-BlockManagerMaster org.apache.spark.storage.BlockManagerInfo:57] q: - Added broadcast_0_piece0 in memory on DESKTOP-A1:58276 (size: 58.0 KiB, free: 18.4 GiB) [20-Apr-2020 11:42:34.953] INFO [dag-scheduler-event-loop org.apache.spark.SparkContext:57] q: - Created broadcast 0 from broadcast at DAGScheduler.scala:1206 [20-Apr-2020 11:42:34.980] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305) (first 15 tasks are for partitions Vector(0)) [20-Apr-2020 11:42:34.981] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.TaskSchedulerImpl:57] q: - Adding task set 0.0 with 1 tasksException in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125) at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859) at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712) at org.apache.spark.util.Utils$$anon$2.write(Utils.scala:153) at com.esotericsoftware.kryo.io.Output.flush(Output.java:185) at com.esotericsoftware.kryo.io.Output.close(Output.java:196) at org.apache.spark.serializer.KryoSerializationStream.close(KryoSerializer.scala:273) at org.apache.spark.util.Utils$.serializeViaNestedStream(Utils.scala:158) at org.apache.spark.rdd.ParallelCollectionPartition.$anonfun$writeObject$1(ParallelCollectionRDD.scala:65) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1343) at org.apache.spark.rdd.ParallelCollectionPartition.writeObject(ParallelCollectionRDD.scala:51) at
[jira] [Created] (SPARK-31496) Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError
Tomas Shestakov created SPARK-31496: --- Summary: Exception in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError Key: SPARK-31496 URL: https://issues.apache.org/jira/browse/SPARK-31496 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Environment: Windows 10 (1909) JDK 11.0.6 spark-3.0.0-preview2-bin-hadoop3.2 local[1] Reporter: Tomas Shestakov Local spark with one core (local[1]) while trying to save Dataset to parquet local file cause OOM {noformat} [20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter[20-Apr-2020 11:42:27.877] INFO [boundedElastic-2 o.a.s.s.e.datasources.parquet.ParquetFileFormat:57] q: - Using default output committer for Parquet: org.apache.parquet.hadoop.ParquetOutputCommitter[20-Apr-2020 11:42:27.967] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1[20-Apr-2020 11:42:27.969] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter[20-Apr-2020 11:42:27.970] INFO [boundedElastic-2 o.a.h.mapreduce.lib.output.FileOutputCommitter:108] q: - File Output Committer Algorithm version is 1[20-Apr-2020 11:42:27.973] INFO [boundedElastic-2 o.a.s.s.e.d.SQLHadoopMapReduceCommitProtocol:57] q: - Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter[20-Apr-2020 11:42:34.371] INFO [boundedElastic-2 org.apache.spark.SparkContext:57] q: - Starting job: save at LoaderImpl.java:305[20-Apr-2020 11:42:34.389] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Got job 0 (save at LoaderImpl.java:305) with 1 output partitions[20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Final stage: ResultStage 0 (save at LoaderImpl.java:305)[20-Apr-2020 11:42:34.390] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Parents of final stage: List()[20-Apr-2020 11:42:34.392] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Missing parents: List()[20-Apr-2020 11:42:34.398] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305), which has no missing parents[20-Apr-2020 11:42:34.634] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0 stored as values in memory (estimated size 166.1 KiB, free 18.4 GiB)[20-Apr-2020 11:42:34.945] INFO [dag-scheduler-event-loop org.apache.spark.storage.memory.MemoryStore:57] q: - Block broadcast_0_piece0 stored as bytes in memory (estimated size 58.0 KiB, free 18.4 GiB)[20-Apr-2020 11:42:34.949] INFO [dispatcher-BlockManagerMaster org.apache.spark.storage.BlockManagerInfo:57] q: - Added broadcast_0_piece0 in memory on DESKTOP-A1:58276 (size: 58.0 KiB, free: 18.4 GiB)[20-Apr-2020 11:42:34.953] INFO [dag-scheduler-event-loop org.apache.spark.SparkContext:57] q: - Created broadcast 0 from broadcast at DAGScheduler.scala:1206[20-Apr-2020 11:42:34.980] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.DAGScheduler:57] q: - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[6] at save at LoaderImpl.java:305) (first 15 tasks are for partitions Vector(0))[20-Apr-2020 11:42:34.981] INFO [dag-scheduler-event-loop org.apache.spark.scheduler.TaskSchedulerImpl:57] q: - Adding task set 0.0 with 1 tasksException in thread "dispatcher-event-loop-1" java.lang.OutOfMemoryError at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125) at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119) at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) at java.base/java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1859) at java.base/java.io.ObjectOutputStream.write(ObjectOutputStream.java:712) at org.apache.spark.util.Utils$$anon$2.write(Utils.scala:153) at com.esotericsoftware.kryo.io.Output.flush(Output.java:185) at com.esotericsoftware.kryo.io.Output.close(Output.java:196) at org.apache.spark.serializer.KryoSerializationStream.close(KryoSerializer.scala:273) at org.apache.spark.util.Utils$.serializeViaNestedStream(Utils.scala:158) at