Caused by: org.apache.spark.SparkException: Task not serializable

That's the answer :)

What are you trying to save? Is it empty or None / null?


On Wed, Jan 10, 2018 at 4:58 PM, Liana Napalkova <
liana.napalk...@eurecat.org> wrote:

> Hello,
>
>
> Has anybody faced the following problem in PySpark? (Python 2.7.12):
>
>     df.show() # works fine and shows the first 5 rows of DataFrame
>
>     df.write.parquet(outputPath + '/data.parquet', mode="overwrite")  #
> throws the error
>
> The last line throws the following error:
>
> py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet.
> : org.apache.spark.SparkException: Job aborted.
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215)
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
>       at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)
>
> Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
>       at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
>       at 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:123)
>       at 
> org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:248)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>
> Caused by: org.apache.spark.SparkException: Task not serializable
>       at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
>       at 
> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
>       at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
>       at org.apache.spark.SparkContext.clean(SparkContext.scala:2287)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:794)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>        at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>
> Caused by: java.lang.IllegalArgumentException
>         at java.nio.Buffer.position(Buffer.java:244)
>         at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:153)
>         at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>
> Caused by: java.nio.BufferUnderflowException
>
>       at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
>       at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>       at 
> org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes(Binary.java:405)
>       at 
> org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytesUnsafe(Binary.java:414)
>       at 
> org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.writeObject(Binary.java:484)
>       at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>
> Thanks.
>
> L.
>
> ------------------------------
> DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè
> no n'és el destinatari, si us plau, esborri'l i faci'ns-ho saber
> immediatament a la següent adreça: le...@eurecat.org Si el destinatari
> d'aquest missatge no consent la utilització del correu electrònic via
> Internet i la gravació de missatges, li preguem que ens ho comuniqui
> immediatament.
>
> DISCLAIMER: Este mensaje puede contener información confidencial. Si usted
> no es el destinatario del mensaje, por favor bórrelo y notifíquenoslo
> inmediatamente a la siguiente dirección: le...@eurecat.org Si el
> destinatario de este mensaje no consintiera la utilización del correo
> electrónico vía Internet y la grabación de los mensajes, rogamos lo ponga
> en nuestro conocimiento de forma inmediata.
>
> DISCLAIMER: Privileged/Confidential Information may be contained in this
> message. If you are not the addressee indicated in this message you should
> destroy this message, and notify us immediately to the following address:
> le...@eurecat.org. If the addressee of this message does not consent to
> the use of Internet e-mail and message recording, please notify us
> immediately.
> ------------------------------
>
>
>

Reply via email to