[jira] [Assigned] (SPARK-25992) Accumulators giving KeyError in pyspark

Apache Spark (JIRA) Wed, 16 Jan 2019 00:45:49 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-25992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-25992:
------------------------------------

    Assignee:     (was: Apache Spark)

> Accumulators giving KeyError in pyspark
> ---------------------------------------
>
>                 Key: SPARK-25992
>                 URL: https://issues.apache.org/jira/browse/SPARK-25992
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.3.1
>            Reporter: Abdeali Kothari
>            Priority: Major
>
> I am using accumulators and when I run my code, I sometimes get some warn 
> messages. When I checked, there was nothing accumulated - not sure if I lost 
> info from the accumulator or it worked and I can ignore this error ?
> The message:
> {noformat}
> Exception happened during processing of request from
> ('127.0.0.1', 62099)
> Traceback (most recent call last):
> File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 317, in 
> _handle_request_noblock
>     self.process_request(request, client_address)
> File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 348, in 
> process_request
>     self.finish_request(request, client_address)
> File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 361, in 
> finish_request
>     self.RequestHandlerClass(request, client_address, self)
> File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 696, in 
> __init__
>     self.handle()
> File "/usr/local/hadoop/spark2.3.1/python/pyspark/accumulators.py", line 238, 
> in handle
>     _accumulatorRegistry[aid] += update
> KeyError: 0
> ----------------------------------------
> 2018-11-09 19:09:08 ERROR DAGScheduler:91 - Failed to update accumulators for 
> task 0
> org.apache.spark.SparkException: EOF reached before Python server acknowledged
>       at 
> org.apache.spark.api.python.PythonAccumulatorV2.merge(PythonRDD.scala:634)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1131)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1123)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>       at 
> org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1123)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1206)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-25992) Accumulators giving KeyError in pyspark

Reply via email to