[ https://issues.apache.org/jira/browse/SPARK-25992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-25992: ------------------------------------ Assignee: (was: Apache Spark) > Accumulators giving KeyError in pyspark > --------------------------------------- > > Key: SPARK-25992 > URL: https://issues.apache.org/jira/browse/SPARK-25992 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.3.1 > Reporter: Abdeali Kothari > Priority: Major > > I am using accumulators and when I run my code, I sometimes get some warn > messages. When I checked, there was nothing accumulated - not sure if I lost > info from the accumulator or it worked and I can ignore this error ? > The message: > {noformat} > Exception happened during processing of request from > ('127.0.0.1', 62099) > Traceback (most recent call last): > File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 317, in > _handle_request_noblock > self.process_request(request, client_address) > File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 348, in > process_request > self.finish_request(request, client_address) > File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 361, in > finish_request > self.RequestHandlerClass(request, client_address, self) > File "/Users/abdealijk/anaconda3/lib/python3.6/socketserver.py", line 696, in > __init__ > self.handle() > File "/usr/local/hadoop/spark2.3.1/python/pyspark/accumulators.py", line 238, > in handle > _accumulatorRegistry[aid] += update > KeyError: 0 > ---------------------------------------- > 2018-11-09 19:09:08 ERROR DAGScheduler:91 - Failed to update accumulators for > task 0 > org.apache.spark.SparkException: EOF reached before Python server acknowledged > at > org.apache.spark.api.python.PythonAccumulatorV2.merge(PythonRDD.scala:634) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1131) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$updateAccumulators$1.apply(DAGScheduler.scala:1123) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > org.apache.spark.scheduler.DAGScheduler.updateAccumulators(DAGScheduler.scala:1123) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1206) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org