Hi,

I tried running the ETL job a few times. It always fails after 40 minutes
or so. When I relaunch jupyter and rerun the job, it runs without error.
Then it fails again after some time. Just wondering if anyone else has
encountered this before?

Here's the error message:

----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 40292)

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/lib/python3.5/socketserver.py", line
313, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/home/ubuntu/anaconda3/lib/python3.5/socketserver.py", line
341, in process_request
    self.finish_request(request, client_address)
  File "/home/ubuntu/anaconda3/lib/python3.5/socketserver.py", line
354, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/ubuntu/anaconda3/lib/python3.5/socketserver.py", line
681, in __init__
    self.handle()
  File "/home/ubuntu/spark-2.1.1-bin-hadoop2.7/python/pyspark/accumulators.py",
line 235, in handle
    num_updates = read_int(self.rfile)
  File "/home/ubuntu/spark-2.1.1-bin-hadoop2.7/python/pyspark/serializers.py",
line 577, in read_int
    raise EOFError
EOFError

Reply via email to