It looks like the executor (JVM) stops immediately. Hard to say why - do
you have Java installed and a compatible version? I agree it could be a
py4j version problem, from that SO link.
On Sat, May 8, 2021, 1:35 PM rajat kumar wrote:
> Hi Sean/Mich,
>
> Thanks for response.
>
> That was the
Hi Sean/Mich,
Thanks for response.
That was the full log. Sending again for reference. I am just running
foreach (lamda) which runs pure python code.
Exception in read_logs : Py4JJavaError Traceback (most recent call last):
File "/opt/spark/python/lib/python3.6/site-packages/filename.py",
By yarn mode I meant dealing with issues raised in a cluster wide.
>From personal experience, I find it easier to trace these sorts of errors
when I run the code in local mode as it could be related to the set-up and
easier to track where things go wrong when one is dealing with local mode.
This
I don't see any reason to think this is related to YARN.
You haven't shown the actual error @rajat so not sure there is anything to
say.
On Fri, May 7, 2021 at 3:08 PM Mich Talebzadeh
wrote:
> I have suspicion that this may be caused by your cluster as it appears
> that you are running this in
I have suspicion that this may be caused by your cluster as it appears that
you are running this in YARN mode like below
spark-submit --master yarn --deploy-mode client xyx.py
What happens if you try running it in local mode?
spark-submit --master local[2] xyx.py
Is this run in a managed
Thanks Mich and Sean for the response . Yes Sean is right. This is a batch
job.
I am having only 10 records in the dataframe still it is giving this
exception
Following are the full logs.
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line
584, in foreach
foreach definitely works :)
This is not a streaming question.
The error says that the JVM worker died for some reason. You'd have to look
at its logs to see why.
On Fri, May 7, 2021 at 11:03 AM Mich Talebzadeh
wrote:
> Hi,
>
> I am not convinced foreach works even in 3.1.1
> Try doing the same
Hi,
I am not convinced foreach works even in 3.1.1
Try doing the same with foreachBatch
foreachBatch(sendToSink). \
trigger(processingTime='2 seconds'). \
and see it works
HTH
view my Linkedin profile
Hi Team,
I am using Spark 2.4.4 with Python
While using below line:
dataframe.foreach(lambda record : process_logs(record))
My use case is , process logs will download the file from cloud storage
using Python code and then it will save the processed data.
I am getting the following error