Adding Logs.

When it launches the multiple applications , following logs get generated
on the terminal
Also it retries the task always:

20/10/13 12:04:30 WARN TaskSetManager: Lost task XX in stage XX (TID XX,
executor 5): java.net.SocketException: Broken pipe (Write failed)
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
        at
org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:212)
        at
org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:224)
        at
org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:224)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at
org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:224)
        at
org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:561)
        at
org.apache.spark.api.python.BasePythonRunner$WriterThread$$anonfun$run$1.apply(PythonRunner.scala:346)
        at
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
        at
org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:195)

Kind Regards,
Sachit Murarka


On Tue, Oct 13, 2020 at 4:02 PM Sachit Murarka <connectsac...@gmail.com>
wrote:

> Hi Users,
>
> When action(I am using count and write) gets executed in my spark job , it
> launches many more application instances(around 25 more apps).
>
> In my spark code ,  I am running the transformations through Dataframes
> then converting dataframe to rdd then applying zipwithindex , then
> converting it back to dataframe and then applying 2 actions(Count & Write).
>
> Please note : This was working fine till the previous week, it has started
> giving this issue since yesterday.
>
> Could you please tell what can be the reason for this behavior?
>
> Kind Regards,
> Sachit Murarka
>

Reply via email to