[ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kushal Mahajan reopened SPARK-28575: ------------------------------------ > Spark job time increasing when upgrading Spark from 2.1.1 to 2.3.1 > ------------------------------------------------------------------ > > Key: SPARK-28575 > URL: https://issues.apache.org/jira/browse/SPARK-28575 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 2.3.1 > Reporter: Kushal Mahajan > Priority: Major > Attachments: spark_2.1_screenshot.PNG, spark_2.3_screenshot.PNG > > > I am running a spark job using standalone cluster with Spark 2.1.1. The > standalone cluster was upgraded from 2.1.1 to Spark 2.3.1. There was > considerable drop in performance(~3-4 times) in the spark job. Upon > investigation, I found out that there is considerable time lag(ranging from > 30 sec to 2 min) between start time of different spark actions(excluding the > time taken by the action itself).(as can be seen from start time of each job > in Spark UI page). This was not there in Spark 2.1.1. Can anybody tell what > is the issue here? > PS: I am reading multiple text files from S3 using wholeTextFile, creating > multiple dataframes for thos textfiles and writing them out to S3 in csv > format. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org