[ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897826#comment-16897826 ]
Hyukjin Kwon commented on SPARK-28575: -------------------------------------- Please show the reproducer (self-contained if possible) and fix the title / description properly to describe the issue. Otherwise, no one knows what's the issue. > Spark job time increasing when upgrading Spark from 2.1.1 to 2.3.1 > ------------------------------------------------------------------ > > Key: SPARK-28575 > URL: https://issues.apache.org/jira/browse/SPARK-28575 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 2.3.1 > Reporter: Kushal Mahajan > Priority: Major > > I am running a spark job using standalone cluster with Spark 2.1.1. The > standalone cluster was upgraded from 2.1.1 to Spark 2.3.1. There was > considerable drop in performance(~3-4 times) in the spark job. Upon > investigation, I found out that there is considerable time lag(ranging from > 30 sec to 2 min) between start time of different spark actions(excluding the > time taken by the action itself).(as can be seen from start time of each job > in Spark UI page). This was not there in Spark 2.1.1. Can anybody tell what > is the issue here? > PS: I am reading multiple text files from S3 using wholeTextFile, creating > multiple dataframes for thos textfiles and writing them out to S3 in csv > format. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org