Hi Spark users, Trying to upgrade to Spark1.2 and running into the following
seeing some very slow queries and wondering if someone can point me in the right direction for debugging. My Spark UI shows a job with duration 15s (see attached screenshot). Which would be great but client side measurement shows the query takes just over 4 min 15/04/15 16:34:59 INFO ParseDriver: Parsing command: ***my query here*** 15/04/15 16:38:28 INFO ParquetTypesConverter: Falling back to schema conversion from Parquet types; 15/04/15 16:38:29 INFO DAGScheduler: Got job 6 (collect at SparkPlan.scala:84) with 401 output partitions (allowLocal=false) So there is a gap of almost 4 min between the parse and the next line I can identify as relating to this job. Can someone shed some light on what happens between the parse and the DAG scheduler? The spark UI also shows the submitted time as 16:38 which leads me to believe it counts time from when the scheduler gets the job...but what happens before then?
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org