[ 
https://issues.apache.org/jira/browse/SPARK-14240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sayak Ghosh updated SPARK-14240:
--------------------------------
    Description: 
I am relatively new to Spark and wrote a simple script using python and spark 
SQL. My problem is that it is perfectly allright at the starting phase of the 
execution but gradually it slowed down and at the end of the last phase the 
whole application hangs
Here is my code snippet - 
  hivectx.registerDataFrameAsTable(aggregatedDataV1,"aggregatedDataV1")

q1 = "SELECT *, (Total_Sale/Sale_Weeks) as Average_Sale_Per_SaleWeek, 
(Total_Weeks/Sale_Weeks) as Velocity FROM aggregatedDataV1"

aggregatedData = hivectx.sql(q1)

aggregatedData.show(100)

========== Terminal Hanging with the following =========
16/03/29 09:05:50 INFO TaskSetManager: Finished task 96.0 in stage 416.0 (TID 
19992) in 41924 ms on 10.9.0.7 (104/200)
16/03/29 09:05:50 INFO TaskSetManager: Finished task 108.0 in stage 416.0 (TID 
20004) in 24608 ms on 10.9.0.10 (105/200)
16/03/29 09:05:50 INFO TaskSetManager: Finished task 105.0 in stage 416.0 (TID 
20001) in 24610 ms on 10.9.0.10 (106/200)
16/03/29 09:05:55 INFO TaskSetManager: Starting task 116.0 in stage 416.0 (TID 
20012, 10.9.0.10, partition 116,NODE_LOCAL, 2240 bytes)
16/03/29 09:06:31 INFO TaskSetManager: Finished task 99.0 in stage 416.0 (TID 
19995) in 78435 ms on 10.9.0.7 (110/200)
16/03/29 09:06:40 INFO TaskSetManager: Starting task 119.0 in stage 416.0 (TID 
20015, 10.9.0.10, partition 119,NODE_LOCAL, 2240 bytes)
16/03/29 09:07:12 INFO TaskSetManager: Starting task 122.0 in stage 416.0 (TID 
20018, 10.9.0.7, partition 122,NODE_LOCAL, 2240 bytes) 
16/03/29 09:07:16 INFO TaskSetManager: Starting task 123.0 in stage 416.0 (TID 
20019, 10.9.0.7, partition 123,NODE_LOCAL, 2240 bytes)
16/03/29 09:07:28 INFO TaskSetManager: Finished task 111.0 in stage 416.0 (TID 
20007) in 110198 ms on 10.9.0.7 (114/200)
16/03/29 09:07:52 INFO TaskSetManager: Starting task 124.0 in stage 416.0 (TID 
20020, 10.9.0.10, partition 124,NODE_LOCAL, 2240 bytes)
16/03/29 09:08:08 INFO TaskSetManager: Finished task 110.0 in stage 416.0 (TID 
20006) in 150023 ms on 10.9.0.7 (115/200)
16/03/29 09:08:12 INFO TaskSetManager: Finished task 113.0 in stage 416.0 (TID 
20009) in 154120 ms on 10.9.0.7 (116/200)
16/03/29 09:08:16 INFO TaskSetManager: Finished task 116.0 in stage 416.0 (TID 
20012) in 145691 ms on 10.9.0.10 (117/200)

There is no sign of error.

  was:
I am relatively new to Spark and wrote a simple script using python and spark 
SQL. My problem is that it is perfectly allright at the starting phase of the 
execution but gradually it slowed down and at the end of the last phase the 
whole application hangs
Here is my code snippet - 
  hivectx.registerDataFrameAsTable(aggregatedDataV1,"aggregatedDataV1")

q1 = "SELECT *, (Total_Sale/Sale_Weeks) as Average_Sale_Per_SaleWeek, 
(Total_Weeks/Sale_Weeks) as Velocity FROM aggregatedDataV1"

aggregatedData = hivectx.sql(q1)

aggregatedData.show(100)




> PySpark Standalone Application hangs without any Error message
> --------------------------------------------------------------
>
>                 Key: SPARK-14240
>                 URL: https://issues.apache.org/jira/browse/SPARK-14240
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy, PySpark
>    Affects Versions: 1.6.0
>            Reporter: Sayak Ghosh
>
> I am relatively new to Spark and wrote a simple script using python and spark 
> SQL. My problem is that it is perfectly allright at the starting phase of the 
> execution but gradually it slowed down and at the end of the last phase the 
> whole application hangs
> Here is my code snippet - 
>   hivectx.registerDataFrameAsTable(aggregatedDataV1,"aggregatedDataV1")
> q1 = "SELECT *, (Total_Sale/Sale_Weeks) as Average_Sale_Per_SaleWeek, 
> (Total_Weeks/Sale_Weeks) as Velocity FROM aggregatedDataV1"
> aggregatedData = hivectx.sql(q1)
> aggregatedData.show(100)
> ========== Terminal Hanging with the following =========
> 16/03/29 09:05:50 INFO TaskSetManager: Finished task 96.0 in stage 416.0 (TID 
> 19992) in 41924 ms on 10.9.0.7 (104/200)
> 16/03/29 09:05:50 INFO TaskSetManager: Finished task 108.0 in stage 416.0 
> (TID 20004) in 24608 ms on 10.9.0.10 (105/200)
> 16/03/29 09:05:50 INFO TaskSetManager: Finished task 105.0 in stage 416.0 
> (TID 20001) in 24610 ms on 10.9.0.10 (106/200)
> 16/03/29 09:05:55 INFO TaskSetManager: Starting task 116.0 in stage 416.0 
> (TID 20012, 10.9.0.10, partition 116,NODE_LOCAL, 2240 bytes)
> 16/03/29 09:06:31 INFO TaskSetManager: Finished task 99.0 in stage 416.0 (TID 
> 19995) in 78435 ms on 10.9.0.7 (110/200)
> 16/03/29 09:06:40 INFO TaskSetManager: Starting task 119.0 in stage 416.0 
> (TID 20015, 10.9.0.10, partition 119,NODE_LOCAL, 2240 bytes)
> 16/03/29 09:07:12 INFO TaskSetManager: Starting task 122.0 in stage 416.0 
> (TID 20018, 10.9.0.7, partition 122,NODE_LOCAL, 2240 bytes) 
> 16/03/29 09:07:16 INFO TaskSetManager: Starting task 123.0 in stage 416.0 
> (TID 20019, 10.9.0.7, partition 123,NODE_LOCAL, 2240 bytes)
> 16/03/29 09:07:28 INFO TaskSetManager: Finished task 111.0 in stage 416.0 
> (TID 20007) in 110198 ms on 10.9.0.7 (114/200)
> 16/03/29 09:07:52 INFO TaskSetManager: Starting task 124.0 in stage 416.0 
> (TID 20020, 10.9.0.10, partition 124,NODE_LOCAL, 2240 bytes)
> 16/03/29 09:08:08 INFO TaskSetManager: Finished task 110.0 in stage 416.0 
> (TID 20006) in 150023 ms on 10.9.0.7 (115/200)
> 16/03/29 09:08:12 INFO TaskSetManager: Finished task 113.0 in stage 416.0 
> (TID 20009) in 154120 ms on 10.9.0.7 (116/200)
> 16/03/29 09:08:16 INFO TaskSetManager: Finished task 116.0 in stage 416.0 
> (TID 20012) in 145691 ms on 10.9.0.10 (117/200)
> There is no sign of error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to