[jira] [Updated] (SPARK-36414) Disable timeout for BroadcastQueryStageExec in AQE

2021-08-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-36414:
--
Parent: SPARK-33828
Issue Type: Sub-task  (was: Improvement)

> Disable timeout for BroadcastQueryStageExec in AQE
> --
>
> Key: SPARK-36414
> URL: https://issues.apache.org/jira/browse/SPARK-36414
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: image-2021-08-04-18-53-44-879.png
>
>
> This reverts SPARK-31475, as there are always more concurrent jobs running in 
> AQE mode, especially when running multiple queries at the same time. 
> Currently, the broadcast timeout does not record accurately for the 
> BroadcastQueryStageExec only but also the time waiting for being scheduled. 
> If all the resources are currently being occupied for materializing other 
> stages, it timeouts without a chance to run actually.
>  
> !image-2021-08-04-18-53-44-879.png!
>  
> The default value is 300s, and it's hard to adjust the timeout for AQE mode. 
> Usually, you need an extremely large number for real-world cases. As you can 
> see the example, above, the timeout we used for it is 1800s, and obviously, 
> it needs 3x more or something
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-36414) Disable timeout for BroadcastQueryStageExec in AQE

2021-08-04 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-36414:
-
Description: 
This reverts SPARK-31475, as there are always more concurrent jobs running in 
AQE mode, especially when running multiple queries at the same time. Currently, 
the broadcast timeout does not record accurately for the 
BroadcastQueryStageExec only but also the time waiting for being scheduled. If 
all the resources are currently being occupied for materializing other stages, 
it timeouts without a chance to run actually.

 

!image-2021-08-04-18-53-44-879.png!

 

The default value is 300s, and it's hard to adjust the timeout for AQE mode. 
Usually, you need an extremely large number for real-world cases. As you can 
see the example, above, the timeout we used for it is 1800s, and obviously, it 
needs 3x more or something

 

  was:
This reverts SPARK-31475, as there are always more concurrent jobs running in 
AQE mode, especially when running multiple queries at the same time. Currently, 
the broadcast timeout does not record accurately for the 
BroadcastQueryStageExec only but also the time waiting for being scheduled. If 
all the resources are currently being occupied for materializing other stages, 
it timeouts without a chance to run actually.

 

!image-2021-08-04-18-48-15-385.png!

 

The default value is 300s, and it's hard to adjust the timeout for AQE mode. 
Usually, you need an extremely large number for real-world cases. As you can 
see the example, above, the timeout we used for it is 1800s, and obviously, it 
needs 3x more or something

 


> Disable timeout for BroadcastQueryStageExec in AQE
> --
>
> Key: SPARK-36414
> URL: https://issues.apache.org/jira/browse/SPARK-36414
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
> Attachments: image-2021-08-04-18-53-44-879.png
>
>
> This reverts SPARK-31475, as there are always more concurrent jobs running in 
> AQE mode, especially when running multiple queries at the same time. 
> Currently, the broadcast timeout does not record accurately for the 
> BroadcastQueryStageExec only but also the time waiting for being scheduled. 
> If all the resources are currently being occupied for materializing other 
> stages, it timeouts without a chance to run actually.
>  
> !image-2021-08-04-18-53-44-879.png!
>  
> The default value is 300s, and it's hard to adjust the timeout for AQE mode. 
> Usually, you need an extremely large number for real-world cases. As you can 
> see the example, above, the timeout we used for it is 1800s, and obviously, 
> it needs 3x more or something
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-36414) Disable timeout for BroadcastQueryStageExec in AQE

2021-08-04 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-36414:
-
Attachment: image-2021-08-04-18-53-44-879.png

> Disable timeout for BroadcastQueryStageExec in AQE
> --
>
> Key: SPARK-36414
> URL: https://issues.apache.org/jira/browse/SPARK-36414
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
> Attachments: image-2021-08-04-18-53-44-879.png
>
>
> This reverts SPARK-31475, as there are always more concurrent jobs running in 
> AQE mode, especially when running multiple queries at the same time. 
> Currently, the broadcast timeout does not record accurately for the 
> BroadcastQueryStageExec only but also the time waiting for being scheduled. 
> If all the resources are currently being occupied for materializing other 
> stages, it timeouts without a chance to run actually.
>  
> !image-2021-08-04-18-48-15-385.png!
>  
> The default value is 300s, and it's hard to adjust the timeout for AQE mode. 
> Usually, you need an extremely large number for real-world cases. As you can 
> see the example, above, the timeout we used for it is 1800s, and obviously, 
> it needs 3x more or something
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org