[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-30 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: explainMode-cost.zip

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt, 
> explainMode-cost.zip
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-30 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: explainMode-cost.zip)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt, 
> explainMode-cost.zip
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-30 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: explainMode-cost.zip

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt, 
> explainMode-cost.zip
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-30 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: explainMode-cost.zip)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt, 
> explainMode-cost.zip
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-29 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: explainMode-cost.zip

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt, 
> explainMode-cost.zip
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: 2.2.AfterAnalyzeTable WITHOUT 
ForAllColumns-joinreorder-enabled.txt
3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt
3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt
1.1.BeforeAnalyzeTable-joinreorder-disabled.txt
1.2.BeforeAnalyzeTable-joinreorder-enabled.txt
2.1.AfterAnalyzeTable WITHOUT 
ForAllColumns-joinreorder-disabled.txt

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: 1.1.BeforeAnalyzeTable-joinreorder-disabled.txt, 
> 1.2.BeforeAnalyzeTable-joinreorder-enabled.txt, 2.1.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-disabled.txt, 2.2.AfterAnalyzeTable WITHOUT 
> ForAllColumns-joinreorder-enabled.txt, 
> 3.1.AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> 3.2.AfterAnalyzeTableForAllColumns-joinreorder-enabled.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: BeforeAnalyzeTable.txt)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: AfterAnalyzeTable WITHOUT ForAllColumns.txt)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: 
AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: AfterAnalyzeTableForAllColumns.txt)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-39971:
-
Component/s: SQL

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, SQL
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: (was: 
AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt)

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Description: 
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
[https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.

Rows Count:
store_sales - 2879966589
store_returns - 288009578
store - 1002
item - 30
customer - 1200
customer_address - 600

  was:
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql takes 
between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.


> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql] takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.
> Rows Count:
> store_sales - 2879966589
> store_returns - 288009578
> store - 1002
> item - 30
> customer - 1200
> customer_address - 600



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Attachment: AfterAnalyzeTable WITHOUT ForAllColumns.txt
AfterAnalyzeTableForAllColumns.txt
BeforeAnalyzeTable.txt

> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
> Attachments: AfterAnalyzeTable WITHOUT ForAllColumns.txt, 
> AfterAnalyzeTableForAllColumns.txt, BeforeAnalyzeTable.txt
>
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Description: 
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql takes 
between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.

  was:
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
[https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql|https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql,],takes
 between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.


> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql takes 
> between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felipe updated SPARK-39971:
---
Description: 
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
[https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql|https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql,],takes
 between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.

  was:
I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the 
FOR ALL COLUMNS) some queries became really slow. For example query24 - 
[https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql,] takes 
between 10~15min before running the ANALYZE TABLE.

After running ANALYZE TABLE I waited 24h before cancelling the execution.

If I disable spark.sql.cbo.joinReorder.enabled or 
spark.sql.cbo.enabled it becomes fast again.
It seems something in join reordering is not working well when we have table 
stats, but not column stats.


> ANALYZE TABLE makes some queries run forever
> 
>
> Key: SPARK-39971
> URL: https://issues.apache.org/jira/browse/SPARK-39971
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 3.2.2
>Reporter: Felipe
>Priority: Major
>
> I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without 
> the FOR ALL COLUMNS) some queries became really slow. For example query24 - 
> [https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql|https://raw.githubusercontent.com/Agirish/tpcds/master/query24.sql,],takes
>  between 10~15min before running the ANALYZE TABLE.
> After running ANALYZE TABLE I waited 24h before cancelling the execution.
> If I disable spark.sql.cbo.joinReorder.enabled or 
> spark.sql.cbo.enabled it becomes fast again.
> It seems something in join reordering is not working well when we have table 
> stats, but not column stats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org