subject:"\[GitHub\] \[spark\] cloud\-fan commented on pull request #32210\: \[SPARK\-32634\]\[SQL\] Introduce sort\-based fallback for shuffled hash join \(non\-code\-gen path\)"

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-26 Thread GitBox

cloud-fan commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-826239123 After more thinking, I'm wondering if this is the right direction to go. Apparently falling back to SMJ wastes the partially-built hash map. If one partition is a bit

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-24 Thread GitBox

cloud-fan commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-826239123 After more thinking, I'm wondering if this is the right direction to go. Apparently falling back to SMJ wastes the partially-built hash map. If one partition is a bit

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-21 Thread GitBox

cloud-fan commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-824536267 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-21 Thread GitBox

cloud-fan commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-823919561 > We enabled shuffled hash join by default with this feature. In our environment, roughly 25% of sort merge join queries are now running with shuffled hash join after

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-20 Thread GitBox

cloud-fan commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-823357554 I'm a bit worried about this solution: 1. sorting the stream-side at runtime may lead to slow query plan because the sort is not whole-stage-codegen-ed. 2. unlike SMJ,

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

[GitHub] [spark] cloud-fan commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

5 matches

Site Navigation

Mail list logo

Footer information