[jira] [Created] (SPARK-46037) When Left Join build Left, ShuffledHashJoinExec may result in incorrect results

2023-11-21 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-46037: Summary: When Left Join build Left, ShuffledHashJoinExec may result in incorrect results Key: SPARK-46037 URL: https://issues.apache.org/jira/browse/SPARK-46037

[jira] [Updated] (SPARK-43911) Use toSet to deduplicate the iterator data to prevent the creation of large Array

2023-06-02 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-43911: - Summary: Use toSet to deduplicate the iterator data to prevent the creation of large Array

[jira] [Created] (SPARK-43911) Directly use Set to consume iterator data to deduplicate, thereby reducing memory usage

2023-06-01 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-43911: Summary: Directly use Set to consume iterator data to deduplicate, thereby reducing memory usage Key: SPARK-43911 URL: https://issues.apache.org/jira/browse/SPARK-43911

[jira] [Created] (SPARK-41361) Invalid call toAttribute on unresolved object exception caused by WidenSetOperationTypes

2022-12-02 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-41361: Summary: Invalid call toAttribute on unresolved object exception caused by WidenSetOperationTypes Key: SPARK-41361 URL: https://issues.apache.org/jira/browse/SPARK-41361

[jira] [Created] (SPARK-41191) Cache Table is not working while nested caches exist

2022-11-17 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-41191: Summary: Cache Table is not working while nested caches exist Key: SPARK-41191 URL: https://issues.apache.org/jira/browse/SPARK-41191 Project: Spark Issue

[jira] [Created] (SPARK-40076) Support number-only column names in ORC data sources when orc impl is hive

2022-08-14 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-40076: Summary: Support number-only column names in ORC data sources when orc impl is hive Key: SPARK-40076 URL: https://issues.apache.org/jira/browse/SPARK-40076 Project:

[jira] [Created] (SPARK-39126) After eliminating join to one side, that side should take advantage of LocalShuffleRead optimization

2022-05-08 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-39126: Summary: After eliminating join to one side, that side should take advantage of LocalShuffleRead optimization Key: SPARK-39126 URL:

[jira] [Created] (SPARK-38867) Avoid OOM when bufferedPlan has a lot of duplicate keys in SortMergeJoin codegen

2022-04-11 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-38867: Summary: Avoid OOM when bufferedPlan has a lot of duplicate keys in SortMergeJoin codegen Key: SPARK-38867 URL: https://issues.apache.org/jira/browse/SPARK-38867

[jira] [Updated] (SPARK-38570) Incorrect DynamicPartitionPruning caused by Literal

2022-03-16 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-38570: - Description: The return value of Literal.references is an empty AttributeSet, so Literal is

[jira] [Created] (SPARK-38570) Incorrect DynamicPartitionPruning caused by Literal

2022-03-16 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-38570: Summary: Incorrect DynamicPartitionPruning caused by Literal Key: SPARK-38570 URL: https://issues.apache.org/jira/browse/SPARK-38570 Project: Spark Issue

[jira] [Updated] (SPARK-38542) UnsafeHashedRelation should serialize numKeys out

2022-03-13 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-38542: - Description: At present, UnsafeHashedRelation does not write out numKeys during serialization,

[jira] [Updated] (SPARK-38542) UnsafeHashedRelation should serialize numKeys out

2022-03-13 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-38542: - Description: At present, UnsafeHashedRelation does not write out numKeys during serialization,

[jira] [Created] (SPARK-38542) UnsafeHashedRelation should serialize numKeys out

2022-03-13 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-38542: Summary: UnsafeHashedRelation should serialize numKeys out Key: SPARK-38542 URL: https://issues.apache.org/jira/browse/SPARK-38542 Project: Spark Issue

[jira] [Updated] (SPARK-37652) Support optimize skewed join through union

2021-12-15 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-37652: - Description: `OptimizeSkewedJoin` rule will take effect only when the plan has two

[jira] [Updated] (SPARK-37652) Support optimize skewed join through union

2021-12-15 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-37652: - Description: `OptimizeSkewedJoin` rule will take effect only when the plan has two

[jira] [Created] (SPARK-37652) Support optimize skewed join through union

2021-12-15 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-37652: Summary: Support optimize skewed join through union Key: SPARK-37652 URL: https://issues.apache.org/jira/browse/SPARK-37652 Project: Spark Issue Type:

[jira] [Updated] (SPARK-37652) Support optimize skewed join through union

2021-12-15 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-37652: - Description: `OptimizeSkewedJoin` rule will take effect only when the plan has two

[jira] [Updated] (SPARK-37301) ConcurrentModificationException caused by CollectionAccumulator serialization in the heartbeat thread

2021-11-12 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-37301: - Description: In our production environment, you can use the following code to reproduce the

[jira] [Updated] (SPARK-37301) ConcurrentModificationException caused by CollectionAccumulator serialization in the heartbeat thread

2021-11-12 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-37301: - Description: In our production environment, you can use the following code to reproduce the

[jira] [Created] (SPARK-37301) ConcurrentModificationException caused by CollectionAccumulator serialization in the heartbeat thread

2021-11-12 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-37301: Summary: ConcurrentModificationException caused by CollectionAccumulator serialization in the heartbeat thread Key: SPARK-37301 URL:

[jira] [Commented] (SPARK-36663) When the existing field name is a number, an error will be reported when reading the orc file

2021-09-03 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409492#comment-17409492 ] mcdull_zhang commented on SPARK-36663: -- cc  [~hyukjin.kwon]      [~cloud_fan] > When the existing

[jira] [Updated] (SPARK-36663) When the existing field name is a number, an error will be reported when reading the orc file

2021-09-03 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-36663: - Description: You can use the following methods to reproduce the problem: {quote}val path =

[jira] [Updated] (SPARK-36663) When the existing field name is a number, an error will be reported when reading the orc file

2021-09-03 Thread mcdull_zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mcdull_zhang updated SPARK-36663: - Attachment: image-2021-09-03-20-56-28-846.png > When the existing field name is a number, an

[jira] [Created] (SPARK-36663) When the existing field name is a number, an error will be reported when reading the orc file

2021-09-03 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-36663: Summary: When the existing field name is a number, an error will be reported when reading the orc file Key: SPARK-36663 URL: https://issues.apache.org/jira/browse/SPARK-36663

[jira] [Created] (SPARK-36612) Support left outer join build left or right outer join build right in shuffled hash join

2021-08-30 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-36612: Summary: Support left outer join build left or right outer join build right in shuffled hash join Key: SPARK-36612 URL: https://issues.apache.org/jira/browse/SPARK-36612

[jira] [Created] (SPARK-36082) when the right side is small enough to use SingleColumn Null Aware Anti Join

2021-07-11 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-36082: Summary: when the right side is small enough to use SingleColumn Null Aware Anti Join Key: SPARK-36082 URL: https://issues.apache.org/jira/browse/SPARK-36082

[jira] [Created] (SPARK-31459) When using the insert overwrite directory syntax, if the target path is an existing file, the final run result is incorrect

2020-04-16 Thread mcdull_zhang (Jira)
mcdull_zhang created SPARK-31459: Summary: When using the insert overwrite directory syntax, if the target path is an existing file, the final run result is incorrect Key: SPARK-31459 URL: