[jira] [Resolved] (SPARK-36679) Remove lz4 hadoop wrapper classes after Hadoop 3.3.2

2022-03-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-36679. -- Fix Version/s: 3.3.0 Resolution: Duplicate > Remove lz4 hadoop wrapper classes after Hadoop

[jira] [Resolved] (SPARK-38179) Improve WritableColumnVector to better support null struct

2022-03-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-38179. -- Resolution: Won't Fix > Improve WritableColumnVector to better support null struct >

[jira] [Assigned] (SPARK-38237) Introduce a new config to require all cluster keys on Aggregate

2022-02-25 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-38237: Assignee: Cheng Su > Introduce a new config to require all cluster keys on Aggregate >

[jira] [Resolved] (SPARK-38237) Introduce a new config to require all cluster keys on Aggregate

2022-02-25 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-38237. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35574

[jira] [Created] (SPARK-38179) Improve WritableColumnVector to better support null struct

2022-02-10 Thread Chao Sun (Jira)
Chao Sun created SPARK-38179: Summary: Improve WritableColumnVector to better support null struct Key: SPARK-38179 URL: https://issues.apache.org/jira/browse/SPARK-38179 Project: Spark Issue

[jira] [Commented] (SPARK-38077) Spark 3.2.1 breaks binary compatibility with Spark 3.2.0

2022-01-31 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17484894#comment-17484894 ] Chao Sun commented on SPARK-38077: -- BTW [~thesamet] it seems Spark only guarantees API compatibility,

[jira] [Commented] (SPARK-38077) Spark 3.2.1 breaks binary compatibility with Spark 3.2.0

2022-01-31 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17484873#comment-17484873 ] Chao Sun commented on SPARK-38077: -- Sorry for breaking the binary compatibility. I wasn't aware that

[jira] [Commented] (SPARK-37994) Unable to build spark3.2 with -Dhadoop.version=3.1.4

2022-01-27 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483399#comment-17483399 ] Chao Sun commented on SPARK-37994: -- Glad it helped [~tanvu]! {quote} We can omit the

[jira] [Commented] (SPARK-37994) Unable to build spark3.2 with -Dhadoop.version=3.1.4

2022-01-26 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482632#comment-17482632 ] Chao Sun commented on SPARK-37994: -- [~tanvu] Hmm in that case maybe you can try: {code}

[jira] [Commented] (SPARK-37994) Unable to build spark3.2 with -Dhadoop.version=3.1.4

2022-01-24 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481327#comment-17481327 ] Chao Sun commented on SPARK-37994: -- I considered to add a new Maven profile for Hadoop versions <= 2.x

[jira] [Commented] (SPARK-37994) Unable to build spark3.2 with -Dhadoop.version=3.1.4

2022-01-24 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481326#comment-17481326 ] Chao Sun commented on SPARK-37994: -- Yes, thanks [~xkrogen] for pinging me. [~tanvu]: can you try this

[jira] [Updated] (SPARK-37957) Deterministic flag is not handled for V2 functions

2022-01-19 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37957: - Fix Version/s: 3.2.1 > Deterministic flag is not handled for V2 functions >

[jira] [Resolved] (SPARK-37957) Deterministic flag is not handled for V2 functions

2022-01-19 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37957. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35243

[jira] [Assigned] (SPARK-37928) Add Parquet Data Page V2 bench scenario to DataSourceReadBenchmark

2022-01-19 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37928: Assignee: Yang Jie > Add Parquet Data Page V2 bench scenario to DataSourceReadBenchmark >

[jira] [Resolved] (SPARK-37928) Add Parquet Data Page V2 bench scenario to DataSourceReadBenchmark

2022-01-19 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37928. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35226

[jira] [Created] (SPARK-37957) Deterministic flag is not handled for V2 functions

2022-01-18 Thread Chao Sun (Jira)
Chao Sun created SPARK-37957: Summary: Deterministic flag is not handled for V2 functions Key: SPARK-37957 URL: https://issues.apache.org/jira/browse/SPARK-37957 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-37864) Support Parquet v2 data page RLE encoding (for Boolean Values) for the vectorized path

2022-01-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37864. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35163

[jira] [Assigned] (SPARK-37864) Support Parquet v2 data page RLE encoding (for Boolean Values) for the vectorized path

2022-01-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37864: Assignee: Yang Jie > Support Parquet v2 data page RLE encoding (for Boolean Values) for the >

[jira] [Assigned] (SPARK-36879) Support Parquet v2 data page encodings for the vectorized path

2022-01-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-36879: Assignee: Parth Chandra > Support Parquet v2 data page encodings for the vectorized path >

[jira] [Resolved] (SPARK-36879) Support Parquet v2 data page encodings for the vectorized path

2022-01-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-36879. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34471

[jira] [Updated] (SPARK-37633) Unwrap cast should skip if downcast failed with ansi enabled

2021-12-15 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37633: - Affects Version/s: (was: 3.0.3) > Unwrap cast should skip if downcast failed with ansi enabled >

[jira] [Assigned] (SPARK-37633) Unwrap cast should skip if downcast failed with ansi enabled

2021-12-15 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37633: Assignee: Manu Zhang > Unwrap cast should skip if downcast failed with ansi enabled >

[jira] [Resolved] (SPARK-37633) Unwrap cast should skip if downcast failed with ansi enabled

2021-12-15 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37633. -- Fix Version/s: 3.3.0 3.2.1 Resolution: Fixed Issue resolved by pull request

[jira] [Updated] (SPARK-37217) The number of dynamic partitions should early check when writing to external tables

2021-12-14 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37217: - Fix Version/s: 3.2.1 > The number of dynamic partitions should early check when writing to external >

[jira] [Updated] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-12-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37481: - Fix Version/s: 3.2.1 (was: 3.2.0) > Disappearance of skipped stages mislead the

[jira] [Assigned] (SPARK-37217) The number of dynamic partitions should early check when writing to external tables

2021-12-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37217: Assignee: dzcxzl > The number of dynamic partitions should early check when writing to external

[jira] [Resolved] (SPARK-37217) The number of dynamic partitions should early check when writing to external tables

2021-12-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37217. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34493

[jira] [Resolved] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-09 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37573. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34830

[jira] [Assigned] (SPARK-37573) IsolatedClient fallbackVersion should be build in version, not always 2.7.4

2021-12-09 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37573: Assignee: angerszhu > IsolatedClient fallbackVersion should be build in version, not always

[jira] [Created] (SPARK-37600) Upgrade to Hadoop 3.3.2

2021-12-09 Thread Chao Sun (Jira)
Chao Sun created SPARK-37600: Summary: Upgrade to Hadoop 3.3.2 Key: SPARK-37600 URL: https://issues.apache.org/jira/browse/SPARK-37600 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-37561) Avoid loading all functions when obtaining hive's DelegationToken

2021-12-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37561: Assignee: dzcxzl > Avoid loading all functions when obtaining hive's DelegationToken >

[jira] [Resolved] (SPARK-37561) Avoid loading all functions when obtaining hive's DelegationToken

2021-12-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37561. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34822

[jira] [Resolved] (SPARK-37205) Support mapreduce.job.send-token-conf when starting containers in YARN

2021-12-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37205. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34635

[jira] [Assigned] (SPARK-37205) Support mapreduce.job.send-token-conf when starting containers in YARN

2021-12-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37205: Assignee: Chao Sun > Support mapreduce.job.send-token-conf when starting containers in YARN >

[jira] [Assigned] (SPARK-37445) Update hadoop-profile

2021-12-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37445: Assignee: angerszhu > Update hadoop-profile > - > > Key:

[jira] [Resolved] (SPARK-37445) Update hadoop-profile

2021-12-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37445. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34715

[jira] [Updated] (SPARK-36529) Decouple CPU with IO work in vectorized Parquet reader

2021-12-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36529: - Attachment: (was: image.png) > Decouple CPU with IO work in vectorized Parquet reader >

[jira] [Updated] (SPARK-36529) Decouple CPU with IO work in vectorized Parquet reader

2021-12-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36529: - Attachment: image.png > Decouple CPU with IO work in vectorized Parquet reader >

[jira] [Resolved] (SPARK-35867) Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-35867. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34611

[jira] [Assigned] (SPARK-35867) Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-29 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-35867: Assignee: Kazuyuki Tanimura > Enable vectorized read for

[jira] [Created] (SPARK-37378) Convert V2 Transform expressions into catalyst expressions and load their associated functions from V2 FunctionCatalog

2021-11-18 Thread Chao Sun (Jira)
Chao Sun created SPARK-37378: Summary: Convert V2 Transform expressions into catalyst expressions and load their associated functions from V2 FunctionCatalog Key: SPARK-37378 URL:

[jira] [Created] (SPARK-37377) Refactor V2 Partitioning interface and remove deprecated usage of Distribution

2021-11-18 Thread Chao Sun (Jira)
Chao Sun created SPARK-37377: Summary: Refactor V2 Partitioning interface and remove deprecated usage of Distribution Key: SPARK-37377 URL: https://issues.apache.org/jira/browse/SPARK-37377 Project:

[jira] [Created] (SPARK-37376) Introduce a new DataSource V2 interface HasPartitionKey

2021-11-18 Thread Chao Sun (Jira)
Chao Sun created SPARK-37376: Summary: Introduce a new DataSource V2 interface HasPartitionKey Key: SPARK-37376 URL: https://issues.apache.org/jira/browse/SPARK-37376 Project: Spark Issue Type:

[jira] [Updated] (SPARK-37166) SPIP: Storage Partitioned Join

2021-11-18 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37166: - Parent: SPARK-37375 Issue Type: Sub-task (was: New Feature) > SPIP: Storage Partitioned Join >

[jira] [Created] (SPARK-37375) Umbrella: Storage Partitioned Join

2021-11-18 Thread Chao Sun (Jira)
Chao Sun created SPARK-37375: Summary: Umbrella: Storage Partitioned Join Key: SPARK-37375 URL: https://issues.apache.org/jira/browse/SPARK-37375 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-37166) SPIP: Storage Partitioned Join

2021-11-18 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37166. -- Fix Version/s: 3.3.0 Assignee: Chao Sun Resolution: Fixed > SPIP: Storage Partitioned

[jira] [Updated] (SPARK-37342) Upgrade Apache Arrow to 6.0.0

2021-11-15 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37342: - Component/s: Build (was: Spark Core) > Upgrade Apache Arrow to 6.0.0 >

[jira] [Created] (SPARK-37342) Upgrade Apache Arrow to 6.0.0

2021-11-15 Thread Chao Sun (Jira)
Chao Sun created SPARK-37342: Summary: Upgrade Apache Arrow to 6.0.0 Key: SPARK-37342 URL: https://issues.apache.org/jira/browse/SPARK-37342 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-37239) Avoid unnecessary `setReplication` in Yarn mode

2021-11-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37239. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34520

[jira] [Assigned] (SPARK-37239) Avoid unnecessary `setReplication` in Yarn mode

2021-11-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-37239: Assignee: Yang Jie > Avoid unnecessary `setReplication` in Yarn mode >

[jira] [Updated] (SPARK-35437) Use expressions to filter Hive partitions at client side

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-35437: - Priority: Major (was: Minor) > Use expressions to filter Hive partitions at client side >

[jira] [Resolved] (SPARK-35437) Use expressions to filter Hive partitions at client side

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-35437. -- Resolution: Fixed Issue resolved by pull request 34431 [https://github.com/apache/spark/pull/34431]

[jira] [Assigned] (SPARK-35437) Use expressions to filter Hive partitions at client side

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-35437: Assignee: dzcxzl > Use expressions to filter Hive partitions at client side >

[jira] [Commented] (SPARK-36998) Handle concurrent eviction of same application in SHS

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440066#comment-17440066 ] Chao Sun commented on SPARK-36998: -- Fixed > Handle concurrent eviction of same application in SHS >

[jira] [Assigned] (SPARK-36998) Handle concurrent eviction of same application in SHS

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned SPARK-36998: Assignee: Thejdeep Gudivada (was: Thejdeep) > Handle concurrent eviction of same application in

[jira] [Commented] (SPARK-37220) Do not split input file for Parquet reader with aggregate push down

2021-11-07 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440042#comment-17440042 ] Chao Sun commented on SPARK-37220: -- Thanks [~hyukjin.kwon]! > Do not split input file for Parquet

[jira] [Resolved] (SPARK-37220) Do not split input file for Parquet reader with aggregate push down

2021-11-06 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37220. -- Fix Version/s: 3.3.0 Resolution: Fixed > Do not split input file for Parquet reader with

[jira] [Commented] (SPARK-37218) Parameterize `spark.sql.shuffle.partitions` in TPCDSQueryBenchmark

2021-11-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17439554#comment-17439554 ] Chao Sun commented on SPARK-37218: -- [~dongjoon] please assign this to yourself - somehow I can't do it.

[jira] [Resolved] (SPARK-37218) Parameterize `spark.sql.shuffle.partitions` in TPCDSQueryBenchmark

2021-11-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved SPARK-37218. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34496

[jira] [Updated] (SPARK-37205) Support mapreduce.job.send-token-conf when starting containers in YARN

2021-11-03 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-37205: - Description: {{mapreduce.job.send-token-conf}} is a useful feature in Hadoop (see

[jira] [Created] (SPARK-37205) Support mapreduce.job.send-token-conf when starting containers in YARN

2021-11-03 Thread Chao Sun (Jira)
Chao Sun created SPARK-37205: Summary: Support mapreduce.job.send-token-conf when starting containers in YARN Key: SPARK-37205 URL: https://issues.apache.org/jira/browse/SPARK-37205 Project: Spark

[jira] [Commented] (SPARK-37166) SPIP: Storage Partitioned Join

2021-11-01 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436963#comment-17436963 ] Chao Sun commented on SPARK-37166: -- [~xkrogen] sure just linked. > SPIP: Storage Partitioned Join >

[jira] [Created] (SPARK-37166) SPIP: Storage Partitioned Join

2021-10-29 Thread Chao Sun (Jira)
Chao Sun created SPARK-37166: Summary: SPIP: Storage Partitioned Join Key: SPARK-37166 URL: https://issues.apache.org/jira/browse/SPARK-37166 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-37113) Upgrade Parquet to 1.12.2

2021-10-25 Thread Chao Sun (Jira)
Chao Sun created SPARK-37113: Summary: Upgrade Parquet to 1.12.2 Key: SPARK-37113 URL: https://issues.apache.org/jira/browse/SPARK-37113 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-35703) Relax constraint for Spark bucket join and remove HashClusteredDistribution

2021-10-22 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-35703: - Summary: Relax constraint for Spark bucket join and remove HashClusteredDistribution (was: Remove

[jira] [Commented] (SPARK-37069) HiveClientImpl throws NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns

2021-10-21 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432624#comment-17432624 ] Chao Sun commented on SPARK-37069: -- Thanks for the ping [~zhouyifan279]! yes this is a bug, and let me

[jira] [Commented] (SPARK-35640) Refactor Parquet vectorized reader to remove duplicated code paths

2021-10-13 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17428522#comment-17428522 ] Chao Sun commented on SPARK-35640: -- [~catalinii] this change seems unrelated since it's only in Spark

[jira] [Commented] (SPARK-36936) spark-hadoop-cloud broken on release and only published via 3rd party repositories

2021-10-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426255#comment-17426255 ] Chao Sun commented on SPARK-36936: -- [~colin.williams] Spark 3.2.0 is not released yet - it will be

[jira] [Commented] (SPARK-36936) spark-hadoop-cloud broken on release and only published via 3rd party repositories

2021-10-06 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425162#comment-17425162 ] Chao Sun commented on SPARK-36936: -- [~colin.williams] which version of {{spark-hadoop-cloud}} you were

[jira] [Updated] (SPARK-36891) Refactor SpecificParquetRecordReaderBase and add more coverage on vectorized Parquet decoding

2021-10-05 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36891: - Parent: SPARK-35743 Issue Type: Sub-task (was: Test) > Refactor

[jira] [Created] (SPARK-36935) Enhance ParquetSchemaConverter to capture Parquet repetition & definition level

2021-10-05 Thread Chao Sun (Jira)
Chao Sun created SPARK-36935: Summary: Enhance ParquetSchemaConverter to capture Parquet repetition & definition level Key: SPARK-36935 URL: https://issues.apache.org/jira/browse/SPARK-36935 Project:

[jira] [Created] (SPARK-36891) Add new test suite to cover Parquet decoding

2021-09-29 Thread Chao Sun (Jira)
Chao Sun created SPARK-36891: Summary: Add new test suite to cover Parquet decoding Key: SPARK-36891 URL: https://issues.apache.org/jira/browse/SPARK-36891 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-36879) Support Parquet v2 data page encodings for the vectorized path

2021-09-28 Thread Chao Sun (Jira)
Chao Sun created SPARK-36879: Summary: Support Parquet v2 data page encodings for the vectorized path Key: SPARK-36879 URL: https://issues.apache.org/jira/browse/SPARK-36879 Project: Spark

[jira] [Updated] (SPARK-36873) Add provided Guava dependency for network-yarn module

2021-09-28 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36873: - Issue Type: Bug (was: Improvement) > Add provided Guava dependency for network-yarn module >

[jira] [Updated] (SPARK-36873) Add provided Guava dependency for network-yarn module

2021-09-28 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36873: - Description: In Spark 3.1 and earlier the network-yarn module implicitly relies on guava from

[jira] [Updated] (SPARK-36873) Add provided Guava dependency for network-yarn module

2021-09-28 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36873: - Description: In Spark 3.1 and earlier the network-yarn module implicitly relies on guava from

[jira] [Created] (SPARK-36873) Add provided Guava dependency for network-yarn module

2021-09-28 Thread Chao Sun (Jira)
Chao Sun created SPARK-36873: Summary: Add provided Guava dependency for network-yarn module Key: SPARK-36873 URL: https://issues.apache.org/jira/browse/SPARK-36873 Project: Spark Issue Type:

[jira] [Created] (SPARK-36863) Update dependency manifests for all released artifacts

2021-09-27 Thread Chao Sun (Jira)
Chao Sun created SPARK-36863: Summary: Update dependency manifests for all released artifacts Key: SPARK-36863 URL: https://issues.apache.org/jira/browse/SPARK-36863 Project: Spark Issue Type:

[jira] [Commented] (SPARK-36835) Spark 3.2.0 POMs are no longer "dependency reduced"

2021-09-23 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419499#comment-17419499 ] Chao Sun commented on SPARK-36835: -- Sorry for the regression [~joshrosen]. I forgot exactly why I added

[jira] [Updated] (SPARK-36828) Remove Guava from Spark binary distribution

2021-09-22 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36828: - Issue Type: Improvement (was: Bug) > Remove Guava from Spark binary distribution >

[jira] [Created] (SPARK-36828) Remove Guava from Spark binary distribution

2021-09-22 Thread Chao Sun (Jira)
Chao Sun created SPARK-36828: Summary: Remove Guava from Spark binary distribution Key: SPARK-36828 URL: https://issues.apache.org/jira/browse/SPARK-36828 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-36820) Disable LZ4 test for Hadoop 2.7 profile

2021-09-21 Thread Chao Sun (Jira)
Chao Sun created SPARK-36820: Summary: Disable LZ4 test for Hadoop 2.7 profile Key: SPARK-36820 URL: https://issues.apache.org/jira/browse/SPARK-36820 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-36820) Disable LZ4 test for Hadoop 2.7 profile

2021-09-21 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36820: - Issue Type: Test (was: Bug) > Disable LZ4 test for Hadoop 2.7 profile >

[jira] [Updated] (SPARK-36726) Upgrade Parquet to 1.12.1

2021-09-12 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36726: - Priority: Blocker (was: Major) > Upgrade Parquet to 1.12.1 > - > >

[jira] [Created] (SPARK-36726) Upgrade Parquet to 1.12.1

2021-09-12 Thread Chao Sun (Jira)
Chao Sun created SPARK-36726: Summary: Upgrade Parquet to 1.12.1 Key: SPARK-36726 URL: https://issues.apache.org/jira/browse/SPARK-36726 Project: Spark Issue Type: Bug Components: SQL

[jira] [Commented] (SPARK-35959) Add a new Maven profile "no-shaded-client" for older Hadoop 3.x versions

2021-09-09 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412897#comment-17412897 ] Chao Sun commented on SPARK-35959: -- [~hyukjin.kwon] No I don't think it qualifies as blocker anymore.

[jira] [Updated] (SPARK-35959) Add a new Maven profile "no-shaded-client" for older Hadoop 3.x versions

2021-09-09 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-35959: - Priority: Major (was: Blocker) > Add a new Maven profile "no-shaded-client" for older Hadoop 3.x

[jira] [Commented] (SPARK-36696) spark.read.parquet loads empty dataset

2021-09-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412167#comment-17412167 ] Chao Sun commented on SPARK-36696: --

[jira] [Commented] (SPARK-36696) spark.read.parquet loads empty dataset

2021-09-08 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412164#comment-17412164 ] Chao Sun commented on SPARK-36696: -- This looks like the same issue as in PARQUET-2078. The file offset

[jira] [Created] (SPARK-36695) Allow passing V2 functions to data sources via V2 filters

2021-09-08 Thread Chao Sun (Jira)
Chao Sun created SPARK-36695: Summary: Allow passing V2 functions to data sources via V2 filters Key: SPARK-36695 URL: https://issues.apache.org/jira/browse/SPARK-36695 Project: Spark Issue

[jira] [Commented] (SPARK-36676) Create shaded Hive module and upgrade to higher version of Guava

2021-09-06 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410726#comment-17410726 ] Chao Sun commented on SPARK-36676: -- Will post a PR soon > Create shaded Hive module and upgrade to

[jira] [Created] (SPARK-36676) Create shaded Hive module and upgrade to higher version of Guava

2021-09-06 Thread Chao Sun (Jira)
Chao Sun created SPARK-36676: Summary: Create shaded Hive module and upgrade to higher version of Guava Key: SPARK-36676 URL: https://issues.apache.org/jira/browse/SPARK-36676 Project: Spark

[jira] [Commented] (SPARK-34276) Check the unreleased/unresolved JIRAs/PRs of Parquet 1.11 and 1.12

2021-09-01 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17408324#comment-17408324 ] Chao Sun commented on SPARK-34276: -- I did some study on the code and it seems this will only affect

[jira] [Commented] (SPARK-34276) Check the unreleased/unresolved JIRAs/PRs of Parquet 1.11 and 1.12

2021-08-31 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407554#comment-17407554 ] Chao Sun commented on SPARK-34276: -- [~smilegator] yea seems like Spark will be affected. cc

[jira] [Updated] (SPARK-36528) Implement lazy decoding for the vectorized Parquet reader

2021-08-26 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36528: - Description: Currently Spark first decode (e.g., RLE/bit-packed, PLAIN) into column vector and then

[jira] [Updated] (SPARK-36528) Implement lazy decoding for the vectorized Parquet reader

2021-08-16 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36528: - Description: Currently Spark first decode (e.g., RLE/bit-packed, PLAIN) into column vector and then

[jira] [Updated] (SPARK-36527) Implement lazy materialization for the vectorized Parquet reader

2021-08-16 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-36527: - Description: At the moment the Parquet vectorized reader will eagerly decode all the columns that are

[jira] [Created] (SPARK-36529) Decouple CPU with IO work in vectorized Parquet reader

2021-08-16 Thread Chao Sun (Jira)
Chao Sun created SPARK-36529: Summary: Decouple CPU with IO work in vectorized Parquet reader Key: SPARK-36529 URL: https://issues.apache.org/jira/browse/SPARK-36529 Project: Spark Issue Type:

[jira] [Created] (SPARK-36528) Implement lazy decoding for the vectorized Parquet reader

2021-08-16 Thread Chao Sun (Jira)
Chao Sun created SPARK-36528: Summary: Implement lazy decoding for the vectorized Parquet reader Key: SPARK-36528 URL: https://issues.apache.org/jira/browse/SPARK-36528 Project: Spark Issue

[jira] [Created] (SPARK-36527) Implement lazy materialization for the vectorized Parquet reader

2021-08-16 Thread Chao Sun (Jira)
Chao Sun created SPARK-36527: Summary: Implement lazy materialization for the vectorized Parquet reader Key: SPARK-36527 URL: https://issues.apache.org/jira/browse/SPARK-36527 Project: Spark

<    1   2   3   4   5   >