[jira] [Updated] (SPARK-47405) Remove `JLine 2` dependency
[ https://issues.apache.org/jira/browse/SPARK-47405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47405: -- Parent: (was: SPARK-47046) Issue Type: Improvement (was: Sub-task) > Remove `JLine 2` dependency > > > Key: SPARK-47405 > URL: https://issues.apache.org/jira/browse/SPARK-47405 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45587) Skip UNIDOC and MIMA in build GitHub Action job
[ https://issues.apache.org/jira/browse/SPARK-45587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45587: -- Fix Version/s: 3.5.2 3.4.3 > Skip UNIDOC and MIMA in build GitHub Action job > --- > > Key: SPARK-45587 > URL: https://issues.apache.org/jira/browse/SPARK-45587 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2, 3.4.3 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47428) Upgrade Jetty to 9.4.54.v20240208
[ https://issues.apache.org/jira/browse/SPARK-47428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47428. --- Fix Version/s: 3.5.2 Resolution: Fixed Issue resolved by pull request 45543 [https://github.com/apache/spark/pull/45543] > Upgrade Jetty to 9.4.54.v20240208 > - > > Key: SPARK-47428 > URL: https://issues.apache.org/jira/browse/SPARK-47428 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.4.2, 3.5.1 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 3.5.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47423) Set operations should work with collated strings
[ https://issues.apache.org/jira/browse/SPARK-47423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47423. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45536 [https://github.com/apache/spark/pull/45536] > Set operations should work with collated strings > > > Key: SPARK-47423 > URL: https://issues.apache.org/jira/browse/SPARK-47423 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Assignee: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47423) Set operations should work with collated strings
[ https://issues.apache.org/jira/browse/SPARK-47423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-47423: Assignee: Aleksandar Tomic > Set operations should work with collated strings > > > Key: SPARK-47423 > URL: https://issues.apache.org/jira/browse/SPARK-47423 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Assignee: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47428) Upgrade Jetty to 9.4.54.v20240208
Dongjoon Hyun created SPARK-47428: - Summary: Upgrade Jetty to 9.4.54.v20240208 Key: SPARK-47428 URL: https://issues.apache.org/jira/browse/SPARK-47428 Project: Spark Issue Type: Bug Components: Build Affects Versions: 3.5.1, 3.4.2 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45540) Upgrade jetty to 9.4.53.v20231009
[ https://issues.apache.org/jira/browse/SPARK-45540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45540: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade jetty to 9.4.53.v20231009 > -- > > Key: SPARK-45540 > URL: https://issues.apache.org/jira/browse/SPARK-45540 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > fix > * [CVE-2023-44487|https://github.com/advisories/GHSA-qppj-fm5r-hxr3] > * [CVE-2023-36478|https://github.com/advisories/GHSA-wgh7-54f2-x98r] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47427) Support trailing commas in select list
[ https://issues.apache.org/jira/browse/SPARK-47427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47427: --- Labels: pull-request-available (was: ) > Support trailing commas in select list > -- > > Key: SPARK-47427 > URL: https://issues.apache.org/jira/browse/SPARK-47427 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Serge Rielau >Priority: Major > Labels: pull-request-available > > DuckDb has popularized allowing trailing commas in the SELECT list. > The benefit of this ability is that it is easy to add, remove, comment out > expressions from the select list: > {noformat} > SELECT c1, > /* c2 */ >FROM T; > vs > SELECT c1 > /* , c2 */ >FROM T; > {noformat} > Recently Snowflake jumped onto this usability feature as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47427) Support trailing commas in select list
Serge Rielau created SPARK-47427: Summary: Support trailing commas in select list Key: SPARK-47427 URL: https://issues.apache.org/jira/browse/SPARK-47427 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau DuckDb has popularized allowing trailing commas in the SELECT list. The benefit of this ability is that it is easy to add, remove, comment out expressions from the select list: {noformat} SELECT c1, /* c2 */ FROM T; vs SELECT c1 /* , c2 */ FROM T; {noformat} Recently Snowflake jumped onto this usability feature as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47426) Upgrade Guava used by the connect module to 33.1.0-jre
[ https://issues.apache.org/jira/browse/SPARK-47426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47426: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade Guava used by the connect module to 33.1.0-jre > -- > > Key: SPARK-47426 > URL: https://issues.apache.org/jira/browse/SPARK-47426 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47426) Upgrade Guava used by the connect module to 33.1.0-jre
[ https://issues.apache.org/jira/browse/SPARK-47426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47426: --- Labels: pull-request-available (was: ) > Upgrade Guava used by the connect module to 33.1.0-jre > -- > > Key: SPARK-47426 > URL: https://issues.apache.org/jira/browse/SPARK-47426 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47426) Upgrade Guava used by the connect module to 33.1.0-jre
[ https://issues.apache.org/jira/browse/SPARK-47426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-47426: Summary: Upgrade Guava used by the connect module to 33.1.0-jre (was: Upgrade Guava used by the connect module to 33.1-jre) > Upgrade Guava used by the connect module to 33.1.0-jre > -- > > Key: SPARK-47426 > URL: https://issues.apache.org/jira/browse/SPARK-47426 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47425) spark-sql does not recognize expressions in repartition hint
[ https://issues.apache.org/jira/browse/SPARK-47425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreyas updated SPARK-47425: Description: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (pmod(hash(user_id),200)) */ * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you was: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (pmod(hash(user_id),200)) */ * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you > spark-sql does not recognize expressions in repartition hint > > > Key: SPARK-47425 > URL: https://issues.apache.org/jira/browse/SPARK-47425 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1 >Reporter: Shreyas >Priority: Major > > In Scala, it is possible to do this, to create a bucketed table to not have > many small files. > > {code:scala} > df.repartition(expr("pmod(hash(user_id), 200)")) > .write > .mode(SaveMode.Overwrite) > .bucketBy(200, "user_id") > .option("path", output_path) > .saveAsTable("bucketed_table") > {code} > Found [this small > trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] > to have the same # files as buckets. > However, the equivalent does not work in spark-sql (using repartition hint) > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+ repartition (pmod(hash(user_id),200)) */ > * from df_table > {code} > {{REPARTITION Hint parameter should include columns, but > 'pmod('hash('user_id), 200) found.}} > When I instead make a virtual column and use that, Spark is not respecting > the repartition anymore > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+repartition (bkt) */ > *, pmod(hash(user_id),200) as bkt > from df_table > {code} > {code:java} > $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head > Found 101601 items > ...{code} > Can the behavior of repartition hint be changed to work like the Scala/Python > equivalent? > Thank you -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47425) spark-sql does not recognize expressions in repartition hint
[ https://issues.apache.org/jira/browse/SPARK-47425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreyas updated SPARK-47425: Description: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (pmod(hash(user_id),200)) */ -- I have the hint setup properly, Jira is removing the space when displaying * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you was: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (pmod(hash(user_id),200)) */ * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you > spark-sql does not recognize expressions in repartition hint > > > Key: SPARK-47425 > URL: https://issues.apache.org/jira/browse/SPARK-47425 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1 >Reporter: Shreyas >Priority: Major > > In Scala, it is possible to do this, to create a bucketed table to not have > many small files. > > {code:scala} > df.repartition(expr("pmod(hash(user_id), 200)")) > .write > .mode(SaveMode.Overwrite) > .bucketBy(200, "user_id") > .option("path", output_path) > .saveAsTable("bucketed_table") > {code} > Found [this small > trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] > to have the same # files as buckets. > However, the equivalent does not work in spark-sql (using repartition hint) > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+ repartition (pmod(hash(user_id),200)) */ > -- I have the hint setup properly, Jira is removing the space when displaying > * from df_table > {code} > {{REPARTITION Hint parameter should include columns, but > 'pmod('hash('user_id), 200) found.}} > When I instead make a virtual column and use that, Spark is not respecting > the repartition anymore > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+repartition (bkt) */ > *, pmod(hash(user_id),200) as bkt > from df_table > {code} > {code:java} > $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head > Found 101601 items > ...{code} > Can the behavior of repartition hint be changed to work like the Scala/Python > equivalent? > Thank you -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47425) spark-sql does not recognize expressions in repartition hint
[ https://issues.apache.org/jira/browse/SPARK-47425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreyas updated SPARK-47425: Description: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (pmod(hash(user_id),200)) */ * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you was: In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+ repartition (pmod(hash(user_id),200)) */ -- I have the hint setup properly, Jira is removing the space when displaying * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you > spark-sql does not recognize expressions in repartition hint > > > Key: SPARK-47425 > URL: https://issues.apache.org/jira/browse/SPARK-47425 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1 >Reporter: Shreyas >Priority: Major > > In Scala, it is possible to do this, to create a bucketed table to not have > many small files. > > {code:scala} > df.repartition(expr("pmod(hash(user_id), 200)")) > .write > .mode(SaveMode.Overwrite) > .bucketBy(200, "user_id") > .option("path", output_path) > .saveAsTable("bucketed_table") > {code} > Found [this small > trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] > to have the same # files as buckets. > However, the equivalent does not work in spark-sql (using repartition hint) > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+ repartition (pmod(hash(user_id),200)) */ * from df_table > {code} > {{REPARTITION Hint parameter should include columns, but > 'pmod('hash('user_id), 200) found.}} > When I instead make a virtual column and use that, Spark is not respecting > the repartition anymore > {code:sql} > create table bucketed_table stored as parquet > clustered by (user_id) into 200 buckets > select /*+ repartition (bkt) */ > *, pmod(hash(user_id),200) as bkt > from df_table > {code} > {code:java} > $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head > Found 101601 items > ...{code} > Can the behavior of repartition hint be changed to work like the Scala/Python > equivalent? > Thank you -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47425) spark-sql does not recognize expressions in repartition hint
Shreyas created SPARK-47425: --- Summary: spark-sql does not recognize expressions in repartition hint Key: SPARK-47425 URL: https://issues.apache.org/jira/browse/SPARK-47425 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.1 Reporter: Shreyas In Scala, it is possible to do this, to create a bucketed table to not have many small files. {code:scala} df.repartition(expr("pmod(hash(user_id), 200)")) .write .mode(SaveMode.Overwrite) .bucketBy(200, "user_id") .option("path", output_path) .saveAsTable("bucketed_table") {code} Found [this small trick|https://towardsdatascience.com/best-practices-for-bucketing-in-spark-sql-ea9f23f7dd53] to have the same # files as buckets. However, the equivalent does not work in spark-sql (using repartition hint) {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (pmod(hash(user_id),200)) */ * from df_table {code} {{REPARTITION Hint parameter should include columns, but 'pmod('hash('user_id), 200) found.}} When I instead make a virtual column and use that, Spark is not respecting the repartition anymore {code:sql} create table bucketed_table stored as parquet clustered by (user_id) into 200 buckets select /*+repartition (bkt) */ *, pmod(hash(user_id),200) as bkt from df_table {code} {code:java} $ hdfs dfs -ls -h /user/spark/warehouse/bucket_test.db/bucketed_table| head Found 101601 items ...{code} Can the behavior of repartition hint be changed to work like the Scala/Python equivalent? Thank you -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47346) Make daemon mode configurable when creating Python workers
[ https://issues.apache.org/jira/browse/SPARK-47346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-47346. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45468 [https://github.com/apache/spark/pull/45468] > Make daemon mode configurable when creating Python workers > -- > > Key: SPARK-47346 > URL: https://issues.apache.org/jira/browse/SPARK-47346 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47346) Make daemon mode configurable when creating Python workers
[ https://issues.apache.org/jira/browse/SPARK-47346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-47346: - Assignee: Allison Wang > Make daemon mode configurable when creating Python workers > -- > > Key: SPARK-47346 > URL: https://issues.apache.org/jira/browse/SPARK-47346 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47424) Add feature flag to enable spark local session time zone calendar in JDBC API calls
[ https://issues.apache.org/jira/browse/SPARK-47424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47424: --- Labels: pull-request-available (was: ) > Add feature flag to enable spark local session time zone calendar in JDBC API > calls > --- > > Key: SPARK-47424 > URL: https://issues.apache.org/jira/browse/SPARK-47424 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Petar Vasiljevic >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47424) Add feature flag to enable spark local session time zone calendar in JDBC API calls
Petar Vasiljevic created SPARK-47424: Summary: Add feature flag to enable spark local session time zone calendar in JDBC API calls Key: SPARK-47424 URL: https://issues.apache.org/jira/browse/SPARK-47424 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 4.0.0 Reporter: Petar Vasiljevic -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47345) XML: Add XmlFunctionsSuite
[ https://issues.apache.org/jira/browse/SPARK-47345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-47345: Assignee: Yousof Hosny > XML: Add XmlFunctionsSuite > -- > > Key: SPARK-47345 > URL: https://issues.apache.org/jira/browse/SPARK-47345 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yousof Hosny >Assignee: Yousof Hosny >Priority: Minor > Labels: pull-request-available > > Convert JsonFunctiosnSuite.scala to XML equivalent. Note that XML doesn’t > implement all json functions like {{{}json_tuple{}}}, > {{{}get_json_object{}}}, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47345) XML: Add XmlFunctionsSuite
[ https://issues.apache.org/jira/browse/SPARK-47345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47345. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45466 [https://github.com/apache/spark/pull/45466] > XML: Add XmlFunctionsSuite > -- > > Key: SPARK-47345 > URL: https://issues.apache.org/jira/browse/SPARK-47345 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Yousof Hosny >Assignee: Yousof Hosny >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > Convert JsonFunctiosnSuite.scala to XML equivalent. Note that XML doesn’t > implement all json functions like {{{}json_tuple{}}}, > {{{}get_json_object{}}}, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47395) Add collate and collation to non-sql APIs
[ https://issues.apache.org/jira/browse/SPARK-47395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47395. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45517 [https://github.com/apache/spark/pull/45517] > Add collate and collation to non-sql APIs > - > > Key: SPARK-47395 > URL: https://issues.apache.org/jira/browse/SPARK-47395 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Stefan Kandic >Assignee: Stefan Kandic >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47423) Set operations should work with collated strings
[ https://issues.apache.org/jira/browse/SPARK-47423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47423: --- Labels: pull-request-available (was: ) > Set operations should work with collated strings > > > Key: SPARK-47423 > URL: https://issues.apache.org/jira/browse/SPARK-47423 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Aleksandar Tomic >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47423) Set operations should work with collated strings
Aleksandar Tomic created SPARK-47423: Summary: Set operations should work with collated strings Key: SPARK-47423 URL: https://issues.apache.org/jira/browse/SPARK-47423 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Aleksandar Tomic -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47419) Move `log4j2-defaults.properties` to `common/utils`
[ https://issues.apache.org/jira/browse/SPARK-47419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47419: - Assignee: BingKun Pan > Move `log4j2-defaults.properties` to `common/utils` > --- > > Key: SPARK-47419 > URL: https://issues.apache.org/jira/browse/SPARK-47419 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47419) Move `log4j2-defaults.properties` to `common/utils`
[ https://issues.apache.org/jira/browse/SPARK-47419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47419. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45532 [https://github.com/apache/spark/pull/45532] > Move `log4j2-defaults.properties` to `common/utils` > --- > > Key: SPARK-47419 > URL: https://issues.apache.org/jira/browse/SPARK-47419 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47422) Support collated strings in array operations
Nikola Mandic created SPARK-47422: - Summary: Support collated strings in array operations Key: SPARK-47422 URL: https://issues.apache.org/jira/browse/SPARK-47422 Project: Spark Issue Type: Task Components: SQL Affects Versions: 4.0.0 Reporter: Nikola Mandic Collations need to be properly supported in following array operations but currently yield unexpected results: ArraysOverlap, ArrayDistinct, ArrayUnion, ArrayIntersect, ArrayExcept. Example query: {code:java} select array_contains(array('aaa' collate utf8_binary_lcase), 'AAA' collate utf8_binary_lcase){code} We would expect the result of query to be true. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47419) Move `log4j2-defaults.properties` to `common/utils`
[ https://issues.apache.org/jira/browse/SPARK-47419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BingKun Pan updated SPARK-47419: Summary: Move `log4j2-defaults.properties` to `common/utils` (was: Move `log4j2-defaults.properties` to `common\utils`) > Move `log4j2-defaults.properties` to `common/utils` > --- > > Key: SPARK-47419 > URL: https://issues.apache.org/jira/browse/SPARK-47419 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44740) Allow configuring the session ID for a spark connect client in the remote string
[ https://issues.apache.org/jira/browse/SPARK-44740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44740: --- Labels: pull-request-available (was: ) > Allow configuring the session ID for a spark connect client in the remote > string > > > Key: SPARK-44740 > URL: https://issues.apache.org/jira/browse/SPARK-44740 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.5.0 >Reporter: Martin Grund >Assignee: Martin Grund >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47416) SoundEx, Luhncheck (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47416: - Summary: SoundEx, Luhncheck (all collations) (was: SoundEx (all collations)) > SoundEx, Luhncheck (all collations) > --- > > Key: SPARK-47416 > URL: https://issues.apache.org/jira/browse/SPARK-47416 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47420) FormatNumber, Sentences (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47420: - Summary: FormatNumber, Sentences (all collations) (was: FormatNumber (all collations)) > FormatNumber, Sentences (all collations) > > > Key: SPARK-47420 > URL: https://issues.apache.org/jira/browse/SPARK-47420 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47421) Split, SplitPart (lowercase collation)
Uroš Bojanić created SPARK-47421: Summary: Split, SplitPart (lowercase collation) Key: SPARK-47421 URL: https://issues.apache.org/jira/browse/SPARK-47421 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47420) FormatNumber (all collations)
Uroš Bojanić created SPARK-47420: Summary: FormatNumber (all collations) Key: SPARK-47420 URL: https://issues.apache.org/jira/browse/SPARK-47420 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47419) Move `log4j2-defaults.properties` to `common\utils`
[ https://issues.apache.org/jira/browse/SPARK-47419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47419: --- Labels: pull-request-available (was: ) > Move `log4j2-defaults.properties` to `common\utils` > --- > > Key: SPARK-47419 > URL: https://issues.apache.org/jira/browse/SPARK-47419 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47419) Move `log4j2-defaults.properties` to `common\utils`
BingKun Pan created SPARK-47419: --- Summary: Move `log4j2-defaults.properties` to `common\utils` Key: SPARK-47419 URL: https://issues.apache.org/jira/browse/SPARK-47419 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: BingKun Pan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47418) Decode, StringDecode, Encode, ToBinary (all collations)
Uroš Bojanić created SPARK-47418: Summary: Decode, StringDecode, Encode, ToBinary (all collations) Key: SPARK-47418 URL: https://issues.apache.org/jira/browse/SPARK-47418 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47417) Ascii, Chr, Base64, UnBase64 (all collations)
Uroš Bojanić created SPARK-47417: Summary: Ascii, Chr, Base64, UnBase64 (all collations) Key: SPARK-47417 URL: https://issues.apache.org/jira/browse/SPARK-47417 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47416) SoundEx (all collations)
Uroš Bojanić created SPARK-47416: Summary: SoundEx (all collations) Key: SPARK-47416 URL: https://issues.apache.org/jira/browse/SPARK-47416 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47415) Levenshtein (all collations)
Uroš Bojanić created SPARK-47415: Summary: Levenshtein (all collations) Key: SPARK-47415 URL: https://issues.apache.org/jira/browse/SPARK-47415 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47414) Length, BitLength, OctetLength (all collations)
Uroš Bojanić created SPARK-47414: Summary: Length, BitLength, OctetLength (all collations) Key: SPARK-47414 URL: https://issues.apache.org/jira/browse/SPARK-47414 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47411) StringInstr, SubstringIndex, StringLocate, Substring (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47411: - Summary: StringInstr, SubstringIndex, StringLocate, Substring (all collations) (was: StringInstr, SubstringIndex, StringLocate (all collations)) > StringInstr, SubstringIndex, StringLocate, Substring (all collations) > - > > Key: SPARK-47411 > URL: https://issues.apache.org/jira/browse/SPARK-47411 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47413) Substring, Right, Left (all collations)
Uroš Bojanić created SPARK-47413: Summary: Substring, Right, Left (all collations) Key: SPARK-47413 URL: https://issues.apache.org/jira/browse/SPARK-47413 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47360) Overlay, FormatString (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47360: - Summary: Overlay, FormatString (all collations) (was: Overlay (all collations)) > Overlay, FormatString (all collations) > -- > > Key: SPARK-47360 > URL: https://issues.apache.org/jira/browse/SPARK-47360 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47358) StringReplace, StringRepeat (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47358: - Summary: StringReplace, StringRepeat (all collations) (was: StringReplace (all collations)) > StringReplace, StringRepeat (all collations) > > > Key: SPARK-47358 > URL: https://issues.apache.org/jira/browse/SPARK-47358 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47357) Upper, Lower, InitCap (lowercase collation)
[ https://issues.apache.org/jira/browse/SPARK-47357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uroš Bojanić updated SPARK-47357: - Summary: Upper, Lower, InitCap (lowercase collation) (was: Upper & Lower (lowercase collation)) > Upper, Lower, InitCap (lowercase collation) > --- > > Key: SPARK-47357 > URL: https://issues.apache.org/jira/browse/SPARK-47357 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47412) StringLPad, BinaryPad, StringRPad (all collations)
Uroš Bojanić created SPARK-47412: Summary: StringLPad, BinaryPad, StringRPad (all collations) Key: SPARK-47412 URL: https://issues.apache.org/jira/browse/SPARK-47412 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47411) StringInstr, SubstringIndex, StringLocate (all collations)
Uroš Bojanić created SPARK-47411: Summary: StringInstr, SubstringIndex, StringLocate (all collations) Key: SPARK-47411 URL: https://issues.apache.org/jira/browse/SPARK-47411 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47410) StringTrimLeft, StringTrimRight (all collations)
Uroš Bojanić created SPARK-47410: Summary: StringTrimLeft, StringTrimRight (all collations) Key: SPARK-47410 URL: https://issues.apache.org/jira/browse/SPARK-47410 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47409) StringTrim, StringTrimBoth (all collations)
Uroš Bojanić created SPARK-47409: Summary: StringTrim, StringTrimBoth (all collations) Key: SPARK-47409 URL: https://issues.apache.org/jira/browse/SPARK-47409 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47408) FindInSet (all collations)
Uroš Bojanić created SPARK-47408: Summary: FindInSet (all collations) Key: SPARK-47408 URL: https://issues.apache.org/jira/browse/SPARK-47408 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Uroš Bojanić -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47407) Support java.sql.Types.NULL
[ https://issues.apache.org/jira/browse/SPARK-47407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-47407: Assignee: Kent Yao > Support java.sql.Types.NULL > --- > > Key: SPARK-47407 > URL: https://issues.apache.org/jira/browse/SPARK-47407 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47407) Support java.sql.Types.NULL
[ https://issues.apache.org/jira/browse/SPARK-47407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47407. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45531 [https://github.com/apache/spark/pull/45531] > Support java.sql.Types.NULL > --- > > Key: SPARK-47407 > URL: https://issues.apache.org/jira/browse/SPARK-47407 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47399) Disable generated columns on expressions with collations
[ https://issues.apache.org/jira/browse/SPARK-47399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47399. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45520 [https://github.com/apache/spark/pull/45520] > Disable generated columns on expressions with collations > > > Key: SPARK-47399 > URL: https://issues.apache.org/jira/browse/SPARK-47399 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Stefan Kandic >Assignee: Stefan Kandic >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Changing the collation of a column or even just changing the ICU version > could lead to a differences in the resulting expression so it would be best > if we simply disable it for now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47399) Disable generated columns on expressions with collations
[ https://issues.apache.org/jira/browse/SPARK-47399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-47399: Assignee: Stefan Kandic > Disable generated columns on expressions with collations > > > Key: SPARK-47399 > URL: https://issues.apache.org/jira/browse/SPARK-47399 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Stefan Kandic >Assignee: Stefan Kandic >Priority: Major > Labels: pull-request-available > > Changing the collation of a column or even just changing the ICU version > could lead to a differences in the resulting expression so it would be best > if we simply disable it for now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47406) Handle TIMESTAMP and DATETIME in MYSQLDialect
[ https://issues.apache.org/jira/browse/SPARK-47406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reassigned SPARK-47406: Assignee: Kent Yao > Handle TIMESTAMP and DATETIME in MYSQLDialect > -- > > Key: SPARK-47406 > URL: https://issues.apache.org/jira/browse/SPARK-47406 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-47406) Handle TIMESTAMP and DATETIME in MYSQLDialect
[ https://issues.apache.org/jira/browse/SPARK-47406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827447#comment-17827447 ] Kent Yao commented on SPARK-47406: -- resolved by https://github.com/apache/spark/pull/45530 > Handle TIMESTAMP and DATETIME in MYSQLDialect > -- > > Key: SPARK-47406 > URL: https://issues.apache.org/jira/browse/SPARK-47406 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47406) Handle TIMESTAMP and DATETIME in MYSQLDialect
[ https://issues.apache.org/jira/browse/SPARK-47406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-47406. -- Fix Version/s: 4.0.0 Resolution: Fixed > Handle TIMESTAMP and DATETIME in MYSQLDialect > -- > > Key: SPARK-47406 > URL: https://issues.apache.org/jira/browse/SPARK-47406 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47407) Support java.sql.Types.NULL
[ https://issues.apache.org/jira/browse/SPARK-47407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47407: --- Labels: pull-request-available (was: ) > Support java.sql.Types.NULL > --- > > Key: SPARK-47407 > URL: https://issues.apache.org/jira/browse/SPARK-47407 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-44893) ThreadInfo improvements for monitoring APIs
[ https://issues.apache.org/jira/browse/SPARK-44893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827435#comment-17827435 ] Kent Yao commented on SPARK-44893: -- Updated. Thank you [~dongjoon] > ThreadInfo improvements for monitoring APIs > --- > > Key: SPARK-44893 > URL: https://issues.apache.org/jira/browse/SPARK-44893 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: releasenotes > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44893) ThreadInfo improvements for monitoring APIs
[ https://issues.apache.org/jira/browse/SPARK-44893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-44893. -- Fix Version/s: 4.0.0 Resolution: Fixed > ThreadInfo improvements for monitoring APIs > --- > > Key: SPARK-44893 > URL: https://issues.apache.org/jira/browse/SPARK-44893 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: releasenotes > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44896) Consider adding information os_prio, cpu, elapsed, tid, nid, etc., from the jstack tool
[ https://issues.apache.org/jira/browse/SPARK-44896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-44896: - Parent: (was: SPARK-44893) Issue Type: Improvement (was: Sub-task) > Consider adding information os_prio, cpu, elapsed, tid, nid, etc., from the > jstack tool > > > Key: SPARK-44896 > URL: https://issues.apache.org/jira/browse/SPARK-44896 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 4.0.0 >Reporter: Kent Yao >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-45877) ExecutorFailureTracker support for standalone mode
[ https://issues.apache.org/jira/browse/SPARK-45877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827434#comment-17827434 ] Kent Yao commented on SPARK-45877: -- Hi [~dongjoon] , LGTM. I have separated this from SPARK-45869 > ExecutorFailureTracker support for standalone mode > -- > > Key: SPARK-45877 > URL: https://issues.apache.org/jira/browse/SPARK-45877 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > > ExecutorFailureTracker now works for k8s and yarn, I guess it also an > important feature for standalone to have -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45877) ExecutorFailureTracker support for standalone mode
[ https://issues.apache.org/jira/browse/SPARK-45877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao updated SPARK-45877: - Parent: (was: SPARK-45869) Issue Type: New Feature (was: Sub-task) > ExecutorFailureTracker support for standalone mode > -- > > Key: SPARK-45877 > URL: https://issues.apache.org/jira/browse/SPARK-45877 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > > ExecutorFailureTracker now works for k8s and yarn, I guess it also an > important feature for standalone to have -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org