[jira] [Updated] (SPARK-46627) Streaming UI hover-over shows incorrect value
[ https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-46627: --- Issue Type: Bug (was: Task) > Streaming UI hover-over shows incorrect value > - > > Key: SPARK-46627 > URL: https://issues.apache.org/jira/browse/SPARK-46627 > Project: Spark > Issue Type: Bug > Components: Structured Streaming, UI, Web UI >Affects Versions: 4.0.0 >Reporter: Wei Liu >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot > 2024-01-08 at 15.06.24.png > > > Running a simple streaming query: > val df = spark.readStream.format("rate").option("rowsPerSecond", > "5000").load() > val q = df.writeStream.format("noop").start() > > The hover-over value is incorrect in the streaming ui (shows 321.00 at > undefined) > > !Screenshot 2024-01-08 at 1.55.57 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46627) Streaming UI hover-over shows incorrect value
[ https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-46627. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/44633 > Streaming UI hover-over shows incorrect value > - > > Key: SPARK-46627 > URL: https://issues.apache.org/jira/browse/SPARK-46627 > Project: Spark > Issue Type: Task > Components: Structured Streaming, UI, Web UI >Affects Versions: 4.0.0 >Reporter: Wei Liu >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot > 2024-01-08 at 15.06.24.png > > > Running a simple streaming query: > val df = spark.readStream.format("rate").option("rowsPerSecond", > "5000").load() > val q = df.writeStream.format("noop").start() > > The hover-over value is incorrect in the streaming ui (shows 321.00 at > undefined) > > !Screenshot 2024-01-08 at 1.55.57 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46627) Streaming UI hover-over shows incorrect value
[ https://issues.apache.org/jira/browse/SPARK-46627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta reassigned SPARK-46627: -- Assignee: Kent Yao > Streaming UI hover-over shows incorrect value > - > > Key: SPARK-46627 > URL: https://issues.apache.org/jira/browse/SPARK-46627 > Project: Spark > Issue Type: Task > Components: Structured Streaming, UI, Web UI >Affects Versions: 4.0.0 >Reporter: Wei Liu >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2024-01-08 at 1.55.57 PM.png, Screenshot > 2024-01-08 at 15.06.24.png > > > Running a simple streaming query: > val df = spark.readStream.format("rate").option("rowsPerSecond", > "5000").load() > val q = df.writeStream.format("noop").start() > > The hover-over value is incorrect in the streaming ui (shows 321.00 at > undefined) > > !Screenshot 2024-01-08 at 1.55.57 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44490) Remove TaskPagedTable in StagePage
[ https://issues.apache.org/jira/browse/SPARK-44490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-44490. Fix Version/s: 4.0.0 Assignee: dzcxzl Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/42085 > Remove TaskPagedTable in StagePage > -- > > Key: SPARK-44490 > URL: https://issues.apache.org/jira/browse/SPARK-44490 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.4.1 >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Minor > Fix For: 4.0.0 > > > In [SPARK-21809|https://issues.apache.org/jira/browse/SPARK-21809], we > introduced stagespage-template.html to show the running status of Stage. > TaskPagedTable is no longer effective, but there are still many PRs updating > related codes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-44279) Upgrade optionator to ^0.9.3
[ https://issues.apache.org/jira/browse/SPARK-44279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-44279. Target Version/s: 3.5.0 Assignee: Bjørn Jørgensen Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/41955 > Upgrade optionator to ^0.9.3 > > > Key: SPARK-44279 > URL: https://issues.apache.org/jira/browse/SPARK-44279 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 3.4.1, 3.5.0 >Reporter: Bjørn Jørgensen >Assignee: Bjørn Jørgensen >Priority: Minor > > [Regular Expression Denial of Service (ReDoS) - > CVE-2023-26115|https://github.com/jonschlinkert/word-wrap/issues/32] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44279) Upgrade optionator to ^0.9.3
[ https://issues.apache.org/jira/browse/SPARK-44279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-44279: --- Priority: Minor (was: Major) > Upgrade optionator to ^0.9.3 > > > Key: SPARK-44279 > URL: https://issues.apache.org/jira/browse/SPARK-44279 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 3.4.1, 3.5.0 >Reporter: Bjørn Jørgensen >Priority: Minor > > [Regular Expression Denial of Service (ReDoS) - > CVE-2023-26115|https://github.com/jonschlinkert/word-wrap/issues/32] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41634) Upgrade minimatch to 3.1.2
[ https://issues.apache.org/jira/browse/SPARK-41634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-41634. Fix Version/s: 3.4.0 Assignee: Bjørn Jørgensen Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/39143 > Upgrade minimatch to 3.1.2 > --- > > Key: SPARK-41634 > URL: https://issues.apache.org/jira/browse/SPARK-41634 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 3.4.0 >Reporter: Bjørn Jørgensen >Assignee: Bjørn Jørgensen >Priority: Minor > Fix For: 3.4.0 > > > [CVE-2022-3517|https://nvd.nist.gov/vuln/detail/CVE-2022-3517] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41587) Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7
[ https://issues.apache.org/jira/browse/SPARK-41587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-41587. Fix Version/s: 3.4.0 Assignee: Yang Jie Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/39129 > Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7 > > > Key: SPARK-41587 > URL: https://issues.apache.org/jira/browse/SPARK-41587 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Fix For: 3.4.0 > > > https://github.com/scalatest/scalatestplus-selenium/releases/tag/release-3.2.14.0-for-selenium-4.7 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40397) Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium to 3.2.13.0
[ https://issues.apache.org/jira/browse/SPARK-40397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-40397. Fix Version/s: 3.4.0 Assignee: Yang Jie Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/37868 > Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium > to 3.2.13.0 > > > Key: SPARK-40397 > URL: https://issues.apache.org/jira/browse/SPARK-40397 > Project: Spark > Issue Type: Improvement > Components: Build, Tests >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38303) Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
[ https://issues.apache.org/jira/browse/SPARK-38303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-38303. Fix Version/s: 3.3.0 3.2.2 Assignee: Bjørn Jørgensen Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35628 > Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev > -- > > Key: SPARK-38303 > URL: https://issues.apache.org/jira/browse/SPARK-38303 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.2.1, 3.3.0 >Reporter: Bjørn Jørgensen >Assignee: Bjørn Jørgensen >Priority: Major > Fix For: 3.3.0, 3.2.2 > > > [CVE-2021-3807|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3807] > > [releases notes at github|https://github.com/chalk/ansi-regex/releases] > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38303) Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
[ https://issues.apache.org/jira/browse/SPARK-38303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38303: --- Affects Version/s: 3.2.1 > Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev > -- > > Key: SPARK-38303 > URL: https://issues.apache.org/jira/browse/SPARK-38303 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.2.1, 3.3.0 >Reporter: Bjørn Jørgensen >Priority: Major > > [CVE-2021-3807|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-3807] > > [releases notes at github|https://github.com/chalk/ansi-regex/releases] > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38278) Add SparkContext.addArchive in PySpark
[ https://issues.apache.org/jira/browse/SPARK-38278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-38278. Assignee: Hyukjin Kwon Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35603 > Add SparkContext.addArchive in PySpark > -- > > Key: SPARK-38278 > URL: https://issues.apache.org/jira/browse/SPARK-38278 > Project: Spark > Issue Type: New Feature > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > SPARK-33530 added {{SparkContext.addArchive}} API. We should have one in > PySpark too. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38278) Add SparkContext.addArchive in PySpark
[ https://issues.apache.org/jira/browse/SPARK-38278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38278: --- Fix Version/s: 3.3.0 > Add SparkContext.addArchive in PySpark > -- > > Key: SPARK-38278 > URL: https://issues.apache.org/jira/browse/SPARK-38278 > Project: Spark > Issue Type: New Feature > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.3.0 > > > SPARK-33530 added {{SparkContext.addArchive}} API. We should have one in > PySpark too. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36808) Upgrade Kafka to 2.8.1
[ https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36808: --- Fix Version/s: 3.2.2 > Upgrade Kafka to 2.8.1 > -- > > Key: SPARK-36808 > URL: https://issues.apache.org/jira/browse/SPARK-36808 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.1, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0, 3.2.2 > > > A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix. > https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36808) Upgrade Kafka to 2.8.1
[ https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36808: --- Affects Version/s: 3.2.1 > Upgrade Kafka to 2.8.1 > -- > > Key: SPARK-36808 > URL: https://issues.apache.org/jira/browse/SPARK-36808 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.1, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix. > https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36808) Upgrade Kafka to 2.8.1
[ https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492452#comment-17492452 ] Kousuke Saruta commented on SPARK-36808: Ah, O.K. I misunderstood. I'll withdraw the PRs. > Upgrade Kafka to 2.8.1 > -- > > Key: SPARK-36808 > URL: https://issues.apache.org/jira/browse/SPARK-36808 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix. > https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36808) Upgrade Kafka to 2.8.1
[ https://issues.apache.org/jira/browse/SPARK-36808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492408#comment-17492408 ] Kousuke Saruta commented on SPARK-36808: [~dongjoon] Sure, I'll do it. > Upgrade Kafka to 2.8.1 > -- > > Key: SPARK-36808 > URL: https://issues.apache.org/jira/browse/SPARK-36808 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > A few hours ago, Kafka 2.8.1 was released, which includes a bunch of bug fix. > https://downloads.apache.org/kafka/2.8.1/RELEASE_NOTES.html -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38149) Upgrade joda-time to 2.10.13
Kousuke Saruta created SPARK-38149: -- Summary: Upgrade joda-time to 2.10.13 Key: SPARK-38149 URL: https://issues.apache.org/jira/browse/SPARK-38149 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta joda-time 2.10.13 was released, which supports the latest TZ database of 2021e. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37934) Upgrade Jetty version to 9.4.44
[ https://issues.apache.org/jira/browse/SPARK-37934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489017#comment-17489017 ] Kousuke Saruta commented on SPARK-37934: Issue resolved in https://github.com/apache/spark/pull/35442 for branch-3.2. > Upgrade Jetty version to 9.4.44 > --- > > Key: SPARK-37934 > URL: https://issues.apache.org/jira/browse/SPARK-37934 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.0, 3.3.0 >Reporter: Sajith A >Assignee: Sajith A >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > > Upgrade Jetty version to 9.4.44.v20210927 in current Spark master to bring-in > the fixes for the > [jetty#6973|https://github.com/eclipse/jetty.project/issues/6973] issue. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37934) Upgrade Jetty version to 9.4.44
[ https://issues.apache.org/jira/browse/SPARK-37934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37934: --- Fix Version/s: 3.2.2 > Upgrade Jetty version to 9.4.44 > --- > > Key: SPARK-37934 > URL: https://issues.apache.org/jira/browse/SPARK-37934 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.0, 3.3.0 >Reporter: Sajith A >Assignee: Sajith A >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > > Upgrade Jetty version to 9.4.44.v20210927 in current Spark master to bring-in > the fixes for the > [jetty#6973|https://github.com/eclipse/jetty.project/issues/6973] issue. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38087) select doesnt validate if the column already exists
[ https://issues.apache.org/jira/browse/SPARK-38087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38087: --- Component/s: SQL (was: Spark Core) > select doesnt validate if the column already exists > --- > > Key: SPARK-38087 > URL: https://issues.apache.org/jira/browse/SPARK-38087 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 > Environment: Version{{{}v3.2.1{}}} > {{}} > {{{}{}}}Master{{{}local[*]{}}} > {{(Reproducible in any environment)}} >Reporter: Deepa Vasanthkumar >Priority: Minor > Attachments: select vs drop.png > > > > Select doesnt validate whether the alias column is already present in the > dataframe. > After which, we cannot do anything in that dataframe on that column. > df4 = df2.select(df2.firstname, df2.lastname) --> throws analysis exception > df4.show() > > However drop will not let you drop the said column. > > Scenario to reproduce : > df2 = df1.select("*", (df1.firstname).alias("firstname")) ---> this will > add same column > df2.show() > df2.drop(df2.firstname) --> this will give AnalysisException: Reference > 'firstname' is ambiguous, could be: firstname, firstname. > > > Is this expected behavior . > !select vs drop.png! > !image-2022-02-02-06-28-23-543.png! > > > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38021) Upgrade dropwizard metrics from 4.2.2 to 4.2.7
[ https://issues.apache.org/jira/browse/SPARK-38021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-38021. Fix Version/s: 3.3.0 Assignee: Yang Jie Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35317. > Upgrade dropwizard metrics from 4.2.2 to 4.2.7 > -- > > Key: SPARK-38021 > URL: https://issues.apache.org/jira/browse/SPARK-38021 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.3.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.3.0 > > > dropwizard metrics has released 5 versions after 4.2.2: > * [https://github.com/dropwizard/metrics/releases/tag/v4.2.3] > * [https://github.com/dropwizard/metrics/releases/tag/v4.2.4] > * [https://github.com/dropwizard/metrics/releases/tag/v4.2.5] > * [https://github.com/dropwizard/metrics/releases/tag/v4.2.6] > * [https://github.com/dropwizard/metrics/releases/tag/v4.2.7] > > And after 4.2.5 version, codahale metrics supports build with JDK 17 > (https://github.com/dropwizard/metrics/pull/2180) > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn
[ https://issues.apache.org/jira/browse/SPARK-38017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-38017. Fix Version/s: 3.3.0 3.2.2 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35313. > Fix the API doc for window to say it supports TimestampNTZType too as > timeColumn > > > Key: SPARK-38017 > URL: https://issues.apache.org/jira/browse/SPARK-38017 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.3.0, 3.2.2 > > > window function supports not only TimestampType but also TimestampNTZType but > the API docs doesn't mention TimestampNTZType. > This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I > separate the tickets. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn
[ https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-38016. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35312. > Fix the API doc for session_window to say it supports TimestampNTZType too as > timeColumn > > > Key: SPARK-38016 > URL: https://issues.apache.org/jira/browse/SPARK-38016 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.3.0 > > > As of Spark 3.3.0, session_window supports not only TimestampType but also > TimestampNTZType but the API docs doesn't mention TimestampNTZType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn
[ https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38016: --- Description: As of Spark 3.3.0, session_window supports not only TimestampType but also TimestampNTZType but the API docs doesn't mention TimestampNTZType. (was: As of Spark 3.3.0, session_window supports not only TimestampType but also TimestampNTZType but the API docs mention TimestampNTZType.) > Fix the API doc for session_window to say it supports TimestampNTZType too as > timeColumn > > > Key: SPARK-38016 > URL: https://issues.apache.org/jira/browse/SPARK-38016 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > As of Spark 3.3.0, session_window supports not only TimestampType but also > TimestampNTZType but the API docs doesn't mention TimestampNTZType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn
[ https://issues.apache.org/jira/browse/SPARK-38017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38017: --- Description: window function supports not only TimestampType but also TimestampNTZType but the API docs doesn't mention TimestampNTZType. This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I separate the tickets. was: window function supports not only TimestampType but also TimestampNTZType but the API docs mention TimestampNTZType. This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I separate the tickets. > Fix the API doc for window to say it supports TimestampNTZType too as > timeColumn > > > Key: SPARK-38017 > URL: https://issues.apache.org/jira/browse/SPARK-38017 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > window function supports not only TimestampType but also TimestampNTZType but > the API docs doesn't mention TimestampNTZType. > This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I > separate the tickets. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38017) Fix the API doc for window to say it supports TimestampNTZType too as timeColumn
Kousuke Saruta created SPARK-38017: -- Summary: Fix the API doc for window to say it supports TimestampNTZType too as timeColumn Key: SPARK-38017 URL: https://issues.apache.org/jira/browse/SPARK-38017 Project: Spark Issue Type: Bug Components: Documentation, SQL Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta window function supports not only TimestampType but also TimestampNTZType but the API docs mention TimestampNTZType. This issue is similar to SPARK-38016, but this issue affects 3.2.0 too, so I separate the tickets. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn
[ https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38016: --- Summary: Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn (was: Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn.) > Fix the API doc for session_window to say it supports TimestampNTZType too as > timeColumn > > > Key: SPARK-38016 > URL: https://issues.apache.org/jira/browse/SPARK-38016 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > As of Spark 3.3.0, session_window supports not only TimestampType but also > TimestampNTZType but the API docs mention TimestampNTZType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38016) Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn.
[ https://issues.apache.org/jira/browse/SPARK-38016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-38016: --- Summary: Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn. (was: Fix the API doc for window and session_window to say it supports TimestampNTZType too as timeColumn.) > Fix the API doc for session_window to say it supports TimestampNTZType too as > timeColumn. > - > > Key: SPARK-38016 > URL: https://issues.apache.org/jira/browse/SPARK-38016 > Project: Spark > Issue Type: Bug > Components: Documentation, SQL >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > As of Spark 3.3.0, session_window supports not only TimestampType but also > TimestampNTZType but the API docs mention TimestampNTZType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38016) Fix the API doc for window and session_window to say it supports TimestampNTZType too as timeColumn.
Kousuke Saruta created SPARK-38016: -- Summary: Fix the API doc for window and session_window to say it supports TimestampNTZType too as timeColumn. Key: SPARK-38016 URL: https://issues.apache.org/jira/browse/SPARK-38016 Project: Spark Issue Type: Bug Components: Documentation, SQL Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta As of Spark 3.3.0, session_window supports not only TimestampType but also TimestampNTZType but the API docs mention TimestampNTZType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline
[ https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472487#comment-17472487 ] Kousuke Saruta commented on SPARK-37860: Note: If the vote of Spark 3.2.1 RC1 passes, replace the fix version of 3.2.1 with 3.2.2. > [BUG] Revert: Fix taskid in the stage page task event timeline > -- > > Key: SPARK-37860 > URL: https://issues.apache.org/jira/browse/SPARK-37860 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.2.1 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0 > > > In [#32888|https://github.com/apache/spark/pull/32888], > [@shahidki31|https://github.com/shahidki31] change taskInfo.index to > taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to > distinguish tasks within a stage, not {{{}taskId.attempt{}}}. > Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix > issue, we should revert it. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37860) [BUG] Revert: Fix taskid in the stage page task event timeline
[ https://issues.apache.org/jira/browse/SPARK-37860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37860. Fix Version/s: 3.1.3 3.0.4 3.2.1 3.3.0 Assignee: Jackey Lee Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35160 > [BUG] Revert: Fix taskid in the stage page task event timeline > -- > > Key: SPARK-37860 > URL: https://issues.apache.org/jira/browse/SPARK-37860 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.2.1 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > Fix For: 3.1.3, 3.0.4, 3.2.1, 3.3.0 > > > In [#32888|https://github.com/apache/spark/pull/32888], > [@shahidki31|https://github.com/shahidki31] change taskInfo.index to > taskInfo.taskId. However, we generally use {{index.attempt}} or {{taskId}} to > distinguish tasks within a stage, not {{{}taskId.attempt{}}}. > Thus [#32888|https://github.com/apache/spark/pull/32888] was a wrong fix > issue, we should revert it. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
[ https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17472434#comment-17472434 ] Kousuke Saruta commented on SPARK-37159: All right. Thank you [~dongjoon]! > Change HiveExternalCatalogVersionsSuite to be able to test with Java 17 > --- > > Key: SPARK-37159 > URL: https://issues.apache.org/jira/browse/SPARK-37159 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.3.0 > > > SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but > `HiveExternalCatalogVersionsSuite`. > {code} > [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED > *** (42 seconds, 526 milliseconds) > [info] spark-submit returned with exit code 1. > [info] Command line: > '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit' > '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' > 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' > 'spark.sql.hive.metastore.version=2.3' '--conf' > 'spark.sql.hive.metastore.jars=maven' '--conf' > 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' > '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' > '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' > > '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py' > [info] > [info] 2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j > profile: org/apache/spark/log4j-defaults.properties > [info] 2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO > SparkContext: Running Spark version 3.2.0 > [info] 2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN > NativeCodeLoader: Unable to load native-hadoop library for your platform... > using builtin-java classes where applicable > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: == > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: No custom resources configured for spark.driver. > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: == > [info] 2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO > SparkContext: Submitted application: prepare testing tables > [info] 2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO > ResourceProfile: Default ResourceProfile created, executor resources: > Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: > memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: > 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) > [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO > ResourceProfile: Limiting resource is cpu > [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO > ResourceProfileManager: Added ResourceProfile id: 0 > [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing view acls to: kou > [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing modify acls to: kou > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing view acls groups to: > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing modify acls groups to: > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: SecurityManager: authentication disabled; ui acls disabled; > users with view permissions: Set(kou); groups with view permissions: Set(); > users with modify permissions: Set(kou); groups with modify permissions: > Set() > [info] 2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: > Successfully started service 'sparkDriver' on port 35867. > [info] 2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: > Registering MapOutputTracker > [info] 2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: > Registering BlockManagerMaster > [info] 2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO >
[jira] [Resolved] (SPARK-37792) Spark shell sets log level to INFO by default
[ https://issues.apache.org/jira/browse/SPARK-37792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37792. Fix Version/s: 3.3.0 Assignee: L. C. Hsieh (was: Apache Spark) Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/35080 > Spark shell sets log level to INFO by default > - > > Key: SPARK-37792 > URL: https://issues.apache.org/jira/browse/SPARK-37792 > Project: Spark > Issue Type: Bug > Components: Spark Shell >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Assignee: L. C. Hsieh >Priority: Major > Fix For: 3.3.0 > > > {code} > ./bin/spark-shell > {code} > {code} > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 21/12/31 10:55:04 INFO SignalUtils: Registering signal handler for INT > 21/12/31 10:55:08 INFO HiveConf: Found configuration file null > 21/12/31 10:55:08 INFO SparkContext: Running Spark version 3.3.0-SNAPSHOT > ... > 21/12/31 10:55:09 INFO BlockManager: Initialized BlockManager: > BlockManagerId(driver, ..., None) > ... > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 3.3.0-SNAPSHOT > /_/ > Using Scala version 2.12.15 (Java HotSpot(TM) 64-Bit Server VM, Java > 1.8.0_291) > Type in expressions to have them evaluated. > Type :help for more information. > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37778) Upgrade SBT to 1.6.1
Kousuke Saruta created SPARK-37778: -- Summary: Upgrade SBT to 1.6.1 Key: SPARK-37778 URL: https://issues.apache.org/jira/browse/SPARK-37778 Project: Spark Issue Type: Bug Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta SBT 1.6.1 was released, which log4j 2 to 2.17.1 for CVE-2021-44832. https://github.com/sbt/sbt/releases/tag/v1.6.1 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37391) SIGNIFICANT bottleneck introduced by fix for SPARK-32001
[ https://issues.apache.org/jira/browse/SPARK-37391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37391. Fix Version/s: 3.3.0 Assignee: Danny Guinther Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34745 for Spark 3.3.0. > SIGNIFICANT bottleneck introduced by fix for SPARK-32001 > > > Key: SPARK-37391 > URL: https://issues.apache.org/jira/browse/SPARK-37391 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0 > Environment: N/A >Reporter: Danny Guinther >Assignee: Danny Guinther >Priority: Major > Fix For: 3.3.0 > > Attachments: so-much-blocking.jpg, spark-regression-dashes.jpg > > > The fix for https://issues.apache.org/jira/browse/SPARK-32001 ( > [https://github.com/apache/spark/pull/29024/files#diff-345beef18081272d77d91eeca2d9b5534ff6e642245352f40f4e9c9b8922b085R58] > ) does not seem to have consider the reality that some apps may rely on > being able to establish many JDBC connections simultaneously for performance > reasons. > The fix forces concurrency to 1 when establishing database connections and > that strikes me as a *significant* user impacting change and a *significant* > bottleneck. > Can anyone propose a workaround for this? I have an app that makes > connections to thousands of databases and I can't upgrade to any version > >3.1.x because of this significant bottleneck. > > Thanks in advance for your help! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37663) Mitigate ConcurrentModificationException thrown from tests in SparkContextSuite
[ https://issues.apache.org/jira/browse/SPARK-37663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37663: --- Summary: Mitigate ConcurrentModificationException thrown from tests in SparkContextSuite (was: Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite) > Mitigate ConcurrentModificationException thrown from tests in > SparkContextSuite > --- > > Key: SPARK-37663 > URL: https://issues.apache.org/jira/browse/SPARK-37663 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > ConcurrentModificationException can be thrown from tests in SparkContextSuite > with Scala 2.13. > The cause seems to be same as SPARK-37315. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37663) SPARK-37315][ML][TEST] Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite
Kousuke Saruta created SPARK-37663: -- Summary: SPARK-37315][ML][TEST] Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite Key: SPARK-37663 URL: https://issues.apache.org/jira/browse/SPARK-37663 Project: Spark Issue Type: Bug Components: Spark Core, Tests Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta ConcurrentModificationException can be thrown from tests in SparkContextSuite with Scala 2.13. The cause seems to be same as SPARK-37315. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37663) Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite
[ https://issues.apache.org/jira/browse/SPARK-37663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37663: --- Summary: Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite (was: SPARK-37315][ML][TEST] Mitigate ConcurrentModificationException thrown from a test in SparkContextSuite) > Mitigate ConcurrentModificationException thrown from a test in > SparkContextSuite > > > Key: SPARK-37663 > URL: https://issues.apache.org/jira/browse/SPARK-37663 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > ConcurrentModificationException can be thrown from tests in SparkContextSuite > with Scala 2.13. > The cause seems to be same as SPARK-37315. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37656) Upgrade SBT to 1.5.7
Kousuke Saruta created SPARK-37656: -- Summary: Upgrade SBT to 1.5.7 Key: SPARK-37656 URL: https://issues.apache.org/jira/browse/SPARK-37656 Project: Spark Issue Type: Bug Components: Build Affects Versions: 3.2.1, 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta SBT 1.5.7 was released a few hours ago, which includes a fix for CVE-2021-45046. https://github.com/sbt/sbt/releases/tag/v1.5.7 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37635) SHOW TBLPROPERTIES should print the fully qualified table name
[ https://issues.apache.org/jira/browse/SPARK-37635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37635. Fix Version/s: 3.3.0 Assignee: Wenchen Fan Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34890 > SHOW TBLPROPERTIES should print the fully qualified table name > -- > > Key: SPARK-37635 > URL: https://issues.apache.org/jira/browse/SPARK-37635 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37310) Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default
[ https://issues.apache.org/jira/browse/SPARK-37310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37310. Fix Version/s: 3.3.0 Assignee: Terry Kim (was: Apache Spark) Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34891 > Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default > --- > > Key: SPARK-37310 > URL: https://issues.apache.org/jira/browse/SPARK-37310 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Terry Kim >Assignee: Terry Kim >Priority: Major > Fix For: 3.3.0 > > > Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36038) Basic speculation metrics at stage level
[ https://issues.apache.org/jira/browse/SPARK-36038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-36038. Assignee: Thejdeep Gudivada Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34607 > Basic speculation metrics at stage level > > > Key: SPARK-36038 > URL: https://issues.apache.org/jira/browse/SPARK-36038 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: Venkata krishnan Sowrirajan >Assignee: Thejdeep Gudivada >Priority: Major > Fix For: 3.3.0 > > > Currently there are no speculation metrics available either at application > level or at stage level. With in our platform, we have added speculation > metrics at stage level as a summary similarly to the stage level metrics > tracking numTotalSpeculated, numCompleted (successful), numFailed, numKilled > etc. This enables us to effectively understand speculative execution feature > at an application level and helps in further tuning the speculation configs. > cc [~ron8hu] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37586) Add cipher mode option and set default cipher mode for aes_encrypt and aes_decrypt
[ https://issues.apache.org/jira/browse/SPARK-37586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37586. Fix Version/s: 3.3.0 Assignee: Max Gekk Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34837 > Add cipher mode option and set default cipher mode for aes_encrypt and > aes_decrypt > -- > > Key: SPARK-37586 > URL: https://issues.apache.org/jira/browse/SPARK-37586 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.3.0 > > > https://github.com/apache/spark/pull/32801 added aes_encrypt/aes_decrypt > functions to spark. However they rely on the jvm's configuration regarding > which cipher mode to support, this is problematic as it is not fixed across > versions and systems. > Let's hardcode a default cipher mode and also allow users to set a cipher > mode as an argument to the function. > In the future, we can support other modes like GCM and CBC that have been > already supported by other systems: > # Snowflake: > https://docs.snowflake.com/en/sql-reference/functions/encrypt.html > # Bigquery: > https://cloud.google.com/bigquery/docs/reference/standard-sql/aead-encryption-concepts#block_cipher_modes -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454960#comment-17454960 ] Kousuke Saruta commented on SPARK-37568: [~yoda-mon] OK, please go ahead. > Support 2-arguments by the convert_timezone() function > -- > > Key: SPARK-37568 > URL: https://issues.apache.org/jira/browse/SPARK-37568 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > # If sourceTs is a timestamp_ntz, take the sourceTz from the session time > zone, see the SQL config spark.sql.session.timeZone > # If sourceTs is a timestamp_ltz, convert it to a timestamp_ntz using the > targetTz -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37568) Support 2-arguments by the convert_timezone() function
[ https://issues.apache.org/jira/browse/SPARK-37568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454947#comment-17454947 ] Kousuke Saruta commented on SPARK-37568: cc: [~yoda-mon] [~YActs] Do you want to work on this? > Support 2-arguments by the convert_timezone() function > -- > > Key: SPARK-37568 > URL: https://issues.apache.org/jira/browse/SPARK-37568 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > # If sourceTs is a timestamp_ntz, take the sourceTz from the session time > zone, see the SQL config spark.sql.session.timeZone > # If sourceTs is a timestamp_ltz, convert it to a timestamp_ntz using the > targetTz -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37469) Unified "fetchWaitTime" and "shuffleReadTime" metrics On UI
[ https://issues.apache.org/jira/browse/SPARK-37469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37469. Fix Version/s: 3.3.0 Assignee: Yazhi Wang Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34720 > Unified "fetchWaitTime" and "shuffleReadTime" metrics On UI > --- > > Key: SPARK-37469 > URL: https://issues.apache.org/jira/browse/SPARK-37469 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.2.0 >Reporter: Yazhi Wang >Assignee: Yazhi Wang >Priority: Minor > Fix For: 3.3.0 > > Attachments: executor-page.png, sql-page.png > > > Metrics in Executor/Task page shown as " > Shuffle Read Block Time", and the SQL page shown as "fetch wait time" which > make us confused !executor-page.png! > !sql-page.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37529) Support K8s integration tests for Java 17
Kousuke Saruta created SPARK-37529: -- Summary: Support K8s integration tests for Java 17 Key: SPARK-37529 URL: https://issues.apache.org/jira/browse/SPARK-37529 Project: Spark Issue Type: Sub-task Components: Kubernetes, Tests Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Now that we can build container image for Java 17, let's support K8s integration tests for Java 17. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37487) CollectMetrics is executed twice if it is followed by a sort
[ https://issues.apache.org/jira/browse/SPARK-37487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451376#comment-17451376 ] Kousuke Saruta commented on SPARK-37487: [~tanelk] Thank you for pinging me. I think a sampling job for the global sort performs the extra CollectMetrics (operations before the sort are performed twice). Please let me look into more. > CollectMetrics is executed twice if it is followed by a sort > > > Key: SPARK-37487 > URL: https://issues.apache.org/jira/browse/SPARK-37487 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Tanel Kiis >Priority: Major > Labels: correctness > > It is best examplified by this new UT in DataFrameCallbackSuite: > {code} > test("SPARK-37487: get observable metrics with sort by callback") { > val df = spark.range(100) > .observe( > name = "my_event", > min($"id").as("min_val"), > max($"id").as("max_val"), > // Test unresolved alias > sum($"id"), > count(when($"id" % 2 === 0, 1)).as("num_even")) > .observe( > name = "other_event", > avg($"id").cast("int").as("avg_val")) > .sort($"id".desc) > validateObservedMetrics(df) > } > {code} > The count and sum aggregate report twice the number of rows: > {code} > [info] - SPARK-37487: get observable metrics with sort by callback *** FAILED > *** (169 milliseconds) > [info] [0,99,9900,100] did not equal [0,99,4950,50] > (DataFrameCallbackSuite.scala:342) > [info] org.scalatest.exceptions.TestFailedException: > [info] at > org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) > [info] at > org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) > [info] at > org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) > [info] at > org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) > [info] at > org.apache.spark.sql.util.DataFrameCallbackSuite.checkMetrics$1(DataFrameCallbackSuite.scala:342) > [info] at > org.apache.spark.sql.util.DataFrameCallbackSuite.validateObservedMetrics(DataFrameCallbackSuite.scala:350) > [info] at > org.apache.spark.sql.util.DataFrameCallbackSuite.$anonfun$new$21(DataFrameCallbackSuite.scala:324) > [info] at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > [info] at org.scalatest.Transformer.apply(Transformer.scala:22) > [info] at org.scalatest.Transformer.apply(Transformer.scala:20) > {code} > I could not figure out how this happes. Hopefully the UT can help with > debugging -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37468) Support ANSI intervals and TimestampNTZ for UnionEstimation
[ https://issues.apache.org/jira/browse/SPARK-37468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37468: --- Description: Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. But I think it can support those types because their underlying types are integer or long, which UnionEstimation can compute stats for. (was: Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. But I think it can support those types because their underlying types are integer or long, which it UnionEstimation can compute stats for.) > Support ANSI intervals and TimestampNTZ for UnionEstimation > --- > > Key: SPARK-37468 > URL: https://issues.apache.org/jira/browse/SPARK-37468 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. > But I think it can support those types because their underlying types are > integer or long, which UnionEstimation can compute stats for. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37468) Support ANSI intervals and TimestampNTZ for UnionEstimation
Kousuke Saruta created SPARK-37468: -- Summary: Support ANSI intervals and TimestampNTZ for UnionEstimation Key: SPARK-37468 URL: https://issues.apache.org/jira/browse/SPARK-37468 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Currently, UnionEstimation doesn't support ANSI intervals and TimestampNTZ. But I think it can support those types because their underlying types are integer or long, which it UnionEstimation can compute stats for. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37459) Upgrade commons-cli to 1.5.0
Kousuke Saruta created SPARK-37459: -- Summary: Upgrade commons-cli to 1.5.0 Key: SPARK-37459 URL: https://issues.apache.org/jira/browse/SPARK-37459 Project: Spark Issue Type: Bug Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Currently used commons-cli is too old and contains an issue which affects the behavior of bin/spark-sql {code} bin/spark-sql -e 'SELECT "Spark"' ... Error in query: no viable alternative at input 'SELECT "'(line 1, pos 7) == SQL == SELECT "Spark ---^^^ {code} The root cause of this issue seems to be resolved in CLI-185. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37354) Make the Java version installed on the container image used by the K8s integration tests with SBT configurable
[ https://issues.apache.org/jira/browse/SPARK-37354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37354. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34628 > Make the Java version installed on the container image used by the K8s > integration tests with SBT configurable > -- > > Key: SPARK-37354 > URL: https://issues.apache.org/jira/browse/SPARK-37354 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > I noticed that the default Java version installed on the container image used > by the K8s integration tests are different depending on the way to run the > tests. > If the tests are launched by Maven, the Java version is 8 is installed. > On the other hand, if the tests are launched by SBT, the Java version is 11. > Further, we have no way to change the version. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37354) Make the Java version installed on the container image used by the K8s integration tests with SBT configurable
Kousuke Saruta created SPARK-37354: -- Summary: Make the Java version installed on the container image used by the K8s integration tests with SBT configurable Key: SPARK-37354 URL: https://issues.apache.org/jira/browse/SPARK-37354 Project: Spark Issue Type: Bug Components: Kubernetes, Tests Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta I noticed that the default Java version installed on the container image used by the K8s integration tests are different depending on the way to run the tests. If the tests are launched by Maven, the Java version is 8 is installed. On the other hand, if the tests are launched by SBT, the Java version is 11. Further, we have no way to change the version. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37319) Support K8s image building with Java 17
[ https://issues.apache.org/jira/browse/SPARK-37319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37319. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34586 > Support K8s image building with Java 17 > --- > > Key: SPARK-37319 > URL: https://issues.apache.org/jira/browse/SPARK-37319 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37320) Delete py_container_checks.zip after the test in DepsTestsSuite finishes
[ https://issues.apache.org/jira/browse/SPARK-37320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37320: --- Description: When K8s integration tests run, py_container_checks.zip still remains in resource-managers/kubernetes/integration-tests/tests/. It's is created in the test "Launcher python client dependencies using a zip file" in DepsTestsSuite. was: When K8s integration tests run, py_container_checks.zip is still remaining in resource-managers/kubernetes/integration-tests/tests/. It's is created in the test "Launcher python client dependencies using a zip file" in DepsTestsSuite. > Delete py_container_checks.zip after the test in DepsTestsSuite finishes > > > Key: SPARK-37320 > URL: https://issues.apache.org/jira/browse/SPARK-37320 > Project: Spark > Issue Type: Bug > Components: k8, Tests >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > When K8s integration tests run, py_container_checks.zip still remains in > resource-managers/kubernetes/integration-tests/tests/. > It's is created in the test "Launcher python client dependencies using a zip > file" in DepsTestsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37320) Delete py_container_checks.zip after the test in DepsTestsSuite finishes
Kousuke Saruta created SPARK-37320: -- Summary: Delete py_container_checks.zip after the test in DepsTestsSuite finishes Key: SPARK-37320 URL: https://issues.apache.org/jira/browse/SPARK-37320 Project: Spark Issue Type: Bug Components: k8, Tests Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta When K8s integration tests run, py_container_checks.zip is still remaining in resource-managers/kubernetes/integration-tests/tests/. It's is created in the test "Launcher python client dependencies using a zip file" in DepsTestsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37315) Mitigate ConcurrentModificationException thrown from a test in MLEventSuite
[ https://issues.apache.org/jira/browse/SPARK-37315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37315: --- Summary: Mitigate ConcurrentModificationException thrown from a test in MLEventSuite (was: Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite) > Mitigate ConcurrentModificationException thrown from a test in MLEventSuite > --- > > Key: SPARK-37315 > URL: https://issues.apache.org/jira/browse/SPARK-37315 > Project: Spark > Issue Type: Bug > Components: ML, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > Recently, I notice ConcurrentModificationException is sometimes thrown from > the following part of the test "pipeline read/write events" in MLEventSuite > when Scala 2.13 is used. > {code} > events.map(JsonProtocol.sparkEventToJson).foreach { event => > assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent]) > } > {code} > I think the root cause is the ArrayBuffer (events) is updated asynchronously > by the following part. > {code} > private val listener: SparkListener = new SparkListener { > override def onOtherEvent(event: SparkListenerEvent): Unit = event match { > case e: MLEvent => events.append(e) > case _ => > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37315) Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite
[ https://issues.apache.org/jira/browse/SPARK-37315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37315: --- Description: Recently, I notice ConcurrentModificationException is sometimes thrown from the following part of the test "pipeline read/write events" in MLEventSuite when Scala 2.13 is used. {code} events.map(JsonProtocol.sparkEventToJson).foreach { event => assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent]) } {code} I think the root cause is the ArrayBuffer (events) is updated asynchronously by the following part. {code} private val listener: SparkListener = new SparkListener { override def onOtherEvent(event: SparkListenerEvent): Unit = event match { case e: MLEvent => events.append(e) case _ => } } {code} was: Recently, I notice ConcurrentModificationException is thrown from the following part of the test "pipeline read/write events" in MLEventSuite when Scala 2.13 is used. {code} events.map(JsonProtocol.sparkEventToJson).foreach { event => assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent]) } {code} I think the root cause is the ArrayBuffer (events) is updated asynchronously by the following part. {code} private val listener: SparkListener = new SparkListener { override def onOtherEvent(event: SparkListenerEvent): Unit = event match { case e: MLEvent => events.append(e) case _ => } } {code} > Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite > - > > Key: SPARK-37315 > URL: https://issues.apache.org/jira/browse/SPARK-37315 > Project: Spark > Issue Type: Bug > Components: ML, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > Recently, I notice ConcurrentModificationException is sometimes thrown from > the following part of the test "pipeline read/write events" in MLEventSuite > when Scala 2.13 is used. > {code} > events.map(JsonProtocol.sparkEventToJson).foreach { event => > assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent]) > } > {code} > I think the root cause is the ArrayBuffer (events) is updated asynchronously > by the following part. > {code} > private val listener: SparkListener = new SparkListener { > override def onOtherEvent(event: SparkListenerEvent): Unit = event match { > case e: MLEvent => events.append(e) > case _ => > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37315) Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite
Kousuke Saruta created SPARK-37315: -- Summary: Mitigate a ConcurrentModificationException thrown from a test in MLEventSuite Key: SPARK-37315 URL: https://issues.apache.org/jira/browse/SPARK-37315 Project: Spark Issue Type: Bug Components: ML, Tests Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Recently, I notice ConcurrentModificationException is thrown from the following part of the test "pipeline read/write events" in MLEventSuite when Scala 2.13 is used. {code} events.map(JsonProtocol.sparkEventToJson).foreach { event => assert(JsonProtocol.sparkEventFromJson(event).isInstanceOf[MLEvent]) } {code} I think the root cause is the ArrayBuffer (events) is updated asynchronously by the following part. {code} private val listener: SparkListener = new SparkListener { override def onOtherEvent(event: SparkListenerEvent): Unit = event match { case e: MLEvent => events.append(e) case _ => } } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37312) Add `.java-version` to `.gitignore` and `.rat-excludes`
[ https://issues.apache.org/jira/browse/SPARK-37312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37312. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34577 > Add `.java-version` to `.gitignore` and `.rat-excludes` > --- > > Key: SPARK-37312 > URL: https://issues.apache.org/jira/browse/SPARK-37312 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Trivial > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1
[ https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37314: --- Description: kubernetes-client 5.10.0 and 5.10.1 were released, which include some bug fixes. https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 Especially, the connection leak issue would affect Spark. https://github.com/fabric8io/kubernetes-client/issues/3561 was: kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug fixes. https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 Especially, the connection leak issue would affect Spark. https://github.com/fabric8io/kubernetes-client/issues/3561 > Upgrade kubernetes-client to 5.10.1 > --- > > Key: SPARK-37314 > URL: https://issues.apache.org/jira/browse/SPARK-37314 > Project: Spark > Issue Type: Bug > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > kubernetes-client 5.10.0 and 5.10.1 were released, which include some bug > fixes. > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 > Especially, the connection leak issue would affect Spark. > https://github.com/fabric8io/kubernetes-client/issues/3561 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1
[ https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37314: --- Description: kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug fixes. https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 Especially, the connection leak issue would affect Spark. https://github.com/fabric8io/kubernetes-client/issues/3561 was: A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include some bug fixes. https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 Especially, the connection leak issue would affect Spark. https://github.com/fabric8io/kubernetes-client/issues/3561 > Upgrade kubernetes-client to 5.10.1 > --- > > Key: SPARK-37314 > URL: https://issues.apache.org/jira/browse/SPARK-37314 > Project: Spark > Issue Type: Bug > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > kubernetes-client 5.10.0 and 5.10.1 were relased, which include some bug > fixes. > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 > Especially, the connection leak issue would affect Spark. > https://github.com/fabric8io/kubernetes-client/issues/3561 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37314) Upgrade kubernetes-client to 5.10.1
[ https://issues.apache.org/jira/browse/SPARK-37314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37314: --- Description: A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include some bug fixes. https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 Especially, the connection leak issue would affect Spark. https://github.com/fabric8io/kubernetes-client/issues/3561 was: A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include some bug fixes. Especially, the connection leak issue would affect Spark. > Upgrade kubernetes-client to 5.10.1 > --- > > Key: SPARK-37314 > URL: https://issues.apache.org/jira/browse/SPARK-37314 > Project: Spark > Issue Type: Bug > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which > include some bug fixes. > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0 > https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.1 > Especially, the connection leak issue would affect Spark. > https://github.com/fabric8io/kubernetes-client/issues/3561 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37314) Upgrade kubernetes-client to 5.10.1
Kousuke Saruta created SPARK-37314: -- Summary: Upgrade kubernetes-client to 5.10.1 Key: SPARK-37314 URL: https://issues.apache.org/jira/browse/SPARK-37314 Project: Spark Issue Type: Bug Components: Build, Kubernetes Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta A few days ago, kubernetes-client 5.10.0 and 5.10.1 are relased, which include some bug fixes. Especially, the connection leak issue would affect Spark. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
[ https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37302: --- Description: dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in the both of Maven and Coursier local repository. {code:java} $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... {code} {code:java} [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.ConnectionFactory) [error] val connector = new ServerConnector( {code} The reason is that exec-maven-plugin used in test-dependencies.sh downloads pom of guava and jetty-io but doesn't downloads the corresponding jars, and skip dependency testing if Scala 2.13 is used (if dependency testing runs, Maven downloads those jars). {code} if [[ "$SCALA_BINARY_VERSION" != "2.12" ]]; then # TODO(SPARK-36168) Support Scala 2.13 in dev/test-dependencies.sh echo "Skip dependency testing on $SCALA_BINARY_VERSION" exit 0 fi {code} {code:java} $ find ~/.m2
[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
[ https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37302: --- Description: dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in the both of Maven and Coursier local repository. {code:java} $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... {code} {code:java} [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.ConnectionFactory) [error] val connector = new ServerConnector( {code} The reason is that exec-maven-plugin used in test-dependencies.sh downloads pom of guava and jetty-io but doesn't downloads the corresponding jars, and skip dependency testing if Scala 2.13 is used (if dependency testing runs, Maven downloads those jars). {code} if [[ "$SCALA_BINARY_VERSION" != "2.12" ]]; then # TODO(SPARK-36168) Support Scala 2.13 in dev/test-dependencies.sh echo "Skip dependency testing on $SCALA_BINARY_VERSION" exit 0 fi {code} {code:java} $ find ~/.m2
[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
[ https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37302: --- Description: dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in the both of Maven and Coursier local repository. {code:java} $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... {code} {code:java} [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.ConnectionFactory) [error] val connector = new ServerConnector( {code} The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads pom of guava and jetty-io but doesn't downloads the corresponding jars. {code:java} $ find ~/.m2 -name "guava*" ... /home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom /home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom.sha1 ... /home/kou/.m2/repository/com/google/guava/guava-parent/14.0.1/guava-parent-14.0.1.pom
[jira] [Updated] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
[ https://issues.apache.org/jira/browse/SPARK-37302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37302: --- Description: dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in the both of Maven and Coursier local repository. {code:java} $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... {code} {code:java} [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.ConnectionFactory) [error] val connector = new ServerConnector( {code} The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads pom of guava and jetty-io but doesn't downloads the corresponding jars. {code:java} $ find ~/.m2 -name "guava*" ... /home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom /home/kou/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.pom.sha1 ... /home/kou/.m2/repository/com/google/guava/guava-parent/14.0.1/guava-parent-14.0.1.pom
[jira] [Created] (SPARK-37302) Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
Kousuke Saruta created SPARK-37302: -- Summary: Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh Key: SPARK-37302 URL: https://issues.apache.org/jira/browse/SPARK-37302 Project: Spark Issue Type: Bug Components: Build Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta dev/run-tests.py fails if Scala 2.13 is used and guava or jetty-io is not in the both of Maven and Coursier local repository. {code} $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... {code} {code} [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.ConnectionFactory) [error] val connector = new ServerConnector( {code} The reason is that exec-maven-plugin used in `test-dependencies.sh` downloads pom of guava and jetty-io but doesn't downloads the corresponding jars. {code} $ find ~/.m2 -name "guava*" ...
[jira] [Created] (SPARK-37284) Upgrade Jekyll to 4.2.1
Kousuke Saruta created SPARK-37284: -- Summary: Upgrade Jekyll to 4.2.1 Key: SPARK-37284 URL: https://issues.apache.org/jira/browse/SPARK-37284 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Jekyll 4.2.1 was released in September, which includes the fix of a regression bug. https://github.com/jekyll/jekyll/releases/tag/v4.2.1 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37283) Don't try to store a V1 table which contains ANSI intervals in Hive compatible format
[ https://issues.apache.org/jira/browse/SPARK-37283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37283: --- Description: If, a table being created contains a column of ANSI interval types and the underlying file format has a corresponding Hive SerDe (e.g. Parquet), `HiveExternalcatalog` tries to store the table in Hive compatible format. But, as ANSI interval types in Spark and interval type in Hive are not compatible (Hive only supports interval_year_month and interval_day_time), the following warning with stack trace will be logged. {code} spark-sql> CREATE TABLE tbl1(a INTERVAL YEAR TO MONTH) USING Parquet; 21/11/11 14:39:29 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 21/11/11 14:39:29 WARN HiveExternalCatalog: Could not persist `default`.`tbl1` in a Hive compatible way. Persisting it into Hive metastore in Spark SQL specific format. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Error: type expected at the position 0 of 'interval year to month' but 'interval year to month' is found. at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:874) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createTable$1(HiveClientImpl.scala:553) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:303) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:234) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:233) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:551) at org.apache.spark.sql.hive.HiveExternalCatalog.saveTableIntoHive(HiveExternalCatalog.scala:499) at org.apache.spark.sql.hive.HiveExternalCatalog.createDataSourceTable(HiveExternalCatalog.scala:397) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createTable$1(HiveExternalCatalog.scala:274) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:376) at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:120) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at
[jira] [Created] (SPARK-37283) Don't try to store a V1 table which contains ANSI intervals in Hive compatible format
Kousuke Saruta created SPARK-37283: -- Summary: Don't try to store a V1 table which contains ANSI intervals in Hive compatible format Key: SPARK-37283 URL: https://issues.apache.org/jira/browse/SPARK-37283 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta If, a table being created contains a column of ANSI interval types and the underlying file format has a corresponding Hive SerDe (e.g. Parquet), `HiveExternalcatalog` tries to store the table in Hive compatible format. But, as ANSI interval types in Spark and interval type in Hive are not compatible (Hive only supports interval_year_month and interval_day_time), the following warning with stack trace will be logged. {code} spark-sql> CREATE TABLE tbl1(a INTERVAL YEAR TO MONTH) USING Parquet; 21/11/11 14:39:29 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 21/11/11 14:39:29 WARN HiveExternalCatalog: Could not persist `default`.`tbl1` in a Hive compatible way. Persisting it into Hive metastore in Spark SQL specific format. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Error: type expected at the position 0 of 'interval year to month' but 'interval year to month' is found. at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:874) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createTable$1(HiveClientImpl.scala:553) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:303) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:234) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:233) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:551) at org.apache.spark.sql.hive.HiveExternalCatalog.saveTableIntoHive(HiveExternalCatalog.scala:499) at org.apache.spark.sql.hive.HiveExternalCatalog.createDataSourceTable(HiveExternalCatalog.scala:397) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createTable$1(HiveExternalCatalog.scala:274) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:376) at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:120) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:97) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:93) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) at
[jira] [Resolved] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37264. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34541 > [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from > orc-core > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.3.0 > > > Like hadoop-common and hadoop-hdfs, this PR proposes to exclude > hadoop-client-api transitive dependency from orc-core. > Why are the changes needed? > Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency > on Hadoop 3.3.1. > This causes test-dependencies.sh failure on Java 17. As a result, > run-tests.py also fails. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37264: --- Description: Like hadoop-common and hadoop-hdfs, this PR proposes to exclude hadoop-client-api transitive dependency from orc-core. Why are the changes needed? Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency on Hadoop 3.3.1. This causes test-dependencies.sh failure on Java 17. As a result, run-tests.py also fails. was: In the current master, `run-tests.py` fails on Java 17 due to `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Hadoop 2.7 doesn't support Java 17 so let's > [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from > orc-core > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > Like hadoop-common and hadoop-hdfs, this PR proposes to exclude > hadoop-client-api transitive dependency from orc-core. > Why are the changes needed? > Since Apache Hadoop 2.7 doesn't work on Java 17, Apache ORC has a dependency > on Hadoop 3.3.1. > This causes test-dependencies.sh failure on Java 17. As a result, > run-tests.py also fails. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37264) Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37264: --- Description: In the current master, `run-tests.py` fails on Java 17 due to `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Hadoop 2.7 doesn't support Java 17 so let's was: In the current master, `run-tests.py` fails on Java 17 due to `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Currently, we don't maintain the dependency manifests for Java 17 yet so let's skip it temporarily. > Cut the transitive dependency on hadoop-client-api which orc-shims depends on > only for Java 17 with hadoop-2.7 > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > In the current master, `run-tests.py` fails on Java 17 due to > `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile > dependency on hadoop-client-api:3.3.1 only for Java 17. > Hadoop 2.7 doesn't support Java 17 so let's -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37264) [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37264: --- Summary: [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from orc-core (was: Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7) > [SPARK-37264][BUILD] Exclude hadoop-client-api transitive dependency from > orc-core > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > In the current master, `run-tests.py` fails on Java 17 due to > `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile > dependency on hadoop-client-api:3.3.1 only for Java 17. > Hadoop 2.7 doesn't support Java 17 so let's -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37264) Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37264: --- Summary: Cut the transitive dependency on hadoop-client-api which orc-shims depends on only for Java 17 with hadoop-2.7 (was: Skip dependency testing on Java 17 temporarily) > Cut the transitive dependency on hadoop-client-api which orc-shims depends on > only for Java 17 with hadoop-2.7 > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > In the current master, `run-tests.py` fails on Java 17 due to > `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile > dependency on hadoop-client-api:3.3.1 only for Java 17. > Currently, we don't maintain the dependency manifests for Java 17 yet so > let's skip it temporarily. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37264) Skip dependency testing on Java 17 temporarily
[ https://issues.apache.org/jira/browse/SPARK-37264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37264: --- Description: In the current master, `run-tests.py` fails on Java 17 due to `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Currently, we don't maintain the dependency manifests for Java 17 yet so let's skip it temporarily. was: In the current master, test-dependencies.sh fails on Java 17 because orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Currently, we don't maintain the dependency manifests for Java 17 yet so let's skip it temporarily. > Skip dependency testing on Java 17 temporarily > -- > > Key: SPARK-37264 > URL: https://issues.apache.org/jira/browse/SPARK-37264 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > > In the current master, `run-tests.py` fails on Java 17 due to > `test-dependencies.sh` fails. The cause is orc-shims:1.7.1 has a compile > dependency on hadoop-client-api:3.3.1 only for Java 17. > Currently, we don't maintain the dependency manifests for Java 17 yet so > let's skip it temporarily. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37265) Support Java 17 in `dev/test-dependencies.sh`
Kousuke Saruta created SPARK-37265: -- Summary: Support Java 17 in `dev/test-dependencies.sh` Key: SPARK-37265 URL: https://issues.apache.org/jira/browse/SPARK-37265 Project: Spark Issue Type: Sub-task Components: Tests Affects Versions: 3.3.0 Reporter: Kousuke Saruta -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37264) Skip dependency testing on Java 17 temporarily
Kousuke Saruta created SPARK-37264: -- Summary: Skip dependency testing on Java 17 temporarily Key: SPARK-37264 URL: https://issues.apache.org/jira/browse/SPARK-37264 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta In the current master, test-dependencies.sh fails on Java 17 because orc-shims:1.7.1 has a compile dependency on hadoop-client-api:3.3.1 only for Java 17. Currently, we don't maintain the dependency manifests for Java 17 yet so let's skip it temporarily. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895 ] Kousuke Saruta deleted comment on SPARK-36895: was (Author: sarutak): The change in https://github.com/apache/spark/pull/34148 was reverted and resolved again in https://github.com/apache/spark/pull/34523 > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440696#comment-17440696 ] Kousuke Saruta commented on SPARK-36895: The change in https://github.com/apache/spark/pull/34148 was reverted and resolved again in https://github.com/apache/spark/pull/34523 > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37240) Cannot read partitioned parquet files with ANSI interval partition values
[ https://issues.apache.org/jira/browse/SPARK-37240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37240. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34517 > Cannot read partitioned parquet files with ANSI interval partition values > - > > Key: SPARK-37240 > URL: https://issues.apache.org/jira/browse/SPARK-37240 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.3.0 > > > The code below demonstrates the issue: > {code:scala} > scala> sql("SELECT INTERVAL '1' YEAR AS i, 0 as > id").write.partitionBy("i").parquet("/Users/maximgekk/tmp/ansi_interval_parquet") > scala> spark.read.schema("i INTERVAL YEAR, id > INT").parquet("/Users/maximgekk/tmp/ansi_interval_parquet").show(false) > 21/11/08 10:56:36 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2) > java.lang.RuntimeException: DataType INTERVAL YEAR is not supported in column > vectorized reader. > at > org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:100) > at > org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:243) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-36038) Basic speculation metrics at stage level
[ https://issues.apache.org/jira/browse/SPARK-36038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta reopened SPARK-36038: Assignee: (was: Venkata krishnan Sowrirajan) The change was reverted. https://github.com/apache/spark/pull/34518 So I re-open this. > Basic speculation metrics at stage level > > > Key: SPARK-36038 > URL: https://issues.apache.org/jira/browse/SPARK-36038 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: Venkata krishnan Sowrirajan >Priority: Major > Fix For: 3.3.0 > > > Currently there are no speculation metrics available either at application > level or at stage level. With in our platform, we have added speculation > metrics at stage level as a summary similarly to the stage level metrics > tracking numTotalSpeculated, numCompleted (successful), numFailed, numKilled > etc. This enables us to effectively understand speculative execution feature > at an application level and helps in further tuning the speculation configs. > cc [~ron8hu] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37158) Add doc about spark not supported hive built-in function
[ https://issues.apache.org/jira/browse/SPARK-37158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37158. Resolution: Won't Fix See the discussion. https://github.com/apache/spark/pull/34434#issuecomment-954545315 > Add doc about spark not supported hive built-in function > > > Key: SPARK-37158 > URL: https://issues.apache.org/jira/browse/SPARK-37158 > Project: Spark > Issue Type: Improvement > Components: docs >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > > Add doc about spark not supported hive built-in function -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37238) Upgrade ORC to 1.6.12
[ https://issues.apache.org/jira/browse/SPARK-37238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37238. Fix Version/s: 3.2.1 Assignee: Dongjoon Hyun Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34512 > Upgrade ORC to 1.6.12 > - > > Key: SPARK-37238 > URL: https://issues.apache.org/jira/browse/SPARK-37238 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.2.1 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.2.1 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37211) More descriptions and adding an image to the failure message about enabling GitHub Actions
[ https://issues.apache.org/jira/browse/SPARK-37211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37211. Fix Version/s: 3.3.0 Assignee: Yuto Akutsu Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34487 > More descriptions and adding an image to the failure message about enabling > GitHub Actions > -- > > Key: SPARK-37211 > URL: https://issues.apache.org/jira/browse/SPARK-37211 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 3.3.0 >Reporter: Yuto Akutsu >Assignee: Yuto Akutsu >Priority: Minor > Fix For: 3.3.0 > > > I've seen and experienced that the build-and-test workflow of first-time PRs > fails and it was caused by developers forgetting to enable Github Actions on > their own repositories. > I think developers will be able to notice the cause quicker by adding more > descriptions and an image to the test-failure message. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37231) Dynamic writes/reads of ANSI interval partitions
[ https://issues.apache.org/jira/browse/SPARK-37231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37231. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34506 > Dynamic writes/reads of ANSI interval partitions > > > Key: SPARK-37231 > URL: https://issues.apache.org/jira/browse/SPARK-37231 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.3.0 > > > Check and fix if it's needed dynamic partitions writes of ANSI intervals. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35496) Upgrade Scala 2.13 to 2.13.7
[ https://issues.apache.org/jira/browse/SPARK-35496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438986#comment-17438986 ] Kousuke Saruta commented on SPARK-35496: [~dongjoon] Thank you for letting me know. That's great. > Upgrade Scala 2.13 to 2.13.7 > > > Key: SPARK-35496 > URL: https://issues.apache.org/jira/browse/SPARK-35496 > Project: Spark > Issue Type: Task > Components: Build >Affects Versions: 3.3.0 >Reporter: Yang Jie >Priority: Minor > > This issue aims to upgrade to Scala 2.13.7. > Scala 2.13.6 released(https://github.com/scala/scala/releases/tag/v2.13.6). > However, we skip 2.13.6 because there is a breaking behavior change at 2.13.6 > which is different from both Scala 2.13.5 and Scala 3. > - https://github.com/scala/bug/issues/12403 > {code} > scala3-3.0.0:$ bin/scala > scala> Array.empty[Double].intersect(Array(0.0)) > val res0: Array[Double] = Array() > scala-2.13.6:$ bin/scala > Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_292). > Type in expressions for evaluation. Or try :help. > scala> Array.empty[Double].intersect(Array(0.0)) > java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D > ... 32 elided > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35496) Upgrade Scala 2.13 to 2.13.7
[ https://issues.apache.org/jira/browse/SPARK-35496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438535#comment-17438535 ] Kousuke Saruta commented on SPARK-35496: [~LuciferYang] Scala 2.13.7 was released a few days ago. https://github.com/scala/scala/releases/tag/v2.13.7 Would you like to continue to work on this? > Upgrade Scala 2.13 to 2.13.7 > > > Key: SPARK-35496 > URL: https://issues.apache.org/jira/browse/SPARK-35496 > Project: Spark > Issue Type: Task > Components: Build >Affects Versions: 3.3.0 >Reporter: Yang Jie >Priority: Minor > > This issue aims to upgrade to Scala 2.13.7. > Scala 2.13.6 released(https://github.com/scala/scala/releases/tag/v2.13.6). > However, we skip 2.13.6 because there is a breaking behavior change at 2.13.6 > which is different from both Scala 2.13.5 and Scala 3. > - https://github.com/scala/bug/issues/12403 > {code} > scala3-3.0.0:$ bin/scala > scala> Array.empty[Double].intersect(Array(0.0)) > val res0: Array[Double] = Array() > scala-2.13.6:$ bin/scala > Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 1.8.0_292). > Type in expressions for evaluation. Or try :help. > scala> Array.empty[Double].intersect(Array(0.0)) > java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [D > ... 32 elided > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37206) Upgrade Avro to 1.11.0
Kousuke Saruta created SPARK-37206: -- Summary: Upgrade Avro to 1.11.0 Key: SPARK-37206 URL: https://issues.apache.org/jira/browse/SPARK-37206 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Recently, Avro 1.1.0 was released which includes bunch of bug fixes. https://issues.apache.org/jira/issues/?jql=project%3DAVRO%20AND%20fixVersion%3D1.11.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37108) Expose make_date expression in R
[ https://issues.apache.org/jira/browse/SPARK-37108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37108. Fix Version/s: 3.3.0 Assignee: Leona Yoda Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34480 > Expose make_date expression in R > > > Key: SPARK-37108 > URL: https://issues.apache.org/jira/browse/SPARK-37108 > Project: Spark > Issue Type: Improvement > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Assignee: Leona Yoda >Priority: Minor > Fix For: 3.3.0 > > > Expose make_date API on SparkR. > > (cf. https://issues.apache.org/jira/browse/SPARK-36554) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
[ https://issues.apache.org/jira/browse/SPARK-37159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-37159. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34425 > Change HiveExternalCatalogVersionsSuite to be able to test with Java 17 > --- > > Key: SPARK-37159 > URL: https://issues.apache.org/jira/browse/SPARK-37159 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Minor > Fix For: 3.3.0 > > > SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but > `HiveExternalCatalogVersionsSuite`. > {code} > [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED > *** (42 seconds, 526 milliseconds) > [info] spark-submit returned with exit code 1. > [info] Command line: > '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit' > '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' > 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' > 'spark.sql.hive.metastore.version=2.3' '--conf' > 'spark.sql.hive.metastore.jars=maven' '--conf' > 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' > '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' > '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' > > '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py' > [info] > [info] 2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j > profile: org/apache/spark/log4j-defaults.properties > [info] 2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO > SparkContext: Running Spark version 3.2.0 > [info] 2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN > NativeCodeLoader: Unable to load native-hadoop library for your platform... > using builtin-java classes where applicable > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: == > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: No custom resources configured for spark.driver. > [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO > ResourceUtils: == > [info] 2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO > SparkContext: Submitted application: prepare testing tables > [info] 2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO > ResourceProfile: Default ResourceProfile created, executor resources: > Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: > memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: > 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) > [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO > ResourceProfile: Limiting resource is cpu > [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO > ResourceProfileManager: Added ResourceProfile id: 0 > [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing view acls to: kou > [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing modify acls to: kou > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing view acls groups to: > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: Changing modify acls groups to: > [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO > SecurityManager: SecurityManager: authentication disabled; ui acls disabled; > users with view permissions: Set(kou); groups with view permissions: Set(); > users with modify permissions: Set(kou); groups with modify permissions: > Set() > [info] 2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: > Successfully started service 'sparkDriver' on port 35867. > [info] 2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: > Registering MapOutputTracker > [info] 2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: > Registering BlockManagerMaster > [info] 2021-10-28 06:07:18.943 - stderr> 21/10/28
[jira] [Resolved] (SPARK-36554) Error message while trying to use spark sql functions directly on dataframe columns without using select expression
[ https://issues.apache.org/jira/browse/SPARK-36554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-36554. Fix Version/s: 3.3.0 Assignee: Nicolas Azrak Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34356 > Error message while trying to use spark sql functions directly on dataframe > columns without using select expression > --- > > Key: SPARK-36554 > URL: https://issues.apache.org/jira/browse/SPARK-36554 > Project: Spark > Issue Type: Bug > Components: Documentation, Examples, PySpark >Affects Versions: 3.1.1 >Reporter: Lekshmi Ramachandran >Assignee: Nicolas Azrak >Priority: Minor > Labels: documentation, features, functions, spark-sql > Fix For: 3.3.0 > > Attachments: Screen Shot .png > > Original Estimate: 24h > Remaining Estimate: 24h > > The below code generates a dataframe successfully . Here make_date function > is used inside a select expression > > from pyspark.sql.functions import expr, make_date > df = spark.createDataFrame([(2020, 6, 26), (1000, 2, 29), (-44, 1, 1)],['Y', > 'M', 'D']) > df.select("*",expr("make_date(Y,M,D) as lk")).show() > > The below code fails with a message "cannot import name 'make_date' from > 'pyspark.sql.functions'" . Here the make_date function is directly called on > dataframe columns without select expression > > from pyspark.sql.functions import make_date > df = spark.createDataFrame([(2020, 6, 26), (1000, 2, 29), (-44, 1, 1)],['Y', > 'M', 'D']) > df.select(make_date(df.Y,df.M,df.D).alias("datefield")).show() > > The error message generated is misleading when it says "cannot import > make_date from pyspark.sql.functions" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37170) Pin PySpark version installed in the Binder environment for tagged commit
[ https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37170: --- Summary: Pin PySpark version installed in the Binder environment for tagged commit (was: Pin PySpark version for Binder) > Pin PySpark version installed in the Binder environment for tagged commit > - > > Key: SPARK-37170 > URL: https://issues.apache.org/jira/browse/SPARK-37170 > Project: Spark > Issue Type: Bug > Components: docs, PySpark >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Apache Spark >Priority: Major > > I noticed that the PySpark 3.1.2 is installed in the live notebook > environment even though the notebook is for PySpark 3.2.0. > http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html > I guess someone accessed to Binder and built the container image with v3.2.0 > before we published the pyspark package to PyPi. > https://mybinder.org/ > I think it's difficult to rebuild the image manually. > To avoid such accident, I'll propose to pin the version of PySpark in > binder/postBuild > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37170) Pin PySpark version for Binder
[ https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37170: --- Description: I noticed that the PySpark 3.1.2 is installed in the live notebook environment even though the notebook is for PySpark 3.2.0. http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html I guess someone accessed to Binder and built the container image with v3.2.0 before we published the pyspark package to PyPi. https://mybinder.org/ I think it's difficult to rebuild the image manually. To avoid such accident, I'll propose to pin the version of PySpark in binder/postBuild was: I noticed that the PySpark 3.1.2 is installed in the live notebook environment even though the notebook is for PySpark 3.2. http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html I guess someone accessed to Binder and built the container image with v3.2.0 before we published the pyspark package to PyPi. https://mybinder.org/ I think it's difficult to rebuild the image manually. To avoid such accident, I'll propose to pin the version of PySpark in binder/postBuild > Pin PySpark version for Binder > -- > > Key: SPARK-37170 > URL: https://issues.apache.org/jira/browse/SPARK-37170 > Project: Spark > Issue Type: Bug > Components: docs, PySpark >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > I noticed that the PySpark 3.1.2 is installed in the live notebook > environment even though the notebook is for PySpark 3.2.0. > http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html > I guess someone accessed to Binder and built the container image with v3.2.0 > before we published the pyspark package to PyPi. > https://mybinder.org/ > I think it's difficult to rebuild the image manually. > To avoid such accident, I'll propose to pin the version of PySpark in > binder/postBuild > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37170) Pin PySpark version for Binder
[ https://issues.apache.org/jira/browse/SPARK-37170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-37170: --- Description: I noticed that the PySpark 3.1.2 is installed in the live notebook environment even though the notebook is for PySpark 3.2. http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html I guess someone accessed to Binder and built the container image with v3.2.0 before we published the pyspark package to PyPi. https://mybinder.org/ I think it's difficult to rebuild the image manually. To avoid such accident, I'll propose to pin the version of PySpark in binder/postBuild was: I noticed that the PySpark 3.1.2 is installed in the environment of live notebook even though the notebook is for PySpark 3.2. http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html I guess someone accessed to Binder and built the container image with v3.2.0 before we published the pyspark package to PyPi. https://mybinder.org/ I think it's difficult to rebuild the image manually. To avoid such accident, I'll propose to pin the version of PySpark in binder/postBuild > Pin PySpark version for Binder > -- > > Key: SPARK-37170 > URL: https://issues.apache.org/jira/browse/SPARK-37170 > Project: Spark > Issue Type: Bug > Components: docs, PySpark >Affects Versions: 3.2.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > I noticed that the PySpark 3.1.2 is installed in the live notebook > environment even though the notebook is for PySpark 3.2. > http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html > I guess someone accessed to Binder and built the container image with v3.2.0 > before we published the pyspark package to PyPi. > https://mybinder.org/ > I think it's difficult to rebuild the image manually. > To avoid such accident, I'll propose to pin the version of PySpark in > binder/postBuild > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37170) Pin PySpark version for Binder
Kousuke Saruta created SPARK-37170: -- Summary: Pin PySpark version for Binder Key: SPARK-37170 URL: https://issues.apache.org/jira/browse/SPARK-37170 Project: Spark Issue Type: Bug Components: docs, PySpark Affects Versions: 3.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta I noticed that the PySpark 3.1.2 is installed in the environment of live notebook even though the notebook is for PySpark 3.2. http://spark.apache.org/docs/3.2.0/api/python/getting_started/index.html I guess someone accessed to Binder and built the container image with v3.2.0 before we published the pyspark package to PyPi. https://mybinder.org/ I think it's difficult to rebuild the image manually. To avoid such accident, I'll propose to pin the version of PySpark in binder/postBuild -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37159) Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
Kousuke Saruta created SPARK-37159: -- Summary: Change HiveExternalCatalogVersionsSuite to be able to test with Java 17 Key: SPARK-37159 URL: https://issues.apache.org/jira/browse/SPARK-37159 Project: Spark Issue Type: Bug Components: SQL, Tests Affects Versions: 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta SPARK-37105 seems to have fixed most of tests in `sql/hive` for Java 17 but `HiveExternalCatalogVersionsSuite`. {code} [info] org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED *** (42 seconds, 526 milliseconds) [info] spark-submit returned with exit code 1. [info] Command line: '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test-spark-d86af275-0c40-4b47-9cab-defa92a5ffa7/spark-3.2.0/bin/spark-submit' '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 'spark.sql.hive.metastore.version=2.3' '--conf' 'spark.sql.hive.metastore.jars=maven' '--conf' 'spark.sql.warehouse.dir=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' '-Dderby.system.home=/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/warehouse-69d9bdbc-54ce-443b-8677-a413663ddb62' '/home/kou/work/oss/spark-java17/sql/hive/target/tmp/org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite/test15166225869206697603.py' [info] [info] 2021-10-28 06:07:18.486 - stderr> Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties [info] 2021-10-28 06:07:18.49 - stderr> 21/10/28 22:07:18 INFO SparkContext: Running Spark version 3.2.0 [info] 2021-10-28 06:07:18.537 - stderr> 21/10/28 22:07:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO ResourceUtils: == [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO ResourceUtils: No custom resources configured for spark.driver. [info] 2021-10-28 06:07:18.616 - stderr> 21/10/28 22:07:18 INFO ResourceUtils: == [info] 2021-10-28 06:07:18.617 - stderr> 21/10/28 22:07:18 INFO SparkContext: Submitted application: prepare testing tables [info] 2021-10-28 06:07:18.632 - stderr> 21/10/28 22:07:18 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO ResourceProfile: Limiting resource is cpu [info] 2021-10-28 06:07:18.641 - stderr> 21/10/28 22:07:18 INFO ResourceProfileManager: Added ResourceProfile id: 0 [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO SecurityManager: Changing view acls to: kou [info] 2021-10-28 06:07:18.679 - stderr> 21/10/28 22:07:18 INFO SecurityManager: Changing modify acls to: kou [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO SecurityManager: Changing view acls groups to: [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO SecurityManager: Changing modify acls groups to: [info] 2021-10-28 06:07:18.68 - stderr> 21/10/28 22:07:18 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(kou); groups with view permissions: Set(); users with modify permissions: Set(kou); groups with modify permissions: Set() [info] 2021-10-28 06:07:18.886 - stderr> 21/10/28 22:07:18 INFO Utils: Successfully started service 'sparkDriver' on port 35867. [info] 2021-10-28 06:07:18.906 - stderr> 21/10/28 22:07:18 INFO SparkEnv: Registering MapOutputTracker [info] 2021-10-28 06:07:18.93 - stderr> 21/10/28 22:07:18 INFO SparkEnv: Registering BlockManagerMaster [info] 2021-10-28 06:07:18.943 - stderr> 21/10/28 22:07:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information [info] 2021-10-28 06:07:18.944 - stderr> 21/10/28 22:07:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up [info] 2021-10-28 06:07:18.945 - stdout> Traceback (most recent call last): [info] 2021-10-28 06:07:18.946 - stdout> File