[jira] [Resolved] (SPARK-46427) Change Python Data Source's description to be pretty in explain
[ https://issues.apache.org/jira/browse/SPARK-46427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46427. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44379 [https://github.com/apache/spark/pull/44379] > Change Python Data Source's description to be pretty in explain > --- > > Key: SPARK-46427 > URL: https://issues.apache.org/jira/browse/SPARK-46427 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Now it's as below: > {code} > == Physical Plan == > *(1) Project [x#0, y#1] > +- BatchScan test[x#0, y#1] class > org.apache.spark.sql.execution.python.PythonTableProvider$$anon$1$$anon$2 > RuntimeFilters: [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46427) Change Python Data Source's description to be pretty in explain
[ https://issues.apache.org/jira/browse/SPARK-46427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46427: Assignee: Hyukjin Kwon > Change Python Data Source's description to be pretty in explain > --- > > Key: SPARK-46427 > URL: https://issues.apache.org/jira/browse/SPARK-46427 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > Now it's as below: > {code} > == Physical Plan == > *(1) Project [x#0, y#1] > +- BatchScan test[x#0, y#1] class > org.apache.spark.sql.execution.python.PythonTableProvider$$anon$1$$anon$2 > RuntimeFilters: [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46403) Decode parquet binary with getBytesUnsafe method
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng reassigned SPARK-46403: -- Assignee: Wan Kun > Decode parquet binary with getBytesUnsafe method > > > Key: SPARK-46403 > URL: https://issues.apache.org/jira/browse/SPARK-46403 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wan Kun >Assignee: Wan Kun >Priority: Major > Labels: pull-request-available > Attachments: image-2023-12-14-16-30-39-104.png > > > Now spark will get a parquet binary object with getBytes() method. > The *Binary.getBytes()* method will always make a new copy of the internal > bytes. > We can use *Binary.getBytesUnsafe()* method to get the cached bytes if it has > already been called getBytes() and has the cached bytes. > !image-2023-12-14-16-30-39-104.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46403) Decode parquet binary with getBytesUnsafe method
[ https://issues.apache.org/jira/browse/SPARK-46403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiaan Geng resolved SPARK-46403. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44351 [https://github.com/apache/spark/pull/44351] > Decode parquet binary with getBytesUnsafe method > > > Key: SPARK-46403 > URL: https://issues.apache.org/jira/browse/SPARK-46403 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wan Kun >Assignee: Wan Kun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: image-2023-12-14-16-30-39-104.png > > > Now spark will get a parquet binary object with getBytes() method. > The *Binary.getBytes()* method will always make a new copy of the internal > bytes. > We can use *Binary.getBytesUnsafe()* method to get the cached bytes if it has > already been called getBytes() and has the cached bytes. > !image-2023-12-14-16-30-39-104.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46427) Change Python Data Source's description to be pretty in explain
[ https://issues.apache.org/jira/browse/SPARK-46427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46427: --- Labels: pull-request-available (was: ) > Change Python Data Source's description to be pretty in explain > --- > > Key: SPARK-46427 > URL: https://issues.apache.org/jira/browse/SPARK-46427 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > Now it's as below: > {code} > == Physical Plan == > *(1) Project [x#0, y#1] > +- BatchScan test[x#0, y#1] class > org.apache.spark.sql.execution.python.PythonTableProvider$$anon$1$$anon$2 > RuntimeFilters: [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46427) Change Python Data Source's description to be pretty in explain
Hyukjin Kwon created SPARK-46427: Summary: Change Python Data Source's description to be pretty in explain Key: SPARK-46427 URL: https://issues.apache.org/jira/browse/SPARK-46427 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 4.0.0 Reporter: Hyukjin Kwon Now it's as below: {code} == Physical Plan == *(1) Project [x#0, y#1] +- BatchScan test[x#0, y#1] class org.apache.spark.sql.execution.python.PythonTableProvider$$anon$1$$anon$2 RuntimeFilters: [] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46425) Pin the bundler version in CI
[ https://issues.apache.org/jira/browse/SPARK-46425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46425. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44376 [https://github.com/apache/spark/pull/44376] > Pin the bundler version in CI > - > > Key: SPARK-46425 > URL: https://issues.apache.org/jira/browse/SPARK-46425 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > https://github.com/apache/spark/actions/runs/7226413850/job/19691970695 > {code} > Requirement already satisfied: docutils<0.18.0 in > /usr/local/lib/python3.9/dist-packages (0.17.1) > WARNING: Running pip as the 'root' user can result in broken permissions and > conflicting behaviour with the system package manager. It is recommended to > use a virtual environment instead: https://pip.pypa.io/warnings/venv > ERROR: Error installing bundler: > The last version of bundler (>= 0) to support your Ruby & RubyGems was > 2.4.22. Try installing it with `gem install bundler -v 2.4.22` > bundler requires Ruby version >= 3.0.0. The current ruby version is > 2.7.0.0. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46425) Pin the bundler version in CI
[ https://issues.apache.org/jira/browse/SPARK-46425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46425: Assignee: Hyukjin Kwon > Pin the bundler version in CI > - > > Key: SPARK-46425 > URL: https://issues.apache.org/jira/browse/SPARK-46425 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > https://github.com/apache/spark/actions/runs/7226413850/job/19691970695 > {code} > Requirement already satisfied: docutils<0.18.0 in > /usr/local/lib/python3.9/dist-packages (0.17.1) > WARNING: Running pip as the 'root' user can result in broken permissions and > conflicting behaviour with the system package manager. It is recommended to > use a virtual environment instead: https://pip.pypa.io/warnings/venv > ERROR: Error installing bundler: > The last version of bundler (>= 0) to support your Ruby & RubyGems was > 2.4.22. Try installing it with `gem install bundler -v 2.4.22` > bundler requires Ruby version >= 3.0.0. The current ruby version is > 2.7.0.0. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46414) Use prependBaseUri to render javascript imports
[ https://issues.apache.org/jira/browse/SPARK-46414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46414. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44363 [https://github.com/apache/spark/pull/44363] > Use prependBaseUri to render javascript imports > --- > > Key: SPARK-46414 > URL: https://issues.apache.org/jira/browse/SPARK-46414 > Project: Spark > Issue Type: Sub-task > Components: UI >Affects Versions: 4.0.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46422) Move `test_window` to `pyspark.pandas.tests.window.*`
[ https://issues.apache.org/jira/browse/SPARK-46422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-46422. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44371 [https://github.com/apache/spark/pull/44371] > Move `test_window` to `pyspark.pandas.tests.window.*` > - > > Key: SPARK-46422 > URL: https://issues.apache.org/jira/browse/SPARK-46422 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42829) Add Identifier to the cached RDD node on the Stages page
[ https://issues.apache.org/jira/browse/SPARK-42829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-42829: --- Labels: pull-request-available (was: ) > Add Identifier to the cached RDD node on the Stages page > - > > Key: SPARK-42829 > URL: https://issues.apache.org/jira/browse/SPARK-42829 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.3.2 >Reporter: Yian Liou >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2023-03-20 at 3.55.40 PM.png > > > On the stages page in the Web UI, there is no distinction for which cached > RDD is being executed in a particular stage. This Jira aims to add an repeat > identifier to distinguish which cached RDD is being executed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Deleted] (SPARK-46426) Uses sum metrics for number of output length
[ https://issues.apache.org/jira/browse/SPARK-46426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon deleted SPARK-46426: - > Uses sum metrics for number of output length > > > Key: SPARK-46426 > URL: https://issues.apache.org/jira/browse/SPARK-46426 > Project: Spark > Issue Type: Improvement >Reporter: Hyukjin Kwon >Priority: Major > > screenhot attached. it shouldn't look like bytes -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46426) Uses sum metrics for number of output length
[ https://issues.apache.org/jira/browse/SPARK-46426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-46426: - Attachment: Screenshot 2023-12-15 at 3.23.09 PM.png > Uses sum metrics for number of output length > > > Key: SPARK-46426 > URL: https://issues.apache.org/jira/browse/SPARK-46426 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Attachments: Screenshot 2023-12-15 at 3.23.09 PM.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46426) Uses sum metrics for number of output length
[ https://issues.apache.org/jira/browse/SPARK-46426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-46426: - Description: screenhot attached. it shouldn't look like bytes > Uses sum metrics for number of output length > > > Key: SPARK-46426 > URL: https://issues.apache.org/jira/browse/SPARK-46426 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Attachments: Screenshot 2023-12-15 at 3.23.09 PM.png > > > screenhot attached. it shouldn't look like bytes -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46426) Uses sum metrics for number of output length
Hyukjin Kwon created SPARK-46426: Summary: Uses sum metrics for number of output length Key: SPARK-46426 URL: https://issues.apache.org/jira/browse/SPARK-46426 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 4.0.0 Reporter: Hyukjin Kwon Attachments: Screenshot 2023-12-15 at 3.23.09 PM.png -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46423) Refactor Python Data Source instance loading
[ https://issues.apache.org/jira/browse/SPARK-46423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46423: Assignee: Hyukjin Kwon > Refactor Python Data Source instance loading > > > Key: SPARK-46423 > URL: https://issues.apache.org/jira/browse/SPARK-46423 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > we should make the instance in lookupDataSourceV2 instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46423) Refactor Python Data Source instance loading
[ https://issues.apache.org/jira/browse/SPARK-46423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46423. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44374 [https://github.com/apache/spark/pull/44374] > Refactor Python Data Source instance loading > > > Key: SPARK-46423 > URL: https://issues.apache.org/jira/browse/SPARK-46423 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > we should make the instance in lookupDataSourceV2 instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46425) Pin the bundler version in CI
[ https://issues.apache.org/jira/browse/SPARK-46425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46425: --- Labels: pull-request-available (was: ) > Pin the bundler version in CI > - > > Key: SPARK-46425 > URL: https://issues.apache.org/jira/browse/SPARK-46425 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > https://github.com/apache/spark/actions/runs/7226413850/job/19691970695 > {code} > Requirement already satisfied: docutils<0.18.0 in > /usr/local/lib/python3.9/dist-packages (0.17.1) > WARNING: Running pip as the 'root' user can result in broken permissions and > conflicting behaviour with the system package manager. It is recommended to > use a virtual environment instead: https://pip.pypa.io/warnings/venv > ERROR: Error installing bundler: > The last version of bundler (>= 0) to support your Ruby & RubyGems was > 2.4.22. Try installing it with `gem install bundler -v 2.4.22` > bundler requires Ruby version >= 3.0.0. The current ruby version is > 2.7.0.0. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46425) Pin the bundler version in CI
Hyukjin Kwon created SPARK-46425: Summary: Pin the bundler version in CI Key: SPARK-46425 URL: https://issues.apache.org/jira/browse/SPARK-46425 Project: Spark Issue Type: Improvement Components: Project Infra Affects Versions: 4.0.0 Reporter: Hyukjin Kwon https://github.com/apache/spark/actions/runs/7226413850/job/19691970695 {code} Requirement already satisfied: docutils<0.18.0 in /usr/local/lib/python3.9/dist-packages (0.17.1) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv ERROR: Error installing bundler: The last version of bundler (>= 0) to support your Ruby & RubyGems was 2.4.22. Try installing it with `gem install bundler -v 2.4.22` bundler requires Ruby version >= 3.0.0. The current ruby version is 2.7.0.0. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46424) Support PythonSQLMetrics.pythonMetrics
[ https://issues.apache.org/jira/browse/SPARK-46424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46424: --- Labels: pull-request-available (was: ) > Support PythonSQLMetrics.pythonMetrics > -- > > Key: SPARK-46424 > URL: https://issues.apache.org/jira/browse/SPARK-46424 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Minor > Labels: pull-request-available > > We should show the stats such as `pythonDataSent`, `pythonDataReceived` and > `pythonNumRowsReceived`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45807) DataSourceV2: Improve ViewCatalog API
[ https://issues.apache.org/jira/browse/SPARK-45807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45807. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44330 [https://github.com/apache/spark/pull/44330] > DataSourceV2: Improve ViewCatalog API > - > > Key: SPARK-45807 > URL: https://issues.apache.org/jira/browse/SPARK-45807 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.5.0 >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > The goal is to add createOrReplaceView(..) and replaceView(..) methods to the > ViewCatalog API -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46424) Support PythonSQLMetrics.pythonMetrics
[ https://issues.apache.org/jira/browse/SPARK-46424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-46424: - Summary: Support PythonSQLMetrics.pythonMetrics (was: Support PythonSQLMetrics.pythonMetrics via custom metrics API in DSv2) > Support PythonSQLMetrics.pythonMetrics > -- > > Key: SPARK-46424 > URL: https://issues.apache.org/jira/browse/SPARK-46424 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Minor > > We should show the stats such as `pythonDataSent`, `pythonDataReceived` and > `pythonNumRowsReceived`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46424) Support PythonSQLMetrics.pythonMetrics via custom metrics API in DSv2
[ https://issues.apache.org/jira/browse/SPARK-46424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-46424: - Priority: Minor (was: Major) > Support PythonSQLMetrics.pythonMetrics via custom metrics API in DSv2 > - > > Key: SPARK-46424 > URL: https://issues.apache.org/jira/browse/SPARK-46424 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Minor > > We should show the stats such as `pythonDataSent`, `pythonDataReceived` and > `pythonNumRowsReceived`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46424) Support PythonSQLMetrics.pythonMetrics via custom metrics API in DSv2
Hyukjin Kwon created SPARK-46424: Summary: Support PythonSQLMetrics.pythonMetrics via custom metrics API in DSv2 Key: SPARK-46424 URL: https://issues.apache.org/jira/browse/SPARK-46424 Project: Spark Issue Type: Sub-task Components: PySpark, SQL Affects Versions: 4.0.0 Reporter: Hyukjin Kwon We should show the stats such as `pythonDataSent`, `pythonDataReceived` and `pythonNumRowsReceived`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46423) Refactor Python Data Source instance loading
[ https://issues.apache.org/jira/browse/SPARK-46423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46423: --- Labels: pull-request-available (was: ) > Refactor Python Data Source instance loading > > > Key: SPARK-46423 > URL: https://issues.apache.org/jira/browse/SPARK-46423 > Project: Spark > Issue Type: Sub-task > Components: PySpark, SQL >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > we should make the instance in lookupDataSourceV2 instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46419) Reorganize `DatetimeIndexTests`: Factor out 3 slow tests
[ https://issues.apache.org/jira/browse/SPARK-46419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46419: Assignee: Ruifeng Zheng > Reorganize `DatetimeIndexTests`: Factor out 3 slow tests > > > Key: SPARK-46419 > URL: https://issues.apache.org/jira/browse/SPARK-46419 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46419) Reorganize `DatetimeIndexTests`: Factor out 3 slow tests
[ https://issues.apache.org/jira/browse/SPARK-46419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46419. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44369 [https://github.com/apache/spark/pull/44369] > Reorganize `DatetimeIndexTests`: Factor out 3 slow tests > > > Key: SPARK-46419 > URL: https://issues.apache.org/jira/browse/SPARK-46419 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46423) Refactor Python Data Source instance loading
Hyukjin Kwon created SPARK-46423: Summary: Refactor Python Data Source instance loading Key: SPARK-46423 URL: https://issues.apache.org/jira/browse/SPARK-46423 Project: Spark Issue Type: Sub-task Components: PySpark, SQL Affects Versions: 4.0.0 Reporter: Hyukjin Kwon we should make the instance in lookupDataSourceV2 instead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45597: Assignee: Hyukjin Kwon (was: Allison Wang) > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-45597: Assignee: Allison Wang > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-45597. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44305 [https://github.com/apache/spark/pull/44305] > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32710) Add Hive Murmur3Hash expression
[ https://issues.apache.org/jira/browse/SPARK-32710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797261#comment-17797261 ] Eric Xiao commented on SPARK-32710: --- Hi [~chengsu], I am interested in working on enabling hive bucketing in spark and I noticed there are a couple of tickets still open. May I take a stab at this ticket? This ticket does not seem too complicated as well? A couple of questions: * Where in Spark would one start to implement the `murmur3hash` algorithm? * Is the scope of this ticket just to implement the exact hashing logic found in the linked hive code snippet? > Add Hive Murmur3Hash expression > --- > > Key: SPARK-32710 > URL: https://issues.apache.org/jira/browse/SPARK-32710 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.0 >Reporter: Cheng Su >Priority: Minor > > To allow Spark to write Hive 3 compatible bucketed table, we need to follow > the same hash function with Hive/Presto. Hive murmur3hash is quite some > different with Spark murmur3hash (with different default seed, different > logic for NULL, array, map, struct, detail in > [https://github.com/apache/hive/blob/ece58fff1b53ea451bfc524c4c15f63ee12eca00/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java#L813]). > So here introduce a Hive murmur3hash expression. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46409) Spark Connect Repl does not work with ClosureCleaner
[ https://issues.apache.org/jira/browse/SPARK-46409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46409. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44360 [https://github.com/apache/spark/pull/44360] > Spark Connect Repl does not work with ClosureCleaner > > > Key: SPARK-46409 > URL: https://issues.apache.org/jira/browse/SPARK-46409 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Vsevolod Stepanov >Assignee: Vsevolod Stepanov >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > SPARK-45136 added ClosureCleaner support to SparkConnect client. > Unfortunately, this change breaks ConnectRepl launched by > `./connector/connect/bin/spark-connect-scala-client`. To reproduce the issue: > # Run `./connector/connect/bin/spark-connect-shell` > # Run `./connector/connect/bin/spark-connect-scala-client` > # In the REPL, execute this code: > ``` > @ def plus1(x: Int): Int = x + 1 > @ val plus1_udf = udf(plus1 _) > ``` > This will fail with the following error: > ``` > java.lang.reflect.InaccessibleObjectException: Unable to make private native > java.lang.reflect.Field[] java.lang.Class.getDeclaredFields0(boolean) > accessible: module java.base does not "opens java.lang" to unnamed module > @45099dd3 > > java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354) > > java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297) > java.lang.reflect.Method.checkCanSetAccessible(Method.java:199) > java.lang.reflect.Method.setAccessible(Method.java:193) > > org.apache.spark.util.ClosureCleaner$.getFinalModifiersFieldForJava17(ClosureCleaner.scala:577) > > org.apache.spark.util.ClosureCleaner$.setFieldAndIgnoreModifiers(ClosureCleaner.scala:560) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18(ClosureCleaner.scala:533) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18$adapted(ClosureCleaner.scala:525) > scala.collection.ArrayOps$WithFilter.foreach(ArrayOps.scala:73) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16(ClosureCleaner.scala:525) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16$adapted(ClosureCleaner.scala:522) > scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576) > scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574) > scala.collection.AbstractIterable.foreach(Iterable.scala:933) > scala.collection.IterableOps$WithFilter.foreach(Iterable.scala:903) > > org.apache.spark.util.ClosureCleaner$.cleanupAmmoniteReplClosure(ClosureCleaner.scala:522) > org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:251) > > org.apache.spark.sql.expressions.SparkConnectClosureCleaner$.clean(UserDefinedFunction.scala:210) > > org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:187) > > org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:180) > org.apache.spark.sql.functions$.udf(functions.scala:7956) > ammonite.$sess.cmd1$Helper.(cmd1.sc:1) > ammonite.$sess.cmd1$.(cmd1.sc:7) > ``` > > This is because ClosureCleaner is heavily reliant on using reflection API and > is not compatible with Java 17. The rest of Spark bypasses this by adding > `--add-opens` JVM flags, see > https://issues.apache.org/jira/browse/SPARK-36796. We need to add these > options to Spark Connect Client launch script as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46409) Spark Connect Repl does not work with ClosureCleaner
[ https://issues.apache.org/jira/browse/SPARK-46409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46409: Assignee: Vsevolod Stepanov > Spark Connect Repl does not work with ClosureCleaner > > > Key: SPARK-46409 > URL: https://issues.apache.org/jira/browse/SPARK-46409 > Project: Spark > Issue Type: Bug > Components: Connect >Affects Versions: 4.0.0 >Reporter: Vsevolod Stepanov >Assignee: Vsevolod Stepanov >Priority: Major > Labels: pull-request-available > > SPARK-45136 added ClosureCleaner support to SparkConnect client. > Unfortunately, this change breaks ConnectRepl launched by > `./connector/connect/bin/spark-connect-scala-client`. To reproduce the issue: > # Run `./connector/connect/bin/spark-connect-shell` > # Run `./connector/connect/bin/spark-connect-scala-client` > # In the REPL, execute this code: > ``` > @ def plus1(x: Int): Int = x + 1 > @ val plus1_udf = udf(plus1 _) > ``` > This will fail with the following error: > ``` > java.lang.reflect.InaccessibleObjectException: Unable to make private native > java.lang.reflect.Field[] java.lang.Class.getDeclaredFields0(boolean) > accessible: module java.base does not "opens java.lang" to unnamed module > @45099dd3 > > java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354) > > java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297) > java.lang.reflect.Method.checkCanSetAccessible(Method.java:199) > java.lang.reflect.Method.setAccessible(Method.java:193) > > org.apache.spark.util.ClosureCleaner$.getFinalModifiersFieldForJava17(ClosureCleaner.scala:577) > > org.apache.spark.util.ClosureCleaner$.setFieldAndIgnoreModifiers(ClosureCleaner.scala:560) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18(ClosureCleaner.scala:533) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18$adapted(ClosureCleaner.scala:525) > scala.collection.ArrayOps$WithFilter.foreach(ArrayOps.scala:73) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16(ClosureCleaner.scala:525) > > org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16$adapted(ClosureCleaner.scala:522) > scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576) > scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574) > scala.collection.AbstractIterable.foreach(Iterable.scala:933) > scala.collection.IterableOps$WithFilter.foreach(Iterable.scala:903) > > org.apache.spark.util.ClosureCleaner$.cleanupAmmoniteReplClosure(ClosureCleaner.scala:522) > org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:251) > > org.apache.spark.sql.expressions.SparkConnectClosureCleaner$.clean(UserDefinedFunction.scala:210) > > org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:187) > > org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:180) > org.apache.spark.sql.functions$.udf(functions.scala:7956) > ammonite.$sess.cmd1$Helper.(cmd1.sc:1) > ammonite.$sess.cmd1$.(cmd1.sc:7) > ``` > > This is because ClosureCleaner is heavily reliant on using reflection API and > is not compatible with Java 17. The rest of Spark bypasses this by adding > `--add-opens` JVM flags, see > https://issues.apache.org/jira/browse/SPARK-36796. We need to add these > options to Spark Connect Client launch script as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23890) Hive ALTER TABLE CHANGE COLUMN for struct type no longer works
[ https://issues.apache.org/jira/browse/SPARK-23890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Otto resolved SPARK-23890. - Fix Version/s: 3.0.0 Resolution: Fixed Ah! This is supported in DataSource v2 after all, except just not via CHANGE COLUMN. Instead, you can add a column to a nested field by addressing it with dotted notation: ALTER TABLE otto.test_table03 ADD COLUMN s1.s1_f2_added STRING; > Hive ALTER TABLE CHANGE COLUMN for struct type no longer works > -- > > Key: SPARK-23890 > URL: https://issues.apache.org/jira/browse/SPARK-23890 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0, 3.0.0 >Reporter: Andrew Otto >Priority: Major > Labels: bulk-closed, pull-request-available > Fix For: 3.0.0 > > > As part of SPARK-14118, Spark SQL removed support for sending ALTER TABLE > CHANGE COLUMN commands to Hive. This restriction was loosened in > [https://github.com/apache/spark/pull/12714] to allow for those commands if > they only change the column comment. > Wikimedia has been evolving Parquet backed Hive tables with data originally > from JSON events by adding newly found columns to the Hive table schema, via > a Spark job we call 'Refine'. We do this by recursively merging an input > DataFrame schema with a Hive table DataFrame schema, finding new fields, and > then issuing an ALTER TABLE statement to add the columns. However, because > we allow for nested data types in the incoming JSON data, we make extensive > use of struct type fields. In order to add newly detected fields in a nested > data type, we must alter the struct column and append the nested struct > field. This requires CHANGE COLUMN that alters the column type. In reality, > the 'type' of the column is not changing, it just just a new field being > added to the struct, but to SQL, this looks like a type change. > -We were about to upgrade to Spark 2 but this new restriction in SQL DDL that > can be sent to Hive will block us. I believe this is fixable by adding an > exception in > [command/ddl.scala|https://github.com/apache/spark/blob/v2.3.0/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala#L294-L325] > to allow ALTER TABLE CHANGE COLUMN with a new type, if the original type and > destination type are both struct types, and the destination type only adds > new fields.- > > In this [PR|https://github.com/apache/spark/pull/21012], I was told that the > Spark 3 datasource v2 would support this. > However, it is clear that it does not. There is an [explicit > check|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L1441] > and > [test|https://github.com/apache/spark/blob/e3f46ed57dc063566cdb9425b4d5e02c65332df1/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala#L583] > that prevents this from happening. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46422) Move `test_window` to `pyspark.pandas.tests.window.*`
[ https://issues.apache.org/jira/browse/SPARK-46422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46422: --- Labels: pull-request-available (was: ) > Move `test_window` to `pyspark.pandas.tests.window.*` > - > > Key: SPARK-46422 > URL: https://issues.apache.org/jira/browse/SPARK-46422 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46422) Move `test_window` to `pyspark.pandas.tests.window.*`
Ruifeng Zheng created SPARK-46422: - Summary: Move `test_window` to `pyspark.pandas.tests.window.*` Key: SPARK-46422 URL: https://issues.apache.org/jira/browse/SPARK-46422 Project: Spark Issue Type: Test Components: PS, Tests Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46421) Broken support for explode on a Map in typed API
[ https://issues.apache.org/jira/browse/SPARK-46421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emil Ejbyfeldt updated SPARK-46421: --- Description: {code:java} scala> spark.createDataset(Seq(Tuple1(Map(1 -> 2.select(explode($"_1").as[(Int, Int)]) org.apache.spark.sql.AnalysisException: [UNSUPPORTED_DESERIALIZER.FIELD_NUMBER_MISMATCH] The deserializer is not supported: try to map "STRUCT" to Tuple1, but failed as the number of fields does not line up. at org.apache.spark.sql.errors.QueryCompilationErrors$.fieldNumberMismatchForDeserializerError(QueryCompilationErrors.scala:357) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$.fail(Analyzer.scala:3494) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveDeserializer$$validateTopLevelTupleFields(Analyzer.scala:3510) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52$$anonfun$applyOrElse$228.applyOrElse(Analyzer.scala:3462) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52$$anonfun$applyOrElse$228.applyOrElse(Analyzer.scala:3454) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDownWithPruning$1(QueryPlan.scala:167) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:208) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:208) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:219) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:229) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:304) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:229) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDownWithPruning(QueryPlan.scala:167) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsWithPruning(QueryPlan.scala:138) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52.applyOrElse(Analyzer.scala:3454) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52.applyOrElse(Analyzer.scala:3449) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1215) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1214) at org.apache.spark.sql.catalyst.plans.logical.MapElements.mapChildren(object.scala:223) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1215) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1214) at
[jira] [Created] (SPARK-46421) Broken support for explode on a Map in typed API
Emil Ejbyfeldt created SPARK-46421: -- Summary: Broken support for explode on a Map in typed API Key: SPARK-46421 URL: https://issues.apache.org/jira/browse/SPARK-46421 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.5.0 Reporter: Emil Ejbyfeldt ``` scala> spark.createDataset(Seq(Tuple1(Map(1 -> 2.select(explode($"_1").as[(Int, Int)]) org.apache.spark.sql.AnalysisException: [UNSUPPORTED_DESERIALIZER.FIELD_NUMBER_MISMATCH] The deserializer is not supported: try to map "STRUCT" to Tuple1, but failed as the number of fields does not line up. at org.apache.spark.sql.errors.QueryCompilationErrors$.fieldNumberMismatchForDeserializerError(QueryCompilationErrors.scala:357) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$.fail(Analyzer.scala:3494) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveDeserializer$$validateTopLevelTupleFields(Analyzer.scala:3510) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52$$anonfun$applyOrElse$228.applyOrElse(Analyzer.scala:3462) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52$$anonfun$applyOrElse$228.applyOrElse(Analyzer.scala:3454) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsDownWithPruning$1(QueryPlan.scala:167) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:208) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:208) at org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:219) at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:229) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:304) at org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:229) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDownWithPruning(QueryPlan.scala:167) at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsWithPruning(QueryPlan.scala:138) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52.applyOrElse(Analyzer.scala:3454) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer$$anonfun$apply$52.applyOrElse(Analyzer.scala:3449) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1215) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1214) at org.apache.spark.sql.catalyst.plans.logical.MapElements.mapChildren(object.scala:223) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:32) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135) at
[jira] [Updated] (SPARK-46420) Remove unused transport from SparkSQLCLIDriver
[ https://issues.apache.org/jira/browse/SPARK-46420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46420: --- Labels: pull-request-available (was: ) > Remove unused transport from SparkSQLCLIDriver > -- > > Key: SPARK-46420 > URL: https://issues.apache.org/jira/browse/SPARK-46420 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Cheng Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46420) Remove unused transport from SparkSQLCLIDriver
Cheng Pan created SPARK-46420: - Summary: Remove unused transport from SparkSQLCLIDriver Key: SPARK-46420 URL: https://issues.apache.org/jira/browse/SPARK-46420 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: Cheng Pan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46419) Reorganize `DatetimeIndexTests`: Factor out 3 slow tests
[ https://issues.apache.org/jira/browse/SPARK-46419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46419: --- Labels: pull-request-available (was: ) > Reorganize `DatetimeIndexTests`: Factor out 3 slow tests > > > Key: SPARK-46419 > URL: https://issues.apache.org/jira/browse/SPARK-46419 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46419) Reorganize `DatetimeIndexTests`: Factor out 3 slow tests
Ruifeng Zheng created SPARK-46419: - Summary: Reorganize `DatetimeIndexTests`: Factor out 3 slow tests Key: SPARK-46419 URL: https://issues.apache.org/jira/browse/SPARK-46419 Project: Spark Issue Type: Test Components: PS, Tests Affects Versions: 4.0.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40876) Spark's Vectorized ParquetReader should support type promotions
[ https://issues.apache.org/jira/browse/SPARK-40876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-40876: --- Labels: pull-request-available (was: ) > Spark's Vectorized ParquetReader should support type promotions > --- > > Key: SPARK-40876 > URL: https://issues.apache.org/jira/browse/SPARK-40876 > Project: Spark > Issue Type: Improvement > Components: Input/Output >Affects Versions: 3.3.0 >Reporter: Alexey Kudinkin >Priority: Major > Labels: pull-request-available > > Currently, when reading Parquet table using Spark's `VectorizedColumnReader`, > we hit an issue where we specify requested (projection) schema where one of > the field's type is widened from int32 to long. > Expectation is that since this is totally legitimate primitive type > promotion, we should be able to read Ints into Longs w/ no problems (for ex, > Avro is able to do that perfectly fine). > However, we're facing an issue where `ParquetVectorUpdaterFactory.getUpdater` > method fails w/ the exception listed below. > Looking at the code, It actually seems to be allowing the opposite – it > allows to "down-size" Int32s persisted in the Parquet to be read as Bytes or > Shorts for ex. I'm actually not sure what's the rationale for this behavior, > and this actually seems like a bug to me (as this will essentially be leading > to data truncation): > {code:java} > case INT32: > if (sparkType == DataTypes.IntegerType || canReadAsIntDecimal(descriptor, > sparkType)) { > return new IntegerUpdater(); > } else if (sparkType == DataTypes.LongType && isUnsignedIntTypeMatched(32)) > { > // In `ParquetToSparkSchemaConverter`, we map parquet UINT32 to our > LongType. > // For unsigned int32, it stores as plain signed int32 in Parquet when > dictionary > // fallbacks. We read them as long values. > return new UnsignedIntegerUpdater(); > } else if (sparkType == DataTypes.ByteType) { > return new ByteUpdater(); > } else if (sparkType == DataTypes.ShortType) { > return new ShortUpdater(); > } else if (sparkType == DataTypes.DateType) { > if ("CORRECTED".equals(datetimeRebaseMode)) { > return new IntegerUpdater(); > } else { > boolean failIfRebase = "EXCEPTION".equals(datetimeRebaseMode); > return new IntegerWithRebaseUpdater(failIfRebase); > } > } > break; {code} > Exception: > {code:java} > at > org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2454) > at > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2403) > at > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2402) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2402) > at > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1160) > at > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1160) > at scala.Option.foreach(Option.scala:407) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1160) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2642) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2584) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2573) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) > at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:938) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2214) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2235) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2254) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2279) > at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:414) > at org.apache.spark.rdd.RDD.collect(RDD.scala:1029) > at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:304) > at org.apache.spark.RangePartitioner.(Partitioner.scala:171) > at >
[jira] [Resolved] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-46417. -- Fix Version/s: 3.4.3 3.5.1 4.0.0 Resolution: Fixed Issue resolved by pull request 44364 [https://github.com/apache/spark/pull/44364] > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.3, 3.5.1, 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao reassigned SPARK-46417: Assignee: Wenchen Fan > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46418) Reorganize `ReshapeTests`
[ https://issues.apache.org/jira/browse/SPARK-46418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-46418. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44365 [https://github.com/apache/spark/pull/44365] > Reorganize `ReshapeTests` > - > > Key: SPARK-46418 > URL: https://issues.apache.org/jira/browse/SPARK-46418 > Project: Spark > Issue Type: Test > Components: PS, Tests >Affects Versions: 4.0.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46417: -- Assignee: (was: Apache Spark) > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46417: -- Assignee: Apache Spark > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45597: -- Assignee: (was: Apache Spark) > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Priority: Major > Labels: pull-request-available > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45597: -- Assignee: Apache Spark > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45597: -- Assignee: Apache Spark > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45597) Support creating table using a Python data source in SQL
[ https://issues.apache.org/jira/browse/SPARK-45597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-45597: -- Assignee: (was: Apache Spark) > Support creating table using a Python data source in SQL > > > Key: SPARK-45597 > URL: https://issues.apache.org/jira/browse/SPARK-45597 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Allison Wang >Priority: Major > Labels: pull-request-available > > Support creating table using a Python data source in SQL query: > For instance: > `CREATE TABLE tableName() USING OPTIONS > ` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46417: -- Assignee: (was: Apache Spark) > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46417) do not fail when calling getTable and throwException is false
[ https://issues.apache.org/jira/browse/SPARK-46417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-46417: -- Assignee: Apache Spark > do not fail when calling getTable and throwException is false > - > > Key: SPARK-46417 > URL: https://issues.apache.org/jira/browse/SPARK-46417 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 4.0.0 >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org