[spark] branch master updated (db47c6e -> bdeb626)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db47c6e [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API add bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-ansi-compliance.md| 2 + docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 67 ++ docs/sql-ref-syntax-aux-conf-mgmt.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 8 ++ .../spark/sql/catalyst/parser/AstBuilder.scala | 11 +- .../org/apache/spark/sql/internal/SQLConf.scala| 6 +- .../spark/sql/execution/SparkSqlParser.scala | 39 +- .../test/resources/sql-tests/inputs/timezone.sql | 15 +++ .../resources/sql-tests/results/timezone.sql.out | 135 + .../apache/spark/sql/internal/SQLConfSuite.scala | 28 + 11 files changed, 308 insertions(+), 6 deletions(-) create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/timezone.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (db47c6e -> bdeb626)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db47c6e [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API add bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-ansi-compliance.md| 2 + docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 67 ++ docs/sql-ref-syntax-aux-conf-mgmt.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 8 ++ .../spark/sql/catalyst/parser/AstBuilder.scala | 11 +- .../org/apache/spark/sql/internal/SQLConf.scala| 6 +- .../spark/sql/execution/SparkSqlParser.scala | 39 +- .../test/resources/sql-tests/inputs/timezone.sql | 15 +++ .../resources/sql-tests/results/timezone.sql.out | 135 + .../apache/spark/sql/internal/SQLConfSuite.scala | 28 + 11 files changed, 308 insertions(+), 6 deletions(-) create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/timezone.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (db47c6e -> bdeb626)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db47c6e [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API add bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-ansi-compliance.md| 2 + docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 67 ++ docs/sql-ref-syntax-aux-conf-mgmt.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 8 ++ .../spark/sql/catalyst/parser/AstBuilder.scala | 11 +- .../org/apache/spark/sql/internal/SQLConf.scala| 6 +- .../spark/sql/execution/SparkSqlParser.scala | 39 +- .../test/resources/sql-tests/inputs/timezone.sql | 15 +++ .../resources/sql-tests/results/timezone.sql.out | 135 + .../apache/spark/sql/internal/SQLConfSuite.scala | 28 + 11 files changed, 308 insertions(+), 6 deletions(-) create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/timezone.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (db47c6e -> bdeb626)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db47c6e [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API add bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-ansi-compliance.md| 2 + docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 67 ++ docs/sql-ref-syntax-aux-conf-mgmt.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 8 ++ .../spark/sql/catalyst/parser/AstBuilder.scala | 11 +- .../org/apache/spark/sql/internal/SQLConf.scala| 6 +- .../spark/sql/execution/SparkSqlParser.scala | 39 +- .../test/resources/sql-tests/inputs/timezone.sql | 15 +++ .../resources/sql-tests/results/timezone.sql.out | 135 + .../apache/spark/sql/internal/SQLConfSuite.scala | 28 + 11 files changed, 308 insertions(+), 6 deletions(-) create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/timezone.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (db47c6e -> bdeb626)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db47c6e [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API add bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 2 + docs/sql-ref-ansi-compliance.md| 2 + docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md | 67 ++ docs/sql-ref-syntax-aux-conf-mgmt.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 8 ++ .../spark/sql/catalyst/parser/AstBuilder.scala | 11 +- .../org/apache/spark/sql/internal/SQLConf.scala| 6 +- .../spark/sql/execution/SparkSqlParser.scala | 39 +- .../test/resources/sql-tests/inputs/timezone.sql | 15 +++ .../resources/sql-tests/results/timezone.sql.out | 135 + .../apache/spark/sql/internal/SQLConfSuite.scala | 28 + 11 files changed, 308 insertions(+), 6 deletions(-) create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/timezone.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables 6be8b93 is described below commit 6be8b935a4f7ce0dea2d7aaaf747c2e8e1a9f47a Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is needed to f
[spark] branch master updated (bdeb626 -> 6be8b93)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE add 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables No new revisions were added by this update. Summary of changes: .../execution/datasources/orc/OrcFileFormat.scala | 10 +++--- .../sql/execution/datasources/orc/OrcUtils.scala | 41 ++ .../v2/orc/OrcPartitionReaderFactory.scala | 21 +-- .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++ 4 files changed, 78 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b745041 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables b745041 is described below commit b745041f698120be21ab889706880e976a599fdb Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is need
[spark] branch master updated (bdeb626 -> 6be8b93)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE add 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables No new revisions were added by this update. Summary of changes: .../execution/datasources/orc/OrcFileFormat.scala | 10 +++--- .../sql/execution/datasources/orc/OrcUtils.scala | 41 ++ .../v2/orc/OrcPartitionReaderFactory.scala | 21 +-- .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++ 4 files changed, 78 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b745041 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables b745041 is described below commit b745041f698120be21ab889706880e976a599fdb Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is need
[spark] branch master updated (bdeb626 -> 6be8b93)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from bdeb626 [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE add 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables No new revisions were added by this update. Summary of changes: .../execution/datasources/orc/OrcFileFormat.scala | 10 +++--- .../sql/execution/datasources/orc/OrcUtils.scala | 41 ++ .../v2/orc/OrcPartitionReaderFactory.scala | 21 +-- .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++ 4 files changed, 78 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b745041 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables b745041 is described below commit b745041f698120be21ab889706880e976a599fdb Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is need
[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b745041 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables b745041 is described below commit b745041f698120be21ab889706880e976a599fdb Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is need
[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b745041 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables b745041 is described below commit b745041f698120be21ab889706880e976a599fdb Author: SaurabhChawla AuthorDate: Thu Jul 16 13:11:47 2020 + [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables ### What changes were proposed in this pull request? Spark sql commands are failing on selecting the orc tables Steps to reproduce Example 1 - Prerequisite - This is the location(/Users/test/tpcds_scale5data/date_dim) for orc data which is generated by the hive. ``` val table = """CREATE TABLE `date_dim` ( `d_date_sk` INT, `d_date_id` STRING, `d_date` TIMESTAMP, `d_month_seq` INT, `d_week_seq` INT, `d_quarter_seq` INT, `d_year` INT, `d_dow` INT, `d_moy` INT, `d_dom` INT, `d_qoy` INT, `d_fy_year` INT, `d_fy_quarter_seq` INT, `d_fy_week_seq` INT, `d_day_name` STRING, `d_quarter_name` STRING, `d_holiday` STRING, `d_weekend` STRING, `d_following_holiday` STRING, `d_first_dom` INT, `d_last_dom` INT, `d_same_day_ly` INT, `d_same_day_lq` INT, `d_current_day` STRING, `d_current_week` STRING, `d_current_month` STRING, `d_current_quarter` STRING, `d_current_year` STRING) USING orc LOCATION '/Users/test/tpcds_scale5data/date_dim'""" spark.sql(table).collect val u = """select date_dim.d_date_id from date_dim limit 5""" spark.sql(u).collect ``` Example 2 ``` val table = """CREATE TABLE `test_orc_data` ( `_col1` INT, `_col2` STRING, `_col3` INT) USING orc""" spark.sql(table).collect spark.sql("insert into test_orc_data values(13, '155', 2020)").collect val df = """select _col2 from test_orc_data limit 5""" spark.sql(df).collect ``` Its Failing with below error ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, 192.168.0.103, executor driver): java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156) at org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372) at org.apache.spark.rdd.RDD.iterator(RDD.scala:336) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:133) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)` ``` The reason behind this initBatch is not getting the schema that is need
[spark] branch master updated (6be8b93 -> c1f160e)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables add c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/StructFilters.scala | 166 + .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++ .../spark/sql/catalyst/json/JacksonParser.scala| 27 ++-- .../spark/sql/catalyst/json/JsonFilters.scala | 156 +++ .../org/apache/spark/sql/internal/SQLConf.scala| 8 + ...FiltersSuite.scala => StructFiltersSuite.scala} | 38 ++--- .../spark/sql/catalyst/csv/CSVFiltersSuite.scala | 116 +- .../sql/catalyst/json/JacksonParserSuite.scala | 57 +++ .../sql/catalyst/json/JsonFiltersSuite.scala} | 14 +- sql/core/benchmarks/CSVBenchmark-jdk11-results.txt | 64 sql/core/benchmarks/CSVBenchmark-results.txt | 64 .../benchmarks/JsonBenchmark-jdk11-results.txt | 94 ++-- sql/core/benchmarks/JsonBenchmark-results.txt | 94 ++-- .../datasources/json/JsonFileFormat.scala | 6 +- .../datasources/v2/csv/CSVScanBuilder.scala| 4 +- .../v2/json/JsonPartitionReaderFactory.scala | 11 +- .../execution/datasources/v2/json/JsonScan.scala | 12 +- .../datasources/v2/json/JsonScanBuilder.scala | 26 +++- .../sql/execution/datasources/csv/CSVSuite.scala | 30 .../execution/datasources/json/JsonBenchmark.scala | 45 +- .../sql/execution/datasources/json/JsonSuite.scala | 151 ++- 21 files changed, 893 insertions(+), 414 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala copy sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala => StructFiltersSuite.scala} (82%) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala copy sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6be8b93 -> c1f160e)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables add c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/StructFilters.scala | 166 + .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++ .../spark/sql/catalyst/json/JacksonParser.scala| 27 ++-- .../spark/sql/catalyst/json/JsonFilters.scala | 156 +++ .../org/apache/spark/sql/internal/SQLConf.scala| 8 + ...FiltersSuite.scala => StructFiltersSuite.scala} | 38 ++--- .../spark/sql/catalyst/csv/CSVFiltersSuite.scala | 116 +- .../sql/catalyst/json/JacksonParserSuite.scala | 57 +++ .../sql/catalyst/json/JsonFiltersSuite.scala} | 14 +- sql/core/benchmarks/CSVBenchmark-jdk11-results.txt | 64 sql/core/benchmarks/CSVBenchmark-results.txt | 64 .../benchmarks/JsonBenchmark-jdk11-results.txt | 94 ++-- sql/core/benchmarks/JsonBenchmark-results.txt | 94 ++-- .../datasources/json/JsonFileFormat.scala | 6 +- .../datasources/v2/csv/CSVScanBuilder.scala| 4 +- .../v2/json/JsonPartitionReaderFactory.scala | 11 +- .../execution/datasources/v2/json/JsonScan.scala | 12 +- .../datasources/v2/json/JsonScanBuilder.scala | 26 +++- .../sql/execution/datasources/csv/CSVSuite.scala | 30 .../execution/datasources/json/JsonBenchmark.scala | 45 +- .../sql/execution/datasources/json/JsonSuite.scala | 151 ++- 21 files changed, 893 insertions(+), 414 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala copy sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala => StructFiltersSuite.scala} (82%) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala copy sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6be8b93 -> c1f160e)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables add c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/StructFilters.scala | 166 + .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++ .../spark/sql/catalyst/json/JacksonParser.scala| 27 ++-- .../spark/sql/catalyst/json/JsonFilters.scala | 156 +++ .../org/apache/spark/sql/internal/SQLConf.scala| 8 + ...FiltersSuite.scala => StructFiltersSuite.scala} | 38 ++--- .../spark/sql/catalyst/csv/CSVFiltersSuite.scala | 116 +- .../sql/catalyst/json/JacksonParserSuite.scala | 57 +++ .../sql/catalyst/json/JsonFiltersSuite.scala} | 14 +- sql/core/benchmarks/CSVBenchmark-jdk11-results.txt | 64 sql/core/benchmarks/CSVBenchmark-results.txt | 64 .../benchmarks/JsonBenchmark-jdk11-results.txt | 94 ++-- sql/core/benchmarks/JsonBenchmark-results.txt | 94 ++-- .../datasources/json/JsonFileFormat.scala | 6 +- .../datasources/v2/csv/CSVScanBuilder.scala| 4 +- .../v2/json/JsonPartitionReaderFactory.scala | 11 +- .../execution/datasources/v2/json/JsonScan.scala | 12 +- .../datasources/v2/json/JsonScanBuilder.scala | 26 +++- .../sql/execution/datasources/csv/CSVSuite.scala | 30 .../execution/datasources/json/JsonBenchmark.scala | 45 +- .../sql/execution/datasources/json/JsonSuite.scala | 151 ++- 21 files changed, 893 insertions(+), 414 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala copy sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala => StructFiltersSuite.scala} (82%) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala copy sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6be8b93 -> c1f160e)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables add c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/StructFilters.scala | 166 + .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++ .../spark/sql/catalyst/json/JacksonParser.scala| 27 ++-- .../spark/sql/catalyst/json/JsonFilters.scala | 156 +++ .../org/apache/spark/sql/internal/SQLConf.scala| 8 + ...FiltersSuite.scala => StructFiltersSuite.scala} | 38 ++--- .../spark/sql/catalyst/csv/CSVFiltersSuite.scala | 116 +- .../sql/catalyst/json/JacksonParserSuite.scala | 57 +++ .../sql/catalyst/json/JsonFiltersSuite.scala} | 14 +- sql/core/benchmarks/CSVBenchmark-jdk11-results.txt | 64 sql/core/benchmarks/CSVBenchmark-results.txt | 64 .../benchmarks/JsonBenchmark-jdk11-results.txt | 94 ++-- sql/core/benchmarks/JsonBenchmark-results.txt | 94 ++-- .../datasources/json/JsonFileFormat.scala | 6 +- .../datasources/v2/csv/CSVScanBuilder.scala| 4 +- .../v2/json/JsonPartitionReaderFactory.scala | 11 +- .../execution/datasources/v2/json/JsonScan.scala | 12 +- .../datasources/v2/json/JsonScanBuilder.scala | 26 +++- .../sql/execution/datasources/csv/CSVSuite.scala | 30 .../execution/datasources/json/JsonBenchmark.scala | 45 +- .../sql/execution/datasources/json/JsonSuite.scala | 151 ++- 21 files changed, 893 insertions(+), 414 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala copy sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala => StructFiltersSuite.scala} (82%) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala copy sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6be8b93 -> c1f160e)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6be8b93 [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables add c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/StructFilters.scala | 166 + .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++ .../spark/sql/catalyst/json/JacksonParser.scala| 27 ++-- .../spark/sql/catalyst/json/JsonFilters.scala | 156 +++ .../org/apache/spark/sql/internal/SQLConf.scala| 8 + ...FiltersSuite.scala => StructFiltersSuite.scala} | 38 ++--- .../spark/sql/catalyst/csv/CSVFiltersSuite.scala | 116 +- .../sql/catalyst/json/JacksonParserSuite.scala | 57 +++ .../sql/catalyst/json/JsonFiltersSuite.scala} | 14 +- sql/core/benchmarks/CSVBenchmark-jdk11-results.txt | 64 sql/core/benchmarks/CSVBenchmark-results.txt | 64 .../benchmarks/JsonBenchmark-jdk11-results.txt | 94 ++-- sql/core/benchmarks/JsonBenchmark-results.txt | 94 ++-- .../datasources/json/JsonFileFormat.scala | 6 +- .../datasources/v2/csv/CSVScanBuilder.scala| 4 +- .../v2/json/JsonPartitionReaderFactory.scala | 11 +- .../execution/datasources/v2/json/JsonScan.scala | 12 +- .../datasources/v2/json/JsonScanBuilder.scala | 26 +++- .../sql/execution/datasources/csv/CSVSuite.scala | 30 .../execution/datasources/json/JsonBenchmark.scala | 45 +- .../sql/execution/datasources/json/JsonSuite.scala | 151 ++- 21 files changed, 893 insertions(+), 414 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala copy sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala => StructFiltersSuite.scala} (82%) create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala copy sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1f160e -> d5c672a)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource add d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require No new revisions were added by this update. Summary of changes: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1f160e -> d5c672a)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource add d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require No new revisions were added by this update. Summary of changes: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1f160e -> d5c672a)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource add d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require No new revisions were added by this update. Summary of changes: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1f160e -> d5c672a)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource add d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require No new revisions were added by this update. Summary of changes: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1f160e -> d5c672a)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c1f160e [SPARK-30648][SQL] Support filters pushdown in JSON datasource add d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require No new revisions were added by this update. Summary of changes: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 9b6bea5 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value 9b6bea5 is described below commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6 Author: Wenchen Fan AuthorDate: Thu Jul 16 11:09:27 2020 -0700 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value partially backport https://github.com/apache/spark/pull/29026 Closes #29125 from cloud-fan/backport. Authored-by: Wenchen Fan Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java | 2 +- .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java index 034894b..4dc5ce1 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java @@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements Externalizable, Kryo Platform.putLong(baseObject, baseOffset + cursor, 0L); Platform.putLong(baseObject, baseOffset + cursor + 8, 0L); - if (value == null) { + if (value == null || !value.changePrecision(precision, value.scale())) { setNullAt(ordinal); // keep the offset for future update Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32); diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala index a5f904c..9daa69c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala @@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite { // Makes sure hashCode on unsafe array won't crash unsafeRow.getArray(0).hashCode() } + + test("SPARK-32018: setDecimal with overflowed value") { +val d1 = new Decimal().set(BigDecimal("1000")).toPrecision(38, 18) +val row = InternalRow.apply(d1) +val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 18))).apply(row) +assert(unsafeRow.getDecimal(0, 38, 18) === d1) +val d2 = (d1 * Decimal(10)).toPrecision(39, 18) +unsafeRow.setDecimal(0, d2, 38) +assert(unsafeRow.getDecimal(0, 38, 18) === null) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 9b6bea5 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value 9b6bea5 is described below commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6 Author: Wenchen Fan AuthorDate: Thu Jul 16 11:09:27 2020 -0700 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value partially backport https://github.com/apache/spark/pull/29026 Closes #29125 from cloud-fan/backport. Authored-by: Wenchen Fan Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java | 2 +- .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java index 034894b..4dc5ce1 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java @@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements Externalizable, Kryo Platform.putLong(baseObject, baseOffset + cursor, 0L); Platform.putLong(baseObject, baseOffset + cursor + 8, 0L); - if (value == null) { + if (value == null || !value.changePrecision(precision, value.scale())) { setNullAt(ordinal); // keep the offset for future update Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32); diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala index a5f904c..9daa69c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala @@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite { // Makes sure hashCode on unsafe array won't crash unsafeRow.getArray(0).hashCode() } + + test("SPARK-32018: setDecimal with overflowed value") { +val d1 = new Decimal().set(BigDecimal("1000")).toPrecision(38, 18) +val row = InternalRow.apply(d1) +val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 18))).apply(row) +assert(unsafeRow.getDecimal(0, 38, 18) === d1) +val d2 = (d1 * Decimal(10)).toPrecision(39, 18) +unsafeRow.setDecimal(0, d2, 38) +assert(unsafeRow.getDecimal(0, 38, 18) === null) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d5c672a -> 383f5e9)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require add 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm No new revisions were added by this update. Summary of changes: .../spark/ml/classification/FMClassifier.scala | 10 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++--- .../ml/classification/LogisticRegression.scala | 14 ++ .../spark/ml/classification/NaiveBayes.scala | 4 +- .../spark/ml/clustering/BisectingKMeans.scala | 7 +-- .../spark/ml/clustering/GaussianMixture.scala | 8 +-- .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++-- .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++-- .../ml/clustering/PowerIterationClustering.scala | 7 +-- .../evaluation/BinaryClassificationEvaluator.scala | 4 +- .../MulticlassClassificationEvaluator.scala| 8 +-- .../MultilabelClassificationEvaluator.scala| 6 +-- .../spark/ml/evaluation/RankingEvaluator.scala | 6 +-- .../spark/ml/evaluation/RegressionEvaluator.scala | 4 +- .../scala/org/apache/spark/ml/fpm/FPGrowth.scala | 5 +- .../ml/regression/AFTSurvivalRegression.scala | 11 ++-- .../spark/ml/regression/LinearRegression.scala | 15 ++ python/pyspark/ml/classification.py| 58 ++ python/pyspark/ml/clustering.py| 33 python/pyspark/ml/fpm.py | 7 ++- python/pyspark/ml/regression.py| 57 + 21 files changed, 141 insertions(+), 157 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 9b6bea5 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value 9b6bea5 is described below commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6 Author: Wenchen Fan AuthorDate: Thu Jul 16 11:09:27 2020 -0700 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value partially backport https://github.com/apache/spark/pull/29026 Closes #29125 from cloud-fan/backport. Authored-by: Wenchen Fan Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java | 2 +- .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java index 034894b..4dc5ce1 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java @@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements Externalizable, Kryo Platform.putLong(baseObject, baseOffset + cursor, 0L); Platform.putLong(baseObject, baseOffset + cursor + 8, 0L); - if (value == null) { + if (value == null || !value.changePrecision(precision, value.scale())) { setNullAt(ordinal); // keep the offset for future update Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32); diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala index a5f904c..9daa69c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala @@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite { // Makes sure hashCode on unsafe array won't crash unsafeRow.getArray(0).hashCode() } + + test("SPARK-32018: setDecimal with overflowed value") { +val d1 = new Decimal().set(BigDecimal("1000")).toPrecision(38, 18) +val row = InternalRow.apply(d1) +val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 18))).apply(row) +assert(unsafeRow.getDecimal(0, 38, 18) === d1) +val d2 = (d1 * Decimal(10)).toPrecision(39, 18) +unsafeRow.setDecimal(0, d2, 38) +assert(unsafeRow.getDecimal(0, 38, 18) === null) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d5c672a -> 383f5e9)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require add 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm No new revisions were added by this update. Summary of changes: .../spark/ml/classification/FMClassifier.scala | 10 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++--- .../ml/classification/LogisticRegression.scala | 14 ++ .../spark/ml/classification/NaiveBayes.scala | 4 +- .../spark/ml/clustering/BisectingKMeans.scala | 7 +-- .../spark/ml/clustering/GaussianMixture.scala | 8 +-- .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++-- .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++-- .../ml/clustering/PowerIterationClustering.scala | 7 +-- .../evaluation/BinaryClassificationEvaluator.scala | 4 +- .../MulticlassClassificationEvaluator.scala| 8 +-- .../MultilabelClassificationEvaluator.scala| 6 +-- .../spark/ml/evaluation/RankingEvaluator.scala | 6 +-- .../spark/ml/evaluation/RegressionEvaluator.scala | 4 +- .../scala/org/apache/spark/ml/fpm/FPGrowth.scala | 5 +- .../ml/regression/AFTSurvivalRegression.scala | 11 ++-- .../spark/ml/regression/LinearRegression.scala | 15 ++ python/pyspark/ml/classification.py| 58 ++ python/pyspark/ml/clustering.py| 33 python/pyspark/ml/fpm.py | 7 ++- python/pyspark/ml/regression.py| 57 + 21 files changed, 141 insertions(+), 157 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 9b6bea5 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value 9b6bea5 is described below commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6 Author: Wenchen Fan AuthorDate: Thu Jul 16 11:09:27 2020 -0700 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value partially backport https://github.com/apache/spark/pull/29026 Closes #29125 from cloud-fan/backport. Authored-by: Wenchen Fan Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java | 2 +- .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java index 034894b..4dc5ce1 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java @@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements Externalizable, Kryo Platform.putLong(baseObject, baseOffset + cursor, 0L); Platform.putLong(baseObject, baseOffset + cursor + 8, 0L); - if (value == null) { + if (value == null || !value.changePrecision(precision, value.scale())) { setNullAt(ordinal); // keep the offset for future update Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32); diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala index a5f904c..9daa69c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala @@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite { // Makes sure hashCode on unsafe array won't crash unsafeRow.getArray(0).hashCode() } + + test("SPARK-32018: setDecimal with overflowed value") { +val d1 = new Decimal().set(BigDecimal("1000")).toPrecision(38, 18) +val row = InternalRow.apply(d1) +val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 18))).apply(row) +assert(unsafeRow.getDecimal(0, 38, 18) === d1) +val d2 = (d1 * Decimal(10)).toPrecision(39, 18) +unsafeRow.setDecimal(0, d2, 38) +assert(unsafeRow.getDecimal(0, 38, 18) === null) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d5c672a -> 383f5e9)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require add 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm No new revisions were added by this update. Summary of changes: .../spark/ml/classification/FMClassifier.scala | 10 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++--- .../ml/classification/LogisticRegression.scala | 14 ++ .../spark/ml/classification/NaiveBayes.scala | 4 +- .../spark/ml/clustering/BisectingKMeans.scala | 7 +-- .../spark/ml/clustering/GaussianMixture.scala | 8 +-- .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++-- .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++-- .../ml/clustering/PowerIterationClustering.scala | 7 +-- .../evaluation/BinaryClassificationEvaluator.scala | 4 +- .../MulticlassClassificationEvaluator.scala| 8 +-- .../MultilabelClassificationEvaluator.scala| 6 +-- .../spark/ml/evaluation/RankingEvaluator.scala | 6 +-- .../spark/ml/evaluation/RegressionEvaluator.scala | 4 +- .../scala/org/apache/spark/ml/fpm/FPGrowth.scala | 5 +- .../ml/regression/AFTSurvivalRegression.scala | 11 ++-- .../spark/ml/regression/LinearRegression.scala | 15 ++ python/pyspark/ml/classification.py| 58 ++ python/pyspark/ml/clustering.py| 33 python/pyspark/ml/fpm.py | 7 ++- python/pyspark/ml/regression.py| 57 + 21 files changed, 141 insertions(+), 157 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 9b6bea5 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value 9b6bea5 is described below commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6 Author: Wenchen Fan AuthorDate: Thu Jul 16 11:09:27 2020 -0700 [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value partially backport https://github.com/apache/spark/pull/29026 Closes #29125 from cloud-fan/backport. Authored-by: Wenchen Fan Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java | 2 +- .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java index 034894b..4dc5ce1 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java @@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements Externalizable, Kryo Platform.putLong(baseObject, baseOffset + cursor, 0L); Platform.putLong(baseObject, baseOffset + cursor + 8, 0L); - if (value == null) { + if (value == null || !value.changePrecision(precision, value.scale())) { setNullAt(ordinal); // keep the offset for future update Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32); diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala index a5f904c..9daa69c 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala @@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite { // Makes sure hashCode on unsafe array won't crash unsafeRow.getArray(0).hashCode() } + + test("SPARK-32018: setDecimal with overflowed value") { +val d1 = new Decimal().set(BigDecimal("1000")).toPrecision(38, 18) +val row = InternalRow.apply(d1) +val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 18))).apply(row) +assert(unsafeRow.getDecimal(0, 38, 18) === d1) +val d2 = (d1 * Decimal(10)).toPrecision(39, 18) +unsafeRow.setDecimal(0, d2, 38) +assert(unsafeRow.getDecimal(0, 38, 18) === null) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383f5e9 -> fb51925)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm add fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT No new revisions were added by this update. Summary of changes: .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 --- 1 file changed, 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d5c672a -> 383f5e9)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require add 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm No new revisions were added by this update. Summary of changes: .../spark/ml/classification/FMClassifier.scala | 10 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++--- .../ml/classification/LogisticRegression.scala | 14 ++ .../spark/ml/classification/NaiveBayes.scala | 4 +- .../spark/ml/clustering/BisectingKMeans.scala | 7 +-- .../spark/ml/clustering/GaussianMixture.scala | 8 +-- .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++-- .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++-- .../ml/clustering/PowerIterationClustering.scala | 7 +-- .../evaluation/BinaryClassificationEvaluator.scala | 4 +- .../MulticlassClassificationEvaluator.scala| 8 +-- .../MultilabelClassificationEvaluator.scala| 6 +-- .../spark/ml/evaluation/RankingEvaluator.scala | 6 +-- .../spark/ml/evaluation/RegressionEvaluator.scala | 4 +- .../scala/org/apache/spark/ml/fpm/FPGrowth.scala | 5 +- .../ml/regression/AFTSurvivalRegression.scala | 11 ++-- .../spark/ml/regression/LinearRegression.scala | 15 ++ python/pyspark/ml/classification.py| 58 ++ python/pyspark/ml/clustering.py| 33 python/pyspark/ml/fpm.py | 7 ++- python/pyspark/ml/regression.py| 57 + 21 files changed, 141 insertions(+), 157 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383f5e9 -> fb51925)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm add fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT No new revisions were added by this update. Summary of changes: .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 --- 1 file changed, 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383f5e9 -> fb51925)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm add fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT No new revisions were added by this update. Summary of changes: .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 --- 1 file changed, 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383f5e9 -> fb51925)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm add fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT No new revisions were added by this update. Summary of changes: .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 --- 1 file changed, 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383f5e9 -> fb51925)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm add fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT No new revisions were added by this update. Summary of changes: .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 --- 1 file changed, 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d5c672a -> 383f5e9)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d5c672a [SPARK-32315][ML] Provide an explanation error message when calling require add 383f5e9 [SPARK-32310][ML][PYSPARK] ML params default value parity in classification, regression, clustering and fpm No new revisions were added by this update. Summary of changes: .../spark/ml/classification/FMClassifier.scala | 10 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++--- .../ml/classification/LogisticRegression.scala | 14 ++ .../spark/ml/classification/NaiveBayes.scala | 4 +- .../spark/ml/clustering/BisectingKMeans.scala | 7 +-- .../spark/ml/clustering/GaussianMixture.scala | 8 +-- .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++-- .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++-- .../ml/clustering/PowerIterationClustering.scala | 7 +-- .../evaluation/BinaryClassificationEvaluator.scala | 4 +- .../MulticlassClassificationEvaluator.scala| 8 +-- .../MultilabelClassificationEvaluator.scala| 6 +-- .../spark/ml/evaluation/RankingEvaluator.scala | 6 +-- .../spark/ml/evaluation/RegressionEvaluator.scala | 4 +- .../scala/org/apache/spark/ml/fpm/FPGrowth.scala | 5 +- .../ml/regression/AFTSurvivalRegression.scala | 11 ++-- .../spark/ml/regression/LinearRegression.scala | 15 ++ python/pyspark/ml/classification.py| 58 ++ python/pyspark/ml/clustering.py| 33 python/pyspark/ml/fpm.py | 7 ++- python/pyspark/ml/regression.py| 57 + 21 files changed, 141 insertions(+), 157 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fb51925 -> 9747e8f)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT add 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories No new revisions were added by this update. Summary of changes: sql/hive-thriftserver/pom.xml | 12 ++ .../hive/thriftserver/HiveSessionImplSuite.scala | 29 + .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ 4 files changed, 113 insertions(+), 28 deletions(-) create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fb51925 -> 9747e8f)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT add 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories No new revisions were added by this update. Summary of changes: sql/hive-thriftserver/pom.xml | 12 ++ .../hive/thriftserver/HiveSessionImplSuite.scala | 29 + .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ 4 files changed, 113 insertions(+), 28 deletions(-) create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fb51925 -> 9747e8f)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT add 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories No new revisions were added by this update. Summary of changes: sql/hive-thriftserver/pom.xml | 12 ++ .../hive/thriftserver/HiveSessionImplSuite.scala | 29 + .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ 4 files changed, 113 insertions(+), 28 deletions(-) create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fb51925 -> 9747e8f)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fb51925 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT add 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories No new revisions were added by this update. Summary of changes: sql/hive-thriftserver/pom.xml | 12 ++ .../hive/thriftserver/HiveSessionImplSuite.scala | 29 + .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ .../thriftserver/GetCatalogsOperationMock.scala| 50 ++ 4 files changed, 113 insertions(+), 28 deletions(-) create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9747e8f -> ea9e8f3)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories add ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 No new revisions were added by this update. Summary of changes: LICENSE|2 +- dev/.rat-excludes |4 +- dev/tox.ini|2 +- python/pyspark/cloudpickle.py | 1362 python/pyspark/cloudpickle/__init__.py |7 + python/pyspark/cloudpickle/cloudpickle.py | 830 +++ python/pyspark/cloudpickle/cloudpickle_fast.py | 747 + python/pyspark/cloudpickle/compat.py | 13 + python/setup.py|1 + 9 files changed, 1602 insertions(+), 1366 deletions(-) delete mode 100644 python/pyspark/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/__init__.py create mode 100644 python/pyspark/cloudpickle/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py create mode 100644 python/pyspark/cloudpickle/compat.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9747e8f -> ea9e8f3)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories add ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 No new revisions were added by this update. Summary of changes: LICENSE|2 +- dev/.rat-excludes |4 +- dev/tox.ini|2 +- python/pyspark/cloudpickle.py | 1362 python/pyspark/cloudpickle/__init__.py |7 + python/pyspark/cloudpickle/cloudpickle.py | 830 +++ python/pyspark/cloudpickle/cloudpickle_fast.py | 747 + python/pyspark/cloudpickle/compat.py | 13 + python/setup.py|1 + 9 files changed, 1602 insertions(+), 1366 deletions(-) delete mode 100644 python/pyspark/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/__init__.py create mode 100644 python/pyspark/cloudpickle/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py create mode 100644 python/pyspark/cloudpickle/compat.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9747e8f -> ea9e8f3)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories add ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 No new revisions were added by this update. Summary of changes: LICENSE|2 +- dev/.rat-excludes |4 +- dev/tox.ini|2 +- python/pyspark/cloudpickle.py | 1362 python/pyspark/cloudpickle/__init__.py |7 + python/pyspark/cloudpickle/cloudpickle.py | 830 +++ python/pyspark/cloudpickle/cloudpickle_fast.py | 747 + python/pyspark/cloudpickle/compat.py | 13 + python/setup.py|1 + 9 files changed, 1602 insertions(+), 1366 deletions(-) delete mode 100644 python/pyspark/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/__init__.py create mode 100644 python/pyspark/cloudpickle/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py create mode 100644 python/pyspark/cloudpickle/compat.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9747e8f -> ea9e8f3)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories add ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 No new revisions were added by this update. Summary of changes: LICENSE|2 +- dev/.rat-excludes |4 +- dev/tox.ini|2 +- python/pyspark/cloudpickle.py | 1362 python/pyspark/cloudpickle/__init__.py |7 + python/pyspark/cloudpickle/cloudpickle.py | 830 +++ python/pyspark/cloudpickle/cloudpickle_fast.py | 747 + python/pyspark/cloudpickle/compat.py | 13 + python/setup.py|1 + 9 files changed, 1602 insertions(+), 1366 deletions(-) delete mode 100644 python/pyspark/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/__init__.py create mode 100644 python/pyspark/cloudpickle/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py create mode 100644 python/pyspark/cloudpickle/compat.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9747e8f -> ea9e8f3)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9747e8f [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for HiveSessionImplSuite in hive version related subdirectories add ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 No new revisions were added by this update. Summary of changes: LICENSE|2 +- dev/.rat-excludes |4 +- dev/tox.ini|2 +- python/pyspark/cloudpickle.py | 1362 python/pyspark/cloudpickle/__init__.py |7 + python/pyspark/cloudpickle/cloudpickle.py | 830 +++ python/pyspark/cloudpickle/cloudpickle_fast.py | 747 + python/pyspark/cloudpickle/compat.py | 13 + python/setup.py|1 + 9 files changed, 1602 insertions(+), 1366 deletions(-) delete mode 100644 python/pyspark/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/__init__.py create mode 100644 python/pyspark/cloudpickle/cloudpickle.py create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py create mode 100644 python/pyspark/cloudpickle/compat.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ea9e8f3 -> efa70b8)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 add efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ea9e8f3 -> efa70b8)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 add efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ea9e8f3 -> efa70b8)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 add efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ea9e8f3 -> efa70b8)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 add efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ea9e8f3 -> efa70b8)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ea9e8f3 [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0 add efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (efa70b8 -> ffdbbae)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation add ffdbbae [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/DeployMessage.scala| 7 +++ .../org/apache/spark/deploy/master/Master.scala| 37 + .../spark/deploy/master/ui/MasterWebUI.scala | 48 - .../org/apache/spark/internal/config/UI.scala | 12 + .../apache/spark/deploy/master/MasterSuite.scala | 61 +- .../spark/deploy/master/ui/MasterWebUISuite.scala | 30 +-- 6 files changed, 190 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (efa70b8 -> ffdbbae)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation add ffdbbae [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/DeployMessage.scala| 7 +++ .../org/apache/spark/deploy/master/Master.scala| 37 + .../spark/deploy/master/ui/MasterWebUI.scala | 48 - .../org/apache/spark/internal/config/UI.scala | 12 + .../apache/spark/deploy/master/MasterSuite.scala | 61 +- .../spark/deploy/master/ui/MasterWebUISuite.scala | 30 +-- 6 files changed, 190 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (efa70b8 -> ffdbbae)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation add ffdbbae [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/DeployMessage.scala| 7 +++ .../org/apache/spark/deploy/master/Master.scala| 37 + .../spark/deploy/master/ui/MasterWebUI.scala | 48 - .../org/apache/spark/internal/config/UI.scala | 12 + .../apache/spark/deploy/master/MasterSuite.scala | 61 +- .../spark/deploy/master/ui/MasterWebUISuite.scala | 30 +-- 6 files changed, 190 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (efa70b8 -> ffdbbae)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation add ffdbbae [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/DeployMessage.scala| 7 +++ .../org/apache/spark/deploy/master/Master.scala| 37 + .../spark/deploy/master/ui/MasterWebUI.scala | 48 - .../org/apache/spark/internal/config/UI.scala | 12 + .../apache/spark/deploy/master/MasterSuite.scala | 61 +- .../spark/deploy/master/ui/MasterWebUISuite.scala | 30 +-- 6 files changed, 190 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (efa70b8 -> ffdbbae)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from efa70b8 [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of SparkOperation add ffdbbae [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/DeployMessage.scala| 7 +++ .../org/apache/spark/deploy/master/Master.scala| 37 + .../spark/deploy/master/ui/MasterWebUI.scala | 48 - .../org/apache/spark/internal/config/UI.scala | 12 + .../apache/spark/deploy/master/MasterSuite.scala | 61 +- .../spark/deploy/master/ui/MasterWebUISuite.scala | 30 +-- 6 files changed, 190 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org