date:20200716

[spark] branch master updated (db47c6e -> bdeb626)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from db47c6e  [SPARK-32125][UI] Support get taskList by status in Web UI 
and SHS Rest API
 add bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-ansi-compliance.md|   2 +
 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md  |  67 ++
 docs/sql-ref-syntax-aux-conf-mgmt.md   |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   8 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  11 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  39 +-
 .../test/resources/sql-tests/inputs/timezone.sql   |  15 +++
 .../resources/sql-tests/results/timezone.sql.out   | 135 +
 .../apache/spark/sql/internal/SQLConfSuite.scala   |  28 +
 11 files changed, 308 insertions(+), 6 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md
 create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/timezone.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (db47c6e -> bdeb626)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from db47c6e  [SPARK-32125][UI] Support get taskList by status in Web UI 
and SHS Rest API
 add bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-ansi-compliance.md|   2 +
 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md  |  67 ++
 docs/sql-ref-syntax-aux-conf-mgmt.md   |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   8 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  11 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  39 +-
 .../test/resources/sql-tests/inputs/timezone.sql   |  15 +++
 .../resources/sql-tests/results/timezone.sql.out   | 135 +
 .../apache/spark/sql/internal/SQLConfSuite.scala   |  28 +
 11 files changed, 308 insertions(+), 6 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md
 create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/timezone.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (db47c6e -> bdeb626)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from db47c6e  [SPARK-32125][UI] Support get taskList by status in Web UI 
and SHS Rest API
 add bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-ansi-compliance.md|   2 +
 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md  |  67 ++
 docs/sql-ref-syntax-aux-conf-mgmt.md   |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   8 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  11 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  39 +-
 .../test/resources/sql-tests/inputs/timezone.sql   |  15 +++
 .../resources/sql-tests/results/timezone.sql.out   | 135 +
 .../apache/spark/sql/internal/SQLConfSuite.scala   |  28 +
 11 files changed, 308 insertions(+), 6 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md
 create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/timezone.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (db47c6e -> bdeb626)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from db47c6e  [SPARK-32125][UI] Support get taskList by status in Web UI 
and SHS Rest API
 add bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-ansi-compliance.md|   2 +
 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md  |  67 ++
 docs/sql-ref-syntax-aux-conf-mgmt.md   |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   8 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  11 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  39 +-
 .../test/resources/sql-tests/inputs/timezone.sql   |  15 +++
 .../resources/sql-tests/results/timezone.sql.out   | 135 +
 .../apache/spark/sql/internal/SQLConfSuite.scala   |  28 +
 11 files changed, 308 insertions(+), 6 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md
 create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/timezone.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (db47c6e -> bdeb626)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from db47c6e  [SPARK-32125][UI] Support get taskList by status in Web UI 
and SHS Rest API
 add bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-ansi-compliance.md|   2 +
 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md  |  67 ++
 docs/sql-ref-syntax-aux-conf-mgmt.md   |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   8 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  11 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  39 +-
 .../test/resources/sql-tests/inputs/timezone.sql   |  15 +++
 .../resources/sql-tests/results/timezone.sql.out   | 135 +
 .../apache/spark/sql/internal/SQLConfSuite.scala   |  28 +
 11 files changed, 308 insertions(+), 6 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-conf-mgmt-set-timezone.md
 create mode 100644 sql/core/src/test/resources/sql-tests/inputs/timezone.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/timezone.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
6be8b93 is described below

commit 6be8b935a4f7ce0dea2d7aaaf747c2e8e1a9f47a
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is needed 
to f

[spark] branch master updated (bdeb626 -> 6be8b93)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE
 add 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/orc/OrcFileFormat.scala  | 10 +++---
 .../sql/execution/datasources/orc/OrcUtils.scala   | 41 ++
 .../v2/orc/OrcPartitionReaderFactory.scala | 21 +--
 .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++
 4 files changed, 78 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b745041  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
b745041 is described below

commit b745041f698120be21ab889706880e976a599fdb
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is need

[spark] branch master updated (bdeb626 -> 6be8b93)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE
 add 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/orc/OrcFileFormat.scala  | 10 +++---
 .../sql/execution/datasources/orc/OrcUtils.scala   | 41 ++
 .../v2/orc/OrcPartitionReaderFactory.scala | 21 +--
 .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++
 4 files changed, 78 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b745041  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
b745041 is described below

commit b745041f698120be21ab889706880e976a599fdb
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is need

[spark] branch master updated (bdeb626 -> 6be8b93)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bdeb626  [SPARK-32272][SQL] Add  SQL standard command SET TIME ZONE
 add 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/orc/OrcFileFormat.scala  | 10 +++---
 .../sql/execution/datasources/orc/OrcUtils.scala   | 41 ++
 .../v2/orc/OrcPartitionReaderFactory.scala | 21 +--
 .../spark/sql/hive/orc/HiveOrcQuerySuite.scala | 28 +++
 4 files changed, 78 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b745041  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
b745041 is described below

commit b745041f698120be21ab889706880e976a599fdb
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is need

[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b745041  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
b745041 is described below

commit b745041f698120be21ab889706880e976a599fdb
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is need

[spark] branch branch-3.0 updated: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b745041  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
b745041 is described below

commit b745041f698120be21ab889706880e976a599fdb
Author: SaurabhChawla 
AuthorDate: Thu Jul 16 13:11:47 2020 +

[SPARK-32234][SQL] Spark sql commands are failing on selecting the orc 
tables

### What changes were proposed in this pull request?
Spark sql commands are failing on selecting the orc tables
Steps to reproduce
Example 1 -
Prerequisite -  This is the location(/Users/test/tpcds_scale5data/date_dim) 
for orc data which is generated by the hive.
```
val table = """CREATE TABLE `date_dim` (
  `d_date_sk` INT,
  `d_date_id` STRING,
  `d_date` TIMESTAMP,
  `d_month_seq` INT,
  `d_week_seq` INT,
  `d_quarter_seq` INT,
  `d_year` INT,
  `d_dow` INT,
  `d_moy` INT,
  `d_dom` INT,
  `d_qoy` INT,
  `d_fy_year` INT,
  `d_fy_quarter_seq` INT,
  `d_fy_week_seq` INT,
  `d_day_name` STRING,
  `d_quarter_name` STRING,
  `d_holiday` STRING,
  `d_weekend` STRING,
  `d_following_holiday` STRING,
  `d_first_dom` INT,
  `d_last_dom` INT,
  `d_same_day_ly` INT,
  `d_same_day_lq` INT,
  `d_current_day` STRING,
  `d_current_week` STRING,
  `d_current_month` STRING,
  `d_current_quarter` STRING,
  `d_current_year` STRING)
USING orc
LOCATION '/Users/test/tpcds_scale5data/date_dim'"""

spark.sql(table).collect

val u = """select date_dim.d_date_id from date_dim limit 5"""

spark.sql(u).collect
```
Example 2

```
  val table = """CREATE TABLE `test_orc_data` (
  `_col1` INT,
  `_col2` STRING,
  `_col3` INT)
  USING orc"""

spark.sql(table).collect

spark.sql("insert into test_orc_data values(13, '155', 2020)").collect

val df = """select _col2 from test_orc_data limit 5"""
spark.sql(df).collect

```

Its Failing with below error
```
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 
(TID 2, 192.168.0.103, executor driver): 
java.lang.ArrayIndexOutOfBoundsException: 1
at 
org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initBatch(OrcColumnarBatchReader.java:156)
at 
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$7(OrcFileFormat.scala:258)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:141)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:203)
at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:116)
at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:620)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:343)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:895)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:895)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:372)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:336)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:133)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:445)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1489)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:448)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)`
```

The reason behind this initBatch is not getting the schema that is need

[spark] branch master updated (6be8b93 -> c1f160e)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
 add c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/StructFilters.scala  | 166 +
 .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++
 .../spark/sql/catalyst/json/JacksonParser.scala|  27 ++--
 .../spark/sql/catalyst/json/JsonFilters.scala  | 156 +++
 .../org/apache/spark/sql/internal/SQLConf.scala|   8 +
 ...FiltersSuite.scala => StructFiltersSuite.scala} |  38 ++---
 .../spark/sql/catalyst/csv/CSVFiltersSuite.scala   | 116 +-
 .../sql/catalyst/json/JacksonParserSuite.scala |  57 +++
 .../sql/catalyst/json/JsonFiltersSuite.scala}  |  14 +-
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  64 
 sql/core/benchmarks/CSVBenchmark-results.txt   |  64 
 .../benchmarks/JsonBenchmark-jdk11-results.txt |  94 ++--
 sql/core/benchmarks/JsonBenchmark-results.txt  |  94 ++--
 .../datasources/json/JsonFileFormat.scala  |   6 +-
 .../datasources/v2/csv/CSVScanBuilder.scala|   4 +-
 .../v2/json/JsonPartitionReaderFactory.scala   |  11 +-
 .../execution/datasources/v2/json/JsonScan.scala   |  12 +-
 .../datasources/v2/json/JsonScanBuilder.scala  |  26 +++-
 .../sql/execution/datasources/csv/CSVSuite.scala   |  30 
 .../execution/datasources/json/JsonBenchmark.scala |  45 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 151 ++-
 21 files changed, 893 insertions(+), 414 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala
 copy 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala
 => StructFiltersSuite.scala} (82%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala
 copy 
sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala
 => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6be8b93 -> c1f160e)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
 add c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/StructFilters.scala  | 166 +
 .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++
 .../spark/sql/catalyst/json/JacksonParser.scala|  27 ++--
 .../spark/sql/catalyst/json/JsonFilters.scala  | 156 +++
 .../org/apache/spark/sql/internal/SQLConf.scala|   8 +
 ...FiltersSuite.scala => StructFiltersSuite.scala} |  38 ++---
 .../spark/sql/catalyst/csv/CSVFiltersSuite.scala   | 116 +-
 .../sql/catalyst/json/JacksonParserSuite.scala |  57 +++
 .../sql/catalyst/json/JsonFiltersSuite.scala}  |  14 +-
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  64 
 sql/core/benchmarks/CSVBenchmark-results.txt   |  64 
 .../benchmarks/JsonBenchmark-jdk11-results.txt |  94 ++--
 sql/core/benchmarks/JsonBenchmark-results.txt  |  94 ++--
 .../datasources/json/JsonFileFormat.scala  |   6 +-
 .../datasources/v2/csv/CSVScanBuilder.scala|   4 +-
 .../v2/json/JsonPartitionReaderFactory.scala   |  11 +-
 .../execution/datasources/v2/json/JsonScan.scala   |  12 +-
 .../datasources/v2/json/JsonScanBuilder.scala  |  26 +++-
 .../sql/execution/datasources/csv/CSVSuite.scala   |  30 
 .../execution/datasources/json/JsonBenchmark.scala |  45 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 151 ++-
 21 files changed, 893 insertions(+), 414 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala
 copy 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala
 => StructFiltersSuite.scala} (82%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala
 copy 
sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala
 => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6be8b93 -> c1f160e)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
 add c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/StructFilters.scala  | 166 +
 .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++
 .../spark/sql/catalyst/json/JacksonParser.scala|  27 ++--
 .../spark/sql/catalyst/json/JsonFilters.scala  | 156 +++
 .../org/apache/spark/sql/internal/SQLConf.scala|   8 +
 ...FiltersSuite.scala => StructFiltersSuite.scala} |  38 ++---
 .../spark/sql/catalyst/csv/CSVFiltersSuite.scala   | 116 +-
 .../sql/catalyst/json/JacksonParserSuite.scala |  57 +++
 .../sql/catalyst/json/JsonFiltersSuite.scala}  |  14 +-
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  64 
 sql/core/benchmarks/CSVBenchmark-results.txt   |  64 
 .../benchmarks/JsonBenchmark-jdk11-results.txt |  94 ++--
 sql/core/benchmarks/JsonBenchmark-results.txt  |  94 ++--
 .../datasources/json/JsonFileFormat.scala  |   6 +-
 .../datasources/v2/csv/CSVScanBuilder.scala|   4 +-
 .../v2/json/JsonPartitionReaderFactory.scala   |  11 +-
 .../execution/datasources/v2/json/JsonScan.scala   |  12 +-
 .../datasources/v2/json/JsonScanBuilder.scala  |  26 +++-
 .../sql/execution/datasources/csv/CSVSuite.scala   |  30 
 .../execution/datasources/json/JsonBenchmark.scala |  45 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 151 ++-
 21 files changed, 893 insertions(+), 414 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala
 copy 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala
 => StructFiltersSuite.scala} (82%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala
 copy 
sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala
 => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6be8b93 -> c1f160e)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
 add c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/StructFilters.scala  | 166 +
 .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++
 .../spark/sql/catalyst/json/JacksonParser.scala|  27 ++--
 .../spark/sql/catalyst/json/JsonFilters.scala  | 156 +++
 .../org/apache/spark/sql/internal/SQLConf.scala|   8 +
 ...FiltersSuite.scala => StructFiltersSuite.scala} |  38 ++---
 .../spark/sql/catalyst/csv/CSVFiltersSuite.scala   | 116 +-
 .../sql/catalyst/json/JacksonParserSuite.scala |  57 +++
 .../sql/catalyst/json/JsonFiltersSuite.scala}  |  14 +-
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  64 
 sql/core/benchmarks/CSVBenchmark-results.txt   |  64 
 .../benchmarks/JsonBenchmark-jdk11-results.txt |  94 ++--
 sql/core/benchmarks/JsonBenchmark-results.txt  |  94 ++--
 .../datasources/json/JsonFileFormat.scala  |   6 +-
 .../datasources/v2/csv/CSVScanBuilder.scala|   4 +-
 .../v2/json/JsonPartitionReaderFactory.scala   |  11 +-
 .../execution/datasources/v2/json/JsonScan.scala   |  12 +-
 .../datasources/v2/json/JsonScanBuilder.scala  |  26 +++-
 .../sql/execution/datasources/csv/CSVSuite.scala   |  30 
 .../execution/datasources/json/JsonBenchmark.scala |  45 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 151 ++-
 21 files changed, 893 insertions(+), 414 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala
 copy 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala
 => StructFiltersSuite.scala} (82%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala
 copy 
sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala
 => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6be8b93 -> c1f160e)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6be8b93  [SPARK-32234][SQL] Spark sql commands are failing on 
selecting the orc tables
 add c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/StructFilters.scala  | 166 +
 .../apache/spark/sql/catalyst/csv/CSVFilters.scala | 124 +++
 .../spark/sql/catalyst/json/JacksonParser.scala|  27 ++--
 .../spark/sql/catalyst/json/JsonFilters.scala  | 156 +++
 .../org/apache/spark/sql/internal/SQLConf.scala|   8 +
 ...FiltersSuite.scala => StructFiltersSuite.scala} |  38 ++---
 .../spark/sql/catalyst/csv/CSVFiltersSuite.scala   | 116 +-
 .../sql/catalyst/json/JacksonParserSuite.scala |  57 +++
 .../sql/catalyst/json/JsonFiltersSuite.scala}  |  14 +-
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  64 
 sql/core/benchmarks/CSVBenchmark-results.txt   |  64 
 .../benchmarks/JsonBenchmark-jdk11-results.txt |  94 ++--
 sql/core/benchmarks/JsonBenchmark-results.txt  |  94 ++--
 .../datasources/json/JsonFileFormat.scala  |   6 +-
 .../datasources/v2/csv/CSVScanBuilder.scala|   4 +-
 .../v2/json/JsonPartitionReaderFactory.scala   |  11 +-
 .../execution/datasources/v2/json/JsonScan.scala   |  12 +-
 .../datasources/v2/json/JsonScanBuilder.scala  |  26 +++-
 .../sql/execution/datasources/csv/CSVSuite.scala   |  30 
 .../execution/datasources/json/JsonBenchmark.scala |  45 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 151 ++-
 21 files changed, 893 insertions(+), 414 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/StructFilters.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonFilters.scala
 copy 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/{csv/CSVFiltersSuite.scala
 => StructFiltersSuite.scala} (82%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonParserSuite.scala
 copy 
sql/catalyst/src/{main/scala/org/apache/spark/sql/connector/write/LogicalWriteInfoImpl.scala
 => test/scala/org/apache/spark/sql/catalyst/json/JsonFiltersSuite.scala} (71%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c1f160e -> d5c672a)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource
 add d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require

No new revisions were added by this update.

Summary of changes:
 mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c1f160e -> d5c672a)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource
 add d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require

No new revisions were added by this update.

Summary of changes:
 mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c1f160e -> d5c672a)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource
 add d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require

No new revisions were added by this update.

Summary of changes:
 mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c1f160e -> d5c672a)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource
 add d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require

No new revisions were added by this update.

Summary of changes:
 mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c1f160e -> d5c672a)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c1f160e  [SPARK-30648][SQL] Support filters pushdown in JSON datasource
 add d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require

No new revisions were added by this update.

Summary of changes:
 mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b6bea5  [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null 
with overflowed value
9b6bea5 is described below

commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6
Author: Wenchen Fan 
AuthorDate: Thu Jul 16 11:09:27 2020 -0700

[SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with 
overflowed value

partially backport https://github.com/apache/spark/pull/29026

Closes #29125 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java   |  2 +-
 .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala   | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
index 034894b..4dc5ce1 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
@@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements 
Externalizable, Kryo
   Platform.putLong(baseObject, baseOffset + cursor, 0L);
   Platform.putLong(baseObject, baseOffset + cursor + 8, 0L);
 
-  if (value == null) {
+  if (value == null || !value.changePrecision(precision, value.scale())) {
 setNullAt(ordinal);
 // keep the offset for future update
 Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32);
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
index a5f904c..9daa69c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
@@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite {
 // Makes sure hashCode on unsafe array won't crash
 unsafeRow.getArray(0).hashCode()
   }
+
+  test("SPARK-32018: setDecimal with overflowed value") {
+val d1 = new 
Decimal().set(BigDecimal("1000")).toPrecision(38, 18)
+val row = InternalRow.apply(d1)
+val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 
18))).apply(row)
+assert(unsafeRow.getDecimal(0, 38, 18) === d1)
+val d2 = (d1 * Decimal(10)).toPrecision(39, 18)
+unsafeRow.setDecimal(0, d2, 38)
+assert(unsafeRow.getDecimal(0, 38, 18) === null)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b6bea5  [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null 
with overflowed value
9b6bea5 is described below

commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6
Author: Wenchen Fan 
AuthorDate: Thu Jul 16 11:09:27 2020 -0700

[SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with 
overflowed value

partially backport https://github.com/apache/spark/pull/29026

Closes #29125 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java   |  2 +-
 .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala   | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
index 034894b..4dc5ce1 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
@@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements 
Externalizable, Kryo
   Platform.putLong(baseObject, baseOffset + cursor, 0L);
   Platform.putLong(baseObject, baseOffset + cursor + 8, 0L);
 
-  if (value == null) {
+  if (value == null || !value.changePrecision(precision, value.scale())) {
 setNullAt(ordinal);
 // keep the offset for future update
 Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32);
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
index a5f904c..9daa69c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
@@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite {
 // Makes sure hashCode on unsafe array won't crash
 unsafeRow.getArray(0).hashCode()
   }
+
+  test("SPARK-32018: setDecimal with overflowed value") {
+val d1 = new 
Decimal().set(BigDecimal("1000")).toPrecision(38, 18)
+val row = InternalRow.apply(d1)
+val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 
18))).apply(row)
+assert(unsafeRow.getDecimal(0, 38, 18) === d1)
+val d2 = (d1 * Decimal(10)).toPrecision(39, 18)
+unsafeRow.setDecimal(0, d2, 38)
+assert(unsafeRow.getDecimal(0, 38, 18) === null)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d5c672a -> 383f5e9)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require
 add 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/classification/FMClassifier.scala | 10 
 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++---
 .../ml/classification/LogisticRegression.scala | 14 ++
 .../spark/ml/classification/NaiveBayes.scala   |  4 +-
 .../spark/ml/clustering/BisectingKMeans.scala  |  7 +--
 .../spark/ml/clustering/GaussianMixture.scala  |  8 +--
 .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++--
 .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++--
 .../ml/clustering/PowerIterationClustering.scala   |  7 +--
 .../evaluation/BinaryClassificationEvaluator.scala |  4 +-
 .../MulticlassClassificationEvaluator.scala|  8 +--
 .../MultilabelClassificationEvaluator.scala|  6 +--
 .../spark/ml/evaluation/RankingEvaluator.scala |  6 +--
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  4 +-
 .../scala/org/apache/spark/ml/fpm/FPGrowth.scala   |  5 +-
 .../ml/regression/AFTSurvivalRegression.scala  | 11 ++--
 .../spark/ml/regression/LinearRegression.scala | 15 ++
 python/pyspark/ml/classification.py| 58 ++
 python/pyspark/ml/clustering.py| 33 
 python/pyspark/ml/fpm.py   |  7 ++-
 python/pyspark/ml/regression.py| 57 +
 21 files changed, 141 insertions(+), 157 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b6bea5  [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null 
with overflowed value
9b6bea5 is described below

commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6
Author: Wenchen Fan 
AuthorDate: Thu Jul 16 11:09:27 2020 -0700

[SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with 
overflowed value

partially backport https://github.com/apache/spark/pull/29026

Closes #29125 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java   |  2 +-
 .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala   | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
index 034894b..4dc5ce1 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
@@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements 
Externalizable, Kryo
   Platform.putLong(baseObject, baseOffset + cursor, 0L);
   Platform.putLong(baseObject, baseOffset + cursor + 8, 0L);
 
-  if (value == null) {
+  if (value == null || !value.changePrecision(precision, value.scale())) {
 setNullAt(ordinal);
 // keep the offset for future update
 Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32);
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
index a5f904c..9daa69c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
@@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite {
 // Makes sure hashCode on unsafe array won't crash
 unsafeRow.getArray(0).hashCode()
   }
+
+  test("SPARK-32018: setDecimal with overflowed value") {
+val d1 = new 
Decimal().set(BigDecimal("1000")).toPrecision(38, 18)
+val row = InternalRow.apply(d1)
+val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 
18))).apply(row)
+assert(unsafeRow.getDecimal(0, 38, 18) === d1)
+val d2 = (d1 * Decimal(10)).toPrecision(39, 18)
+unsafeRow.setDecimal(0, d2, 38)
+assert(unsafeRow.getDecimal(0, 38, 18) === null)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d5c672a -> 383f5e9)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require
 add 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/classification/FMClassifier.scala | 10 
 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++---
 .../ml/classification/LogisticRegression.scala | 14 ++
 .../spark/ml/classification/NaiveBayes.scala   |  4 +-
 .../spark/ml/clustering/BisectingKMeans.scala  |  7 +--
 .../spark/ml/clustering/GaussianMixture.scala  |  8 +--
 .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++--
 .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++--
 .../ml/clustering/PowerIterationClustering.scala   |  7 +--
 .../evaluation/BinaryClassificationEvaluator.scala |  4 +-
 .../MulticlassClassificationEvaluator.scala|  8 +--
 .../MultilabelClassificationEvaluator.scala|  6 +--
 .../spark/ml/evaluation/RankingEvaluator.scala |  6 +--
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  4 +-
 .../scala/org/apache/spark/ml/fpm/FPGrowth.scala   |  5 +-
 .../ml/regression/AFTSurvivalRegression.scala  | 11 ++--
 .../spark/ml/regression/LinearRegression.scala | 15 ++
 python/pyspark/ml/classification.py| 58 ++
 python/pyspark/ml/clustering.py| 33 
 python/pyspark/ml/fpm.py   |  7 ++-
 python/pyspark/ml/regression.py| 57 +
 21 files changed, 141 insertions(+), 157 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b6bea5  [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null 
with overflowed value
9b6bea5 is described below

commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6
Author: Wenchen Fan 
AuthorDate: Thu Jul 16 11:09:27 2020 -0700

[SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with 
overflowed value

partially backport https://github.com/apache/spark/pull/29026

Closes #29125 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java   |  2 +-
 .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala   | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
index 034894b..4dc5ce1 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
@@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements 
Externalizable, Kryo
   Platform.putLong(baseObject, baseOffset + cursor, 0L);
   Platform.putLong(baseObject, baseOffset + cursor + 8, 0L);
 
-  if (value == null) {
+  if (value == null || !value.changePrecision(precision, value.scale())) {
 setNullAt(ordinal);
 // keep the offset for future update
 Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32);
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
index a5f904c..9daa69c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
@@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite {
 // Makes sure hashCode on unsafe array won't crash
 unsafeRow.getArray(0).hashCode()
   }
+
+  test("SPARK-32018: setDecimal with overflowed value") {
+val d1 = new 
Decimal().set(BigDecimal("1000")).toPrecision(38, 18)
+val row = InternalRow.apply(d1)
+val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 
18))).apply(row)
+assert(unsafeRow.getDecimal(0, 38, 18) === d1)
+val d2 = (d1 * Decimal(10)).toPrecision(39, 18)
+unsafeRow.setDecimal(0, d2, 38)
+assert(unsafeRow.getDecimal(0, 38, 18) === null)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d5c672a -> 383f5e9)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require
 add 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/classification/FMClassifier.scala | 10 
 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++---
 .../ml/classification/LogisticRegression.scala | 14 ++
 .../spark/ml/classification/NaiveBayes.scala   |  4 +-
 .../spark/ml/clustering/BisectingKMeans.scala  |  7 +--
 .../spark/ml/clustering/GaussianMixture.scala  |  8 +--
 .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++--
 .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++--
 .../ml/clustering/PowerIterationClustering.scala   |  7 +--
 .../evaluation/BinaryClassificationEvaluator.scala |  4 +-
 .../MulticlassClassificationEvaluator.scala|  8 +--
 .../MultilabelClassificationEvaluator.scala|  6 +--
 .../spark/ml/evaluation/RankingEvaluator.scala |  6 +--
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  4 +-
 .../scala/org/apache/spark/ml/fpm/FPGrowth.scala   |  5 +-
 .../ml/regression/AFTSurvivalRegression.scala  | 11 ++--
 .../spark/ml/regression/LinearRegression.scala | 15 ++
 python/pyspark/ml/classification.py| 58 ++
 python/pyspark/ml/clustering.py| 33 
 python/pyspark/ml/fpm.py   |  7 ++-
 python/pyspark/ml/regression.py| 57 +
 21 files changed, 141 insertions(+), 157 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b6bea5  [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null 
with overflowed value
9b6bea5 is described below

commit 9b6bea5d5ab4bd3a69ca6eaf7811eb54f9562ee6
Author: Wenchen Fan 
AuthorDate: Thu Jul 16 11:09:27 2020 -0700

[SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with 
overflowed value

partially backport https://github.com/apache/spark/pull/29026

Closes #29125 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/UnsafeRow.java   |  2 +-
 .../src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala   | 10 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
index 034894b..4dc5ce1 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
@@ -288,7 +288,7 @@ public final class UnsafeRow extends InternalRow implements 
Externalizable, Kryo
   Platform.putLong(baseObject, baseOffset + cursor, 0L);
   Platform.putLong(baseObject, baseOffset + cursor + 8, 0L);
 
-  if (value == null) {
+  if (value == null || !value.changePrecision(precision, value.scale())) {
 setNullAt(ordinal);
 // keep the offset for future update
 Platform.putLong(baseObject, getFieldOffset(ordinal), cursor << 32);
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
index a5f904c..9daa69c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UnsafeRowSuite.scala
@@ -178,4 +178,14 @@ class UnsafeRowSuite extends SparkFunSuite {
 // Makes sure hashCode on unsafe array won't crash
 unsafeRow.getArray(0).hashCode()
   }
+
+  test("SPARK-32018: setDecimal with overflowed value") {
+val d1 = new 
Decimal().set(BigDecimal("1000")).toPrecision(38, 18)
+val row = InternalRow.apply(d1)
+val unsafeRow = UnsafeProjection.create(Array[DataType](DecimalType(38, 
18))).apply(row)
+assert(unsafeRow.getDecimal(0, 38, 18) === d1)
+val d2 = (d1 * Decimal(10)).toPrecision(39, 18)
+unsafeRow.setDecimal(0, d2, 38)
+assert(unsafeRow.getDecimal(0, 38, 18) === null)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (383f5e9 -> fb51925)

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm
 add fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT

No new revisions were added by this update.

Summary of changes:
 .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 ---
 1 file changed, 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d5c672a -> 383f5e9)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require
 add 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/classification/FMClassifier.scala | 10 
 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++---
 .../ml/classification/LogisticRegression.scala | 14 ++
 .../spark/ml/classification/NaiveBayes.scala   |  4 +-
 .../spark/ml/clustering/BisectingKMeans.scala  |  7 +--
 .../spark/ml/clustering/GaussianMixture.scala  |  8 +--
 .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++--
 .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++--
 .../ml/clustering/PowerIterationClustering.scala   |  7 +--
 .../evaluation/BinaryClassificationEvaluator.scala |  4 +-
 .../MulticlassClassificationEvaluator.scala|  8 +--
 .../MultilabelClassificationEvaluator.scala|  6 +--
 .../spark/ml/evaluation/RankingEvaluator.scala |  6 +--
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  4 +-
 .../scala/org/apache/spark/ml/fpm/FPGrowth.scala   |  5 +-
 .../ml/regression/AFTSurvivalRegression.scala  | 11 ++--
 .../spark/ml/regression/LinearRegression.scala | 15 ++
 python/pyspark/ml/classification.py| 58 ++
 python/pyspark/ml/clustering.py| 33 
 python/pyspark/ml/fpm.py   |  7 ++-
 python/pyspark/ml/regression.py| 57 +
 21 files changed, 141 insertions(+), 157 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (383f5e9 -> fb51925)

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm
 add fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT

No new revisions were added by this update.

Summary of changes:
 .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 ---
 1 file changed, 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (383f5e9 -> fb51925)

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm
 add fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT

No new revisions were added by this update.

Summary of changes:
 .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 ---
 1 file changed, 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (383f5e9 -> fb51925)

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm
 add fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT

No new revisions were added by this update.

Summary of changes:
 .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 ---
 1 file changed, 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (383f5e9 -> fb51925)

2020-07-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm
 add fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT

No new revisions were added by this update.

Summary of changes:
 .../deploy/k8s/integrationtest/PythonTestsSuite.scala | 19 ---
 1 file changed, 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d5c672a -> 383f5e9)

2020-07-16 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5c672a  [SPARK-32315][ML] Provide an explanation error message when 
calling require
 add 383f5e9  [SPARK-32310][ML][PYSPARK] ML params default value parity in 
classification, regression, clustering and fpm

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/classification/FMClassifier.scala | 10 
 .../apache/spark/ml/classification/LinearSVC.scala | 12 ++---
 .../ml/classification/LogisticRegression.scala | 14 ++
 .../spark/ml/classification/NaiveBayes.scala   |  4 +-
 .../spark/ml/clustering/BisectingKMeans.scala  |  7 +--
 .../spark/ml/clustering/GaussianMixture.scala  |  8 +--
 .../org/apache/spark/ml/clustering/KMeans.scala| 11 ++--
 .../scala/org/apache/spark/ml/clustering/LDA.scala | 11 ++--
 .../ml/clustering/PowerIterationClustering.scala   |  7 +--
 .../evaluation/BinaryClassificationEvaluator.scala |  4 +-
 .../MulticlassClassificationEvaluator.scala|  8 +--
 .../MultilabelClassificationEvaluator.scala|  6 +--
 .../spark/ml/evaluation/RankingEvaluator.scala |  6 +--
 .../spark/ml/evaluation/RegressionEvaluator.scala  |  4 +-
 .../scala/org/apache/spark/ml/fpm/FPGrowth.scala   |  5 +-
 .../ml/regression/AFTSurvivalRegression.scala  | 11 ++--
 .../spark/ml/regression/LinearRegression.scala | 15 ++
 python/pyspark/ml/classification.py| 58 ++
 python/pyspark/ml/clustering.py| 33 
 python/pyspark/ml/fpm.py   |  7 ++-
 python/pyspark/ml/regression.py| 57 +
 21 files changed, 141 insertions(+), 157 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9747e8f -> ea9e8f3)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories
 add ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

No new revisions were added by this update.

Summary of changes:
 LICENSE|2 +-
 dev/.rat-excludes  |4 +-
 dev/tox.ini|2 +-
 python/pyspark/cloudpickle.py  | 1362 
 python/pyspark/cloudpickle/__init__.py |7 +
 python/pyspark/cloudpickle/cloudpickle.py  |  830 +++
 python/pyspark/cloudpickle/cloudpickle_fast.py |  747 +
 python/pyspark/cloudpickle/compat.py   |   13 +
 python/setup.py|1 +
 9 files changed, 1602 insertions(+), 1366 deletions(-)
 delete mode 100644 python/pyspark/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/__init__.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py
 create mode 100644 python/pyspark/cloudpickle/compat.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9747e8f -> ea9e8f3)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories
 add ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

No new revisions were added by this update.

Summary of changes:
 LICENSE|2 +-
 dev/.rat-excludes  |4 +-
 dev/tox.ini|2 +-
 python/pyspark/cloudpickle.py  | 1362 
 python/pyspark/cloudpickle/__init__.py |7 +
 python/pyspark/cloudpickle/cloudpickle.py  |  830 +++
 python/pyspark/cloudpickle/cloudpickle_fast.py |  747 +
 python/pyspark/cloudpickle/compat.py   |   13 +
 python/setup.py|1 +
 9 files changed, 1602 insertions(+), 1366 deletions(-)
 delete mode 100644 python/pyspark/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/__init__.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py
 create mode 100644 python/pyspark/cloudpickle/compat.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9747e8f -> ea9e8f3)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories
 add ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

No new revisions were added by this update.

Summary of changes:
 LICENSE|2 +-
 dev/.rat-excludes  |4 +-
 dev/tox.ini|2 +-
 python/pyspark/cloudpickle.py  | 1362 
 python/pyspark/cloudpickle/__init__.py |7 +
 python/pyspark/cloudpickle/cloudpickle.py  |  830 +++
 python/pyspark/cloudpickle/cloudpickle_fast.py |  747 +
 python/pyspark/cloudpickle/compat.py   |   13 +
 python/setup.py|1 +
 9 files changed, 1602 insertions(+), 1366 deletions(-)
 delete mode 100644 python/pyspark/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/__init__.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py
 create mode 100644 python/pyspark/cloudpickle/compat.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9747e8f -> ea9e8f3)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories
 add ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

No new revisions were added by this update.

Summary of changes:
 LICENSE|2 +-
 dev/.rat-excludes  |4 +-
 dev/tox.ini|2 +-
 python/pyspark/cloudpickle.py  | 1362 
 python/pyspark/cloudpickle/__init__.py |7 +
 python/pyspark/cloudpickle/cloudpickle.py  |  830 +++
 python/pyspark/cloudpickle/cloudpickle_fast.py |  747 +
 python/pyspark/cloudpickle/compat.py   |   13 +
 python/setup.py|1 +
 9 files changed, 1602 insertions(+), 1366 deletions(-)
 delete mode 100644 python/pyspark/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/__init__.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py
 create mode 100644 python/pyspark/cloudpickle/compat.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9747e8f -> ea9e8f3)

2020-07-16 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories
 add ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

No new revisions were added by this update.

Summary of changes:
 LICENSE|2 +-
 dev/.rat-excludes  |4 +-
 dev/tox.ini|2 +-
 python/pyspark/cloudpickle.py  | 1362 
 python/pyspark/cloudpickle/__init__.py |7 +
 python/pyspark/cloudpickle/cloudpickle.py  |  830 +++
 python/pyspark/cloudpickle/cloudpickle_fast.py |  747 +
 python/pyspark/cloudpickle/compat.py   |   13 +
 python/setup.py|1 +
 9 files changed, 1602 insertions(+), 1366 deletions(-)
 delete mode 100644 python/pyspark/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/__init__.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle.py
 create mode 100644 python/pyspark/cloudpickle/cloudpickle_fast.py
 create mode 100644 python/pyspark/cloudpickle/compat.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea9e8f3 -> efa70b8)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0
 add efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea9e8f3 -> efa70b8)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0
 add efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea9e8f3 -> efa70b8)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0
 add efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea9e8f3 -> efa70b8)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0
 add efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ea9e8f3 -> efa70b8)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ea9e8f3  [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0
 add efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/hive/thriftserver/SparkOperation.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (efa70b8 -> ffdbbae)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation
 add ffdbbae  [SPARK-32215] Expose a (protected) /workers/kill endpoint on 
the MasterWebUI

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/DeployMessage.scala|  7 +++
 .../org/apache/spark/deploy/master/Master.scala| 37 +
 .../spark/deploy/master/ui/MasterWebUI.scala   | 48 -
 .../org/apache/spark/internal/config/UI.scala  | 12 +
 .../apache/spark/deploy/master/MasterSuite.scala   | 61 +-
 .../spark/deploy/master/ui/MasterWebUISuite.scala  | 30 +--
 6 files changed, 190 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (efa70b8 -> ffdbbae)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation
 add ffdbbae  [SPARK-32215] Expose a (protected) /workers/kill endpoint on 
the MasterWebUI

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/DeployMessage.scala|  7 +++
 .../org/apache/spark/deploy/master/Master.scala| 37 +
 .../spark/deploy/master/ui/MasterWebUI.scala   | 48 -
 .../org/apache/spark/internal/config/UI.scala  | 12 +
 .../apache/spark/deploy/master/MasterSuite.scala   | 61 +-
 .../spark/deploy/master/ui/MasterWebUISuite.scala  | 30 +--
 6 files changed, 190 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (efa70b8 -> ffdbbae)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation
 add ffdbbae  [SPARK-32215] Expose a (protected) /workers/kill endpoint on 
the MasterWebUI

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/DeployMessage.scala|  7 +++
 .../org/apache/spark/deploy/master/Master.scala| 37 +
 .../spark/deploy/master/ui/MasterWebUI.scala   | 48 -
 .../org/apache/spark/internal/config/UI.scala  | 12 +
 .../apache/spark/deploy/master/MasterSuite.scala   | 61 +-
 .../spark/deploy/master/ui/MasterWebUISuite.scala  | 30 +--
 6 files changed, 190 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (efa70b8 -> ffdbbae)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation
 add ffdbbae  [SPARK-32215] Expose a (protected) /workers/kill endpoint on 
the MasterWebUI

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/DeployMessage.scala|  7 +++
 .../org/apache/spark/deploy/master/Master.scala| 37 +
 .../spark/deploy/master/ui/MasterWebUI.scala   | 48 -
 .../org/apache/spark/internal/config/UI.scala  | 12 +
 .../apache/spark/deploy/master/MasterSuite.scala   | 61 +-
 .../spark/deploy/master/ui/MasterWebUISuite.scala  | 30 +--
 6 files changed, 190 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (efa70b8 -> ffdbbae)

2020-07-16 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efa70b8  [SPARK-32145][SQL][FOLLOWUP] Fix type in the error log of 
SparkOperation
 add ffdbbae  [SPARK-32215] Expose a (protected) /workers/kill endpoint on 
the MasterWebUI

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/deploy/DeployMessage.scala|  7 +++
 .../org/apache/spark/deploy/master/Master.scala| 37 +
 .../spark/deploy/master/ui/MasterWebUI.scala   | 48 -
 .../org/apache/spark/internal/config/UI.scala  | 12 +
 .../apache/spark/deploy/master/MasterSuite.scala   | 61 +-
 .../spark/deploy/master/ui/MasterWebUISuite.scala  | 30 +--
 6 files changed, 190 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

58 matches

Mail list logo