[jira] [Commented] (SPARK-30043) Add built-in Array Functions: array_fill

2019-12-29 Thread jiaan.geng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005181#comment-17005181
 ] 

jiaan.geng commented on SPARK-30043:


OK.

> Add built-in Array Functions: array_fill
> 
>
> Key: SPARK-30043
> URL: https://issues.apache.org/jira/browse/SPARK-30043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_fill}}{{(}}{{anyelement}}{{, }}{{int[]}}{{ [, 
> {{int[]}}])}}|{{anyarray}}|returns an array initialized with supplied value 
> and dimensions, optionally with lower bounds other than 1|{{array_fill(7, 
> ARRAY[3], ARRAY[2])}}|{{[2:4]=\{7,7,7}}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30042) Add built-in Array Functions: array_dims

2019-12-29 Thread jiaan.geng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005180#comment-17005180
 ] 

jiaan.geng commented on SPARK-30042:


OK. I doubt the useful too.

> Add built-in Array Functions: array_dims
> 
>
> Key: SPARK-30042
> URL: https://issues.apache.org/jira/browse/SPARK-30042
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_dims}}{{(}}{{anyarray}}{{)}}|{{text}}|returns a text representation 
> of array's dimensions|{{array_dims(ARRAY[[1,2,3], [4,5,6]])}}|{{[1:2][1:3]}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30370) Update SqlBase.g4 to combine namespace and database tokens.

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-30370:
---

Assignee: Terry Kim

> Update SqlBase.g4 to combine namespace and database tokens.
> ---
>
> Key: SPARK-30370
> URL: https://issues.apache.org/jira/browse/SPARK-30370
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Minor
>
> Instead of using `(database | NAMESPACE)` in the grammar, create 
> namespace : NAMESPACE | DATABASE | SCHEMA;
> and use it instead.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30370) Update SqlBase.g4 to combine namespace and database tokens.

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-30370.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27027
[https://github.com/apache/spark/pull/27027]

> Update SqlBase.g4 to combine namespace and database tokens.
> ---
>
> Key: SPARK-30370
> URL: https://issues.apache.org/jira/browse/SPARK-30370
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Minor
> Fix For: 3.0.0
>
>
> Instead of using `(database | NAMESPACE)` in the grammar, create 
> namespace : NAMESPACE | DATABASE | SCHEMA;
> and use it instead.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30348) Flaky test: org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master should avoid dead loop while launching executor failed in Worker

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-30348.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27004
[https://github.com/apache/spark/pull/27004]

> Flaky test:  org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master 
> should avoid dead loop while launching executor failed in Worker
> 
>
> Key: SPARK-30348
> URL: https://issues.apache.org/jira/browse/SPARK-30348
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Major
> Fix For: 3.0.0
>
>
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115664/testReport/]
>  
> {code:java}
> org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master should avoid 
> dead loop while launching executor failed in Worker 
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to 
> eventually never returned normally. Attempted 656 times over 10.002408616 
> seconds. Last failure message: Map() did not contain key 
> "app-20191223154506-". 
> sbt.ForkMain$ForkError: 
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to 
> eventually never returned normally. Attempted 656 times over 10.002408616 
> seconds. Last failure message: Map() did not contain key 
> "app-20191223154506-".
>   at 
> org.scalatest.concurrent.Eventually.tryTryAgain$1(Eventually.scala:432)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:439)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:391)
>   at 
> org.apache.spark.deploy.master.MasterSuite.eventually(MasterSuite.scala:111)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:337)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:336)
>   at 
> org.apache.spark.deploy.master.MasterSuite.eventually(MasterSuite.scala:111)
>   at 
> org.apache.spark.deploy.master.MasterSuite.$anonfun$new$40(MasterSuite.scala:681)
>   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
>   at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:286)
>   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
>   at 
> org.apache.spark.deploy.master.MasterSuite.org$scalatest$BeforeAndAfter$$super$runTest(MasterSuite.scala:111)
>   at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:203)
>   at org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.scala:192)
>   at 
> org.apache.spark.deploy.master.MasterSuite.runTest(MasterSuite.scala:111)
>   at 
> org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
>   at 
> org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:393)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:381)
>   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:376)
>   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:458)
>   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
>   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
>   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>   at org.scalatest.Suite.run(Suite.scala:1124)
>   at org.scalatest.Suite.run$(Suite.scala:1106)
>   at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
>   at org.scalatest.SuperEngine.runImpl(Engine.scala:518)
>   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scal

[jira] [Assigned] (SPARK-30348) Flaky test: org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master should avoid dead loop while launching executor failed in Worker

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-30348:
---

Assignee: Jungtaek Lim

> Flaky test:  org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master 
> should avoid dead loop while launching executor failed in Worker
> 
>
> Key: SPARK-30348
> URL: https://issues.apache.org/jira/browse/SPARK-30348
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Major
>
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115664/testReport/]
>  
> {code:java}
> org.apache.spark.deploy.master.MasterSuite.SPARK-27510: Master should avoid 
> dead loop while launching executor failed in Worker 
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to 
> eventually never returned normally. Attempted 656 times over 10.002408616 
> seconds. Last failure message: Map() did not contain key 
> "app-20191223154506-". 
> sbt.ForkMain$ForkError: 
> org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to 
> eventually never returned normally. Attempted 656 times over 10.002408616 
> seconds. Last failure message: Map() did not contain key 
> "app-20191223154506-".
>   at 
> org.scalatest.concurrent.Eventually.tryTryAgain$1(Eventually.scala:432)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:439)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:391)
>   at 
> org.apache.spark.deploy.master.MasterSuite.eventually(MasterSuite.scala:111)
>   at org.scalatest.concurrent.Eventually.eventually(Eventually.scala:337)
>   at org.scalatest.concurrent.Eventually.eventually$(Eventually.scala:336)
>   at 
> org.apache.spark.deploy.master.MasterSuite.eventually(MasterSuite.scala:111)
>   at 
> org.apache.spark.deploy.master.MasterSuite.$anonfun$new$40(MasterSuite.scala:681)
>   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
>   at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:286)
>   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
>   at 
> org.apache.spark.deploy.master.MasterSuite.org$scalatest$BeforeAndAfter$$super$runTest(MasterSuite.scala:111)
>   at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:203)
>   at org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.scala:192)
>   at 
> org.apache.spark.deploy.master.MasterSuite.runTest(MasterSuite.scala:111)
>   at 
> org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
>   at 
> org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:393)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:381)
>   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:376)
>   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:458)
>   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
>   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
>   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>   at org.scalatest.Suite.run(Suite.scala:1124)
>   at org.scalatest.Suite.run$(Suite.scala:1106)
>   at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
>   at org.scalatest.SuperEngine.runImpl(Engine.scala:518)
>   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
>   at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest

[jira] [Commented] (SPARK-29098) Test both ANSI mode and Spark mode

2019-12-29 Thread Aman Omer (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005145#comment-17005145
 ] 

Aman Omer commented on SPARK-29098:
---

I will work on this

> Test both ANSI mode and Spark mode
> --
>
> Key: SPARK-29098
> URL: https://issues.apache.org/jira/browse/SPARK-29098
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Priority: Major
>
> The PostgreSQL test case improves the test coverage of Spark SQL.
> There are SQL files that have different results with/without ANSI 
> flags(spark.sql.failOnIntegralTypeOverflow, spark.sql.parser.ansi.enabled, 
> etc) enabled.
> We should run tests against these SQL files with both ANSI mode and Spark 
> mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-27148) Support CURRENT_TIME and LOCALTIME when ANSI mode enabled

2019-12-29 Thread Aman Omer (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Omer updated SPARK-27148:
--
Comment: was deleted

(was: I will work on this.)

> Support CURRENT_TIME and LOCALTIME when ANSI mode enabled
> -
>
> Key: SPARK-27148
> URL: https://issues.apache.org/jira/browse/SPARK-27148
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>
> CURRENT_TIME and LOCALTIME should be supported in the ANSI standard;
> {code:java}
> postgres=# select CURRENT_TIME;
>        timetz       
> 
> 16:45:43.398109+09
> (1 row)
> postgres=# select LOCALTIME;
>       time      
> 
> 16:45:48.60969
> (1 row){code}
> Before this, we need to support TIME types (java.sql.Time).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-29098) Test both ANSI mode and Spark mode

2019-12-29 Thread Aman Omer (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Omer updated SPARK-29098:
--
Comment: was deleted

(was: I will work on this.)

> Test both ANSI mode and Spark mode
> --
>
> Key: SPARK-29098
> URL: https://issues.apache.org/jira/browse/SPARK-29098
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Priority: Major
>
> The PostgreSQL test case improves the test coverage of Spark SQL.
> There are SQL files that have different results with/without ANSI 
> flags(spark.sql.failOnIntegralTypeOverflow, spark.sql.parser.ansi.enabled, 
> etc) enabled.
> We should run tests against these SQL files with both ANSI mode and Spark 
> mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27348) HeartbeatReceiver doesn't remove lost executors from CoarseGrainedSchedulerBackend

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-27348:
---

Assignee: wuyi

> HeartbeatReceiver doesn't remove lost executors from 
> CoarseGrainedSchedulerBackend
> --
>
> Key: SPARK-27348
> URL: https://issues.apache.org/jira/browse/SPARK-27348
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Shixiong Zhu
>Assignee: wuyi
>Priority: Major
> Fix For: 3.0.0
>
>
> When a heartbeat timeout happens in HeartbeatReceiver, it doesn't remove lost 
> executors from CoarseGrainedSchedulerBackend. When a connection of an 
> executor is not gracefully shut down, CoarseGrainedSchedulerBackend may not 
> receive a disconnect event. In this case, CoarseGrainedSchedulerBackend still 
> thinks a lost executor is still alive. CoarseGrainedSchedulerBackend may ask 
> TaskScheduler to run tasks on this lost executor. This task will never finish 
> and the job will hang forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27348) HeartbeatReceiver doesn't remove lost executors from CoarseGrainedSchedulerBackend

2019-12-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-27348.
-
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 26980
[https://github.com/apache/spark/pull/26980]

> HeartbeatReceiver doesn't remove lost executors from 
> CoarseGrainedSchedulerBackend
> --
>
> Key: SPARK-27348
> URL: https://issues.apache.org/jira/browse/SPARK-27348
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Shixiong Zhu
>Priority: Major
> Fix For: 3.0.0
>
>
> When a heartbeat timeout happens in HeartbeatReceiver, it doesn't remove lost 
> executors from CoarseGrainedSchedulerBackend. When a connection of an 
> executor is not gracefully shut down, CoarseGrainedSchedulerBackend may not 
> receive a disconnect event. In this case, CoarseGrainedSchedulerBackend still 
> thinks a lost executor is still alive. CoarseGrainedSchedulerBackend may ask 
> TaskScheduler to run tasks on this lost executor. This task will never finish 
> and the job will hang forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29098) Test both ANSI mode and Spark mode

2019-12-29 Thread Aman Omer (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005136#comment-17005136
 ] 

Aman Omer commented on SPARK-29098:
---

I will work on this.

> Test both ANSI mode and Spark mode
> --
>
> Key: SPARK-29098
> URL: https://issues.apache.org/jira/browse/SPARK-29098
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Priority: Major
>
> The PostgreSQL test case improves the test coverage of Spark SQL.
> There are SQL files that have different results with/without ANSI 
> flags(spark.sql.failOnIntegralTypeOverflow, spark.sql.parser.ansi.enabled, 
> etc) enabled.
> We should run tests against these SQL files with both ANSI mode and Spark 
> mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27148) Support CURRENT_TIME and LOCALTIME when ANSI mode enabled

2019-12-29 Thread Aman Omer (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005135#comment-17005135
 ] 

Aman Omer commented on SPARK-27148:
---

I will work on this.

> Support CURRENT_TIME and LOCALTIME when ANSI mode enabled
> -
>
> Key: SPARK-27148
> URL: https://issues.apache.org/jira/browse/SPARK-27148
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>
> CURRENT_TIME and LOCALTIME should be supported in the ANSI standard;
> {code:java}
> postgres=# select CURRENT_TIME;
>        timetz       
> 
> 16:45:43.398109+09
> (1 row)
> postgres=# select LOCALTIME;
>       time      
> 
> 16:45:48.60969
> (1 row){code}
> Before this, we need to support TIME types (java.sql.Time).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28122) Binary String Functions: SHA functions

2019-12-29 Thread Aman Omer (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005131#comment-17005131
 ] 

Aman Omer commented on SPARK-28122:
---

I will work on this.

> Binary String Functions:  SHA functions
> ---
>
> Key: SPARK-28122
> URL: https://issues.apache.org/jira/browse/SPARK-28122
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Return Type||Description||Example||Result||
> |{{sha224(}}{{bytea}}{{)}}|{{bytea}}|SHA-224 
> hash|{{sha224('abc')}}|{{\x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7}}|
> |{{sha256(}}{{bytea}}{{)}}|{{bytea}}|SHA-256 
> hash|{{sha256('abc')}}|{{\xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad}}|
> |{{sha384(}}{{bytea}}{{)}}|{{bytea}}|SHA-384 
> hash|{{sha384('abc')}}|{{\xcb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7}}|
> |{{sha512(}}{{bytea}}{{)}}|{{bytea}}|SHA-512 
> hash|{{sha512('abc')}}|{{\xddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a964b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f}}|
> More details: https://www.postgresql.org/docs/11/functions-binarystring.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30377) Make Regressors extend abstract class Regressor

2019-12-29 Thread zhengruifeng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005099#comment-17005099
 ] 

zhengruifeng commented on SPARK-30377:
--

[~huaxingao]  As to making OVR extend Classifier, there is a previous ticket 
about in https://issues.apache.org/jira/browse/SPARK-8799

> Make Regressors extend abstract class Regressor
> ---
>
> Key: SPARK-30377
> URL: https://issues.apache.org/jira/browse/SPARK-30377
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Priority: Major
>
> Just found that  {{AFTSurvivalRegression}} , {{DecisionTreeRegressor}}, 
> {{FMRegressor}}, {{GBTRegressor}}, {{RandomForestRegressor}} directly extend 
> {{Predictor}}
>  
> Only {{GeneralizedLinearRegression}} and {{LinearRegression now extend 
> Regressor.}} 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30377) Make Regressors extend abstract class Regressor

2019-12-29 Thread zhengruifeng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005098#comment-17005098
 ] 

zhengruifeng commented on SPARK-30377:
--

there is no api difference now.

> Make Regressors extend abstract class Regressor
> ---
>
> Key: SPARK-30377
> URL: https://issues.apache.org/jira/browse/SPARK-30377
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Priority: Major
>
> Just found that  {{AFTSurvivalRegression}} , {{DecisionTreeRegressor}}, 
> {{FMRegressor}}, {{GBTRegressor}}, {{RandomForestRegressor}} directly extend 
> {{Predictor}}
>  
> Only {{GeneralizedLinearRegression}} and {{LinearRegression now extend 
> Regressor.}} 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30376) Unify the computation of numFeatures

2019-12-29 Thread zhengruifeng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng resolved SPARK-30376.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 27037
[https://github.com/apache/spark/pull/27037]

> Unify the computation of numFeatures
> 
>
> Key: SPARK-30376
> URL: https://issues.apache.org/jira/browse/SPARK-30376
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Assignee: zhengruifeng
>Priority: Trivial
> Fix For: 3.0.0
>
>
> Try to extract numFeatures from metadata first, if do not exists, then 
> extract from the dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30376) Unify the computation of numFeatures

2019-12-29 Thread zhengruifeng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng reassigned SPARK-30376:


Assignee: zhengruifeng

> Unify the computation of numFeatures
> 
>
> Key: SPARK-30376
> URL: https://issues.apache.org/jira/browse/SPARK-30376
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Assignee: zhengruifeng
>Priority: Trivial
>
> Try to extract numFeatures from metadata first, if do not exists, then 
> extract from the dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30377) Make Regressors extend abstract class Regressor

2019-12-29 Thread Huaxin Gao (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005086#comment-17005086
 ] 

Huaxin Gao commented on SPARK-30377:


Seems there is no API difference, but it would be more logical to make every 
Regressor to extend ml.Regressor.

I quickly checked on Classifiers. Everything extends Classifier except 
OneVsRest. I guess  I will open a separate Jira to make OneVsRest extend 
Classifier. 

> Make Regressors extend abstract class Regressor
> ---
>
> Key: SPARK-30377
> URL: https://issues.apache.org/jira/browse/SPARK-30377
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Priority: Major
>
> Just found that  {{AFTSurvivalRegression}} , {{DecisionTreeRegressor}}, 
> {{FMRegressor}}, {{GBTRegressor}}, {{RandomForestRegressor}} directly extend 
> {{Predictor}}
>  
> Only {{GeneralizedLinearRegression}} and {{LinearRegression now extend 
> Regressor.}} 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30082) Zeros are being treated as NaNs

2019-12-29 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-30082:

Labels: correctness  (was: )

> Zeros are being treated as NaNs
> ---
>
> Key: SPARK-30082
> URL: https://issues.apache.org/jira/browse/SPARK-30082
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: John Ayad
>Assignee: John Ayad
>Priority: Major
>  Labels: correctness
> Fix For: 2.4.5, 3.0.0
>
>
> If you attempt to run
> {code:java}
> df = df.replace(float('nan'), somethingToReplaceWith)
> {code}
> It will replace all {{0}} s in columns of type {{Integer}}
> Example code snippet to repro this:
> {code:java}
> from pyspark.sql import SQLContext
> spark = SQLContext(sc).sparkSession
> df = spark.createDataFrame([(1, 0), (2, 3), (3, 0)], ("index", "value"))
> df.show()
> df = df.replace(float('nan'), 5)
> df.show()
> {code}
> Here's the output I get when I run this code:
> {code:java}
> Welcome to
>     __
>  / __/__  ___ _/ /__
> _\ \/ _ \/ _ `/ __/  '_/
>/__ / .__/\_,_/_/ /_/\_\   version 2.4.4
>   /_/
> Using Python version 3.7.5 (default, Nov  1 2019 02:16:32)
> SparkSession available as 'spark'.
> >>> from pyspark.sql import SQLContext
> >>> spark = SQLContext(sc).sparkSession
> >>> df = spark.createDataFrame([(1, 0), (2, 3), (3, 0)], ("index", "value"))
> >>> df.show()
> +-+-+
> |index|value|
> +-+-+
> |1|0|
> |2|3|
> |3|0|
> +-+-+
> >>> df = df.replace(float('nan'), 5)
> >>> df.show()
> +-+-+
> |index|value|
> +-+-+
> |1|5|
> |2|3|
> |3|5|
> +-+-+
> >>>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30196) Bump lz4-java version to 1.7.0

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005071#comment-17005071
 ] 

Takeshi Yamamuro commented on SPARK-30196:
--

Thanks for the report and I've filed an issue in lz4-java: 
[https://github.com/lz4/lz4-java/issues/156]

> Bump lz4-java version to 1.7.0
> --
>
> Key: SPARK-30196
> URL: https://issues.apache.org/jira/browse/SPARK-30196
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-29390) Add the justify_days(), justify_hours() and justify_interval() functions

2019-12-29 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reopened SPARK-29390:
-

> Add  the justify_days(),  justify_hours() and  justify_interval() functions
> ---
>
> Key: SPARK-29390
> URL: https://issues.apache.org/jira/browse/SPARK-29390
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Kent Yao
>Priority: Major
> Fix For: 3.0.0
>
>
> See *Table 9.31. Date/Time Functions* 
> ([https://www.postgresql.org/docs/12/functions-datetime.html)]
> |{{justify_days(}}{{interval}}{{)}}|{{interval}}|Adjust interval so 30-day 
> time periods are represented as months|{{justify_days(interval '35 
> days')}}|{{1 mon 5 days}}|
> | {{justify_hours(}}{{interval}}{{)}}|{{interval}}|Adjust interval so 24-hour 
> time periods are represented as days|{{justify_hours(interval '27 
> hours')}}|{{1 day 03:00:00}}|
> | {{justify_interval(}}{{interval}}{{)}}|{{interval}}|Adjust interval using 
> {{justify_days}} and {{justify_hours}}, with additional sign 
> adjustments|{{justify_interval(interval '1 mon -1 hour')}}|{{29 days 
> 23:00:00}}|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29390) Add the justify_days(), justify_hours() and justify_interval() functions

2019-12-29 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li resolved SPARK-29390.
-
Resolution: Later

> Add  the justify_days(),  justify_hours() and  justify_interval() functions
> ---
>
> Key: SPARK-29390
> URL: https://issues.apache.org/jira/browse/SPARK-29390
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Kent Yao
>Priority: Major
> Fix For: 3.0.0
>
>
> See *Table 9.31. Date/Time Functions* 
> ([https://www.postgresql.org/docs/12/functions-datetime.html)]
> |{{justify_days(}}{{interval}}{{)}}|{{interval}}|Adjust interval so 30-day 
> time periods are represented as months|{{justify_days(interval '35 
> days')}}|{{1 mon 5 days}}|
> | {{justify_hours(}}{{interval}}{{)}}|{{interval}}|Adjust interval so 24-hour 
> time periods are represented as days|{{justify_hours(interval '27 
> hours')}}|{{1 day 03:00:00}}|
> | {{justify_interval(}}{{interval}}{{)}}|{{interval}}|Adjust interval using 
> {{justify_days}} and {{justify_hours}}, with additional sign 
> adjustments|{{justify_interval(interval '1 mon -1 hour')}}|{{29 days 
> 23:00:00}}|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30196) Bump lz4-java version to 1.7.0

2019-12-29 Thread Lars Francke (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005066#comment-17005066
 ] 

Lars Francke commented on SPARK-30196:
--

FYI: This seems to have broken Spark 3 on Mac OS for me due to
{code:java}
dyld: lazy symbol binding failed: Symbol not found: chkstk_darwin
  Referenced from: 
/private/var/folders/1v/ckh8py712_n_5r628_16w0l4gn/T/liblz4-java-820584040681098780.dylib
 (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylibdyld: Symbol not found: 
chkstk_darwin
  Referenced from: 
/private/var/folders/1v/ckh8py712_n_5r628_16w0l4gn/T/liblz4-java-820584040681098780.dylib
 (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylib
 {code}
 

I did a bit of googling but I'm not sure what's going on. Reverting to 1.6 
works for me. I'm on MacOS 10.13. Any hints are appreciated.

If the lz4 stuff really only works with MacOS 10.15 that'd be sad but I can't 
really believe that.
Has anyone tried Spark 3 Preview 2 on a Mac with 10.15/10.14/10.13?

> Bump lz4-java version to 1.7.0
> --
>
> Key: SPARK-30196
> URL: https://issues.apache.org/jira/browse/SPARK-30196
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Spark Core
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Assignee: Takeshi Yamamuro
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30342) Update LIST JAR/FILE command

2019-12-29 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-30342.
--
Fix Version/s: 3.0.0
   Resolution: Fixed

Issue resolved by pull request 26996
[https://github.com/apache/spark/pull/26996]

> Update LIST JAR/FILE command
> 
>
> Key: SPARK-30342
> URL: https://issues.apache.org/jira/browse/SPARK-30342
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Rakesh Raushan
>Assignee: Rakesh Raushan
>Priority: Minor
> Fix For: 3.0.0
>
>
> LIST FILE/JAR command is not documented properly. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30342) Update LIST JAR/FILE command

2019-12-29 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-30342:


Assignee: Rakesh Raushan

> Update LIST JAR/FILE command
> 
>
> Key: SPARK-30342
> URL: https://issues.apache.org/jira/browse/SPARK-30342
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Rakesh Raushan
>Assignee: Rakesh Raushan
>Priority: Minor
>
> LIST FILE/JAR command is not documented properly. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29437) CSV Writer should escape 'escapechar' when it exists in the data

2019-12-29 Thread Ankit Raj Boudh (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Raj Boudh resolved SPARK-29437.
-
Resolution: Not A Problem

> CSV Writer should escape 'escapechar' when it exists in the data
> 
>
> Key: SPARK-29437
> URL: https://issues.apache.org/jira/browse/SPARK-29437
> Project: Spark
>  Issue Type: Bug
>  Components: Input/Output
>Affects Versions: 2.4.3
>Reporter: Tomasz Bartczak
>Priority: Trivial
>
> When the data contains escape character (default '\') it should either be 
> escaped or quoted.
> Steps to reproduce: 
> [https://gist.github.com/kretes/58f7f66a0780681a44c175a2ac3c0da2]
>  
> The effect can be either bad data read or sometimes even unable to properly 
> read the csv, e.g. when escape character is the last character in the column 
> - it break the column reading for that row and effectively break e.g. type 
> inference for a dataframe



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30130) Hardcoded numeric values in common table expressions which utilize GROUP BY are interpreted as ordinal positions

2019-12-29 Thread Ankit Raj Boudh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004824#comment-17004824
 ] 

Ankit Raj Boudh commented on SPARK-30130:
-

[~hyukjin.kwon], please confirm, it's require to fix ? then i will start 
working on this jira.

> Hardcoded numeric values in common table expressions which utilize GROUP BY 
> are interpreted as ordinal positions
> 
>
> Key: SPARK-30130
> URL: https://issues.apache.org/jira/browse/SPARK-30130
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: Matt Boegner
>Priority: Minor
>
> Hardcoded numeric values in common table expressions which utilize GROUP BY 
> are interpreted as ordinal positions.
> {code:java}
> val df = spark.sql("""
>  with a as (select 0 as test, count(*) group by test)
>  select * from a
>  """)
>  df.show(){code}
> This results in an error message like {color:#e01e5a}GROUP BY position 0 is 
> not in select list (valid range is [1, 2]){color} .
>  
> However, this error does not appear in a traditional subselect format. For 
> example, this query executes correctly:
> {code:java}
> val df = spark.sql("""
>  select * from (select 0 as test, count(*) group by test) a
>  """)
>  df.show(){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30383) Remove meaning less tooltip from Executor Tab

2019-12-29 Thread Ankit Raj Boudh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004794#comment-17004794
 ] 

Ankit Raj Boudh commented on SPARK-30383:
-

I think not only Executor Tab but all pages need to check, i will check in all 
the pages and will submit PR today

> Remove meaning less tooltip from Executor Tab 
> --
>
> Key: SPARK-30383
> URL: https://issues.apache.org/jira/browse/SPARK-30383
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> There are tooltips display as it is Like Disk Used,  Total Tasks in Executor 
> Table under Executor Tab.
> Should improve and remove meaning less Tool Tips.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30377) Make Regressors extend abstract class Regressor

2019-12-29 Thread Sean R. Owen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004793#comment-17004793
 ] 

Sean R. Owen commented on SPARK-30377:
--

It seems logical; what would the API difference be?

> Make Regressors extend abstract class Regressor
> ---
>
> Key: SPARK-30377
> URL: https://issues.apache.org/jira/browse/SPARK-30377
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: zhengruifeng
>Priority: Major
>
> Just found that  {{AFTSurvivalRegression}} , {{DecisionTreeRegressor}}, 
> {{FMRegressor}}, {{GBTRegressor}}, {{RandomForestRegressor}} directly extend 
> {{Predictor}}
>  
> Only {{GeneralizedLinearRegression}} and {{LinearRegression now extend 
> Regressor.}} 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30383) Remove meaning less tooltip from Executor Tab

2019-12-29 Thread Ankit Raj Boudh (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004792#comment-17004792
 ] 

Ankit Raj Boudh commented on SPARK-30383:
-

i will submit the PR

> Remove meaning less tooltip from Executor Tab 
> --
>
> Key: SPARK-30383
> URL: https://issues.apache.org/jira/browse/SPARK-30383
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> There are tooltips display as it is Like Disk Used,  Total Tasks in Executor 
> Table under Executor Tab.
> Should improve and remove meaning less Tool Tips.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30383) Remove meaning less tooltip from Executor Tab

2019-12-29 Thread ABHISHEK KUMAR GUPTA (Jira)
ABHISHEK KUMAR GUPTA created SPARK-30383:


 Summary: Remove meaning less tooltip from Executor Tab 
 Key: SPARK-30383
 URL: https://issues.apache.org/jira/browse/SPARK-30383
 Project: Spark
  Issue Type: Improvement
  Components: Web UI
Affects Versions: 3.0.0
Reporter: ABHISHEK KUMAR GUPTA


There are tooltips display as it is Like Disk Used,  Total Tasks in Executor 
Table under Executor Tab.
Should improve and remove meaning less Tool Tips.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27764) Feature Parity between PostgreSQL and Spark

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004786#comment-17004786
 ] 

Takeshi Yamamuro commented on SPARK-27764:
--

I've created two new umbrella tickets, SPARK-30374 and SPARK-30375, for 
ANSI-related issues and implementation-dependent issues, then moved some 
tickets there (See the description above for more details). If you have any 
problem, please let me know.

> Feature Parity between PostgreSQL and Spark
> ---
>
> Key: SPARK-27764
> URL: https://issues.apache.org/jira/browse/SPARK-27764
> Project: Spark
>  Issue Type: Umbrella
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> PostgreSQL is one of the most advanced open source databases. This umbrella 
> Jira is trying to track the missing features and bugs. 
> UPDATED: This umbrella tickets basically intend to include bug reports and 
> general issues for the feature parity. For implementation-dependent 
> behaviours and ANS/SQL standard topics, you need to check the two umbrella 
> below;
>  - SPARK-30374 Feature Parity between PostgreSQL and Spark (ANSI/SQL)
>  - SPARK-30375 Feature Parity between PostgreSQL and Spark 
> (implementation-dependent behaviours)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30382) start-thriftserver throws ClassNotFoundException

2019-12-29 Thread Ajith S (Jira)
Ajith S created SPARK-30382:
---

 Summary: start-thriftserver throws ClassNotFoundException
 Key: SPARK-30382
 URL: https://issues.apache.org/jira/browse/SPARK-30382
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Ajith S


start-thriftserver.sh --help throws

{code}
.
 

Thrift server options:
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/logging/log4j/spi/LoggerContextFactory
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:167)
at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:82)
at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
Caused by: java.lang.ClassNotFoundException: 
org.apache.logging.log4j.spi.LoggerContextFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 3 more


{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28429) SQL Datetime util function being casted to double instead of timestamp

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28429:
-
Parent Issue: SPARK-27764  (was: SPARK-30375)

> SQL Datetime util function being casted to double instead of timestamp
> --
>
> Key: SPARK-28429
> URL: https://issues.apache.org/jira/browse/SPARK-28429
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> In the code below, 'now()+'100 days' are casted to double and then an error 
> is thrown:
> {code:sql}
> CREATE TEMP VIEW v_window AS
> SELECT i, min(i) over (order by i range between '1 day' preceding and '10 
> days' following) as min_i
> FROM range(now(), now()+'100 days', '1 hour') i;
> {code}
> Error:
> {code:sql}
> cannot resolve '(current_timestamp() + CAST('100 days' AS DOUBLE))' due to 
> data type mismatch: differing      types in '(current_timestamp() + CAST('100 
> days' AS DOUBLE))' (timestamp and double).;{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29659) Support COMMENT ON syntax

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29659:
-
Parent Issue: SPARK-27764  (was: SPARK-30375)

> Support COMMENT ON syntax
> -
>
> Key: SPARK-29659
> URL: https://issues.apache.org/jira/browse/SPARK-29659
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>
> [https://www.postgresql.org/docs/current/sql-comment.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27987) Support POSIX Regular Expressions

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-27987:
-
Parent Issue: SPARK-27764  (was: SPARK-30375)

> Support POSIX Regular Expressions
> -
>
> Key: SPARK-27987
> URL: https://issues.apache.org/jira/browse/SPARK-27987
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> POSIX regular expressions provide a more powerful means for pattern matching 
> than the LIKE and SIMILAR TO operators. Many Unix tools such as egrep, sed, 
> or awk use a pattern matching language that is similar to the one described 
> here.
> ||Operator||Description||Example||
> |{{~}}|Matches regular expression, case sensitive|{{'thomas' ~ '.*thomas.*'}}|
> |{{~*}}|Matches regular expression, case insensitive|{{'thomas' ~* 
> '.*Thomas.*'}}|
> |{{!~}}|Does not match regular expression, case sensitive|{{'thomas' !~ 
> '.*Thomas.*'}}|
> |{{!~*}}|Does not match regular expression, case insensitive|{{'thomas' !~* 
> '.*vadim.*'}}|
> https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27764) Feature Parity between PostgreSQL and Spark

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-27764:
-
Description: 
PostgreSQL is one of the most advanced open source databases. This umbrella 
Jira is trying to track the missing features and bugs. 

UPDATED: This umbrella tickets basically intend to include bug reports and 
general issues for the feature parity. For implementation-dependent behaviours 
and ANS/SQL standard topics, you need to check the two umbrella below;
 - SPARK-30374 Feature Parity between PostgreSQL and Spark (ANSI/SQL)
 - SPARK-30375 Feature Parity between PostgreSQL and Spark 
(implementation-dependent behaviours)

  was:
PostgreSQL is one of the most advanced open source databases. This umbrella 
Jira is trying to track the missing features and bugs. 

UPDATE: This umbrella tickets basically intend to include bug reports and 
general issues for the feature parity. For implementation-dependent behaviours 
and ANS/SQL standard topics, you need to check the two umbrella below;
 - SPARK-30374 Feature Parity between PostgreSQL and Spark (ANSI/SQL)
 - SPARK-30375 Feature Parity between PostgreSQL and Spark 
(implementation-dependent behaviours)


> Feature Parity between PostgreSQL and Spark
> ---
>
> Key: SPARK-27764
> URL: https://issues.apache.org/jira/browse/SPARK-27764
> Project: Spark
>  Issue Type: Umbrella
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> PostgreSQL is one of the most advanced open source databases. This umbrella 
> Jira is trying to track the missing features and bugs. 
> UPDATED: This umbrella tickets basically intend to include bug reports and 
> general issues for the feature parity. For implementation-dependent 
> behaviours and ANS/SQL standard topics, you need to check the two umbrella 
> below;
>  - SPARK-30374 Feature Parity between PostgreSQL and Spark (ANSI/SQL)
>  - SPARK-30375 Feature Parity between PostgreSQL and Spark 
> (implementation-dependent behaviours)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28402) Array indexing is 1-based

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28402.
--
Resolution: Won't Fix

> Array indexing is 1-based
> -
>
> Key: SPARK-28402
> URL: https://issues.apache.org/jira/browse/SPARK-28402
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Major
>
> Array indexing is 1-based in PostgreSQL: 
> [https://www.postgresql.org/docs/12/arrays.html]
>  
> {quote}The array subscript numbers are written within square brackets. By 
> default PostgreSQL uses a one-based numbering convention for arrays, that is, 
> an array of _{{n}}_ elements starts with {{array[1]}} and ends with 
> {{array[_{{n}}_]}}.{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28451) substr returns different values

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28451.
--
Resolution: Won't Fix

> substr returns different values
> ---
>
> Key: SPARK-28451
> URL: https://issues.apache.org/jira/browse/SPARK-28451
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL:
> {noformat}
> postgres=# select substr('1234567890', -1, 5);
>  substr
> 
>  123
> (1 row)
> postgres=# select substr('1234567890', 1, -1);
> ERROR:  negative substring length not allowed
> {noformat}
> Spark SQL:
> {noformat}
> spark-sql> select substr('1234567890', -1, 5);
> 0
> spark-sql> select substr('1234567890', 1, -1);
> spark-sql>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28451) substr returns different values

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004771#comment-17004771
 ] 

Takeshi Yamamuro commented on SPARK-28451:
--

I'll close this for now based on the discussion above. Thanks, all.

> substr returns different values
> ---
>
> Key: SPARK-28451
> URL: https://issues.apache.org/jira/browse/SPARK-28451
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL:
> {noformat}
> postgres=# select substr('1234567890', -1, 5);
>  substr
> 
>  123
> (1 row)
> postgres=# select substr('1234567890', 1, -1);
> ERROR:  negative substring length not allowed
> {noformat}
> Spark SQL:
> {noformat}
> spark-sql> select substr('1234567890', -1, 5);
> 0
> spark-sql> select substr('1234567890', 1, -1);
> spark-sql>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28296) Improved VALUES support

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28296:
-
Parent Issue: SPARK-27764  (was: SPARK-30375)

> Improved VALUES support
> ---
>
> Key: SPARK-28296
> URL: https://issues.apache.org/jira/browse/SPARK-28296
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Peter Toth
>Priority: Major
>
> These are valid queries in PostgreSQL, but they don't work in Spark SQL:
> {noformat}
> values ((select 1));
> values ((select c from test1));
> select (values(c)) from test10;
> with cte(foo) as ( values(42) ) values((select foo from cte));
> {noformat}
> where test1 and test10:
> {noformat}
> CREATE TABLE test1 (c INTEGER);
> INSERT INTO test1 VALUES(1);
> CREATE TABLE test10 (c INTEGER);
> INSERT INTO test10 SELECT generate_sequence(1, 10);
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27978) Add built-in Aggregate Functions: string_agg

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004762#comment-17004762
 ] 

Takeshi Yamamuro commented on SPARK-27978:
--

I'll close this for now because I think the workaround above is enough. If 
necessary, please reopen this.

> Add built-in Aggregate Functions: string_agg
> 
>
> Key: SPARK-27978
> URL: https://issues.apache.org/jira/browse/SPARK-27978
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Argument Type(s)||Return Type||Partial Mode||Description||
> |string_agg(_{{expression}}_,_{{delimiter}}_)|({{text}}, {{text}}) or 
> ({{bytea}}, {{bytea}})|same as argument types|No|input values concatenated 
> into a string, separated by delimiter|
> https://www.postgresql.org/docs/current/functions-aggregate.html
> We can workaround it by concat_ws(_{{delimiter}}_, 
> collect_list(_{{expression}}_)) currently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27978) Add built-in Aggregate Functions: string_agg

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-27978.
--
Resolution: Won't Fix

> Add built-in Aggregate Functions: string_agg
> 
>
> Key: SPARK-27978
> URL: https://issues.apache.org/jira/browse/SPARK-27978
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Argument Type(s)||Return Type||Partial Mode||Description||
> |string_agg(_{{expression}}_,_{{delimiter}}_)|({{text}}, {{text}}) or 
> ({{bytea}}, {{bytea}})|same as argument types|No|input values concatenated 
> into a string, separated by delimiter|
> https://www.postgresql.org/docs/current/functions-aggregate.html
> We can workaround it by concat_ws(_{{delimiter}}_, 
> collect_list(_{{expression}}_)) currently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29891) Add built-in Array Functions: array_length

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004760#comment-17004760
 ] 

Takeshi Yamamuro commented on SPARK-29891:
--

I'll close this for now because our length is enough for this use case. If 
necessary, please reopen this.

> Add built-in Array Functions: array_length
> --
>
> Key: SPARK-29891
> URL: https://issues.apache.org/jira/browse/SPARK-29891
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_length}}{{(}}{{anyarray}}{{, }}{{int}}{{)}}|{{int}}|returns the 
> length of the requested array dimension|{{array_length(array[1,2,3], 
> 1)}}|{{3}}|
> | | | | | |
> Other DBs:
> [https://phoenix.apache.org/language/functions.html#array_length]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29891) Add built-in Array Functions: array_length

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-29891.
--
Resolution: Won't Fix

> Add built-in Array Functions: array_length
> --
>
> Key: SPARK-29891
> URL: https://issues.apache.org/jira/browse/SPARK-29891
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_length}}{{(}}{{anyarray}}{{, }}{{int}}{{)}}|{{int}}|returns the 
> length of the requested array dimension|{{array_length(array[1,2,3], 
> 1)}}|{{3}}|
> | | | | | |
> Other DBs:
> [https://phoenix.apache.org/language/functions.html#array_length]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29984) Add built-in Array Functions: array_ndims

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004759#comment-17004759
 ] 

Takeshi Yamamuro commented on SPARK-29984:
--

I'll close this for now because I cannot find a strong reason to support this. 
If necessary, please reopen this.

> Add built-in Array Functions: array_ndims
> -
>
> Key: SPARK-29984
> URL: https://issues.apache.org/jira/browse/SPARK-29984
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_ndims}}{{(}}{{anyarray}}{{)}}|{{int}}|returns the number of 
> dimensions of the array|{{array_ndims(ARRAY[[1,2,3], [4,5,6]])}}|{{2}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29984) Add built-in Array Functions: array_ndims

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-29984.
--
Resolution: Won't Fix

> Add built-in Array Functions: array_ndims
> -
>
> Key: SPARK-29984
> URL: https://issues.apache.org/jira/browse/SPARK-29984
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_ndims}}{{(}}{{anyarray}}{{)}}|{{int}}|returns the number of 
> dimensions of the array|{{array_ndims(ARRAY[[1,2,3], [4,5,6]])}}|{{2}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28037) Add built-in String Functions: quote_literal

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004756#comment-17004756
 ] 

Takeshi Yamamuro commented on SPARK-28037:
--

I'll close this for now because I'm not sure this feature is useful for Spark. 
If necessary, please reopen this.

> Add built-in String Functions: quote_literal
> 
>
> Key: SPARK-28037
> URL: https://issues.apache.org/jira/browse/SPARK-28037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Return Type||Description||Example||Result||
> |{{quote_literal(_{{string}}_ }}{{text}}{{)}}|{{text}}|Return the given 
> string suitably quoted to be used as a string literal in an SQL statement 
> string. Embedded single-quotes and backslashes are properly doubled. Note 
> that {{quote_literal}} returns null on null input; if the argument might be 
> null, {{quote_nullable}} is often more suitable. See also [Example 
> 43.1|https://www.postgresql.org/docs/11/plpgsql-statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE].|{{quote_literal(E'O\'Reilly')}}|{{'O''Reilly'}}|
> |{{quote_literal(_{{value}}_ }}{{anyelement}}{{)}}|{{text}}|Coerce the given 
> value to text and then quote it as a literal. Embedded single-quotes and 
> backslashes are properly doubled.|{{quote_literal(42.5)}}|{{'42.5'}}|
> https://www.postgresql.org/docs/11/functions-string.html
> https://docs.aws.amazon.com/redshift/latest/dg/r_QUOTE_LITERAL.html
> https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/String/QUOTE_LITERAL.htm?tocpath=SQL%20Reference%20Manual%7CSQL%20Functions%7CString%20Functions%7C_38



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28037) Add built-in String Functions: quote_literal

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28037.
--
Resolution: Won't Fix

> Add built-in String Functions: quote_literal
> 
>
> Key: SPARK-28037
> URL: https://issues.apache.org/jira/browse/SPARK-28037
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Function||Return Type||Description||Example||Result||
> |{{quote_literal(_{{string}}_ }}{{text}}{{)}}|{{text}}|Return the given 
> string suitably quoted to be used as a string literal in an SQL statement 
> string. Embedded single-quotes and backslashes are properly doubled. Note 
> that {{quote_literal}} returns null on null input; if the argument might be 
> null, {{quote_nullable}} is often more suitable. See also [Example 
> 43.1|https://www.postgresql.org/docs/11/plpgsql-statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE].|{{quote_literal(E'O\'Reilly')}}|{{'O''Reilly'}}|
> |{{quote_literal(_{{value}}_ }}{{anyelement}}{{)}}|{{text}}|Coerce the given 
> value to text and then quote it as a literal. Embedded single-quotes and 
> backslashes are properly doubled.|{{quote_literal(42.5)}}|{{'42.5'}}|
> https://www.postgresql.org/docs/11/functions-string.html
> https://docs.aws.amazon.com/redshift/latest/dg/r_QUOTE_LITERAL.html
> https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/String/QUOTE_LITERAL.htm?tocpath=SQL%20Reference%20Manual%7CSQL%20Functions%7CString%20Functions%7C_38



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30043) Add built-in Array Functions: array_fill

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-30043.
--
Resolution: Won't Fix

> Add built-in Array Functions: array_fill
> 
>
> Key: SPARK-30043
> URL: https://issues.apache.org/jira/browse/SPARK-30043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_fill}}{{(}}{{anyelement}}{{, }}{{int[]}}{{ [, 
> {{int[]}}])}}|{{anyarray}}|returns an array initialized with supplied value 
> and dimensions, optionally with lower bounds other than 1|{{array_fill(7, 
> ARRAY[3], ARRAY[2])}}|{{[2:4]=\{7,7,7}}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30043) Add built-in Array Functions: array_fill

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004755#comment-17004755
 ] 

Takeshi Yamamuro commented on SPARK-30043:
--

I'll close this for now because I cannot find a strong reason to support this. 
If necessary, please reopen this. Thanks.

> Add built-in Array Functions: array_fill
> 
>
> Key: SPARK-30043
> URL: https://issues.apache.org/jira/browse/SPARK-30043
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_fill}}{{(}}{{anyelement}}{{, }}{{int[]}}{{ [, 
> {{int[]}}])}}|{{anyarray}}|returns an array initialized with supplied value 
> and dimensions, optionally with lower bounds other than 1|{{array_fill(7, 
> ARRAY[3], ARRAY[2])}}|{{[2:4]=\{7,7,7}}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30182) Support nested aggregates

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-30182:
-
Parent Issue: SPARK-27764  (was: SPARK-30375)

> Support nested aggregates
> -
>
> Key: SPARK-30182
> URL: https://issues.apache.org/jira/browse/SPARK-30182
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> Spark SQL cannot supports a SQL with nested aggregate as below:
> {code:java}
> SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
>  sum(salary) FILTER (WHERE enroll_date > '2007-01-01')
> ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS 
> "filtered_sum",
>  depname
> FROM empsalary GROUP BY depname;{code}
> And Spark will throw exception as follows:
> {code:java}
> org.apache.spark.sql.AnalysisException
> It is not allowed to use an aggregate function in the argument of another 
> aggregate function. Please use the inner aggregate function in a 
> sub-query.{code}
> But PostgreSQL supports this syntax.
> {code:java}
> SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
>  sum(salary) FILTER (WHERE enroll_date > '2007-01-01')
> ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS 
> "filtered_sum",
>  depname
> FROM empsalary GROUP BY depname;
>  sum | row_number | filtered_sum | depname 
> ---++--+---
>  25100 | 1 | 22600 | develop
>  7400 | 2 | 3500 | personnel
>  14600 | 3 | | sales
> (3 rows){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28036) Support negative length at LEFT/RIGHT SQL functions

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28036.
--
Resolution: Won't Fix

> Support negative length at LEFT/RIGHT SQL functions
> ---
>
> Key: SPARK-28036
> URL: https://issues.apache.org/jira/browse/SPARK-28036
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL:
> {code:sql}
> postgres=# select left('ahoj', -2), right('ahoj', -2);
>  left | right 
> --+---
>  ah   | oj
> (1 row)
> {code}
> Spark SQL:
> {code:sql}
> spark-sql> select left('ahoj', -2), right('ahoj', -2);
> spark-sql>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28036) Support negative length at LEFT/RIGHT SQL functions

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004748#comment-17004748
 ] 

Takeshi Yamamuro commented on SPARK-28036:
--

I'll close this for now because I cannot find a strong reason to support this. 
If necessary, please reopen this.

> Support negative length at LEFT/RIGHT SQL functions
> ---
>
> Key: SPARK-28036
> URL: https://issues.apache.org/jira/browse/SPARK-28036
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL:
> {code:sql}
> postgres=# select left('ahoj', -2), right('ahoj', -2);
>  left | right 
> --+---
>  ah   | oj
> (1 row)
> {code}
> Spark SQL:
> {code:sql}
> spark-sql> select left('ahoj', -2), right('ahoj', -2);
> spark-sql>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30042) Add built-in Array Functions: array_dims

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004747#comment-17004747
 ] 

Takeshi Yamamuro commented on SPARK-30042:
--

I'll close this for now based on the discussion above. If necessary, please 
reopen this.

> Add built-in Array Functions: array_dims
> 
>
> Key: SPARK-30042
> URL: https://issues.apache.org/jira/browse/SPARK-30042
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_dims}}{{(}}{{anyarray}}{{)}}|{{text}}|returns a text representation 
> of array's dimensions|{{array_dims(ARRAY[[1,2,3], [4,5,6]])}}|{{[1:2][1:3]}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30042) Add built-in Array Functions: array_dims

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-30042.
--
Resolution: Won't Fix

> Add built-in Array Functions: array_dims
> 
>
> Key: SPARK-30042
> URL: https://issues.apache.org/jira/browse/SPARK-30042
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: jiaan.geng
>Priority: Major
>
> |{{array_dims}}{{(}}{{anyarray}}{{)}}|{{text}}|returns a text representation 
> of array's dimensions|{{array_dims(ARRAY[[1,2,3], [4,5,6]])}}|{{[1:2][1:3]}}|
> [https://www.postgresql.org/docs/11/functions-array.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28516) Data Type Formatting Functions: `to_char`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28516.
--
Resolution: Won't Fix

> Data Type Formatting Functions: `to_char`
> -
>
> Key: SPARK-28516
> URL: https://issues.apache.org/jira/browse/SPARK-28516
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not have support for `to_char`. PgSQL, however, 
> [does|[https://www.postgresql.org/docs/12/functions-formatting.html]]:
> Query example: 
> {code:sql}
> SELECT to_char(SUM(n) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 
> FOLLOWING),'9D9')
> {code}
> ||Function||Return Type||Description||Example||
> |{{to_char(}}{{timestamp}}{{, }}{{text}}{{)}}|{{text}}|convert time stamp to 
> string|{{to_char(current_timestamp, 'HH12:MI:SS')}}|
> |{{to_char(}}{{interval}}{{, }}{{text}}{{)}}|{{text}}|convert interval to 
> string|{{to_char(interval '15h 2m 12s', 'HH24:MI:SS')}}|
> |{{to_char(}}{{int}}{{, }}{{text}}{{)}}|{{text}}|convert integer to 
> string|{{to_char(125, '999')}}|
> |{{to_char}}{{(}}{{double precision}}{{, }}{{text}}{{)}}|{{text}}|convert 
> real/double precision to string|{{to_char(125.8::real, '999D9')}}|
> |{{to_char(}}{{numeric}}{{, }}{{text}}{{)}}|{{text}}|convert numeric to 
> string|{{to_char(-125.8, '999D99S')}}|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28516) Data Type Formatting Functions: `to_char`

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004746#comment-17004746
 ] 

Takeshi Yamamuro edited comment on SPARK-28516 at 12/29/19 11:20 AM:
-

I'll close this for now because I'm not sure this feature is useful for Spark. 
If necessary, please reopen this. Thanks.


was (Author: maropu):
I'll close this because I'm not sure this feature is useful for Spark. If 
necessary, please reopen this. Thanks.

> Data Type Formatting Functions: `to_char`
> -
>
> Key: SPARK-28516
> URL: https://issues.apache.org/jira/browse/SPARK-28516
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not have support for `to_char`. PgSQL, however, 
> [does|[https://www.postgresql.org/docs/12/functions-formatting.html]]:
> Query example: 
> {code:sql}
> SELECT to_char(SUM(n) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 
> FOLLOWING),'9D9')
> {code}
> ||Function||Return Type||Description||Example||
> |{{to_char(}}{{timestamp}}{{, }}{{text}}{{)}}|{{text}}|convert time stamp to 
> string|{{to_char(current_timestamp, 'HH12:MI:SS')}}|
> |{{to_char(}}{{interval}}{{, }}{{text}}{{)}}|{{text}}|convert interval to 
> string|{{to_char(interval '15h 2m 12s', 'HH24:MI:SS')}}|
> |{{to_char(}}{{int}}{{, }}{{text}}{{)}}|{{text}}|convert integer to 
> string|{{to_char(125, '999')}}|
> |{{to_char}}{{(}}{{double precision}}{{, }}{{text}}{{)}}|{{text}}|convert 
> real/double precision to string|{{to_char(125.8::real, '999D9')}}|
> |{{to_char(}}{{numeric}}{{, }}{{text}}{{)}}|{{text}}|convert numeric to 
> string|{{to_char(-125.8, '999D99S')}}|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28516) Data Type Formatting Functions: `to_char`

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004746#comment-17004746
 ] 

Takeshi Yamamuro commented on SPARK-28516:
--

I'll close this because I'm not sure this feature is useful for Spark. If 
necessary, please reopen this. Thanks.

> Data Type Formatting Functions: `to_char`
> -
>
> Key: SPARK-28516
> URL: https://issues.apache.org/jira/browse/SPARK-28516
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Dylan Guedes
>Priority: Major
>
> Currently, Spark does not have support for `to_char`. PgSQL, however, 
> [does|[https://www.postgresql.org/docs/12/functions-formatting.html]]:
> Query example: 
> {code:sql}
> SELECT to_char(SUM(n) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 
> FOLLOWING),'9D9')
> {code}
> ||Function||Return Type||Description||Example||
> |{{to_char(}}{{timestamp}}{{, }}{{text}}{{)}}|{{text}}|convert time stamp to 
> string|{{to_char(current_timestamp, 'HH12:MI:SS')}}|
> |{{to_char(}}{{interval}}{{, }}{{text}}{{)}}|{{text}}|convert interval to 
> string|{{to_char(interval '15h 2m 12s', 'HH24:MI:SS')}}|
> |{{to_char(}}{{int}}{{, }}{{text}}{{)}}|{{text}}|convert integer to 
> string|{{to_char(125, '999')}}|
> |{{to_char}}{{(}}{{double precision}}{{, }}{{text}}{{)}}|{{text}}|convert 
> real/double precision to string|{{to_char(125.8::real, '999D9')}}|
> |{{to_char(}}{{numeric}}{{, }}{{text}}{{)}}|{{text}}|convert numeric to 
> string|{{to_char(-125.8, '999D99S')}}|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28490) Support `TIME` type in Spark

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28490.
--
Resolution: Duplicate

> Support `TIME` type in Spark
> 
>
> Key: SPARK-28490
> URL: https://issues.apache.org/jira/browse/SPARK-28490
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Zhu, Lipeng
>Priority: Major
>
>  Support the TIME type and related time operators in Spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28865) Table inheritance

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28865:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Table inheritance
> -
>
> Key: SPARK-28865
> URL: https://issues.apache.org/jira/browse/SPARK-28865
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> PostgreSQL implements table inheritance, which can be a useful tool for 
> database designers. (SQL:1999 and later define a type inheritance feature, 
> which differs in many respects from the features described here.)
>  
> [https://www.postgresql.org/docs/11/ddl-inherit.html|https://www.postgresql.org/docs/9.5/ddl-inherit.html]
> [https://www.postgresql.org/docs/11/tutorial-inheritance.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28687) Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28687:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`
> 
>
> Key: SPARK-28687
> URL: https://issues.apache.org/jira/browse/SPARK-28687
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, we support these field for EXTRACT: CENTURY, MILLENNIUM, DECADE, 
> YEAR, QUARTER, MONTH, WEEK, DAY, DAYOFWEEK, HOUR, MINUTE, SECOND, DOW, 
> ISODOW, DOY, 
> We also need support: EPOCH, MICROSECONDS, MILLISECONDS, TIMEZONE, 
> TIMEZONE_M, TIMEZONE_H, ISOYEAR.
> https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28623) Support `dow`, `isodow` and `doy` at `extract()`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28623:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support `dow`, `isodow` and `doy` at `extract()`
> 
>
> Key: SPARK-28623
> URL: https://issues.apache.org/jira/browse/SPARK-28623
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, we support these field for EXTRACT: YEAR, QUARTER, MONTH, WEEK, 
> DAY, DAYOFWEEK, HOUR, MINUTE, SECOND.
> We also need support: EPOCH, CENTURY, MILLENNIUM, DECADE, MICROSECONDS, 
> MILLISECONDS, DOW, ISODOW, DOY, TIMEZONE, TIMEZONE_M, TIMEZONE_H, JULIAN, 
> ISOYEAR.
> https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30381) GBT reuse splits for all trees

2019-12-29 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-30381:


 Summary: GBT reuse splits for all trees
 Key: SPARK-30381
 URL: https://issues.apache.org/jira/browse/SPARK-30381
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 3.0.0
Reporter: zhengruifeng
Assignee: zhengruifeng


In existing GBT, for each tree, it will first compute avaiable splits of each 
feature (via RandomForest.findSplits), based on sampled dataset at this 
iteration. Then it will use these splits to discretize vectors into 
BaggedPoint[TreePoint]s. The BaggedPoints (of the same size of input vectors) 
are then cached and used at this iteration. Note that the splits for 
discretization in each tree are different (if subsamplingRate<1), only because 
the sampled vectors are different.

However, the splits at different iterations shoud be similar if sampled dataset 
is big enough, and even the same if subsamplingRate=1.

 

However, in other famous GBT impls (like XGBoost/lightGBM) with binned 
features, the splits for discretization is the same for different iterations:
{code:java}
import xgboost as xgb
from sklearn.datasets import load_svmlight_file
X, y = 
load_svmlight_file('/data0/Dev/Opensource/spark/data/mllib/sample_linear_regression_data.txt')
dtrain = xgb.DMatrix(X[:, :2], label=y)
num_round = 3
param = {'max_depth': 2, 'eta': 1, 'objective': 'reg:squarederror', 
'tree_method': 'hist', 'max_bin': 2, 'eta': 0.01, 'subsample':0.5}
bst = xgb.train(param, dtrain, num_round)
bst.trees_to_dataframe('/tmp/bst')
Out[61]: 
Tree  Node   ID Feature Split  Yes   No MissingGain  Cover
0  0 0  0-0  f1  0.000408  0-1  0-2 0-1  170.337143  256.0
1  0 1  0-1  f0  0.003531  0-3  0-4 0-3   44.865482  121.0
2  0 2  0-2  f0  0.003531  0-5  0-6 0-5  125.615570  135.0
3  0 3  0-3Leaf   NaN  NaN  NaN NaN   -0.010050   67.0
4  0 4  0-4Leaf   NaN  NaN  NaN NaN0.002126   54.0
5  0 5  0-5Leaf   NaN  NaN  NaN NaN0.020972   69.0
6  0 6  0-6Leaf   NaN  NaN  NaN NaN0.001714   66.0
7  1 0  1-0  f0  0.003531  1-1  1-2 1-1   50.417793  263.0
8  1 1  1-1  f1  0.000408  1-3  1-4 1-3   48.732742  124.0
9  1 2  1-2  f1  0.000408  1-5  1-6 1-5   52.832161  139.0
10 1 3  1-3Leaf   NaN  NaN  NaN NaN   -0.012784   63.0
11 1 4  1-4Leaf   NaN  NaN  NaN NaN   -0.000287   61.0
12 1 5  1-5Leaf   NaN  NaN  NaN NaN0.008661   64.0
13 1 6  1-6Leaf   NaN  NaN  NaN NaN   -0.003624   75.0
14 2 0  2-0  f1  0.000408  2-1  2-2 2-1   62.136013  242.0
15 2 1  2-1  f0  0.003531  2-3  2-4 2-3  150.537781  118.0
16 2 2  2-2  f0  0.003531  2-5  2-6 2-53.829046  124.0
17 2 3  2-3Leaf   NaN  NaN  NaN NaN   -0.016737   65.0
18 2 4  2-4Leaf   NaN  NaN  NaN NaN0.005809   53.0
19 2 5  2-5Leaf   NaN  NaN  NaN NaN0.005251   60.0
20 2 6  2-6Leaf   NaN  NaN  NaN NaN0.001709   64.0
 {code}
 

We can see that even if we set subsample=0.5, the three trees share the same 
splits.

 

So I think we could reuse the splits and treePoints at all iterations:

at iteration=0, compute the splits on whole training dataset, and use the 
splits to generate treepoints.

At each iteration, directly generate baggedPoints based on the treePoints.

Here we do not need to persist/unpersist the internal training dataset for each 
tree.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28033) String concatenation should low priority than other operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28033.
--
Resolution: Won't Fix

> String concatenation should low priority than other operators
> -
>
> Key: SPARK-28033
> URL: https://issues.apache.org/jira/browse/SPARK-28033
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.3
>Reporter: Yuming Wang
>Priority: Major
>
> Spark SQL:
> {code:sql}
> spark-sql> explain select 'four: ' || 2 + 2;
> == Physical Plan ==
> *(1) Project [null AS (CAST(concat(four: , CAST(2 AS STRING)) AS DOUBLE) + 
> CAST(2 AS DOUBLE))#2]
> +- Scan OneRowRelation[]
> spark-sql> select 'four: ' || 2 + 2;
> NULL
> {code}
> Hive:
> {code:sql}
> hive> select 'four: ' || 2 + 2;
> OK
> four: 4
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28033) String concatenation should low priority than other operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004744#comment-17004744
 ] 

Takeshi Yamamuro commented on SPARK-28033:
--

I'll close this because the corresponding pr has been closed. If necessary, 
please reopen this.

> String concatenation should low priority than other operators
> -
>
> Key: SPARK-28033
> URL: https://issues.apache.org/jira/browse/SPARK-28033
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.3
>Reporter: Yuming Wang
>Priority: Major
>
> Spark SQL:
> {code:sql}
> spark-sql> explain select 'four: ' || 2 + 2;
> == Physical Plan ==
> *(1) Project [null AS (CAST(concat(four: , CAST(2 AS STRING)) AS DOUBLE) + 
> CAST(2 AS DOUBLE))#2]
> +- Scan OneRowRelation[]
> spark-sql> select 'four: ' || 2 + 2;
> NULL
> {code}
> Hive:
> {code:sql}
> hive> select 'four: ' || 2 + 2;
> OK
> four: 4
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28028) Cast numeric to integral type need round

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28028:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Cast numeric to integral type need round
> 
>
> Key: SPARK-28028
> URL: https://issues.apache.org/jira/browse/SPARK-28028
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> For example
>  Case 1:
> {code:sql}
> select cast(-1.5 as smallint);
> {code}
> Spark SQL returns {{-1}}, but PostgreSQL returns {{-2}}.
>  
> Case 2:
> {code:sql}
> SELECT smallint(float('32767.6'))
> {code}
> Spark SQL returns {{32767}}, but PostgreSQL throws {{ERROR: smallint out of 
> range}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27987) Support POSIX Regular Expressions

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-27987:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support POSIX Regular Expressions
> -
>
> Key: SPARK-27987
> URL: https://issues.apache.org/jira/browse/SPARK-27987
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> POSIX regular expressions provide a more powerful means for pattern matching 
> than the LIKE and SIMILAR TO operators. Many Unix tools such as egrep, sed, 
> or awk use a pattern matching language that is similar to the one described 
> here.
> ||Operator||Description||Example||
> |{{~}}|Matches regular expression, case sensitive|{{'thomas' ~ '.*thomas.*'}}|
> |{{~*}}|Matches regular expression, case insensitive|{{'thomas' ~* 
> '.*Thomas.*'}}|
> |{{!~}}|Does not match regular expression, case sensitive|{{'thomas' !~ 
> '.*Thomas.*'}}|
> |{{!~*}}|Does not match regular expression, case insensitive|{{'thomas' !~* 
> '.*vadim.*'}}|
> https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25061) Spark SQL Thrift Server fails to not pick up hiveconf passing parameter

2019-12-29 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004736#comment-17004736
 ] 

Ajith S commented on SPARK-25061:
-

I could reproduce this and as per documentation, 
https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html --hiveconf 
can be used to pass hive properties to thrift server. Raising PR for fixing the 
same.

>  Spark SQL Thrift Server fails to not pick up hiveconf passing parameter
> 
>
> Key: SPARK-25061
> URL: https://issues.apache.org/jira/browse/SPARK-25061
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Zineng Yuan
>Priority: Major
>
> Spark thrift server should use passing parameter value and overwrites the 
> same conf from hive-site.xml. For example,  the server should overwrite what 
> exists in hive-site.xml.
>  ./sbin/start-thriftserver.sh --master yarn-client ...
> --hiveconf 
> "hive.server2.authentication.kerberos.principal=" ...
> 
> hive.server2.authentication.kerberos.principal
> hive/_HOST@
> 
> However, the server takes what in hive-site.xml.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28027) Missing some mathematical operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28027:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Missing some mathematical operators
> ---
>
> Key: SPARK-28027
> URL: https://issues.apache.org/jira/browse/SPARK-28027
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ||Operator||Description||Example||Result||
> |{{^}}|exponentiation (associates left to right)|{{2.0 ^ 3.0}}|{{8}}|
> |{{\|/}}|square root|{{\|/ 25.0}}|{{5}}|
> |{{\|\|/}}|cube root|{{\|\|/ 27.0}}|{{3}}|
> |{{\!}}|factorial|{{5 !}}|{{120}}|
> |{{\!\!}}|factorial (prefix operator)|{{!! 5}}|{{120}}|
> |{{@}}|absolute value|{{@ -5.0}}|{{5}}|
> |{{#}}|bitwise XOR|{{17 # 5}}|{{20}}|
> |{{<<}}|bitwise shift left|{{1 << 4}}|{{16}}|
> |{{>>}}|bitwise shift right|{{8 >> 2}}|{{2}}|
>  
>  Please note that we have {{^}}, {{\!}} and {{\!!\}}, but it has different 
> meanings.
> [https://www.postgresql.org/docs/11/functions-math.html]
>  
> [https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Operators/BitwiseOperators.htm]
>  [https://docs.aws.amazon.com/redshift/latest/dg/r_OPERATOR_SYMBOLS.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27931) Accept 'on' and 'off' as input for boolean data type

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-27931:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Accept 'on' and 'off' as input for boolean data type
> 
>
> Key: SPARK-27931
> URL: https://issues.apache.org/jira/browse/SPARK-27931
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: YoungGyu Chun
>Priority: Major
> Fix For: 3.0.0
>
>
> This ticket contains three things:
>  1. Accept 'on' and 'off' as input for boolean data type
> {code:sql}
> SELECT cast('no' as boolean) AS false;
> SELECT cast('off' as boolean) AS false;
> {code}
> 2. Accept unique prefixes thereof:
> {code:sql}
> SELECT cast('of' as boolean) AS false;
> SELECT cast('fal' as boolean) AS false;
> {code}
> 3. Trim the string when cast to boolean type
> {code:sql}
> SELECT cast('true   ' as boolean) AS true;
> SELECT cast(' FALSE' as boolean) AS true;
> {code}
> More details:
>  [https://www.postgresql.org/docs/devel/datatype-boolean.html]
>  
> [https://github.com/postgres/postgres/blob/REL_12_BETA1/src/backend/utils/adt/bool.c#L25]
>  
> [https://github.com/postgres/postgres/commit/05a7db05826c5eb68173b6d7ef1553c19322ef48]
>  
> [https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d]
> Other DBs:
>  [http://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html]
>  [https://my.vertica.com/docs/5.0/HTML/Master/2983.htm]
>  
> [https://github.com/prestosql/presto/blob/b845cd66da3eb1fcece50efba83ea12bc40afbaa/presto-main/src/main/java/com/facebook/presto/type/VarcharOperators.java#L108-L138]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28656) Support `millennium`, `century` and `decade` at `extract()`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28656:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support `millennium`, `century` and `decade` at `extract()`
> ---
>
> Key: SPARK-28656
> URL: https://issues.apache.org/jira/browse/SPARK-28656
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, we support these field for EXTRACT: YEAR, QUARTER, MONTH, WEEK, 
> DAY, DAYOFWEEK, HOUR, MINUTE, SECOND.
> We also need support: EPOCH, CENTURY, MILLENNIUM, DECADE, MICROSECONDS, 
> MILLISECONDS, DOW, ISODOW, DOY, TIMEZONE, TIMEZONE_M, TIMEZONE_H, JULIAN, 
> ISOYEAR.
> https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28674) Spark should support select into from where as PostgreSQL supports

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28674:
-
Parent Issue: SPARK-30374  (was: SPARK-27764)

> Spark should support select  into  from  where 
>  as PostgreSQL supports 
> 
>
> Key: SPARK-28674
> URL: https://issues.apache.org/jira/browse/SPARK-28674
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> Spark should support {{select  into  from  where 
> }} as PostgreSQL supports 
> {code}
> create table dup(id int);
> insert into dup values(1);
> insert into dup values(2);
> select id into test_dup from dup where id=1;
> select * from test_dup;
> {code}
> *Result: Success in PostgreSQL*
> But select id into test_dup from dup where id=1; in Spark gives ParseException
> {code}
> scala> sql("show tables").show();
> ++-+---+
> |database|tableName|isTemporary|
> ++-+---+
> |func|  dup|  false|
> ++-+---+
> {code}
> {code}
> scala> sql("select id into test_dup from dup where id=1").show()
> org.apache.spark.sql.catalyst.parser.ParseException:
> mismatched input 'test_dup' expecting (line 1, pos 15)
> == SQL ==
> select id into test_dup from dup where id=1
> ---^^^
>   at 
> org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
>   at 
> org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
>   at 
> org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>   ... 49 elided
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28718) Support field synonyms at `extract`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28718:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support field synonyms at `extract`
> ---
>
> Key: SPARK-28718
> URL: https://issues.apache.org/jira/browse/SPARK-28718
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Minor
> Fix For: 3.0.0
>
>
> Here is the list of field synonyms supported by PostgreSQL at extract:
> https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/datetime.c#L171-L234



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28768) Implement more text pattern operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-28768.
--
Resolution: Won't Fix

> Implement more text pattern operators
> -
>
> Key: SPARK-28768
> URL: https://issues.apache.org/jira/browse/SPARK-28768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:sql}
> postgres=# \do ~*~
>  List of operators
>Schema   | Name | Left arg type | Right arg type | Result type |   
> Description
> +--+---++-+-
>  pg_catalog | ~<=~ | character | character  | boolean | less than 
> or equal
>  pg_catalog | ~<=~ | text  | text   | boolean | less than 
> or equal
>  pg_catalog | ~<~  | character | character  | boolean | less than
>  pg_catalog | ~<~  | text  | text   | boolean | less than
>  pg_catalog | ~>=~ | character | character  | boolean | greater 
> than or equal
>  pg_catalog | ~>=~ | text  | text   | boolean | greater 
> than or equal
>  pg_catalog | ~>~  | character | character  | boolean | greater 
> than
>  pg_catalog | ~>~  | text  | text   | boolean | greater 
> than
>  pg_catalog | ~~   | bytea | bytea  | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | character | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | name  | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | text  | text   | boolean | matches 
> LIKE expression
> (12 rows)
> {code}
> {noformat}
> postgres=# select '1' ~<~ '2';
>  ?column?
> --
>  t
> (1 row)
> {noformat}
> https://stackoverflow.com/questions/35807872/operator-in-postgres



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28768) Implement more text pattern operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004723#comment-17004723
 ] 

Takeshi Yamamuro commented on SPARK-28768:
--

I'll close for now because I'm not sure that this feature is useful for Spark. 
If necessary, please reopen this.

> Implement more text pattern operators
> -
>
> Key: SPARK-28768
> URL: https://issues.apache.org/jira/browse/SPARK-28768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:sql}
> postgres=# \do ~*~
>  List of operators
>Schema   | Name | Left arg type | Right arg type | Result type |   
> Description
> +--+---++-+-
>  pg_catalog | ~<=~ | character | character  | boolean | less than 
> or equal
>  pg_catalog | ~<=~ | text  | text   | boolean | less than 
> or equal
>  pg_catalog | ~<~  | character | character  | boolean | less than
>  pg_catalog | ~<~  | text  | text   | boolean | less than
>  pg_catalog | ~>=~ | character | character  | boolean | greater 
> than or equal
>  pg_catalog | ~>=~ | text  | text   | boolean | greater 
> than or equal
>  pg_catalog | ~>~  | character | character  | boolean | greater 
> than
>  pg_catalog | ~>~  | text  | text   | boolean | greater 
> than
>  pg_catalog | ~~   | bytea | bytea  | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | character | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | name  | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | text  | text   | boolean | matches 
> LIKE expression
> (12 rows)
> {code}
> {noformat}
> postgres=# select '1' ~<~ '2';
>  ?column?
> --
>  t
> (1 row)
> {noformat}
> https://stackoverflow.com/questions/35807872/operator-in-postgres



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28768) Implement more text pattern operators

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-28768:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Implement more text pattern operators
> -
>
> Key: SPARK-28768
> URL: https://issues.apache.org/jira/browse/SPARK-28768
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:sql}
> postgres=# \do ~*~
>  List of operators
>Schema   | Name | Left arg type | Right arg type | Result type |   
> Description
> +--+---++-+-
>  pg_catalog | ~<=~ | character | character  | boolean | less than 
> or equal
>  pg_catalog | ~<=~ | text  | text   | boolean | less than 
> or equal
>  pg_catalog | ~<~  | character | character  | boolean | less than
>  pg_catalog | ~<~  | text  | text   | boolean | less than
>  pg_catalog | ~>=~ | character | character  | boolean | greater 
> than or equal
>  pg_catalog | ~>=~ | text  | text   | boolean | greater 
> than or equal
>  pg_catalog | ~>~  | character | character  | boolean | greater 
> than
>  pg_catalog | ~>~  | text  | text   | boolean | greater 
> than
>  pg_catalog | ~~   | bytea | bytea  | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | character | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | name  | text   | boolean | matches 
> LIKE expression
>  pg_catalog | ~~   | text  | text   | boolean | matches 
> LIKE expression
> (12 rows)
> {code}
> {noformat}
> postgres=# select '1' ~<~ '2';
>  ?column?
> --
>  t
> (1 row)
> {noformat}
> https://stackoverflow.com/questions/35807872/operator-in-postgres



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29187) Return null from `date_part()` for the null `field`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29187:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Return null from `date_part()` for the null `field`
> ---
>
> Key: SPARK-29187
> URL: https://issues.apache.org/jira/browse/SPARK-29187
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Minor
> Fix For: 3.0.0
>
>
> PostgreSQL return NULL for the NULL field from the date_part() function:
> {code}
> maxim=# select date_part(null, date'2019-09-20');
>  date_part 
> ---
>   
> (1 row)
> {code}
> but Spark fails with the error:
> {code}
> spark-sql> select date_part(null, date'2019-09-20');
> Error in query: null; line 1 pos 7
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29311) Return seconds with fraction from `date_part`/`extract`

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29311:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Return seconds with fraction from `date_part`/`extract`
> ---
>
> Key: SPARK-29311
> URL: https://issues.apache.org/jira/browse/SPARK-29311
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Assignee: Maxim Gekk
>Priority: Minor
> Fix For: 3.0.0
>
>
> The `date_part()` and `extract` should return seconds with fractional part 
> for the `SECOND` field as PostgreSQL does:
> {code}
> # SELECT date_part('SECONDS', timestamp'2019-10-01 00:00:01.01');
>  date_part 
> ---
>   1.01
> (1 row)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23179) Support option to throw exception if overflow occurs during Decimal arithmetic

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-23179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-23179:
-
Parent Issue: SPARK-30374  (was: SPARK-27764)

> Support option to throw exception if overflow occurs during Decimal arithmetic
> --
>
> Key: SPARK-23179
> URL: https://issues.apache.org/jira/browse/SPARK-23179
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Marco Gaido
>Assignee: Marco Gaido
>Priority: Major
> Fix For: 3.0.0
>
>
> SQL ANSI 2011 states that in case of overflow during arithmetic operations, 
> an exception should be thrown. This is what most of the SQL DBs do (eg. 
> SQLServer, DB2). Hive currently returns NULL (as Spark does) but HIVE-18291 
> is open to be SQL compliant.
> I propose to have a config option which allows to decide whether Spark should 
> behave according to SQL standards or in the current way (ie. returning NULL).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29096) The exact math method should be called only when there is a corresponding function in Math

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29096:
-
Parent Issue: SPARK-30374  (was: SPARK-27764)

> The exact math method should be called only when there is a corresponding 
> function in Math
> --
>
> Key: SPARK-29096
> URL: https://issues.apache.org/jira/browse/SPARK-29096
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> After https://github.com/apache/spark/pull/21599, if the option 
> "spark.sql.failOnIntegralTypeOverflow" is enabled, all the Binary Arithmetic 
> operator will used the exact version function. 
> However, only `Add`/`Substract`/`Multiply` has a corresponding exact function 
> in java.lang.Math . When the option "spark.sql.failOnIntegralTypeOverflow" is 
> enabled, a runtime exception "BinaryArithmetics must override either 
> exactMathMethod or genCode" is thrown if the other Binary Arithmetic 
> operators are used, such as "Divide", "Remainder".
> The exact math method should be called only when there is a corresponding 
> function in Math



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26218) Throw exception on overflow for integers

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-26218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-26218:
-
Parent Issue: SPARK-30374  (was: SPARK-27764)

> Throw exception on overflow for integers
> 
>
> Key: SPARK-26218
> URL: https://issues.apache.org/jira/browse/SPARK-26218
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Marco Gaido
>Assignee: Marco Gaido
>Priority: Major
> Fix For: 3.0.0
>
>
> SPARK-24598 just updated the documentation in order to state that our 
> addition is a Java style one and not a SQL style. But in order to follow the 
> SQL standard we should instead throw an exception if an overflow occurs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29386) Copy data between a file and a table

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004722#comment-17004722
 ] 

Takeshi Yamamuro commented on SPARK-29386:
--

I'll close this for now because this feature is pg-specific. If necessary, 
please reopen this.

> Copy data between a file and a table 
> -
>
> Key: SPARK-29386
> URL: https://issues.apache.org/jira/browse/SPARK-29386
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Priority: Major
>
> https://www.postgresql.org/docs/12/sql-copy.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29386) Copy data between a file and a table

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-29386.
--
Resolution: Won't Fix

> Copy data between a file and a table 
> -
>
> Key: SPARK-29386
> URL: https://issues.apache.org/jira/browse/SPARK-29386
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Priority: Major
>
> https://www.postgresql.org/docs/12/sql-copy.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29386) Copy data between a file and a table

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29386:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Copy data between a file and a table 
> -
>
> Key: SPARK-29386
> URL: https://issues.apache.org/jira/browse/SPARK-29386
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Maxim Gekk
>Priority: Major
>
> https://www.postgresql.org/docs/12/sql-copy.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29587) Real data type is not supported in Spark SQL which is supporting in postgresql

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29587:
-
Parent Issue: SPARK-30374  (was: SPARK-27764)

> Real data type is not supported in Spark SQL which is supporting in postgresql
> --
>
> Key: SPARK-29587
> URL: https://issues.apache.org/jira/browse/SPARK-29587
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.4
>Reporter: jobit mathew
>Assignee: Kent Yao
>Priority: Minor
> Fix For: 3.0.0
>
>
> Real data type is not supported in Spark SQL which is supporting in 
> postgresql.
> +*In postgresql query success*+
> CREATE TABLE weather2(prcp real);
> insert into weather2 values(2.5);
> select * from weather2;
>  
> ||  ||prcp||
> |1|2,5|
> +*In spark sql getting error*+
> spark-sql> CREATE TABLE weather2(prcp real);
> Error in query:
> DataType real is not supported.(line 1, pos 27)
> == SQL ==
> CREATE TABLE weather2(prcp real)
> ---
> Better to add the datatype "real " support in sql also
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29583) extract support interval type

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29583:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> extract support interval type
> -
>
> Key: SPARK-29583
> URL: https://issues.apache.org/jira/browse/SPARK-29583
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {code:sql}
> postgres=# select extract(minute from INTERVAL '1 YEAR 10 DAYS 50 MINUTES');
>  date_part
> ---
> 50
> (1 row)
> postgres=# select extract(minute from cast('2019-07-01 17:12:33.068' as 
> timestamp) - cast('2019-07-01 15:57:07.912' as timestamp));
>  date_part
> ---
> 15
> (1 row)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29584) NOT NULL is not supported in Spark

2019-12-29 Thread Takeshi Yamamuro (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004721#comment-17004721
 ] 

Takeshi Yamamuro commented on SPARK-29584:
--

This issue is related to integrity constraints that are related to SPARK-19842.

> NOT NULL is not supported in Spark
> --
>
> Key: SPARK-29584
> URL: https://issues.apache.org/jira/browse/SPARK-29584
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> Spark while creating table restricting column for NULL value is not supported.
> As below
> PostgreSQL: SUCCESS No Exception
>  CREATE TABLE Persons (ID int *NOT NULL*, LastName varchar(255) *NOT 
> NULL*,FirstName varchar(255) NOT NULL, Age int);
>  insert into Persons values(1,'GUPTA','Abhi',NULL);
>  select * from persons;
>  
> Spark: Parse Exception
> jdbc:hive2://10.18.19.208:23040/default> CREATE TABLE Persons (ID int NOT 
> NULL, LastName varchar(255) NOT NULL,FirstName varchar(255) NOT NULL, Age 
> int);
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> no viable alternative at input 'CREATE TABLE Persons (ID int NOT'(line 1, pos 
> 29)
>  Parse Exception



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29584) NOT NULL is not supported in Spark

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-29584.
--
Resolution: Duplicate

> NOT NULL is not supported in Spark
> --
>
> Key: SPARK-29584
> URL: https://issues.apache.org/jira/browse/SPARK-29584
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> Spark while creating table restricting column for NULL value is not supported.
> As below
> PostgreSQL: SUCCESS No Exception
>  CREATE TABLE Persons (ID int *NOT NULL*, LastName varchar(255) *NOT 
> NULL*,FirstName varchar(255) NOT NULL, Age int);
>  insert into Persons values(1,'GUPTA','Abhi',NULL);
>  select * from persons;
>  
> Spark: Parse Exception
> jdbc:hive2://10.18.19.208:23040/default> CREATE TABLE Persons (ID int NOT 
> NULL, LastName varchar(255) NOT NULL,FirstName varchar(255) NOT NULL, Age 
> int);
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> no viable alternative at input 'CREATE TABLE Persons (ID int NOT'(line 1, pos 
> 29)
>  Parse Exception



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29616) Bankers' rounding for double types

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-29616.
--
Resolution: Won't Fix

> Bankers' rounding for double types
> --
>
> Key: SPARK-29616
> URL: https://issues.apache.org/jira/browse/SPARK-29616
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Trivial
>
> PostgreSQL uses banker's rounding mode for double types;
> {code}
> postgres=# select * from t;
>  a | b 
> -+-
>  0.5 | 0.5
>  1.5 | 1.5
>  2.5 | 2.5
>  3.5 | 3.5
>  4.5 | 4.5
> (5 rows)
> postgres=# \d t
>  Table "public.t"
>  Column | Type | Collation | Nullable | Default 
> +--+---+--+-
>  a | double precision | | | 
>  b | numeric(2,1) | | |
> postgres=# select round(a), round(b) from t;
>  round | round 
> ---+---
>  0 | 1
>  2 | 2
>  2 | 3
>  4 | 4
>  4 | 5
> (5 rows)
> {code}
>  
> In the master;
> {code}
> scala> sql("select * from t").show
> +---+---+
> | a| b|
> +---+---+
> |0.5|0.5|
> |1.5|1.5|
> |2.5|2.5|
> |3.5|3.5|
> |4.5|4.5|
> +---+---+
> scala> sql("select * from t").printSchema
> root
>  |-- a: double (nullable = true)
>  |-- b: decimal(2,1) (nullable = true)
> scala> sql("select round(a), round(b) from t").show()
> +---+---+
> |round(a, 0)|round(b, 0)|
> +---+---+
> | 1.0| 1|
> | 2.0| 2|
> | 3.0| 3|
> | 4.0| 4|
> | 5.0| 5|
> +---+---+
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29616) Bankers' rounding for double types

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29616:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Bankers' rounding for double types
> --
>
> Key: SPARK-29616
> URL: https://issues.apache.org/jira/browse/SPARK-29616
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Trivial
>
> PostgreSQL uses banker's rounding mode for double types;
> {code}
> postgres=# select * from t;
>  a | b 
> -+-
>  0.5 | 0.5
>  1.5 | 1.5
>  2.5 | 2.5
>  3.5 | 3.5
>  4.5 | 4.5
> (5 rows)
> postgres=# \d t
>  Table "public.t"
>  Column | Type | Collation | Nullable | Default 
> +--+---+--+-
>  a | double precision | | | 
>  b | numeric(2,1) | | |
> postgres=# select round(a), round(b) from t;
>  round | round 
> ---+---
>  0 | 1
>  2 | 2
>  2 | 3
>  4 | 4
>  4 | 5
> (5 rows)
> {code}
>  
> In the master;
> {code}
> scala> sql("select * from t").show
> +---+---+
> | a| b|
> +---+---+
> |0.5|0.5|
> |1.5|1.5|
> |2.5|2.5|
> |3.5|3.5|
> |4.5|4.5|
> +---+---+
> scala> sql("select * from t").printSchema
> root
>  |-- a: double (nullable = true)
>  |-- b: decimal(2,1) (nullable = true)
> scala> sql("select round(a), round(b) from t").show()
> +---+---+
> |round(a, 0)|round(b, 0)|
> +---+---+
> | 1.0| 1|
> | 2.0| 2|
> | 3.0| 3|
> | 4.0| 4|
> | 5.0| 5|
> +---+---+
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29659) Support COMMENT ON syntax

2019-12-29 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29659:
-
Parent Issue: SPARK-30375  (was: SPARK-27764)

> Support COMMENT ON syntax
> -
>
> Key: SPARK-29659
> URL: https://issues.apache.org/jira/browse/SPARK-29659
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>
> [https://www.postgresql.org/docs/current/sql-comment.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30380) Refactor RandomForest.findSplits

2019-12-29 Thread zhengruifeng (Jira)
zhengruifeng created SPARK-30380:


 Summary: Refactor RandomForest.findSplits
 Key: SPARK-30380
 URL: https://issues.apache.org/jira/browse/SPARK-30380
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 3.0.0
Reporter: zhengruifeng


Current impl of {{RandomForest.findSplits}} uses {{groupByKey}} to collect 
non-zero values for each feature, so it is quite dangerous.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org