[jira] [Updated] (SPARK-26919) change maven default compile java home

2019-02-18 Thread daile (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-26919:
--
Attachment: p1.png

> change maven default compile java home
> --
>
> Key: SPARK-26919
> URL: https://issues.apache.org/jira/browse/SPARK-26919
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.4.1
>Reporter: daile
>Priority: Critical
> Attachments: p1.png
>
>
>   when i use "build/mvn -DskipTests clean package"  the deafult java home 
> Configuration "
> ${java.home}". I tried the environment of mac os and winodws and found that 
> the default java.home is */jre but the jre environment does not have the 
> javac complie command. So I think it can be replaced with the system 
> environment variable and the test is successfully compiled.
> !image-2019-02-19-10-25-02-872.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26919) change maven default compile java home

2019-02-18 Thread daile (JIRA)
daile created SPARK-26919:
-

 Summary: change maven default compile java home
 Key: SPARK-26919
 URL: https://issues.apache.org/jira/browse/SPARK-26919
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 2.4.1
Reporter: daile
 Attachments: p1.png

  when i use "build/mvn -DskipTests clean package"  the deafult java home 
Configuration "

${java.home}". I tried the environment of mac os and winodws and found that the 
default java.home is */jre but the jre environment does not have the javac 
complie command. So I think it can be replaced with the system environment 
variable and the test is successfully compiled.

!image-2019-02-19-10-25-02-872.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26948) vertex and edge rowkey upgrade and support multiple types?

2019-02-20 Thread daile (JIRA)
daile created SPARK-26948:
-

 Summary: vertex and edge rowkey upgrade and support multiple types?
 Key: SPARK-26948
 URL: https://issues.apache.org/jira/browse/SPARK-26948
 Project: Spark
  Issue Type: Improvement
  Components: GraphX
Affects Versions: 2.4.0
Reporter: daile


Currently only Long is supported, but most of the graph databases use string as 
the primary key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26919) change maven default compile java home

2019-02-20 Thread daile (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile resolved SPARK-26919.
---
   Resolution: Done
Fix Version/s: 2.4.0

> change maven default compile java home
> --
>
> Key: SPARK-26919
> URL: https://issues.apache.org/jira/browse/SPARK-26919
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.4.1
>Reporter: daile
>Priority: Critical
> Fix For: 2.4.0
>
> Attachments: p1.png
>
>
>   when i use "build/mvn -DskipTests clean package"  the deafult java home 
> Configuration "
> ${java.home}". I tried the environment of mac os and winodws and found that 
> the default java.home is */jre but the jre environment does not have the 
> javac complie command. So I think it can be replaced with the system 
> environment variable and the test is successfully compiled.
> !image-2019-02-19-10-25-02-872.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27336) Incorrect DataSet.summary() result

2019-09-01 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920557#comment-16920557
 ] 

daile commented on SPARK-27336:
---

I will check this issue

> Incorrect DataSet.summary() result
> --
>
> Key: SPARK-27336
> URL: https://issues.apache.org/jira/browse/SPARK-27336
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gengliang Wang
>Priority: Major
> Attachments: test.csv
>
>
> There is a single data point in the minimum_nights column that is 1.0E8 out 
> of 8k records, but .summary() says it is the 75% and the max.
> I compared this with approxQuantile, and approxQuantile for 75% gave the 
> correct value of 30.0.
> To reproduce:
> {code:java}
> scala> val df = 
> spark.read.format("csv").load("test.csv").withColumn("minimum_nights", 
> '_c0.cast("Int"))
> df: org.apache.spark.sql.DataFrame = [_c0: string, minimum_nights: int]
> scala> df.select("minimum_nights").summary().show()
> +---+--+
> |summary|minimum_nights|
> +---+--+
> |  count|  7072|
> |   mean| 14156.35407239819|
> | stddev|1189128.5444975856|
> |min| 1|
> |25%| 2|
> |50%| 4|
> |75%| 1|
> |max| 1|
> +---+--+
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.1)
> res1: Array[Double] = Array(30.0)
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.001)
> res2: Array[Double] = Array(30.0)
> scala> df.stat.approxQuantile("minimum_nights", Array(0.75), 0.0001)
> res3: Array[Double] = Array(1.0E8)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28694) Add Java/Scala StructuredKerberizedKafkaWordCount examples

2019-09-02 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920665#comment-16920665
 ] 

daile commented on SPARK-28694:
---

I will work on this

> Add Java/Scala StructuredKerberizedKafkaWordCount examples
> --
>
> Key: SPARK-28694
> URL: https://issues.apache.org/jira/browse/SPARK-28694
> Project: Spark
>  Issue Type: Improvement
>  Components: Examples, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: hong dongdong
>Priority: Minor
>
> Now,`StructuredKafkaWordCount` example is not support to visit kafka using 
> kerberos authentication. Add a parameter which target if kerberos is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28694) Add Java/Scala StructuredKerberizedKafkaWordCount examples

2019-09-02 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920677#comment-16920677
 ] 

daile commented on SPARK-28694:
---

ok

> Add Java/Scala StructuredKerberizedKafkaWordCount examples
> --
>
> Key: SPARK-28694
> URL: https://issues.apache.org/jira/browse/SPARK-28694
> Project: Spark
>  Issue Type: Improvement
>  Components: Examples, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: hong dongdong
>Priority: Minor
>
> Now,`StructuredKafkaWordCount` example is not support to visit kafka using 
> kerberos authentication. Add a parameter which target if kerberos is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28956) Make it tighter

2019-09-03 Thread daile (Jira)
daile created SPARK-28956:
-

 Summary: Make it tighter
 Key: SPARK-28956
 URL: https://issues.apache.org/jira/browse/SPARK-28956
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.0.0
Reporter: daile


{code:java}
//代码占位符
private def numStd(s: Double): Double = {
  // TODO: Make it tighter.
  if (s < 6.0) {
12.0
  } else if (s < 16.0) {
9.0
  } else {
6.0
  }
}{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28956) Make it tighter

2019-09-03 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-28956:
--
Priority: Minor  (was: Major)

> Make it tighter
> ---
>
> Key: SPARK-28956
> URL: https://issues.apache.org/jira/browse/SPARK-28956
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: daile
>Priority: Minor
>
> {code:java}
> //代码占位符
> private def numStd(s: Double): Double = {
>   // TODO: Make it tighter.
>   if (s < 6.0) {
> 12.0
>   } else if (s < 16.0) {
> 9.0
>   } else {
> 6.0
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28121) String Functions: decode can not accept 'escape' and 'hex' as charset

2019-09-03 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921369#comment-16921369
 ] 

daile commented on SPARK-28121:
---

i will work on this

> String Functions: decode can not accept 'escape' and 'hex' as charset
> -
>
> Key: SPARK-28121
> URL: https://issues.apache.org/jira/browse/SPARK-28121
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {noformat}
> postgres=# select decode('1234567890','escape');
> decode
> 
> \x31323334353637383930
> (1 row)
> {noformat}
> {noformat}
> spark-sql> select decode('1234567890','escape');
> 19/06/20 01:57:33 ERROR SparkSQLDriver: Failed in [select 
> decode('1234567890','escape')]
> java.io.UnsupportedEncodingException: escape
>   at java.lang.StringCoding.decode(StringCoding.java:190)
>   at java.lang.String.(String.java:426)
>   at java.lang.String.(String.java:491)
> ...
> spark-sql> select decode('ff','hex');
> 19/08/16 21:44:55 ERROR SparkSQLDriver: Failed in [select decode('ff','hex')]
> java.io.UnsupportedEncodingException: hex
>   at java.lang.StringCoding.decode(StringCoding.java:190)
>   at java.lang.String.(String.java:426)
>   at java.lang.String.(String.java:491)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28121) String Functions: decode can not accept 'escape' and 'hex' as charset

2019-09-03 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921369#comment-16921369
 ] 

daile edited comment on SPARK-28121 at 9/3/19 1:27 PM:
---

you can use sql like this
{code:java}
select hex('1234567890');{code}


was (Author: 726575...@qq.com):
i will work on this

> String Functions: decode can not accept 'escape' and 'hex' as charset
> -
>
> Key: SPARK-28121
> URL: https://issues.apache.org/jira/browse/SPARK-28121
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> {noformat}
> postgres=# select decode('1234567890','escape');
> decode
> 
> \x31323334353637383930
> (1 row)
> {noformat}
> {noformat}
> spark-sql> select decode('1234567890','escape');
> 19/06/20 01:57:33 ERROR SparkSQLDriver: Failed in [select 
> decode('1234567890','escape')]
> java.io.UnsupportedEncodingException: escape
>   at java.lang.StringCoding.decode(StringCoding.java:190)
>   at java.lang.String.(String.java:426)
>   at java.lang.String.(String.java:491)
> ...
> spark-sql> select decode('ff','hex');
> 19/08/16 21:44:55 ERROR SparkSQLDriver: Failed in [select decode('ff','hex')]
> java.io.UnsupportedEncodingException: hex
>   at java.lang.StringCoding.decode(StringCoding.java:190)
>   at java.lang.String.(String.java:426)
>   at java.lang.String.(String.java:491)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28990) SparkSQL invalid call to toAttribute on unresolved object, tree: *

2019-09-19 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933163#comment-16933163
 ] 

daile commented on SPARK-28990:
---

It seems to have been solved in 3.0

> SparkSQL invalid call to toAttribute on unresolved object, tree: *
> --
>
> Key: SPARK-28990
> URL: https://issues.apache.org/jira/browse/SPARK-28990
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: fengchaoge
>Priority: Major
>
> SparkSQL create table as select from one table which may not exists throw 
> exceptions like:
> {code}
> org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to 
> toAttribute on unresolved object, tree:
> {code}
> This is not friendly, spark user may have no idea about what's wrong.
> Simple sql can reproduce it,like this:
> {code}
> spark-sql (default)> create table default.spark as select * from default.dual;
> {code}
> {code}
> 2019-09-05 16:27:24,127 INFO (main) [Logging.scala:logInfo(54)] - Parsing 
> command: create table default.spark as select * from default.dual
> 2019-09-05 16:27:24,772 ERROR (main) [Logging.scala:logError(91)] - Failed in 
> [create table default.spark as select * from default.dual]
> org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to 
> toAttribute on unresolved object, tree: *
> at 
> org.apache.spark.sql.catalyst.analysis.Star.toAttribute(unresolved.scala:245)
> at 
> org.apache.spark.sql.catalyst.plans.logical.Project$$anonfun$output$1.apply(basicLogicalOperators.scala:52)
> at 
> org.apache.spark.sql.catalyst.plans.logical.Project$$anonfun$output$1.apply(basicLogicalOperators.scala:52)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:296)
> at 
> org.apache.spark.sql.catalyst.plans.logical.Project.output(basicLogicalOperators.scala:52)
> at 
> org.apache.spark.sql.hive.HiveAnalysis$$anonfun$apply$3.applyOrElse(HiveStrategies.scala:160)
> at 
> org.apache.spark.sql.hive.HiveAnalysis$$anonfun$apply$3.applyOrElse(HiveStrategies.scala:148)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108)
> at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:107)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29)
> at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperators(AnalysisHelper.scala:73)
> at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29)
> at org.apache.spark.sql.hive.HiveAnalysis$.apply(HiveStrategies.scala:148)
> at org.apache.spark.sql.hive.HiveAnalysis$.apply(HiveStrategies.scala:147)
> at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87)
> at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:84)
> at 
> scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
> at 
> scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
> at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48)
> at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:84)
> at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76)
> at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:127)
> at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:12

[jira] [Commented] (SPARK-29174) LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source

2019-09-19 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933181#comment-16933181
 ] 

daile commented on SPARK-29174:
---

/**
 *
 * Expected format:
 * {{{
 * INSERT OVERWRITE DIRECTORY
 * [path]
 * [OPTIONS table_property_list]
 * select_statement;
 * }}}
 */

> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source
> ---
>
> Key: SPARK-29174
> URL: https://issues.apache.org/jira/browse/SPARK-29174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> *using does not work for insert overwrite when in local  but works when 
> insert overwrite in HDFS directory*
>  ** 
>  
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite directory 
> '/user/trash2/' using parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.448 seconds)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' using parquet select * from trash1 a where a.country='PAK';
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source(line 1, 
> pos 0)
>  
> == SQL ==
> insert overwrite local directory '/opt/trash2/' using parquet select * from 
> trash1 a where a.country='PAK'
> ^^^ (state=,code=0)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> | | |
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-29174) LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source

2019-09-19 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-29174:
--
Comment: was deleted

(was: /**
 *
 * Expected format:
 * {{{
 * INSERT OVERWRITE DIRECTORY
 * [path]
 * [OPTIONS table_property_list]
 * select_statement;
 * }}}
 */)

> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source
> ---
>
> Key: SPARK-29174
> URL: https://issues.apache.org/jira/browse/SPARK-29174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> *using does not work for insert overwrite when in local  but works when 
> insert overwrite in HDFS directory*
>  ** 
>  
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite directory 
> '/user/trash2/' using parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.448 seconds)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' using parquet select * from trash1 a where a.country='PAK';
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source(line 1, 
> pos 0)
>  
> == SQL ==
> insert overwrite local directory '/opt/trash2/' using parquet select * from 
> trash1 a where a.country='PAK'
> ^^^ (state=,code=0)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> | | |
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29174) LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source

2019-09-19 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933194#comment-16933194
 ] 

daile commented on SPARK-29174:
---

I tried to remove the judgment and found that it works. I don’t know if there 
will be any other impact.

> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source
> ---
>
> Key: SPARK-29174
> URL: https://issues.apache.org/jira/browse/SPARK-29174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> *using does not work for insert overwrite when in local  but works when 
> insert overwrite in HDFS directory*
>  ** 
>  
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite directory 
> '/user/trash2/' using parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.448 seconds)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' using parquet select * from trash1 a where a.country='PAK';
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source(line 1, 
> pos 0)
>  
> == SQL ==
> insert overwrite local directory '/opt/trash2/' using parquet select * from 
> trash1 a where a.country='PAK'
> ^^^ (state=,code=0)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> | | |
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29586) spark jdbc method param lowerBound and upperBound DataType wrong

2019-10-24 Thread daile (Jira)
daile created SPARK-29586:
-

 Summary: spark jdbc method param lowerBound and upperBound 
DataType wrong
 Key: SPARK-29586
 URL: https://issues.apache.org/jira/browse/SPARK-29586
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.4.4, 3.0.0
Reporter: daile




```
private def toBoundValueInWhereClause(
value: Long,
columnType: DataType,
timeZoneId: String): String = {
  def dateTimeToString(): String = {
val dateTimeStr = columnType match {
  case DateType => DateFormatter().format(value.toInt)
  case TimestampType =>
val timestampFormatter = TimestampFormatter.getFractionFormatter(
  DateTimeUtils.getZoneId(timeZoneId))
DateTimeUtils.timestampToString(timestampFormatter, value)
}
s"'$dateTimeStr'"
  }
  columnType match {
case _: NumericType => value.toString
case DateType | TimestampType => dateTimeToString()
  }
}
```

partitionColumn  supoort NumericType, TimestampType, TimestampType but jdbc 
method only accept Long

```
test("jdbc Suite2") {
  val df = spark
.read
.option("partitionColumn", "B")
.option("lowerBound", "2017-01-01 10:00:00")
.option("upperBound", "2019-01-01 10:00:00")
.option("numPartitions", 5)
.jdbc(urlWithUserAndPass, "TEST.TIMETYPES",  new Properties())
  df.printSchema()
  df.show()
}
```

it's OK 

```
test("jdbc Suite2") {
  val df = spark
.read
.option("partitionColumn", "B")
.option("lowerBound", "2017-01-01 10:00:00")
.option("upperBound", "2019-01-01 10:00:00")
.option("numPartitions", 5)
.jdbc(urlWithUserAndPass, "TEST.TIMETYPES",  new Properties())
  df.printSchema()
  df.show()
}
```




```
test("jdbc Suite") {
  val df = spark.read.jdbc(urlWithUserAndPass, "TEST.TIMETYPES", "B", 
1571899768024L, 1571899768024L, 5, new Properties())
  df.printSchema()
  df.show()
}
```

```Cannot parse the bound value 1571899768024 as date
java.lang.IllegalArgumentException: Cannot parse the bound value 1571899768024 
as date
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.$anonfun$toInternalBoundValue$1(JDBCRelation.scala:184)
at scala.Option.getOrElse(Option.scala:189)
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.parse$1(JDBCRelation.scala:183)
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.toInternalBoundValue(JDBCRelation.scala:189)
at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.columnPartition(JDBCRelation.scala:88)
at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:339)
at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:240)
at 
org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:229)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:229)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:179)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:255)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:297)
at 
org.apache.spark.sql.jdbc.JDBCSuite.$anonfun$new$186(JDBCSuite.scala:1664)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
at 
org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
at 
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
at 
org.apache.spark.sql.jdbc.JDBCSuite.org$scalatest$BeforeAndAfter$$super$runTest(JDBCSuite.scala:43)
at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:203)
at org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.sc

[jira] [Updated] (SPARK-29586) spark jdbc method param lowerBound and upperBound DataType wrong

2019-10-24 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-29586:
--
Description: 
 
{code:java}
private def toBoundValueInWhereClause(
value: Long,
columnType: DataType,
timeZoneId: String): String = {
  def dateTimeToString(): String = {
val dateTimeStr = columnType match {
  case DateType => DateFormatter().format(value.toInt)
  case TimestampType =>
val timestampFormatter = TimestampFormatter.getFractionFormatter(
  DateTimeUtils.getZoneId(timeZoneId))
DateTimeUtils.timestampToString(timestampFormatter, value)
}
s"'$dateTimeStr'"
  }
  columnType match {
case _: NumericType => value.toString
case DateType | TimestampType => dateTimeToString()
  }
}{code}
partitionColumn supoort NumericType, TimestampType, TimestampType but jdbc 
method only accept Long
test("jdbc Suite2") {
  val df = spark
    .read
    .option("partitionColumn", "B")
    .option("lowerBound", "2017-01-01 10:00:00")
    .option("upperBound", "2019-01-01 10:00:00")
    .option("numPartitions", 5)
    .jdbc(urlWithUserAndPass, "TEST.TIMETYPES",  new Properties())
  df.printSchema()
  df.show()
}
test("jdbc Suite2") {
  val df = spark
    .read
    .option("partitionColumn", "B")
    .option("lowerBound", "2017-01-01 10:00:00")
    .option("upperBound", "2019-01-01 10:00:00")
    .option("numPartitions", 5)
    .jdbc(urlWithUserAndPass, "TEST.TIMETYPES",  new Properties())
  df.printSchema()
  df.show()
}
test("jdbc Suite") {
  val df = spark.read.jdbc(urlWithUserAndPass, "TEST.TIMETYPES", "B", 
1571899768024L, 1571899768024L, 5, new Properties())
  df.printSchema()
  df.show()
}
java.lang.IllegalArgumentException: Cannot parse the bound value 1571899768024 
as date
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.$anonfun$toInternalBoundValue$1(JDBCRelation.scala:184)
  at scala.Option.getOrElse(Option.scala:189)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.parse$1(JDBCRelation.scala:183)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.toInternalBoundValue(JDBCRelation.scala:189)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.columnPartition(JDBCRelation.scala:88)
  at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
  at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:339)
  at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:240)
  at 
org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:229)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:229)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:179)
  at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:255)
  at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:297)
  at org.apache.spark.sql.jdbc.JDBCSuite.$anonfun$new$186(JDBCSuite.scala:1664)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)
  at org.scalatest.Transformer.apply(Transformer.scala:20)
  at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
  at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
  at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
  at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
  at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
  at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
  at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
  at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
  at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
  at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
  at 
org.apache.spark.sql.jdbc.JDBCSuite.org$scalatest$BeforeAndAfter$$super$runTest(JDBCSuite.scala:43)
  at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:203)
  at org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.scala:192)
  at org.apache.spark.sql.jdbc.JDBCSuite.runTest(JDBCSuite.scala:43)
  at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
  at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
  at scala.collection.immutable.List.foreach(List.scala:392)
  at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
  at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
  at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
  at org.sc

[jira] [Updated] (SPARK-29586) spark jdbc method param lowerBound and upperBound DataType wrong

2019-10-24 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-29586:
--
Description: 
 
{code:java}
private def toBoundValueInWhereClause(
value: Long,
columnType: DataType,
timeZoneId: String): String = {
  def dateTimeToString(): String = {
val dateTimeStr = columnType match {
  case DateType => DateFormatter().format(value.toInt)
  case TimestampType =>
val timestampFormatter = TimestampFormatter.getFractionFormatter(
  DateTimeUtils.getZoneId(timeZoneId))
DateTimeUtils.timestampToString(timestampFormatter, value)
}
s"'$dateTimeStr'"
  }
  columnType match {
case _: NumericType => value.toString
case DateType | TimestampType => dateTimeToString()
  }
}{code}
partitionColumn supoort NumericType, TimestampType, TimestampType but jdbc 
method only accept Long

 
{code:java}
test("jdbc Suite2") {
  val df = spark
.read
.option("partitionColumn", "B")
.option("lowerBound", "2017-01-01 10:00:00")
.option("upperBound", "2019-01-01 10:00:00")
.option("numPartitions", 5)
.jdbc(urlWithUserAndPass, "TEST.TIMETYPES",  new Properties())
  df.printSchema()
  df.show()
}
{code}
it's OK

 
{code:java}
test("jdbc Suite") { val df = spark.read.jdbc(urlWithUserAndPass, 
"TEST.TIMETYPES", "B", 1571899768024L, 1571899768024L, 5, new Properties()) 
df.printSchema() df.show() }
{code}
 
{code:java}
java.lang.IllegalArgumentException: Cannot parse the bound value 1571899768024 
as datejava.lang.IllegalArgumentException: Cannot parse the bound value 
1571899768024 as date at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.$anonfun$toInternalBoundValue$1(JDBCRelation.scala:184)
 at scala.Option.getOrElse(Option.scala:189) at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.parse$1(JDBCRelation.scala:183)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.toInternalBoundValue(JDBCRelation.scala:189)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.columnPartition(JDBCRelation.scala:88)
 at 
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
 at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:339)
 at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:240) at 
org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:229) 
at scala.Option.getOrElse(Option.scala:189) at 
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:229) at 
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:179) at 
org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:255) at 
org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:297) at 
org.apache.spark.sql.jdbc.JDBCSuite.$anonfun$new$186(JDBCSuite.scala:1664) at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at 
org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at 
org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at 
org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at 
org.scalatest.Transformer.apply(Transformer.scala:22) at 
org.scalatest.Transformer.apply(Transformer.scala:20) at 
org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186) at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149) at 
org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184) at 
org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196) at 
org.scalatest.SuperEngine.runTestImpl(Engine.scala:289) at 
org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196) at 
org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178) at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
 at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221) at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214) at 
org.apache.spark.sql.jdbc.JDBCSuite.org$scalatest$BeforeAndAfter$$super$runTest(JDBCSuite.scala:43)
 at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:203) at 
org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.scala:192) at 
org.apache.spark.sql.jdbc.JDBCSuite.runTest(JDBCSuite.scala:43) at 
org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229) at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396) at 
scala.collection.immutable.List.foreach(List.scala:392) at 
org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384) at 
org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379) at 
org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461) at 
org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229) at 
org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228) at 
org.scalatest.FunSuite.runTests(FunSuite.scala:1560) at 
org.scalatest.Suite.run(Suite.scal

[jira] [Commented] (SPARK-29596) Task duration not updating for running tasks

2019-12-27 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004261#comment-17004261
 ] 

daile commented on SPARK-29596:
---

[~hyukjin.kwon]  I checked the problem and reproduced in 2.4.4 version and will 
raise PR soon 

> Task duration not updating for running tasks
> 
>
> Key: SPARK-29596
> URL: https://issues.apache.org/jira/browse/SPARK-29596
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.4.2
>Reporter: Bharati Jadhav
>Priority: Major
> Attachments: Screenshot_Spark_live_WebUI.png
>
>
> When looking at the task metrics for running tasks in the task table for the 
> related stage, the duration column is not updated until the task has 
> succeeded. The duration values are reported empty or 0 ms until the task has 
> completed. This is a change in behavior, from earlier versions, when the task 
> duration was continuously updated while the task was running. The missing 
> duration values can be observed for both short and long running tasks and for 
> multiple applications.
>  
> To reproduce this, one can run any code from the spark-shell and observe the 
> missing duration values for any running task. Only when the task succeeds is 
> the duration value populated in the UI.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29596) Task duration not updating for running tasks

2020-01-05 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008552#comment-17008552
 ] 

daile commented on SPARK-29596:
---

[~hyukjin.kwon]  task detail list use task.taskMetrics info , but 
task.taskMetrics will only be updated when task finshed ,  Is it feasible to 
get task Duration in its running state ?  
https://github.com/apache/spark/pull/27026

> Task duration not updating for running tasks
> 
>
> Key: SPARK-29596
> URL: https://issues.apache.org/jira/browse/SPARK-29596
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.4.2
>Reporter: Bharati Jadhav
>Priority: Major
> Attachments: Screenshot_Spark_live_WebUI.png
>
>
> When looking at the task metrics for running tasks in the task table for the 
> related stage, the duration column is not updated until the task has 
> succeeded. The duration values are reported empty or 0 ms until the task has 
> completed. This is a change in behavior, from earlier versions, when the task 
> duration was continuously updated while the task was running. The missing 
> duration values can be observed for both short and long running tasks and for 
> multiple applications.
>  
> To reproduce this, one can run any code from the spark-shell and observe the 
> missing duration values for any running task. Only when the task succeeds is 
> the duration value populated in the UI.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31686) Return of String instead of array in function get_json_object

2020-05-12 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105939#comment-17105939
 ] 

daile commented on SPARK-31686:
---

[~bruneltouopi] looks like it was specifically removed
{code:java}
val buf = buffer.getBuffer
if (dirty > 1) {
  g.writeRawValue(buf.toString)
} else if (dirty == 1) {
  // remove outer array tokens
  g.writeRawValue(buf.substring(1, buf.length()-1))
} // else do not write anything
{code}

> Return of String instead of array in function get_json_object
> -
>
> Key: SPARK-31686
> URL: https://issues.apache.org/jira/browse/SPARK-31686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.5
> Environment: {code:json}
> // code placeholder
> {
> customer:{ 
>  addesses:[ { {code}
>                   location :  arizona
>                   }
>                ]
> }
> }
>  get_json_object(string(customer),'$addresses[*].location')
> return "arizona"
> result expected should be
> ["arizona"]
>Reporter: Touopi Touopi
>Priority: Major
>
> when we selecting a node of a json object that is array,
> When the array contains One element , the get_json_object return a String 
> with " characters instead of an array of One element.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-31193) set spark.master and spark.app.name conf default value

2020-03-19 Thread daile (Jira)
daile created SPARK-31193:
-

 Summary: set spark.master and spark.app.name conf default value
 Key: SPARK-31193
 URL: https://issues.apache.org/jira/browse/SPARK-31193
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.5, 2.4.4, 2.4.3, 2.4.2, 2.4.0, 2.3.3, 2.3.0, 3.1.0
Reporter: daile
 Fix For: 3.1.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31193) set spark.master and spark.app.name conf default value

2020-03-19 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-31193:
--
Description: 
 

 
{code:java}
//代码占位符
{code}
I see the default value of master setting in spark-submit client

 

 

```scala
 // Global defaults. These should be keep to minimum to avoid confusing 
behavior.
 master = Option(master).getOrElse("local[*]")
 ```

but during our development and debugging, We will encounter this kind of problem

Exception in thread "main" org.apache.spark.SparkException: A master URL must 
be set in your configuration

This conflicts with the default setting

```scala
 //If we do
 val sparkConf = new SparkConf().setAppName(“app”)
 //When using the client to submit tasks to the cluster, the matser will be 
overwritten by the local
 sparkConf.set("spark.master", "local[*]")
 ```

so we have to do like this

```scala
 val sparkConf = new SparkConf().setAppName(“app”)
 //Because the program runs to set the priority of the master, we have to first 
determine whether to set the master to avoid submitting the cluster to run.
 sparkConf.set("spark.master",sparkConf.get("spark.master","local[*]"))
 ```

so is spark.app.name

Is it better for users to handle it like submit client ?

> set spark.master and spark.app.name conf default value
> --
>
> Key: SPARK-31193
> URL: https://issues.apache.org/jira/browse/SPARK-31193
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.3, 2.4.0, 2.4.2, 2.4.3, 2.4.4, 2.4.5, 3.1.0
>Reporter: daile
>Priority: Major
> Fix For: 3.1.0
>
>
>  
>  
> {code:java}
> //代码占位符
> {code}
> I see the default value of master setting in spark-submit client
>  
>  
> ```scala
>  // Global defaults. These should be keep to minimum to avoid confusing 
> behavior.
>  master = Option(master).getOrElse("local[*]")
>  ```
> but during our development and debugging, We will encounter this kind of 
> problem
> Exception in thread "main" org.apache.spark.SparkException: A master URL must 
> be set in your configuration
> This conflicts with the default setting
> ```scala
>  //If we do
>  val sparkConf = new SparkConf().setAppName(“app”)
>  //When using the client to submit tasks to the cluster, the matser will be 
> overwritten by the local
>  sparkConf.set("spark.master", "local[*]")
>  ```
> so we have to do like this
> ```scala
>  val sparkConf = new SparkConf().setAppName(“app”)
>  //Because the program runs to set the priority of the master, we have to 
> first determine whether to set the master to avoid submitting the cluster to 
> run.
>  sparkConf.set("spark.master",sparkConf.get("spark.master","local[*]"))
>  ```
> so is spark.app.name
> Is it better for users to handle it like submit client ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31193) set spark.master and spark.app.name conf default value

2020-03-19 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-31193:
--
Description: 
I see the default value of master setting in spark-submit client
{code:java}
// Global defaults. These should be keep to minimum to avoid confusing 
behavior. master = Option(master).getOrElse("local[*]") 
{code}
but during our development and debugging, We will encounter this kind of problem

Exception in thread "main" org.apache.spark.SparkException: A master URL must 
be set in your configuration

This conflicts with the default setting

 
{code:java}
//If we do
 val sparkConf = new SparkConf().setAppName(“app”)
 //When using the client to submit tasks to the cluster, the matser will be 
overwritten by the local
 sparkConf.set("spark.master", "local[*]"){code}
 

so we have to do like this
{code:java}
val sparkConf = new SparkConf().setAppName(“app”)
 //Because the program runs to set the priority of the master, we have to first 
determine whether to set the master to avoid submitting the cluster to run.
 sparkConf.set("spark.master",sparkConf.get("spark.master","local[*]")){code}
 

 

so is spark.app.name

Is it better for users to handle it like submit client ?

  was:
 

 
{code:java}
//代码占位符
{code}
I see the default value of master setting in spark-submit client

 

 

```scala
 // Global defaults. These should be keep to minimum to avoid confusing 
behavior.
 master = Option(master).getOrElse("local[*]")
 ```

but during our development and debugging, We will encounter this kind of problem

Exception in thread "main" org.apache.spark.SparkException: A master URL must 
be set in your configuration

This conflicts with the default setting

```scala
 //If we do
 val sparkConf = new SparkConf().setAppName(“app”)
 //When using the client to submit tasks to the cluster, the matser will be 
overwritten by the local
 sparkConf.set("spark.master", "local[*]")
 ```

so we have to do like this

```scala
 val sparkConf = new SparkConf().setAppName(“app”)
 //Because the program runs to set the priority of the master, we have to first 
determine whether to set the master to avoid submitting the cluster to run.
 sparkConf.set("spark.master",sparkConf.get("spark.master","local[*]"))
 ```

so is spark.app.name

Is it better for users to handle it like submit client ?


> set spark.master and spark.app.name conf default value
> --
>
> Key: SPARK-31193
> URL: https://issues.apache.org/jira/browse/SPARK-31193
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.3, 2.4.0, 2.4.2, 2.4.3, 2.4.4, 2.4.5, 3.1.0
>Reporter: daile
>Priority: Major
> Fix For: 3.1.0
>
>
> I see the default value of master setting in spark-submit client
> {code:java}
> // Global defaults. These should be keep to minimum to avoid confusing 
> behavior. master = Option(master).getOrElse("local[*]") 
> {code}
> but during our development and debugging, We will encounter this kind of 
> problem
> Exception in thread "main" org.apache.spark.SparkException: A master URL must 
> be set in your configuration
> This conflicts with the default setting
>  
> {code:java}
> //If we do
>  val sparkConf = new SparkConf().setAppName(“app”)
>  //When using the client to submit tasks to the cluster, the matser will be 
> overwritten by the local
>  sparkConf.set("spark.master", "local[*]"){code}
>  
> so we have to do like this
> {code:java}
> val sparkConf = new SparkConf().setAppName(“app”)
>  //Because the program runs to set the priority of the master, we have to 
> first determine whether to set the master to avoid submitting the cluster to 
> run.
>  sparkConf.set("spark.master",sparkConf.get("spark.master","local[*]")){code}
>  
>  
> so is spark.app.name
> Is it better for users to handle it like submit client ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-31457) spark jdbc read hive created the wrong PreparedStatement

2020-04-15 Thread daile (Jira)
daile created SPARK-31457:
-

 Summary: spark jdbc read hive created the wrong PreparedStatement
 Key: SPARK-31457
 URL: https://issues.apache.org/jira/browse/SPARK-31457
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.3.2, 3.1.0
 Environment: spark 2.3.2

hive 2.1.1

 
Reporter: daile


{code:java}
val res = spark
  .read
  .format("jdbc")
  .option("url", "jdbc:hive2://host:1/default")
  .option("dbtable", "user_info2")
  .option("driver","org.apache.hive.jdbc.HiveDriver")
  .option("user", "")
  .option("password","")
  .load()

res.show(){code}
get wrong result

+--+--+---+
|user_info2.age|user_info2.sex|user_info2.birthday|
+--+--+---+
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
|user_info2.age|user_info2.sex|user_info2.birthday|
+--+--+---+

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31457) spark jdbc read hive created the wrong PreparedStatement

2020-04-15 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-31457:
--
Attachment: hivejdbc3.png

> spark jdbc read hive created the wrong PreparedStatement
> 
>
> Key: SPARK-31457
> URL: https://issues.apache.org/jira/browse/SPARK-31457
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2, 3.1.0
> Environment: spark 2.3.2
> hive 2.1.1
>  
>Reporter: daile
>Priority: Major
> Attachments: hivejdbc2.png, hivejdbc3.png, sparkhivejdbc.png
>
>
> {code:java}
> val res = spark
>   .read
>   .format("jdbc")
>   .option("url", "jdbc:hive2://host:1/default")
>   .option("dbtable", "user_info2")
>   .option("driver","org.apache.hive.jdbc.HiveDriver")
>   .option("user", "")
>   .option("password","")
>   .load()
> res.show(){code}
> get wrong result
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31457) spark jdbc read hive created the wrong PreparedStatement

2020-04-15 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-31457:
--
Attachment: hivejdbc2.png

> spark jdbc read hive created the wrong PreparedStatement
> 
>
> Key: SPARK-31457
> URL: https://issues.apache.org/jira/browse/SPARK-31457
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2, 3.1.0
> Environment: spark 2.3.2
> hive 2.1.1
>  
>Reporter: daile
>Priority: Major
> Attachments: hivejdbc2.png, hivejdbc3.png, sparkhivejdbc.png
>
>
> {code:java}
> val res = spark
>   .read
>   .format("jdbc")
>   .option("url", "jdbc:hive2://host:1/default")
>   .option("dbtable", "user_info2")
>   .option("driver","org.apache.hive.jdbc.HiveDriver")
>   .option("user", "")
>   .option("password","")
>   .load()
> res.show(){code}
> get wrong result
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-31457) spark jdbc read hive created the wrong PreparedStatement

2020-04-15 Thread daile (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-31457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

daile updated SPARK-31457:
--
Attachment: sparkhivejdbc.png

> spark jdbc read hive created the wrong PreparedStatement
> 
>
> Key: SPARK-31457
> URL: https://issues.apache.org/jira/browse/SPARK-31457
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2, 3.1.0
> Environment: spark 2.3.2
> hive 2.1.1
>  
>Reporter: daile
>Priority: Major
> Attachments: hivejdbc2.png, hivejdbc3.png, sparkhivejdbc.png
>
>
> {code:java}
> val res = spark
>   .read
>   .format("jdbc")
>   .option("url", "jdbc:hive2://host:1/default")
>   .option("dbtable", "user_info2")
>   .option("driver","org.apache.hive.jdbc.HiveDriver")
>   .option("user", "")
>   .option("password","")
>   .load()
> res.show(){code}
> get wrong result
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> |user_info2.age|user_info2.sex|user_info2.birthday|
> +--+--+---+
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36422) Examples can't run in IDE directly

2022-09-07 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601181#comment-17601181
 ] 

daile commented on SPARK-36422:
---

you can try set idea run option
{code:java}
Include dependencies with "Provided" scope. {code}

> Examples can't run in IDE directly
> --
>
> Key: SPARK-36422
> URL: https://issues.apache.org/jira/browse/SPARK-36422
> Project: Spark
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Minor
>
> I found the examples can't run in IDE(such as Intellij).
> For example,  if run `org.apache.spark.examples.sql.JavaUserDefinedScalar` in 
> IDE, the error message as follows:
> {code:java}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/spark/sql/SparkSession
>   at 
> org.apache.spark.examples.sql.JavaUserDefinedScalar.main(JavaUserDefinedScalar.java:33)
> Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>   ... 1 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40099) Merge adjacent CaseWhen branches if their values are the same

2023-02-08 Thread daile (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686249#comment-17686249
 ] 

daile commented on SPARK-40099:
---

[~yumwang]  can you help review it again ?

> Merge adjacent CaseWhen branches if their values are the same
> -
>
> Key: SPARK-40099
> URL: https://issues.apache.org/jira/browse/SPARK-40099
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> For example:
> {code:sql}
>   CASE
> WHEN f1.buyer_id IS NOT NULL THEN 1
> WHEN f2.buyer_id IS NOT NULL THEN 1
> ELSE 0
>   END
> {code}
> The excepted result:
> {code:sql}
>   CASE
> WHEN f1.buyer_id IS NOT NULL or f2.buyer_id IS NOT NULL 
> THEN 1
> ELSE 0
>   END
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org