[jira] [Assigned] (SPARK-27102) Remove the references to Python's Scala codes in R's Scala codes

2019-03-09 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-27102:


Assignee: Hyukjin Kwon

> Remove the references to Python's Scala codes in R's Scala codes
> 
>
> Key: SPARK-27102
> URL: https://issues.apache.org/jira/browse/SPARK-27102
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, R, Spark Core
>Affects Versions: 3.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> Currently, R's Scala codes happened to refer Python's Scala codes for code 
> deduplications. It's a bit odd. For instance, when we face an exception from 
> R, it shows python related code path, which makes confusing to debug.
> It should rather have one code base and R's and Python's should share.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27102) Remove the references to Python's Scala codes in R's Scala codes

2019-03-09 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-27102.
--
   Resolution: Fixed
Fix Version/s: 3.0.0

Issue resolved by pull request 24023
[https://github.com/apache/spark/pull/24023]

> Remove the references to Python's Scala codes in R's Scala codes
> 
>
> Key: SPARK-27102
> URL: https://issues.apache.org/jira/browse/SPARK-27102
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark, R, Spark Core
>Affects Versions: 3.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently, R's Scala codes happened to refer Python's Scala codes for code 
> deduplications. It's a bit odd. For instance, when we face an exception from 
> R, it shows python related code path, which makes confusing to debug.
> It should rather have one code base and R's and Python's should share.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27097) Avoid embedding platform-dependent offsets literally in whole-stage generated code

2019-03-09 Thread DB Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DB Tsai resolved SPARK-27097.
-
   Resolution: Fixed
Fix Version/s: 2.4.1

Issue resolved by pull request 24032
[https://github.com/apache/spark/pull/24032]

> Avoid embedding platform-dependent offsets literally in whole-stage generated 
> code
> --
>
> Key: SPARK-27097
> URL: https://issues.apache.org/jira/browse/SPARK-27097
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Assignee: Kris Mok
>Priority: Critical
>  Labels: correctness
> Fix For: 2.4.1
>
>
> Avoid embedding platform-dependent offsets literally in whole-stage generated 
> code.
> Spark SQL performs whole-stage code generation to speed up query execution. 
> There are two steps to it:
> Java source code is generated from the physical query plan on the driver. A 
> single version of the source code is generated from a query plan, and sent to 
> all executors.
> It's compiled to bytecode on the driver to catch compilation errors before 
> sending to executors, but currently only the generated source code gets sent 
> to the executors. The bytecode compilation is for fail-fast only.
> Executors receive the generated source code and compile to bytecode, then the 
> query runs like a hand-written Java program.
> In this model, there's an implicit assumption about the driver and executors 
> being run on similar platforms. Some code paths accidentally embedded 
> platform-dependent object layout information into the generated code, such as:
> {code:java}
> Platform.putLong(buffer, /* offset */ 24, /* value */ 1);
> {code}
> This code expects a field to be at offset +24 of the buffer object, and sets 
> a value to that field.
> But whole-stage code generation generally uses platform-dependent information 
> from the driver. If the object layout is significantly different on the 
> driver and executors, the generated code can be reading/writing to wrong 
> offsets on the executors, causing all kinds of data corruption.
> One code pattern that leads to such problem is the use of Platform.XXX 
> constants in generated code, e.g. Platform.BYTE_ARRAY_OFFSET.
> Bad:
> {code:java}
> val baseOffset = Platform.BYTE_ARRAY_OFFSET
> // codegen template:
> s"Platform.putLong($buffer, $baseOffset, $value);"
> This will embed the value of Platform.BYTE_ARRAY_OFFSET on the driver into 
> the generated code.
> {code}
> Good:
> {code:java}
> val baseOffset = "Platform.BYTE_ARRAY_OFFSET"
> // codegen template:
> s"Platform.putLong($buffer, $baseOffset, $value);"
> This will generate the offset symbolically -- Platform.putLong(buffer, 
> Platform.BYTE_ARRAY_OFFSET, value), which will be able to pick up the correct 
> value on the executors.
> {code}
> Caveat: these offset constants are declared as runtime-initialized static 
> final in Java, so they're not compile-time constants from the Java language's 
> perspective. It does lead to a slightly increased size of the generated code, 
> but this is necessary for correctness.
> NOTE: there can be other patterns that generate platform-dependent code on 
> the driver which is invalid on the executors. e.g. if the endianness is 
> different between the driver and the executors, and if some generated code 
> makes strong assumption about endianness, it would also be problematic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27120) Upgrade scalatest version to 3.0.5

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-27120:
--
Priority: Trivial  (was: Major)

> Upgrade scalatest version to 3.0.5
> --
>
> Key: SPARK-27120
> URL: https://issues.apache.org/jira/browse/SPARK-27120
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Trivial
>
> ScalaTest 3.0.5 Release Notes:
> h2. Bug Fixes
>  * Fixed the implicit view not available problem when used with compile macro.
>  * Fixed a stack depth problem in {{RefSpecLike }}and {{fixture.SpecLike}} 
> under Scala 2.13.
>  * Changed {{Framework}} and {{ScalaTestFramework}} to set 
> {{spanScaleFactor}} for Runner object instances for different Runners using 
> different class loaders. This fixed a problem whereby an incorrect 
> {{Runner.spanScaleFactor}} could be used when the tests for multiple sbt 
> project's were run concurrently.
>  * Fixed a bug in {{endsWith}} regex matcher.
> h2. Improvements
>  * Removed duplicated parsing code for -C in {{ArgsParser}}.
>  * Improved performance in {{WebBrowser}}.
>  * Documentation typo rectification.
>  * Improve validity of Junit XML reports.
>  * Improved performance by replacing all {{.size == 0 }}and {{.length == 0 
> }}to {{.isEmpty}}.
> h2. Enhancements
>  * Added {{'C'}} option to {{-P}}, which will tell {{-P}} to use cached 
> thread pool.
> h2. External Dependencies Update
>  * Bumped up {{scala-js}} version to 0.6.22.
>  * Changed to depend on {{mockito-core}}, not {{mockito-all}}.
>  * Bumped up {{jmock}} version to 2.8.3.
>  * Bumped up {{junit}} version to 4.12.
>  * Removed dependency to {{scala-parser-combinators}}.
> More details:
> http://www.scalatest.org/release_notes/3.0.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27120) Upgrade scalatest version to 3.0.5

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27120:


Assignee: Apache Spark

> Upgrade scalatest version to 3.0.5
> --
>
> Key: SPARK-27120
> URL: https://issues.apache.org/jira/browse/SPARK-27120
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Apache Spark
>Priority: Major
>
> ScalaTest 3.0.5 Release Notes:
> h2. Bug Fixes
>  * Fixed the implicit view not available problem when used with compile macro.
>  * Fixed a stack depth problem in {{RefSpecLike }}and {{fixture.SpecLike}} 
> under Scala 2.13.
>  * Changed {{Framework}} and {{ScalaTestFramework}} to set 
> {{spanScaleFactor}} for Runner object instances for different Runners using 
> different class loaders. This fixed a problem whereby an incorrect 
> {{Runner.spanScaleFactor}} could be used when the tests for multiple sbt 
> project's were run concurrently.
>  * Fixed a bug in {{endsWith}} regex matcher.
> h2. Improvements
>  * Removed duplicated parsing code for -C in {{ArgsParser}}.
>  * Improved performance in {{WebBrowser}}.
>  * Documentation typo rectification.
>  * Improve validity of Junit XML reports.
>  * Improved performance by replacing all {{.size == 0 }}and {{.length == 0 
> }}to {{.isEmpty}}.
> h2. Enhancements
>  * Added {{'C'}} option to {{-P}}, which will tell {{-P}} to use cached 
> thread pool.
> h2. External Dependencies Update
>  * Bumped up {{scala-js}} version to 0.6.22.
>  * Changed to depend on {{mockito-core}}, not {{mockito-all}}.
>  * Bumped up {{jmock}} version to 2.8.3.
>  * Bumped up {{junit}} version to 4.12.
>  * Removed dependency to {{scala-parser-combinators}}.
> More details:
> http://www.scalatest.org/release_notes/3.0.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27120) Upgrade scalatest version to 3.0.5

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27120:


Assignee: (was: Apache Spark)

> Upgrade scalatest version to 3.0.5
> --
>
> Key: SPARK-27120
> URL: https://issues.apache.org/jira/browse/SPARK-27120
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> ScalaTest 3.0.5 Release Notes:
> h2. Bug Fixes
>  * Fixed the implicit view not available problem when used with compile macro.
>  * Fixed a stack depth problem in {{RefSpecLike }}and {{fixture.SpecLike}} 
> under Scala 2.13.
>  * Changed {{Framework}} and {{ScalaTestFramework}} to set 
> {{spanScaleFactor}} for Runner object instances for different Runners using 
> different class loaders. This fixed a problem whereby an incorrect 
> {{Runner.spanScaleFactor}} could be used when the tests for multiple sbt 
> project's were run concurrently.
>  * Fixed a bug in {{endsWith}} regex matcher.
> h2. Improvements
>  * Removed duplicated parsing code for -C in {{ArgsParser}}.
>  * Improved performance in {{WebBrowser}}.
>  * Documentation typo rectification.
>  * Improve validity of Junit XML reports.
>  * Improved performance by replacing all {{.size == 0 }}and {{.length == 0 
> }}to {{.isEmpty}}.
> h2. Enhancements
>  * Added {{'C'}} option to {{-P}}, which will tell {{-P}} to use cached 
> thread pool.
> h2. External Dependencies Update
>  * Bumped up {{scala-js}} version to 0.6.22.
>  * Changed to depend on {{mockito-core}}, not {{mockito-all}}.
>  * Bumped up {{jmock}} version to 2.8.3.
>  * Bumped up {{junit}} version to 4.12.
>  * Removed dependency to {{scala-parser-combinators}}.
> More details:
> http://www.scalatest.org/release_notes/3.0.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27120) Upgrade scalatest version to 3.0.5

2019-03-09 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-27120:
---

 Summary: Upgrade scalatest version to 3.0.5
 Key: SPARK-27120
 URL: https://issues.apache.org/jira/browse/SPARK-27120
 Project: Spark
  Issue Type: Improvement
  Components: Tests
Affects Versions: 3.0.0
Reporter: Yuming Wang


ScalaTest 3.0.5 Release Notes:
h2. Bug Fixes
 * Fixed the implicit view not available problem when used with compile macro.
 * Fixed a stack depth problem in {{RefSpecLike }}and {{fixture.SpecLike}} 
under Scala 2.13.
 * Changed {{Framework}} and {{ScalaTestFramework}} to set {{spanScaleFactor}} 
for Runner object instances for different Runners using different class 
loaders. This fixed a problem whereby an incorrect {{Runner.spanScaleFactor}} 
could be used when the tests for multiple sbt project's were run concurrently.
 * Fixed a bug in {{endsWith}} regex matcher.

h2. Improvements
 * Removed duplicated parsing code for -C in {{ArgsParser}}.
 * Improved performance in {{WebBrowser}}.
 * Documentation typo rectification.
 * Improve validity of Junit XML reports.
 * Improved performance by replacing all {{.size == 0 }}and {{.length == 0 }}to 
{{.isEmpty}}.

h2. Enhancements
 * Added {{'C'}} option to {{-P}}, which will tell {{-P}} to use cached thread 
pool.

h2. External Dependencies Update
 * Bumped up {{scala-js}} version to 0.6.22.
 * Changed to depend on {{mockito-core}}, not {{mockito-all}}.
 * Bumped up {{jmock}} version to 2.8.3.
 * Bumped up {{junit}} version to 4.12.
 * Removed dependency to {{scala-parser-combinators}}.

More details:
http://www.scalatest.org/release_notes/3.0.5



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27118) Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0

2019-03-09 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-27118.
---
   Resolution: Fixed
 Assignee: Yuming Wang
Fix Version/s: 3.0.0

This is resolved via https://github.com/apache/spark/pull/24040

> Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0
> 
>
> Key: SPARK-27118
> URL: https://issues.apache.org/jira/browse/SPARK-27118
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> Hive 1.1.1 and Hive 1.0.1 release is available. We should upgrade Hive 
> Metastore Client version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27054) Remove Calcite dependency

2019-03-09 Thread Dongjoon Hyun (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-27054.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

This is resolved via https://github.com/apache/spark/pull/23970

> Remove Calcite dependency
> -
>
> Key: SPARK-27054
> URL: https://issues.apache.org/jira/browse/SPARK-27054
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> Calcite is only used for 
> [runSqlHive|https://github.com/apache/spark/blob/02bbe977abaf7006b845a7e99d612b0235aa0025/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L699-L705]
>  when 
> {{hive.cbo.enable=true}}([SemanticAnalyzer|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java#L278-L280]).
> So we can disable {{hive.cbo.enable}} and remove Calcite dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27060) DDL Commands are accepting Keywords like create, drop as tableName

2019-03-09 Thread Takeshi Yamamuro (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788813#comment-16788813
 ] 

Takeshi Yamamuro commented on SPARK-27060:
--

This issue is currently in-progress and plz see: SPARK-26976

> DDL Commands are accepting Keywords like create, drop as tableName
> --
>
> Key: SPARK-27060
> URL: https://issues.apache.org/jira/browse/SPARK-27060
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.2, 2.4.0
>Reporter: Sachin Ramachandra Setty
>Priority: Minor
>
> Seems to be a compatibility issue compared to other components such as hive 
> and mySql. 
> DDL commands are successful even though the tableName is same as keyword. 
> Tested with columnNames as well and issue exists. 
> Whereas, Hive-Beeline is throwing ParseException and not accepting keywords 
> as tableName or columnName and mySql is accepting keywords only as columnName.
> Spark-Behaviour :
> {code}
> Connected to: Spark SQL (version 2.3.2.0101)
> CLI_DBMS_APPID
> Beeline version 1.2.1.spark_2.3.2.0101 by Apache Hive
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table create(id int);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.255 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table drop(int int);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.257 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> drop table drop;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.236 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> drop table create;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.168 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table tab1(float float);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.111 seconds)
> 0: jdbc:hive2://10.18.XXX:23040/default> create table double(double float);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.093 seconds)
> {code}
> Hive-Behaviour :
> {code}
> Connected to: Apache Hive (version 3.1.0)
> Driver: Hive JDBC (version 3.1.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 3.1.0 by Apache Hive
> 0: jdbc:hive2://10.18.XXX:21066/> create table create(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:13 
> cannot recognize input near 'create' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18.XXX:21066/> create table drop(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:13 
> cannot recognize input near 'drop' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18XXX:21066/> create table tab1(float float);
> Error: Error while compiling statement: FAILED: ParseException line 1:18 
> cannot recognize input near 'float' 'float' ')' in column name or constraint 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18XXX:21066/> drop table create(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:11 
> cannot recognize input near 'create' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18.XXX:21066/> drop table drop(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:11 
> cannot recognize input near 'drop' '(' 'id' in table name 
> (state=42000,code=4)
> mySql :
> CREATE TABLE CREATE(ID integer);
> Error: near "CREATE": syntax error
> CREATE TABLE DROP(ID integer);
> Error: near "DROP": syntax error
> CREATE TABLE TAB1(FLOAT FLOAT);
> Success
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2019-03-09 Thread Takeshi Yamamuro (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788811#comment-16788811
 ] 

Takeshi Yamamuro commented on SPARK-25863:
--

Thanks alot!

> java.lang.UnsupportedOperationException: empty.max at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1475)
> -
>
> Key: SPARK-25863
> URL: https://issues.apache.org/jira/browse/SPARK-25863
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, Spark Core
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Ruslan Dautkhanov
>Assignee: Takeshi Yamamuro
>Priority: Major
>  Labels: cache, catalyst, code-generation
> Fix For: 2.3.4, 2.4.2, 3.0.0
>
>
> Failing task : 
> {noformat}
> An error occurred while calling o2875.collectToPython.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 58 
> in stage 21413.0 failed 4 times, most recent failure: Lost task 58.3 in stage 
> 21413.0 (TID 4057314, pc1udatahad117, executor 431): 
> java.lang.UnsupportedOperationException: empty.max
> at scala.collection.TraversableOnce$class.max(TraversableOnce.scala:229)
> at scala.collection.AbstractTraversable.max(Traversable.scala:104)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1475)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:1418)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1493)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1490)
> at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
> at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
> at 
> org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
> at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
> at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
> at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
> at 
> org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1365)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.create(GeneratePredicate.scala:81)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.create(GeneratePredicate.scala:40)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1321)
> at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1318)
> at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:401)
> at 
> org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$filteredCachedBatches$1.apply(InMemoryTableScanExec.scala:263)
> at 
> org.apache.spark.sql.execution.columnar.InMemoryTableScanExec$$anonfun$filteredCachedBatches$1.apply(InMemoryTableScanExec.scala:262)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24.apply(RDD.scala:818)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24.apply(RDD.scala:818)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:109)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
> at 
> 

[jira] [Updated] (SPARK-27111) A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally

2019-03-09 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-27111:
-
Fix Version/s: 2.3.4

> A continuous query may fail with InterruptedException when kafka consumer 
> temporally 0 partitions temporally
> 
>
> Key: SPARK-27111
> URL: https://issues.apache.org/jira/browse/SPARK-27111
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Major
> Fix For: 2.3.4, 2.4.2, 3.0.0
>
>
> Before a Kafka consumer gets assigned with partitions, its offset will 
> contain 0 partitions. However, runContinuous will still run and launch a 
> Spark job having 0 partitions. In this case, there is a race that epoch may 
> interrupt the query execution thread after `lastExecution.toRdd`, and either 
> `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next 
> `runContinuous` will get interrupted unintentionally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27111) A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally

2019-03-09 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu resolved SPARK-27111.
--
Resolution: Fixed

> A continuous query may fail with InterruptedException when kafka consumer 
> temporally 0 partitions temporally
> 
>
> Key: SPARK-27111
> URL: https://issues.apache.org/jira/browse/SPARK-27111
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Major
> Fix For: 2.3.4, 2.4.2, 3.0.0
>
>
> Before a Kafka consumer gets assigned with partitions, its offset will 
> contain 0 partitions. However, runContinuous will still run and launch a 
> Spark job having 0 partitions. In this case, there is a race that epoch may 
> interrupt the query execution thread after `lastExecution.toRdd`, and either 
> `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next 
> `runContinuous` will get interrupted unintentionally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27111) A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally

2019-03-09 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-27111:
-
Fix Version/s: 2.4.2

> A continuous query may fail with InterruptedException when kafka consumer 
> temporally 0 partitions temporally
> 
>
> Key: SPARK-27111
> URL: https://issues.apache.org/jira/browse/SPARK-27111
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Major
> Fix For: 2.4.2, 3.0.0
>
>
> Before a Kafka consumer gets assigned with partitions, its offset will 
> contain 0 partitions. However, runContinuous will still run and launch a 
> Spark job having 0 partitions. In this case, there is a race that epoch may 
> interrupt the query execution thread after `lastExecution.toRdd`, and either 
> `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next 
> `runContinuous` will get interrupted unintentionally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27111) A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally

2019-03-09 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-27111:
-
Fix Version/s: 3.0.0

> A continuous query may fail with InterruptedException when kafka consumer 
> temporally 0 partitions temporally
> 
>
> Key: SPARK-27111
> URL: https://issues.apache.org/jira/browse/SPARK-27111
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Major
> Fix For: 3.0.0
>
>
> Before a Kafka consumer gets assigned with partitions, its offset will 
> contain 0 partitions. However, runContinuous will still run and launch a 
> Spark job having 0 partitions. In this case, there is a race that epoch may 
> interrupt the query execution thread after `lastExecution.toRdd`, and either 
> `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next 
> `runContinuous` will get interrupted unintentionally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27111) A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally

2019-03-09 Thread Shixiong Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shixiong Zhu updated SPARK-27111:
-
Affects Version/s: 2.4.1
   2.4.0

> A continuous query may fail with InterruptedException when kafka consumer 
> temporally 0 partitions temporally
> 
>
> Key: SPARK-27111
> URL: https://issues.apache.org/jira/browse/SPARK-27111
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1
>Reporter: Shixiong Zhu
>Assignee: Shixiong Zhu
>Priority: Major
>
> Before a Kafka consumer gets assigned with partitions, its offset will 
> contain 0 partitions. However, runContinuous will still run and launch a 
> Spark job having 0 partitions. In this case, there is a race that epoch may 
> interrupt the query execution thread after `lastExecution.toRdd`, and either 
> `epochEndpoint.askSync[Unit](StopContinuousExecutionWrites)` or the next 
> `runContinuous` will get interrupted unintentionally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27060) DDL Commands are accepting Keywords like create, drop as tableName

2019-03-09 Thread Sujith Chacko (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788778#comment-16788778
 ] 

Sujith Chacko commented on SPARK-27060:
---

cc [~maropu]

> DDL Commands are accepting Keywords like create, drop as tableName
> --
>
> Key: SPARK-27060
> URL: https://issues.apache.org/jira/browse/SPARK-27060
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.2, 2.4.0
>Reporter: Sachin Ramachandra Setty
>Priority: Minor
>
> Seems to be a compatibility issue compared to other components such as hive 
> and mySql. 
> DDL commands are successful even though the tableName is same as keyword. 
> Tested with columnNames as well and issue exists. 
> Whereas, Hive-Beeline is throwing ParseException and not accepting keywords 
> as tableName or columnName and mySql is accepting keywords only as columnName.
> Spark-Behaviour :
> {code}
> Connected to: Spark SQL (version 2.3.2.0101)
> CLI_DBMS_APPID
> Beeline version 1.2.1.spark_2.3.2.0101 by Apache Hive
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table create(id int);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.255 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table drop(int int);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.257 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> drop table drop;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.236 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> drop table create;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.168 seconds)
> 0: jdbc:hive2://10.18.3.XXX:23040/default> create table tab1(float float);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.111 seconds)
> 0: jdbc:hive2://10.18.XXX:23040/default> create table double(double float);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.093 seconds)
> {code}
> Hive-Behaviour :
> {code}
> Connected to: Apache Hive (version 3.1.0)
> Driver: Hive JDBC (version 3.1.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 3.1.0 by Apache Hive
> 0: jdbc:hive2://10.18.XXX:21066/> create table create(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:13 
> cannot recognize input near 'create' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18.XXX:21066/> create table drop(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:13 
> cannot recognize input near 'drop' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18XXX:21066/> create table tab1(float float);
> Error: Error while compiling statement: FAILED: ParseException line 1:18 
> cannot recognize input near 'float' 'float' ')' in column name or constraint 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18XXX:21066/> drop table create(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:11 
> cannot recognize input near 'create' '(' 'id' in table name 
> (state=42000,code=4)
> 0: jdbc:hive2://10.18.XXX:21066/> drop table drop(id int);
> Error: Error while compiling statement: FAILED: ParseException line 1:11 
> cannot recognize input near 'drop' '(' 'id' in table name 
> (state=42000,code=4)
> mySql :
> CREATE TABLE CREATE(ID integer);
> Error: near "CREATE": syntax error
> CREATE TABLE DROP(ID integer);
> Error: near "DROP": syntax error
> CREATE TABLE TAB1(FLOAT FLOAT);
> Success
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-09 Thread Martin Loncaric (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788773#comment-16788773
 ] 

Martin Loncaric edited comment on SPARK-26555 at 3/9/19 7:04 PM:
-

You can literally try any dataset with Option's in the schema and replicate 
this issue. For example,

sparkSession.createDataset(Seq(
  MyClass(new Timestamp(1L), "b", "c", Some("d"), Some(1.0), 
Some(2.0))
))

I think the code I left is pretty clear - it fails sometimes. Run it once, and 
it may or may not work. I don't run multiple spark-submit's in parallel.


was (Author: mwlon):
You can literally try any dataset and replicate this issue. For example,

sparkSession.createDataset(Seq(
  MyClass(new Timestamp(1L), "b", "c", Some("d"), Some(1.0), 
Some(2.0))
))

I think the code I left is pretty clear - it fails sometimes. Run it once, and 
it may or may not work. I don't run multiple spark-submit's in parallel.

> Thread safety issue causes createDataset to fail with misleading errors
> ---
>
> Key: SPARK-26555
> URL: https://issues.apache.org/jira/browse/SPARK-26555
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Martin Loncaric
>Priority: Major
>
> This can be replicated (~2% of the time) with
> {code:scala}
> import java.sql.Timestamp
> import java.util.concurrent.{Executors, Future}
> import org.apache.spark.sql.SparkSession
> import scala.collection.mutable.ListBuffer
> import scala.concurrent.ExecutionContext
> import scala.util.Random
> object Main {
>   def main(args: Array[String]): Unit = {
> val sparkSession = SparkSession.builder
>   .getOrCreate()
> import sparkSession.implicits._
> val executor = Executors.newFixedThreadPool(1)
> try {
>   implicit val xc: ExecutionContext = 
> ExecutionContext.fromExecutorService(executor)
>   val futures = new ListBuffer[Future[_]]()
>   for (i <- 1 to 3) {
> futures += executor.submit(new Runnable {
>   override def run(): Unit = {
> val d = if (Random.nextInt(2) == 0) Some("d value") else None
> val e = if (Random.nextInt(2) == 0) Some(5.0) else None
> val f = if (Random.nextInt(2) == 0) Some(6.0) else None
> println("DEBUG", d, e, f)
> sparkSession.createDataset(Seq(
>   MyClass(new Timestamp(1L), "b", "c", d, e, f)
> ))
>   }
> })
>   }
>   futures.foreach(_.get())
> } finally {
>   println("SHUTDOWN")
>   executor.shutdown()
>   sparkSession.stop()
> }
>   }
>   case class MyClass(
> a: Timestamp,
> b: String,
> c: String,
> d: Option[String],
> e: Option[Double],
> f: Option[Double]
>   )
> }
> {code}
> So it will usually come up during
> {code:bash}
> for i in $(seq 1 200); do
>   echo $i
>   spark-submit --master local[4] target/scala-2.11/spark-test_2.11-0.1.jar
> done
> {code}
> causing a variety of possible errors, such as
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$deserializerFor$1.apply(ScalaReflection.scala:210){code}
> or
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:789){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-09 Thread Martin Loncaric (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788773#comment-16788773
 ] 

Martin Loncaric commented on SPARK-26555:
-

You can literally try any dataset and replicate this issue. For example,

sparkSession.createDataset(Seq(
  MyClass(new Timestamp(1L), "b", "c", Some("d"), Some(1.0), 
Some(2.0))
))

I think the code I left is pretty clear - it fails sometimes. Run it once, and 
it may or may not work. I don't run multiple spark-submit's in parallel.

> Thread safety issue causes createDataset to fail with misleading errors
> ---
>
> Key: SPARK-26555
> URL: https://issues.apache.org/jira/browse/SPARK-26555
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Martin Loncaric
>Priority: Major
>
> This can be replicated (~2% of the time) with
> {code:scala}
> import java.sql.Timestamp
> import java.util.concurrent.{Executors, Future}
> import org.apache.spark.sql.SparkSession
> import scala.collection.mutable.ListBuffer
> import scala.concurrent.ExecutionContext
> import scala.util.Random
> object Main {
>   def main(args: Array[String]): Unit = {
> val sparkSession = SparkSession.builder
>   .getOrCreate()
> import sparkSession.implicits._
> val executor = Executors.newFixedThreadPool(1)
> try {
>   implicit val xc: ExecutionContext = 
> ExecutionContext.fromExecutorService(executor)
>   val futures = new ListBuffer[Future[_]]()
>   for (i <- 1 to 3) {
> futures += executor.submit(new Runnable {
>   override def run(): Unit = {
> val d = if (Random.nextInt(2) == 0) Some("d value") else None
> val e = if (Random.nextInt(2) == 0) Some(5.0) else None
> val f = if (Random.nextInt(2) == 0) Some(6.0) else None
> println("DEBUG", d, e, f)
> sparkSession.createDataset(Seq(
>   MyClass(new Timestamp(1L), "b", "c", d, e, f)
> ))
>   }
> })
>   }
>   futures.foreach(_.get())
> } finally {
>   println("SHUTDOWN")
>   executor.shutdown()
>   sparkSession.stop()
> }
>   }
>   case class MyClass(
> a: Timestamp,
> b: String,
> c: String,
> d: Option[String],
> e: Option[Double],
> f: Option[Double]
>   )
> }
> {code}
> So it will usually come up during
> {code:bash}
> for i in $(seq 1 200); do
>   echo $i
>   spark-submit --master local[4] target/scala-2.11/spark-test_2.11-0.1.jar
> done
> {code}
> causing a variety of possible errors, such as
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$deserializerFor$1.apply(ScalaReflection.scala:210){code}
> or
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:789){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26770) Misleading/unhelpful error message when wrapping a null in an Option

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-26770.
---
Resolution: Not A Problem

> Misleading/unhelpful error message when wrapping a null in an Option
> 
>
> Key: SPARK-26770
> URL: https://issues.apache.org/jira/browse/SPARK-26770
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.2
>Reporter: sam
>Priority: Minor
>
> This
> {code}
> // Using options to indicate nullable fields
> case class Product(productID: Option[Int],
>productName: Option[String])
> val productExtract: Dataset[Product] =
> spark.createDataset(Seq(
>   Product(
> productID = Some(6050286),
> // user mistake here, should be `None` not `Some(null)`
> productName = Some(null)
>   )))
> productExtract.count()
> {code}
> will give an error like the one below.  This error is thrown from quite deep 
> down, but there should be some handling logic further up to check for nulls 
> and to give a more informative error message.  E.g. it could tell the user 
> which field is null, it could detect the `Some(null)` error and suggest using 
> `None`.
> Whatever the exception it shouldn't be NPE, since this is clearly a user 
> error, so should be some kind of user error exception.
> {code}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 9 in stage 1.0 failed 4 times, most recent failure: Lost task 9.3 in 
> stage 1.0 (TID 276, 10.139.64.8, executor 1): java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter.write(UnsafeRowWriter.java:194)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.serializefromobject_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.mapelements_doConsume_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:620)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:384)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> I've seen quite a few other people with this error, but I don't think it's 
> for the same reason:
> https://docs.databricks.com/spark/latest/data-sources/tips/redshift-npe.html
> https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/Dt6ilC9Dn54
> https://issues.apache.org/jira/browse/SPARK-17195
> https://issues.apache.org/jira/browse/SPARK-18859
> https://github.com/datastax/spark-cassandra-connector/issues/1062
> https://stackoverflow.com/questions/39875711/spark-sql-2-0-nullpointerexception-with-a-valid-postgresql-query



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-25350) Spark Serving

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-25350.
---
Resolution: Won't Fix

> Spark Serving
> -
>
> Key: SPARK-25350
> URL: https://issues.apache.org/jira/browse/SPARK-25350
> Project: Spark
>  Issue Type: New Feature
>  Components: Structured Streaming
>Affects Versions: 2.3.1
>Reporter: Mark Hamilton
>Priority: Major
>  Labels: features
>
> Microsoft has created a new system to turn Structured Streaming jobs into 
> RESTful web services. We would like to commit this work back to the 
> community. 
> More information can be found at the [ MMLSpark 
> website|[http://www.aka.ms/spark]]
> And the [ Spark Serving Documentation 
> page|[https://github.com/Azure/mmlspark/blob/master/docs/mmlspark-serving.md]]
>  
> The code can be found in the MMLSpark Repo and a PR will be made soon:
> [https://github.com/Azure/mmlspark/blob/master/src/io/http/src/main/scala/HTTPSource.scala]
>  
> Thanks for your help and feedback!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-25982.
---
Resolution: Not A Problem

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26261) Spark does not check completeness temporary file

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-26261.
---
Resolution: Not A Problem

> Spark does not check completeness temporary file 
> -
>
> Key: SPARK-26261
> URL: https://issues.apache.org/jira/browse/SPARK-26261
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: Jialin LIu
>Priority: Minor
>
> Spark does not check temporary files' completeness. When persisting to disk 
> is enabled on some RDDs, a bunch of temporary files will be created on 
> blockmgr folder. Block manager is able to detect missing blocks while it is 
> not able detect file content being modified during execution. 
> Our initial test shows that if we truncate the block file before being used 
> by executors, the program will finish without detecting any error, but the 
> result content is totally wrong.
> We believe there should be a file checksum on every RDD file block and these 
> files should be protected by checksum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26555) Thread safety issue causes createDataset to fail with misleading errors

2019-03-09 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788771#comment-16788771
 ] 

Sean Owen commented on SPARK-26555:
---

What is the fixed data set that reproduces this, to be clear?
And you mean that if you run it once it works, but fails in parallel?

> Thread safety issue causes createDataset to fail with misleading errors
> ---
>
> Key: SPARK-26555
> URL: https://issues.apache.org/jira/browse/SPARK-26555
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Martin Loncaric
>Priority: Major
>
> This can be replicated (~2% of the time) with
> {code:scala}
> import java.sql.Timestamp
> import java.util.concurrent.{Executors, Future}
> import org.apache.spark.sql.SparkSession
> import scala.collection.mutable.ListBuffer
> import scala.concurrent.ExecutionContext
> import scala.util.Random
> object Main {
>   def main(args: Array[String]): Unit = {
> val sparkSession = SparkSession.builder
>   .getOrCreate()
> import sparkSession.implicits._
> val executor = Executors.newFixedThreadPool(1)
> try {
>   implicit val xc: ExecutionContext = 
> ExecutionContext.fromExecutorService(executor)
>   val futures = new ListBuffer[Future[_]]()
>   for (i <- 1 to 3) {
> futures += executor.submit(new Runnable {
>   override def run(): Unit = {
> val d = if (Random.nextInt(2) == 0) Some("d value") else None
> val e = if (Random.nextInt(2) == 0) Some(5.0) else None
> val f = if (Random.nextInt(2) == 0) Some(6.0) else None
> println("DEBUG", d, e, f)
> sparkSession.createDataset(Seq(
>   MyClass(new Timestamp(1L), "b", "c", d, e, f)
> ))
>   }
> })
>   }
>   futures.foreach(_.get())
> } finally {
>   println("SHUTDOWN")
>   executor.shutdown()
>   sparkSession.stop()
> }
>   }
>   case class MyClass(
> a: Timestamp,
> b: String,
> c: String,
> d: Option[String],
> e: Option[Double],
> f: Option[Double]
>   )
> }
> {code}
> So it will usually come up during
> {code:bash}
> for i in $(seq 1 200); do
>   echo $i
>   spark-submit --master local[4] target/scala-2.11/spark-test_2.11-0.1.jar
> done
> {code}
> causing a variety of possible errors, such as
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: scala.MatchError: scala.Option[String] (of class 
> scala.reflect.internal.Types$ClassArgsTypeRef)
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$org$apache$spark$sql$catalyst$ScalaReflection$$deserializerFor$1.apply(ScalaReflection.scala:210){code}
> or
> {code}Exception in thread "main" java.util.concurrent.ExecutionException: 
> java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: java.lang.UnsupportedOperationException: Schema for type 
> scala.Option[scala.Double] is not supported
>   at 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:789){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27090) Removing old LEGACY_DRIVER_IDENTIFIER ("")

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-27090:
--
   Docs Text: The executor ID for the driver has been "driver" rather 
than "" since Spark 1.5. Spark 3 no longer uses or recognizes this ID 
for the driver.
Target Version/s: 3.0.0
  Labels: release-notes  (was: )
Priority: Minor  (was: Major)
  Issue Type: Task  (was: Bug)
 Summary: Removing old LEGACY_DRIVER_IDENTIFIER ("")  (was: 
Deplementing old LEGACY_DRIVER_IDENTIFIER (""))

> Removing old LEGACY_DRIVER_IDENTIFIER ("")
> --
>
> Key: SPARK-27090
> URL: https://issues.apache.org/jira/browse/SPARK-27090
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Attila Zsolt Piros
>Priority: Minor
>  Labels: release-notes
>
> For legacy reasons LEGACY_DRIVER_IDENTIFIER was checked for a few places 
> along with the new DRIVER_IDENTIFIER ("driver") to decided whether a driver 
> is running or an executor.
> The new DRIVER_IDENTIFIER ("driver") was introduced in spark version 1.4. So 
> I think we have a chance to get rid of  the LEGACY_DRIVER_IDENTIFIER.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-27114:
--
Priority: Minor  (was: Major)

I don't know much about this area.
Does it actually try to execute twice, or just shows up twice in the UI?

> SQL Tab shows duplicate executions for some commands
> 
>
> Key: SPARK-27114
> URL: https://issues.apache.org/jira/browse/SPARK-27114
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: Screenshot from 2019-03-09 14-04-07.png
>
>
> run simple sql  command
> {{create table abc ( a int );}}
> Open SQL tab in SparkUI, we can see duplicate entries for the execution. 
> Tested behaviour in thriftserver and sparksql
> *check attachment*
> The Problem seems be due to eager execution of commands @ 
> org.apache.spark.sql.Dataset#logicalPlan
> After analysis for spark-sql, the call stacks for duplicate execution id 
> seems to be
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> withAction:3346, Dataset (org.apache.spark.sql)
> :203, Dataset (org.apache.spark.sql)
> ofRows:88, Dataset$ (org.apache.spark.sql)
> sql:656, SparkSession (org.apache.spark.sql)
> sql:685, SQLContext (org.apache.spark.sql)
> run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-25838) Remove formatVersion from Saveable

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-25838.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Issue resolved by pull request 22830
[https://github.com/apache/spark/pull/22830]

> Remove formatVersion from Saveable
> --
>
> Key: SPARK-25838
> URL: https://issues.apache.org/jira/browse/SPARK-25838
> Project: Spark
>  Issue Type: Task
>  Components: MLlib
>Affects Versions: 3.0.0
>Reporter: Marco Gaido
>Assignee: Marco Gaido
>Priority: Trivial
> Fix For: 3.0.0
>
>
> The {{Saveable}} interface introduces a {{formatVersion}} term which is used 
> nowhere and it is protected. So this JIRA proposes to get rid of it, which is 
> useless.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-25838) Remove formatVersion from Saveable

2019-03-09 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen reassigned SPARK-25838:
-

Assignee: Marco Gaido

> Remove formatVersion from Saveable
> --
>
> Key: SPARK-25838
> URL: https://issues.apache.org/jira/browse/SPARK-25838
> Project: Spark
>  Issue Type: Task
>  Components: MLlib
>Affects Versions: 3.0.0
>Reporter: Marco Gaido
>Assignee: Marco Gaido
>Priority: Trivial
>
> The {{Saveable}} interface introduces a {{formatVersion}} term which is used 
> nowhere and it is protected. So this JIRA proposes to get rid of it, which is 
> useless.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27118) Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0

2019-03-09 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-27118:
---

 Summary: Upgrade to latest Hive version for Hive Metastore Client 
1.1 and 1.0
 Key: SPARK-27118
 URL: https://issues.apache.org/jira/browse/SPARK-27118
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0
Reporter: Yuming Wang


Hive 1.1.1 and Hive 1.0.1 release is available. We should upgrade Hive 
Metastore Client version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27119) Do not infer schema when reading Hive serde table with native data source

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27119:


Assignee: Wenchen Fan  (was: Apache Spark)

> Do not infer schema when reading Hive serde table with native data source
> -
>
> Key: SPARK-27119
> URL: https://issues.apache.org/jira/browse/SPARK-27119
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27119) Do not infer schema when reading Hive serde table with native data source

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27119:


Assignee: Apache Spark  (was: Wenchen Fan)

> Do not infer schema when reading Hive serde table with native data source
> -
>
> Key: SPARK-27119
> URL: https://issues.apache.org/jira/browse/SPARK-27119
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27119) Do not infer schema when reading Hive serde table with native data source

2019-03-09 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-27119:
---

 Summary: Do not infer schema when reading Hive serde table with 
native data source
 Key: SPARK-27119
 URL: https://issues.apache.org/jira/browse/SPARK-27119
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27118) Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27118:


Assignee: Apache Spark

> Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0
> 
>
> Key: SPARK-27118
> URL: https://issues.apache.org/jira/browse/SPARK-27118
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Apache Spark
>Priority: Major
>
> Hive 1.1.1 and Hive 1.0.1 release is available. We should upgrade Hive 
> Metastore Client version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27118) Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27118:


Assignee: (was: Apache Spark)

> Upgrade to latest Hive version for Hive Metastore Client 1.1 and 1.0
> 
>
> Key: SPARK-27118
> URL: https://issues.apache.org/jira/browse/SPARK-27118
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> Hive 1.1.1 and Hive 1.0.1 release is available. We should upgrade Hive 
> Metastore Client version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27080) Read parquet file with merging metastore schema should compare schema field in uniform case.

2019-03-09 Thread Wenchen Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-27080.
-
   Resolution: Fixed
Fix Version/s: 2.3.4
   2.4.1
   3.0.0

Issue resolved by pull request 24001
[https://github.com/apache/spark/pull/24001]

> Read parquet file with merging metastore schema should compare schema field 
> in uniform case.
> 
>
> Key: SPARK-27080
> URL: https://issues.apache.org/jira/browse/SPARK-27080
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.2, 2.3.3, 2.4.0
>Reporter: BoMeng
>Priority: Major
> Fix For: 3.0.0, 2.4.1, 2.3.4
>
>
> In our product environment, when we upgrade spark from version 2.1 to 2.3, 
> the job failed with an exception as below:
> ---ERROR stack trace –
> Exception occur when running Job, 
> org.apache.spark.SparkException: Detected conflicting schemas when merging 
> the schema obtained from the Hive
>  Metastore with the one inferred from the file format. Metastore schema:
> {
>   "type" : "struct",
>   "fields" : [
> ..
> }
> Inferred schema:
> {
>   "type" : "struct",
>   "fields" : [
> ..
> }
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$.mergeWithMetastoreSchema(HiveMetastoreCatalog.scala:295)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$11.apply(HiveMetastoreCatalog.scala:243)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$11.apply(HiveMetastoreCatalog.scala:243)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$$inferIfNeeded(HiveMetastoreCatalog.scala:243)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$4$$anonfun$5.apply(HiveMetastoreCatalog.scala:167)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$4$$anonfun$5.apply(HiveMetastoreCatalog.scala:156)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$4.apply(HiveMetastoreCatalog.scala:156)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog$$anonfun$4.apply(HiveMetastoreCatalog.scala:148)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog.withTableCreationLock(HiveMetastoreCatalog.scala:54)
> at 
> org.apache.spark.sql.hive.HiveMetastoreCatalog.convertToLogicalRelation(HiveMetastoreCatalog.scala:148)
> at 
> org.apache.spark.sql.hive.RelationConversions.org$apache$spark$sql$hive$RelationConversions$$convert(HiveStrategies.scala:195)
> at 
> org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:226)
> at 
> org.apache.spark.sql.hive.RelationConversions$$anonfun$apply$4.applyOrElse(HiveStrategies.scala:215)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289)
> at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:286)
> at 
> 

[jira] [Created] (SPARK-27117) current_date/current_timestamp should not refer to columns with ansi parser mode

2019-03-09 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-27117:
---

 Summary: current_date/current_timestamp should not refer to 
columns with ansi parser mode
 Key: SPARK-27117
 URL: https://issues.apache.org/jira/browse/SPARK-27117
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Wenchen Fan
Assignee: Wenchen Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27117) current_date/current_timestamp should not refer to columns with ansi parser mode

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27117:


Assignee: Wenchen Fan  (was: Apache Spark)

> current_date/current_timestamp should not refer to columns with ansi parser 
> mode
> 
>
> Key: SPARK-27117
> URL: https://issues.apache.org/jira/browse/SPARK-27117
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27117) current_date/current_timestamp should not refer to columns with ansi parser mode

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27117:


Assignee: Apache Spark  (was: Wenchen Fan)

> current_date/current_timestamp should not refer to columns with ansi parser 
> mode
> 
>
> Key: SPARK-27117
> URL: https://issues.apache.org/jira/browse/SPARK-27117
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Wenchen Fan
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-26004) InMemoryTable support StartsWith predicate push down

2019-03-09 Thread Takeshi Yamamuro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-26004.
--
   Resolution: Fixed
 Assignee: Yuming Wang
Fix Version/s: 3.0.0

Resolved by https://github.com/apache/spark/pull/23004

> InMemoryTable support StartsWith predicate push down
> 
>
> Key: SPARK-26004
> URL: https://issues.apache.org/jira/browse/SPARK-26004
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>
> SPARK-24638  adds support for parquet {{StartsWith}} predicate push down. 
> InMemoryTable can also support this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26004) InMemoryTable support StartsWith predicate push down

2019-03-09 Thread Yuming Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788643#comment-16788643
 ] 

Yuming Wang commented on SPARK-26004:
-

[~maropu] Could you help close this ticket?

> InMemoryTable support StartsWith predicate push down
> 
>
> Key: SPARK-26004
> URL: https://issues.apache.org/jira/browse/SPARK-26004
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Priority: Major
>
> SPARK-24638  adds support for parquet {{StartsWith}} predicate push down. 
> InMemoryTable can also support this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27116) Environment tab must sort Hadoop Configuration by default

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27116:


Assignee: Apache Spark

> Environment tab must sort Hadoop Configuration by default
> -
>
> Key: SPARK-27116
> URL: https://issues.apache.org/jira/browse/SPARK-27116
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Assignee: Apache Spark
>Priority: Minor
>
> Environment tab in SparkUI do not have Hadoop Configuration sorted. All other 
> tables in the same page like Spark Configrations, System Configuration etc 
> are sorted by keys by default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27116) Environment tab must sort Hadoop Configuration by default

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27116:


Assignee: (was: Apache Spark)

> Environment tab must sort Hadoop Configuration by default
> -
>
> Key: SPARK-27116
> URL: https://issues.apache.org/jira/browse/SPARK-27116
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Environment tab in SparkUI do not have Hadoop Configuration sorted. All other 
> tables in the same page like Spark Configrations, System Configuration etc 
> are sorted by keys by default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27116) Environment tab must sort Hadoop Configuration by default

2019-03-09 Thread Ajith S (JIRA)
Ajith S created SPARK-27116:
---

 Summary: Environment tab must sort Hadoop Configuration by default
 Key: SPARK-27116
 URL: https://issues.apache.org/jira/browse/SPARK-27116
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Ajith S


Environment tab in SparkUI do not have Hadoop Configuration sorted. All other 
tables in the same page like Spark Configrations, System Configuration etc are 
sorted by keys by default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27115) Exception thrown in UI when click on sort header in SQL Tab

2019-03-09 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788626#comment-16788626
 ] 

Ajith S commented on SPARK-27115:
-

Duplicates SPARK-27075

> Exception thrown in UI when click on sort header in SQL Tab
> ---
>
> Key: SPARK-27115
> URL: https://issues.apache.org/jira/browse/SPARK-27115
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> When click on table header to change the sort order, we get following 
> exception in UI
>  
> java.lang.IllegalArgumentException: Duplicate key [completed.sort] found. at 
> org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>  at 
> org.spark_project.guava.base.Splitter$MapSplitter.split(Splitter.java:480) at 
> org.apache.spark.ui.PagedTable.pageNavigation(PagedTable.scala:201) at 
> org.apache.spark.ui.PagedTable.pageNavigation$(PagedTable.scala:173) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.pageNavigation(AllExecutionsPage.scala:211)
>  at org.apache.spark.ui.PagedTable.table(PagedTable.scala:117) at 
> org.apache.spark.ui.PagedTable.table$(PagedTable.scala:98) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.table(AllExecutionsPage.scala:211)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.executionsTable(AllExecutionsPage.scala:198)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.render(AllExecutionsPage.scala:78)
>  at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:83) at 
> org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
>  at 
> org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:80) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:740)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at org.spark_project.jetty.server.Server.handle(Server.java:503) at 
> org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:364) at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:103) 
> at org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) 
> at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>  at 
> org.spark_project.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27115) Exception thrown in UI when click on sort header in SQL Tab

2019-03-09 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S resolved SPARK-27115.
-
Resolution: Duplicate

> Exception thrown in UI when click on sort header in SQL Tab
> ---
>
> Key: SPARK-27115
> URL: https://issues.apache.org/jira/browse/SPARK-27115
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> When click on table header to change the sort order, we get following 
> exception in UI
>  
> java.lang.IllegalArgumentException: Duplicate key [completed.sort] found. at 
> org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>  at 
> org.spark_project.guava.base.Splitter$MapSplitter.split(Splitter.java:480) at 
> org.apache.spark.ui.PagedTable.pageNavigation(PagedTable.scala:201) at 
> org.apache.spark.ui.PagedTable.pageNavigation$(PagedTable.scala:173) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.pageNavigation(AllExecutionsPage.scala:211)
>  at org.apache.spark.ui.PagedTable.table(PagedTable.scala:117) at 
> org.apache.spark.ui.PagedTable.table$(PagedTable.scala:98) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.table(AllExecutionsPage.scala:211)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.executionsTable(AllExecutionsPage.scala:198)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.render(AllExecutionsPage.scala:78)
>  at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:83) at 
> org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
>  at 
> org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:80) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:740)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at org.spark_project.jetty.server.Server.handle(Server.java:503) at 
> org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:364) at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:103) 
> at org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) 
> at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>  at 
> org.spark_project.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27115) Exception thrown in UI when click on sort header in SQL Tab

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27115:


Assignee: Apache Spark

> Exception thrown in UI when click on sort header in SQL Tab
> ---
>
> Key: SPARK-27115
> URL: https://issues.apache.org/jira/browse/SPARK-27115
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Assignee: Apache Spark
>Priority: Major
>
> When click on table header to change the sort order, we get following 
> exception in UI
>  
> java.lang.IllegalArgumentException: Duplicate key [completed.sort] found. at 
> org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>  at 
> org.spark_project.guava.base.Splitter$MapSplitter.split(Splitter.java:480) at 
> org.apache.spark.ui.PagedTable.pageNavigation(PagedTable.scala:201) at 
> org.apache.spark.ui.PagedTable.pageNavigation$(PagedTable.scala:173) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.pageNavigation(AllExecutionsPage.scala:211)
>  at org.apache.spark.ui.PagedTable.table(PagedTable.scala:117) at 
> org.apache.spark.ui.PagedTable.table$(PagedTable.scala:98) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.table(AllExecutionsPage.scala:211)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.executionsTable(AllExecutionsPage.scala:198)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.render(AllExecutionsPage.scala:78)
>  at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:83) at 
> org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
>  at 
> org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:80) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:740)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at org.spark_project.jetty.server.Server.handle(Server.java:503) at 
> org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:364) at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:103) 
> at org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) 
> at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>  at 
> org.spark_project.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-27115) Exception thrown in UI when click on sort header in SQL Tab

2019-03-09 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-27115:


Assignee: (was: Apache Spark)

> Exception thrown in UI when click on sort header in SQL Tab
> ---
>
> Key: SPARK-27115
> URL: https://issues.apache.org/jira/browse/SPARK-27115
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> When click on table header to change the sort order, we get following 
> exception in UI
>  
> java.lang.IllegalArgumentException: Duplicate key [completed.sort] found. at 
> org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>  at 
> org.spark_project.guava.base.Splitter$MapSplitter.split(Splitter.java:480) at 
> org.apache.spark.ui.PagedTable.pageNavigation(PagedTable.scala:201) at 
> org.apache.spark.ui.PagedTable.pageNavigation$(PagedTable.scala:173) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.pageNavigation(AllExecutionsPage.scala:211)
>  at org.apache.spark.ui.PagedTable.table(PagedTable.scala:117) at 
> org.apache.spark.ui.PagedTable.table$(PagedTable.scala:98) at 
> org.apache.spark.sql.execution.ui.ExecutionPagedTable.table(AllExecutionsPage.scala:211)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.executionsTable(AllExecutionsPage.scala:198)
>  at 
> org.apache.spark.sql.execution.ui.AllExecutionsPage.render(AllExecutionsPage.scala:78)
>  at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:83) at 
> org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at 
> javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
>  at 
> org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:80) 
> at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:740)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  at org.spark_project.jetty.server.Server.handle(Server.java:503) at 
> org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:364) at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:103) 
> at org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) 
> at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>  at 
> org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>  at 
> org.spark_project.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27115) Exception thrown in UI when click on sort header in SQL Tab

2019-03-09 Thread Ajith S (JIRA)
Ajith S created SPARK-27115:
---

 Summary: Exception thrown in UI when click on sort header in SQL 
Tab
 Key: SPARK-27115
 URL: https://issues.apache.org/jira/browse/SPARK-27115
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Ajith S


When click on table header to change the sort order, we get following exception 
in UI

 

java.lang.IllegalArgumentException: Duplicate key [completed.sort] found. at 
org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
 at org.spark_project.guava.base.Splitter$MapSplitter.split(Splitter.java:480) 
at org.apache.spark.ui.PagedTable.pageNavigation(PagedTable.scala:201) at 
org.apache.spark.ui.PagedTable.pageNavigation$(PagedTable.scala:173) at 
org.apache.spark.sql.execution.ui.ExecutionPagedTable.pageNavigation(AllExecutionsPage.scala:211)
 at org.apache.spark.ui.PagedTable.table(PagedTable.scala:117) at 
org.apache.spark.ui.PagedTable.table$(PagedTable.scala:98) at 
org.apache.spark.sql.execution.ui.ExecutionPagedTable.table(AllExecutionsPage.scala:211)
 at 
org.apache.spark.sql.execution.ui.AllExecutionsPage.executionsTable(AllExecutionsPage.scala:198)
 at 
org.apache.spark.sql.execution.ui.AllExecutionsPage.render(AllExecutionsPage.scala:78)
 at org.apache.spark.ui.WebUI.$anonfun$attachPage$1(WebUI.scala:83) at 
org.apache.spark.ui.JettyUtils$$anon$1.doGet(JettyUtils.scala:80) at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at 
org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) at 
org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
 at 
org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:80) at 
org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
 at 
org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
 at 
org.spark_project.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
 at 
org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
 at 
org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
 at 
org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) 
at 
org.spark_project.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
 at 
org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
 at 
org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
 at 
org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:740)
 at 
org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
 at 
org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
 at org.spark_project.jetty.server.Server.handle(Server.java:503) at 
org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:364) at 
org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
 at 
org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
 at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:103) at 
org.spark_project.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at 
org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
 at 
org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
 at 
org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
 at 
org.spark_project.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
 at 
org.spark_project.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
 at 
org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
 at 
org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788579#comment-16788579
 ] 

Ajith S commented on SPARK-27114:
-

ping [~srowen] [~cloud_fan] [~dongjoon]

> SQL Tab shows duplicate executions for some commands
> 
>
> Key: SPARK-27114
> URL: https://issues.apache.org/jira/browse/SPARK-27114
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
> Attachments: Screenshot from 2019-03-09 14-04-07.png
>
>
> run simple sql  command
> {{create table abc ( a int );}}
> Open SQL tab in SparkUI, we can see duplicate entries for the execution. 
> Tested behaviour in thriftserver and sparksql
> *check attachment*
> The Problem seems be due to eager execution of commands @ 
> org.apache.spark.sql.Dataset#logicalPlan
> After analysis for spark-sql, the call stacks for duplicate execution id 
> seems to be
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> withAction:3346, Dataset (org.apache.spark.sql)
> :203, Dataset (org.apache.spark.sql)
> ofRows:88, Dataset$ (org.apache.spark.sql)
> sql:656, SparkSession (org.apache.spark.sql)
> sql:685, SQLContext (org.apache.spark.sql)
> run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27114:

Description: 
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

*check attachment*

The Problem seems be due to eager execution of commands @ 
org.apache.spark.sql.Dataset#logicalPlan

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
 

  was:
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

*check attachment*

The Problem seems be due to eager execution @ 
org.apache.spark.sql.Dataset#logicalPlan

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 

[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27114:

Description: 
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

*check attachment*

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
 

  was:
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

!image-2019-03-09-14-04-33-409.png!

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)

[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27114:

Description: 
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

*check attachment*

The Problem seems be due to eager execution @ 
org.apache.spark.sql.Dataset#logicalPlan

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
 

  was:
run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

*check attachment*

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit 

[jira] [Commented] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788576#comment-16788576
 ] 

Ajith S commented on SPARK-27114:
-

will be working on this

> SQL Tab shows duplicate executions for some commands
> 
>
> Key: SPARK-27114
> URL: https://issues.apache.org/jira/browse/SPARK-27114
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
> Attachments: Screenshot from 2019-03-09 14-04-07.png
>
>
> run simple sql  command
> {{create table abc ( a int );}}
> Open SQL tab in SparkUI, we can see duplicate entries for the execution. 
> Tested behaviour in thriftserver and sparksql
> *check attachment*
> After analysis for spark-sql, the call stacks for duplicate execution id 
> seems to be
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> withAction:3346, Dataset (org.apache.spark.sql)
> :203, Dataset (org.apache.spark.sql)
> ofRows:88, Dataset$ (org.apache.spark.sql)
> sql:656, SparkSession (org.apache.spark.sql)
> sql:685, SQLContext (org.apache.spark.sql)
> run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27114:

Attachment: Screenshot from 2019-03-09 14-04-07.png

> SQL Tab shows duplicate executions for some commands
> 
>
> Key: SPARK-27114
> URL: https://issues.apache.org/jira/browse/SPARK-27114
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
> Attachments: Screenshot from 2019-03-09 14-04-07.png
>
>
> run simple sql  command
> {{create table abc ( a int );}}
> Open SQL tab in SparkUI, we can see duplicate entries for the execution. 
> Tested behaviour in thriftserver and sparksql
> !image-2019-03-09-14-04-33-409.png!
> After analysis for spark-sql, the call stacks for duplicate execution id 
> seems to be
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> withAction:3346, Dataset (org.apache.spark.sql)
> :203, Dataset (org.apache.spark.sql)
> ofRows:88, Dataset$ (org.apache.spark.sql)
> sql:656, SparkSession (org.apache.spark.sql)
> sql:685, SQLContext (org.apache.spark.sql)
> run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-09 Thread Ajith S (JIRA)
Ajith S created SPARK-27114:
---

 Summary: SQL Tab shows duplicate executions for some commands
 Key: SPARK-27114
 URL: https://issues.apache.org/jira/browse/SPARK-27114
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Ajith S


run simple sql  command

{{create table abc ( a int );}}

Open SQL tab in SparkUI, we can see duplicate entries for the execution. Tested 
behaviour in thriftserver and sparksql

!image-2019-03-09-14-04-33-409.png!

After analysis for spark-sql, the call stacks for duplicate execution id seems 
to be
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
withAction:3346, Dataset (org.apache.spark.sql)
:203, Dataset (org.apache.spark.sql)
ofRows:88, Dataset$ (org.apache.spark.sql)
sql:656, SparkSession (org.apache.spark.sql)
sql:685, SQLContext (org.apache.spark.sql)
run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
{code:java}
$anonfun$withNewExecutionId$1:78, SQLExecution$ (org.apache.spark.sql.execution)
apply:-1, 2057192703 (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
processLine:376, CliDriver (org.apache.hadoop.hive.cli)
main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
invoke:62, NativeMethodAccessorImpl (sun.reflect)
invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
invoke:498, Method (java.lang.reflect)
start:52, JavaMainApplication (org.apache.spark.deploy)
org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
(org.apache.spark.deploy)
doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
submit:185, SparkSubmit (org.apache.spark.deploy)
doSubmit:87, SparkSubmit (org.apache.spark.deploy)
doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
main:943, SparkSubmit$ (org.apache.spark.deploy)
main:-1, SparkSubmit (org.apache.spark.deploy){code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org