[jira] [Updated] (SPARK-17617) Remainder(%) expression.eval returns incorrect result

2016-09-20 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17617: --- Description: h2.Problem Remainder(%) expression returns incorrect result when using expression.eval

[jira] [Created] (SPARK-17617) Remainder(%) expression.eval returns incorrect result

2016-09-20 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17617: -- Summary: Remainder(%) expression.eval returns incorrect result Key: SPARK-17617 URL: https://issues.apache.org/jira/browse/SPARK-17617 Project: Spark Issue

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Attachment: Screen Shot 2016-09-12 at 4.34.19 PM.png Screen Shot 2016-09-12 at

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code}

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Affects Version/s: (was: 1.6.2) (was: 2.0.0) Target Version/s:

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code}

[jira] [Commented] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483389#comment-15483389 ] Sean Zhong commented on SPARK-17503: [~sowen] I have modified the title to mean "cache in memory" >

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD in memory

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Summary: Memory leak in Memory store when unable to cache the whole RDD in memory (was: Memory leak

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code}

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code}

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code}

[jira] [Created] (SPARK-17503) Memory leak in Memory store which unable to cache whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17503: -- Summary: Memory leak in Memory store which unable to cache whole RDD Key: SPARK-17503 URL: https://issues.apache.org/jira/browse/SPARK-17503 Project: Spark

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-12 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Summary: Memory leak in Memory store when unable to cache the whole RDD (was: Memory leak in Memory

[jira] [Commented] (SPARK-17364) Can not query hive table starting with number

2016-09-07 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15472189#comment-15472189 ] Sean Zhong commented on SPARK-17364: I have a trial fix at https://github.com/apache/spark/pull/15006

[jira] [Comment Edited] (SPARK-17364) Can not query hive table starting with number

2016-09-07 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15472182#comment-15472182 ] Sean Zhong edited comment on SPARK-17364 at 9/7/16 11:56 PM: - [~hvanhovell]

[jira] [Comment Edited] (SPARK-17364) Can not query hive table starting with number

2016-09-07 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15472182#comment-15472182 ] Sean Zhong edited comment on SPARK-17364 at 9/7/16 11:55 PM: - [~hvanhovell]

[jira] [Commented] (SPARK-17364) Can not query hive table starting with number

2016-09-07 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15472182#comment-15472182 ] Sean Zhong commented on SPARK-17364: [~hvanhovell] That is because the antlr4 lexer breaks

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Target Version/s: 2.1.0 Description: In SPARK-17356, we fix the OOM issue when Metadata is

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Description: In SPARK-17356, we fix the OOM issue when Metadata is super big. There are other cases

[jira] [Updated] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17426: --- Component/s: SQL > Current TreeNode.toJSON may trigger OOM under some corner cases >

[jira] [Created] (SPARK-17426) Current TreeNode.toJSON may trigger OOM under some corner cases

2016-09-06 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17426: -- Summary: Current TreeNode.toJSON may trigger OOM under some corner cases Key: SPARK-17426 URL: https://issues.apache.org/jira/browse/SPARK-17426 Project: Spark

[jira] [Commented] (SPARK-17302) Cannot set non-Spark SQL session variables in hive-site.xml, spark-defaults.conf, or using --conf

2016-09-05 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465818#comment-15465818 ] Sean Zhong commented on SPARK-17302: [~rdblue] Can you write a reproducer code sample to describe

[jira] [Comment Edited] (SPARK-17364) Can not query hive table starting with number

2016-09-05 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465754#comment-15465754 ] Sean Zhong edited comment on SPARK-17364 at 9/5/16 8:47 PM: [~epahomov]

[jira] [Commented] (SPARK-17364) Can not query hive table starting with number

2016-09-05 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15465754#comment-15465754 ] Sean Zhong commented on SPARK-17364: [~epahomov] Spark 2.0 rewrites the Sql parser with antlr. You

[jira] [Created] (SPARK-17374) Improves the error message when fails to parse some json file lines in DataFrameReader

2016-09-02 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17374: -- Summary: Improves the error message when fails to parse some json file lines in DataFrameReader Key: SPARK-17374 URL: https://issues.apache.org/jira/browse/SPARK-17374

[jira] [Created] (SPARK-17369) MetastoreRelation toJSON throws exception

2016-09-01 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17369: -- Summary: MetastoreRelation toJSON throws exception Key: SPARK-17369 URL: https://issues.apache.org/jira/browse/SPARK-17369 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454553#comment-15454553 ] Sean Zhong commented on SPARK-17356: Reproducer: {code} # Trigger OOM scala> :paste -raw // Entering

[jira] [Comment Edited] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454495#comment-15454495 ] Sean Zhong edited comment on SPARK-17356 at 9/1/16 6:38 AM: *Root cause:* 1.

[jira] [Commented] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454495#comment-15454495 ] Sean Zhong commented on SPARK-17356: Root cause: 1. MLLib heavily leverage MetaData to store a lot

[jira] [Commented] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454488#comment-15454488 ] Sean Zhong commented on SPARK-17356: *Analysis* After looking at the mmap, there is a suspicious

[jira] [Updated] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17356: --- Description: When using MLLib, when calling toJSON on a plan with many level of sub-queries, it may

[jira] [Updated] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17356: --- Attachment: jstack.txt > Out of memory when calling TreeNode.toJSON >

[jira] [Updated] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17356: --- Attachment: jmap.txt > Out of memory when calling TreeNode.toJSON >

[jira] [Updated] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-09-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17356: --- Attachment: queryplan.txt > Out of memory when calling TreeNode.toJSON >

[jira] [Created] (SPARK-17356) Out of memory when calling TreeNode.toJSON

2016-08-31 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17356: -- Summary: Out of memory when calling TreeNode.toJSON Key: SPARK-17356 URL: https://issues.apache.org/jira/browse/SPARK-17356 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-17306) Memory leak in QuantileSummaries

2016-08-29 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17306: --- Component/s: SQL > Memory leak in QuantileSummaries > > >

[jira] [Created] (SPARK-17306) Memory leak in QuantileSummaries

2016-08-29 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17306: -- Summary: Memory leak in QuantileSummaries Key: SPARK-17306 URL: https://issues.apache.org/jira/browse/SPARK-17306 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-17289) Sort based partial aggregation breaks due to SPARK-12978

2016-08-29 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17289: --- Description: For the following query: {code} val df2 = (0 to 1000).map(x => (x % 2,

[jira] [Created] (SPARK-17289) Sort based partial aggregation breaks due to SPARK-12978

2016-08-29 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17289: -- Summary: Sort based partial aggregation breaks due to SPARK-12978 Key: SPARK-17289 URL: https://issues.apache.org/jira/browse/SPARK-17289 Project: Spark Issue

[jira] [Updated] (SPARK-17189) [MINOR] Looses the interface from UnsafeRow to InternalRow in AggregationIterator if UnsafeRow specific method is not used

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17189: --- Component/s: SQL > [MINOR] Looses the interface from UnsafeRow to InternalRow in >

[jira] [Created] (SPARK-17189) [MINOR] Looses the interface from UnsafeRow to InternalRow in AggregationIterator if UnsafeRow specific method is not used

2016-08-22 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17189: -- Summary: [MINOR] Looses the interface from UnsafeRow to InternalRow in AggregationIterator if UnsafeRow specific method is not used Key: SPARK-17189 URL:

[jira] [Comment Edited] (SPARK-16283) Implement percentile_approx SQL function

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431136#comment-15431136 ] Sean Zhong edited comment on SPARK-16283 at 8/22/16 4:35 PM: - Created a

[jira] [Commented] (SPARK-16283) Implement percentile_approx SQL function

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431136#comment-15431136 ] Sean Zhong commented on SPARK-16283: Created a sub-task to move QuantileSummaries to package

[jira] [Updated] (SPARK-17188) Moves QuantileSummaries to project catalyst from sql so that it can be used to implement percentile_approx

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17188: --- Description: QuantileSummaries is a useful utility class to do statistics. It can be used by

[jira] [Created] (SPARK-17188) Moves QuantileSummaries to project catalyst from sql so that it can be used to implement percentile_approx

2016-08-22 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17188: -- Summary: Moves QuantileSummaries to project catalyst from sql so that it can be used to implement percentile_approx Key: SPARK-17188 URL:

[jira] [Updated] (SPARK-17188) Moves QuantileSummaries to project catalyst from sql so that it can be used to implement percentile_approx

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17188: --- Description: org.apache.spark.sql.execution.stat > Moves QuantileSummaries to project catalyst from

[jira] [Updated] (SPARK-17187) Support using arbitrary Java object as internal aggregation buffer object

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17187: --- Description: *Background* For aggregation functions like sum and count, Spark-Sql internally use an

[jira] [Updated] (SPARK-17187) Support using arbitrary Java object as internal aggregation buffer object

2016-08-22 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17187: --- Description: *Background* For aggregation functions like sum and count, Spark-Sql internally use an

[jira] [Created] (SPARK-17187) Support using arbitrary Java object as internal aggregation buffer object

2016-08-22 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17187: -- Summary: Support using arbitrary Java object as internal aggregation buffer object Key: SPARK-17187 URL: https://issues.apache.org/jira/browse/SPARK-17187 Project: Spark

[jira] [Created] (SPARK-17034) Ordinal in ORDER BY or GROUP BY should be treated as an unresolved expression

2016-08-12 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17034: -- Summary: Ordinal in ORDER BY or GROUP BY should be treated as an unresolved expression Key: SPARK-17034 URL: https://issues.apache.org/jira/browse/SPARK-17034 Project:

[jira] [Commented] (SPARK-16666) Kryo encoder for custom complex classes

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408855#comment-15408855 ] Sean Zhong commented on SPARK-1: This issue has been fixed in Spark 2.0 and trunk. Can you use

[jira] [Resolved] (SPARK-16666) Kryo encoder for custom complex classes

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong resolved SPARK-1. Resolution: Not A Problem > Kryo encoder for custom complex classes >

[jira] [Created] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16907: -- Summary: Parquet table reading performance regression when vectorized record reader is not used Key: SPARK-16907 URL: https://issues.apache.org/jira/browse/SPARK-16907

[jira] [Created] (SPARK-16906) Adds more input type information for TypedAggregateExpression

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16906: -- Summary: Adds more input type information for TypedAggregateExpression Key: SPARK-16906 URL: https://issues.apache.org/jira/browse/SPARK-16906 Project: Spark

[jira] [Updated] (SPARK-16898) Adds argument type information for typed logical plan like MapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-16898: --- Summary: Adds argument type information for typed logical plan like MapElements, TypedFilter, and

[jira] [Created] (SPARK-16898) Adds argument type information for typed logical plan likMapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16898: -- Summary: Adds argument type information for typed logical plan likMapElements, TypedFilter, and AppendColumn Key: SPARK-16898 URL: https://issues.apache.org/jira/browse/SPARK-16898

[jira] [Created] (SPARK-16888) Implements eval method for expression AssertNotNull

2016-08-03 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16888: -- Summary: Implements eval method for expression AssertNotNull Key: SPARK-16888 URL: https://issues.apache.org/jira/browse/SPARK-16888 Project: Spark Issue Type:

[jira] [Closed] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-03 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong closed SPARK-16841. -- Resolution: Not A Problem > Improves the row level metrics performance when reading Parquet table >

[jira] [Comment Edited] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-03 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406306#comment-15406306 ] Sean Zhong edited comment on SPARK-16841 at 8/3/16 5:54 PM: This jira is

[jira] [Commented] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-03 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406306#comment-15406306 ] Sean Zhong commented on SPARK-16841: This PR is created after analyzing the performance impact of

[jira] [Comment Edited] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-03 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406306#comment-15406306 ] Sean Zhong edited comment on SPARK-16841 at 8/3/16 5:54 PM: This jira is

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-02 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405057#comment-15405057 ] Sean Zhong commented on SPARK-16320: [~maver1ck] Did you use the test case in this jira {code} select

[jira] [Created] (SPARK-16853) Analysis error for DataSet typed selection

2016-08-02 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16853: -- Summary: Analysis error for DataSet typed selection Key: SPARK-16853 URL: https://issues.apache.org/jira/browse/SPARK-16853 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403238#comment-15403238 ] Sean Zhong edited comment on SPARK-16320 at 8/2/16 2:22 AM: [~maver1ck] Can

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403238#comment-15403238 ] Sean Zhong commented on SPARK-16320: [~loziniak] Can you check whether the PR works for you? >

[jira] [Updated] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-16841: --- Summary: Improves the row level metrics performance when reading Parquet table (was: Improve the

[jira] [Created] (SPARK-16841) Improve the row level metrics performance when reading Parquet table

2016-08-01 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16841: -- Summary: Improve the row level metrics performance when reading Parquet table Key: SPARK-16841 URL: https://issues.apache.org/jira/browse/SPARK-16841 Project: Spark

[jira] [Resolved] (SPARK-12437) Reserved words (like table) throws error when writing a data frame to JDBC

2016-07-19 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong resolved SPARK-12437. Resolution: Duplicate > Reserved words (like table) throws error when writing a data frame to JDBC

[jira] [Commented] (SPARK-12437) Reserved words (like table) throws error when writing a data frame to JDBC

2016-07-19 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385148#comment-15385148 ] Sean Zhong commented on SPARK-12437: This issue is fixed by SPARK-16387. The column names are quoted

[jira] [Created] (SPARK-16323) Avoid unnecessary cast when doing integral divide

2016-06-30 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16323: -- Summary: Avoid unnecessary cast when doing integral divide Key: SPARK-16323 URL: https://issues.apache.org/jira/browse/SPARK-16323 Project: Spark Issue Type:

[jira] [Created] (SPARK-16322) Avoids unnecessary cast when doing integral divide

2016-06-30 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16322: -- Summary: Avoids unnecessary cast when doing integral divide Key: SPARK-16322 URL: https://issues.apache.org/jira/browse/SPARK-16322 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-06-28 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353969#comment-15353969 ] Sean Zhong edited comment on SPARK-14083 at 6/29/16 12:17 AM: -- For typed

[jira] [Commented] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-06-28 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353969#comment-15353969 ] Sean Zhong commented on SPARK-14083: For typed operation like map, it will first de-serialize

[jira] [Updated] (SPARK-16034) Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable

2016-06-17 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-16034: --- Issue Type: Sub-task (was: Bug) Parent: SPARK-16032 > Checks the partition columns when

[jira] [Created] (SPARK-16034) Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable

2016-06-17 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16034: -- Summary: Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable Key: SPARK-16034 URL: https://issues.apache.org/jira/browse/SPARK-16034

[jira] [Commented] (SPARK-15340) Limit the size of the map used to cache JobConfs to void OOM

2016-06-17 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336511#comment-15336511 ] Sean Zhong commented on SPARK-15340: [~DoingDone9] I did some tests, and didn't see the OOM you

[jira] [Commented] (SPARK-14048) Aggregation operations on structs fail when the structs have fields with special characters

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335174#comment-15335174 ] Sean Zhong commented on SPARK-14048: [~simeons] You can use {{sqlContext.sql("query")}} instead of

[jira] [Commented] (SPARK-15786) joinWith bytecode generation calling ByteBuffer.wrap with InternalRow

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334743#comment-15334743 ] Sean Zhong commented on SPARK-15786: [~yhuai] Sure, we definitely can improve it. > joinWith

[jira] [Commented] (SPARK-14048) Aggregation operations on structs fail when the structs have fields with special characters

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334657#comment-15334657 ] Sean Zhong commented on SPARK-14048: [~simeons] I can now reproduce this on Databricks community

[jira] [Commented] (SPARK-14048) Aggregation operations on structs fail when the structs have fields with special characters

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334644#comment-15334644 ] Sean Zhong commented on SPARK-14048: [~simeons] Can you share a complete notebook which we can run

[jira] [Commented] (SPARK-14048) Aggregation operations on structs fail when the structs have fields with special characters

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333221#comment-15333221 ] Sean Zhong commented on SPARK-14048: [~simeons] Can you try the following script on your

[jira] [Commented] (SPARK-15786) joinWith bytecode generation calling ByteBuffer.wrap with InternalRow

2016-06-16 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333205#comment-15333205 ] Sean Zhong commented on SPARK-15786: Hi [~rmarscher] The reason is that you use the Kryo encoder in

[jira] [Commented] (SPARK-14048) Aggregation operations on structs fail when the structs have fields with special characters

2016-06-15 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332775#comment-15332775 ] Sean Zhong commented on SPARK-14048: [~simeons] Are you able to reproduce this case any longer? I

[jira] [Commented] (SPARK-15786) joinWith bytecode generation calling ByteBuffer.wrap with InternalRow

2016-06-15 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332532#comment-15332532 ] Sean Zhong commented on SPARK-15786: The exception stack is: {code} scala> res4.as[(Option[(Int,

[jira] [Commented] (SPARK-15786) joinWith bytecode generation calling ByteBuffer.wrap with InternalRow

2016-06-15 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332527#comment-15332527 ] Sean Zhong commented on SPARK-15786: Basically, what you described can be shorten to: {code} scala>

[jira] [Updated] (SPARK-15914) Add deprecated method back to SQLContext for source code backward compatiblity

2016-06-13 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-15914: --- Description: We removed some deprecated method in SQLContext in branch Spark 2.0. For example:

[jira] [Updated] (SPARK-15914) Add deprecated method back to SQLContext for source code backward compatiblity

2016-06-13 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-15914: --- Summary: Add deprecated method back to SQLContext for source code backward compatiblity (was: Add

[jira] [Updated] (SPARK-15914) Add deprecated method back to SQLContext for backward compatiblity

2016-06-13 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-15914: --- Summary: Add deprecated method back to SQLContext for backward compatiblity (was: Add deprecated

[jira] [Created] (SPARK-15914) Add deprecated method back to SQLContext for compatiblity

2016-06-13 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15914: -- Summary: Add deprecated method back to SQLContext for compatiblity Key: SPARK-15914 URL: https://issues.apache.org/jira/browse/SPARK-15914 Project: Spark Issue

[jira] [Created] (SPARK-15910) Schema is not checked when converting DataFrame to Dataset using Kryo encoder

2016-06-12 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15910: -- Summary: Schema is not checked when converting DataFrame to Dataset using Kryo encoder Key: SPARK-15910 URL: https://issues.apache.org/jira/browse/SPARK-15910 Project:

[jira] [Created] (SPARK-15792) [SQL] Allows operator to change the verbosity in explain output.

2016-06-06 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15792: -- Summary: [SQL] Allows operator to change the verbosity in explain output. Key: SPARK-15792 URL: https://issues.apache.org/jira/browse/SPARK-15792 Project: Spark

[jira] [Commented] (SPARK-15632) Dataset typed filter operation changes query plan schema

2016-06-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316998#comment-15316998 ] Sean Zhong commented on SPARK-15632: *Root cause analysis:* *The root cause is that the

[jira] [Commented] (SPARK-15632) Dataset typed filter operation changes query plan schema

2016-06-06 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316996#comment-15316996 ] Sean Zhong commented on SPARK-15632: There are more issues linked with this bug which we may to fix

[jira] [Created] (SPARK-15734) Avoids printing internal row in explain output

2016-06-02 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15734: -- Summary: Avoids printing internal row in explain output Key: SPARK-15734 URL: https://issues.apache.org/jira/browse/SPARK-15734 Project: Spark Issue Type:

[jira] [Created] (SPARK-15733) Makes the explain output less verbose by hiding some verbose output like None, null, empty List, and etc..

2016-06-02 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15733: -- Summary: Makes the explain output less verbose by hiding some verbose output like None, null, empty List, and etc.. Key: SPARK-15733 URL:

[jira] [Updated] (SPARK-15495) Improve the output of explain for aggregate operator

2016-06-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-15495: --- Description: We should improves the explain output of Aggregator operator to make it more readable.

[jira] [Created] (SPARK-15692) Improves the explain output of several physical plans by displaying embedded logical plan in tree style

2016-06-01 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15692: -- Summary: Improves the explain output of several physical plans by displaying embedded logical plan in tree style Key: SPARK-15692 URL:

[jira] [Created] (SPARK-15674) Deprecates "CREATE TEMPORARY TABLE USING...", use "CREATE TEMPORARY VIEW USING..." instead.

2016-05-31 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15674: -- Summary: Deprecates "CREATE TEMPORARY TABLE USING...", use "CREATE TEMPORARY VIEW USING..." instead. Key: SPARK-15674 URL: https://issues.apache.org/jira/browse/SPARK-15674

[jira] [Created] (SPARK-15495) Improve the output of explain for aggregate operator

2016-05-23 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-15495: -- Summary: Improve the output of explain for aggregate operator Key: SPARK-15495 URL: https://issues.apache.org/jira/browse/SPARK-15495 Project: Spark Issue Type:

[jira] [Updated] (SPARK-15334) HiveClient facade not compatible with Hive 0.12

2016-05-15 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-15334: --- Summary: HiveClient facade not compatible with Hive 0.12 (was: [SPARK][SQL] HiveClient facade not

  1   2   >