[jira] [Comment Edited] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563038#comment-15563038 ] Gaoxiang Liu edited comment on SPARK-3577 at 10/10/16 6:17 PM: --- I find that

[jira] [Comment Edited] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563038#comment-15563038 ] Gaoxiang Liu edited comment on SPARK-3577 at 10/10/16 6:18 PM: --- I find that

[jira] [Comment Edited] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563038#comment-15563038 ] Gaoxiang Liu edited comment on SPARK-3577 at 10/10/16 6:17 PM: --- I find that

[jira] [Issue Comment Deleted] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaoxiang Liu updated SPARK-3577: Comment: was deleted (was: spill size metrics) > Add task metric to report spill time >

[jira] [Commented] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563038#comment-15563038 ] Gaoxiang Liu commented on SPARK-3577: - I find that the spill size metrics is already added in

[jira] [Updated] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaoxiang Liu updated SPARK-3577: Attachment: spill_size.jpg spill size metrics > Add task metric to report spill time >

[jira] [Commented] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562925#comment-15562925 ] Gaoxiang Liu commented on SPARK-3577: - Hi [~kayousterhout], Just want to make sure that this JIRA is

[jira] [Created] (SPARK-17856) JVM Crash during tests: pyspark.mllib.linalg.distributed

2016-10-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17856: -- Summary: JVM Crash during tests: pyspark.mllib.linalg.distributed Key: SPARK-17856 URL: https://issues.apache.org/jira/browse/SPARK-17856 Project: Spark Issue

[jira] [Resolved] (SPARK-17806) Incorrect result when work with data from parquet

2016-10-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17806. Resolution: Fixed Fix Version/s: 2.1.0 2.0.2 Issue resolved by pull

[jira] [Commented] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-10-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1759#comment-1759 ] Davies Liu commented on SPARK-17738: I will looking into that. > Flaky test:

[jira] [Assigned] (SPARK-17806) Incorrect result when work with data from parquet

2016-10-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17806: -- Assignee: Davies Liu > Incorrect result when work with data from parquet >

[jira] [Updated] (SPARK-17806) Incorrect result when work with data from parquet

2016-10-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17806: --- Priority: Blocker (was: Critical) > Incorrect result when work with data from parquet >

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-10-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15549494#comment-15549494 ] Davies Liu commented on SPARK-16922: Thanks for the feedback, that's reasonable. > Query with

[jira] [Resolved] (SPARK-15390) Memory management issue in complex DataFrame join and filter

2016-10-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-15390. Resolution: Fixed > Memory management issue in complex DataFrame join and filter >

[jira] [Updated] (SPARK-15390) Memory management issue in complex DataFrame join and filter

2016-10-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-15390: --- Fix Version/s: 2.0.1 > Memory management issue in complex DataFrame join and filter >

[jira] [Comment Edited] (SPARK-15390) Memory management issue in complex DataFrame join and filter

2016-10-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546675#comment-15546675 ] Davies Liu edited comment on SPARK-15390 at 10/4/16 9:11 PM: - @lulian Dragos

[jira] [Updated] (SPARK-15390) Memory management issue in complex DataFrame join and filter

2016-10-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-15390: --- Fix Version/s: (was: 2.0.0) > Memory management issue in complex DataFrame join and filter >

[jira] [Commented] (SPARK-15390) Memory management issue in complex DataFrame join and filter

2016-10-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546675#comment-15546675 ] Davies Liu commented on SPARK-15390: @lulian Dragos I think this is a different issue, fixed by

[jira] [Commented] (SPARK-17767) Spark SQL ExternalCatalog API custom implementation support

2016-10-03 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543539#comment-15543539 ] Alex Liu commented on SPARK-17767: -- It looks good to me. How about hive thrift server, will you patch

[jira] [Resolved] (SPARK-17679) Remove unnecessary Py4J ListConverter patch

2016-10-03 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17679. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15254

[jira] [Commented] (SPARK-12985) Spark Hive thrift server big decimal data issue

2016-10-03 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542909#comment-15542909 ] Alex Liu commented on SPARK-12985: -- It's ok to close it, It could be fixed from Simba side. > Spark

[jira] [Created] (SPARK-17767) Spark SQL ExternalCatalog API custom implementation support

2016-10-03 Thread Alex Liu (JIRA)
Alex Liu created SPARK-17767: Summary: Spark SQL ExternalCatalog API custom implementation support Key: SPARK-17767 URL: https://issues.apache.org/jira/browse/SPARK-17767 Project: Spark Issue

[jira] [Updated] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-09-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17738: --- Fix Version/s: (was: 2.2.0) 2.1.0 > Flaky test:

[jira] [Resolved] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-09-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17738. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15305

[jira] [Created] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-09-29 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17738: -- Summary: Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract Key: SPARK-17738 URL: https://issues.apache.org/jira/browse/SPARK-17738

[jira] [Updated] (SPARK-17494) Floor/ceil of decimal returns wrong result if it's in compact format

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17494: --- Summary: Floor/ceil of decimal returns wrong result if it's in compact format (was: Floor function

[jira] [Updated] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17100: --- Fix Version/s: (was: 2.2.0) 2.1.0 > pyspark filter on a udf column after join

[jira] [Resolved] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17100. Resolution: Fixed Fix Version/s: 2.2.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-16439) Incorrect information in SQL Query details

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-16439: --- Fix Version/s: (was: 2.2.0) 2.1.0 > Incorrect information in SQL Query

[jira] [Assigned] (SPARK-17494) Floor function rounds up during join

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17494: -- Assignee: Davies Liu > Floor function rounds up during join >

[jira] [Updated] (SPARK-16439) Incorrect information in SQL Query details

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-16439: --- Assignee: Davies Liu (was: Maciej BryƄski) > Incorrect information in SQL Query details >

[jira] [Resolved] (SPARK-16439) Incorrect information in SQL Query details

2016-09-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16439. Resolution: Fixed Fix Version/s: (was: 2.0.0) 2.2.0

[jira] [Reopened] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reopened SPARK-16439: We could bring the seperator back for better readability. > Incorrect information in SQL Query

[jira] [Commented] (SPARK-16439) Incorrect information in SQL Query details

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491744#comment-15491744 ] Davies Liu commented on SPARK-16439: The separator was added on purpose, otherwise it's very

[jira] [Assigned] (SPARK-17100) pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17100: -- Assignee: Davies Liu > pyspark filter on a udf column after join gives >

[jira] [Resolved] (SPARK-17472) Better error message for serialization failures of large objects in Python

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17472. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15026

[jira] [Commented] (SPARK-17544) Timeout waiting for connection from pool, DataFrame Reader's not closing S3 connections?

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491397#comment-15491397 ] Davies Liu commented on SPARK-17544: Could you post some code to reproduce the issue? > Timeout

[jira] [Resolved] (SPARK-17514) df.take(1) and df.limit(1).collect() perform differently in Python

2016-09-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17514. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Resolved] (SPARK-17474) Python UDF does not work between Sort and Limit

2016-09-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17474. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Assigned] (SPARK-15621) BatchEvalPythonExec fails with OOM

2016-09-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-15621: -- Assignee: Davies Liu > BatchEvalPythonExec fails with OOM >

[jira] [Resolved] (SPARK-17354) java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17354. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Created] (SPARK-17482) Analyzer should be able run on top of optimized rule

2016-09-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17482: -- Summary: Analyzer should be able run on top of optimized rule Key: SPARK-17482 URL: https://issues.apache.org/jira/browse/SPARK-17482 Project: Spark Issue Type:

[jira] [Updated] (SPARK-17474) Python UDF does not work between Sort and Limit

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17474: --- Summary: Python UDF does not work between Sort and Limit (was: expressions of QueryPlan does not

[jira] [Updated] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17474: --- Affects Version/s: (was: 1.6.2) (was: 1.5.2) > expressions of

[jira] [Created] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17474: -- Summary: expressions of QueryPlan does not include those inside Option[Seq[Expression]] Key: SPARK-17474 URL: https://issues.apache.org/jira/browse/SPARK-17474 Project:

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468738#comment-15468738 ] Davies Liu commented on SPARK-17381: cc [~cloud_fan] > Memory leak

[jira] [Closed] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-17384. -- Resolution: Duplicate Assignee: Herman van Hovell > SQL - Running query with outer join from 1.6

[jira] [Commented] (SPARK-17384) SQL - Running query with outer join from 1.6 fails

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468636#comment-15468636 ] Davies Liu commented on SPARK-17384: This is caused by the SQL parser change, the parsed plan in 1.6:

[jira] [Commented] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468583#comment-15468583 ] Davies Liu commented on SPARK-17377: Tested this with latest master and 2.0 on databricks[1], they

[jira] [Assigned] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17377: -- Assignee: Davies Liu > Joining Datasets read and aggregated from a partitioned Parquet file

[jira] [Updated] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17377: --- Description: Reproduction: 1) Read two Datasets from a partitioned Parquet file with different

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468449#comment-15468449 ] Davies Liu commented on SPARK-17403: [~rhernando] Could you pull out the string column (SL_RD_ColR_N)

[jira] [Updated] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17403: --- Summary: Fatal Error: Scan cached strings (was: Fatal Error: SIGSEGV on Jdbc joins) > Fatal Error:

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468364#comment-15468364 ] Davies Liu commented on SPARK-16922: Is there any performance difference comparing to

[jira] [Resolved] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16922. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Resolved] (SPARK-17211) Broadcast join produces incorrect results when compressed Oops differs between driver, executor

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17211. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-17409) Query in CTAS is Optimized Twice

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17409: --- Assignee: Xiao Li > Query in CTAS is Optimized Twice > > >

[jira] [Commented] (SPARK-17211) Broadcast join produces incorrect results when compressed Oops differs between driver, executor

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459835#comment-15459835 ] Davies Liu commented on SPARK-17211: Could you try the patch ?

[jira] [Resolved] (SPARK-16334) SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16334. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Resolved] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17230. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Writing decimal to csv

[jira] [Updated] (SPARK-17261) Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17261: --- Assignee: Jeff Zhang > Using HiveContext after re-creating SparkContext in Spark 2.0 throws >

[jira] [Resolved] (SPARK-17261) Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17261. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-17211) Broadcast join produces incorrect results when compressed Oops differs between driver, executor

2016-09-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459040#comment-15459040 ] Davies Liu commented on SPARK-17211: [~migtor] Could you try this patch ?

[jira] [Resolved] (SPARK-16525) Enable Row Based HashMap in HashAggregateExec

2016-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16525. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14176

[jira] [Resolved] (SPARK-16926) Partition columns are present in columns metadata for partition but not table

2016-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16926. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15456605#comment-15456605 ] Davies Liu commented on SPARK-16922: [~sitalke...@gmail.com] I think I found the cause and fix it,

[jira] [Assigned] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-16922: -- Assignee: Davies Liu > Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

[jira] [Assigned] (SPARK-17211) Broadcast join produces incorrect results on EMR with large driver memory

2016-09-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17211: -- Assignee: Davies Liu > Broadcast join produces incorrect results on EMR with large driver

[jira] [Resolved] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17063. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-17256) spark-submit.cmd cannot work if path has space and cut off double-quoted arguments

2016-08-26 Thread Quanmao Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanmao Liu updated SPARK-17256: Remaining Estimate: 120h (was: 5m) Original Estimate: 120h (was: 5m) > spark-submit.cmd

[jira] [Updated] (SPARK-17256) spark-submit.cmd cannot work if path has space and cut off double-quoted arguments

2016-08-26 Thread Quanmao Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanmao Liu updated SPARK-17256: Description: The key problem is { cmd /V /E /C "~%dp0spark-xxx.cmd" } cannot accept arguments

[jira] [Commented] (SPARK-17256) spark-submit.cmd cannot work if path has space and cut off double-quoted arguments

2016-08-26 Thread Quanmao Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438534#comment-15438534 ] Quanmao Liu commented on SPARK-17256: - I've pushed a pull request :

[jira] [Created] (SPARK-17256) spark-submit.cmd cannot work if path has space and cut off double-quoted arguments

2016-08-26 Thread Quanmao Liu (JIRA)
Quanmao Liu created SPARK-17256: --- Summary: spark-submit.cmd cannot work if path has space and cut off double-quoted arguments Key: SPARK-17256 URL: https://issues.apache.org/jira/browse/SPARK-17256

[jira] [Created] (SPARK-17230) Writing decimal to csv will result empty string if the decimal exceeds (20, 18)

2016-08-24 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17230: -- Summary: Writing decimal to csv will result empty string if the decimal exceeds (20, 18) Key: SPARK-17230 URL: https://issues.apache.org/jira/browse/SPARK-17230 Project:

[jira] [Commented] (SPARK-14560) Cooperative Memory Management for Spillables

2016-08-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433200#comment-15433200 ] Davies Liu commented on SPARK-14560: Even with SPARK-4452, we still can not say that we fixed the OOM

[jira] [Resolved] (SPARK-13286) JDBC driver doesn't report full exception

2016-08-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13286. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Closed] (SPARK-16569) Use Cython to speed up Pyspark internals

2016-08-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-16569. -- Resolution: Won't Fix > Use Cython to speed up Pyspark internals >

[jira] [Commented] (SPARK-16569) Use Cython to speed up Pyspark internals

2016-08-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428843#comment-15428843 ] Davies Liu commented on SPARK-16569: Agreed to [~robert3005]. Another options could be just use PyPy,

[jira] [Updated] (SPARK-17113) Job failure due to Executor OOM in offheap mode

2016-08-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17113: --- Assignee: Sital Kedia > Job failure due to Executor OOM in offheap mode >

[jira] [Resolved] (SPARK-17113) Job failure due to Executor OOM in offheap mode

2016-08-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17113. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Job failure due to

[jira] [Assigned] (SPARK-13286) JDBC driver doesn't report full exception

2016-08-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-13286: -- Assignee: Davies Liu > JDBC driver doesn't report full exception >

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427264#comment-15427264 ] Davies Liu commented on SPARK-16922: Which serializer are you using? java serializer or Kyro? >

[jira] [Comment Edited] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427241#comment-15427241 ] Davies Liu edited comment on SPARK-16922 at 8/18/16 9:58 PM: - Is this failure

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427241#comment-15427241 ] Davies Liu commented on SPARK-16922: Is this failure determistic or not? Happened on every task or

[jira] [Created] (SPARK-17115) Improve the performance of UnsafeProjection for wide table

2016-08-17 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17115: -- Summary: Improve the performance of UnsafeProjection for wide table Key: SPARK-17115 URL: https://issues.apache.org/jira/browse/SPARK-17115 Project: Spark Issue

[jira] [Resolved] (SPARK-17106) Simplify subquery interface

2016-08-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17106. Resolution: Fixed Fix Version/s: 2.1.0 > Simplify subquery interface >

[jira] [Resolved] (SPARK-17035) Conversion of datetime.max to microseconds produces incorrect value

2016-08-16 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17035. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14631

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15421484#comment-15421484 ] Davies Liu commented on SPARK-16922: Have you also have this one?

[jira] [Created] (SPARK-17063) MSCK REPAIR TABLE is super slow with Hive metastore

2016-08-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17063: -- Summary: MSCK REPAIR TABLE is super slow with Hive metastore Key: SPARK-17063 URL: https://issues.apache.org/jira/browse/SPARK-17063 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419438#comment-15419438 ] Davies Liu commented on SPARK-16922: I think it's fixed by

[jira] [Commented] (SPARK-16922) Query with Broadcast Hash join fails due to executor OOM in Spark 2.0

2016-08-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419434#comment-15419434 ] Davies Liu commented on SPARK-16922: [~sitalke...@gmail.com] There are two integer overflow bugs

[jira] [Resolved] (SPARK-16958) Reuse subqueries within single query

2016-08-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16958. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14548

[jira] [Resolved] (SPARK-16928) Recursive call of ColumnVector::getInt() breaks JIT inlining

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16928. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14513

[jira] [Commented] (SPARK-16227) Json schema inference fails when `:` exists in file path

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415994#comment-15415994 ] Davies Liu commented on SPARK-16227: Can reproduce this by change `jsont` to `jsont:1` > Json schema

[jira] [Commented] (SPARK-16227) Json schema inference fails when `:` exists in file path

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415991#comment-15415991 ] Davies Liu commented on SPARK-16227: [~brkyvz] I can't reproduce this in master (2.1-snapshot),

[jira] [Assigned] (SPARK-14887) Generated SpecificUnsafeProjection Exceeds JVM Code Size Limits

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14887: -- Assignee: Davies Liu > Generated SpecificUnsafeProjection Exceeds JVM Code Size Limits >

[jira] [Commented] (SPARK-14887) Generated SpecificUnsafeProjection Exceeds JVM Code Size Limits

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415710#comment-15415710 ] Davies Liu commented on SPARK-14887: Do you have a large CaseWhen in this query? > Generated

[jira] [Comment Edited] (SPARK-15639) Try to push down filter at RowGroups level for parquet reader

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415591#comment-15415591 ] Davies Liu edited comment on SPARK-15639 at 8/10/16 5:07 PM: - Merged

[jira] [Resolved] (SPARK-15639) Try to push down filter at RowGroups level for parquet reader

2016-08-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-15639. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Target

[jira] [Commented] (SPARK-16093) Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1

2016-08-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414361#comment-15414361 ] Davies Liu commented on SPARK-16093: It also could be possible that the stat of Hive table is broken

<    3   4   5   6   7   8   9   10   11   12   >