[jira] [Updated] (SPARK-21358) Argument of repartitionandsortwithinpartitions at pyspark

2017-07-09 Thread chie hayashida (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chie hayashida updated SPARK-21358: --- Summary: Argument of repartitionandsortwithinpartitions at pyspark (was: variable of

[jira] [Created] (SPARK-21358) variable of repartitionandsortwithinpartitions at pyspark

2017-07-09 Thread chie hayashida (JIRA)
chie hayashida created SPARK-21358: -- Summary: variable of repartitionandsortwithinpartitions at pyspark Key: SPARK-21358 URL: https://issues.apache.org/jira/browse/SPARK-21358 Project: Spark

[jira] [Created] (SPARK-21357) FileInputDStream not remove out of date RDD

2017-07-09 Thread dadazheng (JIRA)
dadazheng created SPARK-21357: - Summary: FileInputDStream not remove out of date RDD Key: SPARK-21357 URL: https://issues.apache.org/jira/browse/SPARK-21357 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21083) Store zero size and row count after analyzing empty table

2017-07-09 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21083: Fix Version/s: 2.1.2 > Store zero size and row count after analyzing empty table >

[jira] [Issue Comment Deleted] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] fengchaoge updated SPARK-21337: --- Comment: was deleted (was: !http://example.com/image.png!) > SQL which has large ‘case when’

[jira] [Updated] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] fengchaoge updated SPARK-21337: --- Attachment: test2.JPG > SQL which has large ‘case when’ expressions may cause code generation beyond

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079799#comment-16079799 ] fengchaoge commented on SPARK-21337: OK i will have a try. thank you very much. > SQL which has

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079795#comment-16079795 ] Hyukjin Kwon commented on SPARK-21337: -- Probably, I think it would be nicer to narrow down via

[jira] [Resolved] (SPARK-21356) CSV datasource failed to parse a value having newline in its value

2017-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21356. -- Resolution: Invalid I am resolving this as the workaround looks so easy and I am not sure if

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079773#comment-16079773 ] fengchaoge commented on SPARK-21337: 1. create database GBD_DM_PAC_SAFE; 2. use GBD_DM_PAC_SAFE; 3.

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079790#comment-16079790 ] fengchaoge commented on SPARK-21337: Attachments actually happened i have no idea about code

[jira] [Resolved] (SPARK-21355) JSON datasource failed to parse a value having newline in its value

2017-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21355. -- Resolution: Invalid I am resolving this per https://stackoverflow.com/a/42073. {quote} This

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079785#comment-16079785 ] fengchaoge commented on SPARK-21337: thank you very much, what should i do for next? Thank you for

[jira] [Updated] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread fengchaoge (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] fengchaoge updated SPARK-21337: --- Attachment: test1.JPG test.JPG > SQL which has large ‘case when’ expressions may

[jira] [Resolved] (SPARK-19659) Fetch big blocks to disk when shuffle-read

2017-07-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19659. -- Resolution: Fixed Fix Version/s: 2.2.0 The major work is done in 2.2.0. But it's

[jira] [Commented] (SPARK-21337) SQL which has large ‘case when’ expressions may cause code generation beyond 64KB

2017-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079776#comment-16079776 ] Hyukjin Kwon commented on SPARK-21337: -- No, just copying and pasting the SQL without a further

[jira] [Created] (SPARK-21355) JSON datasource failed to parse a value having newline in its value

2017-07-09 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-21355: Summary: JSON datasource failed to parse a value having newline in its value Key: SPARK-21355 URL: https://issues.apache.org/jira/browse/SPARK-21355 Project: Spark

[jira] [Created] (SPARK-21356) CSV datasource failed to parse a value having newline in its value

2017-07-09 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-21356: Summary: CSV datasource failed to parse a value having newline in its value Key: SPARK-21356 URL: https://issues.apache.org/jira/browse/SPARK-21356 Project: Spark

[jira] [Assigned] (SPARK-21354) INPUT FILE related functions do not support more than one sources

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21354: Assignee: Xiao Li (was: Apache Spark) > INPUT FILE related functions do not support more

[jira] [Commented] (SPARK-21354) INPUT FILE related functions do not support more than one sources

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079743#comment-16079743 ] Apache Spark commented on SPARK-21354: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079751#comment-16079751 ] Dongjoon Hyun edited comment on SPARK-21349 at 7/9/17 11:29 PM: This

[jira] [Comment Edited] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079751#comment-16079751 ] Dongjoon Hyun edited comment on SPARK-21349 at 7/9/17 11:28 PM: This

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079751#comment-16079751 ] Dongjoon Hyun commented on SPARK-21349: --- This issue is not about blindly raising the threashold,

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079746#comment-16079746 ] Kay Ousterhout commented on SPARK-21349: Does that mean we should just raise this threshold for

[jira] [Updated] (SPARK-21354) INPUT FILE related functions do not support more than one sources

2017-07-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21354: Description: {noformat} hive> select *, INPUT__FILE__NAME FROM t1, t2; FAILED: SemanticException Column

[jira] [Assigned] (SPARK-21354) INPUT FILE related functions do not support more than one sources

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21354: Assignee: Apache Spark (was: Xiao Li) > INPUT FILE related functions do not support more

[jira] [Created] (SPARK-21354) INPUT FILE related functions do not support more than one sources

2017-07-09 Thread Xiao Li (JIRA)
Xiao Li created SPARK-21354: --- Summary: INPUT FILE related functions do not support more than one sources Key: SPARK-21354 URL: https://issues.apache.org/jira/browse/SPARK-21354 Project: Spark

[jira] [Updated] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-21349: -- Description: Since Spark 1.1.0, Spark emits warning when task size exceeds a threshold,

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079728#comment-16079728 ] Dongjoon Hyun commented on SPARK-21349: --- Thank you for advice, [~kayousterhout]! For usability, we

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079724#comment-16079724 ] Kay Ousterhout commented on SPARK-21349: Is this a major usability issue (and what's the use case

[jira] [Updated] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-09 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-21349: -- Description: Since Spark 1.1.0, Spark emits warning when task size exceeds a threshold,

[jira] [Commented] (SPARK-21353) add checkValue in spark.internal.config about how to correctly set configurations

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079633#comment-16079633 ] Apache Spark commented on SPARK-21353: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21353) add checkValue in spark.internal.config about how to correctly set configurations

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21353: Assignee: Apache Spark > add checkValue in spark.internal.config about how to correctly

[jira] [Assigned] (SPARK-21353) add checkValue in spark.internal.config about how to correctly set configurations

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21353: Assignee: (was: Apache Spark) > add checkValue in spark.internal.config about how to

[jira] [Created] (SPARK-21353) add checkValue in spark.internal.config about how to correctly set configurations

2017-07-09 Thread caoxuewen (JIRA)
caoxuewen created SPARK-21353: - Summary: add checkValue in spark.internal.config about how to correctly set configurations Key: SPARK-21353 URL: https://issues.apache.org/jira/browse/SPARK-21353 Project:

[jira] [Reopened] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-21352: --- > Memory Usage in Spark Streaming > --- > > Key: SPARK-21352

[jira] [Closed] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-21352. - > Memory Usage in Spark Streaming > --- > > Key: SPARK-21352 >

[jira] [Resolved] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21352. --- Resolution: Invalid > Memory Usage in Spark Streaming > --- > >

[jira] [Resolved] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21352. --- Resolution: Fixed That does not imply this is the right place. This is for proposing changes and

[jira] [Updated] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Shubham Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Gupta updated SPARK-21352: -- Description: I am trying to figure out the memory used by executors for a Spark Streaming

[jira] [Reopened] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Shubham Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Gupta reopened SPARK-21352: --- There is no solution provided for the problem and neither stack overflow helping > Memory

[jira] [Updated] (SPARK-21083) Store zero size and row count after analyzing empty table

2017-07-09 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21083: Fix Version/s: 2.2.1 > Store zero size and row count after analyzing empty table >

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079543#comment-16079543 ] Apache Spark commented on SPARK-18016: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Resolved] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21352. --- Resolution: Invalid Please point questions to StackOverflow or the mailing list. > Memory Usage in

[jira] [Commented] (SPARK-21332) Incorrect result type inferred for some decimal expressions

2017-07-09 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079525#comment-16079525 ] Anton Okolnychyi commented on SPARK-21332: -- I know the root cause and will submit a PR soon. >

[jira] [Updated] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Shubham Gupta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Gupta updated SPARK-21352: -- Issue Type: Improvement (was: Bug) > Memory Usage in Spark Streaming >

[jira] [Created] (SPARK-21352) Memory Usage in Spark Streaming

2017-07-09 Thread Shubham Gupta (JIRA)
Shubham Gupta created SPARK-21352: - Summary: Memory Usage in Spark Streaming Key: SPARK-21352 URL: https://issues.apache.org/jira/browse/SPARK-21352 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21341) Spark 2.1.1: I want to be able to serialize wordVectors on Word2VecModel

2017-07-09 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079507#comment-16079507 ] Yan Facai (颜发才) commented on SPARK-21341: - Yes, [~sowen] is right. Why not to use save and load

[jira] [Commented] (SPARK-21083) Store zero size and row count after analyzing empty table

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079493#comment-16079493 ] Apache Spark commented on SPARK-21083: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21351) Update nullability based on children's output in optimized logical plan

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21351: Assignee: (was: Apache Spark) > Update nullability based on children's output in

[jira] [Assigned] (SPARK-21351) Update nullability based on children's output in optimized logical plan

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21351: Assignee: Apache Spark > Update nullability based on children's output in optimized

[jira] [Commented] (SPARK-21351) Update nullability based on children's output in optimized logical plan

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079480#comment-16079480 ] Apache Spark commented on SPARK-21351: -- User 'maropu' has created a pull request for this issue:

[jira] [Commented] (SPARK-21083) Store zero size and row count after analyzing empty table

2017-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079474#comment-16079474 ] Apache Spark commented on SPARK-21083: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Updated] (SPARK-21351) Update nullability based on children's output in optimized logical plan

2017-07-09 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-21351: - Description: In the master, optimized plans do not respect the nullability that `Filter`

[jira] [Created] (SPARK-21351) Update nullability based on children's output in optimized logical plan

2017-07-09 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-21351: Summary: Update nullability based on children's output in optimized logical plan Key: SPARK-21351 URL: https://issues.apache.org/jira/browse/SPARK-21351