[jira] [Comment Edited] (SPARK-24809) Serializing LongHashedRelation in executor may result in data error

2018-07-17 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547337#comment-16547337 ] zenglinxi edited comment on SPARK-24809 at 7/18/18 4:10 AM:

[jira] [Comment Edited] (SPARK-24809) Serializing LongHashedRelation in executor may result in data error

2018-07-17 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547337#comment-16547337 ] zenglinxi edited comment on SPARK-24809 at 7/18/18 4:09 AM:

[jira] [Commented] (SPARK-24809) Serializing LongHashedRelation in executor may result in data error

2018-07-17 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547337#comment-16547337 ] zenglinxi commented on SPARK-24809: --- [^Spark LongHashedRelation serialization.svg] I

[jira] [Updated] (SPARK-24809) Serializing LongHashedRelation in executor may result in data error

2018-07-17 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-24809: -- Attachment: Spark LongHashedRelation serialization.svg > Serializing LongHashedRelation in executor ma

[jira] [Commented] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518230#comment-16518230 ] zenglinxi commented on SPARK-24607: --- [~viirya]  I have some tests, it seems like rand

[jira] [Updated] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-24607: -- Description: Noticed the following queries can give different results: {code:java} select count(*) fro

[jira] [Updated] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-24607: -- Description: Noticed the following queries can give different results: {code:java} select count(*) fro

[jira] [Created] (SPARK-24607) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread zenglinxi (JIRA)
zenglinxi created SPARK-24607: - Summary: Distribute by rand() can lead to data inconsistency Key: SPARK-24607 URL: https://issues.apache.org/jira/browse/SPARK-24607 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-21223: -- Attachment: historyserver_jstack.txt BTW, this cause an infinite loop problem when we restart historyse

[jira] [Comment Edited] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066319#comment-16066319 ] zenglinxi edited comment on SPARK-21223 at 6/28/17 10:42 AM: -

[jira] [Commented] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066319#comment-16066319 ] zenglinxi commented on SPARK-21223: --- ok, i will check SPARK-21078 first. > Thread-safe

[jira] [Created] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-27 Thread zenglinxi (JIRA)
zenglinxi created SPARK-21223: - Summary: Thread-safety issue in FsHistoryProvider Key: SPARK-21223 URL: https://issues.apache.org/jira/browse/SPARK-21223 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-20240) SparkSQL support limitations of max dynamic partitions when inserting hive table

2017-04-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-20240: -- Affects Version/s: (was: 2.1.0) > SparkSQL support limitations of max dynamic partitions when inser

[jira] [Updated] (SPARK-20240) SparkSQL support limitations of max dynamic partitions when inserting hive table

2017-04-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-20240: -- Affects Version/s: 1.6.3 > SparkSQL support limitations of max dynamic partitions when inserting hive

[jira] [Commented] (SPARK-20240) SparkSQL support limitations of max dynamic partitions when inserting hive table

2017-04-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958573#comment-15958573 ] zenglinxi commented on SPARK-20240: --- There is a configuration parameter in hive that ca

[jira] [Created] (SPARK-20240) SparkSQL support limitations of max dynamic partitions when inserting hive table

2017-04-06 Thread zenglinxi (JIRA)
zenglinxi created SPARK-20240: - Summary: SparkSQL support limitations of max dynamic partitions when inserting hive table Key: SPARK-20240 URL: https://issues.apache.org/jira/browse/SPARK-20240 Project: S

[jira] [Commented] (SPARK-13819) using a regexp_replace in a group by clause raises a nullpointerexception

2016-10-27 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611589#comment-15611589 ] zenglinxi commented on SPARK-13819: --- We encountered the same problem, is there any prog

[jira] [Commented] (SPARK-16253) make spark sql compatible with hive sql that using python script transform like using 'xxx.py'

2016-08-17 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424388#comment-15424388 ] zenglinxi commented on SPARK-16253: --- wait a minute please, I'm working on this PR > ma

[jira] [Updated] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-16408: -- Description: when using Spark-sql to execute sql like: {noformat} add file hdfs://xxx/user/test; {nofor

[jira] [Commented] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365668#comment-15365668 ] zenglinxi commented on SPARK-16408: --- I think we should add an parameter (spark.input.di

[jira] [Updated] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-16408: -- Description: when use Spark-sql to execute sql like: {noformat} add file hdfs://xxx/user/test; {noforma

[jira] [Comment Edited] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365658#comment-15365658 ] zenglinxi edited comment on SPARK-16408 at 7/7/16 6:09 AM: --- as

[jira] [Comment Edited] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365658#comment-15365658 ] zenglinxi edited comment on SPARK-16408 at 7/7/16 6:07 AM: --- as

[jira] [Commented] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365658#comment-15365658 ] zenglinxi commented on SPARK-16408: --- as shown in https://issues.apache.org/jira/browse/

[jira] [Created] (SPARK-16408) SparkSQL Added file get Exception: is a directory and recursive is not turned on

2016-07-06 Thread zenglinxi (JIRA)
zenglinxi created SPARK-16408: - Summary: SparkSQL Added file get Exception: is a directory and recursive is not turned on Key: SPARK-16408 URL: https://issues.apache.org/jira/browse/SPARK-16408 Project: S

[jira] [Commented] (SPARK-16253) make spark sql compatible with hive sql that using python script transform like using 'xxx.py'

2016-06-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352917#comment-15352917 ] zenglinxi commented on SPARK-16253: --- I'm working on this issue... > make spark sql com

[jira] [Created] (SPARK-16253) make spark sql compatible with hive sql that using python script transform like using 'xxx.py'

2016-06-28 Thread zenglinxi (JIRA)
zenglinxi created SPARK-16253: - Summary: make spark sql compatible with hive sql that using python script transform like using 'xxx.py' Key: SPARK-16253 URL: https://issues.apache.org/jira/browse/SPARK-16253

[jira] [Commented] (SPARK-14974) spark sql job create too many files in HDFS when doing insert overwrite hive table

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262031#comment-15262031 ] zenglinxi commented on SPARK-14974: --- hi,[~sowen]: Thanks for your reply. "200w" means

[jira] [Updated] (SPARK-14974) spark sql job create too many files in HDFS when doing insert overwrite hive table

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Description: Recently, we often encounter problems using spark sql for inserting data into a partition

[jira] [Updated] (SPARK-14974) spark sql job create too many files in HDFS when doing insert overwrite hive table

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Description: Recently, we often encounter problems using spark sql for inserting data into a partition

[jira] [Updated] (SPARK-14974) spark sql job create too many files in HDFS when doing insert overwrite hive table

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Summary: spark sql job create too many files in HDFS when doing insert overwrite hive table (was: spar

[jira] [Updated] (SPARK-14974) spark sql insert hive table write too many files

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Description: Recently, we often encounter problems using spark sql for inserting data into a partition

[jira] [Updated] (SPARK-14974) spark sql insert hive table write too many files

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Description: Recently, we often encounter problems using spark sql for inserting data into a partition

[jira] [Updated] (SPARK-14974) spark sql insert hive table write too many files

2016-04-28 Thread zenglinxi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zenglinxi updated SPARK-14974: -- Summary: spark sql insert hive table write too many files (was: spark sql insert hive table write too

[jira] [Created] (SPARK-14974) spark sql insert hive table write too much files

2016-04-28 Thread zenglinxi (JIRA)
zenglinxi created SPARK-14974: - Summary: spark sql insert hive table write too much files Key: SPARK-14974 URL: https://issues.apache.org/jira/browse/SPARK-14974 Project: Spark Issue Type: Task