[jira] [Commented] (SPARK-24891) Fix HandleNullInputsForUDF rule

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555111#comment-16555111 ] Apache Spark commented on SPARK-24891: -- User 'gatorsmile' has created a pull request for this

[jira] [Resolved] (SPARK-23957) Sorts in subqueries are redundant and can be removed

2018-07-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23957. - Resolution: Fixed Assignee: Henry Robinson Fix Version/s: 2.4.0 > Sorts in subqueries

[jira] [Resolved] (SPARK-24890) Short circuiting the `if` condition when `trueValue` and `falseValue` are the same

2018-07-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24890. - Resolution: Fixed Fix Version/s: 2.4.0 > Short circuiting the `if` condition when `trueValue`

[jira] [Updated] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2018-07-24 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-24914: -- Description: When determining whether to do a broadcast join, Spark estimates the size of

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555080#comment-16555080 ] Saisai Shao commented on SPARK-24615: - Hi [~tgraves] [~irashid] thanks a lot for your comments.

[jira] [Comment Edited] (SPARK-23622) Flaky Test: HiveClientSuites

2018-07-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555041#comment-16555041 ] Yuming Wang edited comment on SPARK-23622 at 7/25/18 2:44 AM: --

[jira] [Resolved] (SPARK-24891) Fix HandleNullInputsForUDF rule

2018-07-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24891. - Resolution: Fixed Assignee: Maryann Xue Fix Version/s: 2.4.0 2.3.2 >

[jira] [Commented] (SPARK-23622) Flaky Test: HiveClientSuites

2018-07-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555041#comment-16555041 ] Yuming Wang commented on SPARK-23622: -

[jira] [Issue Comment Deleted] (SPARK-24663) Flaky test: StreamingContextSuite "stop slow receiver gracefully"

2018-07-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24663: Comment: was deleted (was:

[jira] [Updated] (SPARK-24529) Add spotbugs into maven build process

2018-07-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24529: - Fix Version/s: (was: 2.4.0) > Add spotbugs into maven build process >

[jira] [Reopened] (SPARK-24529) Add spotbugs into maven build process

2018-07-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24529: -- This was reverted in favour of https://github.com/apache/spark/pull/21865 and SPARK-24895 for

[jira] [Commented] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555027#comment-16555027 ] Hyukjin Kwon commented on SPARK-24895: -- Thank you [~yhuai]. I couldn't foresee this problem. >

[jira] [Commented] (SPARK-24663) Flaky test: StreamingContextSuite "stop slow receiver gracefully"

2018-07-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555028#comment-16555028 ] Yuming Wang commented on SPARK-24663: -

[jira] [Commented] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555023#comment-16555023 ] Apache Spark commented on SPARK-24906: -- User 'habren' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24906: Assignee: (was: Apache Spark) > Enlarge split size for columnar file to ensure the

[jira] [Assigned] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24906: Assignee: Apache Spark > Enlarge split size for columnar file to ensure the task read

[jira] [Commented] (SPARK-24897) DAGScheduler should not unregisterMapOutput and increaseEpoch repeatedly for stage fetchFailed

2018-07-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555002#comment-16555002 ] Sean Owen commented on SPARK-24897: --- I don't understand what you're reporting. I think you should

[jira] [Commented] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554990#comment-16554990 ] Yuming Wang commented on SPARK-24911: - Can we show non printable field delimiter when {{SHOW CREATE

[jira] [Commented] (SPARK-24897) DAGScheduler should not unregisterMapOutput and increaseEpoch repeatedly for stage fetchFailed

2018-07-24 Thread liupengcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554988#comment-16554988 ] liupengcheng commented on SPARK-24897: -- [~srowen] anybody can verify this issue? Thanks a lot >

[jira] [Commented] (SPARK-24867) Add AnalysisBarrier to DataFrameWriter

2018-07-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554986#comment-16554986 ] Saisai Shao commented on SPARK-24867: - [~smilegator] what's the ETA of this issue? > Add

[jira] [Assigned] (SPARK-24297) Change default value for spark.maxRemoteBlockSizeFetchToMem to be < 2GB

2018-07-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned SPARK-24297: --- Assignee: Imran Rashid > Change default value for spark.maxRemoteBlockSizeFetchToMem to be

[jira] [Resolved] (SPARK-24297) Change default value for spark.maxRemoteBlockSizeFetchToMem to be < 2GB

2018-07-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-24297. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21474

[jira] [Comment Edited] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554972#comment-16554972 ] Jason Guo edited comment on SPARK-24906 at 7/25/18 1:03 AM: Thanks [~maropu]

[jira] [Commented] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554972#comment-16554972 ] Jason Guo commented on SPARK-24906: --- Thanks [~maropu] and [~viirya] for your comments. Here is my

[jira] [Commented] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554932#comment-16554932 ] Yin Huai commented on SPARK-24895: -- [~hyukjin.kwon] [~kiszk] seems this revert indeed fixed the problem

[jira] [Created] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2018-07-24 Thread Bruce Robbins (JIRA)
Bruce Robbins created SPARK-24914: - Summary: totalSize is not a good estimate for broadcast joins Key: SPARK-24914 URL: https://issues.apache.org/jira/browse/SPARK-24914 Project: Spark Issue

[jira] [Commented] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-24 Thread Vinitha Reddy Gankidi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554907#comment-16554907 ] Vinitha Reddy Gankidi commented on SPARK-24778: --- [~maropu] Sorry, I missed seeing this

[jira] [Created] (SPARK-24913) Make `AssertTrue` and `AssertNotNull` non-deterministic

2018-07-24 Thread DB Tsai (JIRA)
DB Tsai created SPARK-24913: --- Summary: Make `AssertTrue` and `AssertNotNull` non-deterministic Key: SPARK-24913 URL: https://issues.apache.org/jira/browse/SPARK-24913 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-24908. - Resolution: Fixed Target Version/s: 2.4.0 > [R] remove spaces to make lintr happy >

[jira] [Resolved] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-24895. -- Resolution: Fixed Fix Version/s: 2.4.0 [https://github.com/apache/spark/pull/21865] has been

[jira] [Updated] (SPARK-24912) Broadcast join OutOfMemory stack trace obscures actual cause of OOM

2018-07-24 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-24912: -- Priority: Minor (was: Major) > Broadcast join OutOfMemory stack trace obscures actual cause

[jira] [Commented] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554839#comment-16554839 ] Apache Spark commented on SPARK-24911: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24911: Assignee: Apache Spark > SHOW CREATE TABLE drops escaping of nested column names >

[jira] [Assigned] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24911: Assignee: (was: Apache Spark) > SHOW CREATE TABLE drops escaping of nested column

[jira] [Created] (SPARK-24912) Broadcast join OutOfMemory stack trace obscures actual cause of OOM

2018-07-24 Thread Bruce Robbins (JIRA)
Bruce Robbins created SPARK-24912: - Summary: Broadcast join OutOfMemory stack trace obscures actual cause of OOM Key: SPARK-24912 URL: https://issues.apache.org/jira/browse/SPARK-24912 Project: Spark

[jira] [Created] (SPARK-24911) SHOW CREATE TABLE drops escaping of nested column names

2018-07-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24911: -- Summary: SHOW CREATE TABLE drops escaping of nested column names Key: SPARK-24911 URL: https://issues.apache.org/jira/browse/SPARK-24911 Project: Spark Issue

[jira] [Created] (SPARK-24910) Spark Bloom Filter Closure Serialization improvement for very high volume of Data

2018-07-24 Thread Himangshu Ranjan Borah (JIRA)
Himangshu Ranjan Borah created SPARK-24910: -- Summary: Spark Bloom Filter Closure Serialization improvement for very high volume of Data Key: SPARK-24910 URL:

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554803#comment-16554803 ] Thomas Graves commented on SPARK-24909: --- I haven't come up with a fix yet but have been looking at

[jira] [Commented] (SPARK-24307) Support sending messages over 2GB from memory

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554784#comment-16554784 ] Apache Spark commented on SPARK-24307: -- User 'squito' has created a pull request for this issue:

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-07-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554748#comment-16554748 ] Imran Rashid commented on SPARK-24909: -- ugh, yeah I think you're right. do you have a fix in mind?

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554742#comment-16554742 ] Imran Rashid commented on SPARK-24615: -- hi, just catching up here -- agree with a lot of Tom's

[jira] [Updated] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Description: The DAGScheduler can hang if the executor was lost (due to fetch failure) and

[jira] [Updated] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Summary: Spark scheduler can hang when fetch failures, executor lost, task running on lost

[jira] [Commented] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554715#comment-16554715 ] Apache Spark commented on SPARK-24768: -- User 'gengliangwang' has created a pull request for this

[jira] [Commented] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554716#comment-16554716 ] Liang-Chi Hsieh commented on SPARK-24906: - A {{maxPartitionBytes}} value adapted improperly

[jira] [Commented] (SPARK-24909) Spark scheduler can hang with fetch failures and executor lost and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554701#comment-16554701 ] Thomas Graves commented on SPARK-24909: --- Note this may have been introduced as part of SPARK-23433

[jira] [Updated] (SPARK-24909) Spark scheduler can hang with fetch failures and executor lost and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Description: The DAGScheduler can hang if the executor was lost (due to fetch failure) and

[jira] [Created] (SPARK-24909) Spark scheduler can hang with fetch failures and executor lost and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24909: - Summary: Spark scheduler can hang with fetch failures and executor lost and multiple stage attempts Key: SPARK-24909 URL: https://issues.apache.org/jira/browse/SPARK-24909

[jira] [Updated] (SPARK-24909) Spark scheduler can hang with fetch failures and executor lost and multiple stage attempts

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Priority: Critical (was: Major) > Spark scheduler can hang with fetch failures and executor

[jira] [Resolved] (SPARK-24812) Last Access Time in the table description is not valid

2018-07-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24812. - Resolution: Fixed Assignee: Sujith Fix Version/s: 2.4.0 > Last Access Time in the table

[jira] [Assigned] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-24895: -- Assignee: Eric Chang > Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched

[jira] [Updated] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-24895: --- Description: Spark 2.4.0 has Maven build errors because artifacts uploaded to apache maven repo

[jira] [Commented] (SPARK-24894) Invalid DNS name due to hostname truncation

2018-07-24 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554602#comment-16554602 ] Yinan Li commented on SPARK-24894: -- [~mcheah]. We need to make sure the truncation leads to a valid

[jira] [Commented] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554597#comment-16554597 ] shane knapp commented on SPARK-24908: - i noticed lintr complaining in my PRB build

[jira] [Assigned] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24895: Assignee: Apache Spark > Spark 2.4.0 Snapshot artifacts has broken metadata due to

[jira] [Commented] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554595#comment-16554595 ] Apache Spark commented on SPARK-24895: -- User 'ericfchang' has created a pull request for this

[jira] [Assigned] (SPARK-24895) Spark 2.4.0 Snapshot artifacts has broken metadata due to mismatched filenames

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24895: Assignee: (was: Apache Spark) > Spark 2.4.0 Snapshot artifacts has broken metadata

[jira] [Resolved] (SPARK-23325) DataSourceV2 readers should always produce InternalRow.

2018-07-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23325. - Resolution: Fixed Assignee: Ryan Blue Fix Version/s: 2.4.0 > DataSourceV2 readers

[jira] [Commented] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554561#comment-16554561 ] Apache Spark commented on SPARK-24908: -- User 'shaneknapp' has created a pull request for this

[jira] [Assigned] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24908: Assignee: shane knapp (was: Apache Spark) > [R] remove spaces to make lintr happy >

[jira] [Updated] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp updated SPARK-24908: Description: during my travails in porting spark builds to run on our centos worker, i managed

[jira] [Assigned] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24908: Assignee: Apache Spark (was: shane knapp) > [R] remove spaces to make lintr happy >

[jira] [Created] (SPARK-24908) [R] remove spaces to make lintr happy

2018-07-24 Thread shane knapp (JIRA)
shane knapp created SPARK-24908: --- Summary: [R] remove spaces to make lintr happy Key: SPARK-24908 URL: https://issues.apache.org/jira/browse/SPARK-24908 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554431#comment-16554431 ] Apache Spark commented on SPARK-18874: -- User 'wangyum' has created a pull request for this issue:

[jira] [Commented] (SPARK-24581) Design: BarrierTaskContext.barrier()

2018-07-24 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554409#comment-16554409 ] Jiang Xingbo commented on SPARK-24581: -- Design doc:

[jira] [Commented] (SPARK-21097) Dynamic allocation will preserve cached data

2018-07-24 Thread Brad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554404#comment-16554404 ] Brad commented on SPARK-21097: -- Hi [~menelaus], I have stalled out on this project, if you would like to

[jira] [Assigned] (SPARK-24903) Allow driver container name to be configurable in k8 spark

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24903: Assignee: Apache Spark > Allow driver container name to be configurable in k8 spark >

[jira] [Commented] (SPARK-24903) Allow driver container name to be configurable in k8 spark

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554379#comment-16554379 ] Apache Spark commented on SPARK-24903: -- User 'yifeih' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24903) Allow driver container name to be configurable in k8 spark

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24903: Assignee: (was: Apache Spark) > Allow driver container name to be configurable in k8

[jira] [Commented] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554340#comment-16554340 ] Takeshi Yamamuro commented on SPARK-24906: -- Ah, I see. It make some sense to me. In

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554327#comment-16554327 ] Thomas Graves commented on SPARK-24615: --- Right so I think part of this is trying to make it more

[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-07-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554306#comment-16554306 ] Thomas Graves commented on SPARK-23128: --- we also did some initial evaluation with it as well and

[jira] [Assigned] (SPARK-24907) Migrate JDBC data source to DataSource API v2

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24907: Assignee: Apache Spark > Migrate JDBC data source to DataSource API v2 >

[jira] [Commented] (SPARK-24907) Migrate JDBC data source to DataSource API v2

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554221#comment-16554221 ] Apache Spark commented on SPARK-24907: -- User 'tengpeng' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24907) Migrate JDBC data source to DataSource API v2

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24907: Assignee: (was: Apache Spark) > Migrate JDBC data source to DataSource API v2 >

[jira] [Commented] (SPARK-24630) SPIP: Support SQLStreaming in Spark

2018-07-24 Thread Jackey Lee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554210#comment-16554210 ] Jackey Lee commented on SPARK-24630: In SQLStreaming, we also support standard SQL as batch queries,

[jira] [Created] (SPARK-24907) Migrate JDBC data source to DataSource API v2

2018-07-24 Thread Teng Peng (JIRA)
Teng Peng created SPARK-24907: - Summary: Migrate JDBC data source to DataSource API v2 Key: SPARK-24907 URL: https://issues.apache.org/jira/browse/SPARK-24907 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Guo updated SPARK-24906: -- Description: For columnar file, such as, when spark sql read the table, each split will be 128 MB by

[jira] [Updated] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Guo updated SPARK-24906: -- Attachment: image-2018-07-24-20-30-24-552.png > Enlarge split size for columnar file to ensure the

[jira] [Updated] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Guo updated SPARK-24906: -- Attachment: image-2018-07-24-20-29-24-797.png > Enlarge split size for columnar file to ensure the

[jira] [Updated] (SPARK-24906) Enlarge split size for columnar file to ensure the task read enough data

2018-07-24 Thread Jason Guo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Guo updated SPARK-24906: -- Attachment: image-2018-07-24-20-28-06-269.png > Enlarge split size for columnar file to ensure the

[jira] [Updated] (SPARK-24905) Spark 2.3 Internal URL env variable

2018-07-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-24905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Björn Wenzel updated SPARK-24905: - Priority: Critical (was: Major) > Spark 2.3 Internal URL env variable >

[jira] [Comment Edited] (SPARK-24630) SPIP: Support SQLStreaming in Spark

2018-07-24 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554043#comment-16554043 ] Genmao Yu edited comment on SPARK-24630 at 7/24/18 11:38 AM: - [~zsxwing]

[jira] [Created] (SPARK-24904) Join with broadcasted dataframe causes shuffle of redundant data

2018-07-24 Thread Shay Elbaz (JIRA)
Shay Elbaz created SPARK-24904: -- Summary: Join with broadcasted dataframe causes shuffle of redundant data Key: SPARK-24904 URL: https://issues.apache.org/jira/browse/SPARK-24904 Project: Spark

[jira] [Commented] (SPARK-24288) Enable preventing predicate pushdown

2018-07-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-24288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554109#comment-16554109 ] Tomasz Gawęda commented on SPARK-24288: --- [~smilegator] Yes, you are right. If we don't want to use

[jira] [Assigned] (SPARK-24901) Merge the codegen of RegularHashMap and fastHashMap to reduce compiler maxCodesize when VectorizedHashMap is false

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24901: Assignee: Apache Spark > Merge the codegen of RegularHashMap and fastHashMap to reduce

[jira] [Commented] (SPARK-24901) Merge the codegen of RegularHashMap and fastHashMap to reduce compiler maxCodesize when VectorizedHashMap is false

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554106#comment-16554106 ] Apache Spark commented on SPARK-24901: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24901) Merge the codegen of RegularHashMap and fastHashMap to reduce compiler maxCodesize when VectorizedHashMap is false

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24901: Assignee: (was: Apache Spark) > Merge the codegen of RegularHashMap and fastHashMap

[jira] [Updated] (SPARK-24902) Add integration tests for PVs

2018-07-24 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24902: Description: PVs and hostpath support has been added recently

[jira] [Updated] (SPARK-24903) Allow driver container name to be configurable in k8 spark

2018-07-24 Thread Yifei Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifei Huang updated SPARK-24903: Description: We'd like to expose the container name as a configurable value. In case it changes

[jira] [Created] (SPARK-24903) Allow driver container name to be configurable in k8 spark

2018-07-24 Thread Yifei Huang (JIRA)
Yifei Huang created SPARK-24903: --- Summary: Allow driver container name to be configurable in k8 spark Key: SPARK-24903 URL: https://issues.apache.org/jira/browse/SPARK-24903 Project: Spark

[jira] [Created] (SPARK-24902) Add integration tests for PVs

2018-07-24 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24902: --- Summary: Add integration tests for PVs Key: SPARK-24902 URL: https://issues.apache.org/jira/browse/SPARK-24902 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-24434) Support user-specified driver and executor pod templates

2018-07-24 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554099#comment-16554099 ] Stavros Kontopoulos edited comment on SPARK-24434 at 7/24/18 11:06 AM:

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-07-24 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554099#comment-16554099 ] Stavros Kontopoulos commented on SPARK-24434: - [~liyinan926] Should we move with the

[jira] [Assigned] (SPARK-24900) speed up sort when the dataset is small

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24900: Assignee: Apache Spark > speed up sort when the dataset is small >

[jira] [Commented] (SPARK-24900) speed up sort when the dataset is small

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554094#comment-16554094 ] Apache Spark commented on SPARK-24900: -- User 'sddyljsx' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24900) speed up sort when the dataset is small

2018-07-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24900: Assignee: (was: Apache Spark) > speed up sort when the dataset is small >

[jira] [Created] (SPARK-24901) Merge the codegen of RegularHashMap and fastHashMap to reduce compiler maxCodesize when VectorizedHashMap is false

2018-07-24 Thread caoxuewen (JIRA)
caoxuewen created SPARK-24901: - Summary: Merge the codegen of RegularHashMap and fastHashMap to reduce compiler maxCodesize when VectorizedHashMap is false Key: SPARK-24901 URL:

[jira] [Created] (SPARK-24900) speed up sort when the dataset is small

2018-07-24 Thread SongXun (JIRA)
SongXun created SPARK-24900: --- Summary: speed up sort when the dataset is small Key: SPARK-24900 URL: https://issues.apache.org/jira/browse/SPARK-24900 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-24630) SPIP: Support SQLStreaming in Spark

2018-07-24 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554043#comment-16554043 ] Genmao Yu commented on SPARK-24630: --- {{Structured Streaming supports standard SQL as the batch

  1   2   >