[jira] [Commented] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451743#comment-16451743 ] yucai commented on SPARK-24076: --- 1. When shuffle.partition = 8192, tuples in the same parti

[jira] [Commented] (SPARK-20087) Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners

2018-04-24 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451732#comment-16451732 ] Xianjin YE commented on SPARK-20087: cc [~jiangxb1987] [~irashid], I am going to send

[jira] [Created] (SPARK-24082) [Spark SQL] Tables are not listing under DB

2018-04-24 Thread ABHISHEK KUMAR GUPTA (JIRA)
ABHISHEK KUMAR GUPTA created SPARK-24082: Summary: [Spark SQL] Tables are not listing under DB Key: SPARK-24082 URL: https://issues.apache.org/jira/browse/SPARK-24082 Project: Spark I

[jira] [Commented] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451728#comment-16451728 ] yucai commented on SPARK-24076: --- Root cause: very bad hash conflict in hashaggregate. !ima

[jira] [Updated] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24076: -- Attachment: image-2018-04-25-14-29-39-958.png > very bad performance when shuffle.partition = 8192 > --

[jira] [Updated] (SPARK-24081) Spark SQL drops the table while writing into table in "overwrite" mode.

2018-04-24 Thread Ashish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish updated SPARK-24081: --- Priority: Blocker (was: Major) > Spark SQL drops the table while writing into table in "overwrite" mode. >

[jira] [Commented] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451727#comment-16451727 ] yucai commented on SPARK-24076: --- The query example: {code:sql} insert overwrite table targ

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451699#comment-16451699 ] Nadav Samet commented on SPARK-24074: - Also, this problem doesn't occur with Spark 2.

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451692#comment-16451692 ] Nadav Samet commented on SPARK-24074: - I think that breeze is fine: SBT is able to co

[jira] [Commented] (SPARK-19256) Hive bucketing support

2018-04-24 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451689#comment-16451689 ] Xianjin YE commented on SPARK-19256: Hi [~tejasp] [~cloud_fan], are you still working

[jira] [Updated] (SPARK-24009) spark2.3.0 INSERT OVERWRITE LOCAL DIRECTORY '/home/spark/aaaaab'

2018-04-24 Thread chris_j (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris_j updated SPARK-24009: Description: local mode  spark execute "INSERT OVERWRITE LOCAL DIRECTORY " successfully. on yarn spark exe

[jira] [Commented] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451679#comment-16451679 ] Hyukjin Kwon commented on SPARK-24078: -- Would you be able to test this in higher ver

[jira] [Resolved] (SPARK-24077) Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`?

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24077. -- Resolution: Invalid Fix Version/s: (was: 3.0.0) Target Version/s: (was: 2

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451677#comment-16451677 ] Hyukjin Kwon commented on SPARK-24074: -- I haven't looked into this yet but doesn't t

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

2018-04-24 Thread Spark User (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451666#comment-16451666 ] Spark User commented on SPARK-5594: --- In my case, this issue was happening when spark con

[jira] [Created] (SPARK-24081) Spark SQL drops the table while writing into table in "overwrite" mode.

2018-04-24 Thread Ashish (JIRA)
Ashish created SPARK-24081: -- Summary: Spark SQL drops the table while writing into table in "overwrite" mode. Key: SPARK-24081 URL: https://issues.apache.org/jira/browse/SPARK-24081 Project: Spark

[jira] [Commented] (SPARK-24036) Stateful operators in continuous processing

2018-04-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451653#comment-16451653 ] Jungtaek Lim commented on SPARK-24036: -- Hello, I'm quite interested to this issue si

[jira] [Created] (SPARK-24080) Update the nullability of Filter output based on inferred predicates

2018-04-24 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-24080: Summary: Update the nullability of Filter output based on inferred predicates Key: SPARK-24080 URL: https://issues.apache.org/jira/browse/SPARK-24080 Project:

[jira] [Commented] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451650#comment-16451650 ] Apache Spark commented on SPARK-24079: -- User 'maropu' has created a pull request for

[jira] [Assigned] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24079: Assignee: (was: Apache Spark) > Update the nullability of Join output based on inferre

[jira] [Assigned] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24079: Assignee: Apache Spark > Update the nullability of Join output based on inferred predicate

[jira] [Created] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-24079: Summary: Update the nullability of Join output based on inferred predicates Key: SPARK-24079 URL: https://issues.apache.org/jira/browse/SPARK-24079 Project: S

[jira] [Assigned] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-24070: --- Assignee: Takeshi Yamamuro > TPC-DS Performance Tests for Parquet 1.10.0 Upgrade > -

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples t

[jira] [Commented] (SPARK-23799) [CBO] FilterEstimation.evaluateInSet produces devision by zero in a case of empty table with analyzed statistics

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451609#comment-16451609 ] Apache Spark commented on SPARK-23799: -- User 'gatorsmile' has created a pull request

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples t

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples t

[jira] [Created] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
zhangsongcheng created SPARK-24078: -- Summary: reduce with unionAll takes a long time Key: SPARK-24078 URL: https://issues.apache.org/jira/browse/SPARK-24078 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451540#comment-16451540 ] yucai commented on SPARK-24076: --- shuffle.partition = 8192 !p1.png! shuffle.partition = 80

[jira] [Updated] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24076: -- Attachment: p2.png p1.png > very bad performance when shuffle.partition = 8192 > --

[jira] [Created] (SPARK-24077) Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`?

2018-04-24 Thread Benedict Jin (JIRA)
Benedict Jin created SPARK-24077: Summary: Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`? Key: SPARK-24077 URL: https://issues.apache.org/jira/browse/SPARK-24077 Project: Spark

[jira] [Created] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
yucai created SPARK-24076: - Summary: very bad performance when shuffle.partition = 8192 Key: SPARK-24076 URL: https://issues.apache.org/jira/browse/SPARK-24076 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-23821) High-order function: flatten(x) → array

2018-04-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23821. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20938 [https://g

[jira] [Assigned] (SPARK-23821) High-order function: flatten(x) → array

2018-04-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-23821: - Assignee: Marek Novotny > High-order function: flatten(x) → array >

[jira] [Created] (SPARK-24075) [Mesos] Supervised driver upon failure will be retried indefinitely unless explicitly killed

2018-04-24 Thread Yogesh Natarajan (JIRA)
Yogesh Natarajan created SPARK-24075: Summary: [Mesos] Supervised driver upon failure will be retried indefinitely unless explicitly killed Key: SPARK-24075 URL: https://issues.apache.org/jira/browse/SPARK-240

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451499#comment-16451499 ] Nadav Samet commented on SPARK-24074: - I was only able to reproduce this problem with

[jira] [Updated] (SPARK-24064) [Spark SQL] Create table using csv does not support binary column Type

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24064: - Target Version/s: (was: 2.3.1) Please avoid to set a target version which is usually set by a c

[jira] [Commented] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451483#comment-16451483 ] Hyukjin Kwon commented on SPARK-24068: -- Hm, [~maxgekk], btw is this specific to CSV

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24074: - Priority: Major (was: Critical) Please avoid to set Critical+ which is usually reserved for comm

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451466#comment-16451466 ] Takeshi Yamamuro commented on SPARK-24070: -- ok > TPC-DS Performance Tests for P

[jira] [Resolved] (SPARK-24038) refactor continuous write exec to its own class

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24038. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21116 [https://g

[jira] [Assigned] (SPARK-24038) refactor continuous write exec to its own class

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-24038: - Assignee: Jose Torres > refactor continuous write exec to its own class > --

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451427#comment-16451427 ] Xiao Li commented on SPARK-24070: - Yeah, please do it here. Thanks! If you have the bandw

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451404#comment-16451404 ] Takeshi Yamamuro commented on SPARK-24070: -- ok, this ticket means we will put th

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Samet updated SPARK-24074: Environment: (was: {code:java} // code placeholder {code}) > Maven package resolver downloads j

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Samet updated SPARK-24074: Description: {code:java} // code placeholder {code} >From some reason spark downloads a javadoc art

[jira] [Created] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
Nadav Samet created SPARK-24074: --- Summary: Maven package resolver downloads javadoc instead of jar Key: SPARK-24074 URL: https://issues.apache.org/jira/browse/SPARK-24074 Project: Spark Issue T

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-24 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451263#comment-16451263 ] Henry Robinson commented on SPARK-23852: Yes it has - the Parquet community are g

[jira] [Resolved] (SPARK-24056) Make consumer creation lazy in Kafka source for Structured streaming

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24056. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21134 [https://g

[jira] [Commented] (SPARK-24051) Incorrect results for certain queries using Java and Python APIs on Spark 2.3.0

2018-04-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451241#comment-16451241 ] Marco Gaido commented on SPARK-24051: - [~hvanhovell] I am not sure that the analysis

[jira] [Assigned] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-20114: - Assignee: Weichen Xu > spark.ml parity for sequential pattern mining - PrefixSpa

[jira] [Commented] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450449#comment-16450449 ] Apache Spark commented on SPARK-23654: -- User 'steveloughran' has created a pull requ

[jira] [Assigned] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23654: Assignee: Apache Spark > Cut jets3t as a dependency of spark-core > --

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20114: -- Shepherd: Joseph K. Bradley > spark.ml parity for sequential pattern mining - PrefixSpa

[jira] [Assigned] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23654: Assignee: (was: Apache Spark) > Cut jets3t as a dependency of spark-core > ---

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20114: -- Target Version/s: 2.4.0 > spark.ml parity for sequential pattern mining - PrefixSpan >

[jira] [Commented] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450443#comment-16450443 ] Apache Spark commented on SPARK-24073: -- User 'rdblue' has created a pull request for

[jira] [Assigned] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24073: Assignee: Apache Spark > DataSourceV2: Rename DataReaderFactory back to ReadTask. > --

[jira] [Assigned] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24073: Assignee: (was: Apache Spark) > DataSourceV2: Rename DataReaderFactory back to ReadTas

[jira] [Updated] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-23654: --- Summary: Cut jets3t as a dependency of spark-core (was: Cut jets3t as a dependency of spark-

[jira] [Created] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-24073: - Summary: DataSourceV2: Rename DataReaderFactory back to ReadTask. Key: SPARK-24073 URL: https://issues.apache.org/jira/browse/SPARK-24073 Project: Spark Issue Type

[jira] [Commented] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450426#comment-16450426 ] Apache Spark commented on SPARK-24043: -- User 'bersprockets' has created a pull reque

[jira] [Assigned] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24043: Assignee: Apache Spark > InterpretedPredicate.eval fails if expression tree contains Nonde

[jira] [Assigned] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24043: Assignee: (was: Apache Spark) > InterpretedPredicate.eval fails if expression tree con

[jira] [Updated] (SPARK-24072) clearly define pushed filters

2018-04-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24072: Summary: clearly define pushed filters (was: remove unused DataSourceV2Relation.pushedFilters) >

[jira] [Resolved] (SPARK-23990) Instruments logging improvements - ML regression package

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23990. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21078 [h

[jira] [Commented] (SPARK-24051) Incorrect results for certain queries using Java and Python APIs on Spark 2.3.0

2018-04-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450293#comment-16450293 ] Herman van Hovell commented on SPARK-24051: --- [~mgaido] do you have any idea why

[jira] [Resolved] (SPARK-23455) Default Params in ML should be saved separately

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23455. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20633 [h

[jira] [Assigned] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24072: Assignee: Wenchen Fan (was: Apache Spark) > remove unused DataSourceV2Relation.pushedFilt

[jira] [Assigned] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24072: Assignee: Apache Spark (was: Wenchen Fan) > remove unused DataSourceV2Relation.pushedFilt

[jira] [Commented] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450273#comment-16450273 ] Apache Spark commented on SPARK-24072: -- User 'cloud-fan' has created a pull request

[jira] [Created] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24072: --- Summary: remove unused DataSourceV2Relation.pushedFilters Key: SPARK-24072 URL: https://issues.apache.org/jira/browse/SPARK-24072 Project: Spark Issue Type: Im

[jira] [Commented] (SPARK-23933) High-order function: map(array, array) → map

2018-04-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450262#comment-16450262 ] Kazuaki Ishizaki commented on SPARK-23933: -- cc [~smilegator] > High-order funct

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450243#comment-16450243 ] Xiao Li commented on SPARK-24070: - cc [~maropu] > TPC-DS Performance Tests for Parquet 1

[jira] [Created] (SPARK-24071) Micro-benchmark of Parquet Filter Pushdown

2018-04-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24071: --- Summary: Micro-benchmark of Parquet Filter Pushdown Key: SPARK-24071 URL: https://issues.apache.org/jira/browse/SPARK-24071 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24070: --- Summary: TPC-DS Performance Tests for Parquet 1.10.0 Upgrade Key: SPARK-24070 URL: https://issues.apache.org/jira/browse/SPARK-24070 Project: Spark Issue Type: Sub-tas

[jira] [Resolved] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups

2018-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23807. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20923 [https:/

[jira] [Assigned] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups

2018-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23807: -- Assignee: Steve Loughran > Add Hadoop 3 profile with relevant POM fix ups > --

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Julien Cuquemelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450186#comment-16450186 ] Julien Cuquemelle commented on SPARK-22683: --- Thanks for all your comments and p

[jira] [Resolved] (SPARK-24052) Support spark version showing on environment page

2018-04-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24052. --- Resolution: Not A Problem > Support spark version showing on environment page > -

[jira] [Commented] (SPARK-23975) Allow Clustering to take Arrays of Double as input features

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450161#comment-16450161 ] Joseph K. Bradley commented on SPARK-23975: --- I merged https://github.com/apache

[jira] [Assigned] (SPARK-23975) Allow Clustering to take Arrays of Double as input features

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-23975: - Assignee: Lu Wang > Allow Clustering to take Arrays of Double as input features

[jira] [Assigned] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24069: Assignee: (was: Apache Spark) > Add array_max / array_min functions >

[jira] [Commented] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450142#comment-16450142 ] Apache Spark commented on SPARK-24069: -- User 'HyukjinKwon' has created a pull reques

[jira] [Assigned] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24069: Assignee: Apache Spark > Add array_max / array_min functions > ---

[jira] [Created] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-24069: Summary: Add array_max / array_min functions Key: SPARK-24069 URL: https://issues.apache.org/jira/browse/SPARK-24069 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-22683: - Assignee: Julien Cuquemelle > DynamicAllocation wastes resources by allocating container

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450131#comment-16450131 ] Thomas Graves commented on SPARK-22683: --- Note this added a new config spark.dynamic

[jira] [Resolved] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-22683. --- Resolution: Fixed Fix Version/s: 2.4.0 > DynamicAllocation wastes resources by allocat

[jira] [Commented] (SPARK-24000) S3A: Create Table should fail on invalid AK/SK

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450024#comment-16450024 ] Steve Loughran commented on SPARK-24000: We could consider whether or not to rais

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-24 Thread Eric Maynard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450018#comment-16450018 ] Eric Maynard commented on SPARK-23852: -- {color:#33}>There is no upstream release

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-04-24 Thread Eric Maynard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449948#comment-16449948 ] Eric Maynard commented on SPARK-23519: -- Why is the fact that you dynamically generat

[jira] [Comment Edited] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-04-24 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449935#comment-16449935 ] Li Jin edited comment on SPARK-22947 at 4/24/18 2:16 PM: - I came

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-04-24 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449935#comment-16449935 ] Li Jin commented on SPARK-22947: I came across this blog today: [https://databricks.com/

[jira] [Created] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24068: -- Summary: CSV schema inferring doesn't work for compressed files Key: SPARK-24068 URL: https://issues.apache.org/jira/browse/SPARK-24068 Project: Spark Issue Type

[jira] [Updated] (SPARK-23182) Allow enabling of TCP keep alive for master RPC connections

2018-04-24 Thread Petar Petrov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Petar Petrov updated SPARK-23182: - Affects Version/s: 2.2.2 > Allow enabling of TCP keep alive for master RPC connections >

[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449857#comment-16449857 ] Steve Loughran commented on SPARK-18673: It's a big hive patch, but most of it is

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Description: SPARK-17147 fixes a problem with non-consecutive Kafka Offsets. The  [PR w|http

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Affects Version/s: (was: 2.0.0) 2.3.0 > Backport SPARK-17147 to 2.

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Fix Version/s: (was: 2.4.0) > Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Con

  1   2   >