[jira] [Assigned] (SPARK-28227) Spark can’t support TRANSFORM with aggregation

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28227: Assignee: (was: Apache Spark) > Spark can’t support TRANSFORM with aggregation >

[jira] [Assigned] (SPARK-28227) Spark can’t support TRANSFORM with aggregation

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28227: Assignee: Apache Spark > Spark can’t support TRANSFORM with aggregation >

[jira] [Created] (SPARK-28227) Spark can’t support TRANSFORM with aggregation

2019-07-01 Thread angerszhu (JIRA)
angerszhu created SPARK-28227: - Summary: Spark can’t support TRANSFORM with aggregation Key: SPARK-28227 URL: https://issues.apache.org/jira/browse/SPARK-28227 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23539) Add support for Kafka headers in Structured Streaming

2019-07-01 Thread Thiru Paramasivan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876655#comment-16876655 ] Thiru Paramasivan commented on SPARK-23539: ---  +1 Seems like the PR is close to getting merged.

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876650#comment-16876650 ] Takeshi Yamamuro commented on SPARK-28186: -- I also think this is a right behaviour as Marco

[jira] [Comment Edited] (SPARK-28181) Add a filter interface to KVStore to speed up the entities retrieve

2019-07-01 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876645#comment-16876645 ] Lantao Jin edited comment on SPARK-28181 at 7/2/19 4:30 AM: It's basically

[jira] [Updated] (SPARK-28226) Rename Pandas UDF SCALAR_ITER to MAP_ITER and fix documentation

2019-07-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28226: - Summary: Rename Pandas UDF SCALAR_ITER to MAP_ITER and fix documentation (was: Document

[jira] [Commented] (SPARK-28181) Add a filter interface to KVStore to speed up the entities retrieve

2019-07-01 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876645#comment-16876645 ] Lantao Jin commented on SPARK-28181: It's basically what KVUtils.viewToSeq does > Add a filter

[jira] [Resolved] (SPARK-28181) Add a filter interface to KVStore to speed up the entities retrieve

2019-07-01 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin resolved SPARK-28181. Resolution: Invalid > Add a filter interface to KVStore to speed up the entities retrieve >

[jira] [Assigned] (SPARK-28226) Document MAP_ITER Pandas UDF

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28226: Assignee: (was: Apache Spark) > Document MAP_ITER Pandas UDF >

[jira] [Assigned] (SPARK-28226) Document MAP_ITER Pandas UDF

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28226: Assignee: Apache Spark > Document MAP_ITER Pandas UDF > > >

[jira] [Assigned] (SPARK-28202) [Core] [Test] Avoid noises of system props in SparkConfSuite

2019-07-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned SPARK-28202: --- Assignee: ShuMing Li > [Core] [Test] Avoid noises of system props in SparkConfSuite >

[jira] [Resolved] (SPARK-28202) [Core] [Test] Avoid noises of system props in SparkConfSuite

2019-07-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-28202. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24998

[jira] [Created] (SPARK-28226) Document MAP_ITER Pandas UDF

2019-07-01 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-28226: Summary: Document MAP_ITER Pandas UDF Key: SPARK-28226 URL: https://issues.apache.org/jira/browse/SPARK-28226 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-28226) Document MAP_ITER Pandas UDF

2019-07-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28226: - Description: {{mapPartitionsInPandas}} was added as of SPARK-28198. Now the name

[jira] [Resolved] (SPARK-28198) Add mapPartitionsInPandas to allow an iterator of DataFrames

2019-07-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28198. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24997

[jira] [Assigned] (SPARK-28198) Add mapPartitionsInPandas to allow an iterator of DataFrames

2019-07-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-28198: Assignee: Hyukjin Kwon > Add mapPartitionsInPandas to allow an iterator of DataFrames >

[jira] [Resolved] (SPARK-23098) Migrate Kafka batch source to v2

2019-07-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23098. - Resolution: Fixed Assignee: Gabor Somogyi Fix Version/s: 3.0.0 > Migrate Kafka

[jira] [Assigned] (SPARK-27296) User Defined Aggregating Functions (UDAFs) have a major efficiency problem

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27296: Assignee: Apache Spark > User Defined Aggregating Functions (UDAFs) have a major

[jira] [Assigned] (SPARK-27296) User Defined Aggregating Functions (UDAFs) have a major efficiency problem

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-27296: Assignee: (was: Apache Spark) > User Defined Aggregating Functions (UDAFs) have a

[jira] [Created] (SPARK-28225) Unexpected behavior for Window functions

2019-07-01 Thread Andrew Leverentz (JIRA)
Andrew Leverentz created SPARK-28225: Summary: Unexpected behavior for Window functions Key: SPARK-28225 URL: https://issues.apache.org/jira/browse/SPARK-28225 Project: Spark Issue Type:

[jira] [Commented] (SPARK-28224) Sum aggregation returns null on overflow decimals

2019-07-01 Thread Mick Jermsurawong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876565#comment-16876565 ] Mick Jermsurawong commented on SPARK-28224: --- To reproduce this: 

[jira] [Created] (SPARK-28224) Sum aggregation returns null on overflow decimals

2019-07-01 Thread Mick Jermsurawong (JIRA)
Mick Jermsurawong created SPARK-28224: - Summary: Sum aggregation returns null on overflow decimals Key: SPARK-28224 URL: https://issues.apache.org/jira/browse/SPARK-28224 Project: Spark

[jira] [Assigned] (SPARK-28223) stream-stream joins should fail unsupported checker in update mode

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28223: Assignee: (was: Apache Spark) > stream-stream joins should fail unsupported checker

[jira] [Assigned] (SPARK-28223) stream-stream joins should fail unsupported checker in update mode

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28223: Assignee: Apache Spark > stream-stream joins should fail unsupported checker in update

[jira] [Created] (SPARK-28223) stream-stream joins should fail unsupported checker in update mode

2019-07-01 Thread Jose Torres (JIRA)
Jose Torres created SPARK-28223: --- Summary: stream-stream joins should fail unsupported checker in update mode Key: SPARK-28223 URL: https://issues.apache.org/jira/browse/SPARK-28223 Project: Spark

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876509#comment-16876509 ] Marco Gaido commented on SPARK-28186: - You're right with that. The equivalent in Postgres is

[jira] [Commented] (SPARK-27977) MicroBatchWriter should use StreamWriter for human-friendly textual representation (toString)

2019-07-01 Thread Alessandro D'Armiento (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876478#comment-16876478 ] Alessandro D'Armiento commented on SPARK-27977: --- {{MicroBatchWriter}} implementation has

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Alex Kushnir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876386#comment-16876386 ] Alex Kushnir commented on SPARK-28186: -- I'm porting HIVE workload to SPARK. It works in HIVE as

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876357#comment-16876357 ] Marco Gaido commented on SPARK-28186: - Do you know of any SQL BD with the behavior you are

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Alex Kushnir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876304#comment-16876304 ] Alex Kushnir commented on SPARK-28186: -- because array ["a","b",null,"c"] clearly does not contain

[jira] [Commented] (SPARK-23153) Support application dependencies in submission client's local file system

2019-07-01 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876192#comment-16876192 ] Stavros Kontopoulos commented on SPARK-23153: - [~cloud_fan] is there going to be another

[jira] [Comment Edited] (SPARK-23153) Support application dependencies in submission client's local file system

2019-07-01 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876192#comment-16876192 ] Stavros Kontopoulos edited comment on SPARK-23153 at 7/1/19 1:41 PM: -

[jira] [Created] (SPARK-28222) Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version

2019-07-01 Thread eneriwrt (JIRA)
eneriwrt created SPARK-28222: Summary: Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version Key: SPARK-28222 URL: https://issues.apache.org/jira/browse/SPARK-28222

[jira] [Commented] (SPARK-28090) Spark hangs when an execution plan has many projections on nested structs

2019-07-01 Thread Iskender Unlu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876150#comment-16876150 ] Iskender Unlu commented on SPARK-28090: --- I will try to work on this issue as my first contribution

[jira] [Issue Comment Deleted] (SPARK-28090) Spark hangs when an execution plan has many projections on nested structs

2019-07-01 Thread Iskender Unlu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iskender Unlu updated SPARK-28090: -- Comment: was deleted (was:  I will work on this issue.) > Spark hangs when an execution plan

[jira] [Commented] (SPARK-28090) Spark hangs when an execution plan has many projections on nested structs

2019-07-01 Thread Iskender Unlu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876145#comment-16876145 ] Iskender Unlu commented on SPARK-28090: ---  I will work on this issue. > Spark hangs when an

[jira] [Commented] (SPARK-25299) Use remote storage for persisting shuffle data

2019-07-01 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876070#comment-16876070 ] Saisai Shao commented on SPARK-25299: - Better to post a pdf version [~mcheah] :). > Use remote

[jira] [Commented] (SPARK-24695) Unable to return calendar interval from udf

2019-07-01 Thread Priyanka Garg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876063#comment-16876063 ] Priyanka Garg commented on SPARK-24695: --- The above PR has been closed and a new PR has been

[jira] [Assigned] (SPARK-28221) Upgrade janino to 3.0.13

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28221: Assignee: (was: Apache Spark) > Upgrade janino to 3.0.13 >

[jira] [Assigned] (SPARK-28221) Upgrade janino to 3.0.13

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28221: Assignee: Apache Spark > Upgrade janino to 3.0.13 > > >

[jira] [Assigned] (SPARK-28220) join foldable condition not pushed down when parent filter is totally pushed down

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28220: Assignee: (was: Apache Spark) > join foldable condition not pushed down when parent

[jira] [Assigned] (SPARK-28220) join foldable condition not pushed down when parent filter is totally pushed down

2019-07-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28220: Assignee: Apache Spark > join foldable condition not pushed down when parent filter is

[jira] [Created] (SPARK-28221) Upgrade janino to 3.0.13

2019-07-01 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-28221: --- Summary: Upgrade janino to 3.0.13 Key: SPARK-28221 URL: https://issues.apache.org/jira/browse/SPARK-28221 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-27466) LEAD function with 'ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING' causes exception in Spark

2019-07-01 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875992#comment-16875992 ] Herman van Hovell commented on SPARK-27466: --- Well, we could add it. I am just not sure what

[jira] [Resolved] (SPARK-28205) useV1SourceList configuration should be for all data sources

2019-07-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-28205. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25004

[jira] [Assigned] (SPARK-28205) useV1SourceList configuration should be for all data sources

2019-07-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-28205: --- Assignee: Gengliang Wang > useV1SourceList configuration should be for all data sources >

[jira] [Created] (SPARK-28220) join foldable condition not pushed down when parent filter is totally pushed down

2019-07-01 Thread liupengcheng (JIRA)
liupengcheng created SPARK-28220: Summary: join foldable condition not pushed down when parent filter is totally pushed down Key: SPARK-28220 URL: https://issues.apache.org/jira/browse/SPARK-28220