[jira] [Created] (SPARK-24996) Use DSL to simplify DeclarativeAggregate

2018-08-01 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24996: --- Summary: Use DSL to simplify DeclarativeAggregate Key: SPARK-24996 URL: https://issues.apache.org/jira/browse/SPARK-24996 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23698) Spark code contains numerous undefined names in Python 3

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566372#comment-16566372 ] Apache Spark commented on SPARK-23698: -- User 'cclauss' has created a pull request for this issue:

[jira] [Commented] (SPARK-23698) Spark code contains numerous undefined names in Python 3

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566356#comment-16566356 ] Apache Spark commented on SPARK-23698: -- User 'cclauss' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24994) When the data type of the field is converted to other types, it can also support pushdown to parquet

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24994: Assignee: Apache Spark > When the data type of the field is converted to other types, it

[jira] [Assigned] (SPARK-24994) When the data type of the field is converted to other types, it can also support pushdown to parquet

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24994: Assignee: (was: Apache Spark) > When the data type of the field is converted to

[jira] [Commented] (SPARK-24994) When the data type of the field is converted to other types, it can also support pushdown to parquet

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566336#comment-16566336 ] Apache Spark commented on SPARK-24994: -- User '10110346' has created a pull request for this issue:

[jira] [Created] (SPARK-24995) Flaky tests: FlatMapGroupsWithStateSuite.flatMapGroupsWithState - streaming with processing time timeout

2018-08-01 Thread Jungtaek Lim (JIRA)
Jungtaek Lim created SPARK-24995: Summary: Flaky tests: FlatMapGroupsWithStateSuite.flatMapGroupsWithState - streaming with processing time timeout Key: SPARK-24995 URL:

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566326#comment-16566326 ] yuhao yang commented on SPARK-23742: [~maropu] Can you be more specific about the suggestion? E.g.

[jira] [Created] (SPARK-24994) When the data type of the field is converted to other types, it can also support pushdown to parquet

2018-08-01 Thread liuxian (JIRA)
liuxian created SPARK-24994: --- Summary: When the data type of the field is converted to other types, it can also support pushdown to parquet Key: SPARK-24994 URL: https://issues.apache.org/jira/browse/SPARK-24994

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 2.0.0

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566300#comment-16566300 ] Apache Spark commented on SPARK-18057: -- User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-23908) High-order function: transform(array, function) → array

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566293#comment-16566293 ] Apache Spark commented on SPARK-23908: -- User 'ueshin' has created a pull request for this issue:

[jira] [Commented] (SPARK-24817) Implement BarrierTaskContext.barrier()

2018-08-01 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566278#comment-16566278 ] Jiang Xingbo commented on SPARK-24817: -- Actually the current implementation of _barrier_ function

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566266#comment-16566266 ] Takeshi Yamamuro commented on SPARK-23742: -- Can't we control this case by a new config

[jira] [Assigned] (SPARK-24992) spark should randomize yarn local dir selection

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24992: Assignee: Apache Spark > spark should randomize yarn local dir selection >

[jira] [Assigned] (SPARK-24992) spark should randomize yarn local dir selection

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24992: Assignee: (was: Apache Spark) > spark should randomize yarn local dir selection >

[jira] [Commented] (SPARK-24992) spark should randomize yarn local dir selection

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566249#comment-16566249 ] Apache Spark commented on SPARK-24992: -- User 'hthuynh2' has created a pull request for this issue:

[jira] [Commented] (SPARK-24993) Make Avro fast again

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566222#comment-16566222 ] Apache Spark commented on SPARK-24993: -- User 'dbtsai' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24993) Make Avro fast again

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24993: Assignee: Apache Spark > Make Avro fast again > > >

[jira] [Commented] (SPARK-24957) Decimal arithmetic can lead to wrong values using codegen

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566177#comment-16566177 ] Apache Spark commented on SPARK-24957: -- User 'gatorsmile' has created a pull request for this

[jira] [Commented] (SPARK-24817) Implement BarrierTaskContext.barrier()

2018-08-01 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566159#comment-16566159 ] Erik Erlandson commented on SPARK-24817: I'm curious about what the {{barrier}} invocations

[jira] [Commented] (SPARK-24580) List scenarios to be handled by barrier execution mode properly

2018-08-01 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566154#comment-16566154 ] Erik Erlandson commented on SPARK-24580: This is blocking SPARK-24582 which is marked as

[jira] [Updated] (SPARK-24957) Decimal arithmetic can lead to wrong values using codegen

2018-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24957: Fix Version/s: 2.2.3 > Decimal arithmetic can lead to wrong values using codegen >

[jira] [Assigned] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24914: Assignee: Apache Spark > totalSize is not a good estimate for broadcast joins >

[jira] [Commented] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566105#comment-16566105 ] Apache Spark commented on SPARK-24914: -- User 'bersprockets' has created a pull request for this

[jira] [Assigned] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24914: Assignee: (was: Apache Spark) > totalSize is not a good estimate for broadcast joins

[jira] [Commented] (SPARK-24912) Broadcast join OutOfMemory stack trace obscures actual cause of OOM

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566097#comment-16566097 ] Apache Spark commented on SPARK-24912: -- User 'bersprockets' has created a pull request for this

[jira] [Created] (SPARK-24992) spark should randomize yarn local dir selection

2018-08-01 Thread Hieu Tri Huynh (JIRA)
Hieu Tri Huynh created SPARK-24992: -- Summary: spark should randomize yarn local dir selection Key: SPARK-24992 URL: https://issues.apache.org/jira/browse/SPARK-24992 Project: Spark Issue

[jira] [Resolved] (SPARK-24960) k8s: explicitly expose ports on driver container

2018-08-01 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24960. Resolution: Fixed Fix Version/s: 2.4.0 > k8s: explicitly expose ports on driver container

[jira] [Resolved] (SPARK-24937) Datasource partition table should load empty static partitions

2018-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24937. - Resolution: Fixed Assignee: Yuming Wang Fix Version/s: 2.4.0 > Datasource partition

[jira] [Commented] (SPARK-24957) Decimal arithmetic can lead to wrong values using codegen

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565861#comment-16565861 ] Apache Spark commented on SPARK-24957: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565780#comment-16565780 ] shane knapp edited comment on SPARK-24980 at 8/1/18 7:14 PM: - alright,

[jira] [Comment Edited] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565780#comment-16565780 ] shane knapp edited comment on SPARK-24980 at 8/1/18 7:14 PM: - alright,

[jira] [Comment Edited] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565780#comment-16565780 ] shane knapp edited comment on SPARK-24980 at 8/1/18 7:13 PM: - alright,

[jira] [Assigned] (SPARK-24991) use InternalRow in DataSourceWriter

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24991: Assignee: Apache Spark (was: Wenchen Fan) > use InternalRow in DataSourceWriter >

[jira] [Assigned] (SPARK-24991) use InternalRow in DataSourceWriter

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24991: Assignee: Wenchen Fan (was: Apache Spark) > use InternalRow in DataSourceWriter >

[jira] [Commented] (SPARK-24991) use InternalRow in DataSourceWriter

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565837#comment-16565837 ] Apache Spark commented on SPARK-24991: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-24991) use InternalRow in DataSourceWriter

2018-08-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24991: --- Summary: use InternalRow in DataSourceWriter Key: SPARK-24991 URL: https://issues.apache.org/jira/browse/SPARK-24991 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-23915) High-order function: array_except(x, y) → array

2018-08-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23915: --- Assignee: Kazuaki Ishizaki > High-order function: array_except(x, y) → array >

[jira] [Resolved] (SPARK-23915) High-order function: array_except(x, y) → array

2018-08-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23915. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21103

[jira] [Commented] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2018-08-01 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565802#comment-16565802 ] Sunitha Kambhampati commented on SPARK-19602: - The design doc is also uploaded

[jira] [Commented] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565780#comment-16565780 ] shane knapp commented on SPARK-24980: - alright, pandas 0.19.2 and pyarrow 0.8.0 are installed for

[jira] [Assigned] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2018-08-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-24632: - Assignee: (was: Joseph K. Bradley) > Allow 3rd-party libraries to use

[jira] [Commented] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2018-08-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565769#comment-16565769 ] Joseph K. Bradley commented on SPARK-24632: --- I'm unassigning myself since I don't have time to

[jira] [Commented] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2018-08-01 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565767#comment-16565767 ] Joseph K. Bradley commented on SPARK-24632: --- That's a good point. Let's do it your way. : )

[jira] [Commented] (SPARK-17861) Store data source partitions in metastore and push partition pruning into metastore

2018-08-01 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565749#comment-16565749 ] nirav patel commented on SPARK-17861: - [~rxin] can this also be supported via dataframe? so

[jira] [Commented] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-08-01 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565728#comment-16565728 ] shane knapp commented on SPARK-23874: - i've got a PR ready to go on my end for our ansible to deploy

[jira] [Assigned] (SPARK-24990) merge ReadSupport and ReadSupportWithSchema

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24990: Assignee: Wenchen Fan (was: Apache Spark) > merge ReadSupport and ReadSupportWithSchema

[jira] [Commented] (SPARK-24990) merge ReadSupport and ReadSupportWithSchema

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565688#comment-16565688 ] Apache Spark commented on SPARK-24990: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24990) merge ReadSupport and ReadSupportWithSchema

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24990: Assignee: Apache Spark (was: Wenchen Fan) > merge ReadSupport and ReadSupportWithSchema

[jira] [Created] (SPARK-24990) merge ReadSupport and ReadSupportWithSchema

2018-08-01 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24990: --- Summary: merge ReadSupport and ReadSupportWithSchema Key: SPARK-24990 URL: https://issues.apache.org/jira/browse/SPARK-24990 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24989: Assignee: (was: Apache Spark) > BlockFetcher should retry while getting

[jira] [Assigned] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24989: Assignee: Apache Spark > BlockFetcher should retry while getting OutOfDirectMemoryError

[jira] [Commented] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565577#comment-16565577 ] Apache Spark commented on SPARK-24989: -- User 'xuanyuanking' has created a pull request for this

[jira] [Updated] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Li Yuanjian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Yuanjian updated SPARK-24989: Attachment: FailedStage.png > BlockFetcher should retry while getting OutOfDirectMemoryError >

[jira] [Created] (SPARK-24989) BlockFetcher should retry while getting OutOfDirectMemoryError

2018-08-01 Thread Li Yuanjian (JIRA)
Li Yuanjian created SPARK-24989: --- Summary: BlockFetcher should retry while getting OutOfDirectMemoryError Key: SPARK-24989 URL: https://issues.apache.org/jira/browse/SPARK-24989 Project: Spark

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565513#comment-16565513 ] Thomas Graves commented on SPARK-24909: --- looking more I think the fix may actually just be to

[jira] [Commented] (SPARK-24988) Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565457#comment-16565457 ] Apache Spark commented on SPARK-24988: -- User 'mahmoudmahdi24' has created a pull request for this

[jira] [Assigned] (SPARK-24988) Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24988: Assignee: Apache Spark > Add a castBySchema method which casts all the values of a

[jira] [Assigned] (SPARK-24988) Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24988: Assignee: (was: Apache Spark) > Add a castBySchema method which casts all the values

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565437#comment-16565437 ] Thomas Graves commented on SPARK-24909: --- this is unfortunately not a straight forward fix, the

[jira] [Commented] (SPARK-24795) Implement barrier execution mode

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565423#comment-16565423 ] Apache Spark commented on SPARK-24795: -- User 'jiangxb1987' has created a pull request for this

[jira] [Commented] (SPARK-24821) Fail fast when submitted job compute on a subset of all the partitions for a barrier stage

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565377#comment-16565377 ] Apache Spark commented on SPARK-24821: -- User 'jiangxb1987' has created a pull request for this

[jira] [Commented] (SPARK-24988) Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType

2018-08-01 Thread mahmoud mehdi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565358#comment-16565358 ] mahmoud mehdi commented on SPARK-24988: --- I am working on it. > Add a castBySchema method which

[jira] [Created] (SPARK-24988) Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType

2018-08-01 Thread mahmoud mehdi (JIRA)
mahmoud mehdi created SPARK-24988: - Summary: Add a castBySchema method which casts all the values of a DataFrame based on the DataTypes of a StructType Key: SPARK-24988 URL:

[jira] [Resolved] (SPARK-24971) remove SupportsDeprecatedScanRow

2018-08-01 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24971. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21921

[jira] [Commented] (SPARK-24986) OOM in BufferHolder during writes to a stream

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565296#comment-16565296 ] Thomas Graves commented on SPARK-24986: --- fyi [~irashid] I know you were looking at memory related

[jira] [Commented] (SPARK-24630) SPIP: Support SQLStreaming in Spark

2018-08-01 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565252#comment-16565252 ] Genmao Yu commented on SPARK-24630: --- [~Jackey Lee] Pretty good!  We also have the SQL Streaming

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Shepherd: Shixiong Zhu (was: Tathagata Das) > Kafka Cached Consumer Leaking File

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Shepherd: Tathagata Das (was: Shixiong Zhu) > Kafka Cached Consumer Leaking File

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Summary: Kafka Cached Consumer Leaking File Descriptors (was: Kafka Cached Consumer

[jira] [Comment Edited] (SPARK-4300) Race condition during SparkWorker shutdown

2018-08-01 Thread liqingan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565104#comment-16565104 ] liqingan edited comment on SPARK-4300 at 8/1/18 10:35 AM: -- i feel upset for this

[jira] [Comment Edited] (SPARK-4300) Race condition during SparkWorker shutdown

2018-08-01 Thread liqingan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565104#comment-16565104 ] liqingan edited comment on SPARK-4300 at 8/1/18 10:29 AM: -- i feel upset for this

[jira] [Commented] (SPARK-4300) Race condition during SparkWorker shutdown

2018-08-01 Thread liqingan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565104#comment-16565104 ] liqingan commented on SPARK-4300: - i feel upset for this issue !

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Description: Setup: * Spark 2.3.1 * Java 1.8.0 (112) * Standalone Cluster Manager * 3

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Description: Spark 2.3.0 introduced a new mechanism for caching Kafka consumers

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Description: Spark 2.3.0 introduced a new mechanism for caching Kafka consumers

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Shepherd: Tathagata Das > Kafka Cached Consumer Leaking Consumers >

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Description: Spark 2.3.0 introduced a new mechanism for caching Kafka consumers

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Environment: Spark 2.3.1 Java(TM) SE Runtime Environment (build 1.8.0_112-b15) Java

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Description: Spark 2.3.0 introduced a new mechanism for caching Kafka consumers

[jira] [Updated] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuval Itzchakov updated SPARK-24987: Environment: Spark 2.3.1 Java(TM) SE Runtime Environment (build 1.8.0_112-b15) Java

[jira] [Created] (SPARK-24987) Kafka Cached Consumer Leaking Consumers

2018-08-01 Thread Yuval Itzchakov (JIRA)
Yuval Itzchakov created SPARK-24987: --- Summary: Kafka Cached Consumer Leaking Consumers Key: SPARK-24987 URL: https://issues.apache.org/jira/browse/SPARK-24987 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-24283) Make standard scaler work without legacy MLlib

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24283: Assignee: (was: Apache Spark) > Make standard scaler work without legacy MLlib >

[jira] [Commented] (SPARK-24283) Make standard scaler work without legacy MLlib

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565020#comment-16565020 ] Apache Spark commented on SPARK-24283: -- User 'sujithjay' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24283) Make standard scaler work without legacy MLlib

2018-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24283: Assignee: Apache Spark > Make standard scaler work without legacy MLlib >

[jira] [Assigned] (SPARK-24653) Flaky test "JoinSuite.test SortMergeJoin (with spill)"

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24653: Assignee: Marcelo Vanzin > Flaky test "JoinSuite.test SortMergeJoin (with spill)" >

[jira] [Resolved] (SPARK-24653) Flaky test "JoinSuite.test SortMergeJoin (with spill)"

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24653. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21639

[jira] [Commented] (SPARK-13346) Using DataFrames iteratively leads to slow query planning

2018-08-01 Thread Izek Greenfield (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564885#comment-16564885 ] Izek Greenfield commented on SPARK-13346: - What the status of that? we face this issue too! >

[jira] [Resolved] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24982. - Resolution: Fixed Fix Version/s: 2.4.0 > UDAF resolution should not throw

[jira] [Resolved] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2018-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21274. - Resolution: Fixed Assignee: Dilip Biswal Fix Version/s: 2.4.0 > Implement EXCEPT ALL

[jira] [Commented] (SPARK-23742) Filter out redundant AssociationRules

2018-08-01 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564858#comment-16564858 ] yuhao yang commented on SPARK-23742: The redundant rule may have different confidence and support.

[jira] [Commented] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564813#comment-16564813 ] Hyukjin Kwon commented on SPARK-24980: -- Oh BTW, [~bryanc], do you remember if PyArrow is

[jira] [Comment Edited] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564813#comment-16564813 ] Hyukjin Kwon edited comment on SPARK-24980 at 8/1/18 6:37 AM: -- Oh BTW,

[jira] [Commented] (SPARK-24980) add support for pandas/arrow etc for python2.7 and pypy builds

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564808#comment-16564808 ] Hyukjin Kwon commented on SPARK-24980: -- Thank you for cc'ing me Shane! > add support for

[jira] [Updated] (SPARK-24984) Spark Streaming with xml data

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24984: - Priority: Major (was: Critical) > Spark Streaming with xml data >

[jira] [Resolved] (SPARK-24984) Spark Streaming with xml data

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24984. -- Resolution: Not A Problem It's something the package should implement and support . Please

[jira] [Updated] (SPARK-24984) Spark Streaming with xml data

2018-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24984: - Fix Version/s: (was: 0.8.2) > Spark Streaming with xml data >