[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071002#comment-16071002 ] Wenchen Fan commented on SPARK-21190: - Thanks for your proposal! I have 2 thoughts: 1. How should we

[jira] [Commented] (SPARK-21276) Update lz4-java to remove custom LZ4BlockInputStream

2017-06-30 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070990#comment-16070990 ] Takeshi Yamamuro commented on SPARK-21276: -- cc: [~davies] > Update lz4-java to remove custom

[jira] [Updated] (SPARK-21276) Update lz4-java to remove custom LZ4BlockInputStream

2017-06-30 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-21276: - Description: We currently use custom LZ4BlockInputStream to read concatenated byte

[jira] [Created] (SPARK-21276) Update lz4-java to remove custom LZ4BlockInputStream

2017-06-30 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-21276: Summary: Update lz4-java to remove custom LZ4BlockInputStream Key: SPARK-21276 URL: https://issues.apache.org/jira/browse/SPARK-21276 Project: Spark

[jira] [Updated] (SPARK-21276) Update lz4-java to remove custom LZ4BlockInputStream

2017-06-30 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-21276: - Description: We currently use custom LZ4BlockInputStream to read concatenated byte

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070988#comment-16070988 ] Miao Wang commented on SPARK-20307: --- [~Monday0927!] "Have you also try to load the model trained in

[jira] [Resolved] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21273. - Resolution: Fixed Fix Version/s: 2.3.0 > Decouple stats propagation from logical plan >

[jira] [Assigned] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20307: Assignee: (was: Apache Spark) > SparkR: pass on setHandleInvalid to spark.mllib

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070984#comment-16070984 ] Apache Spark commented on SPARK-20307: -- User 'wangmiao1981' has created a pull request for this

[jira] [Assigned] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20307: Assignee: Apache Spark > SparkR: pass on setHandleInvalid to spark.mllib functions that

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070982#comment-16070982 ] Ruslan Dautkhanov commented on SPARK-21274: --- [~rxin], I wish I could. We only use PySpark and

[jira] [Comment Edited] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070973#comment-16070973 ] Kazuaki Ishizaki edited comment on SPARK-21271 at 7/1/17 3:18 AM: -- I

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070973#comment-16070973 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. I will work for this. Thank you for letting

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070969#comment-16070969 ] Wenchen Fan commented on SPARK-21271: - yea we should. BTW the code seems wrong to be, the length of

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070966#comment-16070966 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. This issue comes from the violation such as

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070963#comment-16070963 ] Wenchen Fan commented on SPARK-21271: - For word-aligned I mean 8-bytes aligned, so the size of

[jira] [Resolved] (SPARK-20953) Add hash map metrics to aggregate and join

2017-06-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh resolved SPARK-20953. - Resolution: Fixed > Add hash map metrics to aggregate and join >

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070961#comment-16070961 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. For var-length part, its regulation (or

[jira] [Updated] (SPARK-20929) LinearSVC should not use shared Param HasThresholds

2017-06-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20929: -- Target Version/s: 2.2.0 (was: 2.2.1, 2.3.0) > LinearSVC should not use shared Param

[jira] [Updated] (SPARK-20929) LinearSVC should not use shared Param HasThresholds

2017-06-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20929: -- Fix Version/s: (was: 2.2.1) (was: 2.3.0)

[jira] [Resolved] (SPARK-21127) Update statistics after data changing commands

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21127. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18334

[jira] [Assigned] (SPARK-21127) Update statistics after data changing commands

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21127: --- Assignee: Zhenhua Wang > Update statistics after data changing commands >

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070944#comment-16070944 ] Wenchen Fan commented on SPARK-21271: - We do have this regulation for var-length part in UnsafeRow,

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070933#comment-16070933 ] Kazuaki Ishizaki commented on SPARK-21271: -- In {{UnsafeRow}}, the length of fixed part should be

[jira] [Resolved] (SPARK-17528) data should be copied properly before saving into InternalRow

2017-06-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17528. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18483

[jira] [Assigned] (SPARK-21275) Update GLM test to use supportedFamilyNames

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21275: Assignee: Apache Spark > Update GLM test to use supportedFamilyNames >

[jira] [Commented] (SPARK-21275) Update GLM test to use supportedFamilyNames

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070893#comment-16070893 ] Apache Spark commented on SPARK-21275: -- User 'actuaryzhang' has created a pull request for this

[jira] [Assigned] (SPARK-21275) Update GLM test to use supportedFamilyNames

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21275: Assignee: (was: Apache Spark) > Update GLM test to use supportedFamilyNames >

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070890#comment-16070890 ] Reynold Xin commented on SPARK-21274: - Do you want to submit a pull request? > Implement EXCEPT ALL

[jira] [Created] (SPARK-21275) Update GLM test to use supportedFamilyNames

2017-06-30 Thread Wayne Zhang (JIRA)
Wayne Zhang created SPARK-21275: --- Summary: Update GLM test to use supportedFamilyNames Key: SPARK-21275 URL: https://issues.apache.org/jira/browse/SPARK-21275 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070884#comment-16070884 ] Ruslan Dautkhanov commented on SPARK-21274: --- For INTERSECT ALL I was also experimenting with

[jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070883#comment-16070883 ] yuhao yang commented on SPARK-20082: I'm OK with only supporting initialModel for Online LDA now. For

[jira] [Commented] (SPARK-13225) [SQL] Support Intersect All/Distinct

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070881#comment-16070881 ] Ruslan Dautkhanov commented on SPARK-13225: --- Please consider this approach to implement

[jira] [Created] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21274: - Summary: Implement EXCEPT ALL and INTERSECT ALL Key: SPARK-21274 URL: https://issues.apache.org/jira/browse/SPARK-21274 Project: Spark Issue Type:

[jira] [Commented] (SPARK-20597) KafkaSourceProvider falls back on path as synonym for topic

2017-06-30 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070865#comment-16070865 ] Jacek Laskowski commented on SPARK-20597: - Go for it, [~Satyajit]! > KafkaSourceProvider falls

[jira] [Commented] (SPARK-19053) Supporting multiple evaluation metrics in DataFrame-based API: discussion

2017-06-30 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070849#comment-16070849 ] yuhao yang commented on SPARK-19053: Not sure if this is still wanted. cc [~josephkb] And I'd like to

[jira] [Assigned] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21273: Assignee: Reynold Xin (was: Apache Spark) > Decouple stats propagation from logical plan

[jira] [Commented] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070837#comment-16070837 ] Apache Spark commented on SPARK-21273: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21273: Assignee: Apache Spark (was: Reynold Xin) > Decouple stats propagation from logical plan

[jira] [Created] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21273: --- Summary: Decouple stats propagation from logical plan Key: SPARK-21273 URL: https://issues.apache.org/jira/browse/SPARK-21273 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-21272) SortMergeJoin LeftAnti does not update numOutputRows

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21272: Assignee: Apache Spark > SortMergeJoin LeftAnti does not update numOutputRows >

[jira] [Assigned] (SPARK-21272) SortMergeJoin LeftAnti does not update numOutputRows

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21272: Assignee: (was: Apache Spark) > SortMergeJoin LeftAnti does not update numOutputRows

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Leif Walsh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070809#comment-16070809 ] Leif Walsh commented on SPARK-21190: I think we can get away with doing windowing (deciding which

[jira] [Commented] (SPARK-21272) SortMergeJoin LeftAnti does not update numOutputRows

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070808#comment-16070808 ] Apache Spark commented on SPARK-21272: -- User 'juliuszsompolski' has created a pull request for this

[jira] [Created] (SPARK-21272) SortMergeJoin LeftAnti does not update numOutputRows

2017-06-30 Thread Juliusz Sompolski (JIRA)
Juliusz Sompolski created SPARK-21272: - Summary: SortMergeJoin LeftAnti does not update numOutputRows Key: SPARK-21272 URL: https://issues.apache.org/jira/browse/SPARK-21272 Project: Spark

[jira] [Resolved] (SPARK-21129) Arguments of SQL function call should not be named expressions

2017-06-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21129. - Resolution: Fixed Fix Version/s: 2.2.0 > Arguments of SQL function call should not be named

[jira] [Commented] (SPARK-20889) SparkR grouped documentation for Column methods

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070730#comment-16070730 ] Apache Spark commented on SPARK-20889: -- User 'actuaryzhang' has created a pull request for this

[jira] [Closed] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-21270. --- Resolution: Won't Fix While I absolutely would love to see this feature, I don't think this is

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Joseph Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070659#comment-16070659 ] Joseph Wang commented on SPARK-20307: - Have you also try to load the model trained in SparkR and call

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-06-30 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070647#comment-16070647 ] Miao Wang commented on SPARK-20307: --- Update: Manual test works. I will submit PR soon. > SparkR: pass

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070618#comment-16070618 ] Li Jin edited comment on SPARK-21190 at 6/30/17 7:38 PM: - I went another round of

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070618#comment-16070618 ] Li Jin edited comment on SPARK-21190 at 6/30/17 7:36 PM: - I have some APIs design

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070618#comment-16070618 ] Li Jin edited comment on SPARK-21190 at 6/30/17 7:34 PM: - I have some APIs design

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-30 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070618#comment-16070618 ] Li Jin commented on SPARK-21190: I have some APIs design written down here: Here is how to define a udf:

[jira] [Created] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Bogdan Raducanu (JIRA)
Bogdan Raducanu created SPARK-21271: --- Summary: UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8 Key: SPARK-21271 URL: https://issues.apache.org/jira/browse/SPARK-21271 Project: Spark

[jira] [Assigned] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-21223: - Assignee: zenglinxi Priority: Minor (was: Major) > Thread-safety issue in

[jira] [Resolved] (SPARK-21223) Thread-safety issue in FsHistoryProvider

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21223. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 18430

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070525#comment-16070525 ] Michael Armbrust commented on SPARK-18057: -- We should upgrade. Now that Kafka has a good

[jira] [Comment Edited] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070475#comment-16070475 ] Bryan Cutler edited comment on SPARK-13534 at 6/30/17 6:05 PM: --- Hi

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070475#comment-16070475 ] Bryan Cutler commented on SPARK-13534: -- Hi [~jaise...@gmail.com], the DataFrameWriter API is for

[jira] [Resolved] (SPARK-17924) Consolidate streaming and batch write path

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17924. - Resolution: Fixed Fix Version/s: 2.3.0 > Consolidate streaming and batch write path >

[jira] [Assigned] (SPARK-19326) Speculated task attempts do not get launched in few scenarios

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19326: Assignee: (was: Apache Spark) > Speculated task attempts do not get launched in few

[jira] [Commented] (SPARK-19326) Speculated task attempts do not get launched in few scenarios

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070457#comment-16070457 ] Apache Spark commented on SPARK-19326: -- User 'janewangfb' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19326) Speculated task attempts do not get launched in few scenarios

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19326: Assignee: Apache Spark > Speculated task attempts do not get launched in few scenarios >

[jira] [Updated] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20073: Issue Type: Improvement (was: Bug) > Unexpected Cartesian product when using eqNullSafe in join with a

[jira] [Updated] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20073: Component/s: (was: Optimizer) SQL > Unexpected Cartesian product when using

[jira] [Updated] (SPARK-20073) Unexpected Cartesian product when using eqNullSafe in join with a derived table

2017-06-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20073: Labels: (was: correctness) > Unexpected Cartesian product when using eqNullSafe in join with a derived

[jira] [Commented] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070336#comment-16070336 ] Sean Owen commented on SPARK-21270: --- How would Spark know how much of the total memory is to be

[jira] [Commented] (SPARK-15533) Deprecate Dataset.explode

2017-06-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070308#comment-16070308 ] Michael Armbrust commented on SPARK-15533: -- Just include the other columns too

[jira] [Commented] (SPARK-21268) Move center calculations to a distributed map in KMeans

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070300#comment-16070300 ] Apache Spark commented on SPARK-21268: -- User 'dardelet' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21268) Move center calculations to a distributed map in KMeans

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21268: Assignee: (was: Apache Spark) > Move center calculations to a distributed map in

[jira] [Assigned] (SPARK-21268) Move center calculations to a distributed map in KMeans

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21268: Assignee: Apache Spark > Move center calculations to a distributed map in KMeans >

[jira] [Updated] (SPARK-21268) Move center calculations to a distributed map in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Dardelet updated SPARK-21268: --- Description: As I was monitoring the perfomance of my algorithm with SparkUI, I

[jira] [Updated] (SPARK-21268) Move center calculations to a distributed map in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Dardelet updated SPARK-21268: --- Summary: Move center calculations to a distributed map in KMeans (was: Move some

[jira] [Updated] (SPARK-21268) Move some calculations to a distributed map in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Dardelet updated SPARK-21268: --- Description: As I was monitoring the perfomance of my algorithm with SparkUI, I

[jira] [Updated] (SPARK-21268) Move some calculations to a distributed map in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Dardelet updated SPARK-21268: --- Summary: Move some calculations to a distributed map in KMeans (was: Redundant

[jira] [Resolved] (SPARK-21156) Spark cannot handle multiple KMS server configuration

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21156. --- Resolution: Not A Problem Closing as not a problem particular to Spark > Spark cannot handle

[jira] [Updated] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jin xing updated SPARK-21270: - Issue Type: Improvement (was: Bug) > Improvement for memory config. > -- >

[jira] [Commented] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070264#comment-16070264 ] jin xing commented on SPARK-21270: -- cc [~rxin] [~cloud_fan] [~joshrosen] > Improvement for memory

[jira] [Created] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread jin xing (JIRA)
jin xing created SPARK-21270: Summary: Improvement for memory config. Key: SPARK-21270 URL: https://issues.apache.org/jira/browse/SPARK-21270 Project: Spark Issue Type: Bug Components:

[jira] [Comment Edited] (SPARK-21156) Spark cannot handle multiple KMS server configuration

2017-06-30 Thread Monica Raj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070256#comment-16070256 ] Monica Raj edited comment on SPARK-21156 at 6/30/17 3:20 PM: - Thank you, the

[jira] [Commented] (SPARK-21156) Spark cannot handle multiple KMS server configuration

2017-06-30 Thread Monica Raj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070256#comment-16070256 ] Monica Raj commented on SPARK-21156: Thank you, the issue was as you said in the hadoop library. We

[jira] [Commented] (SPARK-21268) Redundant collectAsMap in KMeans

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070217#comment-16070217 ] Sean Owen commented on SPARK-21268: --- You can just update this issue. > Redundant collectAsMap in

[jira] [Updated] (SPARK-21269) MetadataFetchFailedException: Missing an output location for shuffle 0

2017-06-30 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-21269: Description: Spark *cluster* can reproduce, *local* can't: 1. Start a spark context with

[jira] [Assigned] (SPARK-21269) MetadataFetchFailedException: Missing an output location for shuffle 0

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21269: Assignee: (was: Apache Spark) > MetadataFetchFailedException: Missing an output

[jira] [Commented] (SPARK-21269) MetadataFetchFailedException: Missing an output location for shuffle 0

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070184#comment-16070184 ] Apache Spark commented on SPARK-21269: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21269) MetadataFetchFailedException: Missing an output location for shuffle 0

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21269: Assignee: Apache Spark > MetadataFetchFailedException: Missing an output location for

[jira] [Commented] (SPARK-21268) Redundant collectAsMap in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070177#comment-16070177 ] Guillaume Dardelet commented on SPARK-21268: Ok I understand why it is useful now thank you.

[jira] [Created] (SPARK-21269) MetadataFetchFailedException: Missing an output location for shuffle 0

2017-06-30 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-21269: --- Summary: MetadataFetchFailedException: Missing an output location for shuffle 0 Key: SPARK-21269 URL: https://issues.apache.org/jira/browse/SPARK-21269 Project: Spark

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-30 Thread Jais Sebastian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070126#comment-16070126 ] Jais Sebastian commented on SPARK-13534: Hi, Do you have any plan to integrate Arrow format for

[jira] [Assigned] (SPARK-21255) NPE when creating encoder for enum

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21255: Assignee: Apache Spark > NPE when creating encoder for enum >

[jira] [Commented] (SPARK-21255) NPE when creating encoder for enum

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070120#comment-16070120 ] Apache Spark commented on SPARK-21255: -- User 'mike0sv' has created a pull request for this issue:

[jira] [Assigned] (SPARK-21255) NPE when creating encoder for enum

2017-06-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21255: Assignee: (was: Apache Spark) > NPE when creating encoder for enum >

[jira] [Commented] (SPARK-21268) Redundant collectAsMap in KMeans

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070114#comment-16070114 ] Sean Owen commented on SPARK-21268: --- I don't think it's redundant, because totalContribs is used later

[jira] [Created] (SPARK-21268) Redundant collectAsMap in KMeans

2017-06-30 Thread Guillaume Dardelet (JIRA)
Guillaume Dardelet created SPARK-21268: -- Summary: Redundant collectAsMap in KMeans Key: SPARK-21268 URL: https://issues.apache.org/jira/browse/SPARK-21268 Project: Spark Issue Type:

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070091#comment-16070091 ] Sean Owen commented on SPARK-21227: --- I don't know, I think there is a real issue here somewhere. I

[jira] [Updated] (SPARK-21253) Cannot fetch big blocks to disk

2017-06-30 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-21253: Description: Spark *cluster* can reproduce, *local* can't: 1. Start a spark context with

[jira] [Comment Edited] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-30 Thread Seydou Dia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070076#comment-16070076 ] Seydou Dia edited comment on SPARK-21227 at 6/30/17 1:24 PM: - Understood,

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-30 Thread Seydou Dia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070076#comment-16070076 ] Seydou Dia commented on SPARK-21227: Understood, thank you so much for those insight! I guess I

[jira] [Commented] (SPARK-21227) Unicode in Json field causes AnalysisException when selecting from Dataframe

2017-06-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070070#comment-16070070 ] Sean Owen commented on SPARK-21227: --- You should certainly make your app less sensitive to issues like

  1   2   >