[jira] [Updated] (SPARK-24362) SUM function precision issue

2018-05-22 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24362: Description:  How to reproduce: {noformat} bin/spark-shell --conf spark.sql.autoBroadcastJoinThresh

[jira] [Updated] (SPARK-24362) SUM function precision issue

2018-05-22 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24362: Description:  How to reproduce: {noformat} bin/spark-shell --conf spark.sql.autoBroadcastJoinThresh

[jira] [Created] (SPARK-24362) SUM function precision issue

2018-05-22 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-24362: --- Summary: SUM function precision issue Key: SPARK-24362 URL: https://issues.apache.org/jira/browse/SPARK-24362 Project: Spark Issue Type: Bug Componen

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486799#comment-16486799 ] Hossein Falaki commented on SPARK-24359: Thanks for reviewing [~felixcheung]. #

[jira] [Commented] (SPARK-22055) Port release scripts

2018-05-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486728#comment-16486728 ] Felix Cheung commented on SPARK-22055: -- interesting - I'd definitely be happy to hel

[jira] [Comment Edited] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486719#comment-16486719 ] Felix Cheung edited comment on SPARK-24359 at 5/23/18 5:21 AM:

[jira] [Comment Edited] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486719#comment-16486719 ] Felix Cheung edited comment on SPARK-24359 at 5/23/18 5:18 AM:

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486719#comment-16486719 ] Felix Cheung commented on SPARK-24359: -- # could you include design doc as google doc

[jira] [Commented] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486717#comment-16486717 ] Hyukjin Kwon commented on SPARK-22366: -- Oops, sorry. I mistakenly edited the JIRA. I

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-22366: - Description: +underlined text+There's an existing flag "spark.sql.files.ignoreCorruptFiles" that

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-22366: - Description: There's an existing flag "spark.sql.files.ignoreCorruptFiles" that will quietly ign

[jira] [Closed] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin closed SPARK-24349. -- > obtainDelegationTokens() exits JVM if Driver use JDBC instead of using > metastore > --

[jira] [Resolved] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin resolved SPARK-24349. Resolution: Not A Problem delegationTokensRequired has been checked in SparkSQLCLIDriver.scala > o

[jira] [Assigned] (SPARK-24361) Polish code block manipulation API

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24361: Assignee: (was: Apache Spark) > Polish code block manipulation API > -

[jira] [Assigned] (SPARK-24361) Polish code block manipulation API

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24361: Assignee: Apache Spark > Polish code block manipulation API >

[jira] [Commented] (SPARK-24361) Polish code block manipulation API

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486652#comment-16486652 ] Apache Spark commented on SPARK-24361: -- User 'viirya' has created a pull request for

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Component/s: (was: Optimizer) Spark Core > Large Task prior scheduling to Reduce overall execu

[jira] [Created] (SPARK-24361) Polish code block manipulation API

2018-05-22 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-24361: --- Summary: Polish code block manipulation API Key: SPARK-24361 URL: https://issues.apache.org/jira/browse/SPARK-24361 Project: Spark Issue Type: Improvem

[jira] [Updated] (SPARK-24339) spark sql can not prune column in transform/map/reduce query

2018-05-22 Thread xdcjie (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xdcjie updated SPARK-24339: --- Affects Version/s: (was: 2.2.1) (was: 2.1.2) (was: 2

[jira] [Commented] (SPARK-21945) pyspark --py-files doesn't work in yarn client mode

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486610#comment-16486610 ] Hyukjin Kwon commented on SPARK-21945: -- [~vanzin], I tried spark-submit in client mo

[jira] [Commented] (SPARK-24358) createDataFrame in Python 3 should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486594#comment-16486594 ] Joel Croteau commented on SPARK-24358: -- Done. > createDataFrame in Python 3 should

[jira] [Updated] (SPARK-24358) createDataFrame in Python 3 should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Croteau updated SPARK-24358: - Labels: Python3 (was: ) Description: createDataFrame can infer Python 3's bytearray type

[jira] [Comment Edited] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486584#comment-16486584 ] Hyukjin Kwon edited comment on SPARK-24358 at 5/23/18 1:50 AM:

[jira] [Commented] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486584#comment-16486584 ] Hyukjin Kwon commented on SPARK-24358: -- Yea, let's note that it's specific to Python

[jira] [Comment Edited] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486581#comment-16486581 ] Joel Croteau edited comment on SPARK-24358 at 5/23/18 1:47 AM:

[jira] [Commented] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486581#comment-16486581 ] Joel Croteau commented on SPARK-24358: -- No, I mean the bytes type in Python 3. This

[jira] [Commented] (SPARK-24356) Duplicate strings in File.path managed by FileSegmentManagedBuffer

2018-05-22 Thread Misha Dmitriev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486580#comment-16486580 ] Misha Dmitriev commented on SPARK-24356: I plan to work on this feature. > Dupli

[jira] [Updated] (SPARK-24349) obtainDelegationTokens() exits JVM if Driver use JDBC instead of using metastore

2018-05-22 Thread Lantao Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lantao Jin updated SPARK-24349: --- Description: In [SPARK-23639|https://issues.apache.org/jira/browse/SPARK-23639], use --proxy-user to

[jira] [Commented] (SPARK-24357) createDataFrame in Python infers large integers as long type and then fails silently when converting them

2018-05-22 Thread Joel Croteau (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486571#comment-16486571 ] Joel Croteau commented on SPARK-24357: -- Fair enough, here is some code to reproduce

[jira] [Commented] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486569#comment-16486569 ] Hyukjin Kwon commented on SPARK-24358: -- ? do you mean bytes in Python 2? that's an a

[jira] [Commented] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486565#comment-16486565 ] Hyukjin Kwon commented on SPARK-24324: -- Ah, I meant shorter reproducer should make o

[jira] [Commented] (SPARK-24339) spark sql can not prune column in transform/map/reduce query

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486561#comment-16486561 ] Hyukjin Kwon commented on SPARK-24339: -- (Don't set the target versions usually reser

[jira] [Updated] (SPARK-24339) spark sql can not prune column in transform/map/reduce query

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24339: - Target Version/s: (was: 2.1.1, 2.1.2, 2.2.0, 2.2.1) > spark sql can not prune column in transfo

[jira] [Updated] (SPARK-24339) spark sql can not prune column in transform/map/reduce query

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24339: - Fix Version/s: (was: 2.2.1) (was: 2.1.2) (was: 2

[jira] [Commented] (SPARK-24354) Adding support for quoteMode in Spark's build in CSV DataFrameWriter

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486557#comment-16486557 ] Hyukjin Kwon commented on SPARK-24354: -- Nope, the library was changed and it doesn't

[jira] [Commented] (SPARK-24357) createDataFrame in Python infers large integers as long type and then fails silently when converting them

2018-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486554#comment-16486554 ] Hyukjin Kwon commented on SPARK-24357: -- It should have been much more readable if th

[jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time

2018-05-22 Thread gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated SPARK-24342: Priority: Major (was: Minor) > Large Task prior scheduling to Reduce overall execution time >

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24322) Upgrade Apache ORC to 1.4.4

2018-05-22 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24322: -- Description: ORC 1.4.4 (released on May 14th) includes nine fixes. This issue aims to update S

[jira] [Assigned] (SPARK-24360) Support Hive 3.0 metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24360: Assignee: Apache Spark > Support Hive 3.0 metastore > -- > >

[jira] [Commented] (SPARK-24360) Support Hive 3.0 metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16485646#comment-16485646 ] Apache Spark commented on SPARK-24360: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Assigned] (SPARK-24360) Support Hive 3.0 metastore

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24360: Assignee: (was: Apache Spark) > Support Hive 3.0 metastore > -

[jira] [Created] (SPARK-24360) Support Hive 3.0 metastore

2018-05-22 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-24360: - Summary: Support Hive 3.0 metastore Key: SPARK-24360 URL: https://issues.apache.org/jira/browse/SPARK-24360 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Description: h1. Background and motivation SparkR supports calling MLlib functionality with

[jira] [Commented] (SPARK-24313) Collection functions interpreted execution doesn't work with complex types

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484679#comment-16484679 ] Apache Spark commented on SPARK-24313: -- User 'mgaido91' has created a pull request f

[jira] [Created] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-24359: -- Summary: SPIP: ML Pipelines in R Key: SPARK-24359 URL: https://issues.apache.org/jira/browse/SPARK-24359 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-24359) SPIP: ML Pipelines in R

2018-05-22 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-24359: --- Attachment: SparkML_ ML Pipelines in R.pdf > SPIP: ML Pipelines in R > --

[jira] [Commented] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484654#comment-16484654 ] Apache Spark commented on SPARK-24355: -- User 'Victsm' has created a pull request for

[jira] [Assigned] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24355: Assignee: Apache Spark > Improve Spark shuffle server responsiveness to non-ChunkFetch req

[jira] [Assigned] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24355: Assignee: (was: Apache Spark) > Improve Spark shuffle server responsiveness to non-Chu

[jira] [Commented] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484650#comment-16484650 ] Min Shen commented on SPARK-24355: -- [~felixcheung]  [~jinxing6...@126.com] [~cloud_fan]

[jira] [Assigned] (SPARK-24335) Dataset.map schema not applied in some cases

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24335: Assignee: Apache Spark > Dataset.map schema not applied in some cases > --

[jira] [Commented] (SPARK-24335) Dataset.map schema not applied in some cases

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484649#comment-16484649 ] Apache Spark commented on SPARK-24335: -- User 'Victsm' has created a pull request for

[jira] [Assigned] (SPARK-24335) Dataset.map schema not applied in some cases

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24335: Assignee: (was: Apache Spark) > Dataset.map schema not applied in some cases > ---

[jira] [Created] (SPARK-24358) createDataFrame in Python should be able to infer bytes type as Binary type

2018-05-22 Thread Joel Croteau (JIRA)
Joel Croteau created SPARK-24358: Summary: createDataFrame in Python should be able to infer bytes type as Binary type Key: SPARK-24358 URL: https://issues.apache.org/jira/browse/SPARK-24358 Project:

[jira] [Created] (SPARK-24357) createDataFrame in Python infers large integers as long type and then fails silently when converting them

2018-05-22 Thread Joel Croteau (JIRA)
Joel Croteau created SPARK-24357: Summary: createDataFrame in Python infers large integers as long type and then fails silently when converting them Key: SPARK-24357 URL: https://issues.apache.org/jira/browse/SPAR

[jira] [Created] (SPARK-24356) Duplicate strings in File.path managed by FileSegmentManagedBuffer

2018-05-22 Thread Misha Dmitriev (JIRA)
Misha Dmitriev created SPARK-24356: -- Summary: Duplicate strings in File.path managed by FileSegmentManagedBuffer Key: SPARK-24356 URL: https://issues.apache.org/jira/browse/SPARK-24356 Project: Spark

[jira] [Created] (SPARK-24355) Improve Spark shuffle server responsiveness to non-ChunkFetch requests

2018-05-22 Thread Min Shen (JIRA)
Min Shen created SPARK-24355: Summary: Improve Spark shuffle server responsiveness to non-ChunkFetch requests Key: SPARK-24355 URL: https://issues.apache.org/jira/browse/SPARK-24355 Project: Spark

[jira] [Assigned] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing

2018-05-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-19185: -- Assignee: Gabor Somogyi > ConcurrentModificationExceptions with CachedKafkaConsumers w

[jira] [Resolved] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing

2018-05-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19185. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20997 [https:/

[jira] [Comment Edited] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Cristian Consonni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484138#comment-16484138 ] Cristian Consonni edited comment on SPARK-24324 at 5/22/18 8:17 PM: ---

[jira] [Comment Edited] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-22 Thread Cristian Consonni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484138#comment-16484138 ] Cristian Consonni edited comment on SPARK-24324 at 5/22/18 8:16 PM: ---

[jira] [Comment Edited] (SPARK-22055) Port release scripts

2018-05-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483217#comment-16483217 ] Marcelo Vanzin edited comment on SPARK-22055 at 5/22/18 8:11 PM: --

[jira] [Resolved] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24348. - Resolution: Fixed > scala.MatchError in the "element_at" expression > ---

[jira] [Assigned] (SPARK-24348) scala.MatchError in the "element_at" expression

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-24348: --- Assignee: Alex Vayda > scala.MatchError in the "element_at" expression > ---

[jira] [Comment Edited] (SPARK-23899) Built-in SQL Function Improvement

2018-05-22 Thread Alex Vayda (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484477#comment-16484477 ] Alex Vayda edited comment on SPARK-23899 at 5/22/18 7:56 PM: -

[jira] [Commented] (SPARK-23899) Built-in SQL Function Improvement

2018-05-22 Thread Alex Vayda (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484477#comment-16484477 ] Alex Vayda commented on SPARK-23899: What do you guys think about adding another set

[jira] [Updated] (SPARK-23780) Failed to use googleVis library with new SparkR

2018-05-22 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-23780: --- Fix Version/s: (was: 2.3.2) 2.3.1 > Failed to use googleVis library wi

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s node

[jira] [Created] (SPARK-24354) Adding support for quoteMode in Spark's build in CSV DataFrameWriter

2018-05-22 Thread Umesh K (JIRA)
Umesh K created SPARK-24354: --- Summary: Adding support for quoteMode in Spark's build in CSV DataFrameWriter Key: SPARK-24354 URL: https://issues.apache.org/jira/browse/SPARK-24354 Project: Spark I

[jira] [Commented] (SPARK-24333) Add fit with validation set to spark.ml GBT: Python API

2018-05-22 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484376#comment-16484376 ] Huaxin Gao commented on SPARK-24333: I will work on this. Thanks! > Add fit with val

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s node

[jira] [Assigned] (SPARK-24121) The API for handling expression code generation in expression codegen

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24121: --- Assignee: Liang-Chi Hsieh > The API for handling expression code generation in expression co

[jira] [Resolved] (SPARK-24121) The API for handling expression code generation in expression codegen

2018-05-22 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24121. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21193 [https://githu

[jira] [Commented] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484346#comment-16484346 ] Stavros Kontopoulos commented on SPARK-24353: - I will create a design doc for

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s node

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s node

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Target Version/s: 2.3.2 > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Target Version/s: 2.3.1 (was: 2.3.2) > LongToUnsafeRowMap calculate the new size may be wrong > --

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Labels: correctness (was: ) > LongToUnsafeRowMap calculate the new size may be wrong > ---

[jira] [Updated] (SPARK-24257) LongToUnsafeRowMap calculate the new size may be wrong

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24257: Priority: Blocker (was: Minor) > LongToUnsafeRowMap calculate the new size may be wrong >

[jira] [Comment Edited] (SPARK-13638) Support for saving with a quote mode

2018-05-22 Thread Umesh K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484255#comment-16484255 ] Umesh K edited comment on SPARK-13638 at 5/22/18 4:36 PM: -- [~rxi

[jira] [Commented] (SPARK-13638) Support for saving with a quote mode

2018-05-22 Thread Umesh K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484255#comment-16484255 ] Umesh K commented on SPARK-13638: - Just want to confirm are we ever going to have quoteMo

[jira] [Updated] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stavros Kontopoulos updated SPARK-24353: Description: Spark on K8s allows to place driver/executor pods on specific k8s node

[jira] [Created] (SPARK-24353) Add support for pod affinity/anti-affinity

2018-05-22 Thread Stavros Kontopoulos (JIRA)
Stavros Kontopoulos created SPARK-24353: --- Summary: Add support for pod affinity/anti-affinity Key: SPARK-24353 URL: https://issues.apache.org/jira/browse/SPARK-24353 Project: Spark Issu

[jira] [Created] (SPARK-24352) Flaky test: StandaloneDynamicAllocationSuite

2018-05-22 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-24352: -- Summary: Flaky test: StandaloneDynamicAllocationSuite Key: SPARK-24352 URL: https://issues.apache.org/jira/browse/SPARK-24352 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24341) Codegen compile error from predicate subquery

2018-05-22 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484225#comment-16484225 ] Xiao Li commented on SPARK-24341: - cc [~dkbiswal] Could you take a look at this too? > C

[jira] [Updated] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread huangtengfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huangtengfei updated SPARK-24351: - Description: In structured streaming, there is a conf spark.sql.streaming.minBatchesToRetain whi

[jira] [Assigned] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24351: Assignee: Apache Spark > offsetLog/commitLog purge thresholdBatchId should be computed wit

[jira] [Commented] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484172#comment-16484172 ] Apache Spark commented on SPARK-24350: -- User 'wajda' has created a pull request for

[jira] [Assigned] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24351: Assignee: (was: Apache Spark) > offsetLog/commitLog purge thresholdBatchId should be c

[jira] [Commented] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484173#comment-16484173 ] Apache Spark commented on SPARK-24351: -- User 'ivoson' has created a pull request for

[jira] [Assigned] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24350: Assignee: Apache Spark > ClassCastException in "array_position" function > ---

[jira] [Assigned] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24350: Assignee: (was: Apache Spark) > ClassCastException in "array_position" function >

[jira] [Created] (SPARK-24351) offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode

2018-05-22 Thread huangtengfei (JIRA)
huangtengfei created SPARK-24351: Summary: offsetLog/commitLog purge thresholdBatchId should be computed with current committed epoch but not currentBatchId in CP mode Key: SPARK-24351 URL: https://issues.apache.o

[jira] [Created] (SPARK-24350) ClassCastException in "array_position" function

2018-05-22 Thread Alex Wajda (JIRA)
Alex Wajda created SPARK-24350: -- Summary: ClassCastException in "array_position" function Key: SPARK-24350 URL: https://issues.apache.org/jira/browse/SPARK-24350 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22269: Assignee: Apache Spark > Java style checks should be run in Jenkins >

[jira] [Commented] (SPARK-22269) Java style checks should be run in Jenkins

2018-05-22 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484151#comment-16484151 ] Apache Spark commented on SPARK-22269: -- User 'HyukjinKwon' has created a pull reques

  1   2   >