[jira] [Commented] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16538118#comment-16538118 ] Apache Spark commented on SPARK-24718: -- User 'wangyum' has created a pull request f

[jira] [Assigned] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24718: Assignee: Apache Spark > Timestamp support pushdown to parquet data source >

[jira] [Assigned] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24718: Assignee: (was: Apache Spark) > Timestamp support pushdown to parquet data source > -

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24268: - Fix Version/s: (was: 2.4.0) > DataType in error messages are not coherent >

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24268: - Priority: Minor (was: Trivial) > DataType in error messages are not coherent >

[jira] [Reopened] (SPARK-24268) DataType in error messages are not coherent

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24268: -- > DataType in error messages are not coherent > --- > >

[jira] [Resolved] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24760. -- Resolution: Not A Problem > Pandas UDF does not handle NaN correctly > ---

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537893#comment-16537893 ] Bryan Cutler commented on SPARK-24760: -- Pandas interprets NaN to be missing data fo

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537864#comment-16537864 ] Mortada Mehyar commented on SPARK-24760: Setting the "new" column to be nullable

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537817#comment-16537817 ] Mortada Mehyar commented on SPARK-24760: [~icexelloss] but NaN is not really a "

[jira] [Commented] (SPARK-24018) Spark-without-hadoop package fails to create or read parquet files with snappy compression

2018-07-09 Thread Jean-Francis Roy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537815#comment-16537815 ] Jean-Francis Roy commented on SPARK-24018: -- Oh, indeed you are right! I was mis

[jira] [Created] (SPARK-24765) Add custom Kubernetes scheduler config parameter to spark-submit

2018-07-09 Thread Nihal Harish (JIRA)
Nihal Harish created SPARK-24765: Summary: Add custom Kubernetes scheduler config parameter to spark-submit Key: SPARK-24765 URL: https://issues.apache.org/jira/browse/SPARK-24765 Project: Spark

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-09 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537662#comment-16537662 ] Li Jin commented on SPARK-24760: I think the issue here is that the output schema for th

[jira] [Assigned] (SPARK-21318) The exception message thrown by `lookupFunction` is ambiguous.

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21318: Assignee: Apache Spark > The exception message thrown by `lookupFunction` is ambiguous. >

[jira] [Assigned] (SPARK-21318) The exception message thrown by `lookupFunction` is ambiguous.

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21318: Assignee: (was: Apache Spark) > The exception message thrown by `lookupFunction` is a

[jira] [Resolved] (SPARK-24759) No reordering keys for broadcast hash join

2018-07-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24759. - Resolution: Fixed Fix Version/s: 2.4.0 > No reordering keys for broadcast hash join > ---

[jira] [Resolved] (SPARK-21318) The exception message thrown by `lookupFunction` is ambiguous.

2018-07-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21318. - Resolution: Fixed Target Version/s: 2.4.0 > The exception message thrown by `lookupFunction` i

[jira] [Reopened] (SPARK-21318) The exception message thrown by `lookupFunction` is ambiguous.

2018-07-09 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reopened SPARK-21318: - > The exception message thrown by `lookupFunction` is ambiguous. > -

[jira] [Commented] (SPARK-24018) Spark-without-hadoop package fails to create or read parquet files with snappy compression

2018-07-09 Thread Patrick Clay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537551#comment-16537551 ] Patrick Clay commented on SPARK-24018: -- I believe we are both partially correct in

[jira] [Comment Edited] (SPARK-24018) Spark-without-hadoop package fails to create or read parquet files with snappy compression

2018-07-09 Thread Patrick Clay (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537551#comment-16537551 ] Patrick Clay edited comment on SPARK-24018 at 7/9/18 8:49 PM:

[jira] [Commented] (SPARK-24179) History Server for Kubernetes

2018-07-09 Thread Chaoran Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537456#comment-16537456 ] Chaoran Yu commented on SPARK-24179: A [PR|https://github.com/kubernetes/charts/pull

[jira] [Created] (SPARK-24764) Add ServiceLoader implementation for SparkHadoopUtil

2018-07-09 Thread Shruti Gumma (JIRA)
Shruti Gumma created SPARK-24764: Summary: Add ServiceLoader implementation for SparkHadoopUtil Key: SPARK-24764 URL: https://issues.apache.org/jira/browse/SPARK-24764 Project: Spark Issue Ty

[jira] [Commented] (SPARK-23822) Improve error message for Parquet schema mismatches

2018-07-09 Thread Yuchen Huo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537412#comment-16537412 ] Yuchen Huo commented on SPARK-23822: Yes, the error message would show the column na

[jira] [Commented] (SPARK-23822) Improve error message for Parquet schema mismatches

2018-07-09 Thread nirav patel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537405#comment-16537405 ] nirav patel commented on SPARK-23822: - Does the fix pinpoint what column it fails to

[jira] [Updated] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2018-07-09 Thread shahid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shahid updated SPARK-18230: --- Fix Version/s: 2.4.0 > MatrixFactorizationModel.recommendProducts throws NoSuchElement exception > when the

[jira] [Assigned] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18230: Assignee: (was: Apache Spark) > MatrixFactorizationModel.recommendProducts throws NoS

[jira] [Assigned] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18230: Assignee: Apache Spark > MatrixFactorizationModel.recommendProducts throws NoSuchElement

[jira] [Commented] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537354#comment-16537354 ] Apache Spark commented on SPARK-18230: -- User 'shahidki31' has created a pull reques

[jira] [Commented] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537330#comment-16537330 ] Apache Spark commented on SPARK-22187: -- User 'tdas' has created a pull request for

[jira] [Commented] (SPARK-17694) convert DataFrame to DataSet should check columns match

2018-07-09 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-17694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537320#comment-16537320 ] Paweł Batko commented on SPARK-17694: - Another minimal: {code:scala} case class Foo(

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537199#comment-16537199 ] Marco Gaido commented on SPARK-24745: - This makes sense, as the map operation create

[jira] [Commented] (SPARK-21743) top-most limit should not cause memory leak

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537196#comment-16537196 ] Apache Spark commented on SPARK-21743: -- User 'cloud-fan' has created a pull request

[jira] [Assigned] (SPARK-24208) Cannot resolve column in self join after applying Pandas UDF

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24208: Assignee: Apache Spark > Cannot resolve column in self join after applying Pandas UDF > -

[jira] [Assigned] (SPARK-24208) Cannot resolve column in self join after applying Pandas UDF

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24208: Assignee: (was: Apache Spark) > Cannot resolve column in self join after applying Pan

[jira] [Commented] (SPARK-24208) Cannot resolve column in self join after applying Pandas UDF

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537082#comment-16537082 ] Apache Spark commented on SPARK-24208: -- User 'mgaido91' has created a pull request

[jira] [Assigned] (SPARK-24268) DataType in error messages are not coherent

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24268: Assignee: Marco Gaido > DataType in error messages are not coherent > ---

[jira] [Resolved] (SPARK-24268) DataType in error messages are not coherent

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24268. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21321 [https://gi

[jira] [Commented] (SPARK-16534) Kafka 0.10 Python support

2018-07-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537035#comment-16537035 ] Thomas Graves commented on SPARK-16534: --- I agree it seems a bit of a bad user stor

[jira] [Commented] (SPARK-16534) Kafka 0.10 Python support

2018-07-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537034#comment-16537034 ] Thomas Graves commented on SPARK-16534: --- If we aren't going to do this we should c

[jira] [Commented] (SPARK-24741) Have a built-in AVRO data source implementation

2018-07-09 Thread Marek Novotny (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536907#comment-16536907 ] Marek Novotny commented on SPARK-24741: --- It would be nice if the build-in data sou

[jira] [Resolved] (SPARK-23936) High-order function: map_concat(map1, map2, ..., mapN) → map

2018-07-09 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23936. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21073 https://g

[jira] [Assigned] (SPARK-23936) High-order function: map_concat(map1, map2, ..., mapN) → map

2018-07-09 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-23936: - Assignee: Bruce Robbins > High-order function: map_concat(map1, map2, ..., mapN) → > m

[jira] [Commented] (SPARK-24674) Spark on Kubernetes BLAS performance

2018-07-09 Thread Dennis Aumiller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536784#comment-16536784 ] Dennis Aumiller commented on SPARK-24674: - Yes, exactly. I am aware of the discu

[jira] [Resolved] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-07-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24438. -- Resolution: Won't Fix I checked that and it looks treating empty string like that intentionall

[jira] [Commented] (SPARK-24719) ClusteringEvaluator supports integer type labels

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536747#comment-16536747 ] Marco Gaido commented on SPARK-24719: - [~mengxr] any luck with this? Thanks. > Clus

[jira] [Commented] (SPARK-24605) size(null) should return null

2018-07-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536743#comment-16536743 ] Apache Spark commented on SPARK-24605: -- User 'mgaido91' has created a pull request

[jira] [Commented] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536682#comment-16536682 ] Marco Gaido commented on SPARK-24438: - IIRC, Hive has a placeholder string (__HIVE_D

[jira] [Comment Edited] (SPARK-24438) Empty strings and null strings are written to the same partition

2018-07-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16536682#comment-16536682 ] Marco Gaido edited comment on SPARK-24438 at 7/9/18 8:37 AM: -