[jira] [Created] (SPARK-24781) Using a reference from Dataset in Filter might not work.

2018-07-10 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-24781: - Summary: Using a reference from Dataset in Filter might not work. Key: SPARK-24781 URL: https://issues.apache.org/jira/browse/SPARK-24781 Project: Spark

[jira] [Updated] (SPARK-24736) --py-files not functional for non local URLs. It appears to pass non-local URL's into PYTHONPATH directly.

2018-07-10 Thread Jonathan A Weaver (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan A Weaver updated SPARK-24736: -- Summary: --py-files not functional for non local URLs. It appears to pass non-local

[jira] [Resolved] (SPARK-24165) UDF within when().otherwise() raises NullPointerException

2018-07-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24165. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21687

[jira] [Assigned] (SPARK-24165) UDF within when().otherwise() raises NullPointerException

2018-07-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-24165: --- Assignee: Marek Novotny > UDF within when().otherwise() raises NullPointerException >

[jira] [Commented] (SPARK-24697) Fix the reported start offsets in streaming query progress

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539512#comment-16539512 ] Apache Spark commented on SPARK-24697: -- User 'tdas' has created a pull request for this issue:

[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539490#comment-16539490 ] Takeshi Yamamuro commented on SPARK-24753: -- I closed this as 'not a problem'. Feel free to open

[jira] [Resolved] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-24753. -- Resolution: Not A Problem > bad backslah parsing in SQL statements >

[jira] [Commented] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539478#comment-16539478 ] Takeshi Yamamuro commented on SPARK-24778: -- I think that it is more reasonable to reject the 

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539473#comment-16539473 ] Hyukjin Kwon commented on SPARK-20202: -- Hey [~owen.omalley] and [~rxin], I know I see many

[jira] [Updated] (SPARK-24766) CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats in parquet

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24766: Summary: CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats in

[jira] [Resolved] (SPARK-24730) Add policy to choose max as global watermark when streaming query has multiple watermarks

2018-07-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24730. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21701

[jira] [Updated] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24753: - Priority: Trivial (was: Minor) > bad backslah parsing in SQL statements >

[jira] [Updated] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24753: - Issue Type: Documentation (was: Bug) > bad backslah parsing in SQL statements >

[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539470#comment-16539470 ] Hyukjin Kwon commented on SPARK-24753: -- If the example is wrong, please go ahead for a PR after

[jira] [Commented] (SPARK-24766) CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats

2018-07-10 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539469#comment-16539469 ] Takeshi Yamamuro commented on SPARK-24766: -- nit: Can you add `in parquet` in the title? I

[jira] [Resolved] (SPARK-24530) Sphinx doesn't render autodoc_docstring_signature correctly (with Python 2?) and pyspark.ml docs are broken

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24530. -- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by pull

[jira] [Updated] (SPARK-24780) DataFrame.column_name should resolve to a distinct ref

2018-07-10 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-24780: Summary: DataFrame.column_name should resolve to a distinct ref (was: DataFrame.column_name should take

[jira] [Updated] (SPARK-24780) DataFrame.column_name should take into account DataFrame alias for future joins

2018-07-10 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-24780: Description: If we join a dataframe with another dataframe which has the same column name of the

[jira] [Created] (SPARK-24780) DataFrame.column_name should take into account DataFrame alias for future joins

2018-07-10 Thread holdenk (JIRA)
holdenk created SPARK-24780: --- Summary: DataFrame.column_name should take into account DataFrame alias for future joins Key: SPARK-24780 URL: https://issues.apache.org/jira/browse/SPARK-24780 Project: Spark

[jira] [Comment Edited] (SPARK-24766) CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539420#comment-16539420 ] Yuming Wang edited comment on SPARK-24766 at 7/11/18 1:24 AM: -- It works

[jira] [Updated] (SPARK-24766) CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24766: Summary: CreateHiveTableAsSelect and InsertIntoHiveDir won't generate decimal column stats (was:

[jira] [Updated] (SPARK-24766) CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24766: Description: How to reproduce: {code:java} INSERT OVERWRITE LOCAL DIRECTORY

[jira] [Commented] (SPARK-24766) CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539420#comment-16539420 ] Yuming Wang commented on SPARK-24766: - It works after upgrade built-in Hive to 2.3.2 and upgrade

[jira] [Commented] (SPARK-21097) Dynamic allocation will preserve cached data

2018-07-10 Thread John Vincent Thorpe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539381#comment-16539381 ] John Vincent Thorpe commented on SPARK-21097: - Hi Brad, Great work on implementing this

[jira] [Commented] (SPARK-24779) Add sequence / map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2018-07-10 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539299#comment-16539299 ] Huaxin Gao commented on SPARK-24779: I will work on this.  > Add sequence / map_concat /

[jira] [Created] (SPARK-24779) Add sequence / map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2018-07-10 Thread Huaxin Gao (JIRA)
Huaxin Gao created SPARK-24779: -- Summary: Add sequence / map_concat / map_from_entries / an option in months_between UDF to disable rounding-off Key: SPARK-24779 URL:

[jira] [Commented] (SPARK-22658) SPIP: TeansorFlowOnSpark as a Scalable Deep Learning Lib of Apache Spark

2018-07-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539266#comment-16539266 ] Ruslan Dautkhanov commented on SPARK-22658: --- Intel's BigDL is somewhat a TensorFlowOnSpark

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-07-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539261#comment-16539261 ] Ruslan Dautkhanov commented on SPARK-22947: --- Oracle has a similar AS OF syntax 

[jira] [Commented] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Vinitha Reddy Gankidi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539253#comment-16539253 ] Vinitha Reddy Gankidi commented on SPARK-24778: --- Workaround could be something like:

[jira] [Updated] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Vinitha Reddy Gankidi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinitha Reddy Gankidi updated SPARK-24778: -- Description: {{DateTimeUtils.getTimeZone}} calls java's

[jira] [Assigned] (SPARK-24767) Propagate MDC to spark-submit thread in InProcessAppHandle

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24767: Assignee: (was: Apache Spark) > Propagate MDC to spark-submit thread in

[jira] [Assigned] (SPARK-24767) Propagate MDC to spark-submit thread in InProcessAppHandle

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24767: Assignee: Apache Spark > Propagate MDC to spark-submit thread in InProcessAppHandle >

[jira] [Commented] (SPARK-24767) Propagate MDC to spark-submit thread in InProcessAppHandle

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539128#comment-16539128 ] Apache Spark commented on SPARK-24767: -- User 'yifeih' has created a pull request for this issue:

[jira] [Updated] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Vinitha Reddy Gankidi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinitha Reddy Gankidi updated SPARK-24778: -- Description: {{DateTimeUtils.getTimeZone}} calls java's

[jira] [Updated] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Vinitha Reddy Gankidi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinitha Reddy Gankidi updated SPARK-24778: -- Description: {{DateTimeUtils.getTimeZone}} calls java's

[jira] [Created] (SPARK-24778) DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed

2018-07-10 Thread Vinitha Reddy Gankidi (JIRA)
Vinitha Reddy Gankidi created SPARK-24778: - Summary: DateTimeUtils.getTimeZone method returns GMT time if timezone cannot be parsed Key: SPARK-24778 URL: https://issues.apache.org/jira/browse/SPARK-24778

[jira] [Resolved] (SPARK-24765) Add custom Kubernetes scheduler config parameter to spark-submit

2018-07-10 Thread Nihal Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal Harish resolved SPARK-24765. -- Resolution: Information Provided Issue is addressed by 

[jira] [Resolved] (SPARK-24662) Structured Streaming should support LIMIT

2018-07-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24662. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21662

[jira] [Assigned] (SPARK-24662) Structured Streaming should support LIMIT

2018-07-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-24662: - Assignee: Mukul Murthy > Structured Streaming should support LIMIT >

[jira] [Comment Edited] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread mathieu longtin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539023#comment-16539023 ] mathieu longtin edited comment on SPARK-24753 at 7/10/18 5:59 PM: --

[jira] [Updated] (SPARK-24776) AVRO unit test: use SQLTestUtils and Replace deprecated methods

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24776: --- Summary: AVRO unit test: use SQLTestUtils and Replace deprecated methods (was: Improve

[jira] [Commented] (SPARK-24753) bad backslah parsing in SQL statements

2018-07-10 Thread mathieu longtin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539023#comment-16539023 ] mathieu longtin commented on SPARK-24753: - Thanks for the response. Yes, it does work with

[jira] [Created] (SPARK-24776) Improve AVRO unit test: use

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24776: -- Summary: Improve AVRO unit test: use Key: SPARK-24776 URL: https://issues.apache.org/jira/browse/SPARK-24776 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24777) Refactor AVRO read/write benchmark

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24777: -- Summary: Refactor AVRO read/write benchmark Key: SPARK-24777 URL: https://issues.apache.org/jira/browse/SPARK-24777 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24775) support reading AVRO logical types - Duration

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24775: -- Summary: support reading AVRO logical types - Duration Key: SPARK-24775 URL: https://issues.apache.org/jira/browse/SPARK-24775 Project: Spark Issue

[jira] [Created] (SPARK-24774) support reading AVRO logical types - Time with different precisions

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24774: -- Summary: support reading AVRO logical types - Time with different precisions Key: SPARK-24774 URL: https://issues.apache.org/jira/browse/SPARK-24774 Project:

[jira] [Created] (SPARK-24773) support reading AVRO logical types - Timestamp with different precisions

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24773: -- Summary: support reading AVRO logical types - Timestamp with different precisions Key: SPARK-24773 URL: https://issues.apache.org/jira/browse/SPARK-24773

[jira] [Created] (SPARK-24772) support reading AVRO logical types - Decimal

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24772: -- Summary: support reading AVRO logical types - Decimal Key: SPARK-24772 URL: https://issues.apache.org/jira/browse/SPARK-24772 Project: Spark Issue Type:

[jira] [Created] (SPARK-24771) Upgrade AVRO version from 1.7.7 to 1.8

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24771: -- Summary: Upgrade AVRO version from 1.7.7 to 1.8 Key: SPARK-24771 URL: https://issues.apache.org/jira/browse/SPARK-24771 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24770: --- Summary: Supporting to convert a column into binary of AVRO format (was: Supporting to

[jira] [Updated] (SPARK-24769) Support for parsing AVRO binary column

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24769: --- Summary: Support for parsing AVRO binary column (was: Support for parsing AVRO string

[jira] [Assigned] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24768: Assignee: Apache Spark > Have a built-in AVRO data source implementation >

[jira] [Assigned] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24768: Assignee: (was: Apache Spark) > Have a built-in AVRO data source implementation >

[jira] [Commented] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539015#comment-16539015 ] Apache Spark commented on SPARK-24768: -- User 'gengliangwang' has created a pull request for this

[jira] [Created] (SPARK-24770) Supporting to convert a column into binary of avro format

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24770: -- Summary: Supporting to convert a column into binary of avro format Key: SPARK-24770 URL: https://issues.apache.org/jira/browse/SPARK-24770 Project: Spark

[jira] [Commented] (SPARK-24765) Add custom Kubernetes scheduler config parameter to spark-submit

2018-07-10 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538999#comment-16538999 ] Yinan Li commented on SPARK-24765: -- Check out https://issues.apache.org/jira/browse/SPARK-24434 and 

[jira] [Created] (SPARK-24769) Support for parsing AVRO string column

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24769: -- Summary: Support for parsing AVRO string column Key: SPARK-24769 URL: https://issues.apache.org/jira/browse/SPARK-24769 Project: Spark Issue Type:

[jira] [Updated] (SPARK-24765) Add custom Kubernetes scheduler config parameter to spark-submit

2018-07-10 Thread Nihal Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal Harish updated SPARK-24765: - Description: spark submit currently does not accept any config parameter that can enable the

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: Built-in AVRO Data Source In Spark 2.4.pdf > Have a built-in AVRO data source

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Description: Apache Avro (https://avro.apache.org) is a popular data serialization format.

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: (was: Design doc-Spark Avro.pdf) > Have a built-in AVRO data source

[jira] [Updated] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-24768: --- Attachment: Design doc-Spark Avro.pdf > Have a built-in AVRO data source implementation >

[jira] [Created] (SPARK-24768) Have a built-in AVRO data source implementation

2018-07-10 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-24768: -- Summary: Have a built-in AVRO data source implementation Key: SPARK-24768 URL: https://issues.apache.org/jira/browse/SPARK-24768 Project: Spark Issue

[jira] [Comment Edited] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538924#comment-16538924 ] Bryan Cutler edited comment on SPARK-24760 at 7/10/18 4:48 PM: --- Pandas

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538924#comment-16538924 ] Bryan Cutler commented on SPARK-24760: -- Pandas uses NaNs as a special value that it interprets as a

[jira] [Created] (SPARK-24767) Propagate MDC to spark-submit thread in InProcessAppHandle

2018-07-10 Thread Yifei Huang (JIRA)
Yifei Huang created SPARK-24767: --- Summary: Propagate MDC to spark-submit thread in InProcessAppHandle Key: SPARK-24767 URL: https://issues.apache.org/jira/browse/SPARK-24767 Project: Spark

[jira] [Resolved] (SPARK-22503) Using current processing time to generate windows in streaming processing

2018-07-10 Thread wangsan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangsan resolved SPARK-22503. - Resolution: Not A Problem > Using current processing time to generate windows in streaming processing >

[jira] [Updated] (SPARK-24766) CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24766: Description: How to reproduce: {code:java} INSERT OVERWRITE LOCAL DIRECTORY

[jira] [Updated] (SPARK-24766) CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24766: Description: How to reproduce: {code:java} INSERT OVERWRITE LOCAL DIRECTORY

[jira] [Created] (SPARK-24766) CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats

2018-07-10 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-24766: --- Summary: CreateHiveTableAsSelectCommand and InsertIntoHiveDirCommand won't generate decimal column stats Key: SPARK-24766 URL: https://issues.apache.org/jira/browse/SPARK-24766

[jira] [Commented] (SPARK-24687) When NoClassDefError thrown during task serialization will cause job hang

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538576#comment-16538576 ] Apache Spark commented on SPARK-24687: -- User 'caneGuy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24687) When NoClassDefError thrown during task serialization will cause job hang

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24687: Assignee: (was: Apache Spark) > When NoClassDefError thrown during task

[jira] [Assigned] (SPARK-24687) When NoClassDefError thrown during task serialization will cause job hang

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24687: Assignee: Apache Spark > When NoClassDefError thrown during task serialization will

[jira] [Updated] (SPARK-24677) Avoid NoSuchElementException from MedianHeap

2018-07-10 Thread dzcxzl (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dzcxzl updated SPARK-24677: --- Summary: Avoid NoSuchElementException from MedianHeap (was: MedianHeap is empty when speculation is

[jira] [Assigned] (SPARK-24678) We should use 'PROCESS_LOCAL' first for Spark-Streaming

2018-07-10 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao reassigned SPARK-24678: --- Assignee: sharkd tu > We should use 'PROCESS_LOCAL' first for Spark-Streaming >

[jira] [Resolved] (SPARK-24678) We should use 'PROCESS_LOCAL' first for Spark-Streaming

2018-07-10 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-24678. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21658

[jira] [Assigned] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24268: Assignee: Marco Gaido (was: Apache Spark) > DataType in error messages are not coherent

[jira] [Assigned] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24268: Assignee: Apache Spark (was: Marco Gaido) > DataType in error messages are not coherent

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-24268: Description: In SPARK-22893 there was a tentative to unify the way dataTypes are reported in

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-10 Thread Igor Pergenitsa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538281#comment-16538281 ] Igor Pergenitsa commented on SPARK-24745: - {quote}you can still set the name also on the RDD you

[jira] [Assigned] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24268: Assignee: Apache Spark (was: Marco Gaido) > DataType in error messages are not coherent

[jira] [Assigned] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24268: Assignee: Marco Gaido (was: Apache Spark) > DataType in error messages are not coherent

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538260#comment-16538260 ] Marco Gaido commented on SPARK-24745: - A RDD already has a unique ID. I think the name is just

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-10 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538234#comment-16538234 ] Mortada Mehyar commented on SPARK-24760: I still think something is not right here. It is true

[jira] [Resolved] (SPARK-24706) Support ByteType and ShortType pushdown to parquet

2018-07-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-24706. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21682

[jira] [Commented] (SPARK-15613) Incorrect days to millis conversion

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538191#comment-16538191 ] Hyukjin Kwon commented on SPARK-15613: -- This seems merged into 1.6.3 and reverted in 1.6.2. I

[jira] [Updated] (SPARK-15613) Incorrect days to millis conversion

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-15613: - Fix Version/s: (was: 1.6.2) 1.6.3 > Incorrect days to millis conversion

[jira] [Commented] (SPARK-24745) Map function does not keep rdd name

2018-07-10 Thread Igor Pergenitsa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538149#comment-16538149 ] Igor Pergenitsa commented on SPARK-24745: - [~mgaido], yes, I agree but on other hand the name

[jira] [Commented] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538118#comment-16538118 ] Apache Spark commented on SPARK-24718: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24718: Assignee: Apache Spark > Timestamp support pushdown to parquet data source >

[jira] [Assigned] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-07-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24718: Assignee: (was: Apache Spark) > Timestamp support pushdown to parquet data source >

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24268: - Fix Version/s: (was: 2.4.0) > DataType in error messages are not coherent >

[jira] [Updated] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24268: - Priority: Minor (was: Trivial) > DataType in error messages are not coherent >

[jira] [Reopened] (SPARK-24268) DataType in error messages are not coherent

2018-07-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24268: -- > DataType in error messages are not coherent > --- > >