[jira] [Commented] (SPARK-29358) Make unionByName optionally fill missing columns with nulls

2020-04-02 Thread Michael Armbrust (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073951#comment-17073951 ] Michael Armbrust commented on SPARK-29358: -- Sure, but it is very easy to make this not a

[jira] [Commented] (SPARK-29358) Make unionByName optionally fill missing columns with nulls

2020-03-31 Thread Michael Armbrust (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071968#comment-17071968 ] Michael Armbrust commented on SPARK-29358: -- I think we should reconsider closing this as won't

[jira] [Commented] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Michael Armbrust (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058287#comment-17058287 ] Michael Armbrust commented on SPARK-31136: -- How hard would it be to add support for "LOAD DATA

[jira] [Commented] (SPARK-31136) Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

2020-03-12 Thread Michael Armbrust (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058125#comment-17058125 ] Michael Armbrust commented on SPARK-31136: -- What was the default before, hive sequence files?

[jira] [Commented] (SPARK-27911) PySpark Packages should automatically choose correct scala version

2019-07-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886479#comment-16886479 ] Michael Armbrust commented on SPARK-27911: -- You are right, there is nothing pyspark specific

[jira] [Updated] (SPARK-27911) PySpark Packages should automatically choose correct scala version

2019-05-31 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-27911: - Description: Today, users of pyspark (and Scala) need to manually specify the version

[jira] [Created] (SPARK-27911) PySpark Packages should automatically choose correct scala version

2019-05-31 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-27911: Summary: PySpark Packages should automatically choose correct scala version Key: SPARK-27911 URL: https://issues.apache.org/jira/browse/SPARK-27911 Project:

[jira] [Commented] (SPARK-27676) InMemoryFileIndex should hard-fail on missing files instead of logging and continuing

2019-05-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837552#comment-16837552 ] Michael Armbrust commented on SPARK-27676: -- I tend to agree that all cases where we chose to

[jira] [Assigned] (SPARK-27453) DataFrameWriter.partitionBy is Silently Dropped by DSV1

2019-04-12 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-27453: Assignee: Liwen Sun > DataFrameWriter.partitionBy is Silently Dropped by DSV1 >

[jira] [Created] (SPARK-27453) DataFrameWriter.partitionBy is Silently Dropped by DSV1

2019-04-12 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-27453: Summary: DataFrameWriter.partitionBy is Silently Dropped by DSV1 Key: SPARK-27453 URL: https://issues.apache.org/jira/browse/SPARK-27453 Project: Spark

[jira] [Commented] (SPARK-23831) Add org.apache.derby to IsolatedClientLoader

2018-11-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685629#comment-16685629 ] Michael Armbrust commented on SPARK-23831: -- Why was it reverted? > Add org.apache.derby to

[jira] [Commented] (SPARK-6459) Warn when Column API is constructing trivially true equality

2018-07-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553330#comment-16553330 ] Michael Armbrust commented on SPARK-6459: - [~tenstriker] this will never happen from a SQL query. 

[jira] [Resolved] (SPARK-5517) Add input types for Java UDFs

2018-05-09 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5517. - Resolution: Unresolved > Add input types for Java UDFs > - >

[jira] [Commented] (SPARK-18165) Kinesis support in Structured Streaming

2018-05-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466410#comment-16466410 ] Michael Armbrust commented on SPARK-18165: -- This is great!  I'm glad there are more connectors

[jira] [Updated] (SPARK-18165) Kinesis support in Structured Streaming

2018-05-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18165: - Component/s: (was: DStreams) Structured Streaming > Kinesis support

[jira] [Commented] (SPARK-23337) withWatermark raises an exception on struct objects

2018-04-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433222#comment-16433222 ] Michael Armbrust commented on SPARK-23337: -- The checkpoint will only grow if you are doing an

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-04-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423046#comment-16423046 ] Michael Armbrust commented on SPARK-23835: -- I believe the correct semantics are to throw a

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-03-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420809#comment-16420809 ] Michael Armbrust commented on SPARK-23835: -- /cc [~cloud_fan] > When Dataset.as converts column

[jira] [Commented] (SPARK-23325) DataSourceV2 readers should always produce InternalRow.

2018-03-08 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391861#comment-16391861 ] Michael Armbrust commented on SPARK-23325: -- It does seem like it would be that hard to stabilize

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373611#comment-16373611 ] Michael Armbrust commented on SPARK-18057: -- My only concern is that it is stable and backwards

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-21 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372138#comment-16372138 ] Michael Armbrust commented on SPARK-18057: -- We generally tend towards "don't break things that

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370780#comment-16370780 ] Michael Armbrust commented on SPARK-18057: -- +1 to upgrading and it would also be great to add

[jira] [Commented] (SPARK-23337) withWatermark raises an exception on struct objects

2018-02-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367933#comment-16367933 ] Michael Armbrust commented on SPARK-23337: -- This is essentially the same issue as SPARK-18084.

[jira] [Updated] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-02-15 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-23173: - Labels: release-notes (was: ) > from_json can produce nulls for fields which are marked

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2018-01-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331141#comment-16331141 ] Michael Armbrust commented on SPARK-20928: -- There is more work to do so I might leave the

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-12 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323883#comment-16323883 ] Michael Armbrust commented on SPARK-23050: -- [~zsxwing] is correct. While it is possible for

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310423#comment-16310423 ] Michael Armbrust commented on SPARK-22947: -- +1 to [~rxin]'s question. This seems like it might

[jira] [Commented] (SPARK-22929) Short name for "kafka" doesn't work in pyspark with packages

2017-12-31 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307144#comment-16307144 ] Michael Armbrust commented on SPARK-22929: -- Haha, thanks [~sowen], you are right. Kafka is a

[jira] [Created] (SPARK-22929) Short name for "kafka" doesn't work in pyspark with packages

2017-12-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-22929: Summary: Short name for "kafka" doesn't work in pyspark with packages Key: SPARK-22929 URL: https://issues.apache.org/jira/browse/SPARK-22929 Project: Spark

[jira] [Created] (SPARK-22862) Docs on lazy elimination of columns missing from an encoder.

2017-12-21 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-22862: Summary: Docs on lazy elimination of columns missing from an encoder. Key: SPARK-22862 URL: https://issues.apache.org/jira/browse/SPARK-22862 Project: Spark

[jira] [Commented] (SPARK-22739) Additional Expression Support for Objects

2017-12-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299232#comment-16299232 ] Michael Armbrust commented on SPARK-22739: -- Sounds good to me. I'm happy to provide pointers on

[jira] [Commented] (SPARK-22739) Additional Expression Support for Objects

2017-12-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299167#comment-16299167 ] Michael Armbrust commented on SPARK-22739: -- Any progress on this? Branch cut is January 1st,

[jira] [Updated] (SPARK-22739) Additional Expression Support for Objects

2017-12-20 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-22739: - Target Version/s: 2.3.0 > Additional Expression Support for Objects >

[jira] [Commented] (SPARK-22824) Spark Structured Streaming Source trait breaking change

2017-12-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295676#comment-16295676 ] Michael Armbrust commented on SPARK-22824: -- This is technically an internal API (as is all of

[jira] [Assigned] (SPARK-22824) Spark Structured Streaming Source trait breaking change

2017-12-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-22824: Assignee: Jose Torres > Spark Structured Streaming Source trait breaking change >

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-12-12 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288391#comment-16288391 ] Michael Armbrust commented on SPARK-20928: -- An update on this. We've started to create subtasks

[jira] [Comment Edited] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-08-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148227#comment-16148227 ] Michael Armbrust edited comment on SPARK-20928 at 8/30/17 11:52 PM:

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-08-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148227#comment-16148227 ] Michael Armbrust commented on SPARK-20928: -- Hey everyone, thanks for your interest in this

[jira] [Updated] (SPARK-20441) Within the same streaming query, one StreamingRelation should only be transformed to one StreamingExecutionRelation

2017-07-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20441: - Affects Version/s: (was: 2.2.0) > Within the same streaming query, one

[jira] [Updated] (SPARK-20441) Within the same streaming query, one StreamingRelation should only be transformed to one StreamingExecutionRelation

2017-07-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20441: - Fix Version/s: 2.2.0 > Within the same streaming query, one StreamingRelation should

[jira] [Updated] (SPARK-21267) Improvements to the Structured Streaming programming guide

2017-07-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-21267: - Target Version/s: (was: 2.2.0) > Improvements to the Structured Streaming programming

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070525#comment-16070525 ] Michael Armbrust commented on SPARK-18057: -- We should upgrade. Now that Kafka has a good

[jira] [Commented] (SPARK-15533) Deprecate Dataset.explode

2017-06-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070308#comment-16070308 ] Michael Armbrust commented on SPARK-15533: -- Just include the other columns too

[jira] [Updated] (SPARK-21253) Cannot fetch big blocks to disk

2017-06-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-21253: - Target Version/s: 2.2.0 > Cannot fetch big blocks to disk >

[jira] [Assigned] (SPARK-21253) Cannot fetch big blocks to disk

2017-06-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-21253: Assignee: Shixiong Zhu > Cannot fetch big blocks to disk >

[jira] [Commented] (SPARK-21110) Structs should be usable in inequality filters

2017-06-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16061398#comment-16061398 ] Michael Armbrust commented on SPARK-21110: -- It seems if you can call {{min}} and {{max}} on

[jira] [Updated] (SPARK-21110) Structs should be usable in inequality filters

2017-06-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-21110: - Target Version/s: 2.3.0 > Structs should be usable in inequality filters >

[jira] [Updated] (SPARK-21133) HighlyCompressedMapStatus#writeExternal throws NPE

2017-06-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-21133: - Target Version/s: 2.2.0 Priority: Blocker (was: Major) Description:

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054596#comment-16054596 ] Michael Armbrust commented on SPARK-20928: -- Hi Cody, I do plan to flesh this out with the other

[jira] [Commented] (SPARK-20980) Rename the option `wholeFile` to `multiLine` for JSON and CSV

2017-06-05 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037416#comment-16037416 ] Michael Armbrust commented on SPARK-20980: -- I already cut RC4, I think we may just need to

[jira] [Closed] (SPARK-20737) Mechanism for cleanup hooks, for structured-streaming sinks on executor shutdown.

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-20737. Resolution: Won't Fix > Mechanism for cleanup hooks, for structured-streaming sinks on

[jira] [Updated] (SPARK-20065) Empty output files created for aggregation query in append mode

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20065: - Target Version/s: 2.3.0 > Empty output files created for aggregation query in append

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Target Version/s: 2.3.0 > Watermark metadata is lost when using resolved attributes >

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Component/s: (was: PySpark) > Watermark metadata is lost when using resolved

[jira] [Updated] (SPARK-19903) Watermark metadata is lost when using resolved attributes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Summary: Watermark metadata is lost when using resolved attributes (was: PySpark Kafka

[jira] [Updated] (SPARK-19903) PySpark Kafka streaming query ouput append mode not possible

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19903: - Description: PySpark example reads a Kafka stream. There is watermarking set when

[jira] [Commented] (SPARK-20002) Add support for unions between streaming and batch datasets

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035441#comment-16035441 ] Michael Armbrust commented on SPARK-20002: -- I'm not sure that we will ever support this. The

[jira] [Resolved] (SPARK-20147) Cloning SessionState does not clone streaming query listeners

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-20147. -- Resolution: Fixed Assignee: Kunal Khamar Fix Version/s: 2.2.0

[jira] [Updated] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20928: - Description: Given the current Source API, the minimum possible latency for any record

[jira] [Updated] (SPARK-20734) Structured Streaming spark.sql.streaming.schemaInference not handling schema changes

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20734: - Issue Type: New Feature (was: Bug) > Structured Streaming

[jira] [Updated] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20958: - Labels: release-notes (was: ) > Roll back parquet-mr 1.8.2 to parquet-1.8.1 >

[jira] [Resolved] (SPARK-20958) Roll back parquet-mr 1.8.2 to parquet-1.8.1

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-20958. -- Resolution: Won't Fix Thanks everyone. Sounds like we'll just provide directions in

[jira] [Commented] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-02 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035012#comment-16035012 ] Michael Armbrust commented on SPARK-19104: -- I'm about to cut RC3 of 2.2 and there is no pull

[jira] [Updated] (SPARK-15693) Write schema definition out for file-based data sources to avoid schema inference

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15693: - Target Version/s: 2.3.0 (was: 2.2.0) > Write schema definition out for file-based data

[jira] [Updated] (SPARK-15380) Generate code that stores a float/double value in each column from ColumnarBatch when DataFrame.cache() is used

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15380: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate code that stores a float/double value

[jira] [Updated] (SPARK-19084) conditional function: field

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19084: - Target Version/s: 2.3.0 (was: 2.2.0) > conditional function: field >

[jira] [Updated] (SPARK-15691) Refactor and improve Hive support

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15691: - Target Version/s: 2.3.0 (was: 2.2.0) > Refactor and improve Hive support >

[jira] [Updated] (SPARK-14878) Support Trim characters in the string trim function

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14878: - Target Version/s: 2.3.0 (was: 2.2.0) > Support Trim characters in the string trim

[jira] [Updated] (SPARK-16496) Add wholetext as option for reading text in SQL.

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16496: - Target Version/s: 2.3.0 (was: 2.2.0) > Add wholetext as option for reading text in SQL.

[jira] [Updated] (SPARK-19241) remove hive generated table properties if they are not useful in Spark

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19241: - Target Version/s: 2.3.0 (was: 2.2.0) > remove hive generated table properties if they

[jira] [Updated] (SPARK-16317) Add file filtering interface for FileFormat

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16317: - Target Version/s: 2.3.0 (was: 2.2.0) > Add file filtering interface for FileFormat >

[jira] [Updated] (SPARK-19027) estimate size of object buffer for object hash aggregate

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19027: - Target Version/s: 2.3.0 (was: 2.2.0) > estimate size of object buffer for object hash

[jira] [Updated] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19104: - Target Version/s: 2.3.0 (was: 2.2.0) > CompileException with Map and Case Class in

[jira] [Updated] (SPARK-18245) Improving support for bucketed table

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18245: - Target Version/s: 2.3.0 (was: 2.2.0) > Improving support for bucketed table >

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14098: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate Java code to build CachedColumnarBatch

[jira] [Updated] (SPARK-19014) support complex aggregate buffer in HashAggregateExec

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19014: - Target Version/s: 2.3.0 (was: 2.2.0) > support complex aggregate buffer in

[jira] [Updated] (SPARK-16011) SQL metrics include duplicated attempts

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16011: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL metrics include duplicated attempts >

[jira] [Updated] (SPARK-18388) Running aggregation on many columns throws SOE

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18388: - Target Version/s: 2.3.0 (was: 2.2.0) > Running aggregation on many columns throws SOE >

[jira] [Updated] (SPARK-19989) Flaky Test: org.apache.spark.sql.kafka010.KafkaSourceStressSuite

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19989: - Target Version/s: 2.3.0 (was: 2.2.0) > Flaky Test:

[jira] [Updated] (SPARK-17915) Prepare ColumnVector implementation for UnsafeData

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17915: - Target Version/s: 2.3.0 (was: 2.2.0) > Prepare ColumnVector implementation for

[jira] [Updated] (SPARK-18134) SQL: MapType in Group BY and Joins not working

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18134: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL: MapType in Group BY and Joins not working >

[jira] [Updated] (SPARK-18455) General support for correlated subquery processing

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18455: - Target Version/s: 2.3.0 (was: 2.2.0) > General support for correlated subquery

[jira] [Updated] (SPARK-15690) Fast single-node (single-process) in-memory shuffle

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15690: - Target Version/s: 2.3.0 (was: 2.2.0) > Fast single-node (single-process) in-memory

[jira] [Updated] (SPARK-15689) Data source API v2

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15689: - Target Version/s: 2.3.0 (was: 2.2.0) > Data source API v2 > -- > >

[jira] [Updated] (SPARK-13184) Support minPartitions parameter for JSON and CSV datasources as options

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13184: - Target Version/s: 2.3.0 (was: 2.2.0) > Support minPartitions parameter for JSON and CSV

[jira] [Updated] (SPARK-13682) Finalize the public API for FileFormat

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-13682: - Target Version/s: 2.3.0 (was: 2.2.0) > Finalize the public API for FileFormat >

[jira] [Updated] (SPARK-9221) Support IntervalType in Range Frame

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9221: Target Version/s: 2.3.0 (was: 2.2.0) > Support IntervalType in Range Frame >

[jira] [Updated] (SPARK-20319) Already quoted identifiers are getting wrapped with additional quotes

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20319: - Target Version/s: 2.3.0 (was: 2.2.0) > Already quoted identifiers are getting wrapped

[jira] [Updated] (SPARK-9576) DataFrame API improvement umbrella ticket (in Spark 2.x)

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-9576: Target Version/s: 2.3.0 (was: 2.2.0) > DataFrame API improvement umbrella ticket (in Spark

[jira] [Updated] (SPARK-18394) Executing the same query twice in a row results in CodeGenerator cache misses

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18394: - Target Version/s: 2.3.0 (was: 2.2.0) > Executing the same query twice in a row results

[jira] [Updated] (SPARK-18891) Support for specific collection types

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18891: - Target Version/s: 2.3.0 (was: 2.2.0) > Support for specific collection types >

[jira] [Updated] (SPARK-14543) SQL/Hive insertInto has unexpected results

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-14543: - Target Version/s: 2.3.0 (was: 2.2.0) > SQL/Hive insertInto has unexpected results >

[jira] [Updated] (SPARK-17556) Executor side broadcast for broadcast joins

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17556: - Target Version/s: 2.3.0 (was: 2.2.0) > Executor side broadcast for broadcast joins >

[jira] [Updated] (SPARK-15694) Implement ScriptTransformation in sql/core

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15694: - Target Version/s: 2.3.0 (was: 2.2.0) > Implement ScriptTransformation in sql/core >

[jira] [Updated] (SPARK-16026) Cost-based Optimizer framework

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16026: - Target Version/s: 2.3.0 (was: 2.2.0) > Cost-based Optimizer framework >

[jira] [Updated] (SPARK-18543) SaveAsTable(CTAS) using overwrite could change table definition

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18543: - Target Version/s: 2.3.0 (was: 2.2.0) > SaveAsTable(CTAS) using overwrite could change

[jira] [Updated] (SPARK-18950) Report conflicting fields when merging two StructTypes.

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18950: - Target Version/s: 2.3.0 (was: 2.2.0) > Report conflicting fields when merging two

[jira] [Updated] (SPARK-15117) Generate code that get a value in each compressed column from CachedBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15117: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate code that get a value in each

[jira] [Updated] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17626: - Target Version/s: 2.3.0 (was: 2.2.0) > TPC-DS performance improvements using

[jira] [Updated] (SPARK-15867) Use bucket files for TABLESAMPLE BUCKET

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15867: - Target Version/s: 2.3.0 (was: 2.2.0) > Use bucket files for TABLESAMPLE BUCKET >

  1   2   3   4   5   6   7   8   9   10   >