[jira] [Updated] (SPARK-16275) Implement all the Hive fallback functions

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16275: - Target Version/s: 2.3.0 (was: 2.2.0) > Implement all the Hive fallback functions >

[jira] [Updated] (SPARK-12978) Skip unnecessary final group-by when input data already clustered with group-by keys

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-12978: - Target Version/s: 2.3.0 (was: 2.2.0) > Skip unnecessary final group-by when input data

[jira] [Updated] (SPARK-16412) Generate Java code that gets an array in each column of CachedBatch when DataFrame.cache() is called

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16412: - Target Version/s: 2.3.0 (was: 2.2.0) > Generate Java code that gets an array in each

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4502: Target Version/s: 2.3.0 (was: 2.2.0) > Spark SQL reads unneccesary nested fields from

[jira] [Updated] (SPARK-16217) Support SELECT INTO statement

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16217: - Target Version/s: 2.3.0 (was: 2.2.0) > Support SELECT INTO statement >

[jira] [Updated] (SPARK-18084) write.partitionBy() does not recognize nested columns that select() can access

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18084: - Target Version/s: 2.3.0 (was: 2.2.0) > write.partitionBy() does not recognize nested

[jira] [Updated] (SPARK-17924) Consolidate streaming and batch write path

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17924: - Target Version/s: 2.3.0 (was: 2.2.0) > Consolidate streaming and batch write path >

[jira] [Updated] (SPARK-19150) completely support using hive as data source to create tables

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19150: - Target Version/s: 2.3.0 (was: 2.2.0) > completely support using hive as data source to

[jira] [Updated] (SPARK-16452) basic INFORMATION_SCHEMA support

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16452: - Target Version/s: 2.3.0 (was: 2.2.0) > basic INFORMATION_SCHEMA support >

[jira] [Updated] (SPARK-16483) Unifying struct fields and columns

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16483: - Target Version/s: 2.3.0 (was: 2.2.0) > Unifying struct fields and columns >

[jira] [Updated] (SPARK-16390) Dataset API improvements

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16390: - Target Version/s: 2.3.0 (was: 2.2.0) > Dataset API improvements >

[jira] [Updated] (SPARK-16196) Optimize in-memory scan performance using ColumnarBatches

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16196: - Target Version/s: 2.3.0 (was: 2.2.0) > Optimize in-memory scan performance using

[jira] [Updated] (SPARK-7768) Make user-defined type (UDT) API public

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7768: Target Version/s: 2.3.0 (was: 2.2.0) > Make user-defined type (UDT) API public >

[jira] [Updated] (SPARK-17203) data source options should always be case insensitive

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17203: - Target Version/s: 2.3.0 (was: 2.2.0) > data source options should always be case

[jira] [Updated] (SPARK-19242) SHOW CREATE TABLE should generate new syntax to create hive table

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19242: - Target Version/s: 2.3.0 (was: 2.2.0) > SHOW CREATE TABLE should generate new syntax to

[jira] [Updated] (SPARK-17528) MutableProjection should not cache content from the input row

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17528: - Target Version/s: 2.3.0 (was: 2.2.0) > MutableProjection should not cache content from

[jira] [Updated] (SPARK-16323) Avoid unnecessary cast when doing integral divide

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16323: - Target Version/s: 2.3.0 (was: 2.2.0) > Avoid unnecessary cast when doing integral

[jira] [Resolved] (SPARK-20854) extend hint syntax to support any expression, not just identifiers or strings

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-20854. -- Resolution: Fixed https://github.com/apache/spark/pull/18086 > extend hint syntax to

[jira] [Updated] (SPARK-15420) Repartition and sort before Parquet writes

2017-06-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-15420: - Target Version/s: 2.3.0 (was: 2.2.0) > Repartition and sort before Parquet writes >

[jira] [Updated] (SPARK-20940) AccumulatorV2 should not throw IllegalAccessError

2017-05-31 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20940: - Target Version/s: 2.2.0 > AccumulatorV2 should not throw IllegalAccessError >

[jira] [Created] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-05-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-20928: Summary: Continuous Processing Mode for Structured Streaming Key: SPARK-20928 URL: https://issues.apache.org/jira/browse/SPARK-20928 Project: Spark

[jira] [Updated] (SPARK-20462) Spark-Kinesis Direct Connector

2017-05-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20462: - Component/s: (was: Input/Output) DStreams > Spark-Kinesis Direct

[jira] [Commented] (SPARK-20843) Cannot gracefully kill drivers which take longer than 10 seconds to die

2017-05-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026634#comment-16026634 ] Michael Armbrust commented on SPARK-20843: -- I don't have much context here /cc [~zsxwing] and

[jira] [Commented] (SPARK-20897) cached self-join should not fail

2017-05-26 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026588#comment-16026588 ] Michael Armbrust commented on SPARK-20897: -- Is this a regression? If so, can you please make

[jira] [Updated] (SPARK-20865) caching dataset throws "Queries with streaming sources must be executed with writeStream.start()"

2017-05-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20865: - Description: {code} SparkSession .builder .master("local[*]")

[jira] [Created] (SPARK-20844) Remove experimental from API and docs

2017-05-22 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-20844: Summary: Remove experimental from API and docs Key: SPARK-20844 URL: https://issues.apache.org/jira/browse/SPARK-20844 Project: Spark Issue Type:

[jira] [Updated] (SPARK-20599) ConsoleSink should work with write (batch)

2017-05-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20599: - Summary: ConsoleSink should work with write (batch) (was: KafkaSourceProvider should

[jira] [Updated] (SPARK-20666) Flaky test - SparkListenerBus randomly failing java.lang.IllegalAccessError

2017-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20666: - Target Version/s: 2.2.0 > Flaky test - SparkListenerBus randomly failing

[jira] [Commented] (SPARK-20376) Make StateStoreProvider plugable

2017-05-09 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16003729#comment-16003729 ] Michael Armbrust commented on SPARK-20376: -- /cc [~tdas] > Make StateStoreProvider plugable >

[jira] [Updated] (SPARK-17939) Spark-SQL Nullability: Optimizations vs. Enforcement Clarification

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17939: - Target Version/s: 2.3.0 (was: 2.2.0) > Spark-SQL Nullability: Optimizations vs.

[jira] [Updated] (SPARK-20569) RuntimeReplaceable functions accept invalid third parameter

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20569: - Affects Version/s: 2.2.0 > RuntimeReplaceable functions accept invalid third parameter >

[jira] [Commented] (SPARK-20569) RuntimeReplaceable functions accept invalid third parameter

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995423#comment-15995423 ] Michael Armbrust commented on SPARK-20569: -- [~rxin] this does seem like a bug. >

[jira] [Updated] (SPARK-20569) RuntimeReplaceable functions accept invalid third parameter

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20569: - Summary: RuntimeReplaceable functions accept invalid third parameter (was: In

[jira] [Updated] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19104: - Affects Version/s: 2.2.0 Target Version/s: 2.2.0 > CompileException with Map and

[jira] [Updated] (SPARK-19104) CompileException with Map and Case Class in Spark 2.1.0

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19104: - Description: The following code will run with Spark 2.0.2 but not with Spark 2.1.0:

[jira] [Commented] (SPARK-20570) The main version number on docs/latest/index.html

2017-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15995248#comment-15995248 ] Michael Armbrust commented on SPARK-20570: -- Hmmm, I did push them, and they show up on the [asf

[jira] [Created] (SPARK-20567) Failure to bind when using explode and collect_set in streaming

2017-05-02 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-20567: Summary: Failure to bind when using explode and collect_set in streaming Key: SPARK-20567 URL: https://issues.apache.org/jira/browse/SPARK-20567 Project:

[jira] [Updated] (SPARK-20547) ExecutorClassLoader's findClass may not work correctly when a task is cancelled.

2017-05-01 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20547: - Affects Version/s: 2.2.0 Target Version/s: 2.2.0 > ExecutorClassLoader's findClass

[jira] [Updated] (SPARK-20364) Parquet predicate pushdown on columns with dots return empty results

2017-04-28 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20364: - Target Version/s: 2.2.0 Priority: Critical (was: Major) > Parquet predicate

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983820#comment-15983820 ] Michael Armbrust commented on SPARK-18057: -- I guess I'd like to understand more about what

[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979156#comment-15979156 ] Michael Armbrust edited comment on SPARK-18057 at 4/21/17 9:10 PM: ---

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979382#comment-15979382 ] Michael Armbrust commented on SPARK-18057: -- Yes, 0.10.2.0 is the first release that promises

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979156#comment-15979156 ] Michael Armbrust commented on SPARK-18057: -- [~srowen], thanks for reporting, but based on the

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979126#comment-15979126 ] Michael Armbrust commented on SPARK-18057: -- If there are multiple reports of 0.10.2.0 being more

[jira] [Reopened] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2017-04-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-16548: -- I'm not sure I agree. The default behavior for parsing corrupted JSON is to return

[jira] [Reopened] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-18891: -- > Support for specific collection types > - > >

[jira] [Resolved] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-18891. -- Resolution: Fixed Fix Version/s: 2.2.0 > Support for specific collection types

[jira] [Commented] (SPARK-20299) NullPointerException when null and string are in a tuple while encoding Dataset

2017-04-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15971491#comment-15971491 ] Michael Armbrust commented on SPARK-20299: -- What input are you looking for? >

[jira] [Resolved] (SPARK-16899) Structured Streaming Checkpointing Example invalid

2017-04-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-16899. -- Resolution: Not A Problem This has been fixed. I believe you are using an old version

[jira] [Updated] (SPARK-16899) Structured Streaming Checkpointing Example invalid

2017-04-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-16899: - Component/s: Structured Streaming > Structured Streaming Checkpointing Example invalid >

[jira] [Commented] (SPARK-19067) mapGroupsWithState - arbitrary stateful operations with Structured Streaming (similar to DStream.mapWithState)

2017-04-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963353#comment-15963353 ] Michael Armbrust commented on SPARK-19067: -- No, this will be available in Spark 2.2.0 >

[jira] [Commented] (SPARK-20216) Install pandoc on machine(s) used for packaging

2017-04-04 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956045#comment-15956045 ] Michael Armbrust commented on SPARK-20216: -- I think it all runs on

[jira] [Updated] (SPARK-20103) Spark structured steaming from kafka - last message processed again after resume from checkpoint

2017-03-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20103: - Fix Version/s: 2.2.0 > Spark structured steaming from kafka - last message processed

[jira] [Commented] (SPARK-20103) Spark structured steaming from kafka - last message processed again after resume from checkpoint

2017-03-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948035#comment-15948035 ] Michael Armbrust commented on SPARK-20103: -- It is fixed in 2.2 but by [SPARK-19876]. > Spark

[jira] [Updated] (SPARK-20103) Spark structured steaming from kafka - last message processed again after resume from checkpoint

2017-03-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20103: - Description: When the application starts after a failure or a graceful shutdown, it is

[jira] [Updated] (SPARK-20103) Spark structured steaming from kafka - last message processed again after resume from checkpoint

2017-03-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-20103: - Docs Text: (was: object StructuredStreaming { def main(args: Array[String]): Unit = {

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2017-03-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939156#comment-15939156 ] Michael Armbrust commented on SPARK-10816: -- Just a quick note for people interested in this

[jira] [Resolved] (SPARK-18970) FileSource failure during file list refresh doesn't cause an application to fail, but stops further processing

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-18970. -- Resolution: Fixed Fix Version/s: 2.1.0 I'm going to close this, but please

[jira] [Closed] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-17344. Resolution: Won't Fix Unless someone really wants to work on this, i think the fact that

[jira] [Updated] (SPARK-19965) DataFrame batch reader may fail to infer partitions when reading FileStreamSink's output

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19965: - Target Version/s: 2.2.0 > DataFrame batch reader may fail to infer partitions when

[jira] [Updated] (SPARK-19767) API Doc pages for Streaming with Kafka 0.10 not current

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19767: - Component/s: (was: Structured Streaming) DStreams > API Doc pages

[jira] [Resolved] (SPARK-19013) java.util.ConcurrentModificationException when using s3 path as checkpointLocation

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-19013. -- Resolution: Later It seems like [HADOOP-13345] is the right solution here, but since

[jira] [Resolved] (SPARK-19788) DataStreamReader/DataStreamWriter.option shall accept user-defined type

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-19788. -- Resolution: Won't Fix Thanks for the suggestion. However, as [~zsxwing] said, the

[jira] [Resolved] (SPARK-19932) Disallow a case that might cause OOM for steaming deduplication

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-19932. -- Resolution: Won't Fix Thanks for working on this. While I think it would be helpful

[jira] [Assigned] (SPARK-19876) Add OneTime trigger executor

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-19876: Assignee: Tyson Condie > Add OneTime trigger executor >

[jira] [Updated] (SPARK-19876) Add OneTime trigger executor

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19876: - Target Version/s: 2.2.0 > Add OneTime trigger executor > >

[jira] [Updated] (SPARK-19989) Flaky Test: org.apache.spark.sql.kafka010.KafkaSourceStressSuite

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19989: - Description: This test failed recently here:

[jira] [Updated] (SPARK-19989) Flaky Test: org.apache.spark.sql.kafka010.KafkaSourceStressSuite

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19989: - Target Version/s: 2.2.0 > Flaky Test:

[jira] [Created] (SPARK-20063) Trigger without delay when falling behind

2017-03-22 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-20063: Summary: Trigger without delay when falling behind Key: SPARK-20063 URL: https://issues.apache.org/jira/browse/SPARK-20063 Project: Spark Issue

[jira] [Commented] (SPARK-20009) Use user-friendly DDL formats for defining a schema in user-facing APIs

2017-03-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936876#comment-15936876 ] Michael Armbrust commented on SPARK-20009: -- Yeah, the DDL format is certainly a lot easier to

[jira] [Commented] (SPARK-19982) JavaDatasetSuite.testJavaBeanEncoder sometimes fails with "Unable to generate an encoder for inner class"

2017-03-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15929275#comment-15929275 ] Michael Armbrust commented on SPARK-19982: -- I'm not sure if changing weak to strong references

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922715#comment-15922715 ] Michael Armbrust commented on SPARK-18057: -- So to summarize, it'll be unfortunate if Kafka

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905755#comment-15905755 ] Michael Armbrust commented on SPARK-18057: -- It seems like we can upgrade the existing Kafka10

[jira] [Updated] (SPARK-19888) Seeing offsets not resetting even when reset policy is configured explicitly

2017-03-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19888: - Component/s: (was: Spark Core) DStreams > Seeing offsets not

[jira] [Updated] (SPARK-18055) Dataset.flatMap can't work with types from customized jar

2017-03-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18055: - Target Version/s: 2.2.0 > Dataset.flatMap can't work with types from customized jar >

[jira] [Assigned] (SPARK-18055) Dataset.flatMap can't work with types from customized jar

2017-03-07 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reassigned SPARK-18055: Assignee: Michael Armbrust > Dataset.flatMap can't work with types from

[jira] [Updated] (SPARK-19813) maxFilesPerTrigger combo latestFirst may miss old files in combination with maxFileAge in FileStreamSource

2017-03-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19813: - Target Version/s: 2.2.0 > maxFilesPerTrigger combo latestFirst may miss old files in

[jira] [Updated] (SPARK-19690) Join a streaming DataFrame with a batch DataFrame may not work

2017-03-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19690: - Priority: Critical (was: Major) > Join a streaming DataFrame with a batch DataFrame may

[jira] [Updated] (SPARK-18258) Sinks need access to offset representation

2017-03-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18258: - Target Version/s: (was: 2.2.0) > Sinks need access to offset representation >

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-24 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883642#comment-15883642 ] Michael Armbrust commented on SPARK-19715: -- This isn't a hypothetical. A user of structured

[jira] [Created] (SPARK-19721) Good error message for version mismatch in log files

2017-02-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19721: Summary: Good error message for version mismatch in log files Key: SPARK-19721 URL: https://issues.apache.org/jira/browse/SPARK-19721 Project: Spark

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15881731#comment-15881731 ] Michael Armbrust commented on SPARK-19715: -- [~lwlin] another file source features you might want

[jira] [Updated] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-02-23 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18057: - Summary: Update structured streaming kafka from 10.0.1 to 10.2.0 (was: Update

[jira] [Created] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-23 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19715: Summary: Option to Strip Paths in FileSource Key: SPARK-19715 URL: https://issues.apache.org/jira/browse/SPARK-19715 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-19637) add to_json APIs to SQL

2017-02-17 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872470#comment-15872470 ] Michael Armbrust commented on SPARK-19637: -- >From JSON is harder because the second argument is

[jira] [Created] (SPARK-19633) FileSource read from FileSink

2017-02-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19633: Summary: FileSource read from FileSink Key: SPARK-19633 URL: https://issues.apache.org/jira/browse/SPARK-19633 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-19553) Add GroupedData.countApprox()

2017-02-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864326#comment-15864326 ] Michael Armbrust commented on SPARK-19553: -- It seems like there are a couple of distinct feature

[jira] [Commented] (SPARK-19477) [SQL] Datasets created from a Dataframe with extra columns retain the extra columns

2017-02-10 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861918#comment-15861918 ] Michael Armbrust commented on SPARK-19477: -- If a lot of people are confused by this being lazy

[jira] [Created] (SPARK-19497) dropDuplicates with watermark

2017-02-07 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19497: Summary: dropDuplicates with watermark Key: SPARK-19497 URL: https://issues.apache.org/jira/browse/SPARK-19497 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-19478) JDBC Sink

2017-02-06 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-19478: - Issue Type: New Feature (was: Bug) > JDBC Sink > - > > Key:

[jira] [Created] (SPARK-19478) JDBC Sink

2017-02-06 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19478: Summary: JDBC Sink Key: SPARK-19478 URL: https://issues.apache.org/jira/browse/SPARK-19478 Project: Spark Issue Type: Bug Components:

[jira] (SPARK-16454) Consider adding a per-batch transform for structured streaming

2017-01-30 Thread Michael Armbrust (JIRA)
Title: Message Title Michael Armbrust commented on SPARK-16454

[jira] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0

2017-01-30 Thread Michael Armbrust (JIRA)
Title: Message Title Michael Armbrust updated an issue

[jira] [Updated] (SPARK-18682) Batch Source for Kafka

2017-01-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18682: - Assignee: Tyson Condie > Batch Source for Kafka > -- > >

[jira] [Closed] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2017-01-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-18475. Resolution: Won't Fix > Be able to provide higher parallelization for StructuredStreaming

[jira] [Updated] (SPARK-18970) FileSource failure during file list refresh doesn't cause an application to fail, but stops further processing

2017-01-13 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18970: - Description: Spark streaming application uses S3 files as streaming sources. After

[jira] [Created] (SPARK-19067) mapWithState Style API

2017-01-03 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19067: Summary: mapWithState Style API Key: SPARK-19067 URL: https://issues.apache.org/jira/browse/SPARK-19067 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-19065) Bad error when using dropDuplicates in Streaming

2017-01-03 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19065: Summary: Bad error when using dropDuplicates in Streaming Key: SPARK-19065 URL: https://issues.apache.org/jira/browse/SPARK-19065 Project: Spark

[jira] [Created] (SPARK-19031) JDBC Streaming Source

2016-12-29 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19031: Summary: JDBC Streaming Source Key: SPARK-19031 URL: https://issues.apache.org/jira/browse/SPARK-19031 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-17344) Kafka 0.8 support for Structured Streaming

2016-12-22 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-17344: - Target Version/s: (was: 2.1.1) > Kafka 0.8 support for Structured Streaming >

<    1   2   3   4   5   6   7   8   9   10   >