[jira] [Created] (SPARK-22909) Move Structured Streaming v2 APIs to streaming package

2017-12-27 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22909: Summary: Move Structured Streaming v2 APIs to streaming package Key: SPARK-22909 URL: https://issues.apache.org/jira/browse/SPARK-22909 Project: Spark Issue

[jira] [Commented] (SPARK-22897) Expose stageAttemptId in TaskContext

2017-12-25 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303567#comment-16303567 ] Shixiong Zhu commented on SPARK-22897: -- +1 > Expose stageAttemptId in TaskContext >

[jira] [Resolved] (SPARK-22789) Add ContinuousExecution for continuous processing queries

2017-12-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22789. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19984

[jira] [Assigned] (SPARK-22789) Add ContinuousExecution for continuous processing queries

2017-12-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22789: Assignee: Jose Torres > Add ContinuousExecution for continuous processing queries >

[jira] [Resolved] (SPARK-19552) Upgrade Netty version to 4.1.x final

2017-12-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19552. -- Resolution: Fixed Assignee: Bryan Cutler Fix Version/s: 2.3.0 Resolved by

[jira] [Created] (SPARK-22863) Make MicroBatchExecution also support MicroBatchRead/WriteSupport

2017-12-21 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22863: Summary: Make MicroBatchExecution also support MicroBatchRead/WriteSupport Key: SPARK-22863 URL: https://issues.apache.org/jira/browse/SPARK-22863 Project: Spark

[jira] [Resolved] (SPARK-22824) Spark Structured Streaming Source trait breaking change

2017-12-20 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22824. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20012

[jira] [Assigned] (SPARK-22781) Support creating streaming dataset with ORC files

2017-12-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22781: Assignee: Dongjoon Hyun > Support creating streaming dataset with ORC files >

[jira] [Resolved] (SPARK-22781) Support creating streaming dataset with ORC files

2017-12-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22781. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19975

[jira] [Resolved] (SPARK-22733) refactor StreamExecution for extensibility

2017-12-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22733. -- Resolution: Fixed Assignee: Jose Torres Fix Version/s: 2.3.0 > refactor

[jira] [Resolved] (SPARK-22732) Add DataSourceV2 streaming APIs

2017-12-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22732. -- Resolution: Fixed Assignee: Jose Torres Fix Version/s: 2.3.0 > Add

[jira] [Commented] (SPARK-22752) FileNotFoundException while reading from Kafka

2017-12-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290150#comment-16290150 ] Shixiong Zhu commented on SPARK-22752: -- What's your code? You probably hit SPARK-21977 >

[jira] [Commented] (SPARK-22752) FileNotFoundException while reading from Kafka

2017-12-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16286584#comment-16286584 ] Shixiong Zhu commented on SPARK-22752: -- What's your "checkpointLocation"? Is it using HDFS? Could

[jira] [Updated] (SPARK-22606) There may be two or more tasks in one executor will use the same kafka consumer at the same time, then it will throw an exception: "KafkaConsumer is not safe for multi-t

2017-12-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22606: - Component/s: (was: Structured Streaming) DStreams > There may be two or

[jira] [Updated] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-12-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22324: - Fix Version/s: 2.3.0 > Upgrade Arrow to version 0.8.0 > -- > >

[jira] [Updated] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null

2017-12-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22187: - Target Version/s: (was: 2.3.0) > Update unsaferow format for saved state such that we can set

[jira] [Commented] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null

2017-12-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283097#comment-16283097 ] Shixiong Zhu commented on SPARK-22187: -- Reverted by https://github.com/apache/spark/pull/19924 >

[jira] [Resolved] (SPARK-22656) Upgrade Arrow to 0.8.0

2017-12-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22656. -- Resolution: Duplicate > Upgrade Arrow to 0.8.0 > -- > >

[jira] [Resolved] (SPARK-22638) Use a separate query for StreamingQueryListenerBus

2017-12-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22638. -- Resolution: Fixed Fix Version/s: 2.3.0 > Use a separate query for

[jira] [Created] (SPARK-22656) Upgrade Arrow to 0.8.0

2017-11-29 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22656: Summary: Upgrade Arrow to 0.8.0 Key: SPARK-22656 URL: https://issues.apache.org/jira/browse/SPARK-22656 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-19552) Upgrade Netty version to 4.1.x final

2017-11-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269711#comment-16269711 ] Shixiong Zhu commented on SPARK-19552: -- [~wesmckinn] I will try. Looks like not a lot of work. >

[jira] [Created] (SPARK-22638) Use a separate query for StreamingQueryListenerBus

2017-11-28 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22638: Summary: Use a separate query for StreamingQueryListenerBus Key: SPARK-22638 URL: https://issues.apache.org/jira/browse/SPARK-22638 Project: Spark Issue

[jira] [Updated] (SPARK-19552) Upgrade Netty version to 4.1.x final

2017-11-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19552: - Summary: Upgrade Netty version to 4.1.x final (was: Upgrade Netty version to 4.1.8 final) >

[jira] [Updated] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-11-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19552: - Priority: Major (was: Minor) > Upgrade Netty version to 4.1.8 final >

[jira] [Reopened] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-11-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-19552: -- Netty is deprecating in https://github.com/netty/netty/issues/7439 Reopened this one to discuss

[jira] [Resolved] (SPARK-22544) FileStreamSource should use its own hadoop conf to call globPathIfNecessary

2017-11-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22544. -- Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.2.2

[jira] [Created] (SPARK-22544) FileStreamSource should use its own hadoop conf to call globPathIfNecessary

2017-11-16 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22544: Summary: FileStreamSource should use its own hadoop conf to call globPathIfNecessary Key: SPARK-22544 URL: https://issues.apache.org/jira/browse/SPARK-22544 Project:

[jira] [Updated] (SPARK-22535) PythonRunner.MonitorThread should give the task a little time to finish before killing the python worker

2017-11-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22535: - Fix Version/s: 2.2.2 > PythonRunner.MonitorThread should give the task a little time to finish

[jira] [Created] (SPARK-22535) PythonRunner.MonitorThread should give the task a little time to finish before killing the python worker

2017-11-15 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22535: Summary: PythonRunner.MonitorThread should give the task a little time to finish before killing the python worker Key: SPARK-22535 URL:

[jira] [Resolved] (SPARK-22509) Spark Streaming: jobs with same batch length all start at the same time, permit jobs to be offset

2017-11-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22509. -- Resolution: Not A Bug I don't think it's worth to do such improvement in Spark Streaming. Even

[jira] [Reopened] (SPARK-22509) Spark Streaming: jobs with same batch length all start at the same time, permit jobs to be offset

2017-11-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-22509: -- > Spark Streaming: jobs with same batch length all start at the same time, > permit jobs to be

[jira] [Resolved] (SPARK-22509) Spark Streaming: jobs with same batch length all start at the same time, permit jobs to be offset

2017-11-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22509. -- Resolution: Duplicate > Spark Streaming: jobs with same batch length all start at the same

[jira] [Updated] (SPARK-22509) Spark Streaming: jobs with same batch length all start at the same time, permit jobs to be offset

2017-11-13 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22509: - Component/s: (was: Structured Streaming) DStreams > Spark Streaming: jobs

[jira] [Resolved] (SPARK-21667) ConsoleSink should not fail streaming query with checkpointLocation option

2017-11-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21667. -- Resolution: Fixed Assignee: Rekha Joshi Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19644. -- Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.3.0

[jira] [Assigned] (SPARK-22294) Reset spark.driver.bindAddress when starting a Checkpoint

2017-11-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22294: Assignee: Santiago Saavedra > Reset spark.driver.bindAddress when starting a Checkpoint >

[jira] [Updated] (SPARK-22243) streaming job failed to restart from checkpoint

2017-11-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22243: - Fix Version/s: 2.2.1 > streaming job failed to restart from checkpoint >

[jira] [Resolved] (SPARK-22294) Reset spark.driver.bindAddress when starting a Checkpoint

2017-11-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22294. -- Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > Reset

[jira] [Assigned] (SPARK-22403) StructuredKafkaWordCount example fails in YARN cluster mode

2017-11-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22403: Assignee: Wing Yew Poon > StructuredKafkaWordCount example fails in YARN cluster mode >

[jira] [Resolved] (SPARK-22403) StructuredKafkaWordCount example fails in YARN cluster mode

2017-11-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22403. -- Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 >

[jira] [Updated] (SPARK-22429) Streaming checkpointing code does not retry after failure due to NullPointerException

2017-11-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22429: - Component/s: (was: Structured Streaming) DStreams > Streaming checkpointing

[jira] [Assigned] (SPARK-22243) streaming job failed to restart from checkpoint

2017-11-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22243: Assignee: StephenZou > streaming job failed to restart from checkpoint >

[jira] [Resolved] (SPARK-22243) streaming job failed to restart from checkpoint

2017-11-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22243. -- Resolution: Fixed Fix Version/s: 2.3.0 > streaming job failed to restart from

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Description: I am using streaming on the production for some aggregation and fetching data from

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Description: I am using streaming on the production for some aggregation and fetching data from

[jira] [Commented] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234655#comment-16234655 ] Shixiong Zhu commented on SPARK-19644: -- I added more components since it also affects them. The

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Description: I am using streaming on the production for some aggregation and fetching data from

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Component/s: Structured Streaming > Memory leak in Spark Streaming (Encoder/Scala Reflection) >

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Component/s: SQL > Memory leak in Spark Streaming > -- > >

[jira] [Updated] (SPARK-19644) Memory leak in Spark Streaming (Encoder/Scala Reflection)

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19644: - Summary: Memory leak in Spark Streaming (Encoder/Scala Reflection) (was: Memory leak in Spark

[jira] [Commented] (SPARK-19644) Memory leak in Spark Streaming

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234643#comment-16234643 ] Shixiong Zhu commented on SPARK-19644: -- By the way, you can confirm this issue by checking if the

[jira] [Commented] (SPARK-19644) Memory leak in Spark Streaming

2017-11-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234638#comment-16234638 ] Shixiong Zhu commented on SPARK-19644: -- I happened to investigate a similar issue and found out the

[jira] [Updated] (SPARK-21930) When the number of attempting to restart receiver greater than 0,spark do nothing in 'else'

2017-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21930: - Component/s: (was: Structured Streaming) DStreams > When the number of

[jira] [Resolved] (SPARK-22305) HDFSBackedStateStoreProvider fails with StackOverflowException when attempting to recover state

2017-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22305. -- Resolution: Fixed Fix Version/s: 2.3.0 > HDFSBackedStateStoreProvider fails with

[jira] [Assigned] (SPARK-22305) HDFSBackedStateStoreProvider fails with StackOverflowException when attempting to recover state

2017-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22305: Assignee: Jose Torres > HDFSBackedStateStoreProvider fails with StackOverflowException

[jira] [Commented] (SPARK-22403) StructuredKafkaWordCount example fails in YARN cluster mode

2017-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227321#comment-16227321 ] Shixiong Zhu commented on SPARK-22403: -- Yeah, feel free to submit a PR to improve the example. >

[jira] [Commented] (SPARK-22403) StructuredKafkaWordCount example fails in YARN cluster mode

2017-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227266#comment-16227266 ] Shixiong Zhu commented on SPARK-22403: -- Yeah, Spark creates a temp directory for you. You can set

[jira] [Commented] (SPARK-22305) HDFSBackedStateStoreProvider fails with StackOverflowException when attempting to recover state

2017-10-27 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223000#comment-16223000 ] Shixiong Zhu commented on SPARK-22305: -- Why not just delete the whole checkpoint dir? Dropping state

[jira] [Resolved] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2017-10-26 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22366. -- Resolution: Fixed Assignee: Jose Torres Fix Version/s: 2.3.0 > Support

[jira] [Commented] (SPARK-22305) HDFSBackedStateStoreProvider fails with StackOverflowException when attempting to recover state

2017-10-26 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221432#comment-16221432 ] Shixiong Zhu commented on SPARK-22305: -- [~Yuval.Itzchakov] how many batches per 1 minute in your

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2017-10-26 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22366: - Component/s: (was: Spark Core) SQL > Support ignoreMissingFiles flag

[jira] [Updated] (SPARK-22366) Support ignoreMissingFiles flag parallel to ignoreCorruptFiles

2017-10-26 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22366: - Issue Type: Improvement (was: Bug) > Support ignoreMissingFiles flag parallel to

[jira] [Updated] (SPARK-21988) Add default stats to StreamingRelation and StreamingExecutionRelation

2017-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21988: - Summary: Add default stats to StreamingRelation and StreamingExecutionRelation (was: Add

[jira] [Resolved] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21988. -- Resolution: Fixed > Add default stats to StreamingExecutionRelation >

[jira] [Reopened] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-21988: -- > Add default stats to StreamingExecutionRelation >

[jira] [Assigned] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-22230: Assignee: Jose Torres > agg(last('attr)) gives weird results for streaming >

[jira] [Resolved] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22230. -- Resolution: Fixed > agg(last('attr)) gives weird results for streaming >

[jira] [Updated] (SPARK-22230) agg(last('attr)) gives weird results for streaming

2017-10-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22230: - Fix Version/s: 2.3.0 > agg(last('attr)) gives weird results for streaming >

[jira] [Resolved] (SPARK-21947) monotonically_increasing_id doesn't work in Structured Streaming

2017-10-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21947. -- Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.3.0 >

[jira] [Updated] (SPARK-22200) Kinesis Receivers stops if Kinesis stream was re-sharded

2017-10-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22200: - Component/s: (was: Spark Core) DStreams > Kinesis Receivers stops if

[jira] [Resolved] (SPARK-22203) Add job description for file listing Spark jobs

2017-10-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22203. -- Resolution: Fixed Fix Version/s: 2.3.0 > Add job description for file listing Spark

[jira] [Created] (SPARK-22203) Add job description for file listing Spark jobs

2017-10-04 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22203: Summary: Add job description for file listing Spark jobs Key: SPARK-22203 URL: https://issues.apache.org/jira/browse/SPARK-22203 Project: Spark Issue Type:

[jira] [Updated] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null

2017-10-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22187: - Labels: release-notes releasenotes (was: release-notes) > Update unsaferow format for saved

[jira] [Updated] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null

2017-10-04 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-22187: - Labels: release-notes (was: ) > Update unsaferow format for saved state such that we can set

[jira] [Commented] (SPARK-22046) Streaming State cannot be scalable

2017-09-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177009#comment-16177009 ] Shixiong Zhu commented on SPARK-22046: -- I don't know what's the exact issue.

[jira] [Commented] (SPARK-21999) ConcurrentModificationException - Spark Streaming

2017-09-22 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175986#comment-16175986 ] Shixiong Zhu commented on SPARK-21999: -- >From the stack trace, it seems the problem is in the

[jira] [Resolved] (SPARK-22094) processAllAvailable should not block forever when a query is stopped

2017-09-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-22094. -- Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > processAllAvailable

[jira] [Created] (SPARK-22094) processAllAvailable should not block forever when a query is stopped

2017-09-21 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-22094: Summary: processAllAvailable should not block forever when a query is stopped Key: SPARK-22094 URL: https://issues.apache.org/jira/browse/SPARK-22094 Project: Spark

[jira] [Resolved] (SPARK-21113) Support for read ahead input stream to amortize disk IO cost in the Spill reader

2017-09-18 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21113. -- Resolution: Fixed Assignee: Sital Kedia Fix Version/s: 2.3.0 > Support for

[jira] [Resolved] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-09-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21988. -- Resolution: Fixed Assignee: Jose Torres Fix Version/s: 2.3.0 > Add default

[jira] [Created] (SPARK-21947) monotonically_increasing_id doesn't work in Structured Streaming

2017-09-07 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-21947: Summary: monotonically_increasing_id doesn't work in Structured Streaming Key: SPARK-21947 URL: https://issues.apache.org/jira/browse/SPARK-21947 Project: Spark

[jira] [Updated] (SPARK-21893) Put Kafka 0.8 behind a profile

2017-09-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21893: - Component/s: (was: Structured Streaming) DStreams > Put Kafka 0.8 behind a

[jira] [Resolved] (SPARK-21901) Define toString for StateOperatorProgress

2017-09-06 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21901. -- Resolution: Fixed Assignee: Jacek Laskowski Fix Version/s: 2.3.0

[jira] [Resolved] (SPARK-9104) expose network layer memory usage in shuffle part

2017-09-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-9104. - Resolution: Fixed Fix Version/s: 2.3.0 > expose network layer memory usage in shuffle part

[jira] [Reopened] (SPARK-9104) expose network layer memory usage in shuffle part

2017-09-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reopened SPARK-9104: - Assignee: Saisai Shao > expose network layer memory usage in shuffle part >

[jira] [Resolved] (SPARK-21880) [spark UI]In the SQL table page, modify jobs trace information

2017-09-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21880. -- Resolution: Fixed Assignee: he.qiao Fix Version/s: 2.3.0 > [spark UI]In the

[jira] [Commented] (SPARK-21869) A cached Kafka producer should not be closed if any task is using it.

2017-08-29 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146347#comment-16146347 ] Shixiong Zhu commented on SPARK-21869: -- [~scrapco...@gmail.com] do you want to take this task? > A

[jira] [Created] (SPARK-21869) A cached Kafka producer should not be closed if any task is using it.

2017-08-29 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-21869: Summary: A cached Kafka producer should not be closed if any task is using it. Key: SPARK-21869 URL: https://issues.apache.org/jira/browse/SPARK-21869 Project: Spark

[jira] [Assigned] (SPARK-21701) Add TCP send/rcv buffer size support for RPC client

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-21701: Assignee: Xu Zhang > Add TCP send/rcv buffer size support for RPC client >

[jira] [Resolved] (SPARK-21701) Add TCP send/rcv buffer size support for RPC client

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21701. -- Resolution: Fixed Fix Version/s: 2.3.0 > Add TCP send/rcv buffer size support for RPC

[jira] [Updated] (SPARK-21788) Handle more exceptions when stopping a streaming query

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21788: - Fix Version/s: (was: 3.0.0) 2.3.0 > Handle more exceptions when stopping

[jira] [Commented] (SPARK-21702) Structured Streaming S3A SSE Encryption Not Visible through AWS S3 GUI when PartitionBy Used

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139675#comment-16139675 ] Shixiong Zhu commented on SPARK-21702: -- I'm not familiar with S3. Just FYI, when using

[jira] [Commented] (SPARK-21760) Structured streaming terminates with Exception

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139665#comment-16139665 ] Shixiong Zhu commented on SPARK-21760: -- It's also weird. Since the file has at least one line, this

[jira] [Commented] (SPARK-21760) Structured streaming terminates with Exception

2017-08-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139660#comment-16139660 ] Shixiong Zhu commented on SPARK-21760: -- Could you provide the file content of previous batches, such

[jira] [Created] (SPARK-21788) Handle more exceptions when stopping a streaming query

2017-08-18 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-21788: Summary: Handle more exceptions when stopping a streaming query Key: SPARK-21788 URL: https://issues.apache.org/jira/browse/SPARK-21788 Project: Spark Issue

[jira] [Updated] (SPARK-21713) Replace LogicalPlan.isStreaming with OutputMode

2017-08-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21713: - Component/s: (was: Spark Core) Structured Streaming SQL >

[jira] [Updated] (SPARK-21732) Lazily init hive metastore client

2017-08-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21732: - Description: Right now when the hive metastore server is down, we cannot create SparkSession. It

[jira] [Created] (SPARK-21732) Lazily init hive metastore client

2017-08-14 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-21732: Summary: Lazily init hive metastore client Key: SPARK-21732 URL: https://issues.apache.org/jira/browse/SPARK-21732 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-21710) ConsoleSink causes OOM crashes with large inputs.

2017-08-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16124200#comment-16124200 ] Shixiong Zhu commented on SPARK-21710: -- `collect` is a workaround for

[jira] [Commented] (SPARK-21667) ConsoleSink should not fail streaming query with checkpointLocation option

2017-08-09 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120423#comment-16120423 ] Shixiong Zhu commented on SPARK-21667: -- Do you mind to submit a PR to fix it? > ConsoleSink should

<    1   2   3   4   5   6   7   8   9   10   >