from:"Susan Zhang"

Re: Spark -- Writing to Partitioned Persistent Table

2015-10-28 Thread Susan Zhang

Have you tried partitionBy? Something like hiveWindowsEvents.foreachRDD( rdd => { val eventsDataFrame = rdd.toDF() eventsDataFrame.write.mode(SaveMode.Append).partitionBy(" windows_event_time_bin").saveAsTable("windows_event") }) On Wed, Oct 28, 2015 at 7:41 AM, Bryan Jeffrey

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-26 Thread Susan Zhang

directKStream.checkpoint(checkpointDuration) Just calling checkpoint on the streaming context should be sufficient to save the metadata On Tue, Aug 25, 2015 at 2:36 PM, Susan Zhang suchenz...@gmail.com wrote: Sure thing! The main looks like

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-26 Thread Susan Zhang

affect anything? On Wed, Aug 26, 2015 at 1:04 PM, Susan Zhang suchenz...@gmail.com wrote: Thanks for the suggestions! I tried the following: I removed createOnError = true And reran the same process to reproduce. Double checked that checkpoint is loading: 15/08/26 10:10:40 INFO

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-26 Thread Susan Zhang

lines you posted indicate that the checkpoint was restored and those offsets were processed; what are the log lines for the following KafkaRDD ? On Wed, Aug 26, 2015 at 2:04 PM, Susan Zhang suchenz...@gmail.com wrote: Compared offsets, and it continues from checkpoint loading: 15/08/26 11:24:54

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-25 Thread Susan Zhang

Yeah. All messages are lost while the streaming job was down. On Tue, Aug 25, 2015 at 11:37 AM, Cody Koeninger c...@koeninger.org wrote: Are you actually losing messages then? On Tue, Aug 25, 2015 at 1:15 PM, Susan Zhang suchenz...@gmail.com wrote: No; first batch only contains messages

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-25 Thread Susan Zhang

No; first batch only contains messages received after the second job starts (messages come in at a steady rate of about 400/second). On Tue, Aug 25, 2015 at 11:07 AM, Cody Koeninger c...@koeninger.org wrote: Does the first batch after restart contain all the messages received while the job was

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

2015-08-25 Thread Susan Zhang

that reproduces the issue? On Tue, Aug 25, 2015 at 1:40 PM, Susan Zhang suchenz...@gmail.com wrote: Yeah. All messages are lost while the streaming job was down. On Tue, Aug 25, 2015 at 11:37 AM, Cody Koeninger c...@koeninger.org wrote: Are you actually losing messages then? On Tue, Aug

Re: Spark -- Writing to Partitioned Persistent Table

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

Re: Spark Streaming Checkpointing Restarts with 0 Event Batches

7 matches

Site Navigation

Mail list logo

Footer information