Re: Spark 1.6 Streaming with Checkpointing

Jacek Laskowski Fri, 26 Aug 2016 14:12:05 -0700

On Fri, Aug 26, 2016 at 10:54 PM, Benjamin Kim <[email protected]> wrote:


>         // Create a text file stream on an S3 bucket
>         val csv = ssc.textFileStream("s3a://" + awsS3BucketName + "/")
>
>         csv.foreachRDD(rdd => {
>             if (!rdd.partitions.isEmpty) {
>                 // process data
>             }
>         })

Hi Benjamin,

I hardly remember the case now but I'd recommend to move the above
snippet *after* you getOrCreate context, i.e.

>         // Get streaming context from checkpoint data or create a new one
>         val context = StreamingContext.getOrCreate(checkpoint,
>             () => createContext(interval, checkpoint, bucket, database, 
> table, partitionBy))

Please the first snippet here so you only create the pipeline after
you get context.

I might be mistaken, too. Sorry.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: Spark 1.6 Streaming with Checkpointing

Reply via email to