Re: Fatal error when using broadcast variables and checkpointing in Spark Streaming

2016-07-22 Thread Joe Panciera
I realized that there's an error in the code. Corrected: from pyspark import SparkContext from pyspark.streaming import StreamingContext from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream sc = SparkContext(appName="FileAutomation") # Create streaming context from

Fatal error when using broadcast variables and checkpointing in Spark Streaming

2016-07-22 Thread Joe Panciera
Hi, I'm attempting to use broadcast variables to update stateful values used across the cluster for processing. Essentially, I have a function that is executed in .foreachRDD that updates the broadcast variable by calling unpersist() and then rebroadcasting. This works without issues when I