I'm creating a real-time visualization of counts of ads shown on my website, using that data pushed through by Spark Streaming.
To avoid clutter, it only looks good to show 4 or 5 lines on my visualization at once (corresponding to 4 or 5 different ads), but there are 50+ different ads that show on my site. What I'd like to do is quickly change which ads to pump through Spark Streaming, without having to rebuild the .jar and push it to my edge node. Ideally I'd have a .csv file on my edge node with a list of 4 ad names, and every time a StreamRDD is created it reads from that tiny file, creates a broadcast variable, and uses that variable as a filter. That way I could just open up the .csv file, save it, and the stream filters correctly automatically. I keep getting errors when I try this. Has anyone had success with a broadcast variable that updates with each new streamRDD? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Read-from-file-and-broadcast-before-every-Spark-Streaming-bucket-tp21433.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org