Serialized 3rd party libs

2014-09-02 Thread Matt Narrell
Hello, I’m using Spark streaming to aggregate data from a Kafka topic in sliding windows. Usually we want to persist this aggregated data to a MongoDB cluster, or republish to a different Kafka topic. When I include these 3rd party drivers, I usually get a NotSerializableException due to the

Re: Serialized 3rd party libs

2014-09-02 Thread Matt Narrell
Sean, Thanks for point this out. I’d have to experiment with the mapPartitions method, but you’re right, this seems to address this issue directly. I’m also connecting to Zookeeper to retrieve SparkConf parameters. I run into the same issue with my Zookeeper driver, however, this is before