Hi, I'm wondering whether it's possible to continuously merge the RDDs coming from a stream into a single RDD efficiently.
One thought is to use the union() method. But using union, I will get a new RDD each time I do a merge. I don't know how I should name these RDDs, because I remember Spark does not encourage users to create an array of RDDs. Another possible solution is to follow the example of "StatefulNetworkWordCount", which uses the updateStateByKey() method. But my RDD type is not key value pairs (it's a struct with multiple fields). Is there a workaround? Thanks, Cui