Hi, I have yet another question, this time maintaining a global list of centroids.
I am trying to implement the clustream algorithm and for that purpose I have the initial set of centres in a flink dataset. Now I need to update the set of centres for every data tuple that comes from the stream. From what I have read so far on 2 different posts having similar questions, is that, in case of streaming datasets the co-map operator was asked to use and retrieve them in 2 separate map functions. My idea is to broadcast the dataset in each flink partition and whenever a data tuple is mapped to a partition using a map function, update the broadcasted dataset. But as this is currently not possible, thus I was thinking to broadcast the datastream using "ds.broadcast()" so that every partition receives the streamed tuple. Then, use a normal flatmap function for the centres and use the broadcasted tuple to update the centres and return the updated set of centres. My question is, would this work? If yes, may someone give an example of the datastream broadcast function and how to retrieve the broadcasted stream in a map function? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-Broadcast-of-datasets-in-streaming-context-tp6456.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.