FYI The relevant StackOverflow query on the same - https://stackoverflow.com/questions/51610482/how-to-do-pca-with-spark-streaming-dataframe
On Tue, Jul 31, 2018 at 3:18 PM, Aakash Basu <aakash.spark....@gmail.com> wrote: > Hi, > > Just curious to know, how can we run a Principal Component Analysis on > streaming data in distributed mode? If we can, is it mathematically valid > enough? > > Have anyone done that before? Can you guys share your experience over it? > Is there any API Spark provides to do the same on Spark Streaming mode? > > Thanks, > Aakash. >