Hi Gurus, Sorry for my naive question. I am new.
I seemed to read somewhere that spark is still batch learning, but spark streaming could allow online learning. I could not find this on the website now. http://spark.apache.org/docs/latest/streaming-programming-guide.html I know MLLib uses incremental or iterative algorithms, I wonder if this is also true between batches of spark streaming. So the question is: say, when I call MLLib linear regression, does the training use one batch data as training data, if yes, then the model update between batches is already taken care of? That is, the model will eventually use all data that arrived from the beginning until current time of scoring as the training data, or the model only use data coming in the past limited number of batches as training data? Many thanks! J