Hi again

Tried the same examples/src/main/scala/org/apache/spark/examples/mllib/StreamingLinearRegression.scala from 1.3.0
and getting in case testing file content is:

and the answer:

What is wrong?
I can see that model's weights are changing in case I put new data into training dir.

On 14/03/15 09:05, Margus Roo wrote:

I try to understand example provided in https://spark.apache.org/docs/1.2.1/mllib-linear-methods.html - Streaming linear regression

import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming.dstream.DStream

object StreamingLinReg {

  def main(args: Array[String]) {

val conf = new SparkConf().setAppName("StreamLinReg").setMaster("local[2]")
    val ssc = new StreamingContext(conf, Seconds(10))

val trainingData = ssc.textFileStream("/Users/margusja/Documents/workspace/sparcdemo/training/").map(LabeledPoint.parse).cache()

val testData = ssc.textFileStream("/Users/margusja/Documents/workspace/sparcdemo/testing/").map(LabeledPoint.parse)

    val numFeatures = 3
val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(numFeatures))

model.predictOnValues(testData.map(lp => (lp.label, lp.features))).print()




Compiled code and run it
Put file contains
in to training directory.

I can see that models weight change:
15/03/14 08:53:40 INFO StreamingLinearRegressionWithSGD: Current model: weights, [7.333333333333333,7.333333333333333,7.333333333333333]

No I can put what ever in to testing directory but I can not understand answer. In example I can put the same file I used for training in to testing directory. File content is

And answer will be

And in case my file content is

the answer will be:

I except to get label predicted by model.
