Get the value of DStream[(String, Iterable[String])]

2014-12-17 Thread Guillermo Ortiz
I'm a newbie with Spark,,, a simple question val errorLines = lines.filter(_.contains(h)) val mapErrorLines = errorLines.map(line = (key, line)) val grouping = errorLinesValue.groupByKeyAndWindow(Seconds(8), Seconds(4)) I get something like: 604: --- 605:

Re: Get the value of DStream[(String, Iterable[String])]

2014-12-17 Thread Gerard Maas
You can create a DStream that contains the count, transforming the grouped windowed RDD, like this: val errorCount = grouping.map{case (k,v) = v.size } If you need to preserve the key: val errorCount = grouping.map{case (k,v) = (k,v.size) } or you if you don't care about the content of the

Re: Get the value of DStream[(String, Iterable[String])]

2014-12-17 Thread Guillermo Ortiz
What I would like to do it's to count the number of elements and if it's greater than a number, I have to iterate all them and store them in mysql or another system. So, I need to count them and preserve the values because saving in other system. I know about this map(line = (key, line)), it was

Re: Get the value of DStream[(String, Iterable[String])]

2014-12-17 Thread Guillermo Ortiz
Basically what I want to do it'd be something like.. val errorLines = lines.filter(_.contains(h)) val mapErrorLines = errorLines.map(line = (key, line)) val grouping = errorLinesValue.groupByKeyAndWindow(Seconds(8), Seconds(4)) if (errorLinesValue.getValue().size() X){ //iterate values and