The oproblem lies the way you are doing the processing. After the g.foreach(x => {println(x); println("************")}) are you doing ssc.start. It means till now what you did is just setup the computation stpes but spark has not started any real processing. so when you do g.foreach what it iterates over is the empyt list yopu are returning from the g method hence it does not print anything
On Thu, May 28, 2015 at 7:02 PM, Animesh Baranawal < animeshbarana...@gmail.com> wrote: > I also started the streaming context by running ssc.start() but still > apart from logs nothing of g gets printed. > > ---------- Forwarded message ---------- > From: Animesh Baranawal <animeshbarana...@gmail.com> > Date: Thu, May 28, 2015 at 6:57 PM > Subject: SPARK STREAMING PROBLEM > To: user@spark.apache.org > > > Hi, > > I am trying to extract the filenames from which a Dstream is generated by > parsing the toDebugString method on RDD > I am implementing the following code in spark-shell: > > import org.apache.spark.streaming.{StreamingContext, Seconds} > val ssc = new StreamingContext(sc,Seconds(10)) > val lines = ssc.textFileStream(// directory //) > > def g : List[String] = { > var res = List[String]() > lines.foreachRDD{ rdd => { > if(rdd.count > 0){ > val files = rdd.toDebugString.split("\n").filter(_.contains(":\")) > files.foreach{ ms => { > res = ms.split(" ")(2)::res > }} } > }} > res > } > > g.foreach(x => {println(x); println("************")}) > > However when I run the code, nothing gets printed on the console apart > from the logs. Am I doing something wrong? > And is there any better way to extract the file names from DStream ? > > Thanks in advance > > > Animesh > > > -- Sourav Chandra Senior Software Engineer · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · sourav.chan...@livestream.com o: +91 80 4121 8723 m: +91 988 699 3746 skype: sourav.chandra Livestream "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd Block, Koramangala Industrial Area, Bangalore 560034 www.livestream.com