Hi, Sorry, please ignore my message, It was sent by mistake. I am still drafting.
Regards, Vinti On Mon, Feb 1, 2016 at 2:25 PM, Vinti Maheshwari <vinti.u...@gmail.com> wrote: > Hi All, > > I recently started learning Spark. I need to use spark-streaming. > > 1) Input, need to read from MongoDB > > db.event_gcovs.find({executions:"56791a746e928d7b176d03c0", valid:1, > infofile:{$exists:1}, geo:"sunnyvale"}, {infofile:1}).count() > > > Number of Info files: 24441 > > /* 0 */ > > { > > "_id" : ObjectId("568eaeda71404e5c563ccb86"), > > "infofile" : > "/volume/testtech/datastore/code-coverage/p//infos/svl/6/56791a746e928d7b176d03c0/ > 69958.pcp_napt44_20368.pl.30090.exhibit.R0-re0.15.1I20151218_1934_jammyc.pfe.i386.TC011.fail.FAIL.gcov.info > " > } > > One info file can have 1000 of these blocks( Each block starts from "SF" > delimeter, and ends with the end_of_record. > > >