Hi Helena and All, I have found one example "multi-line json file" into an RDD using " https://github.com/alexholmes/json-mapreduce".
val data = sc.newAPIHadoopFile( filepath, classOf[MultiLineJsonInputFormat], classOf[LongWritable], classOf[Text], conf ).map(p => (p._1.get, p._2.toString)) data.count It is expecting Conf object. What Conf value I need to specify and how to specify. MultiLineJsonInputFormat class is expecting "member" value. How to pass "member value. Otherwise I'm getting below exception *java.io.IOException: Missing configuration value for multilinejsoninputformat.member at com.alexholmes.json.mapreduce.MultiLineJsonInputFormat.createRecordReader(MultiLineJsonInputFormat.java:30) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:115) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:103) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)* Please let me know who to resolve this issue Regards, Rajesh On Sun, Dec 14, 2014 at 7:21 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > > Thank you Yanbo > > Regards, > Rajesh > > On Sun, Dec 14, 2014 at 3:15 PM, Yanbo <yanboha...@gmail.com> wrote: >> >> Pay attention to your JSON file, try to change it like following. >> Each record represent as a JSON string. >> >> { "NAME" : "Device 1", >> "GROUP" : "1", >> "SITE" : "qqq", >> "DIRECTION" : "East", >> } >> { "NAME" : "Device 2", >> "GROUP" : "2", >> "SITE" : "sss", >> "DIRECTION" : "North", >> } >> >> > 在 2014年12月14日,下午5:01,Madabhattula Rajesh Kumar <mrajaf...@gmail.com> >> 写道: >> > >> > { "Device 1" : >> > { "NAME" : "Device 1", >> > "GROUP" : "1", >> > "SITE" : "qqq", >> > "DIRECTION" : "East", >> > } >> > "Device 2" : >> > { "NAME" : "Device 2", >> > "GROUP" : "2", >> > "SITE" : "sss", >> > "DIRECTION" : "North", >> > } >> > } >> >