Hi Helena and All,

I have found one example "multi-line json file" into an RDD using "
https://github.com/alexholmes/json-mapreduce";.

val data = sc.newAPIHadoopFile(
    filepath,
    classOf[MultiLineJsonInputFormat],
    classOf[LongWritable],
    classOf[Text],
    conf ).map(p => (p._1.get, p._2.toString))
 data.count

It is expecting Conf object. What Conf value I need to specify and how
to specify.
MultiLineJsonInputFormat class is expecting "member" value. How to
pass "member value. Otherwise I'm getting below exception

















*java.io.IOException: Missing configuration value for
multilinejsoninputformat.member at
com.alexholmes.json.mapreduce.MultiLineJsonInputFormat.createRecordReader(MultiLineJsonInputFormat.java:30)
     at
org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:115)        
at
org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:103)       at
org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:65)        at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)        at
org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)      at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)      at
org.apache.spark.scheduler.Task.run(Task.scala:54)      at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at
java.lang.Thread.run(Thread.java:745)*

Please let me know who to resolve this issue

Regards,
Rajesh


On Sun, Dec 14, 2014 at 7:21 PM, Madabhattula Rajesh Kumar <
mrajaf...@gmail.com> wrote:
>
> Thank you Yanbo
>
> Regards,
> Rajesh
>
> On Sun, Dec 14, 2014 at 3:15 PM, Yanbo <yanboha...@gmail.com> wrote:
>>
>> Pay attention to your JSON file, try to change it like following.
>> Each record represent as a JSON string.
>>
>>  {    "NAME" : "Device 1",
>>       "GROUP" : "1",
>>       "SITE" : "qqq",
>>       "DIRECTION" : "East",
>>  }
>>  {    "NAME" : "Device 2",
>>       "GROUP" : "2",
>>       "SITE" : "sss",
>>       "DIRECTION" : "North",
>>  }
>>
>> > 在 2014年12月14日,下午5:01,Madabhattula Rajesh Kumar <mrajaf...@gmail.com>
>> 写道:
>> >
>> > { "Device 1" :
>> >  {    "NAME" : "Device 1",
>> >       "GROUP" : "1",
>> >       "SITE" : "qqq",
>> >       "DIRECTION" : "East",
>> >  }
>> >  "Device 2" :
>> >  {    "NAME" : "Device 2",
>> >       "GROUP" : "2",
>> >       "SITE" : "sss",
>> >       "DIRECTION" : "North",
>> >  }
>> > }
>>
>

Reply via email to