Re: JSON Input files

2014-12-15 Thread Madabhattula Rajesh Kumar
Hi Helena and All, I have found one example multi-line json file into an RDD using https://github.com/alexholmes/json-mapreduce;. val data = sc.newAPIHadoopFile( filepath, classOf[MultiLineJsonInputFormat], classOf[LongWritable], classOf[Text], conf ).map(p = (p._1.get,

Re: JSON Input files

2014-12-15 Thread Michael Armbrust
Underneath the covers, jsonFile uses TextInputFormat, which will split files correctly based on new lines. Thus, there is no fixed maximum size for a json object (other than the fact that it must fit into memory on the executors). On Mon, Dec 15, 2014 at 7:22 AM, Madabhattula Rajesh Kumar

Re: JSON Input files

2014-12-15 Thread Madabhattula Rajesh Kumar
Thank you Peter for the clarification. Regards, Rajesh On Tue, Dec 16, 2014 at 12:42 AM, Michael Armbrust mich...@databricks.com wrote: Underneath the covers, jsonFile uses TextInputFormat, which will split files correctly based on new lines. Thus, there is no fixed maximum size for a json

Re: JSON Input files

2014-12-14 Thread Madabhattula Rajesh Kumar
Hi Helena and All, I have a below example JSON file format. My use case is to read NAME variable. When I execute I got next exception *Exception in thread main org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: 'NAME, tree:Project ['NAME] Subquery device*

Re: JSON Input files

2014-12-14 Thread Yanbo
Pay attention to your JSON file, try to change it like following. Each record represent as a JSON string. {NAME : Device 1, GROUP : 1, SITE : qqq, DIRECTION : East, } {NAME : Device 2, GROUP : 2, SITE : sss, DIRECTION : North, } 在

Re: JSON Input files

2014-12-14 Thread Madabhattula Rajesh Kumar
Thank you Yanbo Regards, Rajesh On Sun, Dec 14, 2014 at 3:15 PM, Yanbo yanboha...@gmail.com wrote: Pay attention to your JSON file, try to change it like following. Each record represent as a JSON string. {NAME : Device 1, GROUP : 1, SITE : qqq, DIRECTION : East,

JSON Input files

2014-12-13 Thread Madabhattula Rajesh Kumar
Hi Team, I have a large JSON file in Hadoop. Could you please let me know 1. How to read the JSON file 2. How to parse the JSON file Please share any example program based on Scala Regards, Rajesh

Re: JSON Input files

2014-12-13 Thread Helena Edelson
One solution can be found here: https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets - Helena @helenaedelson On Dec 13, 2014, at 11:18 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, I have a large JSON file in Hadoop. Could you please let me know