In SQLContext: def jsonFile(path: String, samplingRatio: Double): SchemaRDD = { val json = sparkContext.textFile(path) jsonRDD(json, samplingRatio) } Looks like jsonFile() can be enhanced with call to sparkContext.newAPIHadoopFile() with proper input file format.
Cheers On Wed, Dec 17, 2014 at 8:33 AM, Jerry Lam <chiling...@gmail.com> wrote: > > Hi Ted, > > Thanks for your help. > I'm able to read lzo files using sparkContext.newAPIHadoopFile but I > couldn't do the same for sqlContext because sqlContext.josnFile does not > provide ways to configure the input file format. Do you know if there are > some APIs to do that? > > Best Regards, > > Jerry > > On Wed, Dec 17, 2014 at 11:27 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> See this thread: http://search-hadoop.com/m/JW1q5HAuFv >> which references https://issues.apache.org/jira/browse/SPARK-2394 >> >> Cheers >> >> On Wed, Dec 17, 2014 at 8:21 AM, Jerry Lam <chiling...@gmail.com> wrote: >>> >>> Hi spark users, >>> >>> Do you know how to read json files using Spark SQL that are LZO >>> compressed? >>> >>> I'm looking into sqlContext.jsonFile but I don't know how to configure >>> it to read lzo files. >>> >>> Best Regards, >>> >>> Jerry >>> >>