Re: Spark SQL 1.1.1 reading LZO compressed json files

Ted Yu Wed, 17 Dec 2014 08:44:43 -0800

In SQLContext:
  def jsonFile(path: String, samplingRatio: Double): SchemaRDD = {
    val json = sparkContext.textFile(path)
    jsonRDD(json, samplingRatio)
  }
Looks like jsonFile() can be enhanced with call to
sparkContext.newAPIHadoopFile()
with proper input file format.


Cheers

On Wed, Dec 17, 2014 at 8:33 AM, Jerry Lam <chiling...@gmail.com> wrote:
>
> Hi Ted,
>
> Thanks for your help.
> I'm able to read lzo files using sparkContext.newAPIHadoopFile but I
> couldn't do the same for sqlContext because sqlContext.josnFile does not
> provide ways to configure the input file format. Do you know if there are
> some APIs to do that?
>
> Best Regards,
>
> Jerry
>
> On Wed, Dec 17, 2014 at 11:27 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> See this thread: http://search-hadoop.com/m/JW1q5HAuFv
>> which references https://issues.apache.org/jira/browse/SPARK-2394
>>
>> Cheers
>>
>> On Wed, Dec 17, 2014 at 8:21 AM, Jerry Lam <chiling...@gmail.com> wrote:
>>>
>>> Hi spark users,
>>>
>>> Do you know how to read json files using Spark SQL that are LZO
>>> compressed?
>>>
>>> I'm looking into sqlContext.jsonFile but I don't know how to configure
>>> it to read lzo files.
>>>
>>> Best Regards,
>>>
>>> Jerry
>>>
>>

Re: Spark SQL 1.1.1 reading LZO compressed json files

Reply via email to