I don't know whether this is common, but we might also allow another separator 
for JSON objects, such as two blank lines.

Matei

> On May 4, 2015, at 2:28 PM, Reynold Xin <r...@databricks.com> wrote:
> 
> Joe - I think that's a legit and useful thing to do. Do you want to give it
> a shot?
> 
> On Mon, May 4, 2015 at 12:36 AM, Joe Halliwell <joe.halliw...@gmail.com>
> wrote:
> 
>> I think Reynold’s argument shows the impossibility of the general case.
>> 
>> But a “maximum object depth” hint could enable a new input format to do
>> its job both efficiently and correctly in the common case where the input
>> is an array of similarly structured objects! I’d certainly be interested in
>> an implementation along those lines.
>> 
>> Cheers,
>> Joe
>> 
>> http://www.joehalliwell.com
>> @joehalliwell
>> 
>> 
>> On Mon, May 4, 2015 at 7:55 AM, Reynold Xin <r...@databricks.com> wrote:
>> 
>>> I took a quick look at that implementation. I'm not sure if it actually
>>> handles JSON correctly, because it attempts to find the first { starting
>>> from a random point. However, that random point could be in the middle of
>>> a
>>> string, and thus the first { might just be part of a string, rather than
>>> a
>>> real JSON object starting position.
>>> 
>>> 
>>> On Sun, May 3, 2015 at 11:13 PM, Emre Sevinc <emre.sev...@gmail.com>
>>> wrote:
>>> 
>>>> You can check out the following library:
>>>> 
>>>> https://github.com/alexholmes/json-mapreduce
>>>> 
>>>> --
>>>> Emre Sevinç
>>>> 
>>>> 
>>>> On Sun, May 3, 2015 at 10:04 PM, Olivier Girardot <
>>>> o.girar...@lateral-thoughts.com> wrote:
>>>> 
>>>>> Hi everyone,
>>>>> Is there any way in Spark SQL to load multi-line JSON data
>>> efficiently, I
>>>>> think there was in the mailing list a reference to
>>>>> http://pivotal-field-engineering.github.io/pmr-common/ for its
>>>>> JSONInputFormat
>>>>> 
>>>>> But it's rather inaccessible considering the dependency is not
>>> available
>>>> in
>>>>> any public maven repo (If you know of one, I'd be glad to hear it).
>>>>> 
>>>>> Is there any plan to address this or any public recommendation ?
>>>>> (considering the documentation clearly states that
>>> sqlContext.jsonFile
>>>> will
>>>>> not work for multi-line json(s))
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Olivier.
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Emre Sevinc
>>>> 
>>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to