I don't know whether this is common, but we might also allow another separator for JSON objects, such as two blank lines.
Matei > On May 4, 2015, at 2:28 PM, Reynold Xin <r...@databricks.com> wrote: > > Joe - I think that's a legit and useful thing to do. Do you want to give it > a shot? > > On Mon, May 4, 2015 at 12:36 AM, Joe Halliwell <joe.halliw...@gmail.com> > wrote: > >> I think Reynold’s argument shows the impossibility of the general case. >> >> But a “maximum object depth” hint could enable a new input format to do >> its job both efficiently and correctly in the common case where the input >> is an array of similarly structured objects! I’d certainly be interested in >> an implementation along those lines. >> >> Cheers, >> Joe >> >> http://www.joehalliwell.com >> @joehalliwell >> >> >> On Mon, May 4, 2015 at 7:55 AM, Reynold Xin <r...@databricks.com> wrote: >> >>> I took a quick look at that implementation. I'm not sure if it actually >>> handles JSON correctly, because it attempts to find the first { starting >>> from a random point. However, that random point could be in the middle of >>> a >>> string, and thus the first { might just be part of a string, rather than >>> a >>> real JSON object starting position. >>> >>> >>> On Sun, May 3, 2015 at 11:13 PM, Emre Sevinc <emre.sev...@gmail.com> >>> wrote: >>> >>>> You can check out the following library: >>>> >>>> https://github.com/alexholmes/json-mapreduce >>>> >>>> -- >>>> Emre Sevinç >>>> >>>> >>>> On Sun, May 3, 2015 at 10:04 PM, Olivier Girardot < >>>> o.girar...@lateral-thoughts.com> wrote: >>>> >>>>> Hi everyone, >>>>> Is there any way in Spark SQL to load multi-line JSON data >>> efficiently, I >>>>> think there was in the mailing list a reference to >>>>> http://pivotal-field-engineering.github.io/pmr-common/ for its >>>>> JSONInputFormat >>>>> >>>>> But it's rather inaccessible considering the dependency is not >>> available >>>> in >>>>> any public maven repo (If you know of one, I'd be glad to hear it). >>>>> >>>>> Is there any plan to address this or any public recommendation ? >>>>> (considering the documentation clearly states that >>> sqlContext.jsonFile >>>> will >>>>> not work for multi-line json(s)) >>>>> >>>>> Regards, >>>>> >>>>> Olivier. >>>>> >>>> >>>> >>>> >>>> -- >>>> Emre Sevinc >>>> >>> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org