> On 19 Oct 2016, at 21:46, Jakob Odersky wrote:
>
> Another reason I could imagine is that files are often read from HDFS,
> which by default uses line terminators to separate records.
>
> It is possible to implement your own hdfs delimiter finder, however
> for arbitrary
Another reason I could imagine is that files are often read from HDFS,
which by default uses line terminators to separate records.
It is possible to implement your own hdfs delimiter finder, however
for arbitrary json data, finding that delimiter would require stateful
parsing of the file and
Regarding his recent PR[1], I guess he meant multiple line json.
As far as I know, single line json also conplies the standard. I left a
comment with RFC in the PR but please let me know if I am wrong at any
point.
Thanks!
[1]https://github.com/apache/spark/pull/15511
On 19 Oct 2016 7:00 a.m.,
Koert,
Koert Kuipers wrote:
A single json object would mean for most parsers it needs to fit in memory when
reading or writing
Note that codlife didn't seem to being asking about /single-object/ JSON files,
but about /standard-format/ JSON files.
On Oct 15, 2016 11:09, "codlife"
A single json object would mean for most parsers it needs to fit in memory
when reading or writing
On Oct 15, 2016 11:09, "codlife" <1004910...@qq.com> wrote:
> Hi:
>I'm doubt about the design of spark.read.json, why the json file is not
> a standard json file, who can tell me the internal
Hi:
I'm doubt about the design of spark.read.json, why the json file is not
a standard json file, who can tell me the internal reason. Any advice is
appreciated.
--
View this message in context: