On first look I could read all the files but  doing a flatten caused all
kinds of things that were bad. The worst was a repeatable kernel panic.

I think I'm back to making the initial smaller in the larger file sets.

I have some files that are say 100M in size. Each file is a single line
array:
{"MyArrayInTheFile":[{"a":"1","b":"2"},{"a":"1","b":"2"},...]}
 What is the best way to represent that so it can be explored? Do I do what
was suggested before and put each array entry on its own line?
{"MyArrayInTheFile":[
{"a":"1","b":"2"},
{"a":"1","b":"2"},
...
]}

What works best for the 0.8 code?


On Thu, Mar 19, 2015 at 12:59 PM, Jim Bates <jba...@maprtech.com> wrote:

> Ok, went to drill-0.8.0.31020-1 and it was %1000 better.
>
> On Thu, Mar 19, 2015 at 12:16 PM, Sudhakar Thota <sth...@maprtech.com>
> wrote:
>
>> I got the same issue, engineering recommended me use drill-0.8.0
>>
>> Sudhakar Thota
>> Sent from my iPhone
>>
>> > On Mar 19, 2015, at 9:22 AM, Jim Bates <jba...@maprtech.com> wrote:
>> >
>> > I constantly, constantly, constantly hit this.
>> >
>> > I have json files that are just a huge collection of an array of json
>> > objects
>> >
>> > example
>> > "MyArrayInTheFile":
>> > [{"a":"1","b":"2","c":"3"},{"a":"1","b":"2","c":"3"},...]
>> >
>> > My issue is in exploring the data, I hit this.
>> >
>> > Query failed: Query stopped., Record was too large to copy into vector.
>> [
>> > 39186288-2e01-408c-b886-dcee0a2c25c5 on maprdemo:31010 ]
>> >
>> > I can explore csv, tab, maprdb, hive at fairly large data sets and limit
>> > the response to what fits in my system limitations but not json in this
>> > format.
>> >
>> > The two options I have come up with to move forward are..
>> >
>> >   1. I strip out 90% of the array values in a file and explore that to
>> get
>> >   to my view. then go to a larger system and see if I have enough to
>> get the
>> >   job done.
>> >   2. Move to the larger system and explore there taking resources that
>> >   don't need to be spent on a science project.
>> >
>> > Hoping the smart people have a different option for me,
>> >
>> > Jim
>>
>
>

Reply via email to