Thanks for flagging the problem in the docs, guys. In some cases, removing
an empty array can be used as a workaround to query a JSON file that
otherwise fails, as shown in this example:

http://drill.apache.org/docs/json-data-model/#example:-access-a-map-field-in-an-array

Maybe this example, assuming it still works shown in the docs, should
remain because sometimes empty arrays cause problems.

In the workaround section
http://drill.apache.org/docs/json-data-model/#empty-array, I'll remove the
ambiguous example {"a":[]} and reword the section to say an empty array
causes problems sometimes. Try removing it if you have a problem.

Kristine Hahn
Sr. Technical Writer
415-497-8107 @krishahn skype:krishahn


On Mon, Oct 19, 2015 at 1:03 PM, Andries Engelbrecht <
aengelbre...@maprtech.com> wrote:

> Word of caution that Flatten may be better as only the first may be null.
>
> —Andries
>
>
> > On Oct 19, 2015, at 12:59 PM, John Omernik <j...@omernik.com> wrote:
> >
> > Awesome that worked.
> >
> > *The documentation should probably be updated on the array stuff, it's
> not
> > accurate as it pertains to empty arrays.
> >
> >
> >
> > On Mon, Oct 19, 2015 at 2:52 PM, Andries Engelbrecht <
> > aengelbre...@maprtech.com> wrote:
> >
> >> Use where a[0] is not null
> >>
> >> 0: jdbc:drill:> select * from `./array.json`;
> >> +----+--------+
> >> | b  |   a    |
> >> +----+--------+
> >> | 1  | []     |
> >> | 3  | [1,2]  |
> >> +----+--------+
> >> 2 rows selected (0.13 seconds)
> >> 0: jdbc:drill:> select * from `./array.json` where a[0] is not null;
> >> +----+--------+
> >> | b  |   a    |
> >> +----+--------+
> >> | 3  | [1,2]  |
> >> +----+--------+
> >> 1 row selected (0.151 seconds)
> >>
> >> —Andries
> >>
> >>
> >>> On Oct 19, 2015, at 12:32 PM, John Omernik <j...@omernik.com> wrote:
> >>>
> >>> Well you are in a sense confirming my suspicions that an empty array,
> as
> >>> specified in the Docs as "error causing" doesn't actually cause an
> error,
> >>> and that is expected. That is, empty arrays are not the big meanies
> that
> >>> the docs make them out to be (my results are the same as your, that is,
> >> no
> >>> errors).
> >>>
> >>> I like the flatten approach, but is there a simple way to say select *
> >> from
> >>> dfs.tdunning.`x.json` where
> >>>
> >>> a is not empty
> >>>
> >>> or
> >>>
> >>> size(a) == 0
> >>>
> >>> or
> >>>
> >>> a != []
> >>>
> >>>
> >>> I guess some functions for working with arrays would be handy. I'll
> play
> >>> with flatten to see if it gives me what I am looking for, but are there
> >>> other ways to play with arrays (now that I confirm that empty arrays
> >> aren't
> >>> evil)
> >>>
> >>> John
> >>>
> >>>
> >>> On Mon, Oct 19, 2015 at 1:40 PM, Ted Dunning <ted.dunn...@gmail.com>
> >> wrote:
> >>>
> >>>> John,
> >>>>
> >>>> I don't understand what you are seeing.  Here is what I am seeing (and
> >>>> hopefully you can tell what I am missing).
> >>>>
> >>>> First the input is:
> >>>>
> >>>> $ cat x.json
> >>>> {"b":1, "a":[] }
> >>>> {"a":[1,2], "b":3}
> >>>>
> >>>> And then with this input, I get this:
> >>>>
> >>>> 0: jdbc:drill:> select * from dfs.tdunning.`x.json`;
> >>>> +----+------------+
> >>>> | b  |     a      |
> >>>> +----+------------+
> >>>> | 1  | []         |
> >>>> | 3  | ["1","2"]  |
> >>>> +----+------------+
> >>>> 2 rows selected (0.443 seconds)
> >>>> 0: jdbc:drill:> select a,b from dfs.tdunning.`x.json`;
> >>>> +------------+----+
> >>>> |     a      | b  |
> >>>> +------------+----+
> >>>> | []         | 1  |
> >>>> | ["1","2"]  | 3  |
> >>>> +------------+----+
> >>>> 2 rows selected (0.473 seconds)
> >>>> 0: jdbc:drill:> select flatten(a),b from dfs.tdunning.`x.json`;
> >>>> +---------+----+
> >>>> | EXPR$0  | b  |
> >>>> +---------+----+
> >>>> | 1       | 3  |
> >>>> | 2       | 3  |
> >>>> +---------+----+
> >>>> 2 rows selected (0.499 seconds)
> >>>>
> >>>>
> >>>> On Mon, Oct 19, 2015 at 7:03 AM, John Omernik <j...@omernik.com>
> wrote:
> >>>>
> >>>>> In https://drill.apache.org/docs/json-data-model/ there is a section
> >>>> that
> >>>>> goes as laid out below.   This is actually not occurring for me. I
> >> have a
> >>>>> json dump from Mongo that has a field called tags where many records
> >> have
> >>>>> "tags":[] and it's outputting that without error.  (It just shows []
> as
> >>>> the
> >>>>> output).
> >>>>>
> >>>>> So, my question is this... based on the documentation, what I am
> seeing
> >>>> is
> >>>>> NOT expected, is it a miss on the docs, or something that is fixed in
> >> the
> >>>>> 1.2 release that I have?
> >>>>>
> >>>>> If it is fixed so we can have empty arrays in a field like tags, is
> it
> >>>>> possible there are some functions I can use to determine if that
> field
> >> is
> >>>>> empty? i.e. if isemptyarray(tags) returns true if empty or perhaps
> get
> >> me
> >>>>> the length said array?  These functions would  be very valuable in
> >>>> queries
> >>>>> (if the empty arrays thing is not a weird quirk I am seeing).
> >>>>>
> >>>>> Empty array
> >>>>>
> >>>>> Drill cannot read an empty array, shown in the following example, and
> >>>>> attempting to do so causes an error.
> >>>>>
> >>>>>   { "a":[] }
> >>>>>
> >>>>> Workaround: Remove empty arrays.
> >>>>>
> >>>>
> >>
> >>
>
>

Reply via email to