Thank you.
On Thu, Jul 23, 2015 at 7:24 PM, Ted Dunning ted.dunn...@gmail.com wrote:
On Thu, Jul 23, 2015 at 3:55 AM, Stefán Baxter ste...@activitystream.com
wrote:
Someone must review the underlying optimization errors to prevent this
from
happening to others.
Jinfeng and Parth
Hi Stefán,
Thanks a lot for bringing up this issue, which is really helpful to improve
Drill.
I tried to re-produce the incorrect issues, and I could re-produce the
missing data issue of CTAS parquet, but I could not re-produce the missing
data issue if I query the JSON file directly.
Here is
hi,
I can provide you with json file an statements to reproduce it if you wish.
thank you for looking into this.
regards,
-Stefan
On Jul 23, 2015 9:03 PM, Jinfeng Ni jinfengn...@gmail.com wrote:
Hi Stefán,
Thanks a lot for bringing up this issue, which is really helpful to improve
Drill.
Hi Stefan,
Sorry to hear about your misadventure in Drill land. I will try to give you
some more informations, but I also have limited knowledge for this specific
case and other developers will probably jump in to correct me.
When you try to read schema-less data, Drill will first investigate
Hi Abdel,
Thank you for taking the time to respond. I know my frustration is leaking
through but that does not mean I don appreciate everything you and the
Drill team is doing, I do.
I also understand the premise of the optimization but I find it to
restrictive and it certainly does not fit our
On 23 Jul 2015, at 10:53, Abdel Hakim Deneche wrote:
When you try to read schema-less data, Drill will first investigate
the
1000 rows to figure out a schema for your data, then it will use this
schema for the remaining of the query.
To clarify, if the JSON schema changes on the 1001st 1MMth
Hi,
The only right answer to this question must be to a) adapt to additional
information and b) try the hardest to accommodate changes.
The current behavior must be seen as completely worthless (sorry for the
strong language).
Regards,
-Stefan
On Thu, Jul 23, 2015 at 4:16 PM, Matt
I don't think Drill is supposed to ignore data. My understanding is that
the reader will read the new fields which will cause a schema change, and
depending on the query (if all operators involved can handle the schema
change or not) the query should either succeed or fail.
My understanding is
Hi,
The workaround for this was to edit the first line in the json file and
fake a value for the additional field.
That way the optimizer could not decide to ignore it.
Someone must review the underlying optimization errors to prevent this from
happening to others.
JSON data, which is
in addition to this.
selecting: select some, t.others, t.others.additional from dfs.tmp.`/test.json`
as t;
- returns this: yes, {additional:last entries only}, last entries
only
finding the previously missing value but then ignoring all the other values
of the sub structure.
- Stefan
On Wed,
10 matches
Mail list logo