I’m not an Avro user, but I’d definitely vote for improving this. — C
> On Aug 17, 2017, at 10:17, John Omernik <j...@omernik.com> wrote: > > I was guessing you would chime in with a response ;) > > Are you still using Drill w/ Avro how has things been lately? > > On Thu, Aug 17, 2017 at 8:00 AM, Stefán Baxter <ste...@activitystream.com> > wrote: > >> woha!!! >> >> >> (sorry, I just had to) >> >> >> Best of luck with that! >> >> Regards, >> -Stefán >> >> On Thu, Aug 17, 2017 at 12:37 PM, John Omernik <j...@omernik.com> wrote: >> >>> I know Avro is the unwanted child of the Drill world. (I know others have >>> tried to mature the Avro support and that has been something that still >> is >>> in a "experiemental" state. >>> >>> That said, isn't it time for us to clean it up? >>> >>> I am sure I there are some open JIRAs out there, (last Doc update on the >>> Avro Page, Nov 21, 2016) points to this >>> https://issues.apache.org/jira/browse/DRILL/component/ >>> 12328941/?selectedTab=com.atlassian.jira.jira-projects- >>> plugin:component-summary-panel >>> >>> And I just ran into a issue... I am going to run it by here to see if >> it's >>> JIRA worthy or known: >>> >>> I have two directories, one json (brodns) and one avro (brodnsavro) >>> >>> The both have subdirectories that are YYYY-MM-DD dates. >>> >>> Where I run >>> >>> select dir0, count(*) from `brodns` group by dir0 - This works great! >>> >>> when I run >>> >>> select dir0, count(*) from `brodnsavro` group by dir0 - I get: >>> >>> VALIDATION ERROR: From line 1, column 58 to line 1, column 61: Column >>> 'dir0' not found in any table >>> >>> >>> If I run >>> >>> >>> select count(*) from `brodnsavro/2017-08-17` this works >>> >>> if I run >>> >>> >>> select count(*) from `brodnsavro` this also works >>> >>> >>> But dir0 doesn't appear to be applied to Avro. >>> >>> >>> >>> I really feel this should be consistent (in addition to fixing the >>> other issues in Avro) and lets make Avro o a >>> >>> first class citizen of the Drill world. >>> >>> >>> (If folks are interested, I'd be happy to discuss my use case, it >> involves >>> >>> applying a schema to json records on kafka/maprstreams in streamsets, and >>> then >>> >>> outputting to avro files... from there I hope to convert to parquet, but >>> >>> don't want to use mapreduce, hence drill! >>> >>> ) >>> >>