Re: Avro - Let's talk Avro again

Charles Givre Thu, 17 Aug 2017 07:20:30 -0700

I’m not an Avro user, but I’d definitely vote for improving this.
— C


> On Aug 17, 2017, at 10:17, John Omernik <j...@omernik.com> wrote:
> 
> I was guessing you would chime in with a response ;)
> 
> Are you still using Drill w/ Avro how has things been lately?
> 
> On Thu, Aug 17, 2017 at 8:00 AM, Stefán Baxter <ste...@activitystream.com>
> wrote:
> 
>> woha!!!
>> 
>> 
>> (sorry, I just had to)
>> 
>> 
>> Best of luck with that!
>> 
>> Regards,
>> -Stefán
>> 
>> On Thu, Aug 17, 2017 at 12:37 PM, John Omernik <j...@omernik.com> wrote:
>> 
>>> I know Avro is the unwanted child of the Drill world. (I know others have
>>> tried to mature the Avro support and that has been something that still
>> is
>>> in a "experiemental" state.
>>> 
>>> That said, isn't it time for us to clean it up?
>>> 
>>> I am sure I there are some open JIRAs out there, (last Doc update on the
>>> Avro Page, Nov 21, 2016) points to this
>>> https://issues.apache.org/jira/browse/DRILL/component/
>>> 12328941/?selectedTab=com.atlassian.jira.jira-projects-
>>> plugin:component-summary-panel
>>> 
>>> And I just ran into a issue... I am going to run it by here to see if
>> it's
>>> JIRA worthy or known:
>>> 
>>> I have two directories, one json (brodns) and one avro (brodnsavro)
>>> 
>>> The both have subdirectories that are YYYY-MM-DD dates.
>>> 
>>> Where I run
>>> 
>>> select dir0, count(*) from `brodns` group by dir0  - This works great!
>>> 
>>> when I run
>>> 
>>> select dir0, count(*) from `brodnsavro` group by dir0 - I get:
>>> 
>>> VALIDATION ERROR: From line 1, column 58 to line 1, column 61: Column
>>> 'dir0' not found in any table
>>> 
>>> 
>>> If I run
>>> 
>>> 
>>> select count(*) from `brodnsavro/2017-08-17` this works
>>> 
>>> if I run
>>> 
>>> 
>>> select count(*) from `brodnsavro` this also works
>>> 
>>> 
>>> But dir0 doesn't appear to be applied to Avro.
>>> 
>>> 
>>> 
>>> I really feel this should be consistent (in addition to fixing the
>>> other issues in Avro) and lets make Avro o a
>>> 
>>> first class citizen of the Drill world.
>>> 
>>> 
>>> (If folks are interested, I'd be happy to discuss my use case, it
>> involves
>>> 
>>> applying a schema to json records on kafka/maprstreams in streamsets, and
>>> then
>>> 
>>> outputting to avro files... from there I hope to convert to parquet, but
>>> 
>>> don't want to use mapreduce, hence drill!
>>> 
>>> )
>>> 
>>

Re: Avro - Let's talk Avro again

Reply via email to