Hi Jason,

Is it possible that the Avro plugin does not use any parallelism and that
all the target files are scanned sequentially by the same process?  (1.5)

- Stefán

On Fri, Feb 26, 2016 at 8:04 PM, Stefán Baxter <ste...@activitystream.com>
wrote:

> Thank you Jason.
>
> I do realize that this is an OS project and that everyone is doing their
> best.
>
> There are just a few things I wish I had realized before switching over
> from JSON to Avro that  have caused us a lot of problems and taken a long
> time.
>
> Your work is appreciated and I apologize for letting my frustration get
> the better of me.
>
> - Stefán
>
> On Fri, Feb 26, 2016 at 8:00 PM, Jason Altekruse <altekruseja...@gmail.com
> > wrote:
>
>> Stefan,
>>
>> I'm sorry that we have not been better about getting back to the issues
>> you
>> have filed against the Avro reader. We do appreciate all of the effort you
>> have put into filing thorough bugs and being active in the discussions on
>> the list. I have responded on the bug you filed on this issue [1] with a
>> workaround and will be posting a patch shortly with a fix.
>>
>> - Jason <https://issues.apache.org/jira/browse/DRILL-4120>
>>
>> [1] - https://issues.apache.org/jira/browse/DRILL-4441
>> <https://issues.apache.org/jira/browse/DRILL-4120>
>>
>> On Thu, Feb 25, 2016 at 12:29 PM, Stefán Baxter <
>> ste...@activitystream.com>
>> wrote:
>>
>> > Hi,
>> >
>> > This query targets Avro files in the latest 1.5 release:
>> >
>> > 0: jdbc:drill:zk=local> select count(*) from
>> > dfs.asa.`/streaming/venuepoint/transactions/` as s where s.sold_to =
>> > 'Customer/4-2492847';
>> > +---------+
>> > | EXPR$0  |
>> > +---------+
>> > | 5788    |
>> > +---------+
>> >
>> > 0: jdbc:drill:zk=local> select count(*) from
>> > dfs.asa.`/streaming/venuepoint/transactions/` as s where s.sold_to IN
>> > ('Customer/4-2492847');
>> > +---------+
>> > | EXPR$0  |
>> > +---------+
>> > | 0       |
>> > +---------+
>> >
>> > It shows that the IN operator does not work with Avro (works with
>> Parquet).
>> >
>> > This finally tips us over. We have invested hundreds of hours moving all
>> > streaming/fresh data from JSON to Avro but the Avro part of Drill is
>> broken
>> > in too many ways to recommend its use to anyone.
>> >
>> > Attempts to report Avro errors and shortcomings, like the missing
>> support
>> > for dirX, has had no results.
>> >
>> > I think it would be prudent to warn people on the Drill website that the
>> > Avro support is experimental, at best
>> >
>> > - Stefán Baxter
>> >
>>
>
>

Reply via email to