File does not exist errors across cluster

2018-11-27 Thread Matt Keranen
Have 4 nodes running drillbits version 1.14 for queries over JSON files in
the regular filesystem (not HDFS).

Each node has an identical directory structure, but not all file names
exist on all nodes, and any query in the form of "SELECT ... FROM
dfs.logs.`logs*.json.gz`" fails with:

Error: DATA_READ ERROR: Failure reading JSON file - File
file:/localdata/logs/logs.xhr.json.gz does not exist

where the filename may change, but is always one that exists on some but
not all nodes.

Is there a configuration for Drill where drillbits querying non-distrbuted
filesystems don't expect all files to exist on all nodes?


File "does not exist" error on non-distributed filesystem cluster

2018-11-27 Thread Matt Keranen
Have 4 nodes running drillbits version 1.14 for queries over JSON files in
the regular filesystem (not HDFS).

Each node has an identical directory structure, but not all file names
exist on all nodes, and any query in the form of "SELECT ... FROM
dfs.logs.`logs*.json.gz`" fails with:

Error: DATA_READ ERROR: Failure reading JSON file - File
file:/localdata/logs/logs.xxx.json.gz does not exist

where the filename may change, but is always one that exists on some but
not all nodes.

Is there a configuration for Drill where drillbits querying non-distributed
filesystems don't expect all files to exist on all nodes?


Re: Apache Drill Meetup on Nov 14th!

2018-11-27 Thread Pritesh Maker
Thanks again to the speakers Nitin and Aman for fantastic talks! I have
updated the meetup with the links to the slides from the presenters as well
as a recording of the meetup.

Take a look at the Meetup to see the slides and a few pictures of the event.
https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/

Link to slides/ recording:
https://drive.google.com/drive/folders/10HAyVVUSq8LsFOYG8J8beeIUloG34o_6

Aman's Slideshare Link:
https://www.slideshare.net/AmanSinha6/accelerating-sql-queries-in-nosql-databases-using-apache-drill-and-secondary-indexes


Thanks,
Pritesh

On Sun, Nov 4, 2018 at 2:46 PM Pritesh Maker  wrote:

> Hello, Drillers!
>
> We are restarting meetups for Apache Drill! The next meet up will be on
> Nov 14th at 6:30 PM at the MapR Headquarters.
>
> We will have two speakers for the meetup
> - Nitin Sharma @ Netflix who will talk about Netflix's Personalization
> Infrastructure
> - Aman Sinha @ MapR who will talk about a brand new feature in Apache
> Drill to leverage Secondary Indexes
>
> You can find more details of their proposed talks at the meetup site.
> Please register soon since we have limited seating!
> https://www.meetup.com/Bay-Area-Apache-Drill-User-Group/events/255727785/
>
> We look forward to seeing you there!
>
> Thank you,
> Pritesh
>


Re: Write custom aggregate function

2018-11-27 Thread Charles Givre
Hi Andrea, 
Can you post your code on github?

> On Nov 27, 2018, at 17:44, Andrea Sella  wrote:
> 
> Hi,
> 
> I tried to implement it using ObjectHolder and ComplexWriter and it doesn't
> seem possible to achieve something general at this point.
> 
> Andrea
> 
> On Wed, Nov 21, 2018 at 6:02 PM Andrea Sella  wrote:
> 
>> Hi,
>> 
>> I've just started with Apache Drill and I'd like to write a custom
>> aggregate function in order to concatenate array fields.
>> 
>> I have seen that the feature is still in alpha and before trying to make
>> it work I would like to know if at this stage my custom function is
>> possible.
>> 
>> The main idea is trying to achieve something like this:
>> a,b
>> foo, [1,2,3]
>> foo, [3,5,6]
>> 
>>> select my_custom_fun(b) from table group by a
>>> foo, [1,2,3,3,5,6]
>> 
>> If it is possible, there are any other available documentation other than
>> this section[1]?
>> 
>> Thank you,
>> Andrea
>> 
>> [1] https://drill.apache.org/docs/developing-an-aggregate-function/
>> 
> 
> 
> -- 
> BR,
> Andrea



Re: Write custom aggregate function

2018-11-27 Thread Andrea Sella
Hi,

I tried to implement it using ObjectHolder and ComplexWriter and it doesn't
seem possible to achieve something general at this point.

Andrea

On Wed, Nov 21, 2018 at 6:02 PM Andrea Sella  wrote:

> Hi,
>
> I've just started with Apache Drill and I'd like to write a custom
> aggregate function in order to concatenate array fields.
>
> I have seen that the feature is still in alpha and before trying to make
> it work I would like to know if at this stage my custom function is
> possible.
>
> The main idea is trying to achieve something like this:
> a,b
> foo, [1,2,3]
> foo, [3,5,6]
>
> > select my_custom_fun(b) from table group by a
> > foo, [1,2,3,3,5,6]
>
> If it is possible, there are any other available documentation other than
> this section[1]?
>
> Thank you,
> Andrea
>
> [1] https://drill.apache.org/docs/developing-an-aggregate-function/
>


-- 
BR,
Andrea