Drill performance

Tom Barber Tue, 17 May 2016 02:45:55 -0700

Hey Merlijn

I've not scaled up to 200GB but we did do a 20-30GB HDFS test with adequate
performance and load being spread over drill bits. I guys on the drill
mailing list are pretty good at resolving performance issues though so you
should certainly chat to them, and with backing from the new Drill startup,
MapR tech, Dell and a bunch of other firms, there is a decent amount of
development resource on the platform to getting stuff fixed.


That said, I'm sure there are other solutions that run faster, Impala etc,
also I come from an OLAP background which is why I hooked up with the Kylin
guys as that would give you an alternative entry point.

Another reason for drill is the data federation and non hadoop support, for
example I could spin up HDFS, Mongo, and MySQL and have drill hook up to
all 3 of them at the same time and do:

select * from HDFS.mytable a,MONGODB.mytable b,MySQL.mytable c where a.c1 =
b.c1, b.c2=c.c1

and have it return a nice federated query, which is pretty powerful.

Of course with all this tech YMMV, but personally I've had decent results
with it.

Tom

--------------

Director Meteorite.bi - Saiku Analytics Founder
Tel: +44(0)5603641316

(Thanks to the Saiku community we reached our Kickstart
<http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
goal, but you can always help by sponsoring the project
<http://www.meteorite.bi/products/saiku/sponsorship>)

On 17 May 2016 at 10:37, Merlijn Sebrechts <merlijn.sebrec...@gmail.com>
wrote:

> Hi Tom
>
>
> Slightly off-topic but have you ever worked with drill? We did some tests
> with a 200GB and 100MB dataset in an hdfs cluster and the performance we're
> seeing is so bad drill is unusable for us..
>
> Some initial debugging revealed that drill isn't able to distribute the
> workload over the cluster. The entire query runs on one server... Have you
> been able to get better performance out of it?
>
>
>
> Kind regards
> Merlijn
>
>
> Op dinsdag 17 mei 2016 heeft Tom Barber <t...@analytical-labs.com> het
> volgende geschreven:
> > Okay so I've been asking around as you all know and we're considering
> this apache specific Juju Charms page so I figured it would be useful to
> roundup which communities I have spoken to who have shown definite interest
> in collaboration.
> > We have:
> > Apache Bigtop (we all know about)
> > Apache Zeppelin (we all know about)
> > Apache Karaf
> > Apache Nutch
> > Apache OODT
> > Apache Joshua (Incubating)
> > Apache Kylin
> > I'm sure there will be more, and probably some I've just forgotten about
> or other people spoke to, but I think thats a pretty good start.
> > As me and Kevin also discussed Drill is also a pretty important one from
> a personal perspective as it offers the best (IMHO) route to getting SQL
> over a bunch of your NOSQL charms with minimal effort, which then helps
> Saiku and any other BI tooling you guys get into the platform. Its great
> having all the big data stuff, but we need ways for end users to get this
> stuff back out!
> >
> > Tom
> > --------------
> > Director Meteorite.bi - Saiku Analytics Founder
> > Tel: +44(0)5603641316
> > (Thanks to the Saiku community we reached our Kickstart goal, but you
> can always help by sponsoring the project)
>

-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Drill performance

Reply via email to