Q about current capabilities

Ted Dunning Thu, 04 Sep 2014 22:47:27 -0700

How close is Drill to being able to retrieve do the following?

    select * from primary_table, index_table
    where index_table.key >= limit1 and index_table.key <= limit2
        and primary_table.key = index_table.ref


where both primary_table and index_table are MapR DB tables?

In both tables, the primary key is listed called key and the ref field of
index_table is exactly the key of the primary_table.

I have prototyped this query using Java and the simplest possible
implementation in which I scanned the index_table for values of ref and
then inserted every value of ref into a table of tasks which I executed
using a thread bound worker pool.  Performance was quite acceptable for the
desired application.

This test indicates to me that we wouldn't even need to sort the references
from index_table to be handled nicely by a single thread.  Nor would it
even strictly be necessary to distribute the computation although that
would be fun.

Your thoughts?

Q about current capabilities

Reply via email to