How close is Drill to being able to retrieve do the following?
select * from primary_table, index_table
where index_table.key >= limit1 and index_table.key <= limit2
and primary_table.key = index_table.ref
where both primary_table and index_table are MapR DB tables?
In both tables, the primary key is listed called key and the ref field of
index_table is exactly the key of the primary_table.
I have prototyped this query using Java and the simplest possible
implementation in which I scanned the index_table for values of ref and
then inserted every value of ref into a table of tasks which I executed
using a thread bound worker pool. Performance was quite acceptable for the
desired application.
This test indicates to me that we wouldn't even need to sort the references
from index_table to be handled nicely by a single thread. Nor would it
even strictly be necessary to distribute the computation although that
would be fun.
Your thoughts?