In its current avatar, Drill can do this like it does for CSV data sources:
0: jdbc:drill:schema=dfs.m7> select * from nation_mdb100 n, region_mdb100 r . . . . . . . . . . . . . .> where r.r_regionkey >=2 and r.r_regionkey <= 3 . . . . . . . . . . . . . .> and n.n_regionkey = r.r_regionkey; +-------------+------------+-------------+------------+-------------+------------+------------+ | n_nationkey | n_name | n_regionkey | n_comment | r_regionkey | r_name | r_comment | +-------------+------------+-------------+------------+-------------+------------+------------+ | 19 | ROMANIA | 3 | ular asymptotes are about the furious multipliers. express dependencies nag above the ironically ironic account | 3 | EUROPE | ly final courts cajole furiously final excuse | | 22 | RUSSIA | 3 | requests against the platelets use never according to the quickly regular pint | 3 | EUROPE | ly final courts cajole furiously final excuse | | 23 | UNITED KINGDOM | 3 | eans boost carefully special requests. accounts are. carefull | 3 | EUROPE | ly final courts cajole furiously final excuse | | 6 | FRANCE | 3 | refully final requests. regular, ironi | 3 | EUROPE | ly final courts cajole furiously final excuse | | 7 | GERMANY | 3 | l platelets. regular accounts x-ray: unusual, regular acco | 3 | EUROPE | ly final courts cajole furiously final excuse | | 12 | JAPAN | 2 | ously. final, express gifts cajole a | 2 | ASIA | ges. thinly even pinto beans ca | | 18 | CHINA | 2 | c dependencies. furiously express notornis sleep slyly regular accounts. ideas sleep. depos | 2 | ASIA | ges. thinly even pinto beans ca | | 21 | VIETNAM | 2 | hely enticingly express accounts. even, final | 2 | ASIA | ges. thinly even pinto beans ca | | 8 | INDIA | 2 | ss excuses cajole slyly across the packages. deposits print aroun | 2 | ASIA | ges. thinly even pinto beans ca | | 9 | INDONESIA | 2 | slyly express asymptotes. regular deposits haggle slyly. carefully ironic hockey players sleep blithely. carefull | 2 | ASIA | ges. thinly even pinto beans ca | +-------------+------------+-------------+------------+-------------+------------+------------+ 10 rows selected (3.378 seconds) Like for CSV source, I've used views that cast the String-based keys into numeric values. The range filters are not pushed all the way down to M7, so Drill must read all the keys. I believe there is a plan to support filter push-down... provided the data being stored in the table is a byte-representation of a numeric data type and not String (like I have it). ~ Kunal -----Original Message----- From: Ted Dunning [mailto:[email protected]] Sent: Thursday, September 04, 2014 10:46 PM To: drill Subject: Q about current capabilities How close is Drill to being able to retrieve do the following? select * from primary_table, index_table where index_table.key >= limit1 and index_table.key <= limit2 and primary_table.key = index_table.ref where both primary_table and index_table are MapR DB tables? In both tables, the primary key is listed called key and the ref field of index_table is exactly the key of the primary_table. I have prototyped this query using Java and the simplest possible implementation in which I scanned the index_table for values of ref and then inserted every value of ref into a table of tasks which I executed using a thread bound worker pool. Performance was quite acceptable for the desired application. This test indicates to me that we wouldn't even need to sort the references from index_table to be handled nicely by a single thread. Nor would it even strictly be necessary to distribute the computation although that would be fun. Your thoughts?
