Thanks for the feedback. Apache HBase and Apache Phoenix are an important part of my work. And then, I'm not sure anyone have started the `HBase to EVF` for Drill, but this improvement is valuable. In particular, I found a big improvement over the Phoenix 4.x and HBase 1.x series when I recently used the Phoenix 5.1 + HBase 2.3 on Hadoop 3.3. Look forward to seeing Drill inherit from these advantages.
> 在 2021年8月24日,23:16,Ted Dunning <ted.dunn...@gmail.com> 写道: > > I know somebody who is querying a very large table and has trouble with > pushdown. > > They are looking for values indexed by primary key with a query like > "select * from table where key in s". If s has a very small number of > values, this turns into primary key access, but if there are more than just > a few, it becomes a scan. > > The situation that would be interesting to detect is where s has a few > tightly clustered groups. The ideal strategy would be to scan each group. > How this might be detected isn't clear to me, but it would make a massive > difference to this kind of query. > > Currently, the best alternative is to try to avoid this kind of query and > build a data flow such that each cluster of keys flows into a separate > query. This would be made easier if a common table expression (CTE) query > could be done without having the optimizer try to globally optimize back to > a single big scan. > > Anyway, I have absolutely no concrete suggestions for making this work, but > the need is there. > > >> On Tue, Aug 24, 2021 at 4:39 AM luoc <l...@apache.org> wrote: >> >> Hello Guys, >> Will you use Drill to query Apache HBase? If so, what new feature would >> you like to see in HBase storage plugin? In addition, Drill supported the >> Apache Cassandra since 1.19. >> Absolutely… Could you tell me what your most common storage plugin (or >> data format) are? Thanks for your time. >> >> >> -- luoc