Github user mattf-horton commented on the issue:

    https://github.com/apache/metron/pull/622
  
    @nickwallen brought up the issue of wildcard queries on our rowkeys.  It 
has always bothered me that we can't do wildcard queries on groups.  If you 
have, for example, a single groupBy based on day of week, that's just 7 
possible values, and if you want them all you could just do 7 queries and 
combine them.  But if you have three groupBy's, and they have 7, 31, and 256 
possible values, then to simulate a wildcard query you would have to do over 
55,000 individual queries!  Of course you would just do an hbase scan, but it 
would require a full table scan to select the time range desired.
    
    I propose that we re-order the rowkey elements to support prefix queries on 
Profile and time range, with wildcarding for primarily groups, and secondarily 
entities, ie:
    \<salt\>\<magic\>\<profileHash\>\<period\>\<entity\>\<groups\>
    
    So if I want the results for all rows in a time range regarding entity 
"192.168.222.123" regardless of group, I can query it, and if I want all rows 
in a time range regardless of entity value or group, I can query that too, as 
efficiently as an ordinary time range query.  What do you think?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to