Github user mattf-horton commented on the issue: https://github.com/apache/metron/pull/622 @nickwallen brought up the issue of wildcard queries on our rowkeys. It has always bothered me that we can't do wildcard queries on groups. If you have, for example, a single groupBy based on day of week, that's just 7 possible values, and if you want them all you could just do 7 queries and combine them. But if you have three groupBy's, and they have 7, 31, and 256 possible values, then to simulate a wildcard query you would have to do over 55,000 individual queries! Of course you would just do an hbase scan, but it would require a full table scan to select the time range desired. I propose that we re-order the rowkey elements to support prefix queries on Profile and time range, with wildcarding for primarily groups, and secondarily entities, ie: \<salt\>\<magic\>\<profileHash\>\<period\>\<entity\>\<groups\> So if I want the results for all rows in a time range regarding entity "192.168.222.123" regardless of group, I can query it, and if I want all rows in a time range regardless of entity value or group, I can query that too, as efficiently as an ordinary time range query. What do you think?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---