Github user mattf-horton commented on the issue:

    https://github.com/apache/metron/pull/622
  
    Your proposal has the advantage of making data in HBase self-identifying 
(if one has the key), which I always like.  However, it's a large change and 
induces yet more complexity.  There's an alternative I've been noodling 
occasionally, which I put forward here for consideration:
    
    Create a Profile Audit Log table in HBase.  Every time a Profiler is 
configured, started, or stopped, make one entry in the audit log.  The idea is 
to be able to answer exactly the kinds of questions posed in METRON-450, so the 
records should include things like the configuration, the first and last 
timestamps, and perhaps the key builder parameters.  This would prevent 
historical profiles from being "lost" because the would-be querier doesn't have 
access to the exact config parameters used to write the profile.
    
    For the sake of housekeeping, one might do a scan, daily and/or at system 
restart, to assure that (a) the set Profiles with a "start" but not an "end" 
recorded in the audit log, and (b) the set of currently running Profiles, are 
actually consistent, and record "inferred end" entries in the audit log for 
orphans found.
    
    This solution is somewhat backward-applicable to existing Profile data; I 
think there are brute-force ways to scan the existing HBase tables and infer 
audit log entries, especially if historical configuration data is still 
available.  We could write such a scanner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to