Github user mattf-horton commented on the issue: https://github.com/apache/metron/pull/622 Your proposal has the advantage of making data in HBase self-identifying (if one has the key), which I always like. However, it's a large change and induces yet more complexity. There's an alternative I've been noodling occasionally, which I put forward here for consideration: Create a Profile Audit Log table in HBase. Every time a Profiler is configured, started, or stopped, make one entry in the audit log. The idea is to be able to answer exactly the kinds of questions posed in METRON-450, so the records should include things like the configuration, the first and last timestamps, and perhaps the key builder parameters. This would prevent historical profiles from being "lost" because the would-be querier doesn't have access to the exact config parameters used to write the profile. For the sake of housekeeping, one might do a scan, daily and/or at system restart, to assure that (a) the set Profiles with a "start" but not an "end" recorded in the audit log, and (b) the set of currently running Profiles, are actually consistent, and record "inferred end" entries in the audit log for orphans found. This solution is somewhat backward-applicable to existing Profile data; I think there are brute-force ways to scan the existing HBase tables and infer audit log entries, especially if historical configuration data is still available. We could write such a scanner.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---