Github user cestella commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/450#discussion_r102342075 --- Diff: metron-analytics/metron-profiler-client/README.md --- @@ -91,37 +60,268 @@ want to change the global Client configuration so as not to disrupt the work of | profiler.client.salt.divisor | The salt divisor used to store profile data. | Optional | 1000 | | hbase.provider.impl | The name of the HBaseTableProvider implementation class. | Optional | | + +### Profile Selectors + +You will notice that the third argument for `PROFILE_GET` is a list of `ProfilePeriod` objects. This list is expected to +be produced by another Stellar function. There are a couple options available. + +#### `PROFILE_FIXED` + +The profiler periods associated with a fixed lookback starting from now. These are ProfilePeriod objects. +``` +REQUIRED: + durationAgo - How long ago should values be retrieved from? + units - The units of 'durationAgo'. +OPTIONAL: + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. + +e.g. To retrieve all the profiles for the last 5 hours. PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS')) +``` + +Note that the `config_overrides` parameter operates exactly as the `config_overrides` argument in `PROFILE_GET`. +The only available parameters for override are: +* `profiler.client.period.duration` +* `profiler.client.period.duration.units` + +#### `PROFILE_WINDOW` + +`PROFILE_WINDOW` is intended to provide a finer-level of control over selecting windows for profiles: +* Specify windows relative to the data timestamp (see the optional `now` parameter below) +* Specify non-contiguous windows to better handle seasonal data (e.g. the last hour for every day for the last month) +* Specify profile output excluding holidays +* Specify only profile output on a specific day of the week + +It does this by a domain specific language mimicking natural language that defines the windows excluded. + +``` +REQUIRED: + windowSelector - The statement specifying the window to select. + now - Optional - The timestamp to use for now. +OPTIONAL: + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. + +e.g. To retrieve all the measurements written for 'profile' and 'entity' for the last hour +on the same weekday excluding weekends and US holidays across the last 14 days: +PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 hours starting from 14 days ago including the current day of the week excluding weekends, holidays:us')) +``` + +Note that the `config_overrides` parameter operates exactly as the `config_overrides` argument in `PROFILE_GET`. +The only available parameters for override are: +* `profiler.client.period.duration` +* `profiler.client.period.duration.units` + +##### The Profile Selector Language + +The domain specific language can be broken into a series of clauses, some optional +* <span style="color:blue">Total Temporal Duration</span> - The total range of time in which windows may be specified +* <span style="color:red">Temporal Window Width</span> - How large each temporal window +* <span style="color:green">Skip distance</span> (optional)- How far to skip between when one window starts and when the next begins +* <span style="color:purple">Inclusion/Exclusion specifiers</span> (optional) - The set of specifiers to further filter the window + +One *must* specify either a total temporal duration or a temporal window width. +The remaining clauses are optional. +During the course of the following discussion, we will color code the clauses in the examples. + +From a high level, the language fits the following three forms: + +* <span style="color:red">`time_interval WINDOW?`</span><span style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING specifier_list)?`</span> +* <span style="color:red">`time_interval WINDOW?`</span><span style="color:green">`EVERY time_interval`</span><span style="color:blue">`FROM time_interval (TO time_interval)?`</span><span style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING specifier_list)?`</span> +* <span style="color:blue">`FROM time_interval (TO time_interval)?`</span> + +with +* `time_interval` representing a time amount followed by a unit (e.g. "1 hour") +* `specifier_list` representing a comma separated list of inclusion or exclusion specifiers (e.g. "holidays:us, tuesdays") + + +###### <span style="color:blue">Total Temporal Duration</span> + +Total temporal duration is specified by a phrase: `FROM time_interval AGO TO time_interval AGO` +This indicates the beginning and ending of a time interval. +* `FROM` - Can be the words "from" or "starting from" +* `time_interval` - A time amount followed by a unit (e.g. 1 hour). The unit may be "minute", "day", "hour" with any pluralization. +* `TO` - Can be the words "until" or "to" +* `AGO` - Optionally the word "ago" + +The `TO time_interval AGO` portion is optional. If unspecified then it is expected that the time interval ends now. + +Due to the vagaries of the english language, the from and the to portions, if both specified, are interchangeable +with regard to which one specifies the start and which specifies the end. + +In other words <span style="color:blue">`starting from 1 hour ago to 30 minutes ago`</span> and --- End diff -- All time intervals are inclusive, but it's worth mentioning in the docs.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---