Github user nickwallen commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/450#discussion_r102339076 --- Diff: metron-analytics/metron-profiler-client/README.md --- @@ -91,37 +60,268 @@ want to change the global Client configuration so as not to disrupt the work of | profiler.client.salt.divisor | The salt divisor used to store profile data. | Optional | 1000 | | hbase.provider.impl | The name of the HBaseTableProvider implementation class. | Optional | | + +### Profile Selectors + +You will notice that the third argument for `PROFILE_GET` is a list of `ProfilePeriod` objects. This list is expected to +be produced by another Stellar function. There are a couple options available. + +#### `PROFILE_FIXED` + +The profiler periods associated with a fixed lookback starting from now. These are ProfilePeriod objects. +``` +REQUIRED: + durationAgo - How long ago should values be retrieved from? + units - The units of 'durationAgo'. +OPTIONAL: + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. + +e.g. To retrieve all the profiles for the last 5 hours. PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS')) +``` + +Note that the `config_overrides` parameter operates exactly as the `config_overrides` argument in `PROFILE_GET`. +The only available parameters for override are: +* `profiler.client.period.duration` +* `profiler.client.period.duration.units` + +#### `PROFILE_WINDOW` + +`PROFILE_WINDOW` is intended to provide a finer-level of control over selecting windows for profiles: +* Specify windows relative to the data timestamp (see the optional `now` parameter below) +* Specify non-contiguous windows to better handle seasonal data (e.g. the last hour for every day for the last month) +* Specify profile output excluding holidays +* Specify only profile output on a specific day of the week + +It does this by a domain specific language mimicking natural language that defines the windows excluded. + +``` +REQUIRED: + windowSelector - The statement specifying the window to select. + now - Optional - The timestamp to use for now. +OPTIONAL: + config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter + of the same name. Default is the empty Map, meaning no overrides. + +e.g. To retrieve all the measurements written for 'profile' and 'entity' for the last hour +on the same weekday excluding weekends and US holidays across the last 14 days: +PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 hours starting from 14 days ago including the current day of the week excluding weekends, holidays:us')) +``` + +Note that the `config_overrides` parameter operates exactly as the `config_overrides` argument in `PROFILE_GET`. +The only available parameters for override are: +* `profiler.client.period.duration` +* `profiler.client.period.duration.units` + +##### The Profile Selector Language + +The domain specific language can be broken into a series of clauses, some optional +* <span style="color:blue">Total Temporal Duration</span> - The total range of time in which windows may be specified +* <span style="color:red">Temporal Window Width</span> - How large each temporal window +* <span style="color:green">Skip distance</span> (optional)- How far to skip between when one window starts and when the next begins +* <span style="color:purple">Inclusion/Exclusion specifiers</span> (optional) - The set of specifiers to further filter the window + +One *must* specify either a total temporal duration or a temporal window width. +The remaining clauses are optional. +During the course of the following discussion, we will color code the clauses in the examples. + +From a high level, the language fits the following three forms: + +* <span style="color:red">`time_interval WINDOW?`</span><span style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING specifier_list)?`</span> +* <span style="color:red">`time_interval WINDOW?`</span><span style="color:green">`EVERY time_interval`</span><span style="color:blue">`FROM time_interval (TO time_interval)?`</span><span style="color:purple">`(INCLUDING specifier_list)? (EXCLUDING specifier_list)?`</span> +* <span style="color:blue">`FROM time_interval (TO time_interval)?`</span> + +with +* `time_interval` representing a time amount followed by a unit (e.g. "1 hour") +* `specifier_list` representing a comma separated list of inclusion or exclusion specifiers (e.g. "holidays:us, tuesdays") + + +###### <span style="color:blue">Total Temporal Duration</span> + +Total temporal duration is specified by a phrase: `FROM time_interval AGO TO time_interval AGO` +This indicates the beginning and ending of a time interval. +* `FROM` - Can be the words "from" or "starting from" +* `time_interval` - A time amount followed by a unit (e.g. 1 hour). The unit may be "minute", "day", "hour" with any pluralization. +* `TO` - Can be the words "until" or "to" +* `AGO` - Optionally the word "ago" + +The `TO time_interval AGO` portion is optional. If unspecified then it is expected that the time interval ends now. + +Due to the vagaries of the english language, the from and the to portions, if both specified, are interchangeable +with regard to which one specifies the start and which specifies the end. + +In other words <span style="color:blue">`starting from 1 hour ago to 30 minutes ago`</span> and +<span style="color:blue">`starting from 30 minutes ago to 1 hour ago`</span> specify the same +temporal duration. + +**Examples** + +* A duration starting 1 hour ago and ending now + * <span style="color:blue">`from 1 hour ago`</span> + * <span style="color:blue">`from 1 hour`</span> + * <span style="color:blue">`starting from 1 hour ago`</span> + * <span style="color:blue">`starting from 1 hour`</span> +* A duration starting 1 hour ago and ending 30 minutes ago: + * <span style="color:blue">`from 1 hour ago until 30 minutes ago`</span> + * <span style="color:blue">`from 30 minutes ago until 1 hour ago`</span> + * <span style="color:blue">`starting from 1 hour ago to 30 minutes ago`</span> + * <span style="color:blue">`starting from 1 hour to 30 minutes`</span> + +###### <span style="color:red">Temporal Window Width</span> + +Temporal window width is the specification of a window. +A window is may either repeat within total temporal duration or may fill the total temporal duration. +A window is specified by the phrase: `time_interval WINDOW` +* `time_interval` - A time amount followed by a unit (e.g. 1 hour). The unit may be "minute", "day", "hour" with any pluralization. +* `WINDOW` - Optionally the word "window" + +**Examples** + +* A fixed window starting 2 hours ago and going until now + * <span style="color:red">`2 hour`</span> + * <span style="color:red">`2 hours`</span> + * <span style="color:red">`2 hours window`</span> +* A repeating 30 minute window starting 2 hours ago and repeating every hour until now. +This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago + * <span style="color:red">`30 minute window`</span><span style="color:green">`every 1 hour`</span><span style="color:blue">`starting from 2 hours ago`</span> + * <span style="color:red">`30 minutes window`</span><span style="color:green">`every 1 hour`</span><span style="color:blue">`from 2 hours ago`</span> +* A repeating 30 minute window starting 2 hours ago and repeating every hour until 30 minutes ago. +This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago + * <span style="color:red">`30 minute window`</span><span style="color:green">`every 1 hour`</span><span style="color:blue">`starting from 2 hours ago until 30 minutes ago`</span> + * <span style="color:red">`30 minutes window`</span><span style="color:green">`every 1 hour`</span><span style="color:blue">`from 2 hours ago to 30 minutes ago`</span> + * <span style="color:red">`30 minutes window`</span><span style="color:green">`for every 1 hour`</span><span style="color:blue">`from 30 minutes ago to 2 hours ago`</span> + +###### <span style="color:green">Skip distance</span> + +Skip distance is the amount of time between temporal window beginnings that the next window starts. +It is, in effect, the window period. + +It is specified by the phrase `EVERY time_interval` +* `time_interval` - A time amount followed by a unit (e.g. 1 hour). The unit may be "minute", "day", "hour" with any pluralization. +* `EVERY` - The word/phrase "every" or "for every" + +**Examples** + +* A repeating 30 minute window starting 2 hours ago and repeating every hour until now. --- End diff -- How do I select the last 1 hour window for the past 8 Tuesdays (assuming today is Tuesday)? How do I select the last 1 hour window for the past 8 'current day of the week'?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---