Good idea; I think percent-encoding sounds great. Unless there are any
objections, I'll go with that approach.

On Fri, Jul 6, 2018 at 5:32 PM, Benjamin Mahler <[email protected]> wrote:

> Do we also want:
>
> 3. Has an unambiguous decoding.
>
> Replacing '/' with '#%$' means I don't know if the user actually supplied
> '#%$' or '/'. But using something like percent-encoding would have property
> 3.
>
> On Fri, Jul 6, 2018 at 10:25 AM, Greg Mann <[email protected]> wrote:
>
>> Thanks for the reply Ben!
>>
>> Yea I suspect the lack of normalization there was not intentional, and it
>> means that you can no longer reliably split on '/' unless you apply some
>> external controls to user input. Yep, this is bad :)
>>
>> One thing we should consider when normalizing metadata embedded in metric
>> keys (like framework name/ID) is that operators will likely want to
>> de-normalize this information in their metrics tooling. For example,
>> ideally something like the 'mesos_exporter' [1] could expose the framework
>> name/ID as tags which could be easily consumed by the cluster's metrics
>> infrastructure.
>>
>> To accommodate de-normalization, any substitutions we perform while
>> normalizing should be:
>>
>>    1. Unique - we should substitute a single, unique string for each
>>    disallowed character
>>    2. Verbose - we should substitute strings which are unlikely to
>>    appear in user input. (Examples: '#&#', '*@*', '%!%', etc.)
>>
>>
>> My thought was that since operators can already not simply "split on
>> slash", we might as well avoid normalization to avoid the above
>> difficulties with de-normalization. There are other ways that the metric
>> keys can be processed (e.g., regex), but this requires that the tooling
>> have prior knowledge of the keys it should expect, which is not ideal.
>>
>> However, it also seems reasonable to me to normalize in a way that
>> accommodates de-normalization as noted above.
>>
>> *What would folks think about normalizing by substituting '/' with '#%$'?*
>>
>> Another character that came up as a possible candidate for substitution
>> is the space character, which could very well appear in framework names. I
>> don't think we have a good reason on our side to substitute whitespace, but
>> perhaps its presence in the metric keys would cause issues with external
>> tooling?
>>
>> Greg
>>
>> [1] https://github.com/mesosphere/mesos_exporter
>>
>>
>> On Tue, Jul 3, 2018 at 5:56 PM, Benjamin Mahler <[email protected]>
>> wrote:
>>
>>> I don't think the lack of principal normalization was intentional. Why
>>> spread that further? Don't we also have some normalization today?
>>>
>>> Having slashes show up in components complicates parsing (can no longer
>>> split on '/'), no? For example, if we were to introduce the ability to
>>> query a subset of metrics with a simple matcher (e.g.
>>> /frameworks/*/messages_received), then this would be complicated by the
>>> presence of slashes in the principal or other user supplied strings.
>>>
>>> On Tue, Jul 3, 2018 at 3:17 PM, Greg Mann <[email protected]> wrote:
>>>
>>>> Hi all!
>>>> I'm currently working on adding a suite of new per-framework metrics to
>>>> help schedulers better debug unexpected/unwanted behavior (MESOS-8842
>>>> <https://issues.apache.org/jira/browse/MESOS-8842>). One issue that
>>>> has come up during this work is how we should handle strings like the
>>>> framework name or role name in metric keys, since those strings may contain
>>>> characters like '/' which already have a meaning in our metrics interface.
>>>> I intend to place the framework name and ID in the keys for the new
>>>> per-framework metrics, delimited by a sufficiently-unique separator so that
>>>> operators can decode the name/ID in their metrics tooling. An example
>>>> per-framework metric key:
>>>>
>>>>     master/frameworks/<framework_name>###<framework_id>/tasks/ta
>>>> sk_running
>>>>
>>>>
>>>> I recently realized that we actually already allow the '/' character in
>>>> metric keys, since we include the framework principal in these keys:
>>>>
>>>>     frameworks/<framework_principal>/messages_received
>>>>     frameworks/<framework_principal>/messages_processed
>>>>
>>>> We don't disallow any characters in the principal, so anything could
>>>> appear in those keys.
>>>>
>>>> *Since we don't normalize the principal in the above keys, my proposal
>>>> is that we do not normalize the framework name at all when constructing the
>>>> new per-framework metric keys.*
>>>>
>>>>
>>>> Let me know what you think!
>>>>
>>>> Cheers,
>>>> Greg
>>>>
>>>
>>>
>>
>

Reply via email to