Good idea; I think percent-encoding sounds great. Unless there are any objections, I'll go with that approach.
On Fri, Jul 6, 2018 at 5:32 PM, Benjamin Mahler <[email protected]> wrote: > Do we also want: > > 3. Has an unambiguous decoding. > > Replacing '/' with '#%$' means I don't know if the user actually supplied > '#%$' or '/'. But using something like percent-encoding would have property > 3. > > On Fri, Jul 6, 2018 at 10:25 AM, Greg Mann <[email protected]> wrote: > >> Thanks for the reply Ben! >> >> Yea I suspect the lack of normalization there was not intentional, and it >> means that you can no longer reliably split on '/' unless you apply some >> external controls to user input. Yep, this is bad :) >> >> One thing we should consider when normalizing metadata embedded in metric >> keys (like framework name/ID) is that operators will likely want to >> de-normalize this information in their metrics tooling. For example, >> ideally something like the 'mesos_exporter' [1] could expose the framework >> name/ID as tags which could be easily consumed by the cluster's metrics >> infrastructure. >> >> To accommodate de-normalization, any substitutions we perform while >> normalizing should be: >> >> 1. Unique - we should substitute a single, unique string for each >> disallowed character >> 2. Verbose - we should substitute strings which are unlikely to >> appear in user input. (Examples: '#&#', '*@*', '%!%', etc.) >> >> >> My thought was that since operators can already not simply "split on >> slash", we might as well avoid normalization to avoid the above >> difficulties with de-normalization. There are other ways that the metric >> keys can be processed (e.g., regex), but this requires that the tooling >> have prior knowledge of the keys it should expect, which is not ideal. >> >> However, it also seems reasonable to me to normalize in a way that >> accommodates de-normalization as noted above. >> >> *What would folks think about normalizing by substituting '/' with '#%$'?* >> >> Another character that came up as a possible candidate for substitution >> is the space character, which could very well appear in framework names. I >> don't think we have a good reason on our side to substitute whitespace, but >> perhaps its presence in the metric keys would cause issues with external >> tooling? >> >> Greg >> >> [1] https://github.com/mesosphere/mesos_exporter >> >> >> On Tue, Jul 3, 2018 at 5:56 PM, Benjamin Mahler <[email protected]> >> wrote: >> >>> I don't think the lack of principal normalization was intentional. Why >>> spread that further? Don't we also have some normalization today? >>> >>> Having slashes show up in components complicates parsing (can no longer >>> split on '/'), no? For example, if we were to introduce the ability to >>> query a subset of metrics with a simple matcher (e.g. >>> /frameworks/*/messages_received), then this would be complicated by the >>> presence of slashes in the principal or other user supplied strings. >>> >>> On Tue, Jul 3, 2018 at 3:17 PM, Greg Mann <[email protected]> wrote: >>> >>>> Hi all! >>>> I'm currently working on adding a suite of new per-framework metrics to >>>> help schedulers better debug unexpected/unwanted behavior (MESOS-8842 >>>> <https://issues.apache.org/jira/browse/MESOS-8842>). One issue that >>>> has come up during this work is how we should handle strings like the >>>> framework name or role name in metric keys, since those strings may contain >>>> characters like '/' which already have a meaning in our metrics interface. >>>> I intend to place the framework name and ID in the keys for the new >>>> per-framework metrics, delimited by a sufficiently-unique separator so that >>>> operators can decode the name/ID in their metrics tooling. An example >>>> per-framework metric key: >>>> >>>> master/frameworks/<framework_name>###<framework_id>/tasks/ta >>>> sk_running >>>> >>>> >>>> I recently realized that we actually already allow the '/' character in >>>> metric keys, since we include the framework principal in these keys: >>>> >>>> frameworks/<framework_principal>/messages_received >>>> frameworks/<framework_principal>/messages_processed >>>> >>>> We don't disallow any characters in the principal, so anything could >>>> appear in those keys. >>>> >>>> *Since we don't normalize the principal in the above keys, my proposal >>>> is that we do not normalize the framework name at all when constructing the >>>> new per-framework metric keys.* >>>> >>>> >>>> Let me know what you think! >>>> >>>> Cheers, >>>> Greg >>>> >>> >>> >> >
