Re: [prometheus-developers] Re: Remote Write Metadata propagation

Rob Skillington Wed, 19 Aug 2020 02:03:00 -0700

If anyone wants to do some further testing on their own datasets, would
definitely be interesting to see what range they are in.


I’ll start addressing latest round of comments and tie up tests.

On Wed, Aug 19, 2020 at 4:53 AM Brian Brazil <
brian.bra...@robustperception.io> wrote:

> On Wed, 19 Aug 2020 at 09:47, Rob Skillington <r...@chronosphere.io> wrote:
>
>> To add a bit more detail to that example, I was actually using a
>> fairly tuned
>> remote write queue config that sent large batches since the batch send
>> deadline
>> was set to 1 minute longer with a max samples per send of 5,000. Here's
>> that
>> config:
>> ```
>> remote_write:
>>   - url: http://localhost:3030/remote/write
>>     remote_timeout: 30s
>>     queue_config:
>>       capacity: 10000
>>       max_shards: 10
>>       min_shards: 3
>>       max_samples_per_send: 5000
>>       batch_send_deadline: 1m
>>       min_backoff: 50ms
>>       max_backoff: 1s
>> ```
>>
>> Using the default config we get worse utilization for both before/after
>> numbers
>> but the delta/difference is less:
>> - steady state ~177kb/sec without this change
>> - steady state ~210kb/sec with this change
>> - roughly 20% increase
>>
>
> I think 20% is okay all things considered.
>
> Brian
>
>
>>
>> Using config:
>> ```
>> remote_write:
>>   - url: http://localhost:3030/remote/write
>>     remote_timeout: 30s
>> ```
>>
>> Implicitly the values for this config is:
>> - min shards 1
>> - max shards 1000
>> - max samples per send 100
>> - capacity 500
>> - batch send deadline 5s
>> - min backoff 30ms
>> - max backoff 100ms
>>
>> On Wed, Aug 19, 2020 at 4:26 AM Brian Brazil <
>> brian.bra...@robustperception.io> wrote:
>>
>>> On Wed, 19 Aug 2020 at 09:20, Rob Skillington <r...@chronosphere.io>
>>> wrote:
>>>
>>>> Here's the results from testing:
>>>> - node_exporter exporting 309 metrics each by turning on a lot of
>>>> optional
>>>>   collectors, all have help set, very few have unit set
>>>> - running 8 on the host at 1s scrape interval, each with unique
>>>> instance label
>>>> - steady state ~137kb/sec without this change
>>>> - steady state ~172kb/sec with this change
>>>> - roughly 30% increase
>>>>
>>>> Graph here:
>>>>
>>>> https://github.com/prometheus/prometheus/pull/7771#issuecomment-675923976
>>>>
>>>> How do we want to proceed? This could be fairly close to the higher end
>>>> of
>>>> the spectrum in terms of expected increase given the node_exporter
>>>> metrics
>>>> density and fairly verbose metadata.
>>>>
>>>> Even having said that however 30% is a fairly big increase and
>>>> relatively large
>>>> egress cost to have to swallow without any way to back out of this
>>>> behavior.
>>>>
>>>> What do folks think of next steps?
>>>>
>>>
>>> It is on the high end, however this is going to be among the worst cases
>>> as there's not going to be a lot of per-metric cardinality from the node
>>> exporter. I bet if you greatly increased the number of targets (and reduced
>>> the scrape interval to compensate) it'd be more reasonable. I think this is
>>> just about okay.
>>>
>>> Brian
>>>
>>>
>>>>
>>>>
>>>> On Tue, Aug 11, 2020 at 11:55 AM Rob Skillington <r...@chronosphere.io>
>>>> wrote:
>>>>
>>>>> Agreed - I'll see what I can do in getting some numbers for a workload
>>>>> collecting cAdvisor metrics, it seems to have a significant amount of
>>>>> HELP set:
>>>>>
>>>>> https://github.com/google/cadvisor/blob/8450c56c21bc5406e2df79a2162806b9a23ebd34/metrics/testdata/prometheus_metrics
>>>>>
>>>>>
>>>>> On Tue, Aug 11, 2020 at 6:15 AM Brian Brazil <
>>>>> brian.bra...@robustperception.io> wrote:
>>>>>
>>>>>> On Tue, 11 Aug 2020 at 11:07, Julien Pivotto <
>>>>>> roidelapl...@prometheus.io> wrote:
>>>>>>
>>>>>>> On 11 Aug 11:05, Brian Brazil wrote:
>>>>>>>
>>>>>>>
>>>>>>> > On Tue, 11 Aug 2020 at 04:09, Callum Styan <callumst...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > > I'm hesitant to add anything that significantly increases the
>>>>>>> network
>>>>>>>
>>>>>>>
>>>>>>> > > bandwidth usage or remote write while at the same time not
>>>>>>> giving users a
>>>>>>>
>>>>>>>
>>>>>>> > > way to tune the usage to their needs.
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> > > I agree with Brian that we don't want the protocol itself to
>>>>>>> become
>>>>>>>
>>>>>>>
>>>>>>> > > stateful by introducing something like negotiation. I'd also
>>>>>>> prefer not to
>>>>>>>
>>>>>>>
>>>>>>> > > introduce multiple ways to do things, though I'm hoping we can
>>>>>>> find a way
>>>>>>>
>>>>>>>
>>>>>>> > > to accommodate your use case while not ballooning average users
>>>>>>> network
>>>>>>>
>>>>>>>
>>>>>>> > > egress bill.
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> > > I am fine with forcing the consuming end to be somewhat stateful
>>>>>>> like in
>>>>>>>
>>>>>>>
>>>>>>> > > the case of Josh's PR where all metadata is sent periodically
>>>>>>> and must be
>>>>>>>
>>>>>>>
>>>>>>> > > stored by the remote storage system.
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > > Overall I'd like to see some numbers regarding current network
>>>>>>> bandwidth
>>>>>>>
>>>>>>>
>>>>>>> > > of remote write, remote write with metadata via Josh's PR, and
>>>>>>> remote write
>>>>>>>
>>>>>>>
>>>>>>> > > with sending metadata for every series in a remote write payload.
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > I agree, I noticed that in Rob's PR and had the same thought.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Remote bandwidth are likely to affect only people using remote
>>>>>>> write.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Getting a view on the on-disk size of the WAL would be great too, as
>>>>>>>
>>>>>>>
>>>>>>> that will affect everyone.
>>>>>>>
>>>>>>
>>>>>> I'm not worried about that, it's only really on series creation so
>>>>>> won't be noticed unless you have really high levels of churn.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > Brian
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> > > Rob, I'll review your PR tomorrow but it looks like Julien and
>>>>>>> Brian may
>>>>>>>
>>>>>>>
>>>>>>> > > already have that covered.
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> > > On Sun, Aug 9, 2020 at 9:36 PM Rob Skillington <
>>>>>>> r...@chronosphere.io>
>>>>>>>
>>>>>>>
>>>>>>> > > wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> > >> Update: The PR now sends the fields over remote write from the
>>>>>>> WAL and
>>>>>>>
>>>>>>>
>>>>>>> > >> metadata
>>>>>>>
>>>>>>>
>>>>>>> > >> is also updated in the WAL when any field changes.
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >> Now opened the PR against the primary repo:
>>>>>>>
>>>>>>>
>>>>>>> > >> https://github.com/prometheus/prometheus/pull/7771
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >> I have tested this end-to-end with a modified M3 branch:
>>>>>>>
>>>>>>>
>>>>>>> > >> https://github.com/m3db/m3/compare/r/test-prometheus-metadata
>>>>>>>
>>>>>>>
>>>>>>> > >> > {... "msg":"received
>>>>>>>
>>>>>>>
>>>>>>> > >> series","labels":"{__name__="prometheus_rule_group_...
>>>>>>>
>>>>>>>
>>>>>>> > >> >
>>>>>>> iterations_total",instance="localhost:9090",job="prometheus01",role=...
>>>>>>>
>>>>>>>
>>>>>>> > >> > "remote"}","type":"counter","unit":"","help":"The total
>>>>>>> number of
>>>>>>>
>>>>>>>
>>>>>>> > >> scheduled...
>>>>>>>
>>>>>>>
>>>>>>> > >> > rule group evaluations, whether executed or missed."}
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >> Tests still haven't been updated. Please any feedback on the
>>>>>>> approach /
>>>>>>>
>>>>>>>
>>>>>>> > >> data structures would be greatly appreciated.
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >> Would be good to know what others thoughts are on next steps.
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >> On Sat, Aug 8, 2020 at 11:21 AM Rob Skillington <
>>>>>>> r...@chronosphere.io>
>>>>>>>
>>>>>>>
>>>>>>> > >> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >>> Here's a draft PR that builds that propagates metadata to the
>>>>>>> WAL and
>>>>>>>
>>>>>>>
>>>>>>> > >>> the WAL
>>>>>>>
>>>>>>>
>>>>>>> > >>> reader can read it back:
>>>>>>>
>>>>>>>
>>>>>>> > >>> https://github.com/robskillington/prometheus/pull/1/files
>>>>>>>
>>>>>>>
>>>>>>> > >>>
>>>>>>>
>>>>>>>
>>>>>>> > >>> Would like a little bit of feedback before on the datatypes and
>>>>>>>
>>>>>>>
>>>>>>> > >>> structure going
>>>>>>>
>>>>>>>
>>>>>>> > >>> further if folks are open to that.
>>>>>>>
>>>>>>>
>>>>>>> > >>>
>>>>>>>
>>>>>>>
>>>>>>> > >>> There's a few things not happening:
>>>>>>>
>>>>>>>
>>>>>>> > >>> - Remote write queue manager does not use or send these extra
>>>>>>> fields yet.
>>>>>>>
>>>>>>>
>>>>>>> > >>> - Head does not reset the "metadata" slice (not sure where
>>>>>>> "series"
>>>>>>>
>>>>>>>
>>>>>>> > >>> slice is
>>>>>>>
>>>>>>>
>>>>>>> > >>>   reset in the head for pending series writes to WAL, want to
>>>>>>> do in same
>>>>>>>
>>>>>>>
>>>>>>> > >>> place).
>>>>>>>
>>>>>>>
>>>>>>> > >>> - Metadata is not re-written on change yet.
>>>>>>>
>>>>>>>
>>>>>>> > >>> - Tests.
>>>>>>>
>>>>>>>
>>>>>>> > >>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>
>>>>>>>
>>>>>>>
>>>>>>> > >>> On Sat, Aug 8, 2020 at 9:37 AM Rob Skillington <
>>>>>>> r...@chronosphere.io>
>>>>>>>
>>>>>>>
>>>>>>> > >>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>> Sounds good, I've updated the proposal with details on places
>>>>>>> in which
>>>>>>>
>>>>>>>
>>>>>>> > >>>> changes
>>>>>>>
>>>>>>>
>>>>>>> > >>>> are required given the new approach:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>
>>>>>>> https://docs.google.com/document/d/1LY8Im8UyIBn8e3LJ2jB-MoajXkfAqW2eKzY735aYxqo/edit#
>>>>>>>
>>>>>>>
>>>>>>> > >>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>> On Fri, Aug 7, 2020 at 2:09 PM Brian Brazil <
>>>>>>>
>>>>>>>
>>>>>>> > >>>> brian.bra...@robustperception.io> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> On Fri, 7 Aug 2020 at 15:48, Rob Skillington <
>>>>>>> r...@chronosphere.io>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> True - I mean this could also be a blacklist by config
>>>>>>> perhaps, so if
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> you
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> really don't want to have increased egress you can
>>>>>>> optionally turn
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> off sending
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> the TYPE, HELP, UNIT or send them at different frequencies
>>>>>>> via
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> config. We could
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> package some sensible defaults so folks don't need to
>>>>>>> update their
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> config.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> The main intention is to enable these added features and
>>>>>>> make it
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> possible for
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> various consumers to be able to adjust some of these
>>>>>>> parameters if
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> required
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> since backends can be so different in their implementation.
>>>>>>> For M3 I
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> would be
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> totally fine with the extra egress that should be mitigated
>>>>>>> fairly
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> considerably
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> by Snappy and the fact that HELP is common across certain
>>>>>>> metric
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> families and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> receiving it every single Remote Write request.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> That's really a micro-optimisation. If you are that worried
>>>>>>> about
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> bandwidth you'd run a sidecar specific to your remote
>>>>>>> backend that was
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> stateful and far more efficient overall. Sending the full
>>>>>>> label names and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> values on every request is going to be far more than the
>>>>>>> overhead of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> metadata on top of that, so I don't see a need as it stands
>>>>>>> for any of this
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> to be configurable.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> On Fri, Aug 7, 2020 at 3:56 AM Brian Brazil <
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>> brian.bra...@robustperception.io> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> On Thu, 6 Aug 2020 at 22:58, Rob Skillington <
>>>>>>> r...@chronosphere.io>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Hey Björn,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Thanks for the detailed response. I've had a few back and
>>>>>>> forths on
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> this with
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Brian and Chris over IRC and CNCF Slack now too.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> I agree that fundamentally it seems naive to
>>>>>>> idealistically model
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> this around
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> per metric name. It needs to be per series given what may
>>>>>>> happen
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> w.r.t.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> collision across targets, etc.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Perhaps we can separate these discussions apart into two
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> considerations:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> 1) Modeling of the data such that it is kept around for
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> transmission (primarily
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> we're focused on WAL here).
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> 2) Transmission (and of which you allude to has many
>>>>>>> areas for
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> improvement).
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> For (1) - it seems like this needs to be done per time
>>>>>>> series,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> thankfully we
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> actually already have modeled this to be stored per
>>>>>>> series data
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> just once in a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> single WAL file. I will write up my proposal here, but it
>>>>>>> will
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> surmount to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> essentially encoding the HELP, UNIT and TYPE to the WAL
>>>>>>> per series
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> similar to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> how labels for a series are encoded once per series in
>>>>>>> the WAL.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Since this
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> optimization is in place, there's already a huge
>>>>>>> dampening effect
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> on how
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> expensive it is to write out data about a series (e.g.
>>>>>>> labels). We
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> can always
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> go and collect a sample WAL file and measure how much
>>>>>>> extra size
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> with/without
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> HELP, UNIT and TYPE this would add, but it seems like it
>>>>>>> won't
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> fundamentally
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> change the order of magnitude in terms of "information
>>>>>>> about a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> timeseries
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> storage size" vs "datapoints about a timeseries storage
>>>>>>> size". One
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> extra change
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> would be re-encoding the series into the WAL if the HELP
>>>>>>> changed
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> for that
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> series, just so that when HELP does change it can be up
>>>>>>> to date
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> from the view
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> of whoever is reading the WAL (i.e. the Remote Write
>>>>>>> loop). Since
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> this entry
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> needs to be loaded into memory for Remote Write today
>>>>>>> anyway, with
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> string
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> interning as suggested by Chris, it won't change the
>>>>>>> memory profile
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> algorithmically of a Prometheus with Remote Write
>>>>>>> enabled. There
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> will be some
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> overhead that at most would likely be similar to the
>>>>>>> label data,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> but we aren't
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> altering data structures (so won't change big-O magnitude
>>>>>>> of memory
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> being used),
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> we're adding fields to existing data structures that
>>>>>>> exist and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> string interning
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> should actually make it much less onerous since there is
>>>>>>> a large
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> duplicative
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> effect with HELP among time series.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> For (2) - now we have basically TYPE, HELP and UNIT all
>>>>>>> available
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> for
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> transmission if we wanted to send it with every single
>>>>>>> datapoint.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> While I think
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> we should definitely examine HPACK like compression
>>>>>>> features as you
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> mentioned
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Björn, I think we should think more about separating that
>>>>>>> kind of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> work into a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Milestone 2 where this is considered.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> For the time being it's very plausible
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> we could do some negotiation of the receiving Remote
>>>>>>> Write endpoint
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> by sending
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> a "GET" to the remote write endpoint and seeing if it
>>>>>>> responds with
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> "capabilities + preferences" response, and if the endpoint
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> specifies that it
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> would like to receive metadata all the time on every
>>>>>>> single request
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> and let
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Snappy take care of keeping size not ballooning too much,
>>>>>>> or if it
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> would like
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> TYPE on every single datapoint, and HELP and UNIT every
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> DESIRED_SECONDS or so.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> To enable a "send HELP every 10 minutes" feature we would
>>>>>>> have to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> add to the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> datastructure that holds the LABELS, TYPE, HELP and UNIT
>>>>>>> for each
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> series a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> "last sent" timestamp to know when to resend to that
>>>>>>> backend, but
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> that seems
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> entirely plausible and would not use more than 4 extra
>>>>>>> bytes.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> Negotiation is fundamentally stateful, as the process that
>>>>>>> receives
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> the first request may be a very different one from the one
>>>>>>> that receives
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> the second - such as if an upgrade is in progress. Remote
>>>>>>> write is intended
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> to be a very simple thing that's easy to implement on the
>>>>>>> receiver end and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> is a send-only request-based protocol, so request-time
>>>>>>> negotiation is
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> basically out. Any negotiation needs to happen via the
>>>>>>> config file, and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> even then it'd be better if nothing ever needed to be
>>>>>>> configured. Getting
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> all the users of a remote write to change their config
>>>>>>> file or restart all
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> their Prometheus servers is not an easy task after all.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> These thoughts are based on the discussion I've had and
>>>>>>> the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> thoughts on this
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> thread. What's the feedback on this before I go ahead and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> re-iterate the design
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> to more closely map to what I'm suggesting here?
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Best,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Rob
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> On Thu, Aug 6, 2020 at 2:01 PM Bjoern Rabenstein <
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> bjo...@rabenste.in> wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> On 03.08.20 03:04, Rob Skillington wrote:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> > Ok - I have a proposal which could be broken up into
>>>>>>> two pieces,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> first
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> > delivering TYPE per datapoint, the second consistently
>>>>>>> and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> reliably HELP and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> > UNIT once per unique metric name:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>> https://docs.google.com/document/d/1LY8Im8UyIBn8e3LJ2jB-MoajXkfAqW2eKzY735aYxqo
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> > /edit#heading=h.bik9uwphqy3g
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Thanks for the doc. I have commented on it, but while
>>>>>>> doing so, I
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> felt
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the urge to comment more generally, which would not fit
>>>>>>> well into
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> margin of a Google doc. My thoughts are also a bit out
>>>>>>> of scope of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Rob's design doc and more about the general topic of
>>>>>>> remote write
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the equally general topic of metadata (about which we
>>>>>>> have an
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> ongoing
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> discussion among the Prometheus developers).
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Disclaimer: I don't know the remote-write protocol very
>>>>>>> well. My
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> hope
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> here is that my somewhat distant perspective is of some
>>>>>>> value as it
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> allows to take a step back. However, I might just miss
>>>>>>> crucial
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> details
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> that completely invalidate my thoughts. We'll see...
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> I do care a lot about metadata, though. (And ironically,
>>>>>>> the reason
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> why I declared remote write "somebody else's problem" is
>>>>>>> that I've
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> always disliked how it fundamentally ignores metadata.)
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Rob's document embraces the fact that metadata can
>>>>>>> change over
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> time,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> but it assumes that at any given time, there is only one
>>>>>>> set of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> metadata per unique metric name. It takes into account
>>>>>>> that there
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> can
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> be drift, but it considers them an irregularity that
>>>>>>> will only
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> happen
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> occasionally and iron out over time.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> In practice, however, metadata can be legitimately and
>>>>>>> deliberately
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> different for different time series of the same name.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Instrumentation
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> libraries and even the exposition format inherently
>>>>>>> require one
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> set of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> metadata per metric name, but this is all only enforced
>>>>>>> (and meant
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> be enforced) _per target_. Once the samples are ingested
>>>>>>> (or even
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> sent
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> onwards via remote write), they have no notion of what
>>>>>>> target they
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> came from. Furthermore, samples created by rule
>>>>>>> evaluation don't
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> have
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> an originating target in the first place. (Which raises
>>>>>>> the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> question
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> of metadata for recording rules, which is another can of
>>>>>>> worms I'd
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> like to open eventually...)
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> (There is also the technical difficulty that the WAL has
>>>>>>> no notion
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> of
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> bundling or referencing all the series with the same
>>>>>>> metric name.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> That
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> was commented about in the doc but is not my focus here.)
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Rob's doc sees TYPE as special because it is so cheap to
>>>>>>> just add
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> every data point. That's correct, but it's giving me an
>>>>>>> itch:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Should
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> we really create different ways of handling metadata,
>>>>>>> depending on
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> its
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> expected size?
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Compare this with labels. There is no upper limit to
>>>>>>> their number
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> or
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> size. Still, we have no plan of treating "large" labels
>>>>>>> differently
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> from "short" labels.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> On top of that, we have by now gained the insight that
>>>>>>> metadata is
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> changing over time and essentially has to be tracked per
>>>>>>> series.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Or in other words: From a pure storage perspective,
>>>>>>> metadata
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> behaves
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> exactly the same as labels! (There are certainly huge
>>>>>>> differences
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> semantically, but those only manifest themselves on the
>>>>>>> query
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> level,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> i.e. how you treat it in PromQL etc.)
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> (This is not exactly a new insight. This is more or less
>>>>>>> what I
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> said
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> during the 2016 dev summit, when we first discussed
>>>>>>> remote write.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> But
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> I don't want to dwell on "told you so" moments... :o)
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> There is a good reason why we don't just add metadata as
>>>>>>> "pseudo
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> labels": As discussed a lot in the various design docs
>>>>>>> including
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Rob's
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> one, it would blow up the data size significantly
>>>>>>> because HELP
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> strings
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> tend to be relatively long.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> And that's the point where I would like to take a step
>>>>>>> back: We are
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> discussing to essentially treat something that is
>>>>>>> structurally the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> same thing in three different ways: Way 1 for labels as
>>>>>>> we know
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> them. Way 2 for "small" metadata. Way 3 for "big"
>>>>>>> metadata.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> However, while labels tend to be shorter than HELP
>>>>>>> strings, there
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> is
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the occasional use case with long or many labels.
>>>>>>> (Infamously, at
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> SoundCloud, a binary accidentally put a whole HTML page
>>>>>>> into a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> label. That wasn't a use case, it was a bug, but the
>>>>>>> Prometheus
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> server
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> ingesting that was just chugging along as if nothing
>>>>>>> special had
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> happened. It looked weird in the expression browser,
>>>>>>> though...) I'm
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> sure any vendor offering Prometheus remote storage as a
>>>>>>> service
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> will
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> have a customer or two that use excessively long label
>>>>>>> names. If we
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> have to deal with that, why not bite the bullet and
>>>>>>> treat metadata
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> in
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the same way as labels in general? Or to phrase it in
>>>>>>> another way:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Any
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> solution for "big" metadata could be used for labels,
>>>>>>> too, to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> alleviate the pain with excessively long label names.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Or most succintly: A robust and really good solution for
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> "big" metadata in remote write will make remote write
>>>>>>> much more
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> efficient if applied to labels, too.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Imagine an NALSD tech interview question that boils down
>>>>>>> to "design
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Prometheus remote write". I bet that most of the better
>>>>>>> candidates
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> will recognize that most of the payload will consist of
>>>>>>> series
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> indentifiers (call them labels or whatever) and they
>>>>>>> will suggest
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> first transmit some kind of index and from then on only
>>>>>>> transmit
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> short
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> series IDs. The best candidates will then find out about
>>>>>>> all the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> problems with that: How to keep the protocol stateless,
>>>>>>> how to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> re-sync
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> the index, how to update it if new series arrive etc.
>>>>>>> Those are
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> certainly all good reasons why remote write as we know
>>>>>>> it does not
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> transfer an index of series IDs.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> However, my point here is that we are now discussing
>>>>>>> exactly those
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> problems when we talk about metadata transmission. Let's
>>>>>>> solve
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> those
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> problems and apply them to remote write in general!
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Some thoughts about that:
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Current remote write essentially transfers all labels
>>>>>>> for _every_
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> sample. This works reasonably well. Even if metadata
>>>>>>> blows up the
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> data
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> size by 5x or 10x, transfering the whole index of
>>>>>>> metadata and
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> labels
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> should remain feasible as long as we do it less
>>>>>>> frequently than
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> once
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> every 10 samples. It's something that could be done each
>>>>>>> time a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> remote-write receiver connects. From then on, we "only"
>>>>>>> have to
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> track
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> when new series (or series with new metadata) show up
>>>>>>> and transfer
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> those. (I know it's not trivial, but we are already
>>>>>>> discussing
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> possible solutions in the various design docs.) Whenever
>>>>>>> a
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> remote-write receiver gets out of sync for some reason,
>>>>>>> it can
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> simply
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> cut the connection and start with a complete re-sync
>>>>>>> again. As
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> long as
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> that doesn't happen more often than once every 10
>>>>>>> samples, we still
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> have a net gain. Combining this with sharding is another
>>>>>>> challenge,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> but it doesn't appear unsolveable.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> Björn Rabenstein
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> [PGP-ID] 0x851C3DA17D748D03
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>> [email] bjo...@rabenste.in
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> You received this message because you are subscribed to
>>>>>>> the Google
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> Groups "Prometheus Developers" group.
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> To unsubscribe from this group and stop receiving emails
>>>>>>> from it,
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> send an email to
>>>>>>> prometheus-developers+unsubscr...@googlegroups.com
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> To view this discussion on the web visit
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZaQGfVK5OAfKRP2nxBnp168GML5r_ok_f%3DyVeUdC6e2EQ%40mail.gmail.com
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> <
>>>>>>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZaQGfVK5OAfKRP2nxBnp168GML5r_ok_f%3DyVeUdC6e2EQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> Brian Brazil
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>> www.robustperception.io
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> --
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> Brian Brazil
>>>>>>>
>>>>>>>
>>>>>>> > >>>>> www.robustperception.io
>>>>>>>
>>>>>>>
>>>>>>> > >>>>>
>>>>>>>
>>>>>>>
>>>>>>> > >>>> --
>>>>>>>
>>>>>>>
>>>>>>> > >> You received this message because you are subscribed to the
>>>>>>> Google Groups
>>>>>>>
>>>>>>>
>>>>>>> > >> "Prometheus Developers" group.
>>>>>>>
>>>>>>>
>>>>>>> > >> To unsubscribe from this group and stop receiving emails from
>>>>>>> it, send an
>>>>>>>
>>>>>>>
>>>>>>> > >> email to prometheus-developers+unsubscr...@googlegroups.com.
>>>>>>>
>>>>>>>
>>>>>>> > >> To view this discussion on the web visit
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZb%2BX-ErewAKEyg54_FVRmTSypbnNFmR-8ZayfU_WiTMFw%40mail.gmail.com
>>>>>>>
>>>>>>>
>>>>>>> > >> <
>>>>>>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZb%2BX-ErewAKEyg54_FVRmTSypbnNFmR-8ZayfU_WiTMFw%40mail.gmail.com?utm_medium=email&utm_source=footer
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > >> .
>>>>>>>
>>>>>>>
>>>>>>> > >>
>>>>>>>
>>>>>>>
>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > --
>>>>>>>
>>>>>>>
>>>>>>> > Brian Brazil
>>>>>>>
>>>>>>>
>>>>>>> > www.robustperception.io
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> > --
>>>>>>>
>>>>>>>
>>>>>>> > You received this message because you are subscribed to the Google
>>>>>>> Groups "Prometheus Developers" group.
>>>>>>>
>>>>>>>
>>>>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to prometheus-developers+unsubscr...@googlegroups.com.
>>>>>>>
>>>>>>>
>>>>>>> > To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLouK0PKQMpmuWibEs3%3DDyrEXfN%2BbiUygfak4S_h0k30pw%40mail.gmail.com
>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> Julien Pivotto
>>>>>>>
>>>>>>>
>>>>>>> @roidelapluie
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Brian Brazil
>>>>>> www.robustperception.io
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Brian Brazil
>>> www.robustperception.io
>>>
>>>
>>>
>>
>>
>
> --
> Brian Brazil
> www.robustperception.io
>
>
>
>
>
>
>
>
> --
>
>
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
>
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-developers+unsubscr...@googlegroups.com.
>
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLoWTwM%2B1a-M%2BxPEyihxtYSvyna9m5F%3DXW_Sihs2zoLpgg%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLoWTwM%2B1a-M%2BxPEyihxtYSvyna9m5F%3DXW_Sihs2zoLpgg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAFtK1UMYvQOe4ABvrNYvPpuMnVow8rH%3DLHpO%2B4GZQ8yq9igUYg%40mail.gmail.com.

Re: [prometheus-developers] Re: Remote Write Metadata propagation

Reply via email to