All right, sorry about the noise I wasn’t aware of the existence of two parsers, and I found the slow but correct one first. I quickly edited my stupid benchmark, and it looks like the lex parser is indeed much faster. It’s a bit slower than the Victoria one, but probably not enough to justify a change. And lex parsers are neat.
BenchmarkPrometheusTextParserMinimal-8 411118 2733 ns/op BenchmarkVictoriaMetricsTextParserMinimal-8 4048662 291.1 ns/op BenchmarkPrometheusLexMinimal-8 2301423 526.0 ns/op BenchmarkPrometheusTextParserBig-8 2158 536188 ns/op BenchmarkVictoriaMetricsTextParserBig-8 122446 9699 ns/op BenchmarkPrometheusLexBig-8 33807 29730 ns/op Thanks for your answers. From: prometheus-developers@googlegroups.com <prometheus-developers@googlegroups.com> on behalf of Bryan Boreham <bjbore...@gmail.com> Date: Friday, 23 June 2023 at 20:08 To: Prometheus Developers <prometheus-developers@googlegroups.com> Subject: Re: [prometheus-developers] Potential big performance gain in common/expfmt/TextParser For completeness, the parsers that Prometheus uses for scraping live here: https://github.com/prometheus/prometheus/tree/main/model/textparse Labels are cached by the scraper, so parsing is only done once for the lifetime of each target. This means that Prometheus is not too sensitive to the performance of parsing. But there is a request to make this cache optional: https://github.com/prometheus/prometheus/issues/12443 Bryan On Friday, 23 June 2023 at 09:42:38 UTC+1 Julien Pivotto wrote: On 23 Jun 10:07, Ben Kochie wrote: > We've done at least one rewrite of the parser in the past. We do > substantial changes to our subsystems all the time. For example, the > "stringlabels" changes were a recent substantial change to the internals of > in-memory label storage. > > The only things we want to avoid is breaking existing users and reducing > the correctness of the parser. Yes, that parser is meant for correctness and is not used in the Prometheus server itself. > > On Fri, Jun 23, 2023 at 9:35 AM 'Antoine Pultier' via Prometheus Developers > <prometheus...@googlegroups.com> wrote: > > > Hi, > > > > I am parsing a large number of metrics, and I noticed that the Prometheus > > expfmt.TextParser takes a significant amount of CPU time on my machine. > > > > I also noticed that VictoriaMetrics has an entirely different parsing > > implementation that is faster on my machine. I have not conducted extensive > > benchmarking; I'm unsure if I want to. But you can find a small comparison > > at the end of the email with a small string to parse and a 5MB string full > > of metrics and labels to parse. > > > > I read both implementations, both open-source with the Apache 2.0 license, > > and I guess the main difference is the extensive use of strings.IndexByte > > in the VictoriaMetrics parser. Golang provides a fast implementation to > > look for a byte in a string, which is much faster than scanning and > > comparing byte per byte (on common CPU architectures). > > Example for arm64: > > https://github.com/golang/go/blob/e45202f2154839f713b603fd6e5f8a8ad8d527e0/src/internal/bytealg/indexbyte_arm64.s > > I discovered the existence of such optimisations while reading this article > > about ripgrep: https://blog.burntsushi.net/ripgrep/#literal-optimizations > > > > I'm not a Prometheus developer, but I would guess that completely > > replacing the parser with another one is not on the table, but doing some > > changes to the existing one could be possible. > > > > However, it seems to require significant changes to gain performance. I'm > > wondering whether the Prometheus project would welcome substantial changes > > inside the parser at this point. One change would be to load more data at > > once. Perhaps the whole data into a string in memory like VictoriaMetrics > > does, which has some implications. And also the use of strings.IndexBytes > > and slices instead of constructing many strings byte by byte. These changes > > will probably make the parser less elegant, but that may or may not be > > worth it. > > > > --- > > The tiny benchmark: > > --- > > goos: darwin > > goarch: arm64 > > pkg: simple-bench > > BenchmarkPrometheusTextParserMinimal-8 416382 2798 ns/op > > BenchmarkVictoriaMetricsTextParserMinimal-8 3622894 296.1 ns/op > > BenchmarkPrometheusTextParserBig-8 4 287416010 ns/op > > BenchmarkVictoriaMetricsTextParserBig-8 142 8374695 ns/op > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Prometheus Developers" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to prometheus-devel...@googlegroups.com. > > To view this discussion on the web visit > > https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com > > <https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com?utm_medium=email&utm_source=footer> > > . > > > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to prometheus-devel...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-developers/CABbyFmoT5-Q%3DvDqrjT9sP4w9h-c7ogMGk8vNp_16FG8nkZAJKg%40mail.gmail.com. -- Julien Pivotto @roidelapluie -- You received this message because you are subscribed to a topic in the Google Groups "Prometheus Developers" group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-developers/mWgJyg1VYdQ/unsubscribe. To unsubscribe from this group and all its topics, send an email to prometheus-developers+unsubscr...@googlegroups.com<mailto:prometheus-developers+unsubscr...@googlegroups.com>. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/3384e7dc-d495-4930-be9d-d1e36ad87d00n%40googlegroups.com<https://groups.google.com/d/msgid/prometheus-developers/3384e7dc-d495-4930-be9d-d1e36ad87d00n%40googlegroups.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/OLAP279MB010146CE2EFE49FA72FAA971ED26A%40OLAP279MB0101.NORP279.PROD.OUTLOOK.COM.