fprof is great at telling what in a given workflow is taking time but comparing fprof results won't tell you by how much it got faster. For that you will have to benchmark it again. For tight-loops though, I can see how removing the version check, option handling and everything else speeds up performance. I think it is fine to go that route if you need to.
I am also not sure if fprof will consider the time spent on NIFs. I assume most time is spent on the regex engine but if that is not fully considered in fprof, that could affect measurements. But I am speculating here, I truly don't know. :) On Fri, Mar 15, 2024 at 8:31 AM Jan Krüger <jan.krue...@gmail.com> wrote: > Alright. If you can't see it, then it must have been something in my > environment. What I did when working on this is run fprof to identify > potential performance problems, and the version checked showed up as a > substantial part of the time spent in the regex code. Is that a valid use > of fprof in your opinion? Since we're running this in a very tight loop I > actually also wanted to get rid of the keyword.get calls when running > regexes, and swapped out Regex.run with :re.run, and that substantially > improved the performance overall. > > I think I didn't then go, and profile specifically if removing the version > check alone will improve the performance by itself. So all I have to back > up that the version check is the root cause, is fprof. > > On Friday, March 15, 2024 at 8:22:29 AM UTC+1 José Valim wrote: > >> The 5% also take into account the option processing and result handling. >> The version check itself is a subset of that. I was not able to measure >> sensible gains after removing it. >> >> On Fri, Mar 15, 2024 at 7:58 AM Manish sharma <manish...@brsoftech.org> >> wrote: >> >>> How Machine Learning Services Help Business? >>> <https://www.brsoftech.com/machine-learning-solutions.html> >>> >>> - With Machine Learning consulting services businesses can consider >>> cost reduction while boosting performance. >>> - It helps organizations to timely finish the task with utmost >>> accuracy. >>> - Retrieve information using cutting edge software tools. >>> - Machine learning works according to recent trends and >>> specifications. >>> - It automates the analysis of past patterns and historical data to >>> predict the future. >>> >>> >>> On Fri, Mar 15, 2024 at 12:23 PM 'marcel...@googlemail.com' via >>> elixir-lang-core <elixir-l...@googlegroups.com> wrote: >>> >>>> The benchmark results I'm getting are indeed not as dramatic as the >>>> fprof results, but on the other hand also more than the 5% mentioned in the >>>> PR which introduced the check: >>>> https://github.com/elixir-lang/elixir/pull/9040 >>>> >>>> ```elixir >>>> regex = ~r/^([a-z][a-z0-9\+\-\.]*):/i >>>> re_pattern = regex.re_pattern >>>> >>>> Benchee.run(%{ >>>> "Regex.run/2" => fn -> Regex.run(regex, "foo") end, >>>> ":re.run/3" => fn -> :re.run("foo", re_pattern, [{:capture, :all, >>>> :binary}]) end >>>> }) >>>> ``` >>>> >>>> ``` >>>> Name ips average deviation median >>>> 99th % >>>> :re.run/3 2.88 M 346.90 ns ±3623.51% 333 ns >>>> 417 ns >>>> Regex.run/2 1.98 M 504.74 ns ±5851.21% 416 ns >>>> 542 ns >>>> >>>> Comparison: >>>> :re.run/3 2.88 M >>>> Regex.run/2 1.98 M - 1.46x slower +157.84 ns >>>> ``` >>>> On Friday 15 March 2024 at 07:20:11 UTC+1 jan.k...@gmail.com wrote: >>>> >>>>> The difference was definitely measurable just in pure running time of >>>>> the code, setting aside fprof. I'll post what I have after work today. >>>>> >>>>> On Thursday, March 14, 2024 at 10:21:25 PM UTC+1 José Valim wrote: >>>>> >>>>>> Do you have benchmarks or only the fprof results? fprof is not a >>>>>> benchmarking tool: comparing fprof results from different code may be >>>>>> misleading. Proper benchmarking is preferrable. I am benchmarking locally >>>>>> and I cannot measure any relevant difference even with the whole version >>>>>> checking removed. >>>>>> >>>>>> On Thu, Mar 14, 2024 at 6:01 PM Jan Krüger <jan.k...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Thanks a lot. I'm also happy to share our case, and my fprof >>>>>>> results, if that helps. I am very sure that my erlang, and elixir >>>>>>> versions >>>>>>> match, on the machine where I've tested this. Replacing Regex.run with >>>>>>> an >>>>>>> identical call to :re.run should show the performance improvement I've >>>>>>> mentioned. The regex we've tested this on is: >>>>>>> >>>>>>> ~r/^([a-z][a-z0-9\+\-\.]*):/i >>>>>>> >>>>>>> On Thursday, March 14, 2024 at 5:55:47 PM UTC+1 >>>>>>> marcel...@googlemail.com wrote: >>>>>>> >>>>>>>> I'm the maintainer of RDF.ex library with the RDF.IRI module >>>>>>>> mentioned in the OP. I can confirm that this fix doesn't affect the >>>>>>>> problem, since we're actually not using `URI.parse/1` most of the time >>>>>>>> (we >>>>>>>> use it only when dealing with relative URIs). Even in this case the >>>>>>>> `Regex.version/0` call in `Regex.safe_run/3` ( >>>>>>>> https://github.com/elixir-lang/elixir/blob/b8fca42e58850b56f65d0fb8a2086f2636141f61/lib/elixir/lib/regex.ex#L533) >>>>>>>> still performs the `:erlang.system_info/0` call. >>>>>>>> >>>>>>>> On Thursday 14 March 2024 at 17:15:40 UTC+1 jan.k...@gmail.com >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I read the commit, and I don't it fixes what our actual problem >>>>>>>>> was. See my comment above. The problem is the actual call to >>>>>>>>> :re.version, >>>>>>>>> not the recompilation of the regex >>>>>>>>> >>>>>>>>> On Thursday, March 14, 2024 at 4:37:43 PM UTC+1 José Valim wrote: >>>>>>>>> >>>>>>>>>> I have pushed a fix to main. But also note we provide precompiled >>>>>>>>>> Elixir versions per OTP version. Using a matching version will >>>>>>>>>> always give >>>>>>>>>> you the best results and that's not only about regexes. :) >>>>>>>>>> >>>>>>>>>> On Thu, Mar 14, 2024 at 2:20 PM Jan Krüger <jan.k...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I've recently had to work on a code base that parses largish RDF >>>>>>>>>>> XML files. Part of the code base does relatively simple but regular >>>>>>>>>>> expression matches, but since the files are large, quite a lot of >>>>>>>>>>> Regex.run >>>>>>>>>>> calls. While profiling I've noticed, that there are callouts to >>>>>>>>>>> :erlang.system_info, which fetches the PCRE version BEAM was >>>>>>>>>>> compiled >>>>>>>>>>> against. >>>>>>>>>>> >>>>>>>>>>> An example regular expression from the code base in question >>>>>>>>>>> matches the schema part of a URL. I've replaced Regex.run with >>>>>>>>>>> erlang's >>>>>>>>>>> :re.run for testing purposes, and at least for this case, there >>>>>>>>>>> performance >>>>>>>>>>> gain is quite dramatic. >>>>>>>>>>> >>>>>>>>>>> Comparing fprof results: >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> RDF.IRI.scheme/1 >>>>>>>>>>> 1176473 30615.618 2354.355 >>>>>>>>>>> --- >>>>>>>>>>> RDF.IRI.scheme/1 >>>>>>>>>>> 1176473 3531.955 2353.905 >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> I found this thread in the google group, which actually talk >>>>>>>>>>> about the reasoning for fetching the version, and proposes and >>>>>>>>>>> alternative. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://groups.google.com/g/elixir-lang-core/c/CgFdxIONvGg/m/HN9ryeVXAwAJ?pli=1 >>>>>>>>>>> >>>>>>>>>>> Especially >>>>>>>>>>> >>>>>>>>>>> ``` >>>>>>>>>>> Taking a further look at the code, the issue with recompiling >>>>>>>>>>> regexes on the fly is that it makes executing the regexes more >>>>>>>>>>> expensive, >>>>>>>>>>> as we need to compute the version on every execution. We could >>>>>>>>>>> store the >>>>>>>>>>> version in ETS but that would have performance issues. Storing in a >>>>>>>>>>> persistent_term would be great, but at the moment we support >>>>>>>>>>> Erlang/OTP >>>>>>>>>>> 20+. Thoughts? >>>>>>>>>>> ``` >>>>>>>>>>> >>>>>>>>>>> Since this has a fairly noticeable impact, at least on all tests >>>>>>>>>>> I've run, I wanted to start a discussion, if this could be >>>>>>>>>>> implemented/improved now. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "elixir-lang-core" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to elixir-lang-co...@googlegroups.com. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com >>>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elixir-lang-core" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to elixir-lang-co...@googlegroups.com. >>>>>>> >>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elixir-lang-core" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elixir-lang-co...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com >>>> <https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> Kind Regards, >>> Manish Kr. Sharma >>> Digital Marketing Manager >>> >>> Website: www.brsoftech.com >>> E-mail: manish...@brsoftech.org >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elixir-lang-core" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elixir-lang-co...@googlegroups.com. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "elixir-lang-core" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elixir-lang-core+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elixir-lang-core/ed6c6a7f-74f8-4a49-8c65-42b1ddd8a400n%40googlegroups.com > <https://groups.google.com/d/msgid/elixir-lang-core/ed6c6a7f-74f8-4a49-8c65-42b1ddd8a400n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4LY7N1sGKMYT2G66Zkvi0z3zSM4DZUGTbb5T9YeYFaWqw%40mail.gmail.com.