fprof is great at telling what in a given workflow is taking time but
comparing fprof results won't tell you by how much it got faster. For that
you will have to benchmark it again. For tight-loops though, I can see how
removing the version check, option handling and everything else speeds up
performance. I think it is fine to go that route if you need to.

I am also not sure if fprof will consider the time spent on NIFs. I assume
most time is spent on the regex engine but if that is not fully considered
in fprof, that could affect measurements. But I am speculating here, I
truly don't know. :)

On Fri, Mar 15, 2024 at 8:31 AM Jan Krüger <jan.krue...@gmail.com> wrote:

> Alright. If you can't see it, then it must have been something in my
> environment. What I did when working on this is run fprof to identify
> potential performance problems, and the version checked showed up as a
> substantial part of the time spent in the regex code. Is that a valid use
> of fprof in your opinion? Since we're running this in a very tight loop I
> actually also wanted to get rid of the keyword.get calls when running
> regexes, and swapped out Regex.run with :re.run, and that substantially
> improved the performance overall.
>
> I think I didn't then go, and profile specifically if removing the version
> check alone will improve the performance by itself. So all I have to back
> up that the version check is the root cause, is fprof.
>
> On Friday, March 15, 2024 at 8:22:29 AM UTC+1 José Valim wrote:
>
>> The 5% also take into account the option processing and result handling.
>> The version check itself is a subset of that. I was not able to measure
>> sensible gains after removing it.
>>
>> On Fri, Mar 15, 2024 at 7:58 AM Manish sharma <manish...@brsoftech.org>
>> wrote:
>>
>>> How Machine Learning Services Help Business?
>>> <https://www.brsoftech.com/machine-learning-solutions.html>
>>>
>>>    - With Machine Learning consulting services businesses can consider
>>>    cost reduction while boosting performance.
>>>    - It helps organizations to timely finish the task with utmost
>>>    accuracy.
>>>    - Retrieve information using cutting edge software tools.
>>>    - Machine learning works according to recent trends and
>>>    specifications.
>>>    - It automates the analysis of past patterns and historical data to
>>>    predict the future.
>>>
>>>
>>> On Fri, Mar 15, 2024 at 12:23 PM 'marcel...@googlemail.com' via
>>> elixir-lang-core <elixir-l...@googlegroups.com> wrote:
>>>
>>>> The benchmark results I'm getting are indeed not as dramatic as the
>>>> fprof results, but on the other hand also more than the 5% mentioned in the
>>>> PR which introduced the check:
>>>> https://github.com/elixir-lang/elixir/pull/9040
>>>>
>>>> ```elixir
>>>> regex = ~r/^([a-z][a-z0-9\+\-\.]*):/i
>>>> re_pattern = regex.re_pattern
>>>>
>>>> Benchee.run(%{
>>>>   "Regex.run/2" => fn -> Regex.run(regex, "foo") end,
>>>>   ":re.run/3" => fn -> :re.run("foo", re_pattern, [{:capture, :all,
>>>> :binary}]) end
>>>> })
>>>> ```
>>>>
>>>> ```
>>>> Name                  ips        average  deviation         median
>>>>     99th %
>>>> :re.run/3          2.88 M      346.90 ns  ±3623.51%         333 ns
>>>>     417 ns
>>>> Regex.run/2        1.98 M      504.74 ns  ±5851.21%         416 ns
>>>>     542 ns
>>>>
>>>> Comparison:
>>>> :re.run/3          2.88 M
>>>> Regex.run/2        1.98 M - 1.46x slower +157.84 ns
>>>> ```
>>>> On Friday 15 March 2024 at 07:20:11 UTC+1 jan.k...@gmail.com wrote:
>>>>
>>>>> The difference was definitely measurable just in pure running time of
>>>>> the code, setting aside fprof. I'll post what I have after work today.
>>>>>
>>>>> On Thursday, March 14, 2024 at 10:21:25 PM UTC+1 José Valim wrote:
>>>>>
>>>>>> Do you have benchmarks or only the fprof results? fprof is not a
>>>>>> benchmarking tool: comparing fprof results from different code may be
>>>>>> misleading. Proper benchmarking is preferrable. I am benchmarking locally
>>>>>> and I cannot measure any relevant difference even with the whole version
>>>>>> checking removed.
>>>>>>
>>>>>> On Thu, Mar 14, 2024 at 6:01 PM Jan Krüger <jan.k...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks a lot. I'm also happy to share our case, and my fprof
>>>>>>> results, if that helps. I am very sure that my erlang, and elixir 
>>>>>>> versions
>>>>>>> match, on the machine where I've tested this. Replacing Regex.run with 
>>>>>>> an
>>>>>>> identical call to :re.run should show the performance improvement I've
>>>>>>> mentioned. The regex we've tested this on is:
>>>>>>>
>>>>>>> ~r/^([a-z][a-z0-9\+\-\.]*):/i
>>>>>>>
>>>>>>> On Thursday, March 14, 2024 at 5:55:47 PM UTC+1
>>>>>>> marcel...@googlemail.com wrote:
>>>>>>>
>>>>>>>> I'm the maintainer of RDF.ex library with the RDF.IRI module
>>>>>>>> mentioned in the OP. I can confirm that this fix doesn't affect the
>>>>>>>> problem, since we're actually not using `URI.parse/1` most of the time 
>>>>>>>> (we
>>>>>>>> use it only when dealing with relative URIs). Even in this case the
>>>>>>>> `Regex.version/0` call in `Regex.safe_run/3` (
>>>>>>>> https://github.com/elixir-lang/elixir/blob/b8fca42e58850b56f65d0fb8a2086f2636141f61/lib/elixir/lib/regex.ex#L533)
>>>>>>>> still performs the `:erlang.system_info/0` call.
>>>>>>>>
>>>>>>>> On Thursday 14 March 2024 at 17:15:40 UTC+1 jan.k...@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I read the commit, and I don't it fixes what our actual problem
>>>>>>>>> was. See my comment above. The problem is the actual call to 
>>>>>>>>> :re.version,
>>>>>>>>> not the recompilation of the regex
>>>>>>>>>
>>>>>>>>> On Thursday, March 14, 2024 at 4:37:43 PM UTC+1 José Valim wrote:
>>>>>>>>>
>>>>>>>>>> I have pushed a fix to main. But also note we provide precompiled
>>>>>>>>>> Elixir versions per OTP version. Using a matching version will 
>>>>>>>>>> always give
>>>>>>>>>> you the best results and that's not only about regexes. :)
>>>>>>>>>>
>>>>>>>>>> On Thu, Mar 14, 2024 at 2:20 PM Jan Krüger <jan.k...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've recently had to work on a code base that parses largish RDF
>>>>>>>>>>> XML files. Part of the code base does relatively simple but regular
>>>>>>>>>>> expression matches, but since the files are large, quite a lot of 
>>>>>>>>>>> Regex.run
>>>>>>>>>>> calls. While profiling I've noticed, that there are callouts to
>>>>>>>>>>> :erlang.system_info, which fetches the PCRE version BEAM was 
>>>>>>>>>>> compiled
>>>>>>>>>>> against.
>>>>>>>>>>>
>>>>>>>>>>> An example regular expression from the code base in question
>>>>>>>>>>> matches the schema part of a URL. I've replaced Regex.run with 
>>>>>>>>>>> erlang's
>>>>>>>>>>> :re.run for testing purposes, and at least for this case, there 
>>>>>>>>>>> performance
>>>>>>>>>>> gain is quite dramatic.
>>>>>>>>>>>
>>>>>>>>>>> Comparing fprof results:
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> RDF.IRI.scheme/1
>>>>>>>>>>> 1176473   30615.618    2354.355
>>>>>>>>>>> ---
>>>>>>>>>>> RDF.IRI.scheme/1
>>>>>>>>>>> 1176473    3531.955    2353.905
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> I found this thread in the google group, which actually talk
>>>>>>>>>>> about the reasoning for fetching the version, and proposes and 
>>>>>>>>>>> alternative.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://groups.google.com/g/elixir-lang-core/c/CgFdxIONvGg/m/HN9ryeVXAwAJ?pli=1
>>>>>>>>>>>
>>>>>>>>>>> Especially
>>>>>>>>>>>
>>>>>>>>>>> ```
>>>>>>>>>>> Taking a further look at the code, the issue with recompiling
>>>>>>>>>>> regexes on the fly is that it makes executing the regexes more 
>>>>>>>>>>> expensive,
>>>>>>>>>>> as we need to compute the version on every execution. We could 
>>>>>>>>>>> store the
>>>>>>>>>>> version in ETS but that would have performance issues. Storing in a
>>>>>>>>>>> persistent_term would be great, but at the moment we support 
>>>>>>>>>>> Erlang/OTP
>>>>>>>>>>> 20+. Thoughts?
>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Since this has a fairly noticeable impact, at least on all tests
>>>>>>>>>>> I've run, I wanted to start a discussion, if this could be
>>>>>>>>>>> implemented/improved now.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "elixir-lang-core" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to elixir-lang-co...@googlegroups.com.
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com
>>>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "elixir-lang-core" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>>>
>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elixir-lang-core" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to elixir-lang-co...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> --
>>> Kind Regards,
>>> Manish Kr. Sharma
>>> Digital Marketing Manager
>>>
>>> Website: www.brsoftech.com
>>> E-mail: manish...@brsoftech.org
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elixir-lang-co...@googlegroups.com.
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elixir-lang-core+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elixir-lang-core/ed6c6a7f-74f8-4a49-8c65-42b1ddd8a400n%40googlegroups.com
> <https://groups.google.com/d/msgid/elixir-lang-core/ed6c6a7f-74f8-4a49-8c65-42b1ddd8a400n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/CAGnRm4LY7N1sGKMYT2G66Zkvi0z3zSM4DZUGTbb5T9YeYFaWqw%40mail.gmail.com.

Reply via email to