I read the commit, and I don't it fixes what our actual problem was. See my 
comment above. The problem is the actual call to :re.version, not the 
recompilation of the regex

On Thursday, March 14, 2024 at 4:37:43 PM UTC+1 José Valim wrote:

> I have pushed a fix to main. But also note we provide precompiled Elixir 
> versions per OTP version. Using a matching version will always give you the 
> best results and that's not only about regexes. :)
>
> On Thu, Mar 14, 2024 at 2:20 PM Jan Krüger <jan.k...@gmail.com> wrote:
>
>> I've recently had to work on a code base that parses largish RDF XML 
>> files. Part of the code base does relatively simple but regular expression 
>> matches, but since the files are large, quite a lot of Regex.run calls. 
>> While profiling I've noticed, that there are callouts to 
>> :erlang.system_info, which fetches the PCRE version BEAM was compiled 
>> against.
>>
>> An example regular expression from the code base in question matches the 
>> schema part of a URL. I've replaced Regex.run with erlang's :re.run for 
>> testing purposes, and at least for this case, there performance gain is 
>> quite dramatic.
>>
>> Comparing fprof results:
>>
>> ```
>> RDF.IRI.scheme/1                                               1176473   
>> 30615.618    2354.355
>> ---
>> RDF.IRI.scheme/1                                               1176473   
>>  3531.955    2353.905
>> ```
>>
>> I found this thread in the google group, which actually talk about the 
>> reasoning for fetching the version, and proposes and alternative.
>>
>>
>> https://groups.google.com/g/elixir-lang-core/c/CgFdxIONvGg/m/HN9ryeVXAwAJ?pli=1
>>
>> Especially
>>
>> ```
>> Taking a further look at the code, the issue with recompiling regexes on 
>> the fly is that it makes executing the regexes more expensive, as we need 
>> to compute the version on every execution. We could store the version in 
>> ETS but that would have performance issues. Storing in a persistent_term 
>> would be great, but at the moment we support Erlang/OTP 20+. Thoughts?
>> ```
>>
>> Since this has a fairly noticeable impact, at least on all tests I've 
>> run, I wanted to start a discussion, if this could be implemented/improved 
>> now.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elixir-lang-co...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/9ba26bb4-fc04-46fb-bf26-ad45bb57cfd6n%40googlegroups.com.

Reply via email to