[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2022-12-17 Thread d-bugmail--- via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

Iain Buclaw  changed:

   What|Removed |Added

   Priority|P1  |P4

--


[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2017-09-08 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

Dmitry Olshansky  changed:

   What|Removed |Added

   Assignee|nob...@puremagic.com|dmitry.o...@gmail.com

--


[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2017-07-21 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

Vladimir Panteleev  changed:

   What|Removed |Added

   Hardware|x86_64  |All
   See Also||https://issues.dlang.org/sh
   ||ow_bug.cgi?id=13532
 OS|Linux   |All

--- Comment #3 from Vladimir Panteleev  ---
I think the obvious reasonable improvement here is to improve the
documentation, and mention to not use ctRegex with enum. Not sure much can be
done in Phobos to work around an underlying language / compiler implementation
issue.

--


[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2016-09-03 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

--- Comment #2 from greenify  ---
thanks a lot for looking at this so quickly :)

> The problem is enum re = ... line. Enum means ctRegex!`((c)(s)?)?ti` is 
> copy-pasted at the place of usage, basically reconstructing regex on each 
> loop iteration b/c compile-time version can't cache compiled patterns.

Okay that makes partially sense, but to give some background I found this
problem because the D changelog generation crashed on my machine (16G) while
parsing a 300kB log file. The dummy file is a dustmited & reduced version of
the changelog generator:

https://github.com/dlang/tools/blob/master/changed.d#L77

And using `enum ... = ctRegex!...` seems to be a very common pattern:

https://github.com/search?q=enum+ctRegex=Code=%E2%9C%93

> Replace enum with static and all should be good.

Unfortunately I can still observe the same behavior, I also tried to move
ctRegex to the matchAll.
Funnily with regex I don't get this allocation / out of memory -problem.

FYI static immutable doens't work:

foo.d(10): Error: template std.regex.matchAll cannot deduce function from
argument types !()(string, immutable(StaticRegex!char)), candidates are:
/usr/include/dlang/dmd/std/regex/package.d(860):std.regex.matchAll(R,
RegEx)(R input, RegEx re) if (isSomeString!R && is(RegEx ==
Regex!(BasicElementOf!R)))
/usr/include/dlang/dmd/std/regex/package.d(868):std.regex.matchAll(R,
String)(R input, String re) if (isSomeString!R && isSomeString!String)
/usr/include/dlang/dmd/std/regex/package.d(875):std.regex.matchAll(R,
RegEx)(R input, RegEx re) if (isSomeString!R && is(RegEx ==
StaticRegex!(BasicElementOf!R)))

--


[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2016-09-03 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

Dmitry Olshansky  changed:

   What|Removed |Added

 CC||dmitry.o...@gmail.com

--- Comment #1 from Dmitry Olshansky  ---
(In reply to greensunny12 from comment #0)
> Consider the following program, it will crash as it allocates _a lot_ of
> memory before the garbage can be collected. 
> 
> void main()
> {
> import std.regex;
> enum re = ctRegex!(`((c)(s)?)?ti`);
> import core.memory : GC;
> string text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";
> 
> foreach (i; 0..500_000_000)
> {
> if (auto m = matchAll(text, re)) {}
> //if (i % 1_000_000)
> //GC.collect();
> }
> }
> 
> On my machine (16G) it crashes at about 5M iterations.
> The GC profile finds two hotspots (here 2M iterations): 
> 
>  204800   200 uint[] D main
> std/regex/internal/parser.d:1607
>   18400   200 std.regex.internal.ir.Bytecode[] D main
> std/array.d:852
> 
> (the latter is insertInPlace)
> 
> After looking at the code it seems pretty weird, because "postprocess"
> should be called while constructing the RegEx and once only.

The problem is enum re = ... line. Enum means ctRegex!`((c)(s)?)?ti` is
copy-pasted at the place of usage, basically reconstructing regex on each loop
iteration b/c compile-time version can't cache compiled patterns.

Replace enum with static and all should be good.

--


[Issue 16457] std.regex postprocesses ctRegex every time at runtime

2016-09-01 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=16457

greenify  changed:

   What|Removed |Added

 CC||greeen...@gmail.com
Summary|std.regex postprocesses at  |std.regex postprocesses
   |ctRegex every time at   |ctRegex every time at
   |runtime |runtime

--