Re: Reading .txt File into String and Matching with RegEx

2023-12-11 Thread BoQsc via Digitalmars-d-learn
Matches function declarations and captures function names from 
`.d` Source Code file



**regexcapture.d**

```
import std.stdio : writeln;
import std.regex : matchAll, regex;
import std.file  : read;

void main(){
string input = cast(string)read("sourcecode.d");

	foreach(match; matchAll(input, 
regex(r"\b([A-Za-z_]\w*)\s*\([^)]*\)\s*", "g"))){

writeln(match.captures()[1]);
}
}
```

**Input(sourcecode.d)**
```
BOOL WaitNamedPipeA(LPCSTR, DWORD);
BOOL WaitNamedPipeW(LPCWSTR, DWORD);
BOOL WinLoadTrustProvider(GUID*);
BOOL WriteFile(HANDLE, PCVOID, DWORD, PDWORD, LPOVERLAPPED);
BOOL WriteFileEx(HANDLE, PCVOID, DWORD, LPOVERLAPPED, 
LPOVERLAPPED_COMPLETION_ROUTINE);

BOOL WritePrivateProfileSectionA(LPCSTR, LPCSTR, LPCSTR);
BOOL WritePrivateProfileSectionW(LPCWSTR, LPCWSTR, LPCWSTR);
BOOL WritePrivateProfileStringA(LPCSTR, LPCSTR, LPCSTR, 
LPCSTR);
BOOL WritePrivateProfileStringW(LPCWSTR, LPCWSTR, LPCWSTR, 
LPCWSTR);

```
Note: This small input excerpt was taken from a real source code 
file: 
https://github.com/dlang/dmd/blob/master/druntime/src/core/sys/windows/winbase.d#L2069-L2078


**Output**
```

C:\Users\Windows10\Documents\matchtest>rdmd regexcapture.d
WaitNamedPipeA
WaitNamedPipeW
WinLoadTrustProvider
WriteFile
WriteFileEx
WritePrivateProfileSectionA
WritePrivateProfileSectionW
WritePrivateProfileStringA
WritePrivateProfileStringW
```

---
Relevant links:
https://dlang.org/phobos/std_regex.html#regex
https://dlang.org/phobos/std_regex.html#.RegexMatch.captures
https://regexr.com/


Re: Reading .txt File into String and Matching with RegEx

2023-12-11 Thread BoQsc via Digitalmars-d-learn

On Monday, 11 December 2023 at 05:18:45 UTC, thinkunix wrote:

BoQsc via Digitalmars-d-learn wrote:
This is something I've searched on the forum and couldn't find 
exact answer.


TLDR: `r"^."` is matching the very first two character in the 
`input` string.


Don't you need two dots to match two characters?
Each dot being the regex to match a single character,
so `r"^.."` instead of `r"^."` to get the first two characters.

When I run your program (on linux with rdmd from DMD 2.106.0), 
I get:


[["H"]]


Yeah, that's true, my mistake, forgot to update the snippet and 
note properly. Thanks!


```
import std.stdio : writeln;
import std.regex : matchAll;
import std.file  : read;

void main(){

string input = cast(string)read("example.txt");
writeln(matchAll(input, r"^.."));

}
```


Re: Reading .txt File into String and Matching with RegEx

2023-12-10 Thread thinkunix via Digitalmars-d-learn

BoQsc via Digitalmars-d-learn wrote:
This is something I've searched on the forum and couldn't find exact 
answer.


TLDR: `r"^."` is matching the very first two character in the `input` 
string.


Don't you need two dots to match two characters?
Each dot being the regex to match a single character,
so `r"^.."` instead of `r"^."` to get the first two characters.

When I run your program (on linux with rdmd from DMD 2.106.0), I get:

[["H"]]


Reading .txt File into String and Matching with RegEx

2023-12-10 Thread BoQsc via Digitalmars-d-learn
This is something I've searched on the forum and couldn't find 
exact answer.


TLDR: `r"^."` is matching the very first two character in the 
`input` string.


**matchtest.d**
```
import std.stdio : writeln;
import std.regex : matchAll;
import std.file  : read;

void main(){

string input = cast(string)read("example.txt");
writeln(matchAll(input, r"^."));

}
```

**Input(example.txt)**
```
HelloWorld
```

**Output**
```
rdmd matchtest.d
[["He"]]
```

https://dlang.org/phobos/std_regex.html#matchAll
https://dlang.org/library/std/file/read.html


Re: regex matching but not capturing

2023-04-06 Thread Ali Çehreli via Digitalmars-d-learn

On 4/6/23 11:08, Paul wrote:
ways to access 
those repetitive ", cc" s on the end.  I don't think my regex is 
capturing them.


Some internets think you are in parser territory:


https://stackoverflow.com/questions/1407435/how-do-i-regex-match-with-grouping-with-unknown-number-of-groups

Ali



Re: regex matching but not capturing

2023-04-06 Thread Paul via Digitalmars-d-learn

On Thursday, 6 April 2023 at 16:27:23 UTC, Alex Bryan wrote:

My understanding browsing the documentation is the matchAll 
returns a range of Captures (struct documented at 
https://dlang.org/phobos/std_regex.html#Captures). In your for 
loop I think c[0] will contain the current full match (current 
line that matches), c[1] will contain the first captured match 
("AA" for first line), c.front[2] will contain "0" for first 
line, etc.




Thanks Alex.  Read some more and tried some different ways to 
access those repetitive ", cc" s on the end.  I don't think my 
regex is capturing them.





Re: regex matching but not capturing

2023-04-06 Thread Alex Bryan via Digitalmars-d-learn

On Thursday, 6 April 2023 at 15:52:16 UTC, Paul wrote:
My regex is matching but doesnt seem to be capturing.  You may 
recognize this from the AOC challenges.


file contains...
**Valve AA has flow rate=0; tunnels lead to valves DD, II, BB**
**Valve BB has flow rate=13; tunnels lead to valves CC, AA**
**Valve CC has flow rate=2; tunnels lead to valves DD, BB**
**... etc**

```d
auto s = readText(filename);
auto ctr = ctRegex!(`Valve ([A-Z]{2}).*=(\d+).+valves(,* 
[A-Z]{2})+`);

foreach(c;matchAll(s, ctr)) {
fo.writeln(c);
}
```

produces...
**["Valve AA has flow rate=0; tunnels lead to valves DD, II, 
BB", "AA", "0", ", BB"]**
**["Valve BB has flow rate=13; tunnels lead to valves CC, AA", 
"BB", "13", ", AA"]**
**["Valve CC has flow rate=2; tunnels lead to valves DD, BB", 
"CC", "2", ", BB"]**


what I'm attempting to achieve and expect is, for instance, on 
the 1st line...
[lead to valves DD, II, BB", "AA", "0", **", DD", ", II", 
", BB"]**


My understanding browsing the documentation is the matchAll 
returns a range of Captures (struct documented at 
https://dlang.org/phobos/std_regex.html#Captures). In your for 
loop I think c[0] will contain the current full match (current 
line that matches), c[1] will contain the first captured match 
("AA" for first line), c.front[2] will contain "0" for first 
line, etc.


There's probably logic somewhere that decides when a Capture is 
used as an argument to writeln, to just print the full match 
(line that matches) (making writeln(capture) the same as 
writeln(capture[0])


regex matching but not capturing

2023-04-06 Thread Paul via Digitalmars-d-learn
My regex is matching but doesnt seem to be capturing.  You may 
recognize this from the AOC challenges.


file contains...
**Valve AA has flow rate=0; tunnels lead to valves DD, II, BB**
**Valve BB has flow rate=13; tunnels lead to valves CC, AA**
**Valve CC has flow rate=2; tunnels lead to valves DD, BB**
**... etc**

```d
auto s = readText(filename);
auto ctr = ctRegex!(`Valve ([A-Z]{2}).*=(\d+).+valves(,* 
[A-Z]{2})+`);

foreach(c;matchAll(s, ctr)) {
fo.writeln(c);
}
```

produces...
**["Valve AA has flow rate=0; tunnels lead to valves DD, II, BB", 
"AA", "0", ", BB"]**
**["Valve BB has flow rate=13; tunnels lead to valves CC, AA", 
"BB", "13", ", AA"]**
**["Valve CC has flow rate=2; tunnels lead to valves DD, BB", 
"CC", "2", ", BB"]**


what I'm attempting to achieve and expect is, for instance, on 
the 1st line...
[lead to valves DD, II, BB", "AA", "0", **", DD", ", II", ", 
BB"]**


Re: Read a text file at once for regex searching

2023-03-20 Thread Paul via Digitalmars-d-learn

On Monday, 20 March 2023 at 17:47:19 UTC, Adam D Ruppe wrote:

On Monday, 20 March 2023 at 17:42:17 UTC, Paul wrote:

Do we have some such function in our std library?


Try

static import std.file;
string s = std.file.readText("filename.txt");


http://phobos.dpldocs.info/std.file.readText.html


Thanks Adam.


Re: Read a text file at once for regex searching

2023-03-20 Thread Adam D Ruppe via Digitalmars-d-learn

On Monday, 20 March 2023 at 17:42:17 UTC, Paul wrote:

Do we have some such function in our std library?


Try

static import std.file;
string s = std.file.readText("filename.txt");


http://phobos.dpldocs.info/std.file.readText.html




Read a text file at once for regex searching

2023-03-20 Thread Paul via Digitalmars-d-learn
I've been looking through our Library documentation and having 
trouble finding what I want.  **I'd like to read a text file in 
all at once** and do some searching and analytics on it instead 
of reading it bit by bit or line by line.  Do we have some such 
function in our std library?


Thanks in advance.


Re: How to select the regex that matches the first token of a string?

2021-07-03 Thread vnr via Digitalmars-d-learn

On Saturday, 3 July 2021 at 09:28:32 UTC, user1234 wrote:

On Saturday, 3 July 2021 at 09:05:28 UTC, vnr wrote:

Hello,

I am trying to make a small generic lexer that bases its token 
analysis on regular expressions. The principle I have in mind 
is to define a token type table with its corresponding regular 
expression, here is the code I currently have:


[...]


storing the regex in a token is an antipattern.


Thank you for the answer,

I know it's not clean, I'll modify my code to define a token type 
table with their regular expression and define a token type table 
with what has match; the former defining the lexer, the latter 
being the result of the latter.


But for now and to keep it simple, I did everything in one.


Re: How to select the regex that matches the first token of a string?

2021-07-03 Thread user1234 via Digitalmars-d-learn

On Saturday, 3 July 2021 at 09:05:28 UTC, vnr wrote:

Hello,

I am trying to make a small generic lexer that bases its token 
analysis on regular expressions. The principle I have in mind 
is to define a token type table with its corresponding regular 
expression, here is the code I currently have:


[...]


storing the regex in a token is an antipattern.


How to select the regex that matches the first token of a string?

2021-07-03 Thread vnr via Digitalmars-d-learn

Hello,

I am trying to make a small generic lexer that bases its token 
analysis on regular expressions. The principle I have in mind is 
to define a token type table with its corresponding regular 
expression, here is the code I currently have:


```d
import std.regex;

/// ditto
struct Token
{
/// The token type
string type;
/// The regex to match the token
Regex!char re;
/// The matched string
string matched = null;
}

/// Function to find the right token in the given table
Token find(Token[] table, const(Captures!string delegate(Token) 
pure @safe) fn)

{
foreach (token; table)
if (fn(token)) return token;
return Token("", regex(r""));
}

/// The lexer class
class Lexer
{
private Token[] tokens;

/// ditto
this(Token[] tkns = [])
{
this.tokens = tkns;
}


override string toString() const
{
import std.algorithm : map;
import std.conv : to;
import std.format : format;

return to!string
(this.tokens.map!(tok =>
format("(%s, %s)", tok.type, 
tok.matched)));

}

// Others useful methods ...
}

/// My token table
static Token[] table =
[ Token("NUMBER", regex(r"(?:\d+(?:\.\d*)?|\.\d+)"))
    , Token("MINS", regex(r"\-"))
, Token("PLUS", regex(r"\+")) ];

/// Build a new lexer
Lexer lex(string text)
{
Token[] result = [];

while (text.length > 0)
{
Token token = table.find((Token t) => matchFirst(text, t.re));
const string tmatch = matchFirst(text, token.re)[0];

result ~= Token(token.type, token.re, tmatch);
text = text[tmatch.length .. $];
}
return new Lexer(result);
}

void main()
{
import std.stdio : writeln;

const auto l = lex("3+2");
writeln(l);
}

```

When I run this program, it gives the following sequence:

```
["(NUMBER, 3)", "(NUMBER, 2)", "(NUMBER, 2)"]
```

While I want this:

```
["(NUMBER, 3)", "(PLUS, +)", "(NUMBER, 2)"]
```

The problem seems to come from the `find` function which returns 
the first regex to have match and not the regex of the first 
substring to have match (I hope I am clear enough 😅).


I'm not used to manipulating regex, especially in D, so I'm not 
sure how to consider a solution to this problem.


I thank you in advance for your help.




Re: find regex in backward direction ?

2020-12-19 Thread Виталий Фадеев via Digitalmars-d-learn

On Sunday, 20 December 2020 at 04:33:21 UTC, Виталий Фадеев wrote:

On Saturday, 19 December 2020 at 23:16:18 UTC, kdevel wrote:
On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев 
wrote:

...


"retro" possible when using simple expression "abc".
For complex "ab\w" or "(?Pregex)" should be parsing: [ "a", 
"b", "\w" ],  [ "(", "?", "P", "", "regex", ")"]..., i 
think.


up.


Re: find regex in backward direction ?

2020-12-19 Thread Виталий Фадеев via Digitalmars-d-learn

On Saturday, 19 December 2020 at 23:16:18 UTC, kdevel wrote:
On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев 
wrote:

Goal:
size_t pos = findRegexBackward( r"abc"d );
assert( pos == 4 );



module LastOccurrence;

size_t findRegexBackward_1 (dstring s, dstring pattern)
{
   import std.regex : matchAll;
   auto results = matchAll (s, pattern);
   if (results.empty)
  throw new Exception ("could not match");
   size_t siz;
   foreach (rm; results)
  siz = rm.pre.length;
   return siz;
}

size_t findRegexBackward_2 (dstring s, dstring pattern)
// this does not work with irreversible patterns ...
{
   import std.regex : matchFirst;
   import std.array : array;
   import std.range: retro;
   auto result = matchFirst (s.retro.array, 
pattern.retro.array);

   if (result.empty)
  throw new Exception ("could not match");
   return result.post.length;
}

unittest {
   import std.exception : assertThrown;
   static foreach (f; [&findRegexBackward_1, 
&findRegexBackward_2]) {

  assert (f ("abc3abc7", r""d) == 8);
  assert (f ("abc3abc7", r"abc"d) == 4);
  assertThrown (f ("abc3abc7", r"abx"d));
  assert (f ("abababababab", r"ab"d) == 10);
   }
}


Thanks.
But, not perfect.

We can't use reverse, becausу "ab\w" will be "w\ba" ( expect 
matching "abc". revesed is "cba" ).



size_t findRegexBackward_2 (dstring s, dstring pattern)
...
   assert (f ("abc3abc7", r"ab\w"d) == 4);
...


Of course, I using matchAll. But it scan all text in forward 
direction.



  size_t findRegexBackward_1 (dstring s, dstring pattern)


/** */
size_t findRegexBackwardMatchCase( dstring s, dstring needle, 
out size_t matchedLength )

{
auto matches = matchAll( s, needle );
if ( matches.empty )
{
return -1;
}
else
{
auto last = matches.front;
foreach ( m; matches )
{
last = m;
}
matchedLength = last.hit.length;
return last.pre.length;
}
}

Thank!
Fastest solution wanted!

May be... some like a "RightToLeft" in Win32 API...

https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regexoptions?view=net-5.0#System_Text_RegularExpressions_RegexOptions_RightToLeft

but how on Linux? MS-regex and Linux-regex is identical ?



Re: find regex in backward direction ?

2020-12-19 Thread kdevel via Digitalmars-d-learn
On Saturday, 19 December 2020 at 12:52:54 UTC, Виталий Фадеев 
wrote:

Goal:
size_t pos = findRegexBackward( r"abc"d );
assert( pos == 4 );



module LastOccurrence;

size_t findRegexBackward_1 (dstring s, dstring pattern)
{
   import std.regex : matchAll;
   auto results = matchAll (s, pattern);
   if (results.empty)
  throw new Exception ("could not match");
   size_t siz;
   foreach (rm; results)
  siz = rm.pre.length;
   return siz;
}

size_t findRegexBackward_2 (dstring s, dstring pattern)
// this does not work with irreversible patterns ...
{
   import std.regex : matchFirst;
   import std.array : array;
   import std.range: retro;
   auto result = matchFirst (s.retro.array, pattern.retro.array);
   if (result.empty)
  throw new Exception ("could not match");
   return result.post.length;
}

unittest {
   import std.exception : assertThrown;
   static foreach (f; [&findRegexBackward_1, 
&findRegexBackward_2]) {

  assert (f ("abc3abc7", r""d) == 8);
  assert (f ("abc3abc7", r"abc"d) == 4);
  assertThrown (f ("abc3abc7", r"abx"d));
  assert (f ("abababababab", r"ab"d) == 10);
   }
}


find regex in backward direction ?

2020-12-19 Thread Виталий Фадеев via Digitalmars-d-learn

We have:
dstring s = "abc3abc7";

Source:
https://run.dlang.io/is/PtjN4T

Goal:
size_t pos = findRegexBackward( r"abc"d );
assert( pos == 4 );


How to find regex in backward direction ?



Re: regex: ] in a character class

2020-12-12 Thread Tobias Pankrath via Digitalmars-d-learn

On Saturday, 12 December 2020 at 12:03:49 UTC, kdevel wrote:

I don't have a suggestion for better wording yet.

[1] https://dlang.org/phobos/std_regex.html


This [1] is how I would word it.

[1] https://github.com/dlang/phobos/pull/7724



Re: regex: ] in a character class

2020-12-12 Thread Tobias Pankrath via Digitalmars-d-learn

On Saturday, 12 December 2020 at 12:03:49 UTC, kdevel wrote:

In some situations a ] must be escaped as in

   auto re = regex(`^[a\]]$`); // match a and ] only

Unfortunately dmd/phobos does not warn if you forget the 
backslash:


   auto re = regex(`^[a]]$`); // match a]

This leads me to the documentation [1] which says

   \c where c is one of [|*+?() Matches the character c itself.

] must be added to this list since \] obviously matches ]. 
Additionally

the statement

   any character except [{|*+?()^$ Matches the character 
itself.


is not true since ] does not match itself when ] denotes the 
end of
a character class. I don't have a suggestion for better wording 
yet.


[1] https://dlang.org/phobos/std_regex.html


As I understand it, the statement is indeed true and a regex 
`]]]` would match and only match the string `]]]`. What should be 
added somewhere is


  Inside character classes the character ']' has to be written 
as '\]'.


regex: ] in a character class

2020-12-12 Thread kdevel via Digitalmars-d-learn

In some situations a ] must be escaped as in

   auto re = regex(`^[a\]]$`); // match a and ] only

Unfortunately dmd/phobos does not warn if you forget the 
backslash:


   auto re = regex(`^[a]]$`); // match a]

This leads me to the documentation [1] which says

   \c where c is one of [|*+?() Matches the character c itself.

] must be added to this list since \] obviously matches ]. 
Additionally

the statement

   any character except [{|*+?()^$ Matches the character 
itself.


is not true since ] does not match itself when ] denotes the end 
of
a character class. I don't have a suggestion for better wording 
yet.


[1] https://dlang.org/phobos/std_regex.html


Re: Regex and manipulating files

2020-11-16 Thread Jack via Digitalmars-d-learn

On Monday, 16 November 2020 at 10:51:51 UTC, Bloris wrote:
I've to convert a linux dash script because it is too slow and 
i decded to do it in D. I'm totally new to this and i think it 
could be a good exercise to learn this language.


The shell script does some simple jobs like:
0) Run the script with some options
1) sed/grep regex to catch a portion of a file.
For example: it finds the line that match "1234" and take 
all the lines until the line that match "abcd".

2) sed regex to catch some strings
For example: "abc sdfs#=8 // some text" i've to take "8" 
and "some text"

3) Creates dirs and copy files
4) Add specific char to a specific column position at every row 
of a file

Original file:
abcdefghij
1234567890
c34vt59erj
04jèoàòr4t
14sdf7g784

Edited file:
ab;cde;f;g;hij
12;345;6;7;890
c3;4vt;5;9;erj
04;jèo;à;ò;r4t
14;sdf;7;g;784

I would like to know what could be the best approach i would 
have to take with D to write simple, elegant and fast code, 
scanning files with more than 3000+ columns per line.


Thank you,
Loris


regex you can use std.regex module 
https://dlang.org/phobos/std_regex.html
IO stuff, read files, creatoe folders, etc: 
https://devdocs.io/d/std_stdio
I'm not sure about performance, if you find it be slow, maybe 
there are something better at https://code.dlang.org/ it also 
depends on your algorithm/code, of course


Regex and manipulating files

2020-11-16 Thread Bloris via Digitalmars-d-learn
I've to convert a linux dash script because it is too slow and i 
decded to do it in D. I'm totally new to this and i think it 
could be a good exercise to learn this language.


The shell script does some simple jobs like:
0) Run the script with some options
1) sed/grep regex to catch a portion of a file.
For example: it finds the line that match "1234" and take all 
the lines until the line that match "abcd".

2) sed regex to catch some strings
For example: "abc sdfs#=8 // some text" i've to take "8" and 
"some text"

3) Creates dirs and copy files
4) Add specific char to a specific column position at every row 
of a file

Original file:
abcdefghij
1234567890
c34vt59erj
04jèoàòr4t
14sdf7g784

Edited file:
ab;cde;f;g;hij
12;345;6;7;890
c3;4vt;5;9;erj
04;jèo;à;ò;r4t
14;sdf;7;g;784

I would like to know what could be the best approach i would have 
to take with D to write simple, elegant and fast code, scanning 
files with more than 3000+ columns per line.


Thank you,
Loris


Re: Regex split ignoore empty and whitespace

2020-02-20 Thread Ali Çehreli via Digitalmars-d-learn

On 2/20/20 4:46 PM, Ali Çehreli wrote:

>auto range = std.regex.splitter!(No.keepSeparators)(l,
> ctRegex!`[\s-\)\(\.]+`);

After realizing that No.keepSeparators is the default value anyway, I 
tried 'split' and it worked the way you wanted. So, perhaps all you 
needed was that extra '+' in the regex pattern:


  std.regex.split(l, ctRegex!`[\s-\)\(\.]+`)

Ali




Re: Regex split ignoore empty and whitespace

2020-02-20 Thread Ali Çehreli via Digitalmars-d-learn
On 2/20/20 2:02 PM, AlphaPurned wrote:> std.regex.split(l, 
ctRegex!`[\s-\)\(\.]`);

>
> I'm trying too split a string on spaces and stuff... but it is returning
> empty strings and other matches(e.g., ()).
>
> I realize I can delete afterwards but is there a direct way from split
> or ctRegex?

It turns out, split uses splitter, which is more capable, like allowing 
to say "do not keep the separators". The difference is, splitter returns 
a range, so I called .array to give you an array but it's not necessary.


I took liberty to add a '+' to your pattern, which may not be useful in 
your case:


import std.regex;
import std.stdio;
import std.typecons : No, Yes;
import std.array;

void main() {
  auto l = "hello world\t  and  moon";
  auto range = std.regex.splitter!(No.keepSeparators)(l, 
ctRegex!`[\s-\)\(\.]+`);

  auto array = range.array;

  writeln(range);
  writeln(array);
}

Both lines print ["hello", "world", "and", "moon"].

Ali



Regex split ignoore empty and whitespace

2020-02-20 Thread AlphaPurned via Digitalmars-d-learn

std.regex.split(l, ctRegex!`[\s-\)\(\.]`);

I'm trying too split a string on spaces and stuff... but it is 
returning empty strings and other matches(e.g., ()).


I realize I can delete afterwards but is there a direct way from 
split or ctRegex?


Re: Error on using regex in dmd v2.088.1

2020-02-03 Thread Andrea Fontana via Digitalmars-d-learn

On Monday, 3 February 2020 at 07:11:34 UTC, Dharmil Patel wrote:

On Monday, 3 February 2020 at 07:03:03 UTC, Dharmil Patel wrote:

In my code I am using regex like this:

   auto rgxComma = regex(r",");

On compiling with dmd v2.076.1, it compiles successfully, but 
on compiling with dmd v2.088.1, I am getting lots of errors 
like:


/src/phobos/std/regex/internal/thompson.d-mixin-836(837): 
Error: template instance 
std.regex.internal.thompson.ThompsonOps!(EngineType!(char, 
Input!char), State, true).op!cast(IR)164u error instantiating


Can someone please help me solve this error?

Thanks
Dharmil


Hi Dharmil!
It works for me using 2.088.1 with docker.
Are you sure your installation is ok?

Andrea


Re: Error on using regex in dmd v2.088.1

2020-02-02 Thread Dharmil Patel via Digitalmars-d-learn

On Monday, 3 February 2020 at 07:03:03 UTC, Dharmil Patel wrote:

In my code I am using regex like this:

   auto rgxComma = regex(r",");

On compiling with dmd v2.076.1, it compiles successfully, but 
on compiling with dmd v2.088.1, I am getting lots of errors 
like:


/src/phobos/std/regex/internal/thompson.d-mixin-836(837): 
Error: template instance 
std.regex.internal.thompson.ThompsonOps!(EngineType!(char, 
Input!char), State, true).op!cast(IR)164u error instantiating


Can someone please help me solve this error?

Thanks
Dharmil


Error on using regex in dmd v2.088.1

2020-02-02 Thread Dharmil Patel via Digitalmars-d-learn

In my code I am using regex like this:

   auto rgxComma = regex(r",");

On compiling with dmd v2.076.1, it compiles successfully, but on 
compiling with dmd v2.088.1, I am getting lots of errors like:


/src/phobos/std/regex/internal/thompson.d-mixin-836(837): Error: 
template instance 
std.regex.internal.thompson.ThompsonOps!(EngineType!(char, 
Input!char), State, true).op!cast(IR)164u error instantiating




Re: CT regex in AA at compile time

2020-01-07 Thread Steven Schveighoffer via Digitalmars-d-learn

On 1/7/20 11:00 AM, Taylor Hillegeist wrote:

On Tuesday, 7 January 2020 at 15:51:21 UTC, MoonlightSentinel wrote:

On Tuesday, 7 January 2020 at 15:40:58 UTC, Taylor Hillegeist wrote:
but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or something. 
maybe there is a better way? Does anyone know?


This issue is unrelated to ctRegex, AA literals are non-constant 
expressions (probably due to their implementation).


Correct. A compile-time AA is much different in implementation/layout 
than a runtime AA. So you can't initialize them at compile time.


also, the solution you used is very cool. I had no idea you could put 
shared static this()
just anywhere and have it execute just like it was in main! Does this 
work for dlls as well?


dlls run static constructors upon loading. Thought I'm not too familiar 
with how it exactly works, I know that there can be some trickiness when 
using dlls on Windows.


See information about static constructors/destructors here: 
https://dlang.org/spec/class.html#static-constructor


No idea why it's in the class section, potentially it was only for 
classes at one point?


Information about using DLLs in windows with D is here: 
https://wiki.dlang.org/Win32_DLLs_in_D


-Steve


Re: CT regex in AA at compile time

2020-01-07 Thread Patrick Schluter via Digitalmars-d-learn
On Tuesday, 7 January 2020 at 15:40:58 UTC, Taylor Hillegeist 
wrote:

I'm trying to trick the following code snippet into compilation.

enum TokenType{
//Terminal
Plus,
Minus,
LPer,
RPer,
Number,
}

static auto Regexes =[
  TokenType.Plus:   ctRegex!(`^ *\+`),
  TokenType.Minus:  ctRegex!(`^ *\-`),
  TokenType.LPer:   ctRegex!(`^ *\(`),
  TokenType.RPer:   ctRegex!(`^ *\)`),
  TokenType.Number: ctRegex!(`^ *[0-9]+(.[0-9]+)?`)
];

but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or 
something. maybe there is a better way? Does anyone know?


In that specific case: why don't you use an array indexed on 
TokenType? TokenType are consecutive integrals so indexing is the 
fastest possible access method.


Re: CT regex in AA at compile time

2020-01-07 Thread Taylor Hillegeist via Digitalmars-d-learn
On Tuesday, 7 January 2020 at 15:51:21 UTC, MoonlightSentinel 
wrote:
On Tuesday, 7 January 2020 at 15:40:58 UTC, Taylor Hillegeist 
wrote:
but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or 
something. maybe there is a better way? Does anyone know?


This issue is unrelated to ctRegex, AA literals are 
non-constant expressions (probably due to their 
implementation). You can work around this by using module 
constructors or lazy initialisation inside of a function:


static Regex!char[TokenType] Regexes;

shared static this()
{
Regexes = [
TokenType.Plus:   ctRegex!(`^ *\+`),
TokenType.Minus:  ctRegex!(`^ *\-`),
TokenType.LPer:   ctRegex!(`^ *\(`),
TokenType.RPer:   ctRegex!(`^ *\)`),
TokenType.Number: ctRegex!(`^ *[0-9]+(.[0-9]+)?`)
];
}


Thank you for bringing this to my attention.

also, the solution you used is very cool. I had no idea you could 
put shared static this()
just anywhere and have it execute just like it was in main! Does 
this work for dlls as well?


Re: CT regex in AA at compile time

2020-01-07 Thread MoonlightSentinel via Digitalmars-d-learn
On Tuesday, 7 January 2020 at 15:40:58 UTC, Taylor Hillegeist 
wrote:
but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or 
something. maybe there is a better way? Does anyone know?


This issue is unrelated to ctRegex, AA literals are non-constant 
expressions (probably due to their implementation). You can work 
around this by using module constructors or lazy initialisation 
inside of a function:


static Regex!char[TokenType] Regexes;

shared static this()
{
Regexes = [
TokenType.Plus:   ctRegex!(`^ *\+`),
TokenType.Minus:  ctRegex!(`^ *\-`),
TokenType.LPer:   ctRegex!(`^ *\(`),
TokenType.RPer:   ctRegex!(`^ *\)`),
TokenType.Number: ctRegex!(`^ *[0-9]+(.[0-9]+)?`)
];
}


CT regex in AA at compile time

2020-01-07 Thread Taylor Hillegeist via Digitalmars-d-learn

I'm trying to trick the following code snippet into compilation.

enum TokenType{
//Terminal
Plus,
Minus,
LPer,
RPer,
Number,
}

static auto Regexes =[
  TokenType.Plus:   ctRegex!(`^ *\+`),
  TokenType.Minus:  ctRegex!(`^ *\-`),
  TokenType.LPer:   ctRegex!(`^ *\(`),
  TokenType.RPer:   ctRegex!(`^ *\)`),
  TokenType.Number: ctRegex!(`^ *[0-9]+(.[0-9]+)?`)
];

but I can't get it to work. it says its an Error: non-constant 
expression.


I imagine this has to do with the ctRegex template or something. 
maybe there is a better way? Does anyone know?


Re: using regex at compile time errors out! Error: static variable `thompsonFactory` cannot be read at compile time

2019-10-04 Thread Adam D. Ruppe via Digitalmars-d-learn

On Thursday, 3 October 2019 at 23:47:17 UTC, Brett wrote:
Error: static variable `thompsonFactory` cannot be read at 
compile time


std.regex isn't ctfe compatible, alas.

even the ctRegex doesn't work at ctfe; it *compiles* the regex at 
compile time, but it is not capable of actually *running* it at 
compile time.


(the regular regex is compiled and run at runtime)


Re: using regex at compile time errors out! Error: static variable `thompsonFactory` cannot be read at compile time

2019-10-04 Thread Brett via Digitalmars-d-learn

On Friday, 4 October 2019 at 10:07:40 UTC, kinke wrote:

Have you tried ctRegex?


Yes, just another error about something else that I don't 
remember.




Re: using regex at compile time errors out! Error: static variable `thompsonFactory` cannot be read at compile time

2019-10-04 Thread kinke via Digitalmars-d-learn

Have you tried ctRegex?


using regex at compile time errors out! Error: static variable `thompsonFactory` cannot be read at compile time

2019-10-03 Thread Brett via Digitalmars-d-learn

auto r = replaceAll!((C)
{
return "X";
}
    )(s, regex(`Y`));

Error: static variable `thompsonFactory` cannot be read at 
compile time


This is when the result is tried to be determined at compile 
time, e.g., assigning it to an enum even though s is known at 
compile time.


Regex driving me nuts

2019-06-17 Thread Bart via Digitalmars-d-learn
Error: static variable `thompsonFactory` cannot be read at 
compile time, Trying to regex an import file.


Also I have a group (...)* and it always fails or matches only 
one but if I do (...)(...)(...) it matches all 3(fails if more or 
less of course. ... is the regex).


Also when I ignore a group (?:Text) I get Text as a matched group 
;/


But all this is irrelevant if I can't get the code to work at 
compile time. I tried ctRegex






// A workaround for R-T enum re = regex(...)
template defaultFactory(Char)
{
@property MatcherFactory!Char defaultFactory(const ref 
Regex!Char re) @safe

{
import std.regex.internal.backtracking : 
BacktrackingMatcher;

import std.regex.internal.thompson : ThompsonMatcher;
import std.algorithm.searching : canFind;
static MatcherFactory!Char backtrackingFactory;
static MatcherFactory!Char thompsonFactory;
if (re.backrefed.canFind!"a != 0")
{
if (backtrackingFactory is null)
backtrackingFactory = new 
RuntimeFactory!(BacktrackingMatcher, Char);

return backtrackingFactory;
}
else
{
if (thompsonFactory is null)
thompsonFactory = new 
RuntimeFactory!(ThompsonMatcher, Char);

return thompsonFactory;
}
}
}

The workaround seems to workaround working.


splitter and matcher combined regex

2019-06-16 Thread Amex via Digitalmars-d-learn
Having to split and match seems slow(50%). Surely the regex 
splitter and matcher can be combined? Sometimes we just need to 
extract out and remove information simultaneously.


I propose a new function called extractor that returns the 
matchAll and splitter's results but is optimized.


Re: Poor regex performance?

2019-04-04 Thread Jon Degenhardt via Digitalmars-d-learn

On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote:
On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole 
wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a 
long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


Try:
ldc2 -O3 -release -flto=thin 
-defaultlib=phobos2-ldc-lto,druntime-ldc-lto -enable-inlining


This will improve inlining and optimization across the runtime 
library boundaries. This can help in certain types of code.


Re: Poor regex performance?

2019-04-04 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, Apr 04, 2019 at 09:53:06AM +, Julian via Digitalmars-d-learn wrote:
[...]
>   auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))");
[...]

ctRegex is a crock; use regex() instead and it might actually work
better.


T

-- 
Stop staring at me like that! It's offens... no, you'll hurt your eyes!


Re: Poor regex performance?

2019-04-04 Thread Stefan Koch via Digitalmars-d-learn

On Thursday, 4 April 2019 at 10:31:43 UTC, Julian wrote:
On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole 
wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a 
long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


You need to disable the GC.
by importing core.memory : GC;
and calling GC.Disable();

the next thing is to avoid the .idup and cast to string instead.



Re: Poor regex performance?

2019-04-04 Thread Julian via Digitalmars-d-learn

On Thursday, 4 April 2019 at 09:57:26 UTC, rikki cattermole wrote:

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Thanks! I already had dmd installed from a brief look at D a long
time ago, so I missed the details at 
https://dlang.org/download.html


ldc2 -O3 does a lot better, but the result is still 30x slower
without PCRE.


Re: Poor regex performance?

2019-04-04 Thread XavierAP via Digitalmars-d-learn

On Thursday, 4 April 2019 at 09:53:06 UTC, Julian wrote:


Relatedly, how can I add custom compiler flags to rdmd, in a D 
script?

For example, -L-lpcre


Configuration variable "DFLAGS". On Windows you can specify it in 
the sc.ini file. On Linux: https://dlang.org/dmd-linux.html


Re: Poor regex performance?

2019-04-04 Thread rikki cattermole via Digitalmars-d-learn

If you need performance use ldc not dmd (assumed).

LLVM has many factors better code optimizes than dmd does.


Poor regex performance?

2019-04-04 Thread Julian via Digitalmars-d-learn
The following code, that just runs a regex against a large exim 
log
to report on top senders, is 140 times slower than similar C code 
using
PCRE, when compiled with just -O. With a bunch of other flags I 
got it
down to only 13x slower than C code that's using libc 
regcomp/regexec.


  import std.stdio, std.string, std.regex, std.array, 
std.algorithm;


  T min(T)(T a, T b) {
  if (a < b) return a;
  return b;
  }

  void main() {
  ulong[string] emailcounts;
  auto re = ctRegex!(r"(?:\S+ ){3,4}<= ([^@]+@(\S+))");

  foreach (line; File("exim_mainlog").byLine()) {
  auto m = line.match(re);
  if (m) {
  ++emailcounts[m.front[1].idup];
  }
  }

  string[] senders = emailcounts.keys;
  sort!((a, b) { return emailcounts[a] > emailcounts[b]; 
})(senders);

  foreach (i; 0 .. min(senders.length, 5)) {
  writefln("%5s %s", emailcounts[senders[i]], 
senders[i]);

  }
  }

Other code's available at 
https://github.com/jrfondren/topsender-bench

I get D down to 1.2x slower with PCRE and getline()

I wrote this part of the way through chapter 1 of "The D 
Programming Language",
so my question is mainly: is this a fair result? std.regex is 
very slow and
I should reach for PCRE if regex speed matters? Or is this code 
severely
flawed somehow? I'm using a random production log; not trying to 
make things

difficult.

Relatedly, how can I add custom compiler flags to rdmd, in a D 
script?

For example, -L-lpcre


Re: Redundant "g" flag for regex?

2018-06-23 Thread biocyberman via Digitalmars-d-learn

On Saturday, 23 June 2018 at 13:45:32 UTC, Basile B. wrote:

On Saturday, 23 June 2018 at 12:17:08 UTC, biocyberman wrote:


I get the same output with or without "g" flag at line 6:
https://run.dlang.io/is/9n7iz6

So I don't understand when I have to use "g" flag.


My bet is that Regex results in D are lazy so "g" doesn't make 
sense in this context however I'm able to see an effect with 
"match":


match("12000 + 42100 = 54100", regex(r"(?<=\d)(?=(\d\d\d)+\b)", 
"")).writeln;
match("12000 + 42100 = 54100", regex(r"(?<=\d)(?=(\d\d\d)+\b)", 
"g")).writeln;


matchFirst would be like without "g"
matchAll would be like with "g"


I should have read the doc more thoroughly:

https://dlang.org/phobos/std_regex.html#match
Delegating the kind of operation to "g" flag is soon to be 
phased out along with the ability to choose the exact matching 
scheme.


So case closed for me


Re: Redundant "g" flag for regex?

2018-06-23 Thread Basile B. via Digitalmars-d-learn

On Saturday, 23 June 2018 at 12:17:08 UTC, biocyberman wrote:


I get the same output with or without "g" flag at line 6:
https://run.dlang.io/is/9n7iz6

So I don't understand when I have to use "g" flag.


My bet is that Regex results in D are lazy so "g" doesn't make 
sense in this context however I'm able to see an effect with 
"match":


match("12000 + 42100 = 54100", regex(r"(?<=\d)(?=(\d\d\d)+\b)", 
"")).writeln;
match("12000 + 42100 = 54100", regex(r"(?<=\d)(?=(\d\d\d)+\b)", 
"g")).writeln;


matchFirst would be like without "g"
matchAll would be like with "g"


Re: why explicitly use "flags" in regex does not work?

2018-06-23 Thread Basile B. via Digitalmars-d-learn

On Saturday, 23 June 2018 at 12:50:17 UTC, biocyberman wrote:

On Saturday, 23 June 2018 at 12:20:10 UTC, biocyberman wrote:


I got "Error: undefined identifier flags" in here:

https://run.dlang.io/is/wquscz

Removing "flags =" works.


I kinda found an answer. It's a bit of a surprise anyway: 
https://forum.dlang.org/thread/wokfqqbexazcguffw...@forum.dlang.org?page=1


Long story short, "named" parameter function calling still does 
not work.


Indeed and i'm surprised you didn't know that. The topic has been 
discussed several times and will be again in the next weeks and 
months since there are 2 DIP for the feature:


[1] https://github.com/dlang/DIPs/pull/123
[2] https://github.com/dlang/DIPs/pull/126



Re: why explicitly use "flags" in regex does not work?

2018-06-23 Thread biocyberman via Digitalmars-d-learn

On Saturday, 23 June 2018 at 12:20:10 UTC, biocyberman wrote:


I got "Error: undefined identifier flags" in here:

https://run.dlang.io/is/wquscz

Removing "flags =" works.


I kinda found an answer. It's a bit of a surprise anyway: 
https://forum.dlang.org/thread/wokfqqbexazcguffw...@forum.dlang.org?page=1


Long story short, "named" parameter function calling still does 
not work. IHO, this goes against the readability tendency of D. 
And I still don't know how if I want to do this:



auto func(string a = "a", string b = "b", string c = "c")
{
   write("a: ", a, " b: ", b, " c: ", c);
}

void main()
{
func();
func(b ="B"); // Changes default for b only
func(c = "C"); // Changes default for c only

}



why explicitly use "flags" in regex does not work?

2018-06-23 Thread biocyberman via Digitalmars-d-learn



I got "Error: undefined identifier flags" in here:

https://run.dlang.io/is/wquscz

Removing "flags =" works.


Redundant "g" flag for regex?

2018-06-23 Thread biocyberman via Digitalmars-d-learn



I get the same output with or without "g" flag at line 6:
https://run.dlang.io/is/9n7iz6

So I don't understand when I have to use "g" flag.


Re: forcing tabs in regex

2018-02-27 Thread Dmitry Olshansky via Digitalmars-d-learn
On Wednesday, 28 February 2018 at 05:09:03 UTC, psychoticRabbit 
wrote:

On Wednesday, 28 February 2018 at 01:06:30 UTC, dark777 wrote:

Regex validates years bisexto and not bisextos in format:
const std::regex 
pattern(R"(^(?:(?:(0?[1-9]|1\d|2[0-8])([-/.])(0?[1-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec|[Ff]eb)|(29|30)([-/.])(0?[13-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec)|(31)([-/.])(0?[13578]|1[02]|[Jj]an|[Mm]a[ry]|[Jj]ul|[Aa]ug|[Oo]ct|[Dd]ec))(?:\2|\5|\8)(0{2,3}[1-9]|0{1,2}[1-9]\d|0?[1-9]\d{2}|[1-9]\d{3})|(29)([-/.])(0?2|[Ff]eb)\12(\d{1,2}(?:0[48]|[2468][048]|[13579][26])|(?:0?[48]|[13579][26]|[2468][048])00))$)");


this regex above validates the formats through backreferences.



what is this evil dark magic?


Something that is horribly slow and might be incorrect, there are 
very few reasons to write large regexes like that and they 
usually boil down to “it only accepts regex” otherwise parser 
combinators are much better fit.





Re: forcing tabs in regex

2018-02-27 Thread psychoticRabbit via Digitalmars-d-learn

On Wednesday, 28 February 2018 at 01:06:30 UTC, dark777 wrote:

Regex validates years bisexto and not bisextos in format:
const std::regex 
pattern(R"(^(?:(?:(0?[1-9]|1\d|2[0-8])([-/.])(0?[1-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec|[Ff]eb)|(29|30)([-/.])(0?[13-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec)|(31)([-/.])(0?[13578]|1[02]|[Jj]an|[Mm]a[ry]|[Jj]ul|[Aa]ug|[Oo]ct|[Dd]ec))(?:\2|\5|\8)(0{2,3}[1-9]|0{1,2}[1-9]\d|0?[1-9]\d{2}|[1-9]\d{3})|(29)([-/.])(0?2|[Ff]eb)\12(\d{1,2}(?:0[48]|[2468][048]|[13579][26])|(?:0?[48]|[13579][26]|[2468][048])00))$)");


this regex above validates the formats through backreferences.



what is this evil dark magic?



forcing tabs in regex

2018-02-27 Thread dark777 via Digitalmars-d-learn

Regex validates years bisexto and not bisextos in format:
const std::regex 
pattern(R"(^(?:(?:(0?[1-9]|1\d|2[0-8])([-/.])(0?[1-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec|[Ff]eb)|(29|30)([-/.])(0?[13-9]|1[0-2]|[Jj](?:an|u[nl])|[Mm]a[ry]|[Aa](?:pr|ug)|[Ss]ep|[Oo]ct|[Nn]ov|[Dd]ec)|(31)([-/.])(0?[13578]|1[02]|[Jj]an|[Mm]a[ry]|[Jj]ul|[Aa]ug|[Oo]ct|[Dd]ec))(?:\2|\5|\8)(0{2,3}[1-9]|0{1,2}[1-9]\d|0?[1-9]\d{2}|[1-9]\d{3})|(29)([-/.])(0?2|[Ff]eb)\12(\d{1,2}(?:0[48]|[2468][048]|[13579][26])|(?:0?[48]|[13579][26]|[2468][048])00))$)");


this regex above validates the formats through backreferences.

dd-mm- ou dd-str- ou dd-Str-
dd/mm/ ou dd/str/ ou dd/Str/
dd.mm. ou dd.str. ou dd.Str.yyyy



Regex validates years bisexto and not bisextos in format:
const std::regex 
pattern(R"(^(?:\d{4}([-/.])(?:(?:(?:(?:0?[13578]|1[02]|[Jj](?:an|ul)|[Mm]a[ry]|[Aa]ug|[Oo]ct|[Dd]ec)([-/.])(?:0?[1-9]|[1-2][0-9]|3[01]))|(?:(?:0?[469]|11|[Aa]pr|[Jj]un|[Ss]ep|[Nn]ov)([-/.])(?:0?[1-9]|[1-2][0-9]|30))|(?:(0?2|[Ff]eb)([-/.])(?:0?[1-9]|1[0-9]|2[0-8]|(?:(?:\d{2}(?:0[48]|[2468][048]|[13579][26]))|(?:(?:[02468][048])|[13579][26])00)([-/.])(0?2|[Ff]eb)([-/.])29)$)");


this regex above had to validate the formats through 
backreferences.


but it is validating in the following formats
-mm/dd ou -str/dd ou -Str/dd
/mm.dd ou /str.dd ou /Str.dd
.mm-dd ou .str-dd ou .Str-dd


when it had to validate only in the following formats
-mm-dd ou -str-dd ou -Str-dd
/mm/dd ou /str/dd ou /Str/dd
.mm.dd ou .str.dd ou .Str.dd

how do I do it validate only with some of the tabs?




Re: Convert user input string to Regex

2017-09-16 Thread Ky-Anh Huynh via Digitalmars-d-learn
On Saturday, 16 September 2017 at 03:23:14 UTC, Adam D. Ruppe 
wrote:
On Saturday, 16 September 2017 at 03:18:31 UTC, Ky-Anh Huynh 
wrote:
Is there a way to transform user input string to a regular 
expression? For example, I want to write a `grep`-like program


import std.regex;

auto re = regex(user_pattern, user_flags);


You'll probably want to split it on the '/' to split the 
pattern and the flags since they are two separate variables to 
the regex function, but that's all you need to do.


http://dpldocs.info/experimental-docs/std.regex.regex.2.html



Thanks Adam. I will give it a try.


Re: Convert user input string to Regex

2017-09-15 Thread Adam D. Ruppe via Digitalmars-d-learn
On Saturday, 16 September 2017 at 03:18:31 UTC, Ky-Anh Huynh 
wrote:
Is there a way to transform user input string to a regular 
expression? For example, I want to write a `grep`-like program


import std.regex;

auto re = regex(user_pattern, user_flags);


You'll probably want to split it on the '/' to split the pattern 
and the flags since they are two separate variables to the regex 
function, but that's all you need to do.


http://dpldocs.info/experimental-docs/std.regex.regex.2.html


Convert user input string to Regex

2017-09-15 Thread Ky-Anh Huynh via Digitalmars-d-learn

Hi,

Is there a way to transform user input string to a regular 
expression? For example, I want to write a `grep`-like program


```
mygrep -E '/pattern/i' file.txt
```

and here the user's parameter `/pattern/i` would be converted to 
a Regex object.


Fyi, in Ruby, `to_regexp` is a useful gem: 
https://rubygems.org/gems/to_regexp


Thanks a lot.


Trouble with regex backreferencing

2017-06-12 Thread Murp via Digitalmars-d-learn
I was working around with regex trying to match certain patterns 
of repeating patterns before and after a space and I came across 
some unexpected behavior.


	writeln("ABC ABC CBA".replaceAll(regex(r"([A-Z]) ([A-Z])"), 
"D"));

//ABDBDBA
	//Makes sense, replaces the 3 characters surrounding a space 
with a single D

writeln("ABC ABC CBA".replaceAll(regex(r"([A-Z]) \1"), "D"));
//ABC ABDBA
	//Same idea, but this time only if the 2 surrounding letters are 
the same

writeln("ABC ABC CBA".replaceAll(regex(r"([A-Z]+) \1"), "D"));
//D CBA
	//Same idea again, but this time match any amount of characters 
as long as they are in the same order

writeln("ABCABC ABC CBA".replaceAll(regex(r"([A-Z]+) \1"), "D"));
    //ABCABC ABC CBA
//Hold on, shouldn't this be "ABCD CBA"?
writeln("ABC ABCABC CBA".replaceAll(regex(r"([A-Z]+) \1"), "D"));
//DABC CBA
//Works the other way

The problem I've come across is that the regex should be matching 
the largest portion of the subexpression that it can for both the 
first usage, but it is matching the most it can for its first 
reference without any care as to its future usage, making it only 
work if the entirety of the first word is contained at the start 
of the second, where it should work both ways.
Is there any gross hack I can do to get around this and  if this 
is for some reason intended behavior, why?


Re: Regex multiple matches

2017-04-13 Thread rikki cattermole via Digitalmars-d-learn

On 14/04/2017 3:54 AM, Jethro wrote:

using the rule (?Pregex)

e.g., (?P\w*)*

how do we get at all the matches, e.g., Joe Bob Buddy?

When I access the results captures they are are not arrays and I only
ever get the first match even when I'm using matchAll.


Pseudo code:

foreach(result; matcher) {
...
}

It returns an input range that can "act" as an array without the lookup 
by index, it is a very powerful abstraction.


Regex multiple matches

2017-04-13 Thread Jethro via Digitalmars-d-learn

using the rule (?Pregex)

e.g., (?P\w*)*

how do we get at all the matches, e.g., Joe Bob Buddy?

When I access the results captures they are are not arrays and I 
only ever get the first match even when I'm using matchAll.







How do I get names of regex captures during iteration? Populate AAs with captures?

2017-02-28 Thread Chad Joan via Digitalmars-d-learn
Is there a way to get the name of a named capture when iterating 
over captures from a regular expression match?  I've looked at 
the std.regex code and it seems like "no" to my eyes, but I 
wonder if others here have... a way.


My original problem is this: I need to populate an associative 
array (AA) with all named captures that successfully matched 
during a regex match (and none of the captures that failed).  I 
was wondering what the best way to do this might be.


Thanks!

Please see comments in the below program for details and my 
current progress:


void main()
{
import std.compiler;
import std.regex;
import std.range;
import std.stdio;

writefln("Compiler name:%s", std.compiler.name);
writefln("Compiler version: %s.%s", version_major, 
version_minor);

writeln("");

enum pattern = `(?P\w+)\s*=\s*(?P\d+)?;`;
writefln("Regular expression: `%s`", pattern);
writeln("");

auto re = regex(pattern);

auto c = matchFirst("a = 42;", re);
reportCaptures(re, c);

c = matchFirst("a = ;", re);
reportCaptures(re, c);
}

void reportCaptures(Regex, RegexCaptures)(Regex re, RegexCaptures 
captures)

{
import std.range;
import std.regex;
import std.stdio;

writefln("Captures from matched string '%s'", 
captures[0]);


string[string] captureList;

// I am trying to read the captures from a regular 
expression match

// into the above AA.
//
// ...
//
// This kind of works, but requires a string lookup for 
each capture
// and using it in practice relies on undocumented 
behavior regarding
// the return value of std.regex.Capture's 
opIndex[string] method
// when the string index is a valid named capture that 
was not actually
// captured during the match (ex: the named capture was 
qualified with
// the ? operator or the * operator in the regex and 
never appeared in

// the matched string).
foreach( captureName; re.namedCaptures )
{
auto capture = captures[captureName];
if ( capture is null )
writefln("  captures[%s] is null", 
captureName);

else if ( capture.empty )
writefln("  captures[%s] is empty", 
captureName);

else
{
writefln("  captures[%s] is '%s'", 
captureName, capture);

captureList[captureName] = capture;
}
}

writefln("Total captures: %s", captureList);

/+
// I really want to do something like this, instead:
foreach( capture; captures )
captureList[capture.name] = capture.value;

// And, in reality, it might need to be more like this:
foreach( capture; captures )
foreach ( valueIndex, value; capture.values )

captureList[format("%s-%s",capture.name,valueIndex)] = value;

// Because, logically, named captures qualified with the
// *, +, or {} operators in regular expressions may 
capture

// multiple slices.

writefln("Total captures: %s", captureList);
+/

writeln("");
}


//Output, DMD64 D Compiler v2.073.1:
//---
//
//Compiler name:Digital Mars D
//Compiler version: 2.73
//
//Regular expression: `(?P\w+)\s*=\s*(?P\d+)?;`
//
//Captures from matched string 'a = 42;'
//  captures[value] is '42'
//  captures[var] is 'a'
//Total captures: ["value":"42", "var":"a"]
//
//Captures from matched string 'a = ;'
//  captures[value] is empty
//  captures[var] is 'a'
//Total captures: ["var":"a"]


Regex replace followed by number.

2016-06-01 Thread Taylor Hillegeist via Digitalmars-d-learn
So I have ran into an issue where I want to replace a string with 
regex.


but i cant figure out how to replace items followed by a number.

i use "$1001" to do paste first match but this thinks I'm trying 
using match 1001
but if i try ${1}001 it gives me an error saying that it cant 
match the other "}"


perhaps D uses a different syntax but i couldn't find any 
documentation on the replace side.



The following code renames files.
arg 1 - path
arg 2 - regex match
arg 3 - regex replace

--
import std.file;
import std.path;
import std.regex;
import std.range;
import std.stdio:writeln;

void main(string[] args){

    bool preview;
Regex!char myreg;
string replacment;

	if(!args[1].buildNormalizedPath.isValidPath){writeln("Path is 
invalid! ");   return;}

try{myreg   = regex(args[2]);}
	catch(RegexException e)   {writeln("Invalid 
Regex command");return;}

try{replacment  = args[3];}
	catch(Exception e) {writeln("Needs 
replacment string");return;}

if(args.length < 5){
preview  = true;
writeln("result is preview only add extra arg for action");
}else{preview  = false;}

size_t longest=0;
	foreach (string name; 
dirEntries(buildNormalizedPath(args[1].driveName(),args[1].stripDrive()) , SpanMode.shallow))

{
if(name.isFile){
			longest = (longest>name.baseName.length) ? longest : 
name.length;

}
}

	foreach (string name; 
dirEntries(buildNormalizedPath(args[1].driveName(),args[1].stripDrive()) , SpanMode.shallow))

{
if(name.isFile){
if(preview){
writeln("From:",name.baseName," 
".repeat(longest-name.baseName.length).join,"to:",replaceAll(name.baseName, myreg,replacment ));

}else{
std.file.rename(name,replaceAll(name, 
myreg,replacment ));
}
}
}
}
-


Re: regex - match/matchAll and bmatch - different output

2016-01-02 Thread Ivan Kazmenko via Digitalmars-d-learn

On Friday, 1 January 2016 at 12:29:01 UTC, anonymous wrote:

On 30.12.2015 12:06, Ivan Kazmenko wrote:
As you can see, bmatch (usage discouraged in the docs) gives 
me the
result I want, but match (also discouraged) and matchAll (way 
to go) don't.


Am I misusing matchAll, or is this a bug?


The `\1` there is a backreference. Backreferences are not part 
of regular expressions, in the sense that they allow you to 
describe more than regular languages. [1]


As far as I know, bmatch uses a widespread matching mechanism, 
while match/matchAll use a different, less common one. It 
wouldn't surprise me if match/matchAll simply didn't support 
backreferences.


Backreferences are not documented, as far as I can see, but 
they're working in other patterns. So, yeah, this is possibly a 
bug.



[1] 
https://en.wikipedia.org/wiki/Regular_expression#Patterns_for_non-regular_languages


The overview by the module author 
(http://dlang.org/regular-expression.html) does mention in the 
last paragraph that backreferences are supported.  Looks like it 
is a common feature in other programming languages, too.


The "\1" part is working correctly when "abab" or "abxab" or 
"ababx" but not "abac".  This means it is probably intended to 
work, and handling "xabab" incorrectly is a bug.


Also, as I understand it from the docs, matchAll/matchFirst use 
the most appropriate of match/bmatch internally, so if match does 
not properly support the particular backreference but bmatch 
does, the bug is in using the incorrect one to handle a pattern.


At any rate, wrong result with a 8-character pattern produces a 
"regex don't work" impression, and I hope something can be done 
about it.


Re: Why can't a Regex object be immutable?

2016-01-01 Thread Shriramana Sharma via Digitalmars-d-learn
cym13 wrote:

> Is it that you
> can't make an immutable regex()? In that case it is a
> runtime-related issue and those variables just have to be
> mutable. Or is it that you want to be able to use an immutable or
> const regex (be it from regex() or ctRegex!()) with matchAll()?
> In the latter case it is matchAll's fault for not garanteeing the
> immutability of the regex (and may even qualify as a bug IMHO)
> but you can « cast(Regex!char)numbers » it if you must so it
> isn't hard to work arround it.

Yes after your comments I realized that it's not so much that I cannot 
create an immutable or const symbol referring to a Regex object but that I 
cannot use it with matchAll etc. But of what use is a Regex object if it 
cannot be used with matchAll etc?

-- 
Shriramana Sharma, Penguin #395953


Re: Why can't a Regex object be immutable?

2016-01-01 Thread cym13 via Digitalmars-d-learn

On Saturday, 2 January 2016 at 02:56:35 UTC, cym13 wrote:
On Saturday, 2 January 2016 at 02:39:36 UTC, Shriramana Sharma 
wrote:
Aw come on. The immutability of the variable is *after* it has 
been created at runtime.


Sure, but still...


> you'll find that using
ctRegex() instead will allow you to declare it immutable for 
example. I didn't look at the implementation to identify a 
precise cause though.


You mean ctRegex!(), but nope:

immutable numbers = ctRegex!r"\d+";

or doing const there gives the same error and using auto 
doesn't.


... I definitely get no error with this line (DMD v2.069, GDC 
5.3.0, LDC

0.16.1). The exact code I used is below.

void main(string[] args) {
import std.regex;
immutable numbers = ctRegex!r"\d+";
}

So yes immutability occurs after its creation, but it clearly 
seems linked to
a runtime-related issue nonetheless. I don't know what you used 
to get an
error with ctRegex as I couldn't reproduce one, maybe the 
solution lies

there.


On Saturday, 2 January 2016 at 02:56:35 UTC, cym13 wrote:
[...]

While playing with your original code, I realised that maybe what 
you meant by "the same error" is the « Error: template 
std.regex.matchAll cannot deduce function from argument types 
!()(string, const(StaticRegex!char)) » one. But that error has 
nothing to do with the first one (« Error: cannot implicitly 
convert expression (regex("\\d+", "")) of type Regex!char to 
immutable(Regex!char) ») which is far more interesting.


So my question would be, what's your problem? Is it that you 
can't make an immutable regex()? In that case it is a 
runtime-related issue and those variables just have to be 
mutable. Or is it that you want to be able to use an immutable or 
const regex (be it from regex() or ctRegex!()) with matchAll()? 
In the latter case it is matchAll's fault for not garanteeing the 
immutability of the regex (and may even qualify as a bug IMHO) 
but you can « cast(Regex!char)numbers » it if you must so it 
isn't hard to work arround it.


Re: Why can't a Regex object be immutable?

2016-01-01 Thread cym13 via Digitalmars-d-learn
On Saturday, 2 January 2016 at 02:39:36 UTC, Shriramana Sharma 
wrote:
Aw come on. The immutability of the variable is *after* it has 
been created at runtime.


Sure, but still...


> you'll find that using
ctRegex() instead will allow you to declare it immutable for 
example. I didn't look at the implementation to identify a 
precise cause though.


You mean ctRegex!(), but nope:

immutable numbers = ctRegex!r"\d+";

or doing const there gives the same error and using auto 
doesn't.


... I definitely get no error with this line (DMD v2.069, GDC 
5.3.0, LDC

0.16.1). The exact code I used is below.

void main(string[] args) {
import std.regex;
immutable numbers = ctRegex!r"\d+";
}

So yes immutability occurs after its creation, but it clearly 
seems linked to
a runtime-related issue nonetheless. I don't know what you used 
to get an
error with ctRegex as I couldn't reproduce one, maybe the 
solution lies

there.



Re: Why can't a Regex object be immutable?

2016-01-01 Thread Shriramana Sharma via Digitalmars-d-learn
cym13 wrote:

> I think it's because regex() only compiles the regex at runtime
> so it needs to be modified later ; 

Aw come on. The immutability of the variable is *after* it has been created 
at runtime.

> > you'll find that using
> ctRegex() instead will allow you to declare it immutable for
> example. I didn't look at the implementation to identify a
> precise cause though.

You mean ctRegex!(), but nope:

immutable numbers = ctRegex!r"\d+";

or doing const there gives the same error and using auto doesn't.

-- 
Shriramana Sharma, Penguin #395953


Re: Why can't a Regex object be immutable?

2016-01-01 Thread cym13 via Digitalmars-d-learn
On Saturday, 2 January 2016 at 02:03:13 UTC, Shriramana Sharma 
wrote:

Shriramana Sharma wrote:


Why is it impossible for a Regex object to be
`immutable`?


I find that I can't declare it as `const` either... This is 
most curious!


I think it's because regex() only compiles the regex at runtime 
so it needs to be modified later ; you'll find that using 
ctRegex() instead will allow you to declare it immutable for 
example. I didn't look at the implementation to identify a 
precise cause though.


Why can't a Regex object be immutable?

2016-01-01 Thread Shriramana Sharma via Digitalmars-d-learn
Hello. With this code:

import std.stdio, std.regex;
void main()
{
immutable numbers = regex(r"\d+");
foreach (match; "a1b2c3d4e5".matchAll(numbers))
writeln(match[0]);
}

compiling gives the error:

(4): Error: cannot implicitly convert expression (regex("\\d+", "")) of 
type Regex!char to immutable(Regex!char)
(5): Error: template std.regex.matchAll cannot deduce function from 
argument types !()(string, immutable(Regex!char)), candidates are:
/usr/include/dmd/phobos/std/regex/package.d(859):    
std.regex.matchAll(R, RegEx)(R input, RegEx re) if (isSomeString!R && 
is(RegEx == Regex!(BasicElementOf!R)))
/usr/include/dmd/phobos/std/regex/package.d(867):
std.regex.matchAll(R, String)(R input, String re) if (isSomeString!R && 
isSomeString!String)
/usr/include/dmd/phobos/std/regex/package.d(874):
std.regex.matchAll(R, RegEx)(R input, RegEx re) if (isSomeString!R && 
is(RegEx == StaticRegex!(BasicElementOf!R)))

If I use `auto` all is fine. Why is it impossible for a Regex object to be 
`immutable`?

-- 
Shriramana Sharma, Penguin #395953


Re: Why can't a Regex object be immutable?

2016-01-01 Thread Shriramana Sharma via Digitalmars-d-learn
Shriramana Sharma wrote:

> Why is it impossible for a Regex object to be
> `immutable`?

I find that I can't declare it as `const` either... This is most curious!

-- 
Shriramana Sharma, Penguin #395953


Re: regex - match/matchAll and bmatch - different output

2016-01-01 Thread anonymous via Digitalmars-d-learn

On 30.12.2015 12:06, Ivan Kazmenko wrote:

import std.regex, std.stdio;
void main ()
{
 writeln (bmatch   ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
 writeln (match("abab",  r"(..).*\1"));  // [["abab", "ab"]]
 writeln (matchAll ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
 writeln (bmatch   ("xabab", r"(..).*\1"));  // [["abab", "ab"]]
 writeln (match("xabab", r"(..).*\1"));  // []
 writeln (matchAll ("xabab", r"(..).*\1"));  // []
}

As you can see, bmatch (usage discouraged in the docs) gives me the
result I want, but match (also discouraged) and matchAll (way to go) don't.

Am I misusing matchAll, or is this a bug?


The `\1` there is a backreference. Backreferences are not part of 
regular expressions, in the sense that they allow you to describe more 
than regular languages. [1]


As far as I know, bmatch uses a widespread matching mechanism, while 
match/matchAll use a different, less common one. It wouldn't surprise me 
if match/matchAll simply didn't support backreferences.


Backreferences are not documented, as far as I can see, but they're 
working in other patterns. So, yeah, this is possibly a bug.



[1] 
https://en.wikipedia.org/wiki/Regular_expression#Patterns_for_non-regular_languages


Re: regex - match/matchAll and bmatch - different output

2015-12-31 Thread Ivan Kazmenko via Digitalmars-d-learn
On Wednesday, 30 December 2015 at 11:06:55 UTC, Ivan Kazmenko 
wrote:

...

As you can see, bmatch (usage discouraged in the docs) gives me 
the result I want, but match (also discouraged) and matchAll 
(way to go) don't.


Am I misusing matchAll, or is this a bug?


Reported as https://issues.dlang.org/show_bug.cgi?id=15489.



regex - match/matchAll and bmatch - different output

2015-12-30 Thread Ivan Kazmenko via Digitalmars-d-learn

Hi,

While solving Advent of Code problems for fun (already discussed 
in the forum: 
http://forum.dlang.org/post/cwdkmblukzptsrsrv...@forum.dlang.org), I ran into an issue.  I wanted to test for the pattern "two consecutive characters, arbitrary sequence, the same two consecutive characters".  Sadly, my solution using regular expressions gave a wrong result, but a hand-written one was accepted.


The problem reduced to the following:

import std.regex, std.stdio;
void main ()
{
writeln (bmatch   ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
writeln (match("abab",  r"(..).*\1"));  // [["abab", "ab"]]
writeln (matchAll ("abab",  r"(..).*\1"));  // [["abab", "ab"]]
writeln (bmatch   ("xabab", r"(..).*\1"));  // [["abab", "ab"]]
writeln (match("xabab", r"(..).*\1"));  // []
writeln (matchAll ("xabab", r"(..).*\1"));  // []
}

As you can see, bmatch (usage discouraged in the docs) gives me 
the result I want, but match (also discouraged) and matchAll (way 
to go) don't.


Am I misusing matchAll, or is this a bug?

Ivan Kazmenko.



Re: How to replace inside regex?

2015-12-17 Thread Ali Çehreli via Digitalmars-d-learn

On 12/17/2015 04:57 AM, Suliman wrote:

> find all commas in strings inside quotes and replace them.
>
> foo, bar, "hello, user", baz
[...]
> auto partWithComma = matchAll(line, r).replaceAll(",", " ");

For this particular case, do you really want to replace with spaces, or 
do you want to eliminate them?


1) If the latter, I am sure you already know that you can call filter on 
the whole string:


import std.algorithm;

void main() {
auto s = `foo, bar, "hello, user", baz`;
auto result = s.filter!(c => c != '"');
assert(result.equal(`foo, bar, hello, user, baz`));
}

Note that 'result' above is a range that is produced lazily. If you need 
the result to be a proper array, then append a .array at the end (but 
don't forget to import std.array or std.range in that case):


import std.array;
auto result = s.filter!(c => c != '"').array;

Now 'result' in an array.


2) Also consider std.array.replace:

  http://dlang.org/phobos/std_array.html#.replace


3) For your general question about regular expressions, there may be 
other solutions but the following style works for me:


import std.stdio;
import std.string;
import std.regex;
import std.array;

void main() {

auto data = [ "abc=1", "def=2", "xyz=3" ];

/* Matches patterns like a=1
 *
 * Note the parentheses around the two patterns. Those parentheses 
allow

 * us to refer to the matched parts with indexes 1, 2, etc. later.
 */
enum re = regex(`([a-z]*)=([0-9]*)`);

foreach (line; data) {
if (matchFirst(line, re)) {
/* This line matched. */

/* Instead of such a "sink" delegate, you can use 
std.appender to

 * collect the replaced lines. This one makes use of the
 * replacement right away by sending it to standard output.
 */
auto useTheLine = (const(char)[] replaced) {
if (replaced.empty) {
/* QUESTION TO OTHERS: Why is this function called with
 * empty strings twice, apparently before and after 
each

 * actual replacement? Is that intentional? */

} else {
// This is where we actually use the replacement
writeln(replaced);
}
};

replaceAllInto!makeReplacement(useTheLine, line, re);
}
}
}

string makeReplacement(Captures!(string) matched) {
// Note: matched[0] would be the whole matched line
const label = matched[1];
const value = matched[2];

// We produce the new string here:
return format("((%s)) ((%s))", label, value);
}

Ali



How to replace inside regex?

2015-12-17 Thread Suliman via Digitalmars-d-learn

I can't understand how to replace in regex. I have got next task:
find all commas in strings inside quotes and replace them.

foo, bar, "hello, user", baz

I wrote next regexp that find part that include commas inside the 
quotes:

auto partWithComma = matchAll(line, r);

but I can't understand how to replace commas here. I have only 
ideas to do something like:


auto partWithComma = matchAll(line, r).replaceAll(",", " ");







Re: failing regex

2015-11-23 Thread Rikki Cattermole via Digitalmars-d-learn

On 23/11/15 9:30 PM, yawniek wrote:

regex from
https://github.com/ua-parser/uap-core/blob/master/regexes.yaml#L38
seems to work in other languages, not so in D:

auto r2 = r"(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9
_\!\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*))
(\d+)(?:\.(\d+)(?:\.(\d+))?)?".regex();

( https://gist.github.com/4334e35e68497c0517db )

results in

```
dmd -run failing_regex.d
std.regex.internal.ir.RegexException@/usr/local/Cellar/dmd/2.069.0/include/d2/std/regex/internal/parser.d(1392):
invalid escape sequence
Pattern with error: `(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 _\!` <--HERE--
`\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*))
(\d+)(?:\.(\d+)(?:\.(\d+))?)?`

4   dmd_run68HuB5   0x0001044d211d @trusted void
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.error(immutable(char)[])
+ 297
5   dmd_run68HuB5   0x0001044da604 ref @trusted
std.regex.internal.parser.Parser!(immutable(char)[]).Parser
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.__ctor!(const(char)[]).__ctor(immutable(char)[],
const(char)[]) + 160
6   dmd_run68HuB5   0x0001044cc732 @safe
std.regex.internal.ir.Regex!(char).Regex
std.regex.regexImpl!(immutable(char)[]).regexImpl(immutable(char)[],
const(char)[]) + 86
7   dmd_run68HuB5   0x0001044e944f
std.regex.internal.ir.Regex!(char).Regex
std.functional.__T7memoizeS95_D3std5regex18__T9regexImplTAyaZ9regexImplFNfAyaAxaZS3std5regex8internal2ir12__T5RegexTaZ5RegexVii8Z.memoize(immutable(char)[],
const(char)[]) + 475
8   dmd_run68HuB5   0x0001044cc6bc @trusted
std.regex.internal.ir.Regex!(char).Regex
std.regex.regex!(immutable(char)[]).regex(immutable(char)[],
const(char)[]) + 64
9   dmd_run68HuB5   0x0001044cc5de _Dmain + 46
10  dmd_run68HuB5   0x000104509ac3
D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv + 39
11  dmd_run68HuB5   0x0001045099fb void
rt.dmain2._d_run_main(int, char**, extern (C) int
function(char[][])*).tryExec(scope void delegate()) + 55
12  dmd_run68HuB5   0x000104509a68 void
rt.dmain2._d_run_main(int, char**, extern (C) int
function(char[][])*).runAll() + 44
13  dmd_run68HuB5   0x0001045099fb void
rt.dmain2._d_run_main(int, char**, extern (C) int
function(char[][])*).tryExec(scope void delegate()) + 55
14  dmd_run68HuB5   0x00010450994d _d_run_main +
497
15  dmd_run68HuB5   0x0001044cc677 main + 15
16  libdyld.dylib   0x7fff8e5185c8 start + 0
17  ??? 0x 0x0 + 0

```


bug or did i do something wrong?


Its the: \!
Don't escape !.


Re: regex format string problem

2015-11-23 Thread Rikki Cattermole via Digitalmars-d-learn

On 23/11/15 9:22 PM, yawniek wrote:

Hi Rikki,

On Monday, 23 November 2015 at 03:57:06 UTC, Rikki Cattermole wrote:

I take it that browscap[0] does it not do what you want?
I have an generator at [1].
Feel free to steal.


This looks interesting, thanks for the hint. However it might be a bit
limited,
i have 15M+ different User Agents with all kind of weird cases,
sometimes not even the extensive ua-core regexs work. (if you're
interested for testing let me know)


Also once you do get yours working, you'll want to use ctRegex and
generate a file with all of them in it. That'll increase performance
significantly.


that was my plan.


Reguarding regex, if you want a named sub part use:
(?[a-z]*)
Where [a-z]* is just an example.

I would recommend you learning how input ranges work. They are used
with how to get the matches out, e.g.

auto rgx = ctRegex!`([a-z])[123]`;
foreach(match; rgx.matchAll("b3")) {
writeln(match.hit);
}


i'm aware how this works, the problem is a different  one:

i do have a second string that contains $n's which can occur in any order.
now of course i can just go and write another regex and replace it, job
done.
but from looking at std.regex this seems to be built in, i just failed
to get it to work properly, see my gist. i hoped this to be a 1liner.


So like this?

import std.regex;
import std.stdio : readln, writeln, write, stdout;

auto REG = ctRegex!(`(\S+)(?: (.*))?`);

void main() {
for(;;) {
write("> ");
stdout.flush;
string line = readln();
line.length--;

if (line.length == 0)
return;

writeln("< ", line.replaceAll(REG, "Unknown program: $1"));
}
}



failing regex

2015-11-23 Thread yawniek via Digitalmars-d-learn

regex from
https://github.com/ua-parser/uap-core/blob/master/regexes.yaml#L38
seems to work in other languages, not so in D:

auto r2 = r"(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 
_\!\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*)) (\d+)(?:\.(\d+)(?:\.(\d+))?)?".regex();


( https://gist.github.com/4334e35e68497c0517db )

results in

```
dmd -run failing_regex.d
std.regex.internal.ir.RegexException@/usr/local/Cellar/dmd/2.069.0/include/d2/std/regex/internal/parser.d(1392):
 invalid escape sequence
Pattern with error: `(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 _\!` 
<--HERE-- 
`\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*)) (\d+)(?:\.(\d+)(?:\.(\d+))?)?`


4   dmd_run68HuB5   0x0001044d211d 
@trusted void 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.error(immutable(char)[]) + 297
5   dmd_run68HuB5   0x0001044da604 ref 
@trusted 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.__ctor!(const(char)[]).__ctor(immutable(char)[], const(char)[]) + 160
6   dmd_run68HuB5   0x0001044cc732 @safe 
std.regex.internal.ir.Regex!(char).Regex 
std.regex.regexImpl!(immutable(char)[]).regexImpl(immutable(char)[], const(char)[]) + 86
7   dmd_run68HuB5   0x0001044e944f 
std.regex.internal.ir.Regex!(char).Regex 
std.functional.__T7memoizeS95_D3std5regex18__T9regexImplTAyaZ9regexImplFNfAyaAxaZS3std5regex8internal2ir12__T5RegexTaZ5RegexVii8Z.memoize(immutable(char)[], const(char)[]) + 475
8   dmd_run68HuB5   0x0001044cc6bc 
@trusted std.regex.internal.ir.Regex!(char).Regex 
std.regex.regex!(immutable(char)[]).regex(immutable(char)[], 
const(char)[]) + 64
9   dmd_run68HuB5   0x0001044cc5de _Dmain 
+ 46
10  dmd_run68HuB5   0x000104509ac3 
D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv + 39
11  dmd_run68HuB5   0x0001045099fb void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).tryExec(scope void delegate()) + 55
12  dmd_run68HuB5   0x000104509a68 void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).runAll() + 44
13  dmd_run68HuB5   0x0001045099fb void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).tryExec(scope void delegate()) + 55
14  dmd_run68HuB5   0x00010450994d 
_d_run_main + 497
15  dmd_run68HuB5   0x0001044cc677 main + 
15
16  libdyld.dylib   0x7fff8e5185c8 start 
+ 0

17  ??? 0x 0x0 + 0

```


bug or did i do something wrong?



Re: regex format string problem

2015-11-23 Thread yawniek via Digitalmars-d-learn

Hi Rikki,

On Monday, 23 November 2015 at 03:57:06 UTC, Rikki Cattermole 
wrote:

I take it that browscap[0] does it not do what you want?
I have an generator at [1].
Feel free to steal.


This looks interesting, thanks for the hint. However it might be 
a bit limited,
i have 15M+ different User Agents with all kind of weird cases, 
sometimes not even the extensive ua-core regexs work. (if you're 
interested for testing let me know)


Also once you do get yours working, you'll want to use ctRegex 
and generate a file with all of them in it. That'll increase 
performance significantly.


that was my plan.


Reguarding regex, if you want a named sub part use:
(?[a-z]*)
Where [a-z]* is just an example.

I would recommend you learning how input ranges work. They are 
used with how to get the matches out, e.g.


auto rgx = ctRegex!`([a-z])[123]`;
foreach(match; rgx.matchAll("b3")) {
writeln(match.hit);
}


i'm aware how this works, the problem is a different  one:

i do have a second string that contains $n's which can occur in 
any order.
now of course i can just go and write another regex and replace 
it, job done.
but from looking at std.regex this seems to be built in, i just 
failed to get it to work properly, see my gist. i hoped this to 
be a 1liner.





Re: regex format string problem

2015-11-22 Thread Rikki Cattermole via Digitalmars-d-learn

On 23/11/15 12:41 PM, yawniek wrote:

hi!

how can i format  a string with captures from a regular expression?
basically make this pass:
https://gist.github.com/f17647fb2f8ff2261d42


context: i'm trying to write a implementation for
https://github.com/ua-parser
where the regular expression as well as the format strings are given.


I take it that browscap[0] does it not do what you want?
I have an generator at [1].
Feel free to steal.

Also once you do get yours working, you'll want to use ctRegex and 
generate a file with all of them in it. That'll increase performance 
significantly.


Reguarding regex, if you want a named sub part use:
(?[a-z]*)
Where [a-z]* is just an example.

I would recommend you learning how input ranges work. They are used with 
how to get the matches out, e.g.


auto rgx = ctRegex!`([a-z])[123]`;
foreach(match; rgx.matchAll("b3")) {
writeln(match.hit);
}

Or something along those lines, I did it off the top of my head.

[0] 
https://github.com/rikkimax/Cmsed/blob/master/tools/browser_detection/browscap.ini
[1] 
https://github.com/rikkimax/Cmsed/blob/master/tools/browser_detection/generator.d




regex format string problem

2015-11-22 Thread yawniek via Digitalmars-d-learn

hi!

how can i format  a string with captures from a regular 
expression?

basically make this pass:
https://gist.github.com/f17647fb2f8ff2261d42


context: i'm trying to write a implementation for 
https://github.com/ua-parser
where the regular expression as well as the format strings are 
given.





Re: Error in trying to use an inout(char)[] with a regex

2015-10-16 Thread Shriramana Sharma via Digitalmars-d-learn
Ali Çehreli wrote:

> Meanwhile, can you try the following template which works at least for
> the reduced code:
> 
> import std.range;
> 
> auto foo(S)(S text)
> if (isSomeString!S) {
> import std.regex;
> static auto inlineRE = ctRegex!`\$\(ta (.*?)\)`;
> return text.replaceAll!(m => textAttr(m[1]))(inlineRE);
> }

Well I didn't think isSomeString was appropriate since it also allows 
wstring-s and dstring-s which I'm not sure my other functions with work well 
with, but the following worked nicely for me: thank you!

auto applyTextAttr(T)(T text) if (is(T: const(char)[]))
{
import std.regex;
static auto inlineRE = ctRegex!`\$\(ta (.*?)\)`;
return text.replaceAll!(m => textAttr(m[1]))(inlineRE);
}

-- 
Shriramana Sharma, Penguin #395953


Re: Error in trying to use an inout(char)[] with a regex

2015-10-16 Thread Ali Çehreli via Digitalmars-d-learn

On 10/16/2015 07:06 PM, Shriramana Sharma wrote:

> /usr/include/dmd/phobos/std/regex/package.d(557): Error: variable
> std.regex.RegexMatch!(inout(char)[], 
BacktrackingMatcher).RegexMatch._input

> only parameters or stack based variables can be inout

Reduced:

inout(char)[] foo(inout(char)[] text) {
import std.regex;
static auto inlineRE = ctRegex!`\$\(ta (.*?)\)`;
return text.replaceAll!(m => textAttr(m[1]))(inlineRE);
}

void main() {
}

This may be an oversight in the regex module that it may not be 
well-tested with inout data. If others agree, please open a bug report.


Meanwhile, can you try the following template which works at least for 
the reduced code:


import std.range;

auto foo(S)(S text)
if (isSomeString!S) {
import std.regex;
static auto inlineRE = ctRegex!`\$\(ta (.*?)\)`;
return text.replaceAll!(m => textAttr(m[1]))(inlineRE);
}

void main() {
}

Ali



Error in trying to use an inout(char)[] with a regex

2015-10-16 Thread Shriramana Sharma via Digitalmars-d-learn
Hello. Please see the following code:

import std.range;
string textAttr(T)(T spec) if (is(ElementType!(T): const(char)[]))
{
return "";

// made dummy for illustration; actually this runs a foreach loop on the
// individual items in spec, analyses them and passes them to
// appropriate subroutines for processing and returns their output to 
// the caller
//
// made into a separate function taking either a range or array so that
// it can be passed either the output of splitter or args[1 .. $]
}

string textAttr(const(char)[] spec)
{
import std.algorithm: splitter;
return textAttr(splitter(spec));
}

inout(char)[] applyTextAttr(inout(char)[] text)
{
import std.regex;
static auto inlineRE = ctRegex!`\$\(ta (.*?)\)`;
return text.replaceAll!(m => textAttr(m[1]))(inlineRE);
}

void main(string [] args)
{
import std.stdio;
if (args.length == 1) return;
alias ta = textAttr;
writeln(ta(args[1 .. $]), "text1", ta("u g /w"), "text2", ta("off"));
}

Now upon trying to compile this I'm getting the errors:

$ dmd inout_test.d
/usr/include/dmd/phobos/std/regex/package.d(557): Error: variable 
std.regex.RegexMatch!(inout(char)[], BacktrackingMatcher).RegexMatch._input 
only parameters or stack based variables can be inout
/usr/include/dmd/phobos/std/regex/package.d(374): Error: variable 
std.regex.Captures!(inout(char)[], ulong).Captures._input only parameters or 
stack based variables can be inout
/usr/include/dmd/phobos/std/regex/package.d(419): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(425): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(431): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(438): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(445): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(558): Error: template instance 
std.regex.Captures!(inout(char)[], ulong) error instantiating
/usr/include/dmd/phobos/std/regex/package.d(684):instantiated from 
here: RegexMatch!(inout(char)[], BacktrackingMatcher)
/usr/include/dmd/phobos/std/regex/package.d(878):instantiated from 
here: matchMany!(BacktrackingMatcher, StaticRegex!char, inout(char)[])
/usr/include/dmd/phobos/std/regex/package.d(751):instantiated from 
here: matchAll!(inout(char)[], StaticRegex!char)
/usr/include/dmd/phobos/std/regex/package.d(1210):instantiated from 
here: replaceAllWith!((m, sink) => sink.put(fun(m)), matchAll, inout(char)
[], StaticRegex!char)
inout_test.d(22):instantiated from here: replaceAll!((m) => 
textAttr(m[1]), inout(char)[], StaticRegex!char)
/usr/include/dmd/phobos/std/regex/package.d(599): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(605): Error: inout on return 
means inout must be on a parameter as well for @property R()
/usr/include/dmd/phobos/std/regex/package.d(611): Error: inout on return 
means inout must be on a parameter as well for @property R()

I'm totally not clear as to what the error about inout means, and how I can 
fix this. Please advise. Thanks!

-- 
Shriramana Sharma, Penguin #395953



Re: Regex start/end position of match?

2015-10-01 Thread Gerald via Digitalmars-d-learn
Thanks Adam, that was the hint I needed. For a given RegexMatch 
the pre().length() is essentially equivalent to the start 
position and taking pre().length + hit.length() gives the end 
position so I think this should be OK for my needs.


Re: Regex start/end position of match?

2015-10-01 Thread Adam D. Ruppe via Digitalmars-d-learn

On Thursday, 1 October 2015 at 03:29:29 UTC, Gerald wrote:

I'm stuck though on how to get the start/end index of a match?


I couldn't find one either so I did the pre/post/hit things 
broken up.


Take a look at this little program I wrote:

http://arsdnet.net/dcode/replacer/

All the files it needs are there, so you can browse by http or 
download them somewhere and compile+run to give it a try.


Lines 73-78 show the pre/hit/post thing, with me changing the 
colors of the output so they match.


Afterward, I replace one instance and draw it again to show what 
the new line would look like. (The point of my program is to 
interactively request confirmation of every individual change you 
are going to make.)



Something similar will probably work for you too.


Regex start/end position of match?

2015-09-30 Thread Gerald via Digitalmars-d-learn
I'm using the std.regex API as part of Linux GUI grep utility I'm 
trying to create. I've got the GUI going fine using gtkd, the 
code to iterate over files (wow that was succinct in D, very 
impressive!), and getting matches via regex using the matchAll 
function.


I'm stuck though on how to get the start/end index of a match? 
Looking at RegexMatch, I don't see an obvious way to get this? 
I'll admit that coming from Java I'm not very comfortable with 
the Range concept in D, however I read the Range and Regex D 
documentation as well as Andrei's article on ranges referenced in 
the docs and I'm still not seeing what I'm missing.


By comparison, the Java API includes the start/end match index in 
the MatchResult interface. I have a feeling I'm not grokking 
something fundamental about Ranges in D, what am I missing?


BTW in case anyone asks, the reason I'm looking for this is to 
highlight the matches when displaying matching lines in the GUI.


Re: Regex-Fu

2015-05-25 Thread Chris via Digitalmars-d-learn

On Monday, 25 May 2015 at 11:20:46 UTC, novice2 wrote:

I cannot get the longest possible

it match longest for first group ([a-z]+)

try

^([a-z]+?)(hula|ula)$


Namespace, novice2:

Ah, I see. The problem was with the first group that was too 
greedy, not with the second. I was focusing on the latter. 
Thanks, this works now!


Re: Regex-Fu

2015-05-25 Thread novice2 via Digitalmars-d-learn

I cannot get the longest possible

it match longest for first group ([a-z]+)

try

^([a-z]+?)(hula|ula)$



Re: Regex-Fu

2015-05-25 Thread Namespace via Digitalmars-d-learn

On Monday, 25 May 2015 at 11:11:50 UTC, Chris wrote:
I'm a bit at a loss here. I cannot get the longest possible 
match. I tried several versions with eager operators and stuff, 
but D's regex engine(s) always seem to return the shortest 
match. Is there something embarrassingly simple I'm missing?


void main()
{
  import std.regex : regex, matchFirst;
  import std.stdio : writeln;

  auto word = "blablahula";
  auto m = matchFirst(word, regex("^([a-z]+)(hula|ula)$"));
  writeln(m);  // prints ["blablahula", "blablah", "ula"]
}

I want it to return "hula" not "ula".


Make the + operator less greedy:

matchFirst(word, regex("^([a-z]+?)(hula|ula)$"));


Regex-Fu

2015-05-25 Thread Chris via Digitalmars-d-learn
I'm a bit at a loss here. I cannot get the longest possible 
match. I tried several versions with eager operators and stuff, 
but D's regex engine(s) always seem to return the shortest match. 
Is there something embarrassingly simple I'm missing?


void main()
{
  import std.regex : regex, matchFirst;
  import std.stdio : writeln;

  auto word = "blablahula";
  auto m = matchFirst(word, regex("^([a-z]+)(hula|ula)$"));
  writeln(m);  // prints ["blablahula", "blablah", "ula"]
}

I want it to return "hula" not "ula".


Re: Degenerate Regex Case

2015-04-26 Thread Guillaume via Digitalmars-d-learn
On Saturday, 25 April 2015 at 09:30:55 UTC, Dmitry Olshansky 
wrote:


A quick investigation shows that it gets stuck at the end of 
pattern compilation stage.


The problem is that as a last pass D's regex goes to optimize 
the pattern to construct simple bit-scanning engine as 
approximation for prefix of original pattern. And that process 
is a lot like Thompson NFA ... _BUT_ the trick of merging 
equivalent threads wasn't applied there.


So in short: file a bug, optimizer absolutely should do 
de-duplication of threads.


---
Dmitry Olshansky


Thanks for your help, I'll go file a bug.


Re: Degenerate Regex Case

2015-04-25 Thread Dmitry Olshansky via Digitalmars-d-learn

On Friday, 24 April 2015 at 18:28:16 UTC, Guillaume wrote:
Hello, I'm trying to make a regex comparison with D, based off 
of this article: https://swtch.com/~rsc/regexp/regexp1.html


I've written my code like so:

import std.stdio, std.regex;

void main(string argv[]) {

string m = argv[1];
	auto p = 
ctRegex!("a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaa");

if (match(m, p)) {
writeln("match");
} else {
writeln("no match");
}

}




And the compiler goes into swap. Doing it at runtime is no 
better. I was under the impression that this particular regex 
was used for showcasing the Thompson NFA which D claims to be 
using.




A quick investigation shows that it gets stuck at the end of 
pattern compilation stage.


The problem is that as a last pass D's regex goes to optimize the 
pattern to construct simple bit-scanning engine as approximation 
for prefix of original pattern. And that process is a lot like 
Thompson NFA ... _BUT_ the trick of merging equivalent threads 
wasn't applied there.


So in short: file a bug, optimizer absolutely should do 
de-duplication of threads.



The golang code version of this runs fine, which makes me think 
that maybe D isn't using the correct regex engine for this 
particular regex. Or perhaps I'm using this wrong?


It uses 2 kinds of engines, run-time one is Thompson NFA. 
Compile-time is (for now) still backtracking.


---
Dmitry Olshansky


Re: Degenerate Regex Case

2015-04-24 Thread TheFlyingFiddle via Digitalmars-d-learn

On Friday, 24 April 2015 at 18:28:16 UTC, Guillaume wrote:
Hello, I'm trying to make a regex comparison with D, based off 
of this article: https://swtch.com/~rsc/regexp/regexp1.html


I've written my code like so:

import std.stdio, std.regex;

void main(string argv[]) {

string m = argv[1];
	auto p = 
ctRegex!("a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaa");

if (match(m, p)) {
writeln("match");
} else {
writeln("no match");
}

}

And the compiler goes into swap. Doing it at runtime is no 
better. I was under the impression that this particular regex 
was used for showcasing the Thompson NFA which D claims to be 
using.


The regex 
"a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaa" 
can be simplified to "a{30,60}" (if i counted correctly).


The regex "a{30,60}" works fine.

[Speculation]
I don't have a good understanding of how D's regex engine work 
but I am guessing that it does not do any simplification of the 
regex input causing it to generate larger engines for each 
additional ? symbol. Thus needing more memory. Eventually as in 
this case the compiler runs out of memory.







Degenerate Regex Case

2015-04-24 Thread Guillaume via Digitalmars-d-learn
Hello, I'm trying to make a regex comparison with D, based off of 
this article: https://swtch.com/~rsc/regexp/regexp1.html


I've written my code like so:

import std.stdio, std.regex;

void main(string argv[]) {

string m = argv[1];
	auto p = 
ctRegex!("a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaa");

if (match(m, p)) {
writeln("match");
} else {
writeln("no match");
}

}

And the compiler goes into swap. Doing it at runtime is no 
better. I was under the impression that this particular regex was 
used for showcasing the Thompson NFA which D claims to be using.


The golang code version of this runs fine, which makes me think 
that maybe D isn't using the correct regex engine for this 
particular regex. Or perhaps I'm using this wrong?


Re: regex on binary data

2014-12-31 Thread ketmar via Digitalmars-d-learn
On Wed, 31 Dec 2014 15:36:16 +
Darrell via Digitalmars-d-learn 
wrote:

> So far attempts to run regex on binary data causes
> "Invalid UTF-8 sequence".
> 
> Attempts to pass ubyte also didn't work out.

current regex engine assumes that you are using UTF-8 encoded text. i
really want regex engine to support user-supplied input ranges instead,
so decoding can be done by range (and regex engine can work on
anything, not only on strings), but i'm not ready for that challenge
yet. maybe i'll try to do something with it in 2015. ;-)


signature.asc
Description: PGP signature


Re: regex on binary data

2014-12-31 Thread Tobias Pankrath via Digitalmars-d-learn

On Wednesday, 31 December 2014 at 15:36:19 UTC, Darrell wrote:

So far attempts to run regex on binary data causes
"Invalid UTF-8 sequence".

Attempts to pass ubyte also didn't work out.



I doubt using anything except (d,w)string is supported or 
possible.


regex on binary data

2014-12-31 Thread Darrell via Digitalmars-d-learn

So far attempts to run regex on binary data causes
"Invalid UTF-8 sequence".

Attempts to pass ubyte also didn't work out.



  1   2   3   >