Using regular expressions when reading a file

2022-05-05 Thread Alexander Zhirov via Digitalmars-d-learn
I want to use a configuration file with external settings. I'm 
trying to use regular expressions to read the `Property = Value` 
settings. I would like to do it all more beautifully. Is there 
any way to get rid of the line break character? How much does 
everything look "right"?


**settings.conf:**

```sh
host = 127.0.0.1
port = 5432
dbname = database
user = postgres
```

**code:**

```d
auto file = File("settings.conf", "r");
string[string] properties;
auto p_property = regex(r"^\w+ *= *.+", "s");
while (!file.eof())
{
  string line = file.readln();
  auto m = matchAll(line, p_property);
  if (!m.empty())
  {
string property = matchAll(line, regex(r"^\w+", "m")).hit;
string value = replaceAll(line, regex(r"^\w+ *= *", "m"), "");
properties[property] = value;
  }
}
file.close();
writeln(properties);
```

**output:**

```sh
["host":"127.0.0.1\n", "dbname":"mydb\n", "user":"postgres", 
"port":"5432\n"]

```


Re: Using regular expressions when reading a file

2022-05-05 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, May 05, 2022 at 05:53:57PM +, Alexander Zhirov via 
Digitalmars-d-learn wrote:
> I want to use a configuration file with external settings. I'm trying
> to use regular expressions to read the `Property = Value` settings. I
> would like to do it all more beautifully. Is there any way to get rid
> of the line break character? How much does everything look "right"?
[...]
> ```d
> auto file = File("settings.conf", "r");
> string[string] properties;
> auto p_property = regex(r"^\w+ *= *.+", "s");
> while (!file.eof())
> {
>   string line = file.readln();
>   auto m = matchAll(line, p_property);
>   if (!m.empty())
>   {
> string property = matchAll(line, regex(r"^\w+", "m")).hit;
> string value = replaceAll(line, regex(r"^\w+ *= *", "m"), "");
> properties[property] = value;
>   }
> }

Your regex already matches the `Property = Value` pattern; why not just
use captures to extract the relevant parts of the match, insteead of
doing it all over again inside the if-statement?

// I added captures (parentheses) to extract the property name
// and value directly from the pattern.
auto p_property = regex(r"^(\w+) *= *(.+)", "s");

// I assume you only want one `Property = Value` pair per input
// line, so you really don't need matchAll; matchFirst will do
// the job.
auto m = matchFirst(line, p_property);

if (m) {
// No need to run a match again, just extract the
// captures
string property = m[1];
string value = m[2];
properties[property] = value;
}


T

-- 
"You are a very disagreeable person." "NO."


Re: Using regular expressions when reading a file

2022-05-05 Thread Alexander Zhirov via Digitalmars-d-learn

On Thursday, 5 May 2022 at 18:15:28 UTC, H. S. Teoh wrote:

auto m = matchFirst(line, p_property);


Yes, it looks more attractive. Thanks! I just don't quite 
understand how `matchFirst` works. I seem to have read the 
[description](https://dlang.org/phobos/std_regex.html#Captures), 
but I can't understand something.


And yet I have to manually remove the line break:
```sh
["host":"192.168.100.236\n", "dbname":"belpig\n", 
"user":"postgres", "port":"5432\n"]

```


Re: Using regular expressions when reading a file

2022-05-05 Thread H. S. Teoh via Digitalmars-d-learn
On Thu, May 05, 2022 at 06:50:17PM +, Alexander Zhirov via 
Digitalmars-d-learn wrote:
> On Thursday, 5 May 2022 at 18:15:28 UTC, H. S. Teoh wrote:
> > auto m = matchFirst(line, p_property);
> 
> Yes, it looks more attractive. Thanks! I just don't quite understand how
> `matchFirst` works. I seem to have read the
> [description](https://dlang.org/phobos/std_regex.html#Captures), but I can't
> understand something.
> 
> And yet I have to manually remove the line break:
> ```sh
> ["host":"192.168.100.236\n", "dbname":"belpig\n", "user":"postgres",
> "port":"5432\n"]
> ```

You don't have to. Just add a `$` to the end of your regex, and it
should match the newline. If you put it outside the capture parentheses,
it will not be included in the value.


T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi


Re: Using regular expressions when reading a file

2022-05-05 Thread Alexander Zhirov via Digitalmars-d-learn

On Thursday, 5 May 2022 at 18:58:41 UTC, H. S. Teoh wrote:
You don't have to. Just add a `$` to the end of your regex, and 
it should match the newline. If you put it outside the capture 
parentheses, it will not be included in the value.


In fact, it turned out to be much easier. It was just necessary 
to use the `m` flag instead of the `s` flag:


```d
auto p_property = regex(r"^(\w+) *= *(.+)", "m");
```



Re: Using regular expressions when reading a file

2022-05-05 Thread Ali Çehreli via Digitalmars-d-learn

On 5/5/22 12:05, Alexander Zhirov wrote:

On Thursday, 5 May 2022 at 18:58:41 UTC, H. S. Teoh wrote:
You don't have to. Just add a `$` to the end of your regex, and it 
should match the newline. If you put it outside the capture 
parentheses, it will not be included in the value.


In fact, it turned out to be much easier. It was just necessary to use 
the `m` flag instead of the `s` flag:


```d
auto p_property = regex(r"^(\w+) *= *(.+)", "m");
```



Couldn't help myself from improving. :) The following regex works in my 
Linux console. No issues with '\n'. (?) It also allows for leading and 
trailing spaces:


import std.regex;
import std.stdio;
import std.algorithm;
import std.array;
import std.typecons;
import std.functional;

void main() {
  auto p_property = regex(r"^ *(\w+) *= *(\w+) *$");
  const properties = File("settings.conf")
 .byLineCopy
 .map!(line => matchFirst(line, p_property))
 .filter!(not!empty) // OR: .filter!(m => !m.empty)
 .map!(m => tuple(m[1], m[2]))
 .assocArray;

  writeln(properties);
}

Ali


Re: Using regular expressions when reading a file

2022-05-05 Thread Alexander Zhirov via Digitalmars-d-learn

On Thursday, 5 May 2022 at 19:19:26 UTC, Ali Çehreli wrote:
Couldn't help myself from improving. :) The following regex 
works in my Linux console. No issues with '\n'. (?) It also 
allows for leading and trailing spaces:


import std.regex;
import std.stdio;
import std.algorithm;
import std.array;
import std.typecons;
import std.functional;

void main() {
  auto p_property = regex(r"^ *(\w+) *= *(\w+) *$");
  const properties = File("settings.conf")
 .byLineCopy
 .map!(line => matchFirst(line, p_property))
 .filter!(not!empty) // OR: .filter!(m => 
!m.empty)

 .map!(m => tuple(m[1], m[2]))
 .assocArray;

  writeln(properties);
}


It will need to be sorted out with a fresh head. 😀 Thanks!



Re: Using regular expressions when reading a file

2022-05-05 Thread forkit via Digitalmars-d-learn

On Thursday, 5 May 2022 at 17:53:57 UTC, Alexander Zhirov wrote:
I want to use a configuration file with external settings. I'm 
trying to use regular expressions to read the `Property = 
Value` settings. I would like to do it all more beautifully. Is 
there any way to get rid of the line break character? How much 
does everything look "right"?


regex never looks right ;-)

try something else perhaps??

// 

module test;

import std;

void main()
{
auto file = File("d:\\settings.conf", "r");
string[string] aa;

// create an associate array of settings -> [key:value]
foreach (line; file.byLine().filter!(a => !a.empty))
{
auto myTuple = line.split(" = ");
aa[myTuple[0].to!string] = myTuple[1].to!string;
}

// write out all the settings.
foreach (key, value; aa.byPair)
writefln("%s:%s", key, value);

writeln;

// write just the host value
writeln(aa["host"]);

}


// 



Re: Using regular expressions when reading a file

2022-05-06 Thread Alexander Zhirov via Digitalmars-d-learn

On Friday, 6 May 2022 at 05:40:52 UTC, forkit wrote:

auto myTuple = line.split(" = ");


Well, only if as a strict form :)



Re: Using regular expressions when reading a file

2022-05-06 Thread forkit via Digitalmars-d-learn

On Friday, 6 May 2022 at 07:51:01 UTC, Alexander Zhirov wrote:

On Friday, 6 May 2022 at 05:40:52 UTC, forkit wrote:

auto myTuple = line.split(" = ");


Well, only if as a strict form :)


well.. a settings file should be following a strict format.

..otherwise...anything goes... and good luck with that...

regex won't help you either in that case...

e.g:

user =som=eu=ser  (how you going to deal with this ?)



Re: Using regular expressions when reading a file

2022-05-06 Thread novice2 via Digitalmars-d-learn

imho, regexp is overkill here.
as for me, i usually just split line for first '=', then trim 
spaces left and right parts.