hi Richard,

thanks for the suggestion! I have added some examples about commenting
regular expressions directly into FAQ which is probably more convenient for
the reader:
https://simple-evcorr.github.io/FAQ.html#13

kind regards,
risto

Kontakt Richard Ostrochovský (<richard.ostrochov...@gmail.com>) kirjutas
kuupäeval E, 20. jaanuar 2020 kell 16:57:

> Thank you for comprehensive answer, Risto. Maybe, hyperlink to it could be
> added to that FAQ item.
>
> (I found searching in this forum bit harder, due to traditional e-mail
> from, and maybe we could categorize threads in this forum into topics on ,
> I am inspired with this by
> https://www.perlmonks.org/?node=Categorized%20Questions%20and%20Answers.
> Maybe I would help with something like this, when I will have less busy
> time.)
>
> po 9. 12. 2019 o 16:43 Risto Vaarandi <risto.vaara...@gmail.com>
> napísal(a):
>
>> hi Richard,
>>
>> Kontakt Richard Ostrochovský (<richard.ostrochov...@gmail.com>) kirjutas
>> kuupäeval E, 9. detsember 2019 kell 01:57:
>>
>>> Hello colleagues,
>>>
>>> I was searching for the answer here:
>>> https://simple-evcorr.github.io/man.html
>>> https://sourceforge.net/p/simple-evcorr/mailman/simple-evcorr-users/
>>> and haven't found the answer, so I'am putting new question here:
>>>
>>> Does SEC in pattern= parameters support RegExp modifiers (
>>> https://perldoc.perl.org/perlre.html#Modifiers) somehow?
>>>
>>
>> If you enclose a regular expression within /.../, SEC does not treat
>> slashes as separators but rather as parts of regular expression, therefore
>> you can't provide modifiers in the end of regular expression after /.
>> However, Perl regular expressions allow for modifiers to be provided with
>> (?<modifiers>) construct. For example, the following pattern matches the
>> string "test" in case insensitive way:
>> pattern=(?i)test
>> In addition, you can use such modifiers anywhere in regular expression
>> which makes them more flexible than modifiers after /. For example, the
>> following pattern matches strings "test" and "tesT":
>> pattern=tes(?i)t
>>
>> In SEC FAQ, there is also a short discussion on this topic:
>> https://simple-evcorr.github.io/FAQ.html#13)
>>
>>
>>> E.g. modifiers /x or /xx allow writing more readable expressions by
>>> ignoring unescaped whitespaces (implies possible multi-line regular
>>> expressions). It could be practical in case of more complex expressions, to
>>> let them being typed more legibly. Some simpler example:
>>>
>>> pattern=/\
>>> ^\s*([A-Z]\s+)?\
>>> (?<data_source_timestamp>\
>>>    (\
>>>       ([\[\d\-\.\:\s\]]*[\d\]]) |\
>>>       (\
>>>          (Mon|Tue|Wed|Thu|Fri|Sat|Sun) \s+\
>>>          (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \s+ \d+ \s+
>>> \d\d:\d\d:\d\d \s+ ([A-Z]+\s+)?\d\d\d\d\
>>>       )\
>>>    )\
>>> ) (?<message>.*)/x
>>>
>>>
>> It is a tricky question, since SEC configuration file format allows to
>> natively provide regular expressions in multiple lines without the use of
>> (?x) modifier. If any line in rule definition ends with backslash, the
>> following line is appended to the current line and backslash is removed
>> during configuration file parsing. For example, the following two pattern
>> definitions are equivalent:
>>
>> pattern=test: \
>> (\S+) \
>> (\S+)$
>>
>> pattern=test: (\S+) (\S+)$
>>
>> However, it is important to remember that SEC converts multi-line rule
>> fields into single-line format before any other processing, and that
>> includes compiling regular expressions. In other words, if you consider the
>> first multi-line regular expression pattern definition above, SEC actually
>> sees it as "test: (\S+) (\S+)$" when it compiles this expression. This
>> introduces the following caveat -- when using (?x) modifier for introducing
>> a comment into multi-line regular expression, the expression is converted
>> into single line format before expression is compiled and (?x) has any
>> effect, and therefore the comment will unexpectedly run until the end of
>> regular expression. Consider the following example:
>>
>> pattern=(?x)test:\
>> # this is a comment \
>> (\S+)$
>>
>> Internally, this definition is first converted to single line format:
>> pattern=(?x)test:# this is a comment (\S+)$
>> However, this means that without the comment the expression looks like
>> (?x)test:
>> which is not what we want.
>>
>> To address this issue, you can limit the scope of comments with (?#...)
>> constructs that don't require (?x). For example:
>>
>> pattern=test:\
>> (?# this is a comment )\
>> (\S+)$
>>
>> During configuration file parsing this expression is converted into
>> "test:(?# this is a comment )(\S+)$", and after dropping the comment it
>> becomes "test:(\S+)$" as we expect.
>>
>> Hope this helps,
>> risto
>>
>>
>>
>
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to