I should have been a bit clearer when I said my program didn’t work. It does,
but only for fixed strings: there is none of the RE special character magic.
And, I agree, the crucial question is how to construct a pattern from a string
that treats the special characters as special characters, rather than just
literals.
In passing, write( type( <abc> )) writes string, whereas write(type( <a[bc]*>
)) writes pattern, which isn’t quite what I expected.
I had quite high hopes for Arbno(), but soon realised that it wanted a pattern
for its argument, not a string and, even when I fed it with a variable that had
the type of pattern, it still didn’t work how I might have expected it to. At
that stage, I asked my original question. If Clint’s “option #1" is "write a
library procedure that parses the regex and builds the corresponding pattern”,
I wonder whether Arbno() might be a suitable interface: i.e. if it’s a pattern
already, do what it does now, otherwise turn the string into a pattern and then
do it. Perhaps a separate procedure might be clearer.
The reader with no time for trivia may profitably skip the rest of this message
...
I may have found a use for Succeed: If I modify my program to be as below
(additions in red: the reason for the strange comment at the end will be clear
in a moment)
procedure main(args)
local f, line, re := pop(args) || Succeed()
write(type(re))
every f := open(!args, "r") do {
every line := !f do {
if ( line ?? re ) then write(line)
}
}
end
#[dne][edn][den]
If I use grep on this program source I get
bash-3.2$ grep "[dne][edn][den]" gerp.icn
local f, line, re := pop(args) || Succeed()
end
#[dne][edn][den]
as expected: grep has found “eed", “end" and the regular expression itself in
the final comment. Whereas, if I use the program on its own source code I get
bash-3.2$ ./gerp "[dne][edn][den]" gerp.icn
pattern
#[dne][edn][den]
showing that although I have a pattern, it isn’t interpreting the special
characters.
If I miss off the "|| Succeed()” from the initialisation of re and try again I
get
string
#[dne][edn][den]
I still get a pattern match, even though it’s a string not a pattern, but it’s
the literal string that is matching.
Therefore Succeed() may be used to turn a string into a pattern! Unfortunately
not in a useful way.
> On 29 Nov 2016, at 22:27, Jeffery, Clint ([email protected])
> <[email protected]> wrote:
>
> My thanks to Don, Jay, and anyone else who is trying out stuff related to
> patterns. I am on the road ATM but will work on improving the diagnostics
> related to Jay's experiments. Regarding Don's original request and Jay's
> comments on it: backquotes in patterns is not a full "eval" interpreter that
> will take arbitrary Icon strings and turn them into code. Maybe we need
> that, and maybe someone will build it some day. In the meantime, after
> figuring out the best workarounds that may be available, you can judge for
> yourself whether the patterns are still useful, or whether they remain
> unfinished business.
>
> The basic question is: given a regular expression supplied as string data s,
> how best should we construct a corresponding pattern. The answer sadly is not
> <`s`>. The Unicon translator has a parser for regular expressions and emits
> pattern function calls for them, but we want to do it from the Unicon VM.
> Options include: write a library procedure that parses the regex and builds
> the corresponding pattern; write a library procedure that invokes the
> translator to do the work and use dynamic loading to get the code loaded;
> extend the language with a new built-in that does the same or similar; extend
> the backquotes operator to do what we want here; or use another idea that you
> think up.
>
> Don: great minds think alike. When I started to update the Unicon book to
> talk about patterns, I immediately figured we needed to update the "grep"
> example to use patterns, and came up against the same issue you're asking
> about. I haven't implemented a solution yet, but perhaps we should do option
> #1 and see what that looks like.
>
> Cheers,
> Clint
> From: Jay Hammond <[email protected] <mailto:[email protected]>>
> Sent: Tuesday, November 29, 2016 1:55:07 PM
> To: Don Ward; [email protected]
> <mailto:[email protected]>
> Subject: Re: [Unicon-group] Converting strings to patterns
>
> Hi Don,
> I tried running your program.
> To get it to do anything I had to change line 2, separate the local
> declaration and the assignment.
> to clarify, repat is a (new) variable that I intend to hold a pattern
> procedure main(args)
> local f, line, re
> re := pop(args)
>
> write(re)
>
> repat := re
>
> every f := open(!args, "r") do {
> while ( line := read(f) ) do {
> if line ?? repat then write(line)
> }
> }
> end
> I created qqq.txt with the lines
> QQQ
> qqq
> cqcqcq
> and ran testpat QQQ qqq.txt after compiling testpat.icn
> Output was
> QQQ
> then the contents of qqq.txt
> as if repat always matches. (it has the null value??)
>
> I tried forcing repat to be a pattern (utr18 says that patterns are composed
> of strings concatenated or alternated) so I tried
> repat := re .| fail()
> repat := re .| re
> but the pattern building process gave me node errors at compile time.
> dopat2.icn:6: # "re": syntax error (237;349)
> File dopat2.icn; Line 16 # traverse: undefined node type
> line 16 is the line after end in main, i.e. program source end.
>
> I tried using the -f s option at the compile step, so as to use unevaluated
> expressions in patterns
> like
> repat := < `re` >
> # that syntax ought to force a pattern!
> node traversal errors again. And the backquotes were not recognised. Perhaps
> I put the -f s option in the wrong place?
>
> I also tried
> repat := < Q > || < Q > || < Q >
> dopat2.icn:6: # "repat": invalid argument in augmented assignment
> File dopat2.icn; Line 16 # traverse: undefined node type
> so it is not considering || to be pattern concatenation
> repat := < Q || Q || Q >
> gave the same error!
>
> So although UTR18 seems to give options for converting strings to patterns I
> have not had any luck so far.
>
> Jay
>
> On 29/11/2016 14:33, Don Ward wrote:
>> Here is a very simple (and simple minded) grep program. The idea being to
>> apply a Unicon regexp pattern to a series of files, just like grep
>>
>> procedure main(args)
>> local f, line, re := pop(args)
>>
>> every f := open(!args, "r") do {
>> every line := !f do {
>> if line ?? re then write(line)
>> }
>> }
>> end
>>
>> Of course, it doesn’t work because in line 6 I have a string instead of a
>> pattern.
>>
>> Is there any way to convert the contents of re from a string to a pattern?
>>
>> After reading UTR18 again (and again), I’ve come to the conclusion that
>> there isn’t any way to do it.
>> The pertinent extract from UTR18 is in section 4.5.3 "Limitations due to
>> lack of eval()”.
>> But before I give up on the idea entirely, I thought I’d check to see if my
>> understanding is correct.
>>
>> Don
>>
>>
>> ------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Unicon-group mailing list
>> [email protected]
>> <mailto:[email protected]>
>> https://lists.sourceforge.net/lists/listinfo/unicon-group
>> <https://lists.sourceforge.net/lists/listinfo/unicon-group>
>
>
------------------------------------------------------------------------------
_______________________________________________
Unicon-group mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/unicon-group