Hello jb!

On 01-Apr-00, you wrote:

[Forwarding to the list]

 j> Gabrielle,

Just one "l"... :-)

 j> For the past two months when I have the time to review the
 j> rebol posts, I've appreciated reading your succint, thoughful
 j> and clear examples. Thankyou. Your method outlined below is a
 j> fine example of this. You are both an excellent teacher and
 j> certainly you must be a world class programmer ! :->

Thank you. You're surely exaggerating --- I'm just a student! :)

 j> Perhaps you will be so kind as to provide urls for the best
 j> examples of pattern matching techniques for REBOL? For

I think you can find a lot of examples on rebol.org. I've written
a site-saver too (downloads an HTML document plus all of the links
etc.); it still needs some work, so I didn't publish it yet, as
you can get a couple of other script doing the same job on
rebol.org, but if you're interested...

 j> example, let's assume a useful utility will crawl through a
 j> web site and gather only the links on the site saving them to
 j> a file, how does a good rebol programmer think about and then
 j> execute the following:

 j> find all instances of the following pattern
 j> http://yyyyyy.yyy/zzzz ( ie any and all urls in a web page.)
 j> and return and then store only the http://yyyyy.yyy portion to
 j> a file appending a newline to each ?

If you assume that the initial "http://" is always present, as
well as the "/" after the domain name, the task is really easy:

   file-port: open/lines %destination-file.txt
   parse text-containing-links [
      any [
         to "http://" copy url ["http://" to "/"]
         (insert tail file-port url)
      ]
   ]
   close file-port

If you just want to assume that the URLs begin with "http://":

   url-rule: ["http://" some domain-chars]
   domain-chars: complement charset [#"^(00)" - #" " #"/"]
   parse text-containing-links [
      any [
         to "http://" copy url url-rule
         (insert tail file-port url)
      ]
   ]

 j> Now with perl the matching expression is relatively short and
 j> the assignment of the value is rather straight forward as well
 j> as the printing to a append the file.

 j> Feel free to forward this letter to the list with your
 j> response. I'd love to see how others will answer it.

 j> TIA,
 j> JB

 j> PS
 j> *I'm replying off list because my temporary isp doesn't
 j> forward my mail when I use my subscribed email account.

Regards,
    Gabriele.
-- 
o--------------------) .-^-. (----------------------------------o
| Gabriele Santilli / /_/_\_\ \ Amiga Group Italia --- L'Aquila |
| GIESSE on IRC     \ \-\_/-/ /  http://www.amyresource.it/AGI/ |
o--------------------) `-v-' (----------------------------------o


Reply via email to