[REBOL] Newbie question #2 - Pattern making and using? Re:(11)

icimjs Sun, 02 Apr 2000 13:36:17 -0700
Hi Gabriele,

thanks for posting this to the mailing list. 

In general I would appreciate it if questions that are probably of general
interest be posted to the mailing list, since we can all learn from the
questions as well as the answers.

At 12:00 PM 4/2/00 +0200, you wrote:
>Hello jb!
>
>On 01-Apr-00, you wrote:
>
>[Forwarding to the list]
>
> j> Gabrielle,
>
>Just one "l"... :-)
>
> j> For the past two months when I have the time to review the
> j> rebol posts, I've appreciated reading your succint, thoughful
> j> and clear examples. Thankyou. Your method outlined below is a
> j> fine example of this. You are both an excellent teacher and
> j> certainly you must be a world class programmer ! :->
>
>Thank you. You're surely exaggerating --- I'm just a student! :)
>
> j> Perhaps you will be so kind as to provide urls for the best
> j> examples of pattern matching techniques for REBOL? For
>
>I think you can find a lot of examples on rebol.org. I've written
>a site-saver too (downloads an HTML document plus all of the links
>etc.); it still needs some work, so I didn't publish it yet, as
>you can get a couple of other script doing the same job on
>rebol.org, but if you're interested...
>
> j> example, let's assume a useful utility will crawl through a
> j> web site and gather only the links on the site saving them to
> j> a file, how does a good rebol programmer think about and then
> j> execute the following:
>
> j> find all instances of the following pattern
> j> http://yyyyyy.yyy/zzzz ( ie any and all urls in a web page.)
> j> and return and then store only the http://yyyyy.yyy portion to
> j> a file appending a newline to each ?
>
>If you assume that the initial "http://" is always present, as
>well as the "/" after the domain name, the task is really easy:
>
>   file-port: open/lines %destination-file.txt
>   parse text-containing-links [
>      any [
>         to "http://" copy url ["http://" to "/"]
>         (insert tail file-port url)
>      ]
>   ]
>   close file-port
>
>If you just want to assume that the URLs begin with "http://":
>
>   url-rule: ["http://" some domain-chars]
>   domain-chars: complement charset [#"^(00)" - #" " #"/"]
>   parse text-containing-links [
>      any [
>         to "http://" copy url url-rule
>         (insert tail file-port url)
>      ]
>   ]
>
> j> Now with perl the matching expression is relatively short and
> j> the assignment of the value is rather straight forward as well
> j> as the printing to a append the file.
>
> j> Feel free to forward this letter to the list with your
> j> response. I'd love to see how others will answer it.
>
> j> TIA,
> j> JB
>
> j> PS
> j> *I'm replying off list because my temporary isp doesn't
> j> forward my mail when I use my subscribed email account.
>
>Regards,
>    Gabriele.
>-- 
>o--------------------) .-^-. (----------------------------------o
>| Gabriele Santilli / /_/_\_\ \ Amiga Group Italia --- L'Aquila |
>| GIESSE on IRC     \ \-\_/-/ /  http://www.amyresource.it/AGI/ |
>o--------------------) `-v-' (----------------------------------o
>
>
>
>

;- Elan >> [: - )]
[REBOL] Newbie question #2 - Pattern making and using? Re:(11)

Reply via email to