RE: [PHP] hyperlink parser - a bit new view :)

2002-08-18 Thread MET

Look up Regular Expressions.  Basically they allow you to search out
matches in strings such as a and any variable amount of letters
between the open and closed tags.  So you'd need to read in the entire
file into a string, chop out all the pesky page returns ('\n') and then
run your function using regular expressions inside of a loop of some
sort.

Check out these links:

http://www.php.net/manual/en/ref.regex.php

http://www.phpbuilder.com/columns/dario19990616.php3


~ Matthew

-Original Message-
From: Roman [mailto:[EMAIL PROTECTED]] 
Sent: Sunday, August 18, 2002 4:20 PM
To: [EMAIL PROTECTED]
Subject: RE: [PHP] hyperlink parser - a bit new view :)


I think there is no message about the same idea.

I would need to parse html file and get all within href= or href=''.
So I don't want to get all texts which could be links, but only content
of href variable within a tag.

Is there any easy way ? I have read hundreds of messages in this list
today but without any possitive idea found :(

Roman





-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




RE: [PHP] hyperlink parser - a bit new view :)

2002-08-18 Thread Roman

Hi Mat,

I would not please for help if I would not read those texts. Of course I've
read them both.

But on those phpbuilder's page there are also similar questions but answered
scripts don't work, for example:

$text=a target=\top\ href=\http://aaa\;ttrtert/a;
preg_match_all(|href=\?([^\' ]+)|i(+[ ]), $text,$ar);
print_r($ar);

returns this:
Warning: Unknown modifier '(' in test.php on line 5 [line 5 is that line
with preg_match_all]
Array ( )


I really don't know :(

Roman

-Original Message-
From: MET [mailto:[EMAIL PROTECTED]]
Look up Regular Expressions.  Basically they allow you to search out
Check out these links:
http://www.php.net/manual/en/ref.regex.php
http://www.phpbuilder.com/columns/dario19990616.php3
~ Matthew



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] hyperlink parser - a bit new view :)

2002-08-18 Thread DL Neil

Hi Roman,

 But on those phpbuilder's page there are also similar questions but
answered
 scripts don't work, for example:
 ...
 preg_match_all(|href=\?([^\' ]+)|i(+[ ]), $text,$ar);
...
 I really don't know :(


There's definitely something wrong with the RegEx above. The string
commences with |, so therefore the last characters of the string must be
|a - where a is a letter, in this case i meaning case-insensitive. For
some reason the |i is not at the end of the string - either you miscopied or
they misprinted.

However the RegEx doesn't strike me as correct HTML anyway, because there
can be spaces between elements, eg between href and =, eg href =
www.homepage.com, secondly if the URL is enclosed in quotation marks either
a single or double quotes may be used (' or ).

Unfortunately my RegEx skills are self-taught, so who knows how good/useless
my advice! Here is my current best attempt (abstracted from my web site
links checker routine):


$RegEx = /( . href * . = *['\]?)([^'\ ]*)(['\ ])/i;

if ( DEBUG ) echo brRegEx=$RegEx~;

$bValidity = $iFound

= preg_match_all( $RegEx, $HTML, $aRegExOut );

An improvement might be for the closing quotes to refer back to (any)
opening quotes. I am willing to watch, listen, and learn, if anyone can
offer improvements/wisdom.


Regards,
=dn



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[PHP] hyperlink parser

2002-08-17 Thread Roman

Hello,

I'd need to parse html page and get all hyperlinks there.

Is there any easy way to do it ? I'm not so good at regexps :(

$file=file(http://url.html;);

could be good start :)

Thanks for any help,

Roman




-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] hyperlink parser

2002-08-17 Thread Jason Wong

On Saturday 17 August 2002 19:25, Roman wrote:
 Hello,

 I'd need to parse html page and get all hyperlinks there.

 Is there any easy way to do it ? I'm not so good at regexps :(

 $file=file(http://url.html;);

 could be good start :)

Even better, search the archives.

-- 
Jason Wong - Gremlins Associates - www.gremlins.com.hk
Open Source Software Systems Integrators
* Web Design  Hosting * Internet  Intranet Applications Development *

/*
Mother is far too clever to understand anything she does not like.
-- Arnold Bennett
*/


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




RE: [PHP] hyperlink parser

2002-08-17 Thread Roman


On Saturday 17 August 2002 19:25, Roman wrote:
 I'd need to parse html page and get all hyperlinks there.
 Is there any easy way to do it ? I'm not so good at regexps :(
Even better, search the archives.

pls how to reach archives and what to search ? I have last 7.000 messages in
my mailer :( I would be here 2 weeks if I had to read it one by one :(

Roman



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] hyperlink parser

2002-08-17 Thread Jason Wong

On Saturday 17 August 2002 20:07, Roman wrote:
 On Saturday 17 August 2002 19:25, Roman wrote:
  I'd need to parse html page and get all hyperlinks there.
  Is there any easy way to do it ? I'm not so good at regexps :(

 Even better, search the archives.

 pls how to reach archives 

Look on the php website for where the archives are found.

 and what to search ? 

I don't know. Try various combinations of

extract, url, regex, hyperlink, parse, link

 I have last 7.000 messages
 in my mailer :( I would be here 2 weeks if I had to read it one by one :(

In that case it might be quicker for you to write your own ;-)

-- 
Jason Wong - Gremlins Associates - www.gremlins.com.hk
Open Source Software Systems Integrators
* Web Design  Hosting * Internet  Intranet Applications Development *

/*
Love may laugh at locksmiths, but he has a profound respect for money bags.
-- Sidney Paternoster, The Folly of the Wise
*/


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php