RE: [PHP] hyperlink parser - a bit new view :)
Look up Regular Expressions. Basically they allow you to search out matches in strings such as a and any variable amount of letters between the open and closed tags. So you'd need to read in the entire file into a string, chop out all the pesky page returns ('\n') and then run your function using regular expressions inside of a loop of some sort. Check out these links: http://www.php.net/manual/en/ref.regex.php http://www.phpbuilder.com/columns/dario19990616.php3 ~ Matthew -Original Message- From: Roman [mailto:[EMAIL PROTECTED]] Sent: Sunday, August 18, 2002 4:20 PM To: [EMAIL PROTECTED] Subject: RE: [PHP] hyperlink parser - a bit new view :) I think there is no message about the same idea. I would need to parse html file and get all within href= or href=''. So I don't want to get all texts which could be links, but only content of href variable within a tag. Is there any easy way ? I have read hundreds of messages in this list today but without any possitive idea found :( Roman -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] hyperlink parser - a bit new view :)
Hi Mat, I would not please for help if I would not read those texts. Of course I've read them both. But on those phpbuilder's page there are also similar questions but answered scripts don't work, for example: $text=a target=\top\ href=\http://aaa\;ttrtert/a; preg_match_all(|href=\?([^\' ]+)|i(+[ ]), $text,$ar); print_r($ar); returns this: Warning: Unknown modifier '(' in test.php on line 5 [line 5 is that line with preg_match_all] Array ( ) I really don't know :( Roman -Original Message- From: MET [mailto:[EMAIL PROTECTED]] Look up Regular Expressions. Basically they allow you to search out Check out these links: http://www.php.net/manual/en/ref.regex.php http://www.phpbuilder.com/columns/dario19990616.php3 ~ Matthew -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] hyperlink parser - a bit new view :)
Hi Roman, But on those phpbuilder's page there are also similar questions but answered scripts don't work, for example: ... preg_match_all(|href=\?([^\' ]+)|i(+[ ]), $text,$ar); ... I really don't know :( There's definitely something wrong with the RegEx above. The string commences with |, so therefore the last characters of the string must be |a - where a is a letter, in this case i meaning case-insensitive. For some reason the |i is not at the end of the string - either you miscopied or they misprinted. However the RegEx doesn't strike me as correct HTML anyway, because there can be spaces between elements, eg between href and =, eg href = www.homepage.com, secondly if the URL is enclosed in quotation marks either a single or double quotes may be used (' or ). Unfortunately my RegEx skills are self-taught, so who knows how good/useless my advice! Here is my current best attempt (abstracted from my web site links checker routine): $RegEx = /( . href * . = *['\]?)([^'\ ]*)(['\ ])/i; if ( DEBUG ) echo brRegEx=$RegEx~; $bValidity = $iFound = preg_match_all( $RegEx, $HTML, $aRegExOut ); An improvement might be for the closing quotes to refer back to (any) opening quotes. I am willing to watch, listen, and learn, if anyone can offer improvements/wisdom. Regards, =dn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] hyperlink parser
Hello, I'd need to parse html page and get all hyperlinks there. Is there any easy way to do it ? I'm not so good at regexps :( $file=file(http://url.html;); could be good start :) Thanks for any help, Roman -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] hyperlink parser
On Saturday 17 August 2002 19:25, Roman wrote: Hello, I'd need to parse html page and get all hyperlinks there. Is there any easy way to do it ? I'm not so good at regexps :( $file=file(http://url.html;); could be good start :) Even better, search the archives. -- Jason Wong - Gremlins Associates - www.gremlins.com.hk Open Source Software Systems Integrators * Web Design Hosting * Internet Intranet Applications Development * /* Mother is far too clever to understand anything she does not like. -- Arnold Bennett */ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] hyperlink parser
On Saturday 17 August 2002 19:25, Roman wrote: I'd need to parse html page and get all hyperlinks there. Is there any easy way to do it ? I'm not so good at regexps :( Even better, search the archives. pls how to reach archives and what to search ? I have last 7.000 messages in my mailer :( I would be here 2 weeks if I had to read it one by one :( Roman -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] hyperlink parser
On Saturday 17 August 2002 20:07, Roman wrote: On Saturday 17 August 2002 19:25, Roman wrote: I'd need to parse html page and get all hyperlinks there. Is there any easy way to do it ? I'm not so good at regexps :( Even better, search the archives. pls how to reach archives Look on the php website for where the archives are found. and what to search ? I don't know. Try various combinations of extract, url, regex, hyperlink, parse, link I have last 7.000 messages in my mailer :( I would be here 2 weeks if I had to read it one by one :( In that case it might be quicker for you to write your own ;-) -- Jason Wong - Gremlins Associates - www.gremlins.com.hk Open Source Software Systems Integrators * Web Design Hosting * Internet Intranet Applications Development * /* Love may laugh at locksmiths, but he has a profound respect for money bags. -- Sidney Paternoster, The Folly of the Wise */ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php