php-general Digest 23 Oct 2009 19:00:51 -0000 Issue 6406

Topics (messages 299249 through 299274):

Re: Spam opinions please
        299249 by: Peter Ford
        299252 by: Ashley Sheridan

Re: Is there any way to get all the function name being called in a     process?
        299250 by: Andrea Giammarchi
        299253 by: Ashley Sheridan
        299254 by: Andrea Giammarchi
        299255 by: Satya Narayan Singh
        299257 by: kranthi
        299259 by: Eddie Drapkin
        299260 by: Andrea Giammarchi

Re: input form save and display conflict
        299251 by: Ashley Sheridan

Re: php mail() function
        299256 by: John Black
        299258 by: Bob McConnell

Fedora 11 PHP install problems
        299261 by: Ashley Sheridan
        299262 by: Israel Ekpo
        299264 by: Ashley Sheridan
        299266 by: Israel Ekpo
        299268 by: Ashley Sheridan

regex pattern for extracting URLs
        299263 by: Brad Fuller
        299265 by: Jim Lucas
        299267 by: Ashley Sheridan
        299269 by: Brad Fuller
        299270 by: Ashley Sheridan
        299271 by: Israel Ekpo
        299272 by: Brad Fuller
        299273 by: Brad Fuller

Re: Sessions seems to kill db connection
        299274 by: Kim Madsen

Administrivia:

To subscribe to the digest, e-mail:
        php-general-digest-subscr...@lists.php.net

To unsubscribe from the digest, e-mail:
        php-general-digest-unsubscr...@lists.php.net

To post to the list, e-mail:
        php-gene...@lists.php.net


----------------------------------------------------------------------
--- Begin Message ---
Ashley Sheridan wrote:
> 
> 
> Won't stop a bot worth it's salt either, hence the need for more complex
> and confusing captchas. The best way to stop spam, is to use linguistic
> testing on the content being offered, which protects against bot and
> human spammer alike.
> 
> Thanks,
> Ash
> http://www.ashleysheridan.co.uk
> 
> 
> 

Unfortunately, it might also confound someone who doesn't speak the language.
Admittedly, they would probably already be struggling with the rest of the 
site...

I guess locale-dependent captchas are a possibility.


-- 
Peter Ford                              phone: 01580 893333
Developer                               fax:   01580 893399
Justcroft International Ltd., Staplehurst, Kent

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 08:55 +0100, Peter Ford wrote:

> Ashley Sheridan wrote:
> > 
> > 
> > Won't stop a bot worth it's salt either, hence the need for more complex
> > and confusing captchas. The best way to stop spam, is to use linguistic
> > testing on the content being offered, which protects against bot and
> > human spammer alike.
> > 
> > Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> > 
> > 
> > 
> 
> Unfortunately, it might also confound someone who doesn't speak the language.
> Admittedly, they would probably already be struggling with the rest of the 
> site...
> 
> I guess locale-dependent captchas are a possibility.
> 
> 
> -- 
> Peter Ford                              phone: 01580 893333
> Developer                               fax:   01580 893399
> Justcroft International Ltd., Staplehurst, Kent
> 


I'm not talking about language problems for the user to solve. This
question originally started by the op asking for solutions to human
spam, but most of what I've seen so far in the thread is all about how
to stop bots. By Linguistic analysis, I'm talking about passing the user
offered content through a filter to check for the probability that it is
spam. This goes beyond just looking for spammy words by looking at the
relationship between words, frequency of words, and much more. It's very
complex, but by the end of it, each post gets a rating value, which can
be used as part of a threshold to reach in order to have a post
automatically verified.

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
http://uk3.php.net/manual/en/function.get-defined-functions.php
get_defined_functions

Regards

> Date: Fri, 23 Oct 2009 11:54:34 +0530
> From: astra.sat...@gmail.com
> To: php-gene...@lists.php.net
> Subject: [PHP] Is there any way to get all the function name being called in 
> a        process?
> 
> Hi,
> 
> I am working on reverse engineering for a web project. I was trying to know
> that, is there any way(function by PHP, Zend, extension etc)
> to find out how many function has been called to perform a task.
> 
> If no, can you suggest is it possible/feasible or not?
> 
> 
> Thanks in advance
> -- 
> Satya
> Bangalore.
                                          
_________________________________________________________________
Keep your friends updated—even when you’re not signed in.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_5:092010

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 10:27 +0200, Andrea Giammarchi wrote:

> http://uk3.php.net/manual/en/function.get-defined-functions.php
> get_defined_functions
> 
> Regards
> 
> > Date: Fri, 23 Oct 2009 11:54:34 +0530
> > From: astra.sat...@gmail.com
> > To: php-gene...@lists.php.net
> > Subject: [PHP] Is there any way to get all the function name being called 
> > in a      process?
> > 
> > Hi,
> > 
> > I am working on reverse engineering for a web project. I was trying to know
> > that, is there any way(function by PHP, Zend, extension etc)
> > to find out how many function has been called to perform a task.
> > 
> > If no, can you suggest is it possible/feasible or not?
> > 
> > 
> > Thanks in advance
> > -- 
> > Satya
> > Bangalore.
>                                         
> _________________________________________________________________
> Keep your friends updated—even when you’re not signed in.
> http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_5:092010


That won't do what the OP asked, it will just return a list of all the
functions defined, which could be a lot more than is actually being used
in a process, such as in the case of included libraries of functions.

Would some form of PHP debugger help here? I've not used any debuggers
before, but I would imagine that this is something which could be
achieved quite easily with one.

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---

> That won't do what the OP asked, it will just return a list of all the
> functions defined, which could be a lot more than is actually being used
> in a process, such as in the case of included libraries of functions.

uhm, right, I guess APD then:
http://uk3.php.net/manual/en/book.apd.php

Regards
                                          
_________________________________________________________________
Windows Live: Keep your friends up to date with what you do online.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_1:092010

--- End Message ---
--- Begin Message ---
Thank a lot.

APD is just doing what I was looking for.




On Fri, Oct 23, 2009 at 5:13 PM, Andrea Giammarchi <an_...@hotmail.com>wrote:

>
> > That won't do what the OP asked, it will just return a list of all the
> > functions defined, which could be a lot more than is actually being used
> > in a process, such as in the case of included libraries of functions.
>
> uhm, right, I guess APD then:
> http://uk3.php.net/manual/en/book.apd.php
>
> Regards
>
> ------------------------------
> Windows Live: Keep your friends up to date with what you do 
> online.<http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_1:092010>
>



-- 
Satya
Bangalore.

--- End Message ---
--- Begin Message ---
even APD is not up to the task....

xdebug trace http://devzone.zend.com/article/2871 is sufficient, but
the output will be in a separate file.

--- End Message ---
--- Begin Message ---
There's xdebug, as mentioned, that'll do it as an extension.

What you REALLY probably are looking for is http://php.net/debug_backtrace

And what kind of reverse engineering would you be doing without
reflection? ( http://php.net/reflection ) ;]

--- End Message ---
--- Begin Message ---

> even APD is not up to the task....
> 
> xdebug trace http://devzone.zend.com/article/2871 is sufficient, but
> the output will be in a separate file.



> Thank a lot.
> 
> APD is just doing what I was looking for.
> 
> -- 
> Satya
> Bangalore.

Regards
                                          
_________________________________________________________________
Windows Live: Keep your friends up to date with what you do online.
http://www.microsoft.com/middleeast/windows/windowslive/see-it-in-action/social-network-basics.aspx?ocid=PID23461::T:WLMTAGL:ON:WL:en-xm:SI_SB_1:092010

--- End Message ---
--- Begin Message ---
On Thu, 2009-10-22 at 21:32 -0400, PJ wrote:

> I have several input fields to update a book database. There seems to be
> a conflict in the way tags and text are input through php/mysql and
> phpMyAdmin. If I enter the data with phpMyAdmin the input fields in the
> php page see quotation marks differently than what is input in phpMyAdmin.
> example:
> if the data is input through the update form, single quotes cause an
> error. Double quotes update the db but when the edit(update) form
> displays the text for modification outside the input field except for
> the first part, precisely where the first quotation mark appears in the
> text - as below:
> 
> *<b>Reviewed by <a href=*"mailto:recipi...@somewhere.com";>Recipient:
> blah, blah, blah...religion." _size="50" />_
> The text in square brackets is displayed outside the input field and
> includes part of the code at the end.
> bold is within the field, the rest is outside and the underlined is part
> of code.
> 
> If the same text is entered with phpMyAdmin using single quotes and the
> &quot; characters, the display in the editing field shows correctly...
> but it will not update, that is, the update query generates errors and
> only accepts the double quotes within the tags.
> 
> So, the question is, are there some kind of metacharacters to be used to
> have mysql accept the " ? I have triee backslashing, forward slashing
> and they don't do it.
> 
> Or is there an encoding conflict here? It looks like a display and save
> mismatch somewhere...
> 
> below is another example:
> <a
> href='http://www.amazon.com/exec/obidos/ASIN/0773468943/frankiesbibliogo'><IMG
> height=68 alt="Order This Book From Amazon.com"
> src="../images/amazon1.gif" width=90 border=0 /></a>
> 
> The single quotes for the href seem to work. But the " does not work;
> and using &quot; or &rsquo;  also also do not display correctly; again,
> from "Order... the image is not displayed but only the image blank with
> "Order.. " in it.
> I'm rather puzzled.
> 
> 
> 
> 
> 
> 
> 
> 


Single quotes need to be escaped if you are using them as part of a
query. For example:

$query = "UPDATE table SET title='This is a title with \"quoted\"
\'characters\''";

Note that here, double quotes are used to encapsulate the whole query
string (as it is generally preferred this way), the value of the title
field is encapsulated in single quotes. Lastly, where I've wanted double
quotes to be used in the query, I've escaped them with a back-slash.
This escapes them from PHP, as mysql is using single quotes, so directly
in the query they're fine. The single quotes are also escaped with
back-slashes, but this time to escape them from mysql, as single quotes
are used as the string delimiters there.

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
Paul M Foster wrote:
Regarding the rejection of dynamic IPs by smarthosts, are you saying
that it's a "blacklist" of sorts that lets them know an IP is dynamic?
(Serious question. I don't know the mechanism by which they determine
what is and isn't a dynamic IP.)

I run my own mail server and use the zen blocklist from spamhaus.org. The zen list combines all the the anti spam lists plus all IPs designated to end users (http://www.spamhaus.org/zen/)

The reason for blocking end users is that a lot of SPAM is sent out by compromised machines from some home Internet connection. I am currently getting about 5 connections every 2 seconds from compromised computers attempting to spam my server. The origin is usually a dynamically assigned IP from Sprint or Comcast (USA ISPs). So blocking the end users from sending SPAM tends to cut down on a LOT of junk.

Here is a bit more info about the end user blocklist.
http://www.spamhaus.org/pbl/

--
John
Question / Answer based CAPTCHA
http://www.network-technologies.org/tiny.php?id=1

--- End Message ---
--- Begin Message ---
From: Paul M Foster

> Regarding the rejection of dynamic IPs by smarthosts, are you saying
> that it's a "blacklist" of sorts that lets them know an IP is dynamic?
> (Serious question. I don't know the mechanism by which they determine
> what is and isn't a dynamic IP.)

You are talking about two different mechanisms here. The black or grey
lists are services that track known open relays and other sources of
spam, viruses and assorted malware. Anyone can subscribe to them and use
them to validate relay requests.

There are also services that keep track of valid domain addresses and
the IP assigned to them. They are usually called DNS hosts. These can be
polled to identify the IP address for authorized domains and hosts.
There are even special records for the email severs within a domain.
Most dynamically allocated IP addresses will not show up on these
servers unless you have access to a service authorized to inject
records.

So basically, qmail did a DNS lookup on your host/domain name and did
not find a record pointing to your server. Therefore it rejected your
request.

You will have to ask your ISP for the address of their SMTP and POP
servers if you don't know them. But usually your email client is already
configured to talk to them. I would just go into my Thunderbird setup
and look up those addresses.

Bob McConnell

--- End Message ---
--- Begin Message ---
Hiya,

I know this isn't the best place to ask, but I figured enough people
here would have at some point installed PHP on some Linux variant, and
I'm hoping that some of you may have even done it on Fedora 11.

I'm using the built in gui package manager in Fedora (kpackagekit) to
install php, mysql and php-mysql. I let it resolve the dependencies and
install, but phpMyAdmin refuses to connect to the Mysql database with
the default username and password (root, no password) even though I can
successfully connect to plain old Mysql over the command line with
"mysql -u root -p"

I thought it was some problem with the built in repositories, so I
attempted to build PHP from source, and after sorting out the
dependencies manually, I got PHP installed and working from the command
line. However, when I try to start httpd up again, it gives me the
following error:

libphp5.so: undefined symbol: OnUpdateLong

I ran a nm command against the libphp5.so file, and the first line is "
U OnUpdateLong" (intentional spaces there)

Has anyone come across either of these problems and do you know of a way
to overcome either one? All I need really is a way to get it working, so
don't mind whether it's via the GUI or the command line!

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
On Fri, Oct 23, 2009 at 11:54 AM, Ashley Sheridan
<a...@ashleysheridan.co.uk>wrote:

> Hiya,
>
> I know this isn't the best place to ask, but I figured enough people
> here would have at some point installed PHP on some Linux variant, and
> I'm hoping that some of you may have even done it on Fedora 11.
>
> I'm using the built in gui package manager in Fedora (kpackagekit) to
> install php, mysql and php-mysql. I let it resolve the dependencies and
> install, but phpMyAdmin refuses to connect to the Mysql database with
> the default username and password (root, no password) even though I can
> successfully connect to plain old Mysql over the command line with
> "mysql -u root -p"
>
> I thought it was some problem with the built in repositories, so I
> attempted to build PHP from source, and after sorting out the
> dependencies manually, I got PHP installed and working from the command
> line. However, when I try to start httpd up again, it gives me the
> following error:
>
> libphp5.so: undefined symbol: OnUpdateLong
>
> I ran a nm command against the libphp5.so file, and the first line is "
> U OnUpdateLong" (intentional spaces there)
>
> Has anyone come across either of these problems and do you know of a way
> to overcome either one? All I need really is a way to get it working, so
> don't mind whether it's via the GUI or the command line!
>
> Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>

Hi Ashey,

What version of PHP did you compile?

What were your ./configure options?

It looks like the php source is using OnUpdateLong but the source file
containing OnUpdateLong is not included or linked properly.

I have ran into this kind of problem before.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 13:19 -0400, Israel Ekpo wrote:

> 
> 
> 
> On Fri, Oct 23, 2009 at 11:54 AM, Ashley Sheridan
> <a...@ashleysheridan.co.uk> wrote:
> 
>         Hiya,
>         
>         I know this isn't the best place to ask, but I figured enough
>         people
>         here would have at some point installed PHP on some Linux
>         variant, and
>         I'm hoping that some of you may have even done it on Fedora
>         11.
>         
>         I'm using the built in gui package manager in Fedora
>         (kpackagekit) to
>         install php, mysql and php-mysql. I let it resolve the
>         dependencies and
>         install, but phpMyAdmin refuses to connect to the Mysql
>         database with
>         the default username and password (root, no password) even
>         though I can
>         successfully connect to plain old Mysql over the command line
>         with
>         "mysql -u root -p"
>         
>         I thought it was some problem with the built in repositories,
>         so I
>         attempted to build PHP from source, and after sorting out the
>         dependencies manually, I got PHP installed and working from
>         the command
>         line. However, when I try to start httpd up again, it gives me
>         the
>         following error:
>         
>         libphp5.so: undefined symbol: OnUpdateLong
>         
>         I ran a nm command against the libphp5.so file, and the first
>         line is "
>         U OnUpdateLong" (intentional spaces there)
>         
>         Has anyone come across either of these problems and do you
>         know of a way
>         to overcome either one? All I need really is a way to get it
>         working, so
>         don't mind whether it's via the GUI or the command line!
>         
>         Thanks,
>         Ash
>         http://www.ashleysheridan.co.uk
>         
>         
> 
> 
> 
> Hi Ashey,
> 
> What version of PHP did you compile?
> 
> What were your ./configure options?
> 
> It looks like the php source is using OnUpdateLong but the source file
> containing OnUpdateLong is not included or linked properly.
> 
> I have ran into this kind of problem before.
> -- 
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.


I've actually got some steps closer to figuring out the problem. It
seems that the connection between PHP and MySQL was fine, it just
refused a connection as root. If I create another user it connects via
phpMyAdmin just fine.

The thing now is figuring out either how to get it to connect as root
(it currently keeps refusing root) or create another user with all the
privileges of root so I can connect with that. As of yet I haven't been
able to figure out either!
 

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
What about the error

libphp5.so: undefined symbol: OnUpdateLong

Are you still observing that error?

On Fri, Oct 23, 2009 at 1:23 PM, Ashley Sheridan
<a...@ashleysheridan.co.uk>wrote:

>  On Fri, 2009-10-23 at 13:19 -0400, Israel Ekpo wrote:
>
>
>
>  On Fri, Oct 23, 2009 at 11:54 AM, Ashley Sheridan <
> a...@ashleysheridan.co.uk> wrote:
>
> Hiya,
>
> I know this isn't the best place to ask, but I figured enough people
> here would have at some point installed PHP on some Linux variant, and
> I'm hoping that some of you may have even done it on Fedora 11.
>
> I'm using the built in gui package manager in Fedora (kpackagekit) to
> install php, mysql and php-mysql. I let it resolve the dependencies and
> install, but phpMyAdmin refuses to connect to the Mysql database with
> the default username and password (root, no password) even though I can
> successfully connect to plain old Mysql over the command line with
> "mysql -u root -p"
>
> I thought it was some problem with the built in repositories, so I
> attempted to build PHP from source, and after sorting out the
> dependencies manually, I got PHP installed and working from the command
> line. However, when I try to start httpd up again, it gives me the
> following error:
>
> libphp5.so: undefined symbol: OnUpdateLong
>
> I ran a nm command against the libphp5.so file, and the first line is "
> U OnUpdateLong" (intentional spaces there)
>
> Has anyone come across either of these problems and do you know of a way
> to overcome either one? All I need really is a way to get it working, so
> don't mind whether it's via the GUI or the command line!
>
> Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>
>
> Hi Ashey,
>
> What version of PHP did you compile?
>
> What were your ./configure options?
>
> It looks like the php source is using OnUpdateLong but the source file
> containing OnUpdateLong is not included or linked properly.
>
> I have ran into this kind of problem before.
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
>
>
> I've actually got some steps closer to figuring out the problem. It seems
> that the connection between PHP and MySQL was fine, it just refused a
> connection as root. If I create another user it connects via phpMyAdmin just
> fine.
>
> The thing now is figuring out either how to get it to connect as root (it
> currently keeps refusing root) or create another user with all the
> privileges of root so I can connect with that. As of yet I haven't been able
> to figure out either!
>
>
>   Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>


-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 13:29 -0400, Israel Ekpo wrote:

> What about the error
> 
> libphp5.so: undefined symbol: OnUpdateLong
> 
> Are you still observing that error?
> 
> On Fri, Oct 23, 2009 at 1:23 PM, Ashley Sheridan
> <a...@ashleysheridan.co.uk>wrote:
> 
> >  On Fri, 2009-10-23 at 13:19 -0400, Israel Ekpo wrote:
> >
> >
> >
> >  On Fri, Oct 23, 2009 at 11:54 AM, Ashley Sheridan <
> > a...@ashleysheridan.co.uk> wrote:
> >
> > Hiya,
> >
> > I know this isn't the best place to ask, but I figured enough people
> > here would have at some point installed PHP on some Linux variant, and
> > I'm hoping that some of you may have even done it on Fedora 11.
> >
> > I'm using the built in gui package manager in Fedora (kpackagekit) to
> > install php, mysql and php-mysql. I let it resolve the dependencies and
> > install, but phpMyAdmin refuses to connect to the Mysql database with
> > the default username and password (root, no password) even though I can
> > successfully connect to plain old Mysql over the command line with
> > "mysql -u root -p"
> >
> > I thought it was some problem with the built in repositories, so I
> > attempted to build PHP from source, and after sorting out the
> > dependencies manually, I got PHP installed and working from the command
> > line. However, when I try to start httpd up again, it gives me the
> > following error:
> >
> > libphp5.so: undefined symbol: OnUpdateLong
> >
> > I ran a nm command against the libphp5.so file, and the first line is "
> > U OnUpdateLong" (intentional spaces there)
> >
> > Has anyone come across either of these problems and do you know of a way
> > to overcome either one? All I need really is a way to get it working, so
> > don't mind whether it's via the GUI or the command line!
> >
> > Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
> >
> > Hi Ashey,
> >
> > What version of PHP did you compile?
> >
> > What were your ./configure options?
> >
> > It looks like the php source is using OnUpdateLong but the source file
> > containing OnUpdateLong is not included or linked properly.
> >
> > I have ran into this kind of problem before.
> > --
> > "Good Enough" is not good enough.
> > To give anything less than your best is to sacrifice the gift.
> > Quality First. Measure Twice. Cut Once.
> >
> >
> > I've actually got some steps closer to figuring out the problem. It seems
> > that the connection between PHP and MySQL was fine, it just refused a
> > connection as root. If I create another user it connects via phpMyAdmin just
> > fine.
> >
> > The thing now is figuring out either how to get it to connect as root (it
> > currently keeps refusing root) or create another user with all the
> > privileges of root so I can connect with that. As of yet I haven't been able
> > to figure out either!
> >
> >
> >   Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
> 
> 


Well, I'm not compiling from the source now, I went back to trying to
use the Fedora repositories (which was the only reason I ended up trying
to compile from source in the first place)

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
I'm looking for a regular expression to accomplish a specific task.

I'm hoping someone who's really good at regex patterns can lend a quick hand.

I need a regex pattern that will grab URLs out of HTML that have a
certain link text. (i.e. the word "Continue")

This is what I have so far but it does not work properly (If there are
other attributes in the <a> tag it returns them as part of the URL.)

    
preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
$html, $matches);

It needs to be able to extract the URL and disregard arbitrary
attributes in the HTML tag

Test it with the following examples:

<a href=/path/to/url.html>Continue</a>
<a href='/path/to/url.html'>Continue</a>
<a href="http://example.com/path/to/url.html"; class="link">Continue</a>
<a style="font-size: 12px" href="http://example.com/path/to/url.html";
onlick="someFunction('foo','bar')">Continue</a>

Please reply

Your help is much appreciated.

Thanks in advance,
Brad F.

--- End Message ---
--- Begin Message ---
Brad Fuller wrote:
> I'm looking for a regular expression to accomplish a specific task.
> 
> I'm hoping someone who's really good at regex patterns can lend a quick hand.
> 
> I need a regex pattern that will grab URLs out of HTML that have a
> certain link text. (i.e. the word "Continue")
> 
> This is what I have so far but it does not work properly (If there are
> other attributes in the <a> tag it returns them as part of the URL.)
> 
>     
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> $html, $matches);
> 
> It needs to be able to extract the URL and disregard arbitrary
> attributes in the HTML tag
> 
> Test it with the following examples:
> 
> <a href=/path/to/url.html>Continue</a>
> <a href='/path/to/url.html'>Continue</a>
> <a href="http://example.com/path/to/url.html"; class="link">Continue</a>
> <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> onlick="someFunction('foo','bar')">Continue</a>
> 
> Please reply
> 
> Your help is much appreciated.
> 
> Thanks in advance,
> Brad F.
> 

Looking at this document from an XML standpoint, I could see doing this rather
easily.  Without having to use regex.  You might look into using DomDocument and
simpleXML to complete the task.

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:

> I'm looking for a regular expression to accomplish a specific task.
> 
> I'm hoping someone who's really good at regex patterns can lend a quick hand.
> 
> I need a regex pattern that will grab URLs out of HTML that have a
> certain link text. (i.e. the word "Continue")
> 
> This is what I have so far but it does not work properly (If there are
> other attributes in the <a> tag it returns them as part of the URL.)
> 
>     
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> $html, $matches);
> 
> It needs to be able to extract the URL and disregard arbitrary
> attributes in the HTML tag
> 
> Test it with the following examples:
> 
> <a href=/path/to/url.html>Continue</a>
> <a href='/path/to/url.html'>Continue</a>
> <a href="http://example.com/path/to/url.html"; class="link">Continue</a>
> <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> onlick="someFunction('foo','bar')">Continue</a>
> 
> Please reply
> 
> Your help is much appreciated.
> 
> Thanks in advance,
> Brad F.
> 


preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^
\"\']+?).+?>Continue</a>#i', $html, $matches);

I just changed your regex a bit. What your regex was previously doing
was matching everything from the first quote after the href= right up
until the first > it found, which would usually be the one that closes
the opening tag. You could make it a bit more intelligent if you wished
with backreferencing to make sure it matches against the same type of
quotation character it matched as the start of the href's value.

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
On Fri, Oct 23, 2009 at 1:28 PM, Ashley Sheridan
<a...@ashleysheridan.co.uk>wrote:

>  On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:
>
> I'm looking for a regular expression to accomplish a specific task.
>
> I'm hoping someone who's really good at regex patterns can lend a quick hand.
>
> I need a regex pattern that will grab URLs out of HTML that have a
> certain link text. (i.e. the word "Continue")
>
> This is what I have so far but it does not work properly (If there are
> other attributes in the <a> tag it returns them as part of the URL.)
>
>     
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> $html, $matches);
>
> It needs to be able to extract the URL and disregard arbitrary
> attributes in the HTML tag
>
> Test it with the following examples:
>
> <a href=/path/to/url.html>Continue</a>
> <a href='/path/to/url.html'>Continue</a>
> <a href="http://example.com/path/to/url.html"; class="link">Continue</a>
> <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> onlick="someFunction('foo','bar')">Continue</a>
>
> Please reply
>
> Your help is much appreciated.
>
> Thanks in advance,
> Brad F.
>
>
>
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^\"\']+?).+?>Continue</a>#i',
> $html, $matches);
>
> I just changed your regex a bit. What your regex was previously doing was
> matching everything from the first quote after the href= right up until the
> first > it found, which would usually be the one that closes the opening
> tag. You could make it a bit more intelligent if you wished with
> backreferencing to make sure it matches against the same type of quotation
> character it matched as the start of the href's value.
>
>   Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>

I appreciate the help.  However, when try this I only get the first
character of the URL.  Can you double check it please.

Thanks again

--- End Message ---
--- Begin Message ---
On Fri, 2009-10-23 at 13:45 -0400, Brad Fuller wrote:

> On Fri, Oct 23, 2009 at 1:28 PM, Ashley Sheridan
> <a...@ashleysheridan.co.uk>wrote:
> 
> >  On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:
> >
> > I'm looking for a regular expression to accomplish a specific task.
> >
> > I'm hoping someone who's really good at regex patterns can lend a quick 
> > hand.
> >
> > I need a regex pattern that will grab URLs out of HTML that have a
> > certain link text. (i.e. the word "Continue")
> >
> > This is what I have so far but it does not work properly (If there are
> > other attributes in the <a> tag it returns them as part of the URL.)
> >
> >     
> > preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> > $html, $matches);
> >
> > It needs to be able to extract the URL and disregard arbitrary
> > attributes in the HTML tag
> >
> > Test it with the following examples:
> >
> > <a href=/path/to/url.html>Continue</a>
> > <a href='/path/to/url.html'>Continue</a>
> > <a href="http://example.com/path/to/url.html"; class="link">Continue</a>
> > <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> > onlick="someFunction('foo','bar')">Continue</a>
> >
> > Please reply
> >
> > Your help is much appreciated.
> >
> > Thanks in advance,
> > Brad F.
> >
> >
> >
> > preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^\"\']+?).+?>Continue</a>#i',
> > $html, $matches);
> >
> > I just changed your regex a bit. What your regex was previously doing was
> > matching everything from the first quote after the href= right up until the
> > first > it found, which would usually be the one that closes the opening
> > tag. You could make it a bit more intelligent if you wished with
> > backreferencing to make sure it matches against the same type of quotation
> > character it matched as the start of the href's value.
> >
> >   Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
> 
> I appreciate the help.  However, when try this I only get the first
> character of the URL.  Can you double check it please.
> 
> Thanks again


I think it's probably the first ? in ([^\"\']+?)

Remove that and it should do the trick

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
On Fri, Oct 23, 2009 at 1:48 PM, Ashley Sheridan
<a...@ashleysheridan.co.uk>wrote:

> On Fri, 2009-10-23 at 13:45 -0400, Brad Fuller wrote:
>
> > On Fri, Oct 23, 2009 at 1:28 PM, Ashley Sheridan
> > <a...@ashleysheridan.co.uk>wrote:
> >
> > >  On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:
> > >
> > > I'm looking for a regular expression to accomplish a specific task.
> > >
> > > I'm hoping someone who's really good at regex patterns can lend a quick
> hand.
> > >
> > > I need a regex pattern that will grab URLs out of HTML that have a
> > > certain link text. (i.e. the word "Continue")
> > >
> > > This is what I have so far but it does not work properly (If there are
> > > other attributes in the <a> tag it returns them as part of the URL.)
> > >
> > >
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> > > $html, $matches);
> > >
> > > It needs to be able to extract the URL and disregard arbitrary
> > > attributes in the HTML tag
> > >
> > > Test it with the following examples:
> > >
> > > <a href=/path/to/url.html>Continue</a>
> > > <a href='/path/to/url.html'>Continue</a>
> > > <a href="http://example.com/path/to/url.html";
> class="link">Continue</a>
> > > <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> > > onlick="someFunction('foo','bar')">Continue</a>
> > >
> > > Please reply
> > >
> > > Your help is much appreciated.
> > >
> > > Thanks in advance,
> > > Brad F.
> > >
> > >
> > >
> > >
> preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^\"\']+?).+?>Continue</a>#i',
> > > $html, $matches);
> > >
> > > I just changed your regex a bit. What your regex was previously doing
> was
> > > matching everything from the first quote after the href= right up until
> the
> > > first > it found, which would usually be the one that closes the
> opening
> > > tag. You could make it a bit more intelligent if you wished with
> > > backreferencing to make sure it matches against the same type of
> quotation
> > > character it matched as the start of the href's value.
> > >
> > >   Thanks,
> > > Ash
> > > http://www.ashleysheridan.co.uk
> > >
> > >
> > >
> >
> > I appreciate the help.  However, when try this I only get the first
> > character of the URL.  Can you double check it please.
> >
> > Thanks again
>
>
> I think it's probably the first ? in ([^\"\']+?)
>
> Remove that and it should do the trick
>
> Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>
Hi Brad,

I agree with Jim.

Take a look at this. It might help.

<?php

$xml_string = <<<TEXT_BOUNDARY
<html>
    <head>
        <title></title>
    </head>
    <body>
        <div>
            <a href="http://example.com/path/to/urlA.html";>Continue</a>
            <a href="http://example.com/path/to/url2.html";>Brad Fuller</a>
            <a href="http://example.com/path/to/urlB.html";>Continue</a>
            <a href="http://example.com/path/to/url4.html";>PHP.net</a>
            <a href="http://example.com/path/to/urlC.html";
class="link">Continue</a>
            <a style="font-size: 12px" href="
http://example.com/path/to/urlD.html";
onclick="someFunction('foo','bar')">Continue</a>
        </div>
    </body>
</html>
TEXT_BOUNDARY;

$xml = simplexml_load_string($xml_string);

$continue_hrefs = $xml->xpath("//a[text() = 'Continue']/@href");

print_r($continue_hrefs);

?>

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.

--- End Message ---
--- Begin Message ---
On Fri, Oct 23, 2009 at 1:48 PM, Ashley Sheridan
<a...@ashleysheridan.co.uk>wrote:

>  On Fri, 2009-10-23 at 13:45 -0400, Brad Fuller wrote:
>
> On Fri, Oct 23, 2009 at 1:28 PM, Ashley Sheridan
> <a...@ashleysheridan.co.uk>wrote:
>
> >  On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:
> >
> > I'm looking for a regular expression to accomplish a specific task.
> >
> > I'm hoping someone who's really good at regex patterns can lend a quick 
> > hand.
> >
> > I need a regex pattern that will grab URLs out of HTML that have a
> > certain link text. (i.e. the word "Continue")
> >
> > This is what I have so far but it does not work properly (If there are
> > other attributes in the <a> tag it returns them as part of the URL.)
> >
> >     
> > preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
> > $html, $matches);
> >
> > It needs to be able to extract the URL and disregard arbitrary
> > attributes in the HTML tag
> >
> > Test it with the following examples:
> >
> > <a href=/path/to/url.html>Continue</a>
> > <a href='/path/to/url.html'>Continue</a>
> > <a href="http://example.com/path/to/url.html"; class="link">Continue</a>
> > <a style="font-size: 12px" href="http://example.com/path/to/url.html";
> > onlick="someFunction('foo','bar')">Continue</a>
> >
> > Please reply
> >
> > Your help is much appreciated.
> >
> > Thanks in advance,
> > Brad F.
> >
> >
> >
> > preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^\"\']+?).+?>Continue</a>#i',
> > $html, $matches);
> >
> > I just changed your regex a bit. What your regex was previously doing was
> > matching everything from the first quote after the href= right up until the
> > first > it found, which would usually be the one that closes the opening
> > tag. You could make it a bit more intelligent if you wished with
> > backreferencing to make sure it matches against the same type of quotation
> > character it matched as the start of the href's value.
> >
> >   Thanks,
> > Ash
> > http://www.ashleysheridan.co.uk
> >
> >
> >
>
> I appreciate the help.  However, when try this I only get the first
> character of the URL.  Can you double check it please.
>
> Thanks again
>
>
> I think it's probably the first ? in ([^\"\']+?)
>
> Remove that and it should do the trick
>
>   Thanks,
> Ash
> http://www.ashleysheridan.co.uk
>
>
>
That did the trick.  Thanks Ash you are awesome!

Also thanks Jim for your suggestion.  I may move to SimpleXML if the project
grows much bigger.  But for now I was looking for a nice one liner and this
is it.

Cheers,
Brad

--- End Message ---
--- Begin Message ---
On Fri, Oct 23, 2009 at 1:54 PM, Israel Ekpo <israele...@gmail.com> wrote:
>
>
> On Fri, Oct 23, 2009 at 1:48 PM, Ashley Sheridan <a...@ashleysheridan.co.uk>
> wrote:
>>
>> On Fri, 2009-10-23 at 13:45 -0400, Brad Fuller wrote:
>>
>> > On Fri, Oct 23, 2009 at 1:28 PM, Ashley Sheridan
>> > <a...@ashleysheridan.co.uk>wrote:
>> >
>> > >  On Fri, 2009-10-23 at 13:23 -0400, Brad Fuller wrote:
>> > >
>> > > I'm looking for a regular expression to accomplish a specific task.
>> > >
>> > > I'm hoping someone who's really good at regex patterns can lend a
>> > > quick hand.
>> > >
>> > > I need a regex pattern that will grab URLs out of HTML that have a
>> > > certain link text. (i.e. the word "Continue")
>> > >
>> > > This is what I have so far but it does not work properly (If there are
>> > > other attributes in the <a> tag it returns them as part of the URL.)
>> > >
>> > >
>> > > preg_match_all('#<a[\s]+[^>]*href\s*=\s*([\"\']+)([^>]+?)(\1|>)>Continue</a>#i',
>> > > $html, $matches);
>> > >
>> > > It needs to be able to extract the URL and disregard arbitrary
>> > > attributes in the HTML tag
>> > >
>> > > Test it with the following examples:
>> > >
>> > > <a href=/path/to/url.html>Continue</a>
>> > > <a href='/path/to/url.html'>Continue</a>
>> > > <a href="http://example.com/path/to/url.html";
>> > > class="link">Continue</a>
>> > > <a style="font-size: 12px" href="http://example.com/path/to/url.html";
>> > > onlick="someFunction('foo','bar')">Continue</a>
>> > >
>> > > Please reply
>> > >
>> > > Your help is much appreciated.
>> > >
>> > > Thanks in advance,
>> > > Brad F.
>> > >
>> > >
>> > >
>> > >
>> > > preg_match_all('#<a[\s]+[^>]*href\s*=\s*[\"\']+([^\"\']+?).+?>Continue</a>#i',
>> > > $html, $matches);
>> > >
>> > > I just changed your regex a bit. What your regex was previously doing
>> > > was
>> > > matching everything from the first quote after the href= right up
>> > > until the
>> > > first > it found, which would usually be the one that closes the
>> > > opening
>> > > tag. You could make it a bit more intelligent if you wished with
>> > > backreferencing to make sure it matches against the same type of
>> > > quotation
>> > > character it matched as the start of the href's value.
>> > >
>> > >   Thanks,
>> > > Ash
>> > > http://www.ashleysheridan.co.uk
>> > >
>> > >
>> > >
>> >
>> > I appreciate the help.  However, when try this I only get the first
>> > character of the URL.  Can you double check it please.
>> >
>> > Thanks again
>>
>>
>> I think it's probably the first ? in ([^\"\']+?)
>>
>> Remove that and it should do the trick
>>
>> Thanks,
>> Ash
>> http://www.ashleysheridan.co.uk
>>
>>
>
> Hi Brad,
>
> I agree with Jim.
>
> Take a look at this. It might help.
>
> <?php
>
> $xml_string = <<<TEXT_BOUNDARY
> <html>
>     <head>
>         <title></title>
>     </head>
>     <body>
>         <div>
>             <a href="http://example.com/path/to/urlA.html";>Continue</a>
>             <a href="http://example.com/path/to/url2.html";>Brad Fuller</a>
>             <a href="http://example.com/path/to/urlB.html";>Continue</a>
>             <a href="http://example.com/path/to/url4.html";>PHP.net</a>
>             <a href="http://example.com/path/to/urlC.html";
> class="link">Continue</a>
>             <a style="font-size: 12px"
> href="http://example.com/path/to/urlD.html";
> onclick="someFunction('foo','bar')">Continue</a>
>         </div>
>     </body>
> </html>
> TEXT_BOUNDARY;
>
> $xml = simplexml_load_string($xml_string);
>
> $continue_hrefs = $xml->xpath("//a[text() = 'Continue']/@href");
>
> print_r($continue_hrefs);
>
> ?>
>

Thanks, I'm sure I will use this at some point in the future :)

--- End Message ---
--- Begin Message ---
Kim Madsen wrote on 2009-10-22 17:51:
Hi PHPeople

I have an odd problem at my new work and wonder if it's some sort of odd setup that is causing this problem when using sessions:

Like I said, my new work and odd setup, an include file had a mysql_close() in the bottom

Speaking of mysql_close(), I think I've read somewhere that in PHP6 a db connection will not be closed, when the script is done. Is this true? Cause then it would definetly be best practice to to _always_ have a mysql_close() in the end for the main file.

--
Kind regards
Kim Emax - masterminds.dk

--- End Message ---

Reply via email to