On Wed, 2006-08-09 at 19:07 +0900, Dave M G wrote: > Robert, > > Thank you for your quick response and helpful advice. > >> Use preg_match() and pay special attention to the manual as it refers to > >> the third parameter :) The expression you need follows: > >> "#^([^\\s]*)\\s(.*)$#U" > > This works perfectly. > > I now see that the preg_match() function returns an array with the > original text, the selected text, and the discarded text. That wasn't > clear to me before when I wasn't looking for that kind of behavior. But > now that you point it out I see how it works. > > But I am still confused about the expression you used. I can't quite > break it down: > The opening "^" says to start at the beginning of the line. > The brackets indicate a sub-expression. > The square brackets indicate a character class (?). > The "^" inside the square brackets means "not". > > First question, why is there an extra backslash before the space marker > "\s"? Isn't that an escape character, so wouldn't that turn the > following space marker into a literal backslash followed by an "s"? > > The "*" says to select everything matching the preceeding conditions. > > There's that double backslash and "s" again. > > Hmm... does the (.*) after the second "\s" mean to match all the > whitespace found? For example if there happened to be two space > characters instead of just one? > > The PHP manual says the "$" means to "assert end of subject". Which I > think means "stop looking for any more matches". > > So basically I'm confused about the extra escape slashes.
The extra slashes are to properly escape the backslash character since you are using double quotes. While it is true that your expression works because \s has no meaning in PHP, you can't rely on that being indefinitely true. You are relying on a side effect of an unrecognized special character. In all honesty, I should have escaped the $ character also since it denotes a variable in double quotes. In case you didn't know, double quotes indicate interpolated strings, single quotes indicate literal strings. This is why you can produce newline, tab, and other special characters in double quotes, but not in single quotes. The best way to write your string since you do not intend to perform any interpolation is the following: '#^([^\s]*)\s(.*)$#' As for the meaning of the individual parts of the pattern... The first ^ anchors the matching to the beginning of the string to match. The open square braces indicates a character range (which can include character classes such as \s for whitespace. The ^ within the range negates the semantics of the range, so as you say... do NOT match the enclosed range. The * following the square brackets say that there can be 0 to infinity characters matched (this means if your value being matched has a lead space then you can get a blank first match... you may actually want + here instead of *, but I think you're trimming anyways and I followed your original). the following \s then matches the first whitespace following the initial block. You might actually want to put a ? after this so that if the end of the value is reached without any whitespace then you still match the initial portion. The .* portion says match 0 to infinity characters of any value and the trailing $ anchors the matching to the end of the value thus forcing .* to match all characters from the space to the end of the string. As I said you may want the following pattern in cases where no space exists: '#^([^\s]*)\s?(.*)$#' In this usage the ? makes it match only if it exists. it's the same as \s{0,1} but obviously shorter :) Cheers, Rob. -- .------------------------------------------------------------. | InterJinn Application Framework - http://www.interjinn.com | :------------------------------------------------------------: | An application and templating framework for PHP. Boasting | | a powerful, scalable system for accessing system services | | such as forms, properties, sessions, and caches. InterJinn | | also provides an extremely flexible architecture for | | creating re-usable components quickly and easily. | `------------------------------------------------------------' -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php