Re: [PHP] str_replace on words with an array
On 03/11/06, Richard Lynch <[EMAIL PROTECTED]> wrote: On Fri, November 3, 2006 5:30 am, Dotan Cohen wrote: > To all others who took part in this thread: I was unclear on another > point as well, the issue of sql-injection. As I'm removing the > symbols, signs, and other non-alpha characters from the query, I > expect it to be sql-injection proof. As I wrong? ie, could an attacker > successful inject sql if he has nothing but alpha characters at his > disposal? I think not, but I'd like to hear it from someone with more > experience than i. In Latin1, ISO-8891-1 or whatever, plain old not-quite-ASCII, yeah, you should be safe, I think... I'm making *no* promises if your DB is configured to accept some *other* character set, or the Bad Guy manages to trick it into thinking it should be using that charset. Yep, configured to accept UTF-8. Us Hebrew-speakers and our funny letters :) Why the big deal about just calling mysql_real_escape_string() on your data? No biggie- I'm doing that too. Or using prepared statements and that ilk? Then you'd be 100% sure, and not worrying about it, eh? Well, abstinence is not an option! I can't use prepared statements on a full-text search. Thanks, Richard. When is that Uranus office opening I've been waiting almost five years!! Dotan Cohen nirot.com http://what-is-what.com/what_is/ubuntu.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On Fri, November 3, 2006 5:30 am, Dotan Cohen wrote: > To all others who took part in this thread: I was unclear on another > point as well, the issue of sql-injection. As I'm removing the > symbols, signs, and other non-alpha characters from the query, I > expect it to be sql-injection proof. As I wrong? ie, could an attacker > successful inject sql if he has nothing but alpha characters at his > disposal? I think not, but I'd like to hear it from someone with more > experience than i. In Latin1, ISO-8891-1 or whatever, plain old not-quite-ASCII, yeah, you should be safe, I think... I'm making *no* promises if your DB is configured to accept some *other* character set, or the Bad Guy manages to trick it into thinking it should be using that charset. Why the big deal about just calling mysql_real_escape_string() on your data? Or using prepared statements and that ilk? Then you'd be 100% sure, and not worrying about it, eh? -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some starving artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
# [EMAIL PROTECTED] / 2006-10-30 21:18:33 +: > Dotan Cohen wrote: > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); > > Ok, this is what the compiler will see... > > $searchQuery=str_replace("^Array$", " ", $searchQuery); > > Yes, that's a literal Array in the string. You cannot, and you should > remember this, you cannot concatenate strings and arrays. What would you > expect it to do? DTRT? This is what e. g. zsh does with the right configuration: [EMAIL PROTECTED] ~ 1108:0 > echo x-{aa,bb,cc}-y x-aa-y x-bb-y x-cc-y -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On 31/10/06, Larry Garfield <[EMAIL PROTECTED]> wrote: From your original message, it sounds like you want to strip selected complete words, not substrings, from a string for indexing or searching or such. Right? I think that was my mistake- not differentiating between the two. Symbols and such I wanted to replace as substrings, yet noise words I wanted to replace as words. Now that I've created two arrays, one with symbols and one with noise words, things are on track. Try something like this: $string = "The quick sly fox jumped over a fence and ran away"; $words = array('the', 'a', 'and'); function make_regex($str) { return '/\b' . $str . '\b/i'; } $search = array_map('make_regex', $words); $string = preg_replace($search, '', $string); print $string . "\n"; I was completely unaware of the array_map function. Thank you- that is exactly what I needed. What you really need to do that is to match word boundaries, NOT string boundaries. So you take your list of words and mutate *each one* (that's what the array_map() is about) into a regex pattern that finds that word, case-insensitively. Then you use preg_replace() to replace all matches of any of those patterns with an empty string. Yep. You were close. What you were missing was the array_map(), because you needed to concatenate stuff to each element of the array rather than trying to concatenate a string to an array, which as others have said will absolutely not work. Yep. I can't guarantee that the above code is the best performant method, but it works. :-) It certainly does. Of course I'm not using it exactly how you pasted it, but you got me on track. Thank you very much. To all others who took part in this thread: I was unclear on another point as well, the issue of sql-injection. As I'm removing the symbols, signs, and other non-alpha characters from the query, I expect it to be sql-injection proof. As I wrong? ie, could an attacker successful inject sql if he has nothing but alpha characters at his disposal? I think not, but I'd like to hear it from someone with more experience than i. Thank you. Dotan Cohen http://what-is-what.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On Oct 30, 2006, at 1:10 PM, Dotan Cohen wrote: On 30/10/06, Stut <[EMAIL PROTECTED]> wrote: Ed Lazor wrote: > It looks like you guys are coming up with some cool solutions, but I > have a question. Wasn't the original purpose of this thread to > prevent sql injection attacks in input from user forms? If so, > wouldn't mysql_real_escape_string be an easier solution? Me thinkie nottie. From the OP... "I need to remove the noise words from a search string." Yes, that is also part of the aim. How come? Not trying to be facetious here. I'm just wondering if you see a benefit that I don't. For example, say the hacker injects some sql and you use mysql_real_escape_string. You end up with something like this... actually, I'll do one step further and just use the quote_smart function described in the mysql_real_escape_string page of the php manual: $query = sprintf("SELECT * FROM users WHERE user=%s AND password=%s", quote_smart($_POST['username']), quote_smart($_POST['password']) ); Say the user tried to inject sql in $_POST['username'] and it looked something like: root';drop all; Having used quote_smart, the value of $query ends up SELECT * FROM users WHERE user='root\'\;drop all\;' AND password='something' The sql injection fails. The data is seen as a literal. The database is going to think there's no user with that name. That means that even if the user did include extra words, they're just part of the value that is checked against user names - rather than being see as potential commands. I'm not sure if I'm describing this well, so let me know what you think and I'll go from there. -Ed -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On Monday 30 October 2006 15:10, Dotan Cohen wrote: > Er, so how would it be done? I've been trying for two days now with no > success. From your original message, it sounds like you want to strip selected complete words, not substrings, from a string for indexing or searching or such. Right? Try something like this: $string = "The quick sly fox jumped over a fence and ran away"; $words = array('the', 'a', 'and'); function make_regex($str) { return '/\b' . $str . '\b/i'; } $search = array_map('make_regex', $words); $string = preg_replace($search, '', $string); print $string . "\n"; What you really need to do that is to match word boundaries, NOT string boundaries. So you take your list of words and mutate *each one* (that's what the array_map() is about) into a regex pattern that finds that word, case-insensitively. Then you use preg_replace() to replace all matches of any of those patterns with an empty string. You were close. What you were missing was the array_map(), because you needed to concatenate stuff to each element of the array rather than trying to concatenate a string to an array, which as others have said will absolutely not work. I can't guarantee that the above code is the best performant method, but it works. :-) -- Larry Garfield AIM: LOLG42 [EMAIL PROTECTED] ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On Oct 30, 2006, at 9:19 AM, Stut wrote: Ed Lazor wrote: It looks like you guys are coming up with some cool solutions, but I have a question. Wasn't the original purpose of this thread to prevent sql injection attacks in input from user forms? If so, wouldn't mysql_real_escape_string be an easier solution? Me thinkie nottie. From the OP... "I need to remove the noise words from a search string." You sure? This is what they said originally: "Nothing else is relevant, but $searchQuery will get passed to the database, so it should be protected from SQL injection. That's why I want to remove characters such as quotes, dashes, and the equals sign." Maybe that doesn't account for all of the extra words they're trying to remove... dunno, thus my question. However, until the OPer accepts that people are right when they say you can't append strings to an array it's never going to work. Every bit of sample code posted retains the following line of code rather than fixing it according to several other previous posts... "^".$noiseArray."$" Happy happy joy joy, oh look, the spring's broken. Doing!! Persistence is a virtue? hehe -Stut (slightly drunk, but feeling generally good about the world) Hy. That's not fair. No bragging unless you plan on sharing :) -Ed -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Dotan Cohen wrote: > Er, so how would it be done? I've been trying for two days now with no > success. Ok, I guess my original reply didn't get through, or you ignored it. Here it is again for your convenience. Dotan Cohen wrote: > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); Ok, this is what the compiler will see... $searchQuery=str_replace("^Array$", " ", $searchQuery); Yes, that's a literal Array in the string. You cannot, and you should remember this, you cannot concatenate strings and arrays. What would you expect it to do? Now, the answer is this... $searchQuery = str_replace($noiseArray, ' ', $searchQuery); However, what you seem to be doing is putting regex syntax where it would have no effect even if $noiseArray was not an array. If your intention is to replace the words rather than just the strings then you need to look at preg_replace (http://php.net/preg_replace) and you'll need to decorate the strings in $noiseArray with the appropriate characters to make them the search pattern you need - full details on the preg_replace manual page. -Stut (painfully sober now :( ) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On 30/10/06, Stut <[EMAIL PROTECTED]> wrote: Ed Lazor wrote: > It looks like you guys are coming up with some cool solutions, but I > have a question. Wasn't the original purpose of this thread to > prevent sql injection attacks in input from user forms? If so, > wouldn't mysql_real_escape_string be an easier solution? Me thinkie nottie. From the OP... "I need to remove the noise words from a search string." Yes, that is also part of the aim. However, until the OPer accepts that people are right when they say you can't append strings to an array it's never going to work. Every bit of sample code posted retains the following line of code rather than fixing it according to several other previous posts... "^".$noiseArray."$" Er, so how would it be done? I've been trying for two days now with no success. Happy happy joy joy, oh look, the spring's broken. Doing!! Boing!!! -Stut (slightly drunk, but feeling generally good about the world) Dotan Cohen http://lyricslist.com/ http://what-is-what.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Ed Lazor wrote: It looks like you guys are coming up with some cool solutions, but I have a question. Wasn't the original purpose of this thread to prevent sql injection attacks in input from user forms? If so, wouldn't mysql_real_escape_string be an easier solution? Me thinkie nottie. From the OP... "I need to remove the noise words from a search string." However, until the OPer accepts that people are right when they say you can't append strings to an array it's never going to work. Every bit of sample code posted retains the following line of code rather than fixing it according to several other previous posts... "^".$noiseArray."$" Happy happy joy joy, oh look, the spring's broken. Doing!! -Stut (slightly drunk, but feeling generally good about the world) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
It looks like you guys are coming up with some cool solutions, but I have a question. Wasn't the original purpose of this thread to prevent sql injection attacks in input from user forms? If so, wouldn't mysql_real_escape_string be an easier solution? On Oct 30, 2006, at 8:17 AM, Jochem Maas wrote: Dotan Cohen wrote: I need to remove the noise words from a search string. I can't seem to get str_replace to go through the array and remove the words, and I'd rather avoid a redundant foreach it I can. According to TFM str_replace should automatically go through the whole array, no? Does anybody see anything wrong with this code: $noiseArray = array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "\"", "'", ":", ";", "|", "\\", "<", ">", ",", ".", "?", "$", "!", "@", "#", "$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", "]", "{", "}", "about", "after", "all", "also", "an", "and", "another", "any", "are", "as", "at", "be", "because", "been", "before", "being", "between", "both", "but", "by", "came", "can", "come", "could", "did", "do", "does", "each", "else", "for", "from", "get", "got", "has", "had", "he", "have", "her", "here", "him", "himself", "his", "how", "if", "in", "into", "is", "it", "its", "just", "like", "make", "many", "me", "might", "more", "most", "much", "must", "my", "never", "now", "of", "on", "only", "or", "other", "our", "out", "over", "re", "said", "same", "see", "should", "since", "so", "some", "still", "such", "take", "than", "that", "the", "their", "them", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", "well", "were", "what", "when", "where", "which", "while", "who", "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"); $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); // another idea based on further reading of the thread: function pregify($val) { foreach ((array)$val $k as $v) $val[$k] = '\b'.preg_quote($v).'\b'; return $val; } $searchQuery = preg_replace(pregify($noiseArray), " ", $searchQuery); Thanks in advance. Dotan Cohen http://essentialinux.com http://what-is-what.com/what_is/sitepoint.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Dotan Cohen wrote: > I need to remove the noise words from a search string. I can't seem to > get str_replace to go through the array and remove the words, and I'd > rather avoid a redundant foreach it I can. According to TFM > str_replace should automatically go through the whole array, no? Does > anybody see anything wrong with this code: > > $noiseArray = array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0", > "\"", "'", ":", ";", "|", "\\", "<", ">", ",", ".", "?", "$", "!", > "@", "#", "$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", > "]", "{", "}", "about", "after", "all", "also", "an", "and", > "another", "any", "are", "as", "at", "be", "because", "been", > "before", "being", "between", "both", "but", "by", "came", "can", > "come", "could", "did", "do", "does", "each", "else", "for", "from", > "get", "got", "has", "had", "he", "have", "her", "here", "him", > "himself", "his", "how", "if", "in", "into", "is", "it", "its", > "just", "like", "make", "many", "me", "might", "more", "most", "much", > "must", "my", "never", "now", "of", "on", "only", "or", "other", > "our", "out", "over", "re", "said", "same", "see", "should", "since", > "so", "some", "still", "such", "take", "than", "that", "the", "their", > "them", "then", "there", "these", "they", "this", "those", "through", > "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", > "well", "were", "what", "when", "where", "which", "while", "who", > "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", > "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", > "u", "v", "w", "x", "y", "z"); > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); // another idea based on further reading of the thread: function pregify($val) { foreach ((array)$val $k as $v) $val[$k] = '\b'.preg_quote($v).'\b'; return $val; } $searchQuery = preg_replace(pregify($noiseArray), " ", $searchQuery); > > Thanks in advance. > > Dotan Cohen > > http://essentialinux.com > http://what-is-what.com/what_is/sitepoint.html > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Dotan Cohen wrote: > I need to remove the noise words from a search string. I can't seem to > get str_replace to go through the array and remove the words, and I'd > rather avoid a redundant foreach it I can. According to TFM > str_replace should automatically go through the whole array, no? Does > anybody see anything wrong with this code: > > $noiseArray = array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0", > "\"", "'", ":", ";", "|", "\\", "<", ">", ",", ".", "?", "$", "!", > "@", "#", "$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", > "]", "{", "}", "about", "after", "all", "also", "an", "and", > "another", "any", "are", "as", "at", "be", "because", "been", > "before", "being", "between", "both", "but", "by", "came", "can", > "come", "could", "did", "do", "does", "each", "else", "for", "from", > "get", "got", "has", "had", "he", "have", "her", "here", "him", > "himself", "his", "how", "if", "in", "into", "is", "it", "its", > "just", "like", "make", "many", "me", "might", "more", "most", "much", > "must", "my", "never", "now", "of", "on", "only", "or", "other", > "our", "out", "over", "re", "said", "same", "see", "should", "since", > "so", "some", "still", "such", "take", "than", "that", "the", "their", > "them", "then", "there", "these", "they", "this", "those", "through", > "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", > "well", "were", "what", "when", "where", "which", "while", "who", > "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", > "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", > "u", "v", "w", "x", "y", "z"); > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); it's string replacement not regexp replacement, therefore the '^' and '$' are bogus. the first argument to the str_replace() call is the string literal: '^Array$' ... you can't concatenate strings and arrays like that (and get the result you want)! instead, your function call should look like this: $searchQuery = str_replace($noiseArray, ' ', $searchQuery); > > Thanks in advance. > > Dotan Cohen > > http://essentialinux.com > http://what-is-what.com/what_is/sitepoint.html > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
checkout the function mysql_real_escape_string() On Oct 29, 2006, at 3:13 PM, Dotan Cohen wrote: On 30/10/06, Paul Novitski <[EMAIL PROTECTED]> wrote: Hi Dotan, To get help with your problem, share more of your PHP code with the list so we can look at what you're doing. Also, give us a link to the PHP script on your server so we can see the output. Regards, Paul Nothing else is relevant, but $searchQuery will get passed to the database, so it should be protected from SQL injection. That's why I want to remove characters such as quotes, dashes, and the equals sign. I set up a test page: http://what-is-what.com/test.php with this code: ", "#", "@", "\$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", "]", "{", "}", "about", "after", "all", "also", "an", "and", "another", "any", "are", "as", "at", "be", "because", "been", "before", "being", "between", "both", "but", "by", "came", "can", "come", "could", "did", "do", "does", "each", "else", "for", "from", "get", "got", "has", "had", "he", "have", "her", "here", "him", "himself", "his", "how", "if", "in", "into", "is", "it", "its", "just", "like", "make", "many", "me", "might", "more", "most", "much", "must", "my", "never", "now", "of", "on", "only", "or", "other", "our", "out", "over", "re", "said", "same", "see", "should", "since", "so", "some", "still", "such", "take", "than", "that", "the", "their", "them", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", "well", "were", "what", "when", "where", "which", "while", "who", "will", "with", "would", "you", "your"); $searchQuery=preg_replace( "/^".$noiseArray."$/", " ", $_POST ["query"]); $searchQuery=trim($searchQuery); print "$searchQuery"; ?> Dotan Cohen http://song-lirics.com http://what-is-what.com/what_is/distribution.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On 30/10/06, Paul Novitski <[EMAIL PROTECTED]> wrote: Hi Dotan, To get help with your problem, share more of your PHP code with the list so we can look at what you're doing. Also, give us a link to the PHP script on your server so we can see the output. Regards, Paul Nothing else is relevant, but $searchQuery will get passed to the database, so it should be protected from SQL injection. That's why I want to remove characters such as quotes, dashes, and the equals sign. I set up a test page: http://what-is-what.com/test.php with this code: ", "#", "@", "\$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", "]", "{", "}", "about", "after", "all", "also", "an", "and", "another", "any", "are", "as", "at", "be", "because", "been", "before", "being", "between", "both", "but", "by", "came", "can", "come", "could", "did", "do", "does", "each", "else", "for", "from", "get", "got", "has", "had", "he", "have", "her", "here", "him", "himself", "his", "how", "if", "in", "into", "is", "it", "its", "just", "like", "make", "many", "me", "might", "more", "most", "much", "must", "my", "never", "now", "of", "on", "only", "or", "other", "our", "out", "over", "re", "said", "same", "see", "should", "since", "so", "some", "still", "such", "take", "than", "that", "the", "their", "them", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", "well", "were", "what", "when", "where", "which", "while", "who", "will", "with", "would", "you", "your"); $searchQuery=preg_replace( "/^".$noiseArray."$/", " ", $_POST["query"]); $searchQuery=trim($searchQuery); print "$searchQuery"; ?> Dotan Cohen http://song-lirics.com http://what-is-what.com/what_is/distribution.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Dotan Cohen wrote: > Thanks all for the heads up with the str_replace not working with > regexes. Duh! I've switched to preg_replace, but still no luck. (nor > skill, on my part) > > I'm trying to use array_walk to go through the array and deliminate > each item with /b so that the preg_replace function will know to only > operate on whole words, but I just can't seem to get it. I am of > course Ring TFM and Sing THW but with no luck. A push (link to TFA or > tutorial, whatever) would be most appreciated. IMHO, unless you're going to be doing a lot with each element of the array it's easier to do it without using array_walk. foreach ($arr as $key => $val) $arr[$key] = "\\b".$val."\\b"; -Stut -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Thanks all for the heads up with the str_replace not working with regexes. Duh! I've switched to preg_replace, but still no luck. (nor skill, on my part) I'm trying to use array_walk to go through the array and deliminate each item with /b so that the preg_replace function will know to only operate on whole words, but I just can't seem to get it. I am of course Ring TFM and Sing THW but with no luck. A push (link to TFA or tutorial, whatever) would be most appreciated. Thanks again. Dotan Cohen http://what-is-what.com/what_is/love.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Paul Novitski wrote: If you go this route, perhaps you could enclose each member of your original array in \b word boundary sequences using an array_walk routine so that you don't have to muddy your original array declaration statement. At 10/29/2006 01:54 PM, rich gray wrote: IIRC str_replace() does not interpret or understand regular expression syntax - you'd need preg_replace() for that You're absolutely right -- I was focusing so much on the regexp syntax that I failed to broaden my gaze... When the OP corrects his PHP to use preg_replace() instead of str_replace(), I believe he'll still need to provide word boundaries around each member of his noise-word array, otherwise the function will simply remove all letters and digits from the words in the search-string he considers meaningful and he'll end up searching thin air. Aside, without knowing the context of his search, it seems a bit extreme to remove all single characters from the search string. It's not hard to think of examples of them occurring as part of valid entity names in our universe -- she got an A, Plan B From Outer Space, Vitamin C, etc. An alternative strategy might be to trust the user to enter a valid search string and simply warn them that the quality of the answer they get will depend on the quality of their input. Regards, Paul -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On 29/10/06, Børge Holen <[EMAIL PROTECTED]> wrote: Yes you need to put some \ in front of some of those characters On Sunday 29 October 2006 21:05, Dotan Cohen wrote: > I need to remove the noise words from a search string. I can't seem to > get str_replace to go through the array and remove the words, and I'd > rather avoid a redundant foreach it I can. According to TFM > str_replace should automatically go through the whole array, no? Does > anybody see anything wrong with this code: > > $noiseArray = array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0", > "\"", "'", ":", ";", "|", "\\", "<", ">", ",", ".", "?", "$", "!", > "@", "#", "$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", > "]", "{", "}", "about", "after", "all", "also", "an", "and", > "another", "any", "are", "as", "at", "be", "because", "been", > "before", "being", "between", "both", "but", "by", "came", "can", > "come", "could", "did", "do", "does", "each", "else", "for", "from", > "get", "got", "has", "had", "he", "have", "her", "here", "him", > "himself", "his", "how", "if", "in", "into", "is", "it", "its", > "just", "like", "make", "many", "me", "might", "more", "most", "much", > "must", "my", "never", "now", "of", "on", "only", "or", "other", > "our", "out", "over", "re", "said", "same", "see", "should", "since", > "so", "some", "still", "such", "take", "than", "that", "the", "their", > "them", "then", "there", "these", "they", "this", "those", "through", > "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", > "well", "were", "what", "when", "where", "which", "while", "who", > "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", > "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", > "u", "v", "w", "x", "y", "z"); > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); > > Thanks in advance. > I improved the $noiseArray to this: $noiseArray = array("[:alnum:]", "[:punct:]", "|", "\\", "<", ">", "#", "@", "\$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", "]", "{", "}", "about", "after", "all", "also", "an", "and", "another", "any", "are", "as", "at", "be", "because", "been", "before", "being", "between", "both", "but", "by", "came", "can", "come", "could", "did", "do", "does", "each", "else", "for", "from", "get", "got", "has", "had", "he", "have", "her", "here", "him", "himself", "his", "how", "if", "in", "into", "is", "it", "its", "just", "like", "make", "many", "me", "might", "more", "most", "much", "must", "my", "never", "now", "of", "on", "only", "or", "other", "our", "out", "over", "re", "said", "same", "see", "should", "since", "so", "some", "still", "such", "take", "than", "that", "the", "their", "them", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", "well", "were", "what", "when", "where", "which", "while", "who", "will", "with", "would", "you", "your"); Do any of the characters in there need escaping (other than the $ which is already escaped)? How does on go about looping through the array, matching only whole words? This didn't quite do it: $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); And neither did this: $searchQuery=str_replace( "/^".$noiseArray."$/", " ", $searchQuery); Thanks. Dotan Cohen http://dotancohen.com http://what-is-what.com/what_is/drm.html
Re: [PHP] str_replace on words with an array
Dotan Cohen wrote: > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); Ok, this is what the compiler will see... $searchQuery=str_replace("^Array$", " ", $searchQuery); Yes, that's a literal Array in the string. You cannot, and you should remember this, you cannot concatenate strings and arrays. What would you expect it to do? Now, the answer is this... $searchQuery = str_replace($noiseArray, ' ', $searchQuery); However, what you seem to be doing is putting regex syntax where it would have no effect even if $noiseArray was not an array. If your intention is to replace the words rather than just the strings then you need to look at preg_replace (http://php.net/preg_replace) and you'll need to decorate the strings in $noiseArray with the appropriate characters to make them the search pattern you need - full details on the preg_replace manual page. Hope that made sense. -Stut -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Paul Novitski wrote: If you go this route, perhaps you could enclose each member of your original array in \b word boundary sequences using an array_walk routine so that you don't have to muddy your original array declaration statement. IIRC str_replace() does not interpret or understand regular expression syntax - you'd need preg_replace() for that rich
Re: [PHP] str_replace on words with an array
I never use this function, since I always use regular expressions, but according to the manual: If you don't need fancy replacing rules (like regular expressions), you should always use this function instead of ereg_replace() or preg_replace(). So, I assume your problem initially was that you were using regular expression syntax and this function doesn't accept regular expressions. Is your aim to get rid strings of all the items in the $noiseArray? Your $noiseArray includes the entire alpahabet, so it's no wonder that you end up with empty strings! Myron Dotan Cohen wrote: On 29/10/06, Alan Milnes <[EMAIL PROTECTED]> wrote: Dotan Cohen wrote: > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); Can you explain what you are trying to do with the ^ and $? What is a typical value of the original $searchQuery? Alan The purpose of the ^ and the $ is to define the beginning and the end of a word: http://il2.php.net/regex I also tried str_replace( $noiseArray, " ", $searchQuery) but that was replacing the insides of words as well. And with the addition of the individual letters, that emptied the entire $searchQuery string! A typical value of $searchQuery could be "What is php?" or "What is open source". See this site for details: http://what-is-what.com Dotan Cohen http://technology-sleuth.com/ -- _ Myron Turner http://www.room535.org http://www.bstatzero.org http://www.mturner.org/XML_PullParser/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
At 10/29/2006 01:07 PM, Dotan Cohen wrote: The purpose of the ^ and the $ is to define the beginning and the end of a word: http://il2.php.net/regex No, actually, ^ and $ define the beginnning & end of the entire expression being searched, not the boundaries of a single word. Therefore searching for ^mouse$ will locate "mouse" only if it's the only word in the entire string, which I gather is not what you want. I suspect what you want is either this: (^| )WORD( |$) (the word bounded by either the start/end of string or a space) or perhaps better yet this: \bWORD\b (the word bounded by word boundaries). See [PCRE regex] Pattern Syntax http://il2.php.net/manual/en/reference.pcre.pattern.syntax.php Further to your problem, I believe this is incorrect: $noiseArray = array("1", ... ... $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); Since $noiseArray is your entire array, it doesn't make sense to enclose it in word boundaries of any kind. Instead, I imagine each member of the array needs to be bounded individually. If you go this route, perhaps you could enclose each member of your original array in \b word boundary sequences using an array_walk routine so that you don't have to muddy your original array declaration statement. Regards, Paul -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
On 29/10/06, Alan Milnes <[EMAIL PROTECTED]> wrote: Dotan Cohen wrote: > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); Can you explain what you are trying to do with the ^ and $? What is a typical value of the original $searchQuery? Alan The purpose of the ^ and the $ is to define the beginning and the end of a word: http://il2.php.net/regex I also tried str_replace( $noiseArray, " ", $searchQuery) but that was replacing the insides of words as well. And with the addition of the individual letters, that emptied the entire $searchQuery string! A typical value of $searchQuery could be "What is php?" or "What is open source". See this site for details: http://what-is-what.com Dotan Cohen http://technology-sleuth.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] str_replace on words with an array
Yes you need to put some \ in front of some of those characters On Sunday 29 October 2006 21:05, Dotan Cohen wrote: > I need to remove the noise words from a search string. I can't seem to > get str_replace to go through the array and remove the words, and I'd > rather avoid a redundant foreach it I can. According to TFM > str_replace should automatically go through the whole array, no? Does > anybody see anything wrong with this code: > > $noiseArray = array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0", > "\"", "'", ":", ";", "|", "\\", "<", ">", ",", ".", "?", "$", "!", > "@", "#", "$", "%", "^", "&", "*", "(", ")", "-", "_", "+", "=", "[", > "]", "{", "}", "about", "after", "all", "also", "an", "and", > "another", "any", "are", "as", "at", "be", "because", "been", > "before", "being", "between", "both", "but", "by", "came", "can", > "come", "could", "did", "do", "does", "each", "else", "for", "from", > "get", "got", "has", "had", "he", "have", "her", "here", "him", > "himself", "his", "how", "if", "in", "into", "is", "it", "its", > "just", "like", "make", "many", "me", "might", "more", "most", "much", > "must", "my", "never", "now", "of", "on", "only", "or", "other", > "our", "out", "over", "re", "said", "same", "see", "should", "since", > "so", "some", "still", "such", "take", "than", "that", "the", "their", > "them", "then", "there", "these", "they", "this", "those", "through", > "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", > "well", "were", "what", "when", "where", "which", "while", "who", > "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", > "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", > "u", "v", "w", "x", "y", "z"); > > $searchQuery=str_replace( "^".$noiseArray."$", " ", $searchQuery); > > Thanks in advance. > > Dotan Cohen > > http://essentialinux.com > http://what-is-what.com/what_is/sitepoint.html -- --- Børge Kennel Arivene http://www.arivene.net --- -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php