RE: Backslash and full text searches

2003-11-17 Thread Jason Ramsey
Matt,

Yes, database could be cleaned with a query, but I don't even want to think
about the code problem.  I was being hyperbolic with the 500,000; it's
actually more like 50,000 lines of code - but still, who wants to change
50,000 lines of code.  And yes, it's PHP.

However, as an update, I did some tests on another installation of the
software we have on a different server, and it was actually matching the
\.  The problem server will match with something like ... WHERE Field
REGEXP 'somebody.?'s' so mysql knows there's a character there, it just
won't allow you to escape the \ to be able to use it.  I'm going to do a
configuration comparison for both php and mysql between the two boxes and
try to hunt down the problem.

-Original Message-
From: Matt W [mailto:[EMAIL PROTECTED]
Sent: Saturday, November 15, 2003 3:20 PM
To: Jason Ramsey; [EMAIL PROTECTED]
Subject: Re: Backslash and full text searches


Hi Jason,

Sorry for the late reply.

If all of the 100,000 entries have extra backslashes, that can be fixed
with a query. But then you have all your code. :-(

Speaking of code, what language are you using? I always assume PHP, but
with 500k lines I'm thinking you're using C?

Is somebody\'s the only word you're trying to match? I just tried it and
I can't get somebody\'s to match even as a phrase in BOOLEAN mode
because the somebody part is a stopword. However, Matt\'s works. If
you can't match a non-stopword, are you sure the correct amount of
backslashes are making it to MySQL, or are they being lost as escape
characters in your programming language first?

My results:

mysql CREATE TABLE test (test TEXT NOT NULL, FULLTEXT (test));
Query OK, 0 rows affected (0.01 sec)

mysql INSERT INTO test VALUES ('This is Matt\\\'s text.'), ('Text by
Matt.');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql SELECT * FROM test;
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- This just matches the Matt part
mysql SELECT * FROM test WHERE MATCH (test) AGAINST('Matt\\\'s' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- Exact phrase
mysql SELECT * FROM test WHERE MATCH (test) AGAINST(' Matt\\\'s ' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
+---+
1 row in set (0.00 sec)


Matt


- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 6:41 PM
Subject: RE: Backslash and full text searches


 Thanks for your answer, you can see my comments below.

 -Original Message-
 From: Matt W
 Sent: Thursday, November 13, 2003 3:36 PM
 To: Jason Ramsey; [EMAIL PROTECTED]
 Subject: Re: Backslash and full text searches


 Hi Jason,

 Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
 should match somebody\'s -- because it's really matching the
somebody
 part.

 ^ That makes sense to me, and the behavior I would have expected;
except, it
 doesn't seem to work that way.  Anything I try I can't get
somebody\'s to
 match a field in the database.

 If you're using 4+, you can use IN BOOLEAN MODE to match
 somebody\'s exactly:

 MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

 ^ Unfortunatley, this doesn't work.  I've tried several variations,
but
 can't seem to match useing MATCH, LIKE or even =.

 However, I get the feeling that you're not really wanting to match a
 word that has a backslash in it. e.g. You're getting extra
backslashes.
 mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
 the table -- it's not even *possible* since they're only escape
 characters in the query. If you're getting any extra backslashes when
 retrieving data, too many were added during insertion -- you ran
 mysql_escape_string/addslashes too many times. This happens if PHP's
 stupid magic_quotes_gpc is on and you don't check for that.

 ^hmm.  Well, that certainly is different than what I understood, and
helpful
 information.  It looks like magic_quotes_gpc is on.  However, now I
have a
 problem that we have a 100,000 entries in the database like this, and
 500,000 lines of code expecting the backslash.  I really need to find
a
 solution so that I can search and somehow account for the \ in my
 searches.  Any help would be appreciated.


 Your text should come out exactly the way it was intended. Never, ever
 any need for stripslashes(), etc. if it was inserted correctly. :-)


 Hope that helps.


 Matt



 - Original Message -
 From: Jason Ramsey
 Sent: Thursday, November 13, 2003 4:10 PM
 Subject: Backslash and full text searches


  We make extensive use of full text searches, but have run into some
 problems
  with backslashes

RE: Backslash and full text searches

2003-11-16 Thread Michael McTernan
Hi there,

I had a similar problem wanting to match 'words' like 42.3.1.1 as
technical references.  I couldn't get them to match using a fulltext index,
I assume because the full stops are used by the parser in MySql to break the
text into words for indexing - maybe this is the same for the slashes.  I've
considered changing the full stops to underscores as I add the text to the
database, or re-coding the function that does the splitting - I think I
found the function, but it is quite hard to penetrate :-/

Thanks,

Mike

 -Original Message-
 From: Jason Ramsey [mailto:[EMAIL PROTECTED]
 Sent: 13 November 2003 22:11
 To: [EMAIL PROTECTED]
 Subject: Backslash and full text searches


 We make extensive use of full text searches, but have run into
 some problems
 with backslashes.  If a word like somebody's is entered into
 our database,
 we escape the string using mysql_escapes_string in php.  So,
 mysql_escape_string(somebody's) becomes somebody\'s when it
 is saved in
 the database.  The problem is, we don't seem to be able to match against
 this in the database.

 Let's say we saved somebody's in the data base.  The following
 will match
 fine and pull up the results expected...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)

  But if somebody\'s is stored in the database, there seems
 to be no way
 to match the \.  We've tried all of the following...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\\\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')

  Any ideas?


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Backslash and full text searches

2003-11-15 Thread Matt W
Hi Jason,

Sorry for the late reply.

If all of the 100,000 entries have extra backslashes, that can be fixed
with a query. But then you have all your code. :-(

Speaking of code, what language are you using? I always assume PHP, but
with 500k lines I'm thinking you're using C?

Is somebody\'s the only word you're trying to match? I just tried it and
I can't get somebody\'s to match even as a phrase in BOOLEAN mode
because the somebody part is a stopword. However, Matt\'s works. If
you can't match a non-stopword, are you sure the correct amount of
backslashes are making it to MySQL, or are they being lost as escape
characters in your programming language first?

My results:

mysql CREATE TABLE test (test TEXT NOT NULL, FULLTEXT (test));
Query OK, 0 rows affected (0.01 sec)

mysql INSERT INTO test VALUES ('This is Matt\\\'s text.'), ('Text by
Matt.');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql SELECT * FROM test;
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- This just matches the Matt part
mysql SELECT * FROM test WHERE MATCH (test) AGAINST('Matt\\\'s' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- Exact phrase
mysql SELECT * FROM test WHERE MATCH (test) AGAINST(' Matt\\\'s ' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
+---+
1 row in set (0.00 sec)


Matt


- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 6:41 PM
Subject: RE: Backslash and full text searches


 Thanks for your answer, you can see my comments below.

 -Original Message-
 From: Matt W
 Sent: Thursday, November 13, 2003 3:36 PM
 To: Jason Ramsey; [EMAIL PROTECTED]
 Subject: Re: Backslash and full text searches


 Hi Jason,

 Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
 should match somebody\'s -- because it's really matching the
somebody
 part.

 ^ That makes sense to me, and the behavior I would have expected;
except, it
 doesn't seem to work that way.  Anything I try I can't get
somebody\'s to
 match a field in the database.

 If you're using 4+, you can use IN BOOLEAN MODE to match
 somebody\'s exactly:

 MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

 ^ Unfortunatley, this doesn't work.  I've tried several variations,
but
 can't seem to match useing MATCH, LIKE or even =.

 However, I get the feeling that you're not really wanting to match a
 word that has a backslash in it. e.g. You're getting extra
backslashes.
 mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
 the table -- it's not even *possible* since they're only escape
 characters in the query. If you're getting any extra backslashes when
 retrieving data, too many were added during insertion -- you ran
 mysql_escape_string/addslashes too many times. This happens if PHP's
 stupid magic_quotes_gpc is on and you don't check for that.

 ^hmm.  Well, that certainly is different than what I understood, and
helpful
 information.  It looks like magic_quotes_gpc is on.  However, now I
have a
 problem that we have a 100,000 entries in the database like this, and
 500,000 lines of code expecting the backslash.  I really need to find
a
 solution so that I can search and somehow account for the \ in my
 searches.  Any help would be appreciated.


 Your text should come out exactly the way it was intended. Never, ever
 any need for stripslashes(), etc. if it was inserted correctly. :-)


 Hope that helps.


 Matt



 - Original Message -
 From: Jason Ramsey
 Sent: Thursday, November 13, 2003 4:10 PM
 Subject: Backslash and full text searches


  We make extensive use of full text searches, but have run into some
 problems
  with backslashes.  If a word like somebody's is entered into our
 database,
  we escape the string using mysql_escapes_string in php.  So,
  mysql_escape_string(somebody's) becomes somebody\'s when it is
 saved in
  the database.  The problem is, we don't seem to be able to match
 against
  this in the database.
 
  Let's say we saved somebody's in the data base.  The following
will
 match
  fine and pull up the results expected...
 
  SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
 
  ... But if somebody\'s is stored in the database, there seems to
be
 no way
  to match the \.  We've tried all of the following...
 
  SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
  SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
  SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
  SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
  SELECT * FROM Table WHERE MATCH (Field) AGAINST

Re: Backslash and full text searches

2003-11-13 Thread Matt W
Hi Jason,

Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
should match somebody\'s -- because it's really matching the somebody
part. If you're using 4+, you can use IN BOOLEAN MODE to match
somebody\'s exactly:

MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

However, I get the feeling that you're not really wanting to match a
word that has a backslash in it. e.g. You're getting extra backslashes.
mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
the table -- it's not even *possible* since they're only escape
characters in the query. If you're getting any extra backslashes when
retrieving data, too many were added during insertion -- you ran
mysql_escape_string/addslashes too many times. This happens if PHP's
stupid magic_quotes_gpc is on and you don't check for that.

Your text should come out exactly the way it was intended. Never, ever
any need for stripslashes(), etc. if it was inserted correctly. :-)


Hope that helps.


Matt



- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 4:10 PM
Subject: Backslash and full text searches


 We make extensive use of full text searches, but have run into some
problems
 with backslashes.  If a word like somebody's is entered into our
database,
 we escape the string using mysql_escapes_string in php.  So,
 mysql_escape_string(somebody's) becomes somebody\'s when it is
saved in
 the database.  The problem is, we don't seem to be able to match
against
 this in the database.

 Let's say we saved somebody's in the data base.  The following will
match
 fine and pull up the results expected...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)

 ... But if somebody\'s is stored in the database, there seems to be
no way
 to match the \.  We've tried all of the following...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\\\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')

 ... Any ideas?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Backslash and full text searches

2003-11-13 Thread Jason Ramsey
Thanks for your answer, you can see my comments below.

-Original Message-
From: Matt W [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 13, 2003 3:36 PM
To: Jason Ramsey; [EMAIL PROTECTED]
Subject: Re: Backslash and full text searches


Hi Jason,

Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
should match somebody\'s -- because it's really matching the somebody
part.

^ That makes sense to me, and the behavior I would have expected; except, it
doesn't seem to work that way.  Anything I try I can't get somebody\'s to
match a field in the database.

If you're using 4+, you can use IN BOOLEAN MODE to match
somebody\'s exactly:

MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

^ Unfortunatley, this doesn't work.  I've tried several variations, but
can't seem to match useing MATCH, LIKE or even =.

However, I get the feeling that you're not really wanting to match a
word that has a backslash in it. e.g. You're getting extra backslashes.
mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
the table -- it's not even *possible* since they're only escape
characters in the query. If you're getting any extra backslashes when
retrieving data, too many were added during insertion -- you ran
mysql_escape_string/addslashes too many times. This happens if PHP's
stupid magic_quotes_gpc is on and you don't check for that.

^hmm.  Well, that certainly is different than what I understood, and helpful
information.  It looks like magic_quotes_gpc is on.  However, now I have a
problem that we have a 100,000 entries in the database like this, and
500,000 lines of code expecting the backslash.  I really need to find a
solution so that I can search and somehow account for the \ in my
searches.  Any help would be appreciated.


Your text should come out exactly the way it was intended. Never, ever
any need for stripslashes(), etc. if it was inserted correctly. :-)


Hope that helps.


Matt



- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 4:10 PM
Subject: Backslash and full text searches


 We make extensive use of full text searches, but have run into some
problems
 with backslashes.  If a word like somebody's is entered into our
database,
 we escape the string using mysql_escapes_string in php.  So,
 mysql_escape_string(somebody's) becomes somebody\'s when it is
saved in
 the database.  The problem is, we don't seem to be able to match
against
 this in the database.

 Let's say we saved somebody's in the data base.  The following will
match
 fine and pull up the results expected...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)

 ... But if somebody\'s is stored in the database, there seems to be
no way
 to match the \.  We've tried all of the following...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\\\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')

 ... Any ideas?


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]