RE: Backslash and full text searches

2003-11-17 Thread Jason Ramsey
Matt,

Yes, database could be cleaned with a query, but I don't even want to think
about the code problem.  I was being hyperbolic with the 500,000; it's
actually more like 50,000 lines of code - but still, who wants to change
50,000 lines of code.  And yes, it's PHP.

However, as an update, I did some tests on another installation of the
software we have on a different server, and it was actually matching the
\.  The problem server will match with something like ... WHERE Field
REGEXP 'somebody.?'s' so mysql knows there's a character there, it just
won't allow you to escape the \ to be able to use it.  I'm going to do a
configuration comparison for both php and mysql between the two boxes and
try to hunt down the problem.

-Original Message-
From: Matt W [mailto:[EMAIL PROTECTED]
Sent: Saturday, November 15, 2003 3:20 PM
To: Jason Ramsey; [EMAIL PROTECTED]
Subject: Re: Backslash and full text searches


Hi Jason,

Sorry for the late reply.

If all of the 100,000 entries have extra backslashes, that can be fixed
with a query. But then you have all your code. :-(

Speaking of code, what language are you using? I always assume PHP, but
with 500k lines I'm thinking you're using C?

Is somebody\'s the only word you're trying to match? I just tried it and
I can't get somebody\'s to match even as a phrase in BOOLEAN mode
because the somebody part is a stopword. However, Matt\'s works. If
you can't match a non-stopword, are you sure the correct amount of
backslashes are making it to MySQL, or are they being lost as escape
characters in your programming language first?

My results:

mysql CREATE TABLE test (test TEXT NOT NULL, FULLTEXT (test));
Query OK, 0 rows affected (0.01 sec)

mysql INSERT INTO test VALUES ('This is Matt\\\'s text.'), ('Text by
Matt.');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql SELECT * FROM test;
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- This just matches the Matt part
mysql SELECT * FROM test WHERE MATCH (test) AGAINST('Matt\\\'s' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
| Text by Matt. |
+---+
2 rows in set (0.00 sec)

mysql -- Exact phrase
mysql SELECT * FROM test WHERE MATCH (test) AGAINST(' Matt\\\'s ' IN
BOOLEAN MODE);
+---+
| test  |
+---+
| This is Matt\'s text. |
+---+
1 row in set (0.00 sec)


Matt


- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 6:41 PM
Subject: RE: Backslash and full text searches


 Thanks for your answer, you can see my comments below.

 -Original Message-
 From: Matt W
 Sent: Thursday, November 13, 2003 3:36 PM
 To: Jason Ramsey; [EMAIL PROTECTED]
 Subject: Re: Backslash and full text searches


 Hi Jason,

 Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
 should match somebody\'s -- because it's really matching the
somebody
 part.

 ^ That makes sense to me, and the behavior I would have expected;
except, it
 doesn't seem to work that way.  Anything I try I can't get
somebody\'s to
 match a field in the database.

 If you're using 4+, you can use IN BOOLEAN MODE to match
 somebody\'s exactly:

 MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

 ^ Unfortunatley, this doesn't work.  I've tried several variations,
but
 can't seem to match useing MATCH, LIKE or even =.

 However, I get the feeling that you're not really wanting to match a
 word that has a backslash in it. e.g. You're getting extra
backslashes.
 mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
 the table -- it's not even *possible* since they're only escape
 characters in the query. If you're getting any extra backslashes when
 retrieving data, too many were added during insertion -- you ran
 mysql_escape_string/addslashes too many times. This happens if PHP's
 stupid magic_quotes_gpc is on and you don't check for that.

 ^hmm.  Well, that certainly is different than what I understood, and
helpful
 information.  It looks like magic_quotes_gpc is on.  However, now I
have a
 problem that we have a 100,000 entries in the database like this, and
 500,000 lines of code expecting the backslash.  I really need to find
a
 solution so that I can search and somehow account for the \ in my
 searches.  Any help would be appreciated.


 Your text should come out exactly the way it was intended. Never, ever
 any need for stripslashes(), etc. if it was inserted correctly. :-)


 Hope that helps.


 Matt



 - Original Message -
 From: Jason Ramsey
 Sent: Thursday, November 13, 2003 4:10 PM
 Subject: Backslash and full text searches


  We make extensive use of full text searches, but have run into some
 problems
  with backslashes

Format of stopwords file

2003-11-13 Thread Jason Ramsey
In the docs is says that you can define your own stopwords file for fulltext
searching.  The following is under show variables...

ft_stopword_file The file from which to read the list of stopwords for
full-text searches. All the words from the file will be used; comments are
not honored. By default, built-in list of stopwords is used (as defined in
`myisam/ft_static.c'). Setting this parameter to an empty string () will
disable stopword filtering. Note: FULLTEXT indexes must be rebuilt after
changing this variable. (This option is new for MySQL 4.0.10)

.. However, it doesn't say what format this file should be in.  Should it be
a text document with one word per line?  Is there some other format?

Also, is there a way to list the words mysql is currently using as
stopwords?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Backslash and full text searches

2003-11-13 Thread Jason Ramsey
We make extensive use of full text searches, but have run into some problems
with backslashes.  If a word like somebody's is entered into our database,
we escape the string using mysql_escapes_string in php.  So,
mysql_escape_string(somebody's) becomes somebody\'s when it is saved in
the database.  The problem is, we don't seem to be able to match against
this in the database.

Let's say we saved somebody's in the data base.  The following will match
fine and pull up the results expected...

SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)

... But if somebody\'s is stored in the database, there seems to be no way
to match the \.  We've tried all of the following...

SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\\\'s)
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')

... Any ideas?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



RE: Backslash and full text searches

2003-11-13 Thread Jason Ramsey
Thanks for your answer, you can see my comments below.

-Original Message-
From: Matt W [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 13, 2003 3:36 PM
To: Jason Ramsey; [EMAIL PROTECTED]
Subject: Re: Backslash and full text searches


Hi Jason,

Of course somebody's isn't going to match somebody\'s. :-) somebody\'s
should match somebody\'s -- because it's really matching the somebody
part.

^ That makes sense to me, and the behavior I would have expected; except, it
doesn't seem to work that way.  Anything I try I can't get somebody\'s to
match a field in the database.

If you're using 4+, you can use IN BOOLEAN MODE to match
somebody\'s exactly:

MATCH (col) AGAINST (' somebody\\\'s ' IN BOOLEAN MODE)

^ Unfortunatley, this doesn't work.  I've tried several variations, but
can't seem to match useing MATCH, LIKE or even =.

However, I get the feeling that you're not really wanting to match a
word that has a backslash in it. e.g. You're getting extra backslashes.
mysql_escape_string()/addslashes() *DOES NOT* store the backslashes in
the table -- it's not even *possible* since they're only escape
characters in the query. If you're getting any extra backslashes when
retrieving data, too many were added during insertion -- you ran
mysql_escape_string/addslashes too many times. This happens if PHP's
stupid magic_quotes_gpc is on and you don't check for that.

^hmm.  Well, that certainly is different than what I understood, and helpful
information.  It looks like magic_quotes_gpc is on.  However, now I have a
problem that we have a 100,000 entries in the database like this, and
500,000 lines of code expecting the backslash.  I really need to find a
solution so that I can search and somehow account for the \ in my
searches.  Any help would be appreciated.


Your text should come out exactly the way it was intended. Never, ever
any need for stripslashes(), etc. if it was inserted correctly. :-)


Hope that helps.


Matt



- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 4:10 PM
Subject: Backslash and full text searches


 We make extensive use of full text searches, but have run into some
problems
 with backslashes.  If a word like somebody's is entered into our
database,
 we escape the string using mysql_escapes_string in php.  So,
 mysql_escape_string(somebody's) becomes somebody\'s when it is
saved in
 the database.  The problem is, we don't seem to be able to match
against
 this in the database.

 Let's say we saved somebody's in the data base.  The following will
match
 fine and pull up the results expected...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)

 ... But if somebody\'s is stored in the database, there seems to be
no way
 to match the \.  We've tried all of the following...

 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody's)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody's')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST (somebody\\\'s)
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\'s')
 SELECT * FROM Table WHERE MATCH (Field) AGAINST ('somebody\\\'s')

 ... Any ideas?


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



runnaway mysql children processes

2003-09-02 Thread Jason Ramsey
Hey everyone,

I'd appreciate it if anyone on this list might be able to help us out with a
problem we've been having with mysql lately.

We are using mysql as a backend for a content management system on our site.
We've been using it for awhile, and it has been working fine.  However, we
recently launched a new version of our site, and have begun to see some
strange behavior.  Every few hours (and when we restart mysql) mysql will
suddenly open roughly a hundred child processes and bring the server to a
complete standstill (top will show an idle time of 0% all the processes
will be running around 30% each).  This will last for a few minutes, and
then each of the child processes will go away.  Everything will be fine for
a few hours, and then it will happen again.

We are running Apache 1.3, MySQL 4.0.12 and PHP 4.3.2 on Redhat 8.  The
server is a dual pentium 4 box with 2Gigs of memory.  Our main content table
has roughly 20,000 entries and numerious fulltext indexes.  The data is
about 37M and the indexes 22M.

Here are the settings we're currently running with; though we've tried some
other variations - all with the same results.

# The MySQL server
[mysqld]
# port  = @MYSQL_TCP_PORT@
# socket= @MYSQL_UNIX_ADDR@
socket=/var/lib/mysql/mysql.sock
#skip-locking
key_buffer = 128M
max_allowed_packet = 1M
table_cache = 128
sort_buffer_size = 4M
read_buffer_size = 1M
myisam_sort_buffer_size = 16M
thread_cache = 8
query_cache_size= 50M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 4
datadir=/var/lib/mysql
log-slow-queries

Our mysql instances do seem to be rather large - around 150M - though we
chalked this up to extensive sorting, etc. that needs to be done.  Since we
have 2Gigs of member - this really isn't a problem for us.

After it has gone through this process of opening all these child
processes - bring the machine to a stand still and then coming back - SQL
queries will be extremely fast after that.

Any help would be greatly appreciated.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]