from:"Gordan Bobic"

Re: MySQL upgrade from 5.0.32 to 5.0.42 broke replication

2007-06-12 Thread Gordan Bobic

On Tue, 12 Jun 2007, Ian P. Christian wrote:

> Ian P. Christian wrote:
> > I upgraded my slave server a few weeks ago, and the slave failed, with
> > an error similar to the one shown below.
> 
> 
> I have figured out what happened here now - and I'm part of the way
> though fixing it.
> 
> It turned out the defaults had changed somewhere, and rather then using
> /var/lib/mysql/-bin, it was using /var/run/mysql/mysqld-bin
> (and the same change fro relay logs too).

I think you've just discovered why non-packaged distributions have 
no place in production environment. Compiling things with defaults and 
expecting it to work the same is asking for trouble most of the time.

> Now... I've changed the slave to use it's correct logs now - however, if
> I do the same on the master, I'll have the last 4 hours of logs in
> /var/run/mysql/mysqld-bin ignored.
> 
> Somehow, I need to get the slave to catch up with the master's old logs
> in /var/lib/mysql/-bin, and then continue from the brand new
> logs in /var/run/mysql/mysqld-bin
> 
> This is an awful mess, and I'm not sure it's recoverable - perhaps it is.
> 
> In theory, I should be able to find out where the slave was up to in the
> old logs, extract them manually and replay them on the slave, and then
> reset the slave to use the new logs - however i'm not sure how reliable
> that's going to be - or even how to go about doing it yet.
> 
> Ideas anyone?

Has the slave started replicating from the new logs? If it has, you'll 
find it will be quicker to re-seed the slave - even if it takes a day or 
two. The only difference will be that in one case the slave will catch up 
on it's own, and in the other, you'll also lose a day or two of your time 
trawling through the logs manually trying to re-construct the data.

If it hasn't, you can probably change the big-log sequencing numbers and 
change the pointers to file names and offsets in the index files on the 
master and the slave, and hope for the best. With some luck, it'll work, 
but I wouldn't count on it.

Gordan

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Translation of sql into mysql

2007-06-12 Thread Gordan Bobic

On Wed, 13 Jun 2007, David Scott wrote:

> 
> I am trying to update the column CreditCode in a table extract using data 
> from another table CreditCodes, which has columns CreditCode and 
> Consumer_No.
> 
> I have been given the following sql which works on another database (not 
> sure which and it is late at night so I can't ring up and ask)
> 
> update extract
>  set CustomerCreditCode = b.CreditCode
>  from extract a
>  inner join CreditCodes b
>  on a.ConsumerNO = b.Consumer_No;
> 
> This gives an error in mysql:
> ERROR 1064: You have an error in your SQL syntax; check the manual that 
> corresponds to your MySQL server version for the right syntax to use near 
> 'from extract a
> inner join CreditCodes b
> on a.ConsumerNO = b.Cons
> 
> Can anyone translate it into correct mysql syntax for me?

Try:

UPDATE  extract,
CreditCodes
SET extract.CustomerCreditCode = CreditCodes.CreditCode
WHERE   extract.ConsumerNO = CreditCodes.Consumer_No;

You should probably try this on a scratch database or at least take a 
backup first.

Gordan


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: MySQL upgrade from 5.0.32 to 5.0.42 broke replication

2007-06-12 Thread Gordan Bobic

Just to clarify - are you asking for suggestions regarding avoiding 
re-seeding the slave or regarding what is likely to have gone wrong?

Generally, a newer slave can cope with an older master, but not the other 
way around. If you updated the master while slave was out of date, you may 
be out of options.

For what it's worth, LOAD DATA FROM MASTER tends to be much faster and 
more reliable than copying tar balls across if you have to re-seed.

The error indicates that there's a problem with that version, though. Have 
you tried any versions between 32 and 42?

Gordan

On Tue, 12 Jun 2007, Ian P. Christian wrote:

> I upgraded my slave server a few weeks ago, and the slave failed, with
> an error similar to the one shown below.
> 
> I rolled back my upgrade, and it started working again, so I forgot
> about it.
> 
> Today, I upgraded the master (again, from 5.0.32 to 5.0.42) - and hte
> slave failed again.
> 
> I thought upgrading the slave to match the master might help, but still
> it failed.  Below is the error.
> 
> The hostname did *not* change.
> 
> 
> 070612 13:35:09 [Warning] No argument was provided to --log-bin, and
> --log-bin-index was not used; so replication may break when this MySQL
> server acts as a master and has his hostname changed!! Please use
> '--log-bin=/var/run/mysqld/mysqld-bin' to avoid this problem.
> 
> 070612 13:35:09  InnoDB: Started; log sequence number 40 824537593
> 070612 13:35:09 [Warning] Neither --relay-log nor --relay-log-index were
> used; so replication may break when this MySQL server acts as a slave
> and has his hostname changed!! Please use
> '--relay-log=/var/run/mysqld/mysqld-relay-bin' to avoid this problem.
> 070612 13:35:09 [ERROR] Failed to open the relay log
> './xian-relay-bin.000962' (relay_log_pos 284157529)
> 070612 13:35:09 [ERROR] Could not find target log during relay log
> initialization
> 070612 13:35:09 [ERROR] Failed to initialize the master info structure
> 070612 13:35:09 [Note] /usr/sbin/mysqld: ready for connections.
> Version: '5.0.42-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306
>  Gentoo Linux mysql-5.0.42
> 
> Any ideas/suggestions welcome, reseeding the slave will literally take days.
> 
> 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Building RPMs with ICC

2007-06-11 Thread Gordan Bobic

Having just tried building MySQL 6.0 SRPM on CentOS/RHEL 5, it would
appear that it doesn't even build cleanly with the default settings with
GCC! Has anyone got any information as to why this might be? It seems to
build OK, but then all the tests (mysql-test-run) seem to fail.

Is this known not to work, or am I just unlucky enough to be the first
one to try it?

Gordan

Gordan Bobic wrote:
> Hi,
> 
> I'm trying to build the RPMs with ICC, but I'm not having a lot of luck. I 
> saw the documentation for what paremeters are used to combile the tar 
> balled binaries, but can anyone give me some guidance on how to get the 
> RPMs to build?
> 
> The distributions I want to build for are CentOS 5.0 / x86_64 and RedHat 9 
> / pentium3.
> 
> At the moment I have tried the following:
> export CC=icc
> export CXX=icpc
> export MYSQL_BUILD_PATH=$PATH
> export CPATH=/path/to/my/icc/includes
> export LIBRARY_PATH=/path/to/my/icc/lib
> 
> Added ICC libs to ld.so.conf
> 
> Changed parameters on icc in ~/.rpmrc:
> optflags: x86_64 -O3 -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -msse3 
> -xP -no-gcc -restrict -ip -mp -funroll-loops
> 
> which is pretty close to the documented example. I had to add the 
> -march/-mtune/-mcpu parameters because it looks like the default Makefile 
> somewhere was trying to substitute x86_64 in there, and this is not a 
> valid value for ICC v9.1 (thus causing it to abort). Instead it now just 
> issues a warning and says that it is overriding pentium4 with -msse3, but 
> proceeds with building.
> 
> Then I tried:
> rpmbuild --define "__cc icc" --rebuild MySQL-6.0.0-0.glibc23.src.rpm
> 
> Then things fail at various points. Then I doctored the rpm file to 
> exclude building of NDB, InnoDB and most other optional DB engines (I only 
> really require MyISAM, but I'm tryign to be a bit future facing so I'm 
> usiong 6.0 instead of 5.x). Eventually I get a segfault from the linker 
> and static mysqld library fails to build, at which point the process dies.
> 
> Is there any more detailed guidance available for building RPMs with ICC?
> 
> Many thanks.
> 
> Gordan
> 
> 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: MySQL Magazine - Issue 1 available NOW!!!!

2007-06-11 Thread Gordan Bobic

Oh dear... Without getting into any religious arguments, if you have to
use it, PHP already provides a perfectly good interface for preventing
any SQL injections - ever. Use MySQLi and bound parameters. And if
somebody manages to invent some quasi-valid reason for not using MySQLi
(e.g. version of PHP used), then there is always mysql_escape_string() /
mysql_real_escape_string().

I don't understand this never-ending fascination with re-inventing a
square wheel for an application for which the standard round type has
already been kindly provided since year dot.

/RANT

Gordan

Daevid Vincent wrote:
> Yes, you are correct. In a cruel, ironic twist, that actually bit me in
> the ass, as it turns out we tried to import some "HTML" output from MS
> Word, which adds all kinds of crazy XHTML comment tag thingys that look
> like:
> 
> 
> Etc.
> 
> *sigh*
>  
> 
>> -Original Message-
>> From: Yves Goergen [mailto:[EMAIL PROTECTED] 
>> Sent: Saturday, June 09, 2007 4:34 AM
>> To: Daevid Vincent
>> Cc: 'B. Keith Murphy'; 'MySQL General'
>> Subject: Re: MySQL Magazine - Issue 1 available NOW
>>
>> On 04.06.2007 23:44 CE(S)T, Daevid Vincent wrote:
>>> Thanks for the magazine. I already incorporated a little extra SQL
>>> injection checking into my db.inc.php wrapper...
>>>
>>> //[dv] added to remove all comments (which may help with 
>> SQL injections
>>> as well.
>>> $sql = preg_replace("/#.*?[\r\n]/s", '', $sql);
>>> $sql = preg_replace("/--.*?[\r\n]/s", '', $sql);
>>> $sql = preg_replace("@/\*(.*?)\*/@s", '', $sql); 
>> I'm not aware of the context, but I guess you can imagine 
>> that this will
>> corrupt any SQL queries that contain "#" or "--" or "/* ... 
>> */" inside a
>> string. So I would highly recommend not using those.
>>
>> -- 
>> Yves Goergen "LonelyPixel" <[EMAIL PROTECTED]>
>> Visit my web laboratory at http://beta.unclassified.de
>>
> 
> 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: How to set table-specific privileges

2007-06-11 Thread Gordan Bobic

On Mon, 11 Jun 2007, Mahmoud Badreddine wrote:

> Hello,
> I am trying to revoke all privileges from a user on a certain table in a
> database.
> I want the user to continue to have all his/her privileges intact for the
> other tables though.
> 
> I used the statement "REVOKE ALL PRIVILEGES on dbName.tableName from
> thisUser";
> 
> But that doesn't seem to change anything.
> The message always gives "Query 0K, 0 rows affected..."
> What could I be missing.

I seem to remember that this always says 0 rows affected.
To make sure the privilege changes stick, follow that up with:

mysql> FLUSH PRIVILEGES;

That should do the trick.

Gordan


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Maximizing number of connections per client session

2007-06-11 Thread Gordan Bobic

On Mon, 11 Jun 2007, Dave Shariff Yadallee -  System Administrator a.k.a. The 
Root of the Problem wrote:

> REcently my mysql  server has been overload with a lot
> of connections; in fact losing the mysql.user table and
> one whole database.

Define "losing". You lost access, or the tables got corrupted?

> How can this be prevented?

If your load is so high that the whole system is grinding to a halt, that 
indicates one or more of the following:

1) Your database isn't configured properly.
1.1) This may mean you have to allocate more memory to it in my.conf, but 
more likely,
1.2) you either have a poorly designed query somewhere in the calling 
code, or most likely
1.3) a missing index on a frequently used table

2) Your server isn't up to the job

Have you enabled logging of slow queries in my.conf to see what queries 
are taking a lot of time to execute?

What table type are you using? Is your data access mainly reads, mainly 
writes, or a combination of both? If it's both, you may benefit from using 
InnoDB tables instead of MyISAM, assuming this isn't what you are doing 
already.

The first thing to find is what queries are taking up a lot of resources 
and making sure they are running fully indexed, and/or possibly changing 
the data structure to better suit the query requirements.

Gordan

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Building RPMs with ICC

2007-06-11 Thread Gordan Bobic

Hi,

I'm trying to build the RPMs with ICC, but I'm not having a lot of luck. I 
saw the documentation for what paremeters are used to combile the tar 
balled binaries, but can anyone give me some guidance on how to get the 
RPMs to build?

The distributions I want to build for are CentOS 5.0 / x86_64 and RedHat 9 
/ pentium3.

At the moment I have tried the following:
export CC=icc
export CXX=icpc
export MYSQL_BUILD_PATH=$PATH
export CPATH=/path/to/my/icc/includes
export LIBRARY_PATH=/path/to/my/icc/lib

Added ICC libs to ld.so.conf

Changed parameters on icc in ~/.rpmrc:
optflags: x86_64 -O3 -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -msse3 
-xP -no-gcc -restrict -ip -mp -funroll-loops

which is pretty close to the documented example. I had to add the 
-march/-mtune/-mcpu parameters because it looks like the default Makefile 
somewhere was trying to substitute x86_64 in there, and this is not a 
valid value for ICC v9.1 (thus causing it to abort). Instead it now just 
issues a warning and says that it is overriding pentium4 with -msse3, but 
proceeds with building.

Then I tried:
rpmbuild --define "__cc icc" --rebuild MySQL-6.0.0-0.glibc23.src.rpm

Then things fail at various points. Then I doctored the rpm file to 
exclude building of NDB, InnoDB and most other optional DB engines (I only 
really require MyISAM, but I'm tryign to be a bit future facing so I'm 
usiong 6.0 instead of 5.x). Eventually I get a segfault from the linker 
and static mysqld library fails to build, at which point the process dies.

Is there any more detailed guidance available for building RPMs with ICC?

Many thanks.

Gordan


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Seriously.. When are we going to get subqueries?!

2005-06-09 Thread Gordan Bobic


Jay Blanchard wrote:

[snip]


Well, George, you never mentioned that this was your problem. And you
would run into the same problem, given your definition above,


regardless


of database (unless the database product has a hack to account for it,
I am not aware of any).



Not true. PostgreSQL can do it. If you want the timestamp modified every

time, a record is changed, you can use triggers to achieve this 
transparently. In PostgreSQL you can also set the default value (at 
creation time) to the output of a function.

[/snip]

I was unaware of that, but then you have to create a trigger...which is
a hack.


Not true. Triggers are in the SQL specification. They are a legitimate 
feature with specified expected behaviour. A hack is only implementing 
function returns as default field values on timestamps, rather than 
handling all field types in a unified way and allowing them to be set to 
the return value of a function by default.



A timestamp column will update anytime the tuple is updated
without additional triggers. And as Jeff points out MySQL 4.1 has a way
to control when the field gets populated.



[snip]
My understanding was the timestamp fields were only set when the record 
is created. They are not changed when the record is modified.

[/snip]

Not true, see above. And you can use the table creation statement I
provided earlier to make a table to test this with.

[snip] 


As far as MySQL is concerned it has been documented that there are
more 
than several large scale database application being utilized today,

including projects at Fortune 500 companies.



Indeed, but it depends on your application. If you are running something
big but very simple (e.g. 1 daily batch if INSERTs over night, and the 
rest of the day of millions of SELECTs), MySQL is fine. On any project 
where I actually have to manipulate the data and do more complex things,


I have been finding that MySQL simply isn't up to it.
[/snip]

Really? We do some very complex stuff with the data each day and have
had relatively little problem with these issues.


You really cannot have tried very hard, then. :-)


[snip]
Horses for courses, as ever. If MySQL isn't capable enough for your 
application, the correct solution is to find a more suitable database - 
not moaning about how MySQL isn't good enough for your specific 
application, just because you are afraid of learning how to use 
something slightly different.

[/snip]

Spot on.


Thank you. :-)

Gordan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Seriously.. When are we going to get subqueries?!

2005-06-09 Thread Gordan Bobic


Jay Blanchard wrote:

[snip]
The issue with timestamp is this:

You can only have one timestamp with a default of the current date.

My app has two fields in one table

Created_datedatetime default now()
Last_update datetime default now()

This doesn't work with timestamp because timestamp doesn't support two
columns with default current_timestamp. Additionally, it looks like if
you
default to current_timestamp then any time the record is modified, that
value is changed.
[/snip]

Well, George, you never mentioned that this was your problem. And you
would run into the same problem, given your definition above, regardless
of database (unless the database product has a hack to account for it, I
am not aware of any).


Not true. PostgreSQL can do it. If you want the timestamp modified every 
time, a record is changed, you can use triggers to achieve this 
transparently. In PostgreSQL you can also set the default value (at 
creation time) to the output of a function.




But it is simply fixed. When creating the row you
include as your value for Created_date, NOW()

INSERT INTO tblFoo (Created_date) VALUES (NOW())

...and then you never modify the Created_date again. The column with the
timestamp will continue to update properly when the record is UPDATED.


My understanding was the timestamp fields were only set when the record 
is created. They are not changed when the record is modified.



[snip]
While tinker-toys were wildly popular (I had them), they are wholly
unsuited
for large scale building projects.
[/snip]

I disagree. As shown by this link, a computer desk has been made from
tinker toys. http://www.charm.net/~jriley/tinkertoy.html :)

As far as MySQL is concerned it has been documented that there are more
than several large scale database application being utilized today,
including projects at Fortune 500 companies.


Indeed, but it depends on your application. If you are running something 
big but very simple (e.g. 1 daily batch if INSERTs over night, and the 
rest of the day of millions of SELECTs), MySQL is fine. On any project 
where I actually have to manipulate the data and do more complex things, 
I have been finding that MySQL simply isn't up to it.



I personally  manage
several MySQL databases containing 100's of millions of records on
OpenBSD and Linux systems. I have seen examples of MySQL databases
larger than the ones I am intimately familiar with. There are several
folks on this list who operate MySQL databases for large scale projects.


Size isn't the problem here. Functionality for dealing with such 
databases in a clean way is.


Horses for courses, as ever. If MySQL isn't capable enough for your 
application, the correct solution is to find a more suitable database - 
not moaning about how MySQL isn't good enough for your specific 
application, just because you are afraid of learning how to use 
something slightly different.


Gordan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Seriously.. When are we going to get subqueries?!

2005-06-09 Thread Gordan Bobic




I think MySQL has a little ways to go yet before I would subjectively
call it best.

I posted twice to the list with questions about porting my application
that
runs on (SQL Server, Oracle, PostgreSQL, Sybase SQL Anywhere, MS Access,
and
DB2) to MySQL. No one on the mysql list, or the internals list responded
to my pretty basic issues:

1)  Why can't I declare a datetime field with DEFAULT NOW()


Because MySQL only lets you declare constants as defaults. Timestamp 
fields are an exception hack. I can understand why you might want to 
port applications from SQL Server/Oracle/Sybase/Access/DB2 to MySQL, but 
if you already have an app on PostgreSQL, why would you want to port it 
to MySQL? In that case you already have a "free" database that does what 
you want.



2)  Since the SQL standard states that identifiers are not case
sensitive, how can I use the DB without case sensitivity, when I don't
have authority to change the system wide lowercase setting? I wouldn't have
authority to change the setting in a hosted environment.


If this is your worst gripe about MySQL's compliance with SQL standards, 
you have obviously not tried to use it for very complex things...



I have to say, MySQL still looks like a tinker-toy to me.


It is - but for most purposes it is "good enough".

If your applications grows to exceeds it's capabilities, provided you 
have written your queries in a reasonable and sensible way (_especially_ 
including NOT using MySQL proprietary extensions, and particularly using 
enum() fields (which are an insane idea in what is supposed to be a 
RELATIONAL database - but that's a whole different rant)), porting to a 
database with the extra features you require can be fairly painless. 
Porting the data and table structures is certainly simple enough. (I 
have written scripts to do this in a matter of hours in the past - the 
existing porting scripts are next to useless).


Gordan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Bug Report (LOAD DATA FROM MASTER on MERGE Tables)

2005-06-07 Thread Gordan Bobic

It would appear that LOAD DATA FROM MASTER processes databases and 
tables alphabetically. When a merge table is being copied, and it's name 
is alphabetically before some/any/all of it's components, the process 
fails with a 1017 couldn't find file error.


Has this been fixed? If so, as of which version? Is this a bug on the 
master or the slave side? I ask that because I am replicating from 4.1.x 
to 5.0.x.


Many thanks.

Gordan

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Full Text Search with and without index - possible bug?

2002-01-19 Thread Gordan Bobic


Hi!

> > 1.1) Full Text Search can, according to the manual, be performed without
> > the FTS index, but it is slower.
> > 1.2) MySQL can only use 1 index per join per table.
> >
> > => This means that if I specify the USE INDEX (some_non_fulltext_index),
> > the FTS will be performed without the index, and this will only work IN
> > BOOLEAN MODE. Otherwise, MySQL returns an error, complaining about the
> > lack of an index.
>
> Gordan, you're right. It's the way it was expected to behave :-)

The part that I thought could be a "bug" was the one you didn't quote. The 
problem I am experiencing is that the boolean MATCH/AGAINST FTS WITH an FTS 
index doesn't match the results returned WITHOUT the FTS index when the same 
query/parameters are used. I don't know if this is the case when there is no 
index at all (it would be difficult to test, too). I just found this out by 
using the USE INDEX clause to switch to using different indices.

Incidentally, I have set the fts minimum word length to 1, so the cause of 
this is not the fact that the index skips some values. In fact, it is the 
unindexed search that seems to get things wrong. The indexed FTS verifiably 
works on my data set. The un-indexed FTS demonstrably doesn't work as 
expected. The unindexed search returns much fewer values.

> It's not a "known" bug as the code is rather new.
>
> Can you create a test case for this ?

Possibly, but it could be difficult as my data set changes daily. It looks 
like a problem that ought to be replicable with fairly generic data.

Incidentally, could it be caused by the fact that I am using a multi-column 
FTS? Could this be where the operation without the index is falling over?

Anyway, from what I can see, the indexed FTS works great. The unindexed FTS 
doesn't (both in boolean mode).

Another thing - is it possible to combine a FULLTEXT index with another 
column/index to achieve a multi-column index with FTS capabilities? It just 
seems wrong that FTS should take a fixed amount of time regardless of what 
other indexable restrictions are made on the data set, and if there is a 
limitation on 1 index per table per query, then the multi-column index is the 
only way around it.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Full Text Search with and without index - possible bug?

2002-01-18 Thread Gordan Bobic


Hi.

I thought it would be useful to share my findings. They all relate to the 
4.0.1 release. It would be nice to have some clarification on whether this is 
expected behaviour, whether this behaviour is wrong (i.e. bug, corrupted 
index, etc), and what you guys think could be causing it.

1.1) Full Text Search can, according to the manual, be performed without the 
FTS index, but it is slower.
1.2) MySQL can only use 1 index per join per table.

=> This means that if I specify the USE INDEX (some_non_fulltext_index), the 
FTS will be performed without the index, and this will only work IN BOOLEAN 
MODE. Otherwise, MySQL returns an error, complaining about the lack of an 
index.

HOWEVER, please someone explain why the following results are happening:

Two nearly identical queries, similar to:

SELECT  Table1.ID,
Table1.Title,
Table1.Type,
Table1.Description,
DATE_FORMAT(Retrieved, '%d-%b-%Y %H:%i:%S') AS Retrieved
FROMTable1
WHERE   Type = 'SomeType'
AND Retrieved > '2002011800'
MATCH (Title, Description)  AGAINST ('some words to match' IN 
BOOLEAN MODE)
ORDER BY MATCH (Title, Description) AGAINST ('some words to match' IN 
BOOLEAN MODE) DESC;

and

SELECT  Table1.ID,
Table1.Title,
Table1.Type,
Table1.Description,
DATE_FORMAT(Retrieved, '%d-%b-%Y %H:%i:%S') AS Retrieved
FROMTable1 USE INDEX (Table1_Retrieved_Index)
WHERE   Type = 'SomeType'
AND Retrieved > '2002011800'
MATCH (Title, Description)  AGAINST ('some words to match' IN 
BOOLEAN MODE)
ORDER BY Retrieved DESC;

These two queries return DIFFERENT numbers of records!

If my understanding of the documentation is correct, the second example 
should be slower because the FTS index isn't used. But the results should be 
the same right? Well, that definitely isn't the case in my database.

I have just done a REPAIR TABLE Table1, Table2... EXTENDED, so the tables 
definitely aren't corrupted.

The FTS index search returns 24 records on my data set (~ 60K records), and 
the non-fts search returns 7 records.

The reason I have been even trying this is because FTS is a bit slow for some 
of the things I am doing. By limiting the data set through the "Retrieved" 
date field, I can usually cut the data down to about 10% of the total size, 
hoping that non-indexed FTS on that will be faster.

Well, it turned out to be faster for cases where the data set was cut down a 
lot by the index, but the IN BOOLEAN MODE FTS doesn't seem to be reacting to 
things like '-word' in the MATCH/AGAINST clause, as it should per the FTS 
search. Sometimes, specifying a '-word' that should only remove a few results 
returns 0 rows - which is clearly wrong in some cases.

Is there a know bug in the indexless FTS that causes this? The indexed FTS is 
behaving well, but I was really hoping to gain some speed by using a 
different index in some specific cases...

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Porting from MS SQL to MySQL

2002-01-18 Thread Gordan Bobic

On Fri, 18 Jan 2002, Markus Lervik wrote:

>
> Hello all!
>
> We've requested a database from different companies, and specifically
> said we wanted MySQL or PostgreSQL because of the open source angle
> and we're a library.
> One company offered MS SQL as the platform and said that they can later on
> port it to MySQL. For this they wanted 18 000 euro. Now, what I want to know
> is, how easy is it to port a (fairly complicated) database from MS SQL to
> MySQL? It can't be work worth 18 000 euro, now can it?

I'm not sure I'm understaidning your question completely, but I'll try to
answer it anyway.

1) Porting from MS SQL to MySQL is not the easiest of things. There are
several options, and they include connecting to both via MS Access and
ODBC, or using the Sybase perl drivers with the FreeTDS library to get it
to talk to MS SQL. They you can write a program in perl to port the data.

2) If you want MySQL or PostgreSQL, why exactly would you want to install
MSSQL, and then "upgrade" later? Both PostgreSQL and MySQL will run on
Windows (although again, I'm not sure why you would want to do that if you
have a choice).

All in all, it can be complicated and time consuming. You don't want to go
there. Just get MySQL or PostgreSQL in the first place, and save yourself
the trouble of porting the data later.

If you are about to implement a system from scratch, it's is always
easiest and cheapest to do it right in the first place, with the software
you want.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MS Access to MySQL convert

2001-12-27 Thread Gordan Bobic

On Thursday 27 Dec 2001 23:31, Billy Reed wrote:
> I am new to MySQL and am trying to convert :
>
> SELECT SUM(DistKm) AS Distance FROM (SELECT TOP 5 DistKm FROM XCLeague
> WHERE LoginID='billyreed' AND HGPG='PG' ORDER BY DistKm DESC)
>
> Can anyone suggest how this is done in MySQL. It doesnt seem to like the
> TOP 5 syntax.

I am not sure sub-selects are actually supported in MySQL, but in the SQL 
dialect of MySQL IF sub-selects were supported, it would be something like:

SELECT SUM(DistKm) AS Distance FROM (SELECT DistKm FROM XCLeague
WHERE LoginID='billyreed' AND HGPG='PG' ORDER BY DistKm DESC LIMIT 5).

Without sub-selects, you will probably want something like:

CREATE TEMPORARY TABLE IF NOT EXISTS TempTable Type = Heap
(
DistKm  integer unsigned
)
SELECT  DistKm
FROMXCLeague
WHERE   LoginID = 'billyreed' AND
HGPG= 'PG'
ORDER BYDistKm DESC
LIMIT   5;

SELECT  Sum(DistKm)
AS  Distance
FROMTempTable;

DROP TABLE TempTable;

Make sure the CREATE TABLE above returns '1' before proceeding with the rest, 
as there could be race-condition where the new table gets created by another 
user before the old TempTable gets dropped below. If CREATE fails, try until 
it works, or it times out... It could be that TEMPORARY tables only exist in 
the scope of current connection (they do on PostgreSQL, I'm not sure about 
MySQL as I never used this feature), but then you could potentially run into 
problems with persistent connections in Perl and/or PHP, so you probably want 
to drop it just to make sure.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

SMP, Parallel queries, MERGE, RAID

2001-12-26 Thread Gordan Bobic


Hi.

Just wondering if this is something that has been planned for MySQL...

If MERGE table are used, from the documentation, my understanding is that 
they are searched in sequence when a SELECT is issued. However, would it not 
be reasonably easy to implement a parallel search algorythm, so that the 
SELECT on each of the MERGE tables is done simultaneously, and then UNIONed 
together before it is returned?

I understand that there is the fork penalty for this, but for big MERGE 
tables on an SMP system, this could improve the performance nearly linearly 
with the number of CPUs in the system. It could be limited to, say 
"thread_concurrency" SELECTS in parallel or something like that, in order to 
minimize the fork penalty to a useful level.

Is this something that is planned? It just strikes me as a reasonably obvious 
place to implement parallel query execution for UNION SELECTs.

This could then (I can dream, right?) be extended to performing UNION SELECTs 
across the whole replicated cluster (execute the SELECT on the "latest" MERGE 
table on the master/"insert/update host" to insure most up-to-date data is 
returned).

That should REALLY start giving Oracle some competition in the "run a query 
across a cluster" feature set. :-)

Am I missing something, and is there something in the current MySQL design 
that makes this difficult/implausible? It doesn't appear to be in the TODO 
list (but I might be looking at slightly out-of-date documentation)...

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL 4.0.1 released

2001-12-25 Thread Gordan Bobic

On Monday 24 Dec 2001 06:37, Michael Widenius wrote:

> - A Query cache to cache results for SELECT statements. This should
>   give a major speed improvement for queries on read-intensive tables.
>   More information can be found at:
>   http://www.mysql.com/doc/Q/u/Query_Cache.html

This is a rather useful feature, but I am not sure it is behaving quite the 
way it is supposed to. The cache "expiration" seems to be triggered by all 
insert statements, including the ones that fail, e.g. due to unique index 
violations. Would it be possible to have the cache expire only if the data is 
actually changed?

In my particular application, I have a lot of INSERT DELAYED queries running 
in the background, and I rely on unique indices to keep the data, well, 
unique. However, every time such a query is executed, indluding when all the 
insertion attempts fail due to unique key violation, the cache seems to 
"expire", and the search is performaed again "from scratch".

Is this something that can easily be "fixed" for the next release?

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: linux and hdparm

2001-12-22 Thread Gordan Bobic

On Saturday 22 Dec 2001 19:42, Gary Huntress wrote:
> I've just checked my IDE parameters for one of my linux mysql servers.
> Using hdparm -c -d I've discovered that I'm running at the default 16 bit
> non DMA.   I'm assuming that enabling both of these should increase
> performance, possibly significantly.
>
> However, I'm concerned about doing this 1)  on the fly with the server up
> and 2) doing it at all.  Can anyone comment on linux hdparm performance
> tuning for mysql and the possibility of table corruption?

I have found that using hdparm will either work, or it will hang the disk/bus 
immediately. Therefore, you should NEVER do it to a live, running server with 
disk I/O constantly occuring. Shut down server deamons, such as MySQL before 
you use hdparm or anything else to tune disks.

As for performance, I have often seen performance improve by 2-10 fold when 
proper hdparm parameters were used to tune up the disks.

So yes, it is very much worth it, but no, do not do it on a running server - 
reserve a few minutes of downtime before you start tuning things, as you will 
inevitably eventually push things to far and it will crash. Also never set 
the -k1 -K1 parameters before you've insured that the settings will work, as 
sometimes an automatic bus re-set will prevent a thorough crash.

I don't know what distribution you are running, but on RedHat, there is a 
file /etc/sysconfig/harddisks where you can specify the parameters when you 
are done with tuning, so they get re-applied every time the machine boots up. 
Bear this in mind when doing disk-swaps.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Performance

2001-12-13 Thread Gordan Bobic

On Tuesday 13 Nov 2001 17:05, René Fournier wrote:

> And concerning database connections, my ISP asks that I always close a
> MySQL connection with a mysql_close()--which I can understand.

They probably use persistent database connections for Apache. If that is the 
case, if you keep opening connections without closing them, eventually the 
system will run out of available connections. Similar things happen with 
mod_perl. You ALWAYS close your database connections.

> But I'm
> curious, if I have to make, say, 10 SELECTs throughout a page (in the
> header, body, and footer), is it perhaps faster to use the mysql_close()
> at the very end of the page (in the bottom of the footer.inc)? In other
> words, is there any disadvantage performance-wise (or stability reason)
> to open and immediately thereafter close mysql connections?

Yes, there is a performance hit. Open the database connection at the 
beginning of your page, and then close it at the end. It takes time to set up 
a database connection, albeit this is somewhat alleviated by using persistent 
connections. But you STILL have to do some work to access a connection from 
the connection pool.

> Can anyone recommend an article or two on coding techniques for
> improving PHP and MySQL performance? I don't wan to spend the rest of my
> life optimizing my php code, but if I knew a few basic rules about its
> performance characteristics, I might be able to make better decisions
> when coding. Thanks.

Using persistent connections, is a good start. Opening and closing database 
connections as rarely as possible is going to help, too, as long as you make 
sure all your open connections get closed when the script terminates.

Everything else is likely to be optimizations of PHP code or SQL code, 
separately.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Browser caching problem OR mysql/perl code?

2001-12-12 Thread Gordan Bobic

On Thursday 13 Dec 2001 03:42, Jack A. Fobel wrote:
> Hello,
>
> Coming from an ASP world, I'm trying to learn how to break the browser from
> caching. I am using variables and their values in the querystring to 
> update a mysql database with perl. I am also re-querying the database to
> pull back the record. I was hoping to pull back the updated record but it
> appears to either cache the info in the browser and/or mysql is caching the
> query; dunno.

I use perl/mysql quite extensively, and I have never run into anything like 
the second possibility you are describing. AFAIK, the only way that could 
happen is if you use INSERT DELAYED queries, and they don't finish running 
before your next refresh happens.

There are, however, bugs in IE that make it not requery, and when it doesn, 
it doesn't do it properly, thus possibly resulting in stale data. IIRC, squid 
has a workaround kludge for this behaviour.

> I have tried a few things, they are 1) turn off cache in IE, 2) pass a
> datestamp through the querystring, 3) HTML cache buster code. Being none of
> these worked, I am resorting to believe it is the queries I am using in
> perl.

Try sending back the last modified time in the headers that is always the 
current timestamp. You may also want to play with the "Pragma:" http header.

If you use the standard CGI modules in perl, you shouldn't really be having 
any of those problems, though. Try a few different browsers (including 
versions) - that should help you pinpoint the problem.

> Any ideas how I can write my queries in a select so that it is sure to
> pick up any recently changed data from the database that might have just
> been written?

It sounds like you have a browser problem.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: [OT] Undescriptive subjects.

2001-12-10 Thread Gordan Bobic

On Monday 10 Dec 2001 14:08, Etienne Marcotte wrote:
> PLEASE
>
> can everybody use a descriptive subject?

I've got an additional idea - how about configuring the mailing list remailer 
to put something like [MySQL-General] or [MySQL-Internals] in the subject 
line? For some of us who are on multiple lists, it gets really hard to work 
out what is what without reading the whole lot.

Cheers.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Optimizing date queries

2001-12-05 Thread Gordan Bobic

On Wednesday 05 Dec 2001 09:30, Emil Rasmussen wrote:
> > CREADE INDEX myDateField_Index
> > ON tblTabel
> > (
> > DATE_FORMAT(myDateField,'%Y%m%d')
> > );
> >
> > This may or may not work, but it's worth a shot, if you prefer your query
>
> in
>
> > the format it was in before.
>
> It did not work with MySQL 3.23.32, it whould have been very nice, because
> i still have a problem with hours, eg.:
>
> SELECT count(id)
> FROM tblTable
> WHERE myField != '' AND
> (
> myDateField >= '23' AND
> myDateField < '235959'
> );
>
> Wich offcourse does not work, so again i am looking for some help?

Don't you mean

myDateField >= '2001120523' AND
myDateField < '20011205235959'
?

You have to include the full format to make it work, i.e. including the 
MMDD section of the MMDDHHMMSS format. The above query will probably 
yield no results because you have no data in the given range (year ).

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Optimizing date queries

2001-12-04 Thread Gordan Bobic


> > Does anyone have some optimization tips for this query:
> >
> > SELECT count(id) FROM tblTabel WHERE fieldName != '' AND
> > DATE_FORMAT(myDateField,'%Y%m%d') = 20011120;
> >
> > It does not use the index i have created, so it scans the entire table,
> > which is a bit to slow.
> >
> > The problem off course is, that MySQL has to do a calculation on the
> > myDateField before it can decide if the row matches. Any "workarounds"
> > for this?
>
> How about:
>
> SELECTcount(id)
> FROM  tblTabel
> WHERE fieldName != '' AND
>   (
>   myDateField >= 2001112000   AND
>   myDateField < 2001113000
>   );
>
> That will use indices, and yield the same results. It will also consume
> less CPU time because you don't have to play with string parsing for the
> date formatting.

Just thought of something else. I don't know if MySQL can do this, but here 
is something that PostgreSQL lets you do. I haven't tried it on MySQL, but 
here it is anyway.

Your query would use an index if the index was created on:

DATE_FORMAT(myDateField,'%Y%m%d')

as stated in your query.

So, you would want to do something like:

CREADE INDEX myDateField_Index
ON tblTabel
(
DATE_FORMAT(myDateField,'%Y%m%d')
);

This may or may not work, but it's worth a shot, if you prefer your query in 
the format it was in before.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Optimizing date queries

2001-12-04 Thread Gordan Bobic

On Tuesday 04 Dec 2001 14:58, Emil Rasmussen wrote:
> > SELECT count(id)
> > FROM tblTabel
> > WHERE fieldName != '' AND
> > (
> > myDateField >= 2001112000 AND
> > myDateField < 2001113000
> > );
>
> Thank you, that was defiantly part of the solution!
>
> I had created an index on both fieldName og myDateField, but MySQL did not
> use that. But creating an index only on myDateField did the trick.

If you don't search on the first field in the index, the index doesn't get 
used. In a majority of cases, what you want is a separate index for each 
field. There are exceptions, of course, for example when you want to create 
unique constraints on a table...

> Thanks for the help!

Any time.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Optimizing date queries

2001-12-04 Thread Gordan Bobic

On Tuesday 04 Dec 2001 13:49, Emil Rasmussen wrote:
> Hey
>
> Does anyone have some optimization tips for this query:
>
> SELECT count(id) FROM tblTabel WHERE fieldName != '' AND
> DATE_FORMAT(myDateField,'%Y%m%d') = 20011120;
>
> It does not use the index i have created, so it scans the entire table,
> which is a bit to slow.
>
> The problem off course is, that MySQL has to do a calculation on the
> myDateField before it can decide if the row matches. Any "workarounds" for
> this?

How about:

SELECT  count(id)
FROMtblTabel
WHERE   fieldName != '' AND
(
myDateField >= 2001112000   AND
myDateField < 2001113000
);

That will use indices, and yield the same results. It will also consume less 
CPU time because you don't have to play with string parsing for the date 
formatting.

HTH.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL on SMP machine

2001-12-01 Thread Gordan Bobic

On Saturday 01 Dec 2001 21:29, Ady Wicaksono wrote:

> How to get an advantage from Dual ProcessorMachine when i'm using MySQL,
> should i recompile MySQL what option should be activated ?

You need to run more than one query at a time, for a start.

I am not sure about how good SMP support for various platforms is. It 
definitely works on Linux (that's what I'm using).

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL v.s. Oracle

2001-12-01 Thread Gordan Bobic


> We are currently using MySQL for our database driven website. Currently,
> we have about 100,000 users. In January, we will be getting 2 million
> new registered users on our website.

Yes, but how many hits are you expecting, and what sort of queries will be 
ran?

> We're buying a $50,000 Sun box to run the database server on. We're
> deciding whether we should switch to Oracle. Can MySQL handle this kind
> of load? The president (who doesn't know much about databases) was
> thinking about buying Oracle, but from what I've heard, Oracle is
> actually slower than MySQL since it needs to check FOREIGN KEYs,
> TRIGGERs, ASSERTIONs, etc.

AFAIK Oracle is slower, even if you don't use foreign keys, triggers, etc.

> Can someone provide some advice? Thanks. My main concern is whether that
> massive scale (2 million registered users, along with all the data and
> CPU load they generate) is supported by MySQL.

You may find that the problem is not in the database - it may be that your 
hardware will be insufficient. With this sort of load, and a 50K budged for 
hardware, you may be better off getting a bunch of cheap intel/amd based 
servers (say, 10-20 of them with that sort of budget) and setting up 
replication for a server farm. You could also gain reliability in this way. 
With a lot of simultaneous read-write database access, you may also want to 
consider PostgreSQL (something about row-level instead of table level locking 
on inserts). It's replication, however, doesn't appear to be as mature as 
that of MySQL, and if your queries are likely to be mostly SELECTs, then 
MySQL is probably the way forward.

You may also want to take a look at the benchmark comparison charts between 
MySQL and PostgreSQL, Oracle, and other databases.

Having said that, what server do you have at the moment? And what is the 
server load like? If you are expecting a 20 fold increase in users, and a 
linear increase in server load, you can work out roughly what sort of 
hardware you will need.

Another thing you may want to look at is the "What we used MySQL for" section 
in the MySQL manual for an indication of what MySQL can deal with.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Comparing text documents

2001-12-01 Thread Gordan Bobic

On Saturday 01 Dec 2001 10:41, Ovanes Manucharyan wrote:
> I have lots of text documents archived in a mysql database. I need
> to compare these documents to see which ones are similar (either they
> may have a small formatting difference or have a word or two different).
> Is there a function/method in mysql that will easily accomplish this?
> If not could you suggest web sites/literature on this topic. (Lexical
> analyis?)

IMHO, this is not really something that should be handled by the database 
itself, as it is too complicated. It would, however, probably be most 
efficiently dealt with at the database end.

However, I have been running into the same issue myself, and here is 
something  have come up with.

Strip down the documents you want to compare at insert time. Make them all 
lower case, compress white space, erase all non-alpha characters 
(punctuation, etc). Split the document into two-word groups, with a sliding 
2-word window. That should give you n-1 elements, where n = number of words 
in the document.

In Perl you could handle this efficiently using a hash. Increment the value 
of a hash for each element by one, for each element in the array you've split 
the document into.

For example.

Document = "This is a test message."
StrippedDocument = "this is a test message"
Array = ("this is", "is a", "a test", "test message")
Hash = ("this is" => 1, "is a" => 1, "a test" => 1, "test message" => 1)

Save the contents of this has to a separate field for quick and easy 
retrieval (not sure what the best way to do this is yet, still working on it).

Then, at each insert, compare the new document with all the others. If you 
can constrain the cross-check test to begin with (e.g. by author, source, 
etc), that will speed things up massively.

Then just compare the hashes in some way, for example, by checking whether 
the number of two-word construct counts exceeds some similarity threshold 
that you can define yourself.

You could then store this similarity cross-list in another table.

It may be an idea to implement it as a stored procedure, but AFAIK, MySQL 
can't do that (please correct me if I am wrong). If you combine this with 
triggers, you could intercept INSERT and UPDATE queries on the relevant 
tablea, to make sure that things are always kept in sync.

I am not sure how this could be handled in MySQL, but in PostgreSQL you could 
use PL/Perl stored procedures and set up triggers. That is how the PostgreSQL 
Full Text Index contrib used to work. It is likely, though, that this will be 
very slow, especially on a big table...

You could probably handle some of this at the application level, too, but in 
any case, the INSERT time would go through the roof because you would have to 
compare every new record against every existing record (again, constraints in 
this comparison would be a very beneficial thing).

This is not a prticularly scientiffic approach - it is a 
not-too-well-thought-out pragmatic one, but I only ran into the same problem 
myself yesterday...

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL and RAID question

2001-11-30 Thread Gordan Bobic

On Friday 30 Nov 2001 16:33, Jason Burfield wrote:

First of all, the idea of data loss because one disk has gone bad on RAID-5 
is not possible. RAID-5 specifically allows you to loose up to 1 disk at any 
one time, and stay active.

The reason why RAID-5 is not recommended for MySQL is speed. RAID-5 requires 
parity bits to be calculated and spread across all your disks for every write 
you do. For every read, this all has to be decoded. This means that the file 
system is slower, which is not very helpful for a database.

I am running SCSI RAID-5 (Linux - software only), and I have had no problems. 
I have lost a disk from a stripe before, and no data was ever corrupted. You 
just put in a new disk and rebuild the data on it. It's as simple as that. 

RAID tends not to mess up the performance too much is you are running SMP (if 
you've got multiple CPUs, it is easier to find time on them to do the RAID-5 
checksumming) and you have lots of RAM to keep things nicely cached.

Regard.

Gordan

> I currently run a system with Raid 5. I have had one disk drop, however,
> no data was lost. That is one of the beauties of RAID 5, you have a
> spare disk that takes over.
>
> I too had heard originally that running MySQL on a RAID 5 was a bad idea
> (this, after I already had it running...), however, I have not had any
> problems at all.

> On Thu, 2001-11-29 at 17:34, Lionlike MySQL Email List wrote:
> > Just one quick question...  I've heard that running a database in a
> > system using RAID isn't a good idea on RAID 5, but works fine with
> > mirroring (0, 0/1, 0/5).  Does anyone out there have experience with
> > MySQL on RAID 5, or know how MySQL performs in a RAID 5 environment?  My
> > concern is loss of data due to a drive going bad.  At least that what
> > I've heard could happen. Thanks,

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: mySql vs Interbase

2001-11-29 Thread Gordan Bobic

On Thursday 29 Nov 2001 16:18, Fabien R. wrote:

> > I saw somewhere that this comparaison existed but I can't remember
> >  where.
> > Does it talk to anyone ?

I haven't seen one between InterBase and MySQL, but I do remember seeing one 
between PostgreSQL and InterBase. Allegedly, they thought that InterBase was 
faster in the single-user scenario.

However, in my personal comparative test as of a few weeks ago, I have seen 
MySQL to be about twice as fast as PostgreSQL, and about 5-10 times as fast 
as InterBase.

Unless you have a particular need for InterBase, I'd stay away from it. It 
doesn't particularly exceed the features of MySQL or PostgreSQL, and it is 
MUCH slower... It could be a performance issue due to Perl DBI, but I doubt 
it - just about all the CPU time was consumed by the database back ends. 
Additionally, InterBase, according even to their own documentation, doesn't 
scale very well with SMP. Since it didn't measure up even in the single-user 
scenario, I never got as far as testing it in the SMP type environment.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL v4 FTS speed

2001-11-28 Thread Gordan Bobic


On Wednesday 28 Nov 2001 10:37, Sergei Golubchik wrote:

> Just added:
>
> % ./ft_dump
>
> Use: ./ft_dump [-dscvh]  
>
> -d  dump index (incl. data offsets and word weights)
> -s  report global stats
> -c  calculate per-word stats (counts and global weights)
> -v  be verbose
> -h  this text

Ah, it was the -h!

> > > For now - there's one bit changed - and one has to rebuild the table.
> > > The easiest way is 'ALTER TABLE ... TYPE=MYISAM', though dump+restore
> > > will work too, of course.
> >
> > I'm not sure I follow what you mean. All my tables are already of type
> > MyISAM (that's all I compiled MySQL with!) Are you saying that I should
> > install the new binaries "on top", restart the server, and just do the
> > ALTER TABLE as above on each table?
>
> Yes. OPTIMIZE or REPAIR TABLE is too weak, as these commands preserve
> MYI header. ALTER TABLE will recreate the table from scratch.

But the new server will still be able to read the old format tables? 
Otherwise it couldn't rebuild them, right?

> > > Still, I'd like to make file formats fully compatible
>
> Unfortunately it's hardly possible.
> Sorting order is changed, so index has to be rebuilt.

Will I have to do this manually, of will ALTER TABLE do all that for me?

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Can't make after ./configure

2001-11-27 Thread Gordan Bobic


On Tuesday 27 Nov 2001 23:33, you wrote:

> gcc version 3.0.2

It would appear that you tried compiling using the GCC 3.0.2 compiler. Try 
downgrading to 2.95.3 (the last stable release) and try again.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL v4 FTS speed

2001-11-27 Thread Gordan Bobic

On Tuesday 27 Nov 2001 23:28, Sergei Golubchik wrote:

> > Another question - is there a way to acquire a list of words in the FTS
> > index? Someting like
> >
> > SELECT  Word,
> > count(*) AS Frequency
> > FROMFTSIndex
> > GROUP BYWord
> > ORDER BYFREQUENCY ASC
> > LIMIT   100;
>
> There's myisam/ft_dump utility that can dump fulltext index ot of MYI file.

Is there any documentation on how to use it and what the parameters mean? The 
help it gives isn't all that descriptive... :-(

> > BTW, will the file formats be compatible? Or will it require a dump +
> > restore of the database, when going from 4.0.0 to 4.0.1?
>
> For now - there's one bit changed - and one has to rebuild the table.
> The easiest way is 'ALTER TABLE ... TYPE=MYISAM', though dump+restore
> will work too, of course.

I'm not sure I follow what you mean. All my tables are already of type MyISAM 
(that's all I compiled MySQL with!) Are you saying that I should install the 
new binaries "on top", restart the server, and just do the ALTER TABLE as 
above on each table?

> Still, I'd like to make file formats fully compatible - so,
> you'd better take a look at the ChangeLog section of the
> manual included in the 4.0.1 distribution.

Will do. When I actually get my hands on the 4.0.1 tar ball.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL v4 FTS speed

2001-11-27 Thread Gordan Bobic

On Tuesday 27 Nov 2001 20:17, Mike Wexler wrote:
> > The query
> >
> >   SELECT * FROM table
> >   WHERE MATCH (description) AGAINST ("+A +really +nice +toy" IN BOOLEAN
> > MODE) ORDER BY MATCH (description) AGAINST ("+A +really +nice +toy" IN
> > BOOLEAN MODE) LIMIT 10
> >
> > will abort the search when it retrieves the 500th document with all the
> > three words. It's impossible to say, how many index entries it will look
> > at.

I take it the "IN BOOLEAN MODE" part of the AGAINST() is going to be new to 
4.0.1.

Incidentally, how are the WHERE clauses handled when MATCH/AGAINST is used 
for FTS? Given that I am seeing a fairly linear increase in query time with 
the increase in number of matched terms, I would guess that the FTS is 
performed first. Especially since limiting other constraints in the WHERE 
clause produces no noticeable reduction in query time. This seems to be 
wasteful.

Considering that FTS is likely the slowest part of the query, it would 
probably be beneficial in terms of performance to have it execute last, with 
all other "simpler" constraints being satisfied first, so fewer records need 
to be searched.

Another question - is there a way to acquire a list of words in the FTS 
index? Someting like

SELECT  Word,
count(*) AS Frequency
FROMFTSIndex
GROUP BYWord
ORDER BYFREQUENCY ASC
LIMIT   100;

This would allow for easier overview of what "dead" words are being indexed, 
and therefore allow for easier isolation of new "stop words", and reduction 
in unnecessary searching that FTS would have to perform, thus increasing 
performance. Considering that I'm really after SELECT speed, would more 
careful tuning of stop words be likeky to yield signifficant performance 
improvements?

It would also be REALLY nice to have a "dynamic" list of stop words. I know 
you said that this is definitely planned, but it would be nice to know how 
soon...

Another thing - it would probably be useful to gather some statistics about 
FTS queries performed. For example, say the terms 'perl' or 'linux' appear 
1 times in the queries (since the last checkpoint interval, or whatever). 
But the words 'nt' and 'vb' appear 0 times. It might be beneficial to tune 
the FTS in a slightly dynamic way, so that the frequently searched terms have 
a sort of "keep-alive", while the words that are never searched for (but are 
indexed) get added to the stop-word list.

OTOH, if a stop word does end up being searched for a number of times that 
exceeds some threshold (say, 0.1%), it could be removed from the stop word 
list. The FTS index would then need to be updated, but if the database 
structure is dynamic (for example a search engine of some sort) where the 
records get deleted and inserted all the time, the database would be able to 
"learn" what are "start words" and what are "stop words".

These statistics could be analyzed and acted upon when, say, ANALYZE is 
issued. It would make the FTS "self-tuning". Of course, the FTS feature in 
the TODO that allows specifiying explicitly strings that get indexed "as they 
are" becomes much more useful, then, as it can prevent things from being
automatically added to the "stop word" list during ANALYZE/OPTIMIZE stage.

Has any of this been at least thought about? I've just checked the TODO, and 
it doesn't appear to be there...

Looking forward to 4.0.1.

BTW, will the file formats be compatible? Or will it require a dump + restore 
of the database, when going from 4.0.0 to 4.0.1?

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySQL - Got Signal 11 - URGENT-RedHat-gcc2.96

2001-11-26 Thread Gordan Bobic

On Tuesday 27 Nov 2001 04:11, Ady Wicaksono wrote:
> gcc 2.96 has bugs that affect MySQL, what bug ?

A number of bugs, and they don't just affect MySQL - there is a number of 
packages that break when compiled using GCC-2.96. Some examples that I 
presonally ran into at some point are:

mPlayer
InterBase DBD driver

There are more, I just can't remember right now. You will find that the 
kernel, for example, is compiled using kgcc under RedHat v7.X. kgcc is an 
older version of gcc (egcs, actually) that actually works. Try kgcc --version.

> RedHat 7.1 and RedHat 7.2 comes with gcc 2.96

And 7.0. 7.2 Also ships with GCC-3.0, which is actually a production release 
compiler, unlike the 2.96 version, which was a pre-release development 
snapshot. RedHat have been flamed repeatedly since the release of RH7 over 
this.

> so even we download the source code and recompile it 
> the bug will hit MySQL, is that right ?

Sometimes. YMMV.

Just download GCC-2.95.2.1 and the PGCC patches, and compile it yourself. 
Install it under /usr/local/pgcc, and prepend /usr/local/pgcc/bin to your 
path, and prepend /usr/local/pgcc/lib/ to your /etc/ld.so.conf 
file. That works great. It also has the added bonus that there are some extra 
optimizations available for the i586 and the i686 CPUs. You can also go silly 
on the optimizer and push it as far as -O6, and actually see a difference 
from -O3. DO NOT use special pentium optimizations if you are running on an 
AMD, as PGCC will produce pentium optimized code that won't run on non-intel 
CPUs. Hopefully we should all start using GCC v3 soon...

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: opinion - voating

2001-11-26 Thread Gordan Bobic

On Monday 26 Nov 2001 20:00, Neil Zanella wrote:
> Now there is one thing that we must be aware of. There are things that
> mysql does not support (yet). These include foreign keys, views,
> subselects, triggers, and procedural SQL, and I can't remember
> if mySQL supports transaction processing either, perhaps someone
> can confirm. If your database is not complex or if you do not need
> subselects then go with mysql. 

And if you need subselects, transparently unlimited table sizes or procedural 
SQL and other languages at the database level, triggers, foreign keys and 
views you will probably be better of with PostgreSQL than with Oracle.

The only feature that comes to mind that Oracle has and neither MySQL nor 
PostgreSQL have is the ability to set up a transparent server cluster with 
distributed data and have multiple servers execute (some) queries in parallel 
transparently, thus providing a bit of added scaleability.

In practice, though, you can get a nearly same effect with PostgreSQL and 
MySQL if you set up replication, and have your application round-robin the 
database servers.

As someone said earlier in this thread - if you don't _know_ you need Oracle, 
then you don't need Oracle.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

MySQL v4 FTS speed (Was: MySQL 4.0 fulltext search truncation bug?)

2001-11-26 Thread Gordan Bobic


> Second - probably it's a bug. I cannot say much as the code used by this
> query was removed from MySQL source tree. It was replaced by completely
> new boolean search engine written from scratch. Unfortunately it was
> done several days _after_ 4.0.0 release.
>
> Wait for 4.0.1 - it should be out in a few days.

Question - how much (if any) performance improvement is to be expected from 
the new boolean search engine for FTS in 4.0.1?

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: MySql on redhat 6.x and above

2001-11-23 Thread Gordan Bobic

On Friday 23 Nov 2001 11:27, you wrote:
> Hi,
> I have some doubts regarding MySql on Linux
> environment.
> I'd like to install MySql on linux -redhat 6.x and
> above.
> Could you please guide me how to proceed.
> I did not find a specific document on redhat 6.x and
> above in the documentations provided in the site.

Try with the RPMs. You may want to recompile it yourself to get the 
optimizations right for your application, though.

I'm running it on RH7.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Query Execution Time Perl/DBD

2001-11-21 Thread Gordan Bobic

On Wednesday 21 Nov 2001 12:11, Christian Andersson wrote:
> The obvious way to do this would be to get the time before and after the
> execution and then make a simple diff (after - before)  I guess that is how
> the mysql client does it (butI'm not sure)
> I do not think that mysql itself provides this information..

I was kind of hoping that the DBD object would have a special property where 
this can be read.

Thanks.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Query Execution Time Perl/DBD

2001-11-21 Thread Gordan Bobic


Hi.

How do I get the query execution time from Perl/DBD? MySQL client returns 
this in the console window. Is this value available through the Perl DBD 
interface?

Thanks.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Full Text Search - indexing 1-character words

2001-11-16 Thread Gordan Bobic


Hi.

How can I make FTS index single-character words? I have tried changing the 
ft_* values in my.cnf and rebuilding the tables/indices, but I still seem to 
be unable to get indexing of single-character words to work.

For example, if I need to index occurences of c++, it doesn't work. I 
understand that this gets reduced to "c", and that is not a problem. However, 
just "c" doesn't get indexed which IS a problem.

Is there a solution for this? I don't recompiling the source if required.

I'm using MySQL 4.0 Alpha.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Fulltext search - How to index "c++"

2001-11-12 Thread Gordan Bobic


Hi.

What do I need to do to make FTS index things like c++? I have tried setting

ft_min_word_len=0
and
ft_min_word_len=1

in /etc/my.cnf.

Restarted the mysqld and did a RENAME on the table followed by a new CREATE 
TABLE and an INSERT ... SELECT to import the data back into the correctly 
named table, but I still cannot search for "c", any other 1 character term or 
"c++" in my data set. I can search for two-character terms fine, though...

Is this not possible, or have I missed a setting somewhere?

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Default forced on MySQL

2001-11-10 Thread Gordan Bobic

On Saturday 10 Nov 2001 21:00, Franklin Schmidt wrote:

> The worst thing any development tool can do is silent unwanted favors.
> Java and C don't do unwanted favors which is why they are great.
> Javascript and most Microsoft tools do lots of unwanted favors which is
> why I hate them.  This is a personal preference but I suspect most
> programmers with experience in large projects share this preference.
> MySQL does many unwanted favors to avoid rollback.  This was a bad
> design decision because update speed usually doesn't matter much.  What
> matters is the speed of reading and reliability of the data.

That is very arguable. IMHO, the design decision was the correct one because 
the primary design goal of MySQL is speed. There are lots of databases out 
there. Use the one that agrees with you. If you want referential integrity 
checks, triggers, detailed data checking, defaults, stored procedures, etc, 
you should probably be using a different database.

Personally, I use PostgreSQL a lot of the time, when I need the features 
listed above. When the primary requirement of my application is speed, I use 
MySQL, because it is AT LEAST twice as fast as PostgreSQL on most things, 
sometimes much faster.

You cannot have it all. Now, can we please stop the criticism thread, and 
stick to doing the sensible things, i.e. using the right tool for the job?

> My point about unique indexes is that I assume that this forces MySQL to
> deal with the rollback problem anyway.

Any reduction in the frequency of the rollback problem is beneficial if 
performance is the primary requirement.

> If you have a multi-row insert
> and one of these rows violates a unique index, what does MySQL do?

Besides, uniqueness checking on an index for the insert is likely a lot 
faster, as typically it only needs checking on one of two fields per table, 
as opposed to EVERY field in the table. There is also less data transferred 
and overheads than doing a SELECT first to see if there's a records already 
there.

> You
> either do a rollback or lose atomicity.  This looks like exactly the
> same problem that you would have implementing NO DEFAULT or any other
> column value sanity check in the database.

But it unarguably reduces the frequency of the problem, and thus the speed of 
operation in most cases.

Just MHO...

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

How to improve FTI speed?

2001-11-09 Thread Gordan Bobic


Hi.

I've just set up my prototype so I can benchmark comparatively tthe profmance 
between PostgreSQL and MySQL.

There's good news, and there's bad news.

The good news is - MySQL is faster by about 2 times.
The bad news is - MySQL is ONLY faster by about 2 times.

Now, I DO understand that this is a LOT, but I'd really like to make this go 
faster.

Here is the setup I have used:

Number of Master records: 49,508
Number of FTI entries (PostgreSQL only): 1,574,136

Optimize/Vacuum ran just before the test:
PostgreSQL: VACUUM VERBOSE ANALYZE;
MySQL:  OPTIMIZE TABLE Master

The data set used on the prototype benchmark is a random set of old job 
adverts that I had lying around from a previous project.

On PostgreSQL a custom subrouting was written to handle the FTI text 
splitting.

Query used on PostgreSQL:

SELECT DISTINCT Jobs.*
FROMJobs,
JobsFTI
WHERE   Retrieved   >   '08-Nov-2000'   AND
Retrieved   <   '10-Nov-2000AND
Jobs.ID =   JobsFTI.ID  AND
(
JobsFTI.String = 'perl' OR
JobsFTI.String = 'linux'OR
JobsFTI.String = 'apache'   OR
JobsFTI.String = 'html' OR
JobsFTI.String = 'cgi'  OR
JobsFTI.String = 'mod_perl' OR
JobsFTI.String = 'modperl'  OR
JobsFTI.String = 'postgres' OR
JobsFTI.String = 'nt'   OR
JobsFTI.String = 'postgresql'
);

Query used on MySQL:

SELECT  Jobs.*
FROMJobs
WHERE   Retrieved > '20001108'  AND
Retrieved < '20001110   AND
MATCH   (Title, Description)
AGAINST ('perl linux apache html cgi mod_perl modperl postgres postgresql 
nt');

The test machine is a dual P3/1GHz with 1 GB of RAM. In the test, a perl 
program using DBI executed the above queries 100 times. The test was ran 
twice on each database, to avoid any influence of disk access on the 
performance (i.e. it was all running from cache). The second value is 
reported here.

Both PostgreSQL and MySQL were set to use 128 MB of buffer cache (key_buffer 
in MySQL, shared_buffers in PostgreSQL), with a similar amount of sort memory 
allowed.

Results:
PostgreSQL: 119 seconds. (1.19 seconds/query)
MySQL: 58 seconds. (0.58 seconds/query)

I must say I am impressed with the speed increase. In all honesty I expected 
less of a difference.

I just tried reducing the query string to just 'perl linux', and now the time 
taken is 9 seconds for the test (100 runs), compared to 15 seconds for 
PostgreSQL. MySQL sits happily at about 16 MB memory consumption, while 
PostgreSQL consumes about 130 MB. The reason for this is, apparently, because 
PostgreSQL handles it's own caching (in addition to the OS).

The figure of 0.09 seconds/query is very nice for a reasonable sounding case 
of two search parameters. With 35 search parameters MySQL reports 1.94 
seconds search time, which is still impressive for such a large query.

Now for the punch-line question - how do I speed this up further? I need to 
at least maintain this sort of performance on a signifficantly smaller server 
than this benchmark test-bed. Are there any tuning parameters I should be 
aware of? I have already gone through the documentation and applied 
reasonable sounding values to variables. Is there some sort of query 
"profiling" mechanism to try to see where things are taking longest?

I'm using MySQL v4.0-alpha BTW.

I will try recompiling with pgcc at some point with in-line functions, -O9, 
and without frame pointers etc, but it is likely that this system will 
eventually run on an AMD Athlon processor, so it may not work with pgcc. It 
is also almost certain that the database server will be separate from the 
application server, which is likely to slow things down further.

I have just done "strip" on the mysqld. No percievable difference in my 
custom test.
I compiled it all without debug options.

I haven't compiled the server with -static.
I haven't tried starting the deamon with --skip-safemalloc. Is this likely to 
make much difference/cause problems?
How much difference is compiling with pgcc with -O9, in-line functions, and 
without frame pointers likely to make?

Are any of those tweaks deprecated in v4?

Or am I simply pushing against the limits of what is reasonably possible here?

Thanks for any input you can provide.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Re: OT: Default forced on MySQL;

2001-11-09 Thread Gordan Bobic


On Friday 09 Nov 2001 12:08, you wrote:
MySQL
> On Friday 09 Nov 2001 11:37, you wrote:
> > Note on what is going on here...
> >
> > The paragraph below was from a private email I sent to Gordon.  And now
> > Gordon just responded to it on the public list.
> >
> > Gordon, I was strongly hinting to you in private that it was time to stop
> > now.  Can you please stop??
>
> My apologies for my rude behaviour. It was not my deliberate intention to
> mis-address the reply. :-(
>
> I agree with what you say here - let's stop this thread.
>
> Regards.
>
> Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: OT: Default forced on MySQL;

2001-11-09 Thread Gordan Bobic

On Friday 09 Nov 2001 12:11, DownloadFAST.com wrote:
> >Considering the features you want (IIRC you only mentioned the
> >default values in tables) - why don't you try PostgreSQL instead?
>
> My understanding is it is much slower.

That is what I am hoping for. Or rather, I am hoping that MySQL proves to be 
signifficantly faster, as that is the main reason for the migrtion I am 
testing at the moment.

> But the more important thing that has been learned here, is to find out
> what kind of people are in the community.  Because one can learn a lot
> about the future direction and success based on the people who are
> involved.

Indeed. All the MySQL developers who bothered responding to this thread 
completely stayed out of the somewhat heated discussion that we have recently 
been having. IMHO, that indicates a great amount of self-control and patience 
- all of which speak only the best for the MySQL development effort.

> And what I learned is that some (I don't know what % yet) are very
> dogmatic, closed-minded, intolerant, rude, unprofessional, and down right
> blood thirsty.  Not the kind of people I would want to contribute to or be
> a part of.

I think you may be (although I am not sure what particular responses have 
made you form that opinion) mistaking in your identification of the actual 
developers. As I am quite happy to confess, I am not a developer/contributor 
to the MySQL project - I just use it. I just wasn't happy with the way you 
attacked them, for no (IMHO) particularly good reason. This is not 
specifically related to any particular development effort. So, if you have a 
disagreement with my opinion, it would be a good idea not to make the mistake 
of assuming that is how the actual developers/contributors feel as well. I am 
sure it wasn't necessary for me to point this out, but I don't want my 
argument with you to be responsible for any sort of grudge against any of the 
hard working MySQL developers.

> And I also learned that MySQL has a license which says we can take their
> great work, and basically resell it and use it to make a profit for
> ourselves, as long as we give back any source code changes.

Indeed - a growing number of companies are trying to do this sort of thing 
with other GPLed developments.

> In 3.23 they basically stabilized all the major issues for a typical
> low-end web database.  So really the thing is there for the taking, if one
> wanted to just focus on a small set of goals.  It may not be me, but
> someone else my have already started on that business opportunity.

The problem is - what happnes when/if you start needing extra features that 
are not easy to just patch in? Personally, I would handle such minor lack of 
features in a database at the application level. When it comes to default 
values, I know I could handle that in just about all of my applications by 
fairly minor modifications to my code. In fact, that may well be what I will 
have to do to some extent when porting from PostgreSQL to MySQL. If nothing 
else, it seems like a much easier and quicker fix to the problem (or at least 
it is when your code is tidy) than insisting that a new feature be 
implemented, especially if it can be handled at the application level.

In the end, I think we have come full circle here. The only way to really 
protect your investment is to take over all of the software development, and 
have a definite say in what happens. It just seems a little to excessive to 
fork off the whole project, when a patch to either MySQL or your application 
may well be the far less time consuming option...

> I really did not want to learn the source.

I can certainly relate to that. But the alternatives often come to the rescue 
in that case.

> I have too many other things to
> do.  But if I have to learn the source to protect my investment, then I
> might as well do action and not beg people who I do not like.

That is a reasonable point, even if it may not be the most "efficient" (for 
want of better word) way do handle the problem. A slightly more cooperative 
attitude would have been much more likely to produce better results with less 
effort.

> The best way to erase all this noise and measure who is smarter, is in the
> competitive arena.

Open source arena is both cooperative and competitive - that is why it works.

Good luck with your forking efforts.

Regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: OT: Default forced on MySQL;

2001-11-09 Thread Gordan Bobic

On Friday 09 Nov 2001 03:36, DownloadFAST.com wrote:
> Mr. Bobic,

[snip]

> The only thing that matters is that every time you respond on the list, you
> are only making a fool of yourself in the eyes of the many people who are
> lurking.  They can see right thru your motives in sending this post below.

LOL. So much for "no personal attacks" that you called for. At least you're 
being consistent, I suppose.

As for growing up, the things that one tends to learn when they grow up is 
that life is about being reasonable and about compromise. Taking on the 
entire MySQL (or any established open source development effort) with the "I 
want this feature, and I want it now, because I _know_ it must be simple to 
implement even though I haven't exactly looked at what is involved" attitude 
isn't necessarily a compromising nor reasonable thing to do, in terms of time 
or effort. Considering the features you want (IIRC you only mentioned the 
default values in tables) - why don't you try PostgreSQL instead? I have said 
it before and I'll say it again - use the product that works best for YOUR 
particular application.

There are at least 5 free open source SQL DBMSes (MySQL, PostgreSQL, SAP, 
Interbase, MiniSQL). Before you hopelessly start trying to make one into 
another, you may find that a more rewarding exercise in terms of your time is 
to just use the one that is better geared toward your target application. 
That would at least show that you have studied the problems in your 
application and done the relevant homework. Complacency, resistance to 
learning, just wanting a "quick patch" and the "I'm OK, you are not OK" 
attitude don't do much for the display of how serious you are being (or 
trying to be).

The gauntlet has been thrown by you, and the response was your invitation to 
join the development effort (see Arjen's post on the list). The whole point 
of open source development is that if you don't like what something is like, 
you can just go and do it better. It's the ultimate put-up or shut-up 
competitive environment. You lead, you follow, or you get out of the way. To 
lead, you need followers that believe you are fit to do so. To follow, you 
need to find a leader you approve of and that approves of you. Getting out of 
the way is the easiest, but also the least rewarding action to take.

Humble regards.

Gordan

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

48 matches

Mail list logo