Re: [Bacula-devel] Fwd: Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

Marc Cousin Tue, 23 Jun 2009 02:06:08 -0700

Creating these indexes will only make things worse : it will slow down inserts 
on the file table (more index maintenance). Of course, for dbcheck, it's 
better, but not for normal use, so you'd rather create them just before 
dbcheck and destroy them afterwards.


For the batch insert queries, all you need is indexes on
Filename.Name and Path.Name. Those are the ones we join the batch table on.

Maybe one of these indexes is corrupted and not usable ?



On Monday 22 June 2009 21:05:10 Bruno Friedmann wrote:
> Hi Jason, I'm looking around conf a table status, comparing it to some
> config here. But nothing special "wrong" appears.
> I'm just using the utf8_general_ci as default encoding for db & table.
>
>
> I've just two questions : did you already run the dbcheck ?
> (be carefull if not it could take days ... especially against your db )
>
> I have added this indexes ( taken from the wiki )
> CREATE INDEX idxPathId ON File (PathId);
> CREATE INDEX idxFilenameId ON File (FilenameId);
> CREATE INDEX idxJPF ON File (JobId, PathId, FilenameId);
> CREATE INDEX idxFJ ON File (FileId,JobId);
> CREATE INDEX idxPJ ON File (PathId,JobId);
> to have a dbcheck running quickly.
> But with the db size you have, building the indexes could be long ...
>
> Beside that, I don't explain why you have trouble.
> Perharps it's could be useful ( when your backup server are not backing :-)
> to retry the batch-insert query with explain to see how mysql would do it.
> ( the trouble is to capture the batch table of course ).
>
> CALL For mysql administrator who have an Enterprise DB contract.
> If someone has this type, then he can ask the question directly to mysql
> engineers. ?
>
> Jason A. Kates wrote:
> > Bruno,
> > Thanks for taking a look at the batch mysql issue. We started with inodb
> > and moved to MyISAM.  MyISAM seemed marginally faster.   I have included
> > the output of show table status.
> >
> > Without the batch my troubled backup is no longer troubled it's working
> > at a decent rate based on the number of small files it has.   If this
> > was the same size with big files the transfer rates would be faster etc.
> >
> > 22-Jun 06:58 backup-server JobId 70468: Sending spooled attrs to the
> > Director. Despooling 2,897,722,241 bytes ...
> > 22-Jun 07:33 backup-server JobId 70468: Bacula qwest-netbackup-dir 3.0.2
> > (18Jun09): 22-Jun-2009 07:33:15
> >   Build OS:               x86_64-unknown-linux-gnu redhat Enterprise
> > release
> >   JobId:                  70468
> >   Job:                    backup-client1.2009-06-22_01.03.05_56
> >   Backup Level:           Full
> >   Client:                 "backup-client1" 3.0.1 (30Apr09)
> > x86_64-unknown-linux-gnu,redhat,Enterprise release
> >   FileSet:                "LINUX_ALL_LOCAL_FS" 2009-05-17 11:11:12
> >   Pool:                   "Daily" (From Job resource)
> >   Catalog:                "MyCatalog" (From Client resource)
> >   Storage:                "Autochanger" (From Job resource)
> >   Scheduled time:         22-Jun-2009 01:03:05
> >   Start time:             22-Jun-2009 01:03:20
> >   End time:               22-Jun-2009 07:33:15
> >   Elapsed time:           6 hours 29 mins 55 secs
> >   Priority:               20
> >   FD Files Written:       9,448,749
> >   SD Files Written:       9,448,749
> >   FD Bytes Written:       242,193,953,548 (242.1 GB)
> >   SD Bytes Written:       243,431,447,949 (243.4 GB)
> >   Rate:                   10352.4 KB/s
> >   Software Compression:   None
> >   VSS:                    no
> >   Encryption:             no
> >   Accurate:               yes
> >   Volume name(s):         CA3833L4|CA3831L4
> >   Volume Session Id:      213
> >   Volume Session Time:    1245547751
> >   Last Volume Bytes:      871,293,198,336 (871.2 GB)
> >   Non-fatal FD errors:    1
> >   SD Errors:              0
> >   FD termination status:  OK
> >   SD termination status:  OK
> >   Termination:            Backup OK -- with warnings
> >
> >                                  -Jason
> >
> > On Mon, 2009-06-22 at 09:02 +0200, Bruno Friedmann wrote:
> >> hi Jason, hardware should be more than sufficient.
> >>
> >> Could you post your my.cnf ?
> >>
> >> and tell what type of db is used (mysisam / inodb )
> >>
> >> ps :  I'm going to work, so it could take time before I answer ..
> >>
> >> Jason A. Kates wrote:
> >>> In my /etc/my.cnf  I have this set:
> >>> tmpdir = /tmp
> >>>
> >>> /tmp is a memory file system to avoid I/O contention.
> >>>
> >>> When looking you can see that temp tables are create in /tmp
> >>> thus /var/tmp should be out of the loop.
> >>>
> >>> The db resides in /export/mysql which is part of a 6 disk 15K rpm
> >>> hardware raid 10 disk set.
> >>>
> >>> df -Th
> >>> Filesystem    Type    Size  Used Avail Use% Mounted on
> >>> /dev/cciss/c0d0p1
> >>>               ext3    996M  489M  456M  52% /
> >>> /dev/mapper/Volume00-vartest
> >>>               ext3    2.0G  368M  1.5G  20% /var
> >>> /dev/mapper/Volume00-export
> >>>               ext3    186G   47G  130G  27% /export
> >>> /dev/mapper/Volume00-usr
> >>>               ext3    3.9G  1.8G  2.0G  47% /usr
> >>> none         tmpfs    8.0G  8.0K  8.0G   1% /tmp
> >>> /dev/cciss/c1d0p1
> >>>               ext3    943G  316G  579G  36% /var/spool/bacula
> >>>
> >>>
> >>> We are running Redhat ES 5 with Kernel 2.6.18-128.el5.
> >>> If you would like to test a mount option I would be glad to give it a
> >>> test.
> >>>
> >>>                                  -Jason
> >>>
> >>> On Mon, 2009-06-22 at 07:38 +0200, Bruno Friedmann wrote:
> >>>> Hi Jason, can you tell us if the place mysql works typically
> >>>> /var/lib/mysql are on the same partition of /tmp or /var/tmp ?
> >>>>
> >>>> I've found (in another project) with mysql myisam table you can kill
> >>>> disk io if the two are on same disk. It's worst when you insert lot of
> >>>> data from the same.
> >>>>
> >>>> In a perfect world after tunning carefully the /etc/my.cnf ( huge
> >>>> isn't suffisant in many big case ) you should have one full io for
> >>>> mysql temp files, one full io for indexes one full io for tables AND
> >>>> one full io for import data ( completely separate /tmp for example ).
> >>>>
> >>>> In many installation, /tmp and /var are on the same device if you add
> >>>> the fact of lvm or raid 1 configuration you get some io trouble.
> >>>>
> >>>> I wouldn't explain all fine tuning of mysql here, there's a lot and
> >>>> each off them should be review & tested.
> >>>>
> >>>> Next there's also ext3 tuning for those who use it as many distro are
> >>>> very conservative. (Newer kernel >2.6.30 would reduce the impact)
> >>>> atime,diratime,data=ordered etc ... could change life :-)
> >>>>
> >>>>
> >>>> First time bacula came with batch-insert enabled, I've test it, and
> >>>> have really poor performances. So I disable it. ( lvm on top of soft
> >>>> raid 1 ) my mistake to not mention it on user/devel list.
> >>>>
> >>>> In some future here we decide to migrate all possible application to
> >>>> postgresql.
> >>>>
> >>>> For debugging and making some sort of test case, it would be nice to
> >>>> have a "joke" dump database containing lot of dir & files.
> >>>> so we can test large queries ....
> >>>>
> >>>> Perhaps someone with very large records could do it using the md5() on
> >>>> Filename and directory ?
> >>>>
> >>>> Jason A. Kates wrote:
> >>>>> Based on following the thread I recompile bacula with the following
> >>>>> options:
> >>>>> ./configure --enable-smartalloc --with-mysql
> >>>>> --with-tcp-wrappers=/usr/lib/libwrap.so --enable-static-fd
> >>>>> --enable-static-dir --enable-static-sd --disable-batch-insert
> >>>>> And my 18+ hour backup ran in 9 hours 22 mins 27 secs last night.  
> >>>>> It seems to be back to the pre 3.0 time.
> >>>>>
> >>>>> As for my tables I have an Admin job that optimizes my bacula tables
> >>>>> within mysql.
> >>>>> Job {
> >>>>>         Name = optimize_table_JobMedia
> >>>>>         JobDefs = "DefaultJob"
> >>>>>         Type = Admin
> >>>>>         Client = bacula-local-fd
> >>>>>         RunBeforeJob = "/etc/bacula/optimize_table_JobMedia.sh"
> >>>>>         Priority = 40
> >>>>>         Cancel Queued Duplicates=yes
> >>>>>         Maximum Concurrent Jobs = 1
> >>>>>         Run Before Job = "/etc/bacula/check-duplicate-queued-job.sh
> >>>>> '%n' '%i'"
> >>>>> }
> >>>>>
> >>>>> Please let me know what else I can do to provide useful data for the
> >>>>> improvement of bacula (Worlds best backup software!).
> >>>>>                              Thanks -Jason
> >>>>>
> >>>>> On Sun, 2009-06-21 at 08:11 +0200, Marc Cousin wrote:
> >>>>>> First thing to know is which of the queries takes time (there are
> >>>>>> the multiple inserts into the batch table, then insert missing
> >>>>>> filenames/path, and the final insert). Or is it slow with all the
> >>>>>> queries ? Which storage engine ?
> >>>>>>
> >>>>>> There are two possibilities :
> >>>>>>
> >>>>>> - either you have a bad plan for one of the queries (but I don't see
> >>>>>> how, mysql having a very simple optimizer).
> >>>>>> - or you have a tuning problem with mysql (there I can't really help
> >>>>>> you : not enough memory ? fragmented indexes ? I don't really know
> >>>>>> how mysql works)
> >>>>>>
> >>>>>> Kern, I have not been following closely the 3.0 development : do the
> >>>>>> xattr go into the lstat field ?
> >>>>>>
> >>>>>> On Sunday 21 June 2009 01:52:34 Jason A. Kates wrote:
> >>>>>>> I have a backup that runs for 3 hours then spends another 15 hours
> >>>>>>> in status "Dir inserting Attributes".
> >>>>>>>
> >>>>>>> The backup has 9,442,578 files and 239,718,461,832 bytes.   It was
> >>>>>>> much faster with bacula before and I thought that that the time
> >>>>>>> change was due to  additional data with xattrsupport that didn't
> >>>>>>> exist before 3.0. The job a spool job with spooled attributes too.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> We are using mysql-server-5.0.45-7.el5.   I would be glad to
> >>>>>>> collect and share data.  Please let me know what you sould like to
> >>>>>>> me set or do to get the appropriate data.
> >>>>>>>
> >>>>>>>                              Thanks -Jason
> >>>>>>>
> >>>>>>> On Sat, 2009-06-20 at 17:48 +0200, Marc Cousin wrote:
> >>>>>>>> There may be something. But first we have to know why mysql
> >>>>>>>> behaves that slowly for Tom Summers. The problem being that I
> >>>>>>>> don't have a big mysql database to test on ... So as long as we
> >>>>>>>> don't have the query plan or an explanation ...
> >>>>>>>>
> >>>>>>>> On Saturday 20 June 2009 14:36:44 Kern Sibbald wrote:
> >>>>>>>>> Hello Marc,
> >>>>>>>>>
> >>>>>>>>> Thanks for looking at this and for your responses.  If I
> >>>>>>>>> understand you correctly, there is nothing to be done, which does
> >>>>>>>>> not surprise me :-)
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>>
> >>>>>>>>> Kern
> >>>>>>>>>
> >>>>>>>>> On Saturday 20 June 2009 13:16:44 Marc Cousin wrote:
> >>>>>>>>>>> Subject: Re: [Bacula-users] Performance with MySQL queries
> >>>>>>>>>>> since 3.0.0 (Dir inserting attributes hang)
> >>>>>>>>>>> Date: Saturday 20 June 2009
> >>>>>>>>>>> From: Kern Sibbald <k...@sibbald.com>
> >>>>>>>>>>> To: Tom Sommer <m...@tomsommer.dk>
> >>>>>>>>>>>
> >>>>>>>>>>> On Saturday 20 June 2009 08:51:53 Tom Sommer wrote:
> >>>>>>>>>>>> Tom Sommer wrote:
> >>>>>>>>>>>>> Mike Holden wrote:
> >>>>>>>>>>>>>> Jari Fredriksson wrote:
> >>>>>>>>>>>>>>>> INSERT INTO Filename( Name )
> >>>>>>>>>>>>>>>> SELECT a.Name
> >>>>>>>>>>>>>>>> FROM (
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> SELECT DISTINCT Name
> >>>>>>>>>>>>>>>> FROM batch
> >>>>>>>>>>>>>>>> ) AS a
> >>>>>>>>>>>>>>>> WHERE NOT
> >>>>>>>>>>>>>>>> EXISTS (
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> SELECT Name
> >>>>>>>>>>>>>>>> FROM Filename AS f
> >>>>>>>>>>>>>>>> WHERE f.Name = a.Name
> >>>>>>>>>>>>>>>> )
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> That looks silly.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I would write it shorter as
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> INSERT INTO Filename(Name)
> >>>>>>>>>>>>>>> SELECT DISTINCT Name
> >>>>>>>>>>>>>>> FROM batch AS a
> >>>>>>>>>>>>>>> WHERE NOT EXISTS
> >>>>>>>>>>>>>>> (
> >>>>>>>>>>>>>>>     SELECT Name
> >>>>>>>>>>>>>>>     FROM Filename AS f
> >>>>>>>>>>>>>>>     WHERE f.Name = a.Name
> >>>>>>>>>>>>>>> )
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> You may also want to consider using a JOIN rather than a
> >>>>>>>>>>>>>> subquery with a NOT EXISTS, something like (untested and
> >>>>>>>>>>>>>> unvalidated!):
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> INSERT INTO filename(name)
> >>>>>>>>>>>>>> SELECT DISTINCT name
> >>>>>>>>>>>>>> FROM batch AS a
> >>>>>>>>>>>>>> LEFT JOIN filename AS f USING (name)
> >>>>>>>>>>>>>> WHERE f.name IS NULL
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I know from experience as an Oracle DBA (my day job) that
> >>>>>>>>>>>>>> this can often produce far more efficient results.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Of course, all options need testing for both speed of
> >>>>>>>>>>>>>> execution and resource usage, bearing in mind that data
> >>>>>>>>>>>>>> varies from one installation to another, and one size may
> >>>>>>>>>>>>>> not fit all!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Good suggestions, sounds like there might be an overall
> >>>>>>>>>>>>> performance problem with the current query for batch-inserts
> >>>>>>>>>>>>> with lots of data. I'm a bit unsure if I dare test these
> >>>>>>>>>>>>> queries on my current installation.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'll CC Kern on the thread, perhaps he has some insights.
> >>>>>>>>>>>>
> >>>>>>>>>>>> So this morning I had to kill the above query because it's
> >>>>>>>>>>>> been running for 24+ hours, preventing the new daily jobs from
> >>>>>>>>>>>> running. I think I'm going to try and disable batch-inserts,
> >>>>>>>>>>>> the current situation is simply not good enough, it's become a
> >>>>>>>>>>>> major headache to run backups suddenly. When I hit the end of
> >>>>>>>>>>>> this month and ALL servers have to run FULL backups, I'm gonna
> >>>>>>>>>>>> be in a world of trouble I think - I just don't understand
> >>>>>>>>>>>> what has changed, because it's all been running great up until
> >>>>>>>>>>>> now.
> >>>>>>>>>>>>
> >>>>>>>>>>>> // Tom
> >>>>>>>>>>>
> >>>>>>>>>>> Hello,
> >>>>>>>>>>>
> >>>>>>>>>>> We certainly can use help optimizing SQL since we are not DBAs,
> >>>>>>>>>>> and we will look into the SQL optimization suggestions given
> >>>>>>>>>>> above, keeping in mind that there are often rather radical
> >>>>>>>>>>> differences in timing of particular SQL queries depending on
> >>>>>>>>>>> the database engine used.
> >>>>>>>>>>>
> >>>>>>>>>>> To the best of my knowledge nothing has changed in terms of the
> >>>>>>>>>>> Batch insert queries since when it was implemented, and it is
> >>>>>>>>>>> *very* unlikely (though I haven't checked the code) that
> >>>>>>>>>>> something changed from 2.4.4 to 3.0.x.
> >>>>>>>>>>>
> >>>>>>>>>>> More likely, your workload or MySQL has changed in some way --
> >>>>>>>>>>> e.g. more Jobs, more machines backed up, Director machine with
> >>>>>>>>>>> less memory or other jobs that use memory, a new version of
> >>>>>>>>>>> MySQL, ...
> >>>>>>>>>>>
> >>>>>>>>>>> - I would suggest that you ensure that your database has all
> >>>>>>>>>>> the recommended indexes (see the make_mysql_tables file), and
> >>>>>>>>>>> that you are running with the large memory /etc/my.cnf file.
> >>>>>>>>>>>
> >>>>>>>>>>> - Another thing to do is to compact your database.  One way to
> >>>>>>>>>>> do it is to write it to an ASCII file and then re-insert it.
> >>>>>>>>>>>
> >>>>>>>>>>> - If you are running certain programs that create and delete
> >>>>>>>>>>> lots of temporary files with different names, you Filename
> >>>>>>>>>>> table may need cleaning.
> >>>>>>>>>>>
> >>>>>>>>>>> - I would strongly recommend not starting *lots* of Full
> >>>>>>>>>>> backups at the same time or on the same day.  By lots, I mean
> >>>>>>>>>>> more than say 10 or 20 (depends on the size of your system). 
> >>>>>>>>>>> It is generally far better to stage 1/4 of the backup every
> >>>>>>>>>>> week for a full backup so that the peak workload is spread out
> >>>>>>>>>>> over the month.
> >>>>>>>>>>>
> >>>>>>>>>>> - If the bottleneck is in MySQL, you might consider moving it
> >>>>>>>>>>> to another machine that has more memory and faster disks.
> >>>>>>>>>>>
> >>>>>>>>>>> - If you really have a huge number of backups (say 50 or more)
> >>>>>>>>>>> that all run at the same time, it might be advisable to
> >>>>>>>>>>> consider using PostgreSQL, but in that case, you will probably
> >>>>>>>>>>> need an onsite DBA to properly tune and maintain it.
> >>>>>>>>>>>
> >>>>>>>>>>> - Regardless of what hardware you have, there are certain
> >>>>>>>>>>> limitations on how many simultaneous jobs you can run (this
> >>>>>>>>>>> kicks in on many systems around 50). Once a certain number is
> >>>>>>>>>>> exceeded, the total throughput can rather radically decrease so
> >>>>>>>>>>> careful monitoring is necessary.  Bweb can help a lot in these
> >>>>>>>>>>> situations.
> >>>>>>>>>>>
> >>>>>>>>>>> - If you are really "in a world of trouble" and it is a
> >>>>>>>>>>> performance issue, there is not much we (the Bacula project)
> >>>>>>>>>>> can do for you other than the above tips.  However, ($$) Bacula
> >>>>>>>>>>> Systems has tools that can help more precisely identify
> >>>>>>>>>>> bottlenecks and help balance loads.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>>
> >>>>>>>>>>> Kern
> >>>>>>>>>>>
> >>>>>>>>>>> -------------------------------------------------------
> >>>>>>>>>>
> >>>>>>>>>> Sorry Kern, I wasn't monitoring this thread (mysql in the title 
> >>>>>>>>>> :) )
> >>>>>>>>>>
> >>>>>>>>>> I wrote these queries, so maybe I should explain (not why I'm
> >>>>>>>>>> right, I'm not sure of that, but why we've done it this way).
> >>>>>>>>>>
> >>>>>>>>>> A few forewords :
> >>>>>>>>>> - batch insert was created at the begining for postgresql. I
> >>>>>>>>>> don't have any experience with mysql, except that knowing at the
> >>>>>>>>>> moment we programmed batch insert it didn't really do anything
> >>>>>>>>>> but nested loops... - The main reason for batch inserts with
> >>>>>>>>>> postgresql was to use COPY instead of INSERT, which is much
> >>>>>>>>>> faster (we also had good results with stored procedures, but
> >>>>>>>>>> batch insert was even faster) - our current file table is nearly
> >>>>>>>>>> 200GB (1 billion rows), with a 250GB database, and we insert
> >>>>>>>>>> around 50 million records a day (rough estimate)
> >>>>>>>>>>
> >>>>>>>>>> * First question I've seen in the thread:
> >>>>>>>>>>
> >>>>>>>>>> Why
> >>>>>>>>>>
> >>>>>>>>>> INSERT INTO Filename( Name )
> >>>>>>>>>> SELECT a.Name
> >>>>>>>>>> FROM (
> >>>>>>>>>>    SELECT DISTINCT Name
> >>>>>>>>>>    FROM batch
> >>>>>>>>>> ) AS a
> >>>>>>>>>> WHERE NOT EXISTS (
> >>>>>>>>>>    SELECT Name
> >>>>>>>>>>    FROM Filename AS f
> >>>>>>>>>>    WHERE f.Name = a.Name
> >>>>>>>>>> )
> >>>>>>>>>>
> >>>>>>>>>> instead of the much simpler
> >>>>>>>>>>
> >>>>>>>>>> INSERT INTO Filename(Name)
> >>>>>>>>>> SELECT DISTINCT Name
> >>>>>>>>>> FROM batch AS a
> >>>>>>>>>> WHERE NOT EXISTS
> >>>>>>>>>> (
> >>>>>>>>>>    SELECT Name
> >>>>>>>>>>    FROM Filename AS f
> >>>>>>>>>>    WHERE f.Name = a.Name
> >>>>>>>>>> )
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> => Because the first one removes the duplicate rows from batch
> >>>>>>>>>> table before checking they exist in the filename table, while
> >>>>>>>>>> the second one checks after. Performance was better in
> >>>>>>>>>> benchmarks with the first one (with PostgreSQL, with our data).
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> * Second question :
> >>>>>>>>>> Why not :
> >>>>>>>>>>
> >>>>>>>>>> INSERT INTO filename(name)
> >>>>>>>>>> SELECT DISTINCT name
> >>>>>>>>>> FROM batch AS a
> >>>>>>>>>> LEFT JOIN filename AS f USING (name)
> >>>>>>>>>> WHERE f.name IS NULL
> >>>>>>>>>>
> >>>>>>>>>> In fact, if the engine choses the right execution path, the
> >>>>>>>>>> execution plan will be the roughly the same (nested loop) as
> >>>>>>>>>>
> >>>>>>>>>> INSERT INTO Filename(Name)
> >>>>>>>>>> SELECT DISTINCT Name
> >>>>>>>>>> FROM batch AS a
> >>>>>>>>>> WHERE NOT EXISTS
> >>>>>>>>>> (
> >>>>>>>>>> SELECT Name
> >>>>>>>>>> FROM Filename AS f
> >>>>>>>>>> WHERE f.Name = a.Name
> >>>>>>>>>> )
> >>>>>>>>>>
> >>>>>>>>>> Except if batch is close in size to filename (for instance on my
> >>>>>>>>>> bacula database, it would be a 50 million files backup). Anyway,
> >>>>>>>>>> If i recall correctly about mysql (and it hasn't changed
> >>>>>>>>>> recently), mysql will do a nested loop for both, as it doesn't
> >>>>>>>>>> know how to merge or hash join ...
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> But I agree that this second syntax theorically gives more
> >>>>>>>>>> lattitude to the engine. Except that recent oracle and
> >>>>>>>>>> postgresql versions will anyway be able to convert between the
> >>>>>>>>>> exists/not exists and the join (or IN) in this case, as they are
> >>>>>>>>>> semantically equivalent.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Now, I don't know how mysql would react to all these queries. An
> >>>>>>>>>> explain plan would be great to understand... but I would have
> >>>>>>>>>> thought that the NOT EXIST query would have been great for it's
> >>>>>>>>>> optimizer, as it tells mysql exactly what to do :)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> A last note, for postgreSQL users on this mailing list :
> >>>>>>>>>> You can try to get better performance with batch inserts by
> >>>>>>>>>> lowering seq_page_cost and random_page_cost, so that the engine
> >>>>>>>>>> favours nested loop instead of hash join for the INSERT INTO
> >>>>>>>>>> File query : most of the time, most of filename and path will be
> >>>>>>>>>> in the database or filesystem cache, and the plan will be much
> >>>>>>>>>> faster (and save you big sorts)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Please tell me if you need more information.



------------------------------------------------------------------------------
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Re: [Bacula-devel] Fwd: Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

Reply via email to