Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-23 Thread Jeronimo Zucco

Citando James Harper james.har...@bendigoit.com.au:


 Does MySQL have a 'profiler' tool like MSSQL does? I spend most of my
 time in MSSQL and if some operation is running slow I just attach the
 profiler to it and capture the queries and focus on the ones that are
 taking most of the time.

 James


What is the impact of use this option in my.cnf:

low_priority_updates=1

I ran a lot of concurrent jobs, and this option help me in mysql  
performance issues.



-- 
Jeronimo Zucco
LPIC-1 Linux Professional Institute Certified
Universidade de Caxias do Sul - NPDU

http://jczucco.blogspot.com

---
Essa mensagem foi enviada pelo UCS Mail



--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-20 Thread Tom Sommer
Tom Sommer wrote:
 Mike Holden wrote:
   
 Jari Fredriksson wrote:
   
 
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

   
 
 That looks silly.

 I would write it shorter as

 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS a
 WHERE NOT EXISTS
 (
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
 
   
 You may also want to consider using a JOIN rather than a subquery with a
 NOT EXISTS, something like (untested and unvalidated!):

 INSERT INTO filename(name)
 SELECT DISTINCT name
 FROM batch AS a
 LEFT JOIN filename AS f USING (name)
 WHERE f.name IS NULL

 I know from experience as an Oracle DBA (my day job) that this can often
 produce far more efficient results.

 Of course, all options need testing for both speed of execution and
 resource usage, bearing in mind that data varies from one installation to
 another, and one size may not fit all!
   
 
 Good suggestions, sounds like there might be an overall performance
 problem with the current query for batch-inserts with lots of data. I'm
 a bit unsure if I dare test these queries on my current installation.

 I'll CC Kern on the thread, perhaps he has some insights.
   
So this morning I had to kill the above query because it's been running 
for 24+ hours, preventing the new daily jobs from running.
I think I'm going to try and disable batch-inserts, the current 
situation is simply not good enough, it's become a major headache to run 
backups suddenly. When I hit the end of this month and ALL servers have 
to run FULL backups, I'm gonna be in a world of trouble I think - I just 
don't understand what has changed, because it's all been running great 
up until now.

// Tom

--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-20 Thread Kern Sibbald
On Saturday 20 June 2009 08:51:53 Tom Sommer wrote:
 Tom Sommer wrote:
  Mike Holden wrote:
  Jari Fredriksson wrote:
  INSERT INTO Filename( Name )
  SELECT a.Name
  FROM (
 
  SELECT DISTINCT Name
  FROM batch
  ) AS a
  WHERE NOT
  EXISTS (
 
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
 
  That looks silly.
 
  I would write it shorter as
 
  INSERT INTO Filename(Name)
  SELECT DISTINCT Name
  FROM batch AS a
  WHERE NOT EXISTS
  (
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
 
  You may also want to consider using a JOIN rather than a subquery with a
  NOT EXISTS, something like (untested and unvalidated!):
 
  INSERT INTO filename(name)
  SELECT DISTINCT name
  FROM batch AS a
  LEFT JOIN filename AS f USING (name)
  WHERE f.name IS NULL
 
  I know from experience as an Oracle DBA (my day job) that this can often
  produce far more efficient results.
 
  Of course, all options need testing for both speed of execution and
  resource usage, bearing in mind that data varies from one installation
  to another, and one size may not fit all!
 
  Good suggestions, sounds like there might be an overall performance
  problem with the current query for batch-inserts with lots of data. I'm
  a bit unsure if I dare test these queries on my current installation.
 
  I'll CC Kern on the thread, perhaps he has some insights.

 So this morning I had to kill the above query because it's been running
 for 24+ hours, preventing the new daily jobs from running.
 I think I'm going to try and disable batch-inserts, the current
 situation is simply not good enough, it's become a major headache to run
 backups suddenly. When I hit the end of this month and ALL servers have
 to run FULL backups, I'm gonna be in a world of trouble I think - I just
 don't understand what has changed, because it's all been running great
 up until now.

 // Tom

Hello,

We certainly can use help optimizing SQL since we are not DBAs, and we will 
look into the SQL optimization suggestions given above, keeping in mind that 
there are often rather radical differences in timing of particular SQL 
queries depending on the database engine used.

To the best of my knowledge nothing has changed in terms of the Batch insert 
queries since when it was implemented, and it is *very* unlikely (though I 
haven't checked the code) that something changed from 2.4.4 to 3.0.x.

More likely, your workload or MySQL has changed in some way -- e.g. more Jobs, 
more machines backed up, Director machine with less memory or other jobs that 
use memory, a new version of MySQL, ...

- I would suggest that you ensure that your database has all the recommended 
indexes (see the make_mysql_tables file), and that you are running with the 
large memory /etc/my.cnf file.

- Another thing to do is to compact your database.  One way to do it is to 
write it to an ASCII file and then re-insert it.

- If you are running certain programs that create and delete lots of temporary 
files with different names, you Filename table may need cleaning.

- I would strongly recommend not starting *lots* of Full backups at the same 
time or on the same day.  By lots, I mean more than say 10 or 20 (depends on 
the size of your system).  It is generally far better to stage 1/4 of the 
backup every week for a full backup so that the peak workload is spread out 
over the month.

- If the bottleneck is in MySQL, you might consider moving it to another 
machine that has more memory and faster disks.

- If you really have a huge number of backups (say 50 or more) that all run at 
the same time, it might be advisable to consider using PostgreSQL, but in 
that case, you will probably need an onsite DBA to properly tune and maintain 
it.

- Regardless of what hardware you have, there are certain limitations on how 
many simultaneous jobs you can run (this kicks in on many systems around 50).  
Once a certain number is exceeded, the total throughput can rather radically 
decrease so careful monitoring is necessary.  Bweb can help a lot in these 
situations.

- If you are really in a world of trouble and it is a performance issue, 
there is not much we (the Bacula project) can do for you other than the above 
tips.  However, ($$) Bacula Systems has tools that can help more precisely 
identify bottlenecks and help balance loads.

Regards,

Kern

--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Jari Fredriksson wrote:
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

 

 That looks silly.

 I would write it shorter as

 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS a
 WHERE NOT EXISTS
 (
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
Make a bug report :)

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Martin Simmons wrote:
 On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
 
 Martin Simmons wrote:
 
 On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
 
 Martin Simmons wrote:
 
 On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
 
 
 Hi,

 I have a somewhat pressing problem with the performance of my Bacula
 installation.

 My MySQL database currently holds 247,342,127 (36GB) records in the File
 table, and 78,576,199 (10GB) records in the Filename table.

 Since 3.0.0, but even more since 3.0.1, I have a problem with queries
 being really slow. Basically when doing a full backup of a server
 (mailserver, LOTS of small files), I can have my MySQL hanging for up to
 24+ hours on queries like this:

 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
 
 [...]
 
 Sorry, I don't know how to optimize it.

 These queries are part of the new batch insert code.  You can turn that 
 off
 by passing --disable-batch-insert to configure when building Bacula.
   
   
 what does this mean in practice? 1 query per file? or?
 
 The non-batch approach makes several queries per file.
   
 But it will do so as it backs up the files, as opposed to the batch
 approach where it collects all the queries and does them at the end of
 the job, right?
 

 Yes, but if you have spooling turned on, then it will spool the attributes too
 and insert them all at the end using several queries per file.
   
Well, the problem is really that first is spends time storing the files,
and then it spends time saving information about the backup in the database.
It would make more sense, for larger clients, to spend the idle database
time (while taking the file backup) more wisely.

I'm having a hard time judging if turning off batch-inserts is the right
action, if I would gain anything from this.

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Martin Simmons
 On Fri, 19 Jun 2009 03:00:54 +0300, Jari Fredriksson said:
 
  
  INSERT INTO Filename( Name )
  SELECT a.Name
  FROM (
  
  SELECT DISTINCT Name
  FROM batch
  ) AS a
  WHERE NOT
  EXISTS (
  
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
  
 
 That looks silly.
 
 I would write it shorter as
 
 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS a
 WHERE NOT EXISTS
 (
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

Maybe, but perhaps it quicker to remove the duplicates from batch.name before
comparing them to the rows in Filename.name?

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Martin Simmons
 On Fri, 19 Jun 2009 09:51:20 +0200, Tom Sommer said:
 
 Martin Simmons wrote:
  On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
  
  Martin Simmons wrote:
  
  On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
  
  Martin Simmons wrote:
  
  On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
  
  
  Hi,
 
  I have a somewhat pressing problem with the performance of my Bacula
  installation.
 
  My MySQL database currently holds 247,342,127 (36GB) records in the 
  File
  table, and 78,576,199 (10GB) records in the Filename table.
 
  Since 3.0.0, but even more since 3.0.1, I have a problem with queries
  being really slow. Basically when doing a full backup of a server
  (mailserver, LOTS of small files), I can have my MySQL hanging for up 
  to
  24+ hours on queries like this:
 
  INSERT INTO Filename( Name )
  SELECT a.Name
  FROM (
 
  SELECT DISTINCT Name
  FROM batch
  ) AS a
  WHERE NOT
  EXISTS (
 
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
  
  [...]
  
 Sorry, I don't know how to optimize it.
 
 These queries are part of the new batch insert code.  You can turn that 
 off
 by passing --disable-batch-insert to configure when building Bacula.
   
   
  what does this mean in practice? 1 query per file? or?
  
  The non-batch approach makes several queries per file.

  But it will do so as it backs up the files, as opposed to the batch
  approach where it collects all the queries and does them at the end of
  the job, right?
  
 
  Yes, but if you have spooling turned on, then it will spool the attributes 
  too
  and insert them all at the end using several queries per file.

 Well, the problem is really that first is spends time storing the files,
 and then it spends time saving information about the backup in the database.
 It would make more sense, for larger clients, to spend the idle database
 time (while taking the file backup) more wisely.
 
 I'm having a hard time judging if turning off batch-inserts is the right
 action, if I would gain anything from this.

AFAIK, the non-batch approach in Bacula 3.x does the same queies as Bacula
2.x.  Since you started the thread by saying that backups are slower in 3.x
because of the queries, turning off batch-inserts might solve that.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Mike Holden
Jari Fredriksson wrote:

 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )


 That looks silly.

 I would write it shorter as

 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS a
 WHERE NOT EXISTS
 (
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

You may also want to consider using a JOIN rather than a subquery with a
NOT EXISTS, something like (untested and unvalidated!):

INSERT INTO filename(name)
SELECT DISTINCT name
FROM batch AS a
LEFT JOIN filename AS f USING (name)
WHERE f.name IS NULL

I know from experience as an Oracle DBA (my day job) that this can often
produce far more efficient results.

Of course, all options need testing for both speed of execution and
resource usage, bearing in mind that data varies from one installation to
another, and one size may not fit all!
-- 
Mike Holden

http://www.by-ang.com - the place to shop for all manner of hand crafted
items, including Jewellery, Greetings Cards and Gifts




--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Mike Holden wrote:
 Jari Fredriksson wrote:
   
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

   
 That looks silly.

 I would write it shorter as

 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS a
 WHERE NOT EXISTS
 (
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
 

 You may also want to consider using a JOIN rather than a subquery with a
 NOT EXISTS, something like (untested and unvalidated!):

 INSERT INTO filename(name)
 SELECT DISTINCT name
 FROM batch AS a
 LEFT JOIN filename AS f USING (name)
 WHERE f.name IS NULL

 I know from experience as an Oracle DBA (my day job) that this can often
 produce far more efficient results.

 Of course, all options need testing for both speed of execution and
 resource usage, bearing in mind that data varies from one installation to
 another, and one size may not fit all!
   
Good suggestions, sounds like there might be an overall performance
problem with the current query for batch-inserts with lots of data. I'm
a bit unsure if I dare test these queries on my current installation.

I'll CC Kern on the thread, perhaps he has some insights.

// Tom

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread James Harper
  INSERT INTO Filename(Name)
  SELECT DISTINCT Name
  FROM batch AS a
  WHERE NOT EXISTS
  (
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
 
 You may also want to consider using a JOIN rather than a subquery with
a
 NOT EXISTS, something like (untested and unvalidated!):
 
 INSERT INTO filename(name)
 SELECT DISTINCT name
 FROM batch AS a
 LEFT JOIN filename AS f USING (name)
 WHERE f.name IS NULL
 
 I know from experience as an Oracle DBA (my day job) that this can
often
 produce far more efficient results.
 
 Of course, all options need testing for both speed of execution and
 resource usage, bearing in mind that data varies from one installation
to
 another, and one size may not fit all!

The subquery returns a very small result set (0 or 1, assuming you use
DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really bog
down

James

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Mike Holden
James Harper wrote:
 The subquery returns a very small result set (0 or 1, assuming you use
 DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
 EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really bog
 down

True, but if the outer query contains a very large number of records, the
inner query is executed over and over and over and ...

This multiplies the small execution time by a large number.

Think of it like a WHILE loop that executes a block of code each time
round. The inner block of code might be quick, but if the loop goes round
several million times, the cumulative time becomes significant.

One general rule of thumb of SQL execution is that bulk processing is
almost always faster than processing records individually in a loop.
-- 
Mike Holden


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread James Harper
 
 James Harper wrote:
  The subquery returns a very small result set (0 or 1, assuming you
use
  DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
  EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really
bog
  down
 
 True, but if the outer query contains a very large number of records,
the
 inner query is executed over and over and over and ...
 
 This multiplies the small execution time by a large number.
 
 Think of it like a WHILE loop that executes a block of code each time
 round. The inner block of code might be quick, but if the loop goes
round
 several million times, the cumulative time becomes significant.
 
 One general rule of thumb of SQL execution is that bulk processing is
 almost always faster than processing records individually in a loop.

I would have thought that the query planner would have reduced both
solutions to the same thing (eg for every record in the outer or left
part, match it up with the inner or left part). I agree that the JOIN
lends itself better to optimization though, and that's certainly the way
I would have written it as it is far clearer to read and scales to far
more situations than the 'NOT EXISTS' solution does.

I constructed a simple test scenario - created a table 'james' with a
few hundred thousand rows in it of both real filenames and corrupted
versions of real filenames, and ran the following tests:

RESET QUERY CACHE; (can't remember if that was the exact syntax)
SELECT COUNT(DISTINCT Name)
FROM james AS a
WHERE NOT EXISTS
(
SELECT Name
FROM Filename AS f
WHERE f.Name = a.Name
);

And

RESET QUERY CACHE;
SELECT COUNT(DISTINCT Name)
FROM james AS a
LEFT JOIN filename AS f ON a.name = f.name
WHERE f.name IS NULL;

And compared the times over a half a dozen runs. There wasn't a
measurable difference between the results, but the server was busy
enough at the time that a significant amount of 'noise' was added in -
both queries ran for between 2 and 6 seconds. The simple timing test
also completely fails to measure memory and io usage.

I'm sure someone could construct some more precise tests that would
eliminate any speculation, but on face value you aren't going to get an
order-of-magnitude improvement or anything close just by tinkering with
this query, unless you can identify some indexes that could be added
onto the batch table (which is temporary isn't it?). Although given that
Filename already has an index on the only field being used, I suspect
that even that won't do much for you.

Does MySQL have a 'profiler' tool like MSSQL does? I spend most of my
time in MSSQL and if some operation is running slow I just attach the
profiler to it and capture the queries and focus on the ones that are
taking most of the time.

James


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Jari Fredriksson
 James Harper wrote:
 The subquery returns a very small result set (0 or 1,
 assuming you use DISTINCT) and so isn't too inefficient.
 It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
 LOTS OF RESULTS)' that you start to really bog down
 
 True, but if the outer query contains a very large number
 of records, the inner query is executed over and over and
 over and ... 
 
 This multiplies the small execution time by a large
 number. 
 
 Think of it like a WHILE loop that executes a block of
 code each time round. The inner block of code might be
 quick, but if the loop goes round several million times,
 the cumulative time becomes significant. 
 
 One general rule of thumb of SQL execution is that bulk
 processing is almost always faster than processing
 records individually in a loop. 

All right then. Then there is this version:

INSERT INTO Filename(Name)
SELECT DISTINCT Name
FROM batch AS
WHERE Name NOT IN
(
SELECT DISTINCT Name
FROM Filename
);

I guess it depends on RDMS, but I wonder it it executes the subquery more than 
once.



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Bruno Friedmann
Jari Fredriksson wrote:
 James Harper wrote:
 The subquery returns a very small result set (0 or 1,
 assuming you use DISTINCT) and so isn't too inefficient.
 It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
 LOTS OF RESULTS)' that you start to really bog down
 True, but if the outer query contains a very large number
 of records, the inner query is executed over and over and
 over and ... 

 This multiplies the small execution time by a large
 number. 

 Think of it like a WHILE loop that executes a block of
 code each time round. The inner block of code might be
 quick, but if the loop goes round several million times,
 the cumulative time becomes significant. 

 One general rule of thumb of SQL execution is that bulk
 processing is almost always faster than processing
 records individually in a loop. 
 
 All right then. Then there is this version:
 
 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS
 WHERE Name NOT IN
 (
 SELECT DISTINCT Name
 FROM Filename
 );
 
 I guess it depends on RDMS, but I wonder it it executes the subquery more 
 than once.
 
 
 

The keyword EXPLAIN before the query should show how mysql would react.

There's some tools also mysql-administrator which can be use to detect / search 
point of slow


-- 

 Bruno Friedmann



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Bruno Friedmann
Jari Fredriksson wrote:
 James Harper wrote:
 The subquery returns a very small result set (0 or 1,
 assuming you use DISTINCT) and so isn't too inefficient.
 It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
 LOTS OF RESULTS)' that you start to really bog down
 True, but if the outer query contains a very large number
 of records, the inner query is executed over and over and
 over and ... 

 This multiplies the small execution time by a large
 number. 

 Think of it like a WHILE loop that executes a block of
 code each time round. The inner block of code might be
 quick, but if the loop goes round several million times,
 the cumulative time becomes significant. 

 One general rule of thumb of SQL execution is that bulk
 processing is almost always faster than processing
 records individually in a loop. 
 
 All right then. Then there is this version:
 
 INSERT INTO Filename(Name)
 SELECT DISTINCT Name
 FROM batch AS
 WHERE Name NOT IN
 (
 SELECT DISTINCT Name
 FROM Filename
 );
 
 I guess it depends on RDMS, but I wonder it it executes the subquery more 
 than once.
 
 
 

The keyword EXPLAIN before the query should show how mysql would react.

There's some tools also mysql-administrator which can be use to detect / search 
point of slow


-- 

 Bruno Friedmann



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Michel Meyers
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Martin Simmons wrote:
 On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
 Martin Simmons wrote:
 On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
 
 Hi,

 I have a somewhat pressing problem with the performance of my Bacula
 installation.

 My MySQL database currently holds 247,342,127 (36GB) records in the File
 table, and 78,576,199 (10GB) records in the Filename table.

 Since 3.0.0, but even more since 3.0.1, I have a problem with queries
 being really slow. Basically when doing a full backup of a server
 (mailserver, LOTS of small files), I can have my MySQL hanging for up to
 24+ hours on queries like this:

 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
[...]
 Sorry, I don't know how to optimize it.

 These queries are part of the new batch insert code.  You can turn that 
 off
 by passing --disable-batch-insert to configure when building Bacula.
   
 what does this mean in practice? 1 query per file? or?
 
 The non-batch approach makes several queries per file.

But it will do so as it backs up the files, as opposed to the batch
approach where it collects all the queries and does them at the end of
the job, right?

- - Michel
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)

iEYEARECAAYFAko6WQgACgkQ2Vs+MkscAyWeTQCglDfuICz6cZ1eYz06yIl8VzDS
jw8An1Xoa1cje6vI2ChFttOqjvG5FzQa
=KKVz
-END PGP SIGNATURE-


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Martin Simmons
 On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
 
 Martin Simmons wrote:
  On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
  Martin Simmons wrote:
  On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
  
  Hi,
 
  I have a somewhat pressing problem with the performance of my Bacula
  installation.
 
  My MySQL database currently holds 247,342,127 (36GB) records in the File
  table, and 78,576,199 (10GB) records in the Filename table.
 
  Since 3.0.0, but even more since 3.0.1, I have a problem with queries
  being really slow. Basically when doing a full backup of a server
  (mailserver, LOTS of small files), I can have my MySQL hanging for up to
  24+ hours on queries like this:
 
  INSERT INTO Filename( Name )
  SELECT a.Name
  FROM (
 
  SELECT DISTINCT Name
  FROM batch
  ) AS a
  WHERE NOT
  EXISTS (
 
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
 [...]
  Sorry, I don't know how to optimize it.
 
  These queries are part of the new batch insert code.  You can turn that 
  off
  by passing --disable-batch-insert to configure when building Bacula.

  what does this mean in practice? 1 query per file? or?
  
  The non-batch approach makes several queries per file.
 
 But it will do so as it backs up the files, as opposed to the batch
 approach where it collects all the queries and does them at the end of
 the job, right?

Yes, but if you have spooling turned on, then it will spool the attributes too
and insert them all at the end using several queries per file.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Jari Fredriksson
 
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (
 
 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (
 
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
 

That looks silly.

I would write it shorter as

INSERT INTO Filename(Name)
SELECT DISTINCT Name
FROM batch AS a
WHERE NOT EXISTS
(
SELECT Name
FROM Filename AS f
WHERE f.Name = a.Name
)

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Martin Simmons
 On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
 
 Hi,
 
 I have a somewhat pressing problem with the performance of my Bacula
 installation.
 
 My MySQL database currently holds 247,342,127 (36GB) records in the File
 table, and 78,576,199 (10GB) records in the Filename table.
 
 Since 3.0.0, but even more since 3.0.1, I have a problem with queries
 being really slow. Basically when doing a full backup of a server
 (mailserver, LOTS of small files), I can have my MySQL hanging for up to
 24+ hours on queries like this:
 
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (
 
 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (
 
 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
 
 with the status Sending data, and a lot of other similar queries in
 queue with the status Locked. One of these queries take approx 10.000
 seconds to execute, but is just followed by another similar (identical)
 query with the same duration. This is a problem mostly because it
 prevents me doing restores while backups are Dir inserting attributes
 
 Obviously I would think it's a MySQL performance issue, but I was
 wondering if anything had been done to the queries? They seem to be a
 LOT slower and a LOT heavier.
 I've just put more RAM into the server, but it's done little to improve
 the duration of the queries. My server now has 4GB RAM (will update to
 6GB) - but again it's a recent issue, because Bacula has been running
 perfectly for many months on 2GB RAM, until I updated to 3.0.1
 
 I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.
 
 Finally if anyone have any specific ideas to improve performance on my
 huge SQL database, please share :)

Sorry, I don't know how to optimize it.

These queries are part of the new batch insert code.  You can turn that off
by passing --disable-batch-insert to configure when building Bacula.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Tom Sommer
Martin Simmons wrote:
 On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
 
 Hi,

 I have a somewhat pressing problem with the performance of my Bacula
 installation.

 My MySQL database currently holds 247,342,127 (36GB) records in the File
 table, and 78,576,199 (10GB) records in the Filename table.

 Since 3.0.0, but even more since 3.0.1, I have a problem with queries
 being really slow. Basically when doing a full backup of a server
 (mailserver, LOTS of small files), I can have my MySQL hanging for up to
 24+ hours on queries like this:

 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

 with the status Sending data, and a lot of other similar queries in
 queue with the status Locked. One of these queries take approx 10.000
 seconds to execute, but is just followed by another similar (identical)
 query with the same duration. This is a problem mostly because it
 prevents me doing restores while backups are Dir inserting attributes

 Obviously I would think it's a MySQL performance issue, but I was
 wondering if anything had been done to the queries? They seem to be a
 LOT slower and a LOT heavier.
 I've just put more RAM into the server, but it's done little to improve
 the duration of the queries. My server now has 4GB RAM (will update to
 6GB) - but again it's a recent issue, because Bacula has been running
 perfectly for many months on 2GB RAM, until I updated to 3.0.1

 I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.

 Finally if anyone have any specific ideas to improve performance on my
 huge SQL database, please share :)
 

 Sorry, I don't know how to optimize it.

 These queries are part of the new batch insert code.  You can turn that off
 by passing --disable-batch-insert to configure when building Bacula.
   
what does this mean in practice? 1 query per file? or?

My real problem is these huge queries lock the File table, so no other
jobs can touch it for many hours.

// Tom

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Martin Simmons
 On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
 
 Martin Simmons wrote:
  On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
  
  Hi,
 
  I have a somewhat pressing problem with the performance of my Bacula
  installation.
 
  My MySQL database currently holds 247,342,127 (36GB) records in the File
  table, and 78,576,199 (10GB) records in the Filename table.
 
  Since 3.0.0, but even more since 3.0.1, I have a problem with queries
  being really slow. Basically when doing a full backup of a server
  (mailserver, LOTS of small files), I can have my MySQL hanging for up to
  24+ hours on queries like this:
 
  INSERT INTO Filename( Name )
  SELECT a.Name
  FROM (
 
  SELECT DISTINCT Name
  FROM batch
  ) AS a
  WHERE NOT
  EXISTS (
 
  SELECT Name
  FROM Filename AS f
  WHERE f.Name = a.Name
  )
 
  with the status Sending data, and a lot of other similar queries in
  queue with the status Locked. One of these queries take approx 10.000
  seconds to execute, but is just followed by another similar (identical)
  query with the same duration. This is a problem mostly because it
  prevents me doing restores while backups are Dir inserting attributes
 
  Obviously I would think it's a MySQL performance issue, but I was
  wondering if anything had been done to the queries? They seem to be a
  LOT slower and a LOT heavier.
  I've just put more RAM into the server, but it's done little to improve
  the duration of the queries. My server now has 4GB RAM (will update to
  6GB) - but again it's a recent issue, because Bacula has been running
  perfectly for many months on 2GB RAM, until I updated to 3.0.1
 
  I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.
 
  Finally if anyone have any specific ideas to improve performance on my
  huge SQL database, please share :)
  
 
  Sorry, I don't know how to optimize it.
 
  These queries are part of the new batch insert code.  You can turn that 
  off
  by passing --disable-batch-insert to configure when building Bacula.

 what does this mean in practice? 1 query per file? or?

The non-batch approach makes several queries per file.


 My real problem is these huge queries lock the File table, so no other
 jobs can touch it for many hours.

The batch queries update the Filename and Path tables atomically, so locking
is expected.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users