Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-23 Thread Jeronimo Zucco

Citando James Harper :


> Does MySQL have a 'profiler' tool like MSSQL does? I spend most of my
> time in MSSQL and if some operation is running slow I just attach the
> profiler to it and capture the queries and focus on the ones that are
> taking most of the time.
>
> James


What is the impact of use this option in my.cnf:

low_priority_updates=1

I ran a lot of concurrent jobs, and this option help me in mysql  
performance issues.



-- 
Jeronimo Zucco
LPIC-1 Linux Professional Institute Certified
Universidade de Caxias do Sul - NPDU

http://jczucco.blogspot.com

---
Essa mensagem foi enviada pelo UCS Mail



--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-20 Thread Kern Sibbald
On Saturday 20 June 2009 08:51:53 Tom Sommer wrote:
> Tom Sommer wrote:
> > Mike Holden wrote:
> >> Jari Fredriksson wrote:
>  INSERT INTO Filename( Name )
>  SELECT a.Name
>  FROM (
> 
>  SELECT DISTINCT Name
>  FROM batch
>  ) AS a
>  WHERE NOT
>  EXISTS (
> 
>  SELECT Name
>  FROM Filename AS f
>  WHERE f.Name = a.Name
>  )
> >>>
> >>> That looks silly.
> >>>
> >>> I would write it shorter as
> >>>
> >>> INSERT INTO Filename(Name)
> >>> SELECT DISTINCT Name
> >>> FROM batch AS a
> >>> WHERE NOT EXISTS
> >>> (
> >>> SELECT Name
> >>> FROM Filename AS f
> >>> WHERE f.Name = a.Name
> >>> )
> >>
> >> You may also want to consider using a JOIN rather than a subquery with a
> >> NOT EXISTS, something like (untested and unvalidated!):
> >>
> >> INSERT INTO filename(name)
> >> SELECT DISTINCT name
> >> FROM batch AS a
> >> LEFT JOIN filename AS f USING (name)
> >> WHERE f.name IS NULL
> >>
> >> I know from experience as an Oracle DBA (my day job) that this can often
> >> produce far more efficient results.
> >>
> >> Of course, all options need testing for both speed of execution and
> >> resource usage, bearing in mind that data varies from one installation
> >> to another, and one size may not fit all!
> >
> > Good suggestions, sounds like there might be an overall performance
> > problem with the current query for batch-inserts with lots of data. I'm
> > a bit unsure if I dare test these queries on my current installation.
> >
> > I'll CC Kern on the thread, perhaps he has some insights.
>
> So this morning I had to kill the above query because it's been running
> for 24+ hours, preventing the new daily jobs from running.
> I think I'm going to try and disable batch-inserts, the current
> situation is simply not good enough, it's become a major headache to run
> backups suddenly. When I hit the end of this month and ALL servers have
> to run FULL backups, I'm gonna be in a world of trouble I think - I just
> don't understand what has changed, because it's all been running great
> up until now.
>
> // Tom

Hello,

We certainly can use help optimizing SQL since we are not DBAs, and we will 
look into the SQL optimization suggestions given above, keeping in mind that 
there are often rather radical differences in timing of particular SQL 
queries depending on the database engine used.

To the best of my knowledge nothing has changed in terms of the Batch insert 
queries since when it was implemented, and it is *very* unlikely (though I 
haven't checked the code) that something changed from 2.4.4 to 3.0.x.

More likely, your workload or MySQL has changed in some way -- e.g. more Jobs, 
more machines backed up, Director machine with less memory or other jobs that 
use memory, a new version of MySQL, ...

- I would suggest that you ensure that your database has all the recommended 
indexes (see the make_mysql_tables file), and that you are running with the 
large memory /etc/my.cnf file.

- Another thing to do is to compact your database.  One way to do it is to 
write it to an ASCII file and then re-insert it.

- If you are running certain programs that create and delete lots of temporary 
files with different names, you Filename table may need cleaning.

- I would strongly recommend not starting *lots* of Full backups at the same 
time or on the same day.  By lots, I mean more than say 10 or 20 (depends on 
the size of your system).  It is generally far better to stage 1/4 of the 
backup every week for a full backup so that the peak workload is spread out 
over the month.

- If the bottleneck is in MySQL, you might consider moving it to another 
machine that has more memory and faster disks.

- If you really have a huge number of backups (say 50 or more) that all run at 
the same time, it might be advisable to consider using PostgreSQL, but in 
that case, you will probably need an onsite DBA to properly tune and maintain 
it.

- Regardless of what hardware you have, there are certain limitations on how 
many simultaneous jobs you can run (this kicks in on many systems around 50).  
Once a certain number is exceeded, the total throughput can rather radically 
decrease so careful monitoring is necessary.  Bweb can help a lot in these 
situations.

- If you are really "in a world of trouble" and it is a performance issue, 
there is not much we (the Bacula project) can do for you other than the above 
tips.  However, ($$) Bacula Systems has tools that can help more precisely 
identify bottlenecks and help balance loads.

Regards,

Kern

--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___

Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Tom Sommer wrote:
> Mike Holden wrote:
>   
>> Jari Fredriksson wrote:
>>   
>> 
 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )

   
 
>>> That looks silly.
>>>
>>> I would write it shorter as
>>>
>>> INSERT INTO Filename(Name)
>>> SELECT DISTINCT Name
>>> FROM batch AS a
>>> WHERE NOT EXISTS
>>> (
>>> SELECT Name
>>> FROM Filename AS f
>>> WHERE f.Name = a.Name
>>> )
>>> 
>>>   
>> You may also want to consider using a JOIN rather than a subquery with a
>> NOT EXISTS, something like (untested and unvalidated!):
>>
>> INSERT INTO filename(name)
>> SELECT DISTINCT name
>> FROM batch AS a
>> LEFT JOIN filename AS f USING (name)
>> WHERE f.name IS NULL
>>
>> I know from experience as an Oracle DBA (my day job) that this can often
>> produce far more efficient results.
>>
>> Of course, all options need testing for both speed of execution and
>> resource usage, bearing in mind that data varies from one installation to
>> another, and one size may not fit all!
>>   
>> 
> Good suggestions, sounds like there might be an overall performance
> problem with the current query for batch-inserts with lots of data. I'm
> a bit unsure if I dare test these queries on my current installation.
>
> I'll CC Kern on the thread, perhaps he has some insights.
>   
So this morning I had to kill the above query because it's been running 
for 24+ hours, preventing the new daily jobs from running.
I think I'm going to try and disable batch-inserts, the current 
situation is simply not good enough, it's become a major headache to run 
backups suddenly. When I hit the end of this month and ALL servers have 
to run FULL backups, I'm gonna be in a world of trouble I think - I just 
don't understand what has changed, because it's all been running great 
up until now.

// Tom

--
Are you an open source citizen? Join us for the Open Source Bridge conference!
Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250.
Need another reason to go? 24-hour hacker lounge. Register today!
http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Bruno Friedmann
Jari Fredriksson wrote:
>> James Harper wrote:
>>> The subquery returns a very small result set (0 or 1,
>>> assuming you use DISTINCT) and so isn't too inefficient.
>>> It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
>>> LOTS OF RESULTS)' that you start to really bog down
>> True, but if the outer query contains a very large number
>> of records, the inner query is executed over and over and
>> over and ... 
>>
>> This multiplies the small execution time by a large
>> number. 
>>
>> Think of it like a WHILE loop that executes a block of
>> code each time round. The inner block of code might be
>> quick, but if the loop goes round several million times,
>> the cumulative time becomes significant. 
>>
>> One general rule of thumb of SQL execution is that bulk
>> processing is almost always faster than processing
>> records individually in a loop. 
> 
> All right then. Then there is this version:
> 
> INSERT INTO Filename(Name)
> SELECT DISTINCT Name
> FROM batch AS
> WHERE Name NOT IN
> (
> SELECT DISTINCT Name
> FROM Filename
> );
> 
> I guess it depends on RDMS, but I wonder it it executes the subquery more 
> than once.
> 
> 
> 

The keyword EXPLAIN before the query should show how mysql would react.

There's some tools also mysql-administrator which can be use to detect / search 
point of slow


-- 

 Bruno Friedmann



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Bruno Friedmann
Jari Fredriksson wrote:
>> James Harper wrote:
>>> The subquery returns a very small result set (0 or 1,
>>> assuming you use DISTINCT) and so isn't too inefficient.
>>> It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
>>> LOTS OF RESULTS)' that you start to really bog down
>> True, but if the outer query contains a very large number
>> of records, the inner query is executed over and over and
>> over and ... 
>>
>> This multiplies the small execution time by a large
>> number. 
>>
>> Think of it like a WHILE loop that executes a block of
>> code each time round. The inner block of code might be
>> quick, but if the loop goes round several million times,
>> the cumulative time becomes significant. 
>>
>> One general rule of thumb of SQL execution is that bulk
>> processing is almost always faster than processing
>> records individually in a loop. 
> 
> All right then. Then there is this version:
> 
> INSERT INTO Filename(Name)
> SELECT DISTINCT Name
> FROM batch AS
> WHERE Name NOT IN
> (
> SELECT DISTINCT Name
> FROM Filename
> );
> 
> I guess it depends on RDMS, but I wonder it it executes the subquery more 
> than once.
> 
> 
> 

The keyword EXPLAIN before the query should show how mysql would react.

There's some tools also mysql-administrator which can be use to detect / search 
point of slow


-- 

 Bruno Friedmann



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Jari Fredriksson
> James Harper wrote:
>> The subquery returns a very small result set (0 or 1,
>> assuming you use DISTINCT) and so isn't too inefficient.
>> It's when you say 'WHERE NOT EXISTS (SOME QUERY WITH
>> LOTS OF RESULTS)' that you start to really bog down
> 
> True, but if the outer query contains a very large number
> of records, the inner query is executed over and over and
> over and ... 
> 
> This multiplies the small execution time by a large
> number. 
> 
> Think of it like a WHILE loop that executes a block of
> code each time round. The inner block of code might be
> quick, but if the loop goes round several million times,
> the cumulative time becomes significant. 
> 
> One general rule of thumb of SQL execution is that bulk
> processing is almost always faster than processing
> records individually in a loop. 

All right then. Then there is this version:

INSERT INTO Filename(Name)
SELECT DISTINCT Name
FROM batch AS
WHERE Name NOT IN
(
SELECT DISTINCT Name
FROM Filename
);

I guess it depends on RDMS, but I wonder it it executes the subquery more than 
once.



--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread James Harper
> 
> James Harper wrote:
> > The subquery returns a very small result set (0 or 1, assuming you
use
> > DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
> > EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really
bog
> > down
> 
> True, but if the outer query contains a very large number of records,
the
> inner query is executed over and over and over and ...
> 
> This multiplies the small execution time by a large number.
> 
> Think of it like a WHILE loop that executes a block of code each time
> round. The inner block of code might be quick, but if the loop goes
round
> several million times, the cumulative time becomes significant.
> 
> One general rule of thumb of SQL execution is that bulk processing is
> almost always faster than processing records individually in a loop.

I would have thought that the query planner would have reduced both
solutions to the same thing (eg for every record in the outer or left
part, match it up with the inner or left part). I agree that the JOIN
lends itself better to optimization though, and that's certainly the way
I would have written it as it is far clearer to read and scales to far
more situations than the 'NOT EXISTS' solution does.

I constructed a simple test scenario - created a table 'james' with a
few hundred thousand rows in it of both real filenames and corrupted
versions of real filenames, and ran the following tests:

RESET QUERY CACHE; (can't remember if that was the exact syntax)
SELECT COUNT(DISTINCT Name)
FROM james AS a
WHERE NOT EXISTS
(
SELECT Name
FROM Filename AS f
WHERE f.Name = a.Name
);

And

RESET QUERY CACHE;
SELECT COUNT(DISTINCT Name)
FROM james AS a
LEFT JOIN filename AS f ON a.name = f.name
WHERE f.name IS NULL;

And compared the times over a half a dozen runs. There wasn't a
measurable difference between the results, but the server was busy
enough at the time that a significant amount of 'noise' was added in -
both queries ran for between 2 and 6 seconds. The simple timing test
also completely fails to measure memory and io usage.

I'm sure someone could construct some more precise tests that would
eliminate any speculation, but on face value you aren't going to get an
order-of-magnitude improvement or anything close just by tinkering with
this query, unless you can identify some indexes that could be added
onto the batch table (which is temporary isn't it?). Although given that
Filename already has an index on the only field being used, I suspect
that even that won't do much for you.

Does MySQL have a 'profiler' tool like MSSQL does? I spend most of my
time in MSSQL and if some operation is running slow I just attach the
profiler to it and capture the queries and focus on the ones that are
taking most of the time.

James


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Mike Holden
James Harper wrote:
> The subquery returns a very small result set (0 or 1, assuming you use
> DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
> EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really bog
> down

True, but if the outer query contains a very large number of records, the
inner query is executed over and over and over and ...

This multiplies the small execution time by a large number.

Think of it like a WHILE loop that executes a block of code each time
round. The inner block of code might be quick, but if the loop goes round
several million times, the cumulative time becomes significant.

One general rule of thumb of SQL execution is that bulk processing is
almost always faster than processing records individually in a loop.
-- 
Mike Holden


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread James Harper
> > INSERT INTO Filename(Name)
> > SELECT DISTINCT Name
> > FROM batch AS a
> > WHERE NOT EXISTS
> > (
> > SELECT Name
> > FROM Filename AS f
> > WHERE f.Name = a.Name
> > )
> 
> You may also want to consider using a JOIN rather than a subquery with
a
> NOT EXISTS, something like (untested and unvalidated!):
> 
> INSERT INTO filename(name)
> SELECT DISTINCT name
> FROM batch AS a
> LEFT JOIN filename AS f USING (name)
> WHERE f.name IS NULL
> 
> I know from experience as an Oracle DBA (my day job) that this can
often
> produce far more efficient results.
> 
> Of course, all options need testing for both speed of execution and
> resource usage, bearing in mind that data varies from one installation
to
> another, and one size may not fit all!

The subquery returns a very small result set (0 or 1, assuming you use
DISTINCT) and so isn't too inefficient. It's when you say 'WHERE NOT
EXISTS (SOME QUERY WITH LOTS OF RESULTS)' that you start to really bog
down

James

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Mike Holden wrote:
> Jari Fredriksson wrote:
>   
>>> INSERT INTO Filename( Name )
>>> SELECT a.Name
>>> FROM (
>>>
>>> SELECT DISTINCT Name
>>> FROM batch
>>> ) AS a
>>> WHERE NOT
>>> EXISTS (
>>>
>>> SELECT Name
>>> FROM Filename AS f
>>> WHERE f.Name = a.Name
>>> )
>>>
>>>   
>> That looks silly.
>>
>> I would write it shorter as
>>
>> INSERT INTO Filename(Name)
>> SELECT DISTINCT Name
>> FROM batch AS a
>> WHERE NOT EXISTS
>> (
>> SELECT Name
>> FROM Filename AS f
>> WHERE f.Name = a.Name
>> )
>> 
>
> You may also want to consider using a JOIN rather than a subquery with a
> NOT EXISTS, something like (untested and unvalidated!):
>
> INSERT INTO filename(name)
> SELECT DISTINCT name
> FROM batch AS a
> LEFT JOIN filename AS f USING (name)
> WHERE f.name IS NULL
>
> I know from experience as an Oracle DBA (my day job) that this can often
> produce far more efficient results.
>
> Of course, all options need testing for both speed of execution and
> resource usage, bearing in mind that data varies from one installation to
> another, and one size may not fit all!
>   
Good suggestions, sounds like there might be an overall performance
problem with the current query for batch-inserts with lots of data. I'm
a bit unsure if I dare test these queries on my current installation.

I'll CC Kern on the thread, perhaps he has some insights.

// Tom

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Mike Holden
Jari Fredriksson wrote:
>>
>> INSERT INTO Filename( Name )
>> SELECT a.Name
>> FROM (
>>
>> SELECT DISTINCT Name
>> FROM batch
>> ) AS a
>> WHERE NOT
>> EXISTS (
>>
>> SELECT Name
>> FROM Filename AS f
>> WHERE f.Name = a.Name
>> )
>>
>
> That looks silly.
>
> I would write it shorter as
>
> INSERT INTO Filename(Name)
> SELECT DISTINCT Name
> FROM batch AS a
> WHERE NOT EXISTS
> (
> SELECT Name
> FROM Filename AS f
> WHERE f.Name = a.Name
> )

You may also want to consider using a JOIN rather than a subquery with a
NOT EXISTS, something like (untested and unvalidated!):

INSERT INTO filename(name)
SELECT DISTINCT name
FROM batch AS a
LEFT JOIN filename AS f USING (name)
WHERE f.name IS NULL

I know from experience as an Oracle DBA (my day job) that this can often
produce far more efficient results.

Of course, all options need testing for both speed of execution and
resource usage, bearing in mind that data varies from one installation to
another, and one size may not fit all!
-- 
Mike Holden

http://www.by-ang.com - the place to shop for all manner of hand crafted
items, including Jewellery, Greetings Cards and Gifts




--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Martin Simmons
> On Fri, 19 Jun 2009 09:51:20 +0200, Tom Sommer said:
> 
> Martin Simmons wrote:
> >> On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
> >> 
> >> Martin Simmons wrote:
> >> 
>  On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
>  
>  Martin Simmons wrote:
>  
> >> On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
> >> 
> >> 
> >> Hi,
> >>
> >> I have a somewhat pressing problem with the performance of my Bacula
> >> installation.
> >>
> >> My MySQL database currently holds 247,342,127 (36GB) records in the 
> >> File
> >> table, and 78,576,199 (10GB) records in the Filename table.
> >>
> >> Since 3.0.0, but even more since 3.0.1, I have a problem with queries
> >> being really slow. Basically when doing a full backup of a server
> >> (mailserver, LOTS of small files), I can have my MySQL hanging for up 
> >> to
> >> 24+ hours on queries like this:
> >>
> >> INSERT INTO Filename( Name )
> >> SELECT a.Name
> >> FROM (
> >>
> >> SELECT DISTINCT Name
> >> FROM batch
> >> ) AS a
> >> WHERE NOT
> >> EXISTS (
> >>
> >> SELECT Name
> >> FROM Filename AS f
> >> WHERE f.Name = a.Name
> >> )
> >> 
> >> [...]
> >> 
> Sorry, I don't know how to optimize it.
> >
> These queries are part of the new "batch insert" code.  You can turn that 
> off
> by passing --disable-batch-insert to configure when building Bacula.
>   
>   
>  what does this mean in practice? 1 query per file? or?
>  
> >>> The non-batch approach makes several queries per file.
> >>>   
> >> But it will do so as it backs up the files, as opposed to the batch
> >> approach where it collects all the queries and does them at the end of
> >> the job, right?
> >> 
> >
> > Yes, but if you have spooling turned on, then it will spool the attributes 
> > too
> > and insert them all at the end using several queries per file.
> >   
> Well, the problem is really that first is spends time storing the files,
> and then it spends time saving information about the backup in the database.
> It would make more sense, for larger clients, to spend the idle database
> time (while taking the file backup) more wisely.
> 
> I'm having a hard time judging if turning off batch-inserts is the right
> action, if I would gain anything from this.

AFAIK, the non-batch approach in Bacula 3.x does the same queies as Bacula
2.x.  Since you started the thread by saying that backups are slower in 3.x
because of the queries, turning off batch-inserts might solve that.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Martin Simmons
> On Fri, 19 Jun 2009 03:00:54 +0300, Jari Fredriksson said:
> 
> > 
> > INSERT INTO Filename( Name )
> > SELECT a.Name
> > FROM (
> > 
> > SELECT DISTINCT Name
> > FROM batch
> > ) AS a
> > WHERE NOT
> > EXISTS (
> > 
> > SELECT Name
> > FROM Filename AS f
> > WHERE f.Name = a.Name
> > )
> > 
> 
> That looks silly.
> 
> I would write it shorter as
> 
> INSERT INTO Filename(Name)
> SELECT DISTINCT Name
> FROM batch AS a
> WHERE NOT EXISTS
> (
> SELECT Name
> FROM Filename AS f
> WHERE f.Name = a.Name
> )

Maybe, but perhaps it quicker to remove the duplicates from batch.name before
comparing them to the rows in Filename.name?

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Martin Simmons wrote:
>> On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
>> 
>> Martin Simmons wrote:
>> 
 On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
 
 Martin Simmons wrote:
 
>> On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
>> 
>> 
>> Hi,
>>
>> I have a somewhat pressing problem with the performance of my Bacula
>> installation.
>>
>> My MySQL database currently holds 247,342,127 (36GB) records in the File
>> table, and 78,576,199 (10GB) records in the Filename table.
>>
>> Since 3.0.0, but even more since 3.0.1, I have a problem with queries
>> being really slow. Basically when doing a full backup of a server
>> (mailserver, LOTS of small files), I can have my MySQL hanging for up to
>> 24+ hours on queries like this:
>>
>> INSERT INTO Filename( Name )
>> SELECT a.Name
>> FROM (
>>
>> SELECT DISTINCT Name
>> FROM batch
>> ) AS a
>> WHERE NOT
>> EXISTS (
>>
>> SELECT Name
>> FROM Filename AS f
>> WHERE f.Name = a.Name
>> )
>> 
>> [...]
>> 
> Sorry, I don't know how to optimize it.
>
> These queries are part of the new "batch insert" code.  You can turn that 
> off
> by passing --disable-batch-insert to configure when building Bacula.
>   
>   
 what does this mean in practice? 1 query per file? or?
 
>>> The non-batch approach makes several queries per file.
>>>   
>> But it will do so as it backs up the files, as opposed to the batch
>> approach where it collects all the queries and does them at the end of
>> the job, right?
>> 
>
> Yes, but if you have spooling turned on, then it will spool the attributes too
> and insert them all at the end using several queries per file.
>   
Well, the problem is really that first is spends time storing the files,
and then it spends time saving information about the backup in the database.
It would make more sense, for larger clients, to spend the idle database
time (while taking the file backup) more wisely.

I'm having a hard time judging if turning off batch-inserts is the right
action, if I would gain anything from this.

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-19 Thread Tom Sommer
Jari Fredriksson wrote:
>> INSERT INTO Filename( Name )
>> SELECT a.Name
>> FROM (
>>
>> SELECT DISTINCT Name
>> FROM batch
>> ) AS a
>> WHERE NOT
>> EXISTS (
>>
>> SELECT Name
>> FROM Filename AS f
>> WHERE f.Name = a.Name
>> )
>>
>> 
>
> That looks silly.
>
> I would write it shorter as
>
> INSERT INTO Filename(Name)
> SELECT DISTINCT Name
> FROM batch AS a
> WHERE NOT EXISTS
> (
> SELECT Name
> FROM Filename AS f
> WHERE f.Name = a.Name
> )
Make a bug report :)

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Jari Fredriksson
> 
> INSERT INTO Filename( Name )
> SELECT a.Name
> FROM (
> 
> SELECT DISTINCT Name
> FROM batch
> ) AS a
> WHERE NOT
> EXISTS (
> 
> SELECT Name
> FROM Filename AS f
> WHERE f.Name = a.Name
> )
> 

That looks silly.

I would write it shorter as

INSERT INTO Filename(Name)
SELECT DISTINCT Name
FROM batch AS a
WHERE NOT EXISTS
(
SELECT Name
FROM Filename AS f
WHERE f.Name = a.Name
)

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Martin Simmons
> On Thu, 18 Jun 2009 17:11:04 +0200, Michel Meyers said:
> 
> Martin Simmons wrote:
> >> On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
> >> Martin Simmons wrote:
>  On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
>  
>  Hi,
> 
>  I have a somewhat pressing problem with the performance of my Bacula
>  installation.
> 
>  My MySQL database currently holds 247,342,127 (36GB) records in the File
>  table, and 78,576,199 (10GB) records in the Filename table.
> 
>  Since 3.0.0, but even more since 3.0.1, I have a problem with queries
>  being really slow. Basically when doing a full backup of a server
>  (mailserver, LOTS of small files), I can have my MySQL hanging for up to
>  24+ hours on queries like this:
> 
>  INSERT INTO Filename( Name )
>  SELECT a.Name
>  FROM (
> 
>  SELECT DISTINCT Name
>  FROM batch
>  ) AS a
>  WHERE NOT
>  EXISTS (
> 
>  SELECT Name
>  FROM Filename AS f
>  WHERE f.Name = a.Name
>  )
> [...]
> >>> Sorry, I don't know how to optimize it.
> >>>
> >>> These queries are part of the new "batch insert" code.  You can turn that 
> >>> off
> >>> by passing --disable-batch-insert to configure when building Bacula.
> >>>   
> >> what does this mean in practice? 1 query per file? or?
> > 
> > The non-batch approach makes several queries per file.
> 
> But it will do so as it backs up the files, as opposed to the batch
> approach where it collects all the queries and does them at the end of
> the job, right?

Yes, but if you have spooling turned on, then it will spool the attributes too
and insert them all at the end using several queries per file.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-18 Thread Michel Meyers
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Martin Simmons wrote:
>> On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
>> Martin Simmons wrote:
 On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
 
 Hi,

 I have a somewhat pressing problem with the performance of my Bacula
 installation.

 My MySQL database currently holds 247,342,127 (36GB) records in the File
 table, and 78,576,199 (10GB) records in the Filename table.

 Since 3.0.0, but even more since 3.0.1, I have a problem with queries
 being really slow. Basically when doing a full backup of a server
 (mailserver, LOTS of small files), I can have my MySQL hanging for up to
 24+ hours on queries like this:

 INSERT INTO Filename( Name )
 SELECT a.Name
 FROM (

 SELECT DISTINCT Name
 FROM batch
 ) AS a
 WHERE NOT
 EXISTS (

 SELECT Name
 FROM Filename AS f
 WHERE f.Name = a.Name
 )
[...]
>>> Sorry, I don't know how to optimize it.
>>>
>>> These queries are part of the new "batch insert" code.  You can turn that 
>>> off
>>> by passing --disable-batch-insert to configure when building Bacula.
>>>   
>> what does this mean in practice? 1 query per file? or?
> 
> The non-batch approach makes several queries per file.

But it will do so as it backs up the files, as opposed to the batch
approach where it collects all the queries and does them at the end of
the job, right?

- - Michel
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)

iEYEARECAAYFAko6WQgACgkQ2Vs+MkscAyWeTQCglDfuICz6cZ1eYz06yIl8VzDS
jw8An1Xoa1cje6vI2ChFttOqjvG5FzQa
=KKVz
-END PGP SIGNATURE-


--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Martin Simmons
> On Wed, 17 Jun 2009 13:48:58 +0200, Tom Sommer said:
> 
> Martin Simmons wrote:
> >> On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
> >> 
> >> Hi,
> >>
> >> I have a somewhat pressing problem with the performance of my Bacula
> >> installation.
> >>
> >> My MySQL database currently holds 247,342,127 (36GB) records in the File
> >> table, and 78,576,199 (10GB) records in the Filename table.
> >>
> >> Since 3.0.0, but even more since 3.0.1, I have a problem with queries
> >> being really slow. Basically when doing a full backup of a server
> >> (mailserver, LOTS of small files), I can have my MySQL hanging for up to
> >> 24+ hours on queries like this:
> >>
> >> INSERT INTO Filename( Name )
> >> SELECT a.Name
> >> FROM (
> >>
> >> SELECT DISTINCT Name
> >> FROM batch
> >> ) AS a
> >> WHERE NOT
> >> EXISTS (
> >>
> >> SELECT Name
> >> FROM Filename AS f
> >> WHERE f.Name = a.Name
> >> )
> >>
> >> with the status "Sending data", and a lot of other similar queries in
> >> queue with the status "Locked". One of these queries take approx 10.000
> >> seconds to execute, but is just followed by another similar (identical)
> >> query with the same duration. This is a problem mostly because it
> >> prevents me doing restores while backups are "Dir inserting attributes"
> >>
> >> Obviously I would think it's a MySQL performance issue, but I was
> >> wondering if anything had been done to the queries? They seem to be a
> >> LOT slower and a LOT heavier.
> >> I've just put more RAM into the server, but it's done little to improve
> >> the duration of the queries. My server now has 4GB RAM (will update to
> >> 6GB) - but again it's a recent issue, because Bacula has been running
> >> perfectly for many months on 2GB RAM, until I updated to 3.0.1
> >>
> >> I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.
> >>
> >> Finally if anyone have any specific ideas to improve performance on my
> >> huge SQL database, please share :)
> >> 
> >
> > Sorry, I don't know how to optimize it.
> >
> > These queries are part of the new "batch insert" code.  You can turn that 
> > off
> > by passing --disable-batch-insert to configure when building Bacula.
> >   
> what does this mean in practice? 1 query per file? or?

The non-batch approach makes several queries per file.


> My real problem is these huge queries lock the File table, so no other
> jobs can touch it for many hours.

The batch queries update the Filename and Path tables atomically, so locking
is expected.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Tom Sommer
Martin Simmons wrote:
>> On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
>> 
>> Hi,
>>
>> I have a somewhat pressing problem with the performance of my Bacula
>> installation.
>>
>> My MySQL database currently holds 247,342,127 (36GB) records in the File
>> table, and 78,576,199 (10GB) records in the Filename table.
>>
>> Since 3.0.0, but even more since 3.0.1, I have a problem with queries
>> being really slow. Basically when doing a full backup of a server
>> (mailserver, LOTS of small files), I can have my MySQL hanging for up to
>> 24+ hours on queries like this:
>>
>> INSERT INTO Filename( Name )
>> SELECT a.Name
>> FROM (
>>
>> SELECT DISTINCT Name
>> FROM batch
>> ) AS a
>> WHERE NOT
>> EXISTS (
>>
>> SELECT Name
>> FROM Filename AS f
>> WHERE f.Name = a.Name
>> )
>>
>> with the status "Sending data", and a lot of other similar queries in
>> queue with the status "Locked". One of these queries take approx 10.000
>> seconds to execute, but is just followed by another similar (identical)
>> query with the same duration. This is a problem mostly because it
>> prevents me doing restores while backups are "Dir inserting attributes"
>>
>> Obviously I would think it's a MySQL performance issue, but I was
>> wondering if anything had been done to the queries? They seem to be a
>> LOT slower and a LOT heavier.
>> I've just put more RAM into the server, but it's done little to improve
>> the duration of the queries. My server now has 4GB RAM (will update to
>> 6GB) - but again it's a recent issue, because Bacula has been running
>> perfectly for many months on 2GB RAM, until I updated to 3.0.1
>>
>> I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.
>>
>> Finally if anyone have any specific ideas to improve performance on my
>> huge SQL database, please share :)
>> 
>
> Sorry, I don't know how to optimize it.
>
> These queries are part of the new "batch insert" code.  You can turn that off
> by passing --disable-batch-insert to configure when building Bacula.
>   
what does this mean in practice? 1 query per file? or?

My real problem is these huge queries lock the File table, so no other
jobs can touch it for many hours.

// Tom

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-17 Thread Martin Simmons
> On Tue, 16 Jun 2009 15:05:18 +0200, Tom Sommer said:
> 
> Hi,
> 
> I have a somewhat pressing problem with the performance of my Bacula
> installation.
> 
> My MySQL database currently holds 247,342,127 (36GB) records in the File
> table, and 78,576,199 (10GB) records in the Filename table.
> 
> Since 3.0.0, but even more since 3.0.1, I have a problem with queries
> being really slow. Basically when doing a full backup of a server
> (mailserver, LOTS of small files), I can have my MySQL hanging for up to
> 24+ hours on queries like this:
> 
> INSERT INTO Filename( Name )
> SELECT a.Name
> FROM (
> 
> SELECT DISTINCT Name
> FROM batch
> ) AS a
> WHERE NOT
> EXISTS (
> 
> SELECT Name
> FROM Filename AS f
> WHERE f.Name = a.Name
> )
> 
> with the status "Sending data", and a lot of other similar queries in
> queue with the status "Locked". One of these queries take approx 10.000
> seconds to execute, but is just followed by another similar (identical)
> query with the same duration. This is a problem mostly because it
> prevents me doing restores while backups are "Dir inserting attributes"
> 
> Obviously I would think it's a MySQL performance issue, but I was
> wondering if anything had been done to the queries? They seem to be a
> LOT slower and a LOT heavier.
> I've just put more RAM into the server, but it's done little to improve
> the duration of the queries. My server now has 4GB RAM (will update to
> 6GB) - but again it's a recent issue, because Bacula has been running
> perfectly for many months on 2GB RAM, until I updated to 3.0.1
> 
> I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.
> 
> Finally if anyone have any specific ideas to improve performance on my
> huge SQL database, please share :)

Sorry, I don't know how to optimize it.

These queries are part of the new "batch insert" code.  You can turn that off
by passing --disable-batch-insert to configure when building Bacula.

__Martin

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Performance with MySQL queries since 3.0.0 (Dir inserting attributes hang)

2009-06-16 Thread Tom Sommer
Hi,

I have a somewhat pressing problem with the performance of my Bacula
installation.

My MySQL database currently holds 247,342,127 (36GB) records in the File
table, and 78,576,199 (10GB) records in the Filename table.

Since 3.0.0, but even more since 3.0.1, I have a problem with queries
being really slow. Basically when doing a full backup of a server
(mailserver, LOTS of small files), I can have my MySQL hanging for up to
24+ hours on queries like this:

INSERT INTO Filename( Name )
SELECT a.Name
FROM (

SELECT DISTINCT Name
FROM batch
) AS a
WHERE NOT
EXISTS (

SELECT Name
FROM Filename AS f
WHERE f.Name = a.Name
)

with the status "Sending data", and a lot of other similar queries in
queue with the status "Locked". One of these queries take approx 10.000
seconds to execute, but is just followed by another similar (identical)
query with the same duration. This is a problem mostly because it
prevents me doing restores while backups are "Dir inserting attributes"

Obviously I would think it's a MySQL performance issue, but I was
wondering if anything had been done to the queries? They seem to be a
LOT slower and a LOT heavier.
I've just put more RAM into the server, but it's done little to improve
the duration of the queries. My server now has 4GB RAM (will update to
6GB) - but again it's a recent issue, because Bacula has been running
perfectly for many months on 2GB RAM, until I updated to 3.0.1

I've done REPAIR TABLE and OPTIMIZE TABLE and seen no improvement.

Finally if anyone have any specific ideas to improve performance on my
huge SQL database, please share :)

Thanks
--
Tom

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users