Re: [firebird-support] why Blob is so slow ?

2012-05-02 Thread Alexandre Benson Smith
Em 1/5/2012 14:51, Fabricio Araujo escreveu:
 Remember Alexandre, GBAK (and Services API) are a DataPump-style backup,
 diffent of NBAK (which AFAIR restores database pages instead of loginal
 structure)
 which makes me think: you tried that restore on a heavily fragmented
 storage?

 Since GBAK works as a datapump, certainly it makes the Server grows the
 *.fdb file
 so many times. Would be nice if we could (if it already doesn't
 do that) specify an file size on restore and it could be created using
 instant file instancing
 (I know it have something to do with a volume operation, since the
 service user
 need to have disk volume operations' permission on Windows - MSSQL use that
 and brings the restore multi GB time to a half - or less).



Yes, I know, but the same occurs to an gorwing database file with just 
simple types (varchar, date, integer, etc.) and the time is considerably 
diferent. There is a new feature on Fb (I can't remember in wich 
version) that grows the database more than a page at once, this was 
implemented to avoid disk full problems, but as a side effect it could 
improve a lot the restore time.

But the case I faced has something to do with my hardware and/or 
filesystem. The very same back-up restored under 3s on Cantu's and 
Kuzmenko's computers, where in my server it need more than 10 minutes to 
finish. I did another test on my notebook out of my VM and it took under 
3s too.

I ruled hardware out to fast, because I faced the slow restore on a 
costumer, and did a test on my server, and on both the time was so big, 
this leads me to rule out hardware, but perhaps the filesystem on both 
machines are the same, unfortunatelly I did not have remote access to 
that costumer server, and did not visited them since, so I could not 
tell anything about the costumer filesystem.

Something is very diferent on restoring blobs than simple types, I know 
it's comparing apple to orange, but 2 databases with the same size has 
considerably diferent times for back-up/restore if it's made of simple 
datatypes or with blob content. I know it's not a fair comparison, but 
anyway, I think it's not completely invalid.

I am very busy this days, but I will perform more tests on distinct 
hardware to see some numbers about it.

Thanks for your message.

see you !




++

Visit http://www.firebirdsql.org and click the Resources item
on the main (top) menu.  Try Knowledgebase and FAQ links !

Also search the knowledgebases at http://www.ibphoenix.com 

++
Yahoo! Groups Links

* To visit your group on the web, go to:
http://groups.yahoo.com/group/firebird-support/

* Your email settings:
Individual Email | Traditional

* To change settings online go to:
http://groups.yahoo.com/group/firebird-support/join
(Yahoo! ID required)

* To change settings via email:
firebird-support-dig...@yahoogroups.com 
firebird-support-fullfeatu...@yahoogroups.com

* To unsubscribe from this group, send an email to:
firebird-support-unsubscr...@yahoogroups.com

* Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/



Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Alexandre Benson Smith
Hi Roberto,

Em 19/4/2012 08:52, Tupy... nambá escreveu:
 Alexandre,
 At my point of view, I prefer avoid using BLOB fields. First of all, because 
 these kind of field are not indicated for searches of any kind (most of them 
 are pictures). Second,
 because
 normally they have very large content, what does the DB increase in a large 
 amount. I think the most important property of the DB´s is the capability of 
 searches. But having fields which  don´t allow us to do that, disturb the 
 funcionality of DB´s.
 I prefer using to store files outside DB´s, storing inside them the path for 
 the files. So, you have the speed at all operations (searches and 
 backup´s/restores) and not a meaningfull increase of the DB´s.

 I´m not sure about the reasons for the backup/restore speed problem, but I 
 believe that inside the DB happens almost the same as at OS environment = 
 when adjacent areas are full, then the OS or the DB manager application most 
 look for distant areas to store parts of the data, causing a data 
 fragmentation. And to access the complete data, the OS or DB manager must 
 remount them, before delivering to the client. And the DB itself suffers 
 from the DB file fragmentation at disc level.
 At file servers, normally file fragmentation are low (you don´t edit them 
 directly at the server) and still you can defragment the files. 
 At SQL server, you find discussions about internal tables and indexes 
 fragmentation, and you have commands to repair fragmentation.
 At Firebird/Interbase, nobody talks about that, but we know it happens and 
 can became a problem, when the DB is greater in size. BLOB are worst for 
 causing that, affecting not only the BLOB fields and data itself, but also 
 fields and data of other data types. And you don´t have (i never see) 
 commands for DB internal defragment.
 Try to do some experiences about that, making comparisons between different 
 solutions for a same problem. May be imediatelly filled DB will not show 
 great differences, but DB´s at common filling (day by day), after a great 
 amount of time, will show meaningfull differences. 
 Roberto Camargo,Rio de Janeiro / Brazil


In the past I used the approach of store just the filename, and I still 
use in some cases, but when everything is inside the datase it's easier 
to be sure that back-up/restore of everything is in place, to move the 
content around, provide transaction control (all the ACID features) that 
needs to be re-implemented if I work at filesystem level. Since you are 
in Brazil I could point a case where the need to store blob's is almost 
mandatory:
The storage of XML files of Nota Fiscal Eletronica (eletronic 
invoice), We need to keep the data for the legal periods specified in 
our legislation, and to handle thousands (millions ?) of individual 
files on the filesystem is not the best option in my point of view, it's 
much easier to be sure that everything is secure inside the database.

I disagree with you about the main feature of a RDBMS is search, search 
is a part of the whole system, but the main feature in my point of view 
is to store data. :) Of course there is no sense in store something if 
you cannot search for it, but, you could have a product that stores the 
data efficiently and not search it so efficiently called a RDBMS, but 
the other way around is not possible. Quoting Ann Harrison from the top 
of my head (probably not the exact words) if you don't need a correct 
answer, the answer is 13.

I don't use Blob's that much, but in some cases I think it's a good 
sollution.

Anyway, thanks for sharing your thoughts, I know that store large binary 
data inside/outside the database is the kind of thing that there is no 
rule of thumb to choose between one or another, myself use both 
approachs for distinct use cases.

My concerns is that something is strange regarding blob manipulation. 
It's too slow to me.

see you !

Alexandre


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Tupy . . . nambá
Hi, Alexandre,
For the sample you gave (NFE), I agree with you, because the amount of files 
that will be generated will be very great and each file itself is not so big, 
probably they will not become a problem. And, in this case, they are part of a 
transaction. Probably not, but I´m not sure - one have to make comparisons to 
be sure about the best solution. I told in a generic way, specially were we 
have contracts, photos, and other no transactional documents.

But, having many NFE (as many as the transactions), don´t you agree that these 
BLOB´s will be a great source of fragmentation inside the DB ?
And, if I´m sure about my thinkings, as Firebird doesn´t have a way to 
defragment inside the DB, you don´t have a way to resolve this.
May be, for having a good solution for such kind of business, one had to use a 
MS SQL Server to periodically defragment the DB. Or another DB name that has 
this funcionality. I searched something like this at Postgres and I found a 
command named VACUUM that does something like this. Think about all of this, if 
you want. If have to have BLOB´s, I think Firebird is not a good solution for a 
great number of them. My thought, you don´t need to agree.
Friendly, best regards,Roberto Camargo.


--- On Thu, 4/19/12, Alexandre Benson Smith ibl...@thorsoftware.com.br wrote:

From: Alexandre Benson Smith ibl...@thorsoftware.com.br
Subject: Re: [firebird-support] why Blob is so slow ?
To: firebird-support@yahoogroups.com
Date: Thursday, April 19, 2012, 6:42 PM

Hi Roberto,

Em 19/4/2012 08:52, Tupy... nambá escreveu:
 Alexandre,
 At my point of view, I prefer avoid using BLOB fields. First of all, because 
 these kind of field are not indicated for searches of any kind (most of them 
 are pictures). Second,
 because
 normally they have very large content, what does the DB increase in a large 
 amount. I think the most important property of the DB´s is the capability of 
 searches. But having fields which  don´t allow us to do that, disturb the 
 funcionality of DB´s.
 I prefer using to store files outside DB´s, storing inside them the path for 
 the files. So, you have the speed at all operations (searches and 
 backup´s/restores) and not a meaningfull increase of the DB´s.

 I´m not sure about the reasons for the backup/restore speed problem, but I 
 believe that inside the DB happens almost the same as at OS environment = 
 when adjacent areas are full, then the OS or the DB manager application most 
 look for distant areas to store parts of the data, causing a data 
 fragmentation. And to access the complete data, the OS or DB manager must 
 remount them, before delivering to the client. And the DB itself suffers 
 from the DB file fragmentation at disc level.
 At file servers, normally file fragmentation are low (you don´t edit them 
 directly at the server) and still you can defragment the files. 
 At SQL server, you find discussions about internal tables and indexes 
 fragmentation, and you have commands to repair fragmentation.
 At Firebird/Interbase, nobody talks about that, but we know it happens and 
 can became a problem, when the DB is greater in size. BLOB are worst for 
 causing that, affecting not only the BLOB fields and data itself, but also 
 fields and data of other data types. And you don´t have (i never see) 
 commands for DB internal defragment.
 Try to do some experiences about that, making comparisons between different 
 solutions for a same problem. May be imediatelly filled DB will not show 
 great differences, but DB´s at common filling (day by day), after a great 
 amount of time, will show meaningfull differences. 
 Roberto Camargo,Rio de Janeiro / Brazil


In the past I used the approach of store just the filename, and I still 
use in some cases, but when everything is inside the datase it's easier 
to be sure that back-up/restore of everything is in place, to move the 
content around, provide transaction control (all the ACID features) that 
needs to be re-implemented if I work at filesystem level. Since you are 
in Brazil I could point a case where the need to store blob's is almost 
mandatory:
The storage of XML files of Nota Fiscal Eletronica (eletronic 
invoice), We need to keep the data for the legal periods specified in 
our legislation, and to handle thousands (millions ?) of individual 
files on the filesystem is not the best option in my point of view, it's 
much easier to be sure that everything is secure inside the database.

I disagree with you about the main feature of a RDBMS is search, search 
is a part of the whole system, but the main feature in my point of view 
is to store data. :) Of course there is no sense in store something if 
you cannot search for it, but, you could have a product that stores the 
data efficiently and not search it so efficiently called a RDBMS, but 
the other way around is not possible. Quoting Ann Harrison from the top 
of my head (probably not the exact words) if you don't need a correct 
answer

Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Alexandre Benson Smith
Em 19/4/2012 12:13, Tupy... nambá escreveu:
 Hi, Alexandre,
 For the sample you gave (NFE), I agree with you, because the amount of files 
 that will be generated will be very great and each file itself is not so big, 
 probably they will not become a problem. And, in this case, they are part of 
 a transaction. Probably not, but I´m not sure - one have to make comparisons 
 to be sure about the best solution. I told in a generic way, specially were 
 we have contracts, photos, and other no transactional documents.

 But, having many NFE (as many as the transactions), don´t you agree that 
 these BLOB´s will be a great source of fragmentation inside the DB ?
 And, if I´m sure about my thinkings, as Firebird doesn´t have a way to 
 defragment inside the DB, you don´t have a way to resolve this.
 May be, for having a good solution for such kind of business, one had to use 
 a MS SQL Server to periodically defragment the DB. Or another DB name that 
 has this funcionality. I searched something like this at Postgres and I found 
 a command named VACUUM that does something like this. Think about all of 
 this, if you want. If have to have BLOB´s, I think Firebird is not a good 
 solution for a great number of them. My thought, you don´t need to agree.
 Friendly, best regards,Roberto Camargo.



I had used MSSQL 6.5 (yes it's a long time ago) so can't comment on the 
need of defragmentation.
I don't know Postgres, but I think the VACUMM is a similar to FB garbage 
collection.

There is a way to defragment FB, make a back-up/restore, but I don't 
think it's needed, at least I had never had the need for such operation.

A big blob will be stored in a bunch of pages that tends to be 
contiguous at the end of the file (yes, I know unsed page are reused), 
so I don't think it's the reason.

A typical NFE would be around 10KB, depending on the page size it could 
be stored with the record, or be stored in two blob pages and just the 
blob id on the record page, anyway I prefer to have a separate table to 
hold the blobs, because in my case the access to blob's are not so 
often, so I prefer to have as many records per page as I can, and read a 
separate table (and therefore page) to read the blob contents when I 
need it.

It's good to read your thougths, I am just arguing about the options :)

see you !


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Alexandre Benson Smith
Em 19/4/2012 12:28, Carlos H. Cantu escreveu:
 Sorry but the discussion is going off-topic for the original
 question, that is: why backup/restore of blobs are so much slower
 compared to non-blobs data. I'm also curious about this.

 Carlos
 Firebird Performance in Detail - http://videos.firebirddevelopersday.com
 www.firebirdnews.org - www.FireBase.com.br


I noted this slowness for some time, but never created a test case so it 
can be measured.

I am sending a back-up to Dmitry Kuzmenko (as he asked for) so he could 
take a look.

I really don't know what's happening, but it's strange to me.

I think that a profilling of gbak and fb server process during the 
restore could show where the time is used and shed some light.

see you !


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Tupy . . . nambá
MSSQL has two commands of the DBCC that allow to do defragmentation. The 
defragmentation is not a garbage collection, but putting all parts of an object 
(file or columns, hanging of the level - disc or DB) side by side, in a way 
that the reading of data will be almost fast, because all data will be found 
almost together. Normally,this is the way to have quick readings of data. 
Garbage collection is like removing of erased data.
As I quickly read at some PostGreSQL pages, VACUUM has to be a defragment 
command for PostGreSQL.

Since you know that you can make a defragment at Firebird making an DB restore, 
you can make a restore and compare the reading times at the two situations. If 
you have a meaningfull increase of readings speed (SELECT´s and so on) after 
the restore, this will mean that your problem is of high fragmentation.
Also, after having made the restore, you can do a new backup and once again, a 
second restore, and see if you have time reduce. At the first restore, the time 
has to be long, but at the second, no more, because the second backup will 
store defragmented data.
If you can, let´s try till now, all I have are only theories. Your results 
will be interesting for all of us.
--- On Thu, 4/19/12, Alexandre Benson Smith ibl...@thorsoftware.com.br wrote:


I had used MSSQL 6.5 (yes it's a long time ago) so can't comment on the 
need of defragmentation.
I don't know Postgres, but I think the VACUMM is a similar to FB garbage 
collection.

There is a way to defragment FB, make a back-up/restore, but I don't 
think it's needed, at least I had never had the need for such operation.

A big blob will be stored in a bunch of pages that tends to be 
contiguous at the end of the file (yes, I know unsed page are reused), 
so I don't think it's the reason.

A typical NFE would be around 10KB, depending on the page size it could 
be stored with the record, or be stored in two blob pages and just the 
blob id on the record page, anyway I prefer to have a separate table to 
hold the blobs, because in my case the access to blob's are not so 
often, so I prefer to have as many records per page as I can, and read a 
separate table (and therefore page) to read the blob contents when I 
need it.

It's good to read your thougths, I am just arguing about the options :)

see you !




++

Visit http://www.firebirdsql.org and click the Resources item
on the main (top) menu.  Try Knowledgebase and FAQ links !

Also search the knowledgebases at http://www.ibphoenix.com 

++
Yahoo! Groups Links





[Non-text portions of this message have been removed]



Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Ann Harrison
On Thu, Apr 19, 2012 at 11:13 AM, Tupy... nambá anhangu...@yahoo.comwrote:


 But, having many NFE (as many as the transactions), don´t you agree that
 these BLOB´s will be a great source of fragmentation inside the DB ?


Err, no.  It's not.  I'm not 100% sure what you mean by fragmentation, but
all data, metadata, blobs, internal structure and state are kept on fixed
sized pages in a single file.  Yes, if you're running on a disk that's full
and fragmented, that file will be scattered around the disk, but inside,
it's quite tidy.


 And, if I´m sure about my thinkings, as Firebird doesn´t have a way to
 defragment inside the DB, you don´t have a way to resolve this.


When pages are released, they're reused.


 May be, for having a good solution for such kind of business, one had to
 use a MS SQL Server to periodically defragment the DB. Or another DB name
 that has this funcionality. I searched something like this at Postgres and
 I found a command named VACUUM that does something like this. Think about
 all of this, if you want. If have to have BLOB´s, I think Firebird is not a
 good solution for a great number of them. My thought, you don´t need to
 agree.


The PostgreSQL vacuum is similar to Firebird's continuous, on-line garbage
collection, except that it's a separate, off-line command.

Good luck,

Ann


[Non-text portions of this message have been removed]



Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Alexandre Benson Smith
Em 19/4/2012 13:18, Tupy... nambá escreveu:
 MSSQL has two commands of the DBCC that allow to do defragmentation. The 
 defragmentation is not a garbage collection, but putting all parts of an 
 object (file or columns, hanging of the level - disc or DB) side by side, in 
 a way that the reading of data will be almost fast, because all data will be 
 found almost together. Normally,this is the way to have quick readings of 
 data. Garbage collection is like removing of erased data.
 As I quickly read at some PostGreSQL pages, VACUUM has to be a defragment 
 command for PostGreSQL.

 Since you know that you can make a defragment at Firebird making an DB 
 restore, you can make a restore and compare the reading times at the two 
 situations. If you have a meaningfull increase of readings speed (SELECT´s 
 and so on) after the restore, this will mean that your problem is of high 
 fragmentation.
 Also, after having made the restore, you can do a new backup and once again, 
 a second restore, and see if you have time reduce. At the first restore, the 
 time has to be long, but at the second, no more, because the second backup 
 will store defragmented data.
 If you can, let´s try till now, all I have are only theories. Your 
 results will be interesting for all of us.


I don't said the Garbage Collection is the same as defragmentation on 
MSSQL, I said that I don't know about PG, but I *think* VACCUMM is the 
same as FB Garbage Collection :) and I didn't say that I am sure about it

All the tests are done on freshly restore DB, so it's not fragmented, 
the slowness is on back-up/restore of a freshly created test database.

In this moment I am doing tests with Carlos Cantu and Dmitry Kuzmenko, 
and the culprit so far is my machine, on their machine (both !) the 
restore took 3s in mine 10 minutes !

I am testing on ext3 and ext4 partitions and I will make more tests on 
another machine, so I can isolate hardware as a factor.

see you !


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Mark Rotteveel
On 19-4-2012 18:34, Tupy... nambá wrote:
 Still something = doesn´t matter if you have the blob field in a separated 
 table. Since they are all together in a same DB file, they may cause 
 defragmentation, no one can ensure where at the DB file they will be written 
 and probably will be written in the middle of others non-blob columns/fields.
 If you have an separated DB for the blob-fields-tables, you will not have 
 this problem, but then you will have new ACIDity problems. If Firebird had 
 something like MSSQL Server Linked Servers, than you still could have 
 integration between the two DB´s, having the best of both (no fragmentation 
 at one / blob´s at the other).

Firebird has distributed transactions, so if you really want to use two 
databases then you can use a two-phase commit to maintain ACID.

Mark
-- 
Mark Rotteveel


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Lester Caine
Alexandre Benson Smith wrote:
 In this moment I am doing tests with Carlos Cantu and Dmitry Kuzmenko,
 and the culprit so far is my machine, on their machine (both !) the
 restore took 3s in mine 10 minutes !

 I am testing on ext3 and ext4 partitions and I will make more tests on
 another machine, so I can isolate hardware as a factor.

It is a little amazing at time when some things work fast on one machine and a 
lot slower on another, but the sort of problem you are seeing I would check 
that 
there is not a problem with the hard disc.  I've seen that sort of effect when 
the controller is having trouble reading a disk. It WILL read the data 
eventually, but keeps winding the heads back to '0' and repositioning for each 
block read. Replacing the hard disk and restoring the data invariably cleared 
the problem. Had it a couple of time now - 'Maxtor' discs have been stripped 
from all my customer machines now!

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php


Re: [firebird-support] why Blob is so slow ?

2012-04-19 Thread Alexandre Benson Smith
Em 19/4/2012 16:28, Carlos H. Cantu escreveu:
 LC  It is a little amazing at time when some things work fast on one machine 
 and a
 LC  lot slower on another, but the sort of problem you are seeing I would 
 check that
 LC  there is not a problem with the hard disc.  I've seen that sort of 
 effect when
 LC  the controller is having trouble reading a disk. It WILL read the data
 LC  eventually, but keeps winding the heads back to '0' and repositioning 
 for each
 LC  block read. Replacing the hard disk and restoring the data invariably 
 cleared
 LC  the problem. Had it a couple of time now - 'Maxtor' discs have been 
 stripped
 LC  from all my customer machines now!

 My guess is that the time differences are also related to the
 configuration of the file system used in his linux server (ie: barrier
 and other params). Kouzmenko and me tested in Windows machines.

 Carlos
 Firebird Performance in Detail - http://videos.firebirddevelopersday.com
 www.firebirdnews.org - www.FireBase.com.br


I am still doing some tests to try to identify the culprit.

I tested on another linux machine and the restore is under 3s, but I 
can't compare because this machine uses SCSI disks on RAID, and mine is 
a simple (and pretty old) SATA disc.

I will test on some real hardware and report back.

I had ruled out hardware/file system too fast, thats the reason I posted 
the original message, the reason I ruled out hardware/file system 
configuration is because I noted the slowdown on a client site and then 
tested on my server I noted the same speed problem... But I think that 
both servers (mine and my customer) have something weird (perhaps 
filesystem options as pointed out by Carlos).

Unfortunatelly I had no remote access to that server.

Thanks for all the input and to Carlos and Dmitry for the time to 
perform the tests.

see you !


Re: [firebird-support] why Blob is so slow ?

2012-04-18 Thread Dmitry Kuzmenko
Hello, Alexandre!

Thursday, April 19, 2012, 2:12:02 AM, you wrote:

ABS # time /opt/firebird/bin/gbak blob_test.fdb blob_test.fbk -user sysdba
ABS -password masterkey -t

stop using -t option, it's already by default :-)

ABS real10m8.894s

10 minutes to restore ~300mb database? incredible, can't believe that.
Can you put zip/rar of this backup somewhere on ftp or http?

-- 
Dmitry Kuzmenko, www.ib-aid.com