[firebird-support] Re: Best way to delete millions of rows
Hey everyone First of all thanks for helping me, finding a solution to my problem. I was unable to reproduce the problem once i used sequential generated guid's. To do so i used the c# code from this site, if it has any interest. http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database Another problem i found during my tests is that, if garbage collection is deferred to backup/sweep, by executing backup/sweep instead of "select count(*) from table", Firebird will become unresponsive for minutes. Later today or tomorrow i will file a bug in the firebird tracker, about the problems I have encountered with index garbage collection on varchar index, causing the server to become unresponsive. Because there is a lot of scenarios where we use index on varchar columns that cannot be made sequential and we need to delete old data on a regular basis. In our domain this is social security number, login tokens, oauth access/refresh tokens, guids from third party applications and so on. //Thomas
Re: [firebird-support] Re: Best way to delete millions of rows
23.11.2016 9:06, Norbert Saint Georges n...@tetrasys.eu [firebird-support] wrote: > For tables of several hundred millions of recordings, if we organize > the first eight bytes in a less cahotic way, this will lighten Firebird > by making it faster? Yes, that's why sequential GUID was invented. -- WBR, SD. ++ Visit http://www.firebirdsql.org and click the Documentation item on the main (top) menu. Try FAQ and other links from the left-side menu there. Also search the knowledgebases at http://www.ibphoenix.com/resources/documents/ ++ Yahoo Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/firebird-support/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/firebird-support/join (Yahoo! ID required) <*> To change settings via email: firebird-support-dig...@yahoogroups.com firebird-support-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: firebird-support-unsubscr...@yahoogroups.com <*> Your use of Yahoo Groups is subject to: https://info.yahoo.com/legal/us/yahoo/utos/terms/
[firebird-support] Re: Best way to delete millions of rows
Dimitry Sibiryakov s...@ibphoenix.com [firebird-support] a écrit : >For that Microsoft invented "sequential GUID" quite a lot time ago. > >BTW: CHAR(16) CHARACTER SET OCTETS is a better way to keep GUID in > database. For tables of several hundred millions of recordings, if we organize the first eight bytes in a less cahotic way, this will lighten Firebird by making it faster? -- Norbert Saint Georges http://tetrasys.fi ++ Visit http://www.firebirdsql.org and click the Documentation item on the main (top) menu. Try FAQ and other links from the left-side menu there. Also search the knowledgebases at http://www.ibphoenix.com/resources/documents/ ++ Yahoo Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/firebird-support/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/firebird-support/join (Yahoo! ID required) <*> To change settings via email: firebird-support-dig...@yahoogroups.com firebird-support-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: firebird-support-unsubscr...@yahoogroups.com <*> Your use of Yahoo Groups is subject to: https://info.yahoo.com/legal/us/yahoo/utos/terms/
Re: [firebird-support] Re: Best way to delete millions of rows
22.11.2016 21:01, 'Leyne, Sean' s...@broadviewsoftware.com [firebird-support] wrote: >> There would be a lot of advantage, for Firebird, in using this kind of guid? > > Yes, Firebird indexes use prefix compression so the leading 14 chars of the > above values > would be stored under a single entry. For that Microsoft invented "sequential GUID" quite a lot time ago. BTW: CHAR(16) CHARACTER SET OCTETS is a better way to keep GUID in database. -- WBR, SD. ++ Visit http://www.firebirdsql.org and click the Documentation item on the main (top) menu. Try FAQ and other links from the left-side menu there. Also search the knowledgebases at http://www.ibphoenix.com/resources/documents/ ++ Yahoo Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/firebird-support/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/firebird-support/join (Yahoo! ID required) <*> To change settings via email: firebird-support-dig...@yahoogroups.com firebird-support-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: firebird-support-unsubscr...@yahoogroups.com <*> Your use of Yahoo Groups is subject to: https://info.yahoo.com/legal/us/yahoo/utos/terms/
RE: [firebird-support] Re: Best way to delete millions of rows
> > @Ann > > I will run some test and see what happens if the guids are generated > > in way where the last part varies. Like this: > > 39db9ec6-178e-77b4-5d7b-d4e969b0cd98 > > 39db9ec6-178e-e4ba-54ed-92347a131663 > > 39db9ec6-178e-c95b-c709-a42e349410df > > There would be a lot of advantage, for Firebird, in using this kind of guid? Yes, Firebird indexes use prefix compression so the leading 14 chars of the above values would be stored under a single entry. The reality is that UUID/GUID while "universally" unique do have parts which are unique to a source system, so they are common for all UUID/GUID from that system. Sean
[firebird-support] Re: Best way to delete millions of rows
kragh.tho...@yahoo.com [firebird-support] a écrit : > @Ann > I will run some test and see what happens if the guids are generated in way > where the last part varies. Like this: > 39db9ec6-178e-77b4-5d7b-d4e969b0cd98 > 39db9ec6-178e-e4ba-54ed-92347a131663 > 39db9ec6-178e-c95b-c709-a42e349410df There would be a lot of advantage, for Firebird, in using this kind of guid? -- Norbert Saint Georges http://tetrasys.fi ++ Visit http://www.firebirdsql.org and click the Documentation item on the main (top) menu. Try FAQ and other links from the left-side menu there. Also search the knowledgebases at http://www.ibphoenix.com/resources/documents/ ++ Yahoo Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/firebird-support/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/firebird-support/join (Yahoo! ID required) <*> To change settings via email: firebird-support-dig...@yahoogroups.com firebird-support-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: firebird-support-unsubscr...@yahoogroups.com <*> Your use of Yahoo Groups is subject to: https://info.yahoo.com/legal/us/yahoo/utos/terms/
[firebird-support] Re: Best way to delete millions of rows
@Vlad I tried with 1024, but with no luck - same thing happens. @Ann I will run some test and see what happens if the guids are generated in way where the last part varies. Like this: 39db9ec6-178e-77b4-5d7b-d4e969b0cd98 39db9ec6-178e-e4ba-54ed-92347a131663 39db9ec6-178e-c95b-c709-a42e349410df
[firebird-support] Re: Best way to delete millions of rows
@Vlad I tried with 1024, but with no luck - same thing happens. @Ann I will run some test and see what happens if the guids are generated in way where the last part varies. Like this: 39db9ec6-178e-77b4-5d7b-d4e969b0cd98 39db9ec6-178e-e4ba-54ed-92347a131663 39db9ec6-178e-c95b-c709-a42e349410df
[firebird-support] Re: Best way to delete millions of rows
---In firebird-support@yahoogroups.com, wrote : > Pagesize is 16384 It is OK > and pagebuffers is 256. It is *terrible* ! Set it to 1024 and look what happens > Servermode is superclassic. Not matters in this test > If I drop the primary key/index everything works as expected. Sure :) Regards, Vlad
Re: [firebird-support] Re: Best way to delete millions of rows
On Nov 22, 2016, at 6:56 AM, kragh.tho...@yahoo.com [firebird-support] wrote: > > Pagesize is 16384 and pagebuffers is 256. Servermode is superclassic. > If I drop the primary key/index everything works as expected. > Interesting. GUIDs produce really fat indexes. Firebird uses prefix compression and GUIDs vary in the first bytes so they don't compress well. Keys generated sequentially will be on different index leaf pages. Since records are deleted and garbage collected in the order they were stored, index garbage collection could easily touch a different page for each key. With only 256 buffers, index garbage collection is running pages through the cache wildly - with luck they're cached by the file system. I'm not saying that shutting down the machine for seconds or minutes at a time is acceptable, but that the place to look is in index garbage collection with a small page cache and large randomly generated key values. For what little it's worth, I'd bet that a 64bit primary key generated by a sequence would not show this problem. Nor, for that matter would a GUID that had been re-organized to put the volatile bytes at the end... Good luck, Ann
[firebird-support] Re: Best way to delete millions of rows
Pagesize is 16384 and pagebuffers is 256. Servermode is superclassic. If I drop the primary key/index everything works as expected. //Thomas
[firebird-support] Re: Best way to delete millions of rows
---In firebird-support@yahoogroups.com, wrote : > I have finally had some more time to look into the problem I have been > having, and have isolated and > reproduced the problem. The cause of the problem seams to be a Varchar(36) ID > column containing a > guid/uuid. What is database page size and page cache setting ? Regards, Vlad
Re: [firebird-support] Re: Best way to delete millions of rows
22.11.2016 9:18, kragh.tho...@yahoo.com [firebird-support] wrote: > I know that this will trigger garbage collection. However i don't believe > this is expected > behavior, when a action from one user crashes the entire server and causes > the server to > be unresponsive for 1-5 minutes while GC is running, even users on other > databases on the > same server is affected by this. You are right. Create a ticket in tracker with your testcase, please. -- WBR, SD. ++ Visit http://www.firebirdsql.org and click the Documentation item on the main (top) menu. Try FAQ and other links from the left-side menu there. Also search the knowledgebases at http://www.ibphoenix.com/resources/documents/ ++ Yahoo Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/firebird-support/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/firebird-support/join (Yahoo! ID required) <*> To change settings via email: firebird-support-dig...@yahoogroups.com firebird-support-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: firebird-support-unsubscr...@yahoogroups.com <*> Your use of Yahoo Groups is subject to: https://info.yahoo.com/legal/us/yahoo/utos/terms/
Re: [firebird-support] Re: Best way to delete millions of rows
Hi, i agree with you i suppose that time spend by GC should be measured and if it take to much time then stop GC and try again leter. E.g. when i do delete from milions_table and then do SELECT COUNT(*) form milions_table it should not clear whole garbages and stop query until GC finished. e.g. if GC spend 1 seconds, it should stop taking this execution anymore and my SELECT COUNT(*) form milions_table should run now without GC actions. regards, Karol Bieniaszewski W dniu 2016-11-22 09:18:50 użytkownik kragh.tho...@yahoo.com [firebird-support] napisał: Hey I know that this will trigger garbage collection. However i don't believe this is expected behavior, when a action from one user crashes the entire server and causes the server to be unresponsive for 1-5 minutes while GC is running, even users on other databases on the same server is affected by this. Posted by: kragh.tho...@yahoo.com
[firebird-support] Re: Best way to delete millions of rows
Hey I know that this will trigger garbage collection. However i don't believe this is expected behavior, when a action from one user crashes the entire server and causes the server to be unresponsive for 1-5 minutes while GC is running, even users on other databases on the same server is affected by this.
[firebird-support] Re: Best way to delete millions of rows
This is not a bug, this is expected behaviour : sweep/garbage collection occures.
[firebird-support] Re: Best way to delete millions of rows
Hey I have finally had some more time to look into the problem I have been having, and have isolated and reproduced the problem. The cause of the problem seams to be a Varchar(36) ID column containing a guid/uuid. Steps to reproduce: 1) Create a table(TEST) with a single varchar(36) column named ID (not null). 2) Load the table with data 400 rows using this: (select lower( uuid_to_char(gen_uuid())) from rdb$database); 3) Creat a primary key on the ID column. 4) Delete all rows in "TEST" with delete from test and commit 5) from another connection issue "select count(*) from test" This will cause the database server to become unstable and hang for minutes at a time, where no new connections is made and no other queries get processed. To observe this behavior, i wrote a small program that would select current_timestamp from the server(select current_timestamp from mon$database) and print how long time it took to the console. Then sleep for 500ms and then select the time time again. This loop ran during step 4 and 5. I am unsure if there is a solution to this problem, or should I just create it as a bug in the tracker? //Thomas
[firebird-support] Re: Best way to delete millions of rows
We use ext3, no problem there. Windows : If Win is Domain Controllet, its a problem (messing with disc cache settings) RAID : if RAID card has no battery (very low end card) (no disc cache at all)
[firebird-support] Re: Best way to delete millions of rows
Thanks for your input. What file system should be used instead? As i understood our sysadmin, the file system was mounted with barriers off. And as i wrote last night, the problem is also present on windows virtual servers.
[firebird-support] Re: Best way to delete millions of rows
Hi! Firebird does not like ext4. http://www.firebirdnews.org/forced-writes-performance-impact-on-ubuntu/ http://www.firebirdnews.org/forced-writes-performance-impact-on-ubuntu/ http://www.firebirdnews.org/understanding-barrier-on-linux/ http://www.firebirdnews.org/understanding-barrier-on-linux/
[firebird-support] Re: Best way to delete millions of rows
Hey Vlad The table has 4 index, one from a primary key, another from a foreign key, and 2 on other columns. The system is a virtual machine with 8 Gb of ram, page size is 16K and page buffers 512. The IO sub system is a Dell DAS, with 16 SAS 15K disks in raid 10. Our sysadmin ran a test on the disk, and the result was 7500 IOPS. The filesystem is EXT4 mounted without barrier, and using LVM volumes. I hope this is meaningful to you, if not, I will ask the sysadmin again. I was able to reproduce the problem on our production server witch is a Virtual windows server with 16 cores and 128 Gb of Ram. The IO system is a fiber channel SAN, no more than 6 months old. I have also reproduced the problem on a pre prod environment(Virtual windows server). However i was unable to reproduce the problem on my laptop. Could this be a issue with running Firebird on virtual servers? The production database server was moved to a virtual server about 6 months ago, and our tests showed a 30-50% performance increase on all our bulk inserts/updates and general program/website performance. However we did not test how deletes performed. Regarding gbak, I was under the impression that gbak would perform garbage collection during backup? Thanks for your help so far.
[firebird-support] Re: Best way to delete millions of rows
Hi Thomas, only from curiosity: what is your filesystem? I had a similar issue some time ago, when we used an OpenSuse with btrfs. After switching to ext4 everything was ok. I never investigated the problem deeper, but I think it could be the snapshot feature of btrfs that has slowed down the database. But just maybe. Regards Martin
[firebird-support] Re: Best way to delete millions of rows
> ---In firebird-support@yahoogroups.com, wrote : > > Hey Vlad > If i only count the deleted rows, the problem is the same, and > > the server hangs while performing garbage collection. Sad. Probably, table have a lot of indices ? IO subsystem, amount of RAM, page size and page cache size also matters. Also interesting if filesystem is mounted with "barrier=on" option > On the other hand, if I count the rows not deleted, no garbage collection is > performed - that makes sens as the index > on created makes sure that no delete records/rows are accessed. This is > useful to know as i can make sure that other > parts of the application never issues a query for rows older that 6 months. It still will clean up record versions left after UPDATE's > Now the garbage collection will be deferred and happen during backup. Is it > possible that the same "hang" behavior will > happen during backup as a result of the deferred garbage collection? gbak have no magic, it just reads the tables and put data into backup file. Regards, Vlad
[firebird-support] Re: Best way to delete millions of rows
Hey Vlad If i only count the deleted rows, the problem is the same, and the server hangs while performing garbage collection. On the other hand, if I count the rows not deleted, no garbage collection is performed - that makes sens as the index on created makes sure that no delete records/rows are accessed. This is useful to know as i can make sure that other parts of the application never issues a query for rows older that 6 months. Now the garbage collection will be deferred and happen during backup. Is it possible that the same "hang" behavior will happen during backup as a result of the deferred garbage collection?
[firebird-support] Re: Best way to delete millions of rows
Hey Thanks for your input. What Karol suggested worked - thanks I will have a look at fb3 to see if it suffers from the same "hang" behavior when massive garbage collection is executed.
Re: [firebird-support] Re: Best way to delete millions of rows
---In firebird-support@yahoogroups.com, wrote : >> Here you delete some records, but then count whole table. >>Add same "where" condition as in "count" query > > I think the execution of select count is for garbage collection. Sure > If count of records that does not exists is executed, will be the garbage > collection executed for the deleted records? Exactly. No need to count not deleted records :) Regards, Vlad
[firebird-support] Re: Best way to delete millions of rows
Hey Svein I don't really buy your premise about your car. No DBMS should stop process statements on one database as a result of statements running on another database on the same server. You are talking about a slowdown in "traffic" what i am talking about is a complete standstill on the server. /Thomas
Re: [firebird-support] Re: Best way to delete millions of rows
> > Here you delete some records, but then count whole table. > Add same "where" condition as in "count" query > > Regards, > Vlad I think the execution of select count is for garbage collection. If count of records that does not exists is executed, will be the garbage collection executed for the deleted records? Jesus >
[firebird-support] Re: Best way to delete millions of rows
---In firebird-support@yahoogroups.com, wrote : Hey I have a database that needs some cleanup, in some tables i need to delete millions of rows. I performed the cleanup on a copy of the production database in a development environment, using: delete from table where created < dateadd(-6 month to current_date) followed by a count(*) on the table Here you delete some records, but then count whole table. Add same "where" condition as in "count" query Regards, Vlad