In my case I have two separate steps. First SELECT primary keys of those records to be deleted (in your case SELECT id FROM mytable WHERE created_at < some_fixed_millis). And then I issue DELETE for those primary keys in batches of N statements (N being configuration parameter). You could create stored procedure for this with two parameters (some_fixed_millis, batch_size).
Your idea DELETE WHERE SELECT and limiting rows needs to be run for every DELETE step making unnecessary read I/O. On Mon, 7 Oct 2019 at 14:10, Peter <tableyourt...@gmail.com> wrote: > Hi Peter, > > Thanks a lot for the suggestion.This would be nice if it performs better. > > Is the idea to split one request into smaller parts or will "Select+Delete > IDs" just perform better? > > And regarding the latter option - is this possible in one SQL request? So > something like > > DELETE FROM mytable WHERE id IN > > ( SELECT id FROM mytable WHERE created_at < some_fixed_millis OFFSET 0 ROWS > FETCH NEXT 1000 ROWS ONLY ) > > > And then loop through the results via changing OFFSET and ROWS? (Btw: the > column created_at is indexed) > > Or would you recommend doing this as 2 separate statements in Java/JDBC? > Or via maybe even just issuing the original DELETE request more frequent? > > Regards > Peter > > On 06.10.19 03:50, Peter Ondruška wrote: > > Peter, try this if it makes a difference: > > 1. Select entries to be deleted, note their primary keys. > 2. Issue delete using keys to be deleted (1.) and use short transaction > batches. > > On Sun, 6 Oct 2019, 01:33 Peter, <tableyourt...@gmail.com> wrote: > >> Hi, >> >> I have a table "mytable" with columns "id", "created_at" and "json" >> (VARCHAR, BIGINT, LONG VARCHAR), where data is coming in like new 200k >> entries every hour and I would like to keep only entries of the last 1 >> or 2 hours. It is expected behaviour for the user if too old entries >> gets lost as it is some kind of a LRU cache. >> >> The current solution is to delete entries older than 4 hours every 30 >> minutes: >> >> DELETE FROM mytable WHERE created_at < ? >> >> I'm using this in a prepared statement where ? is "4 hours ago" in >> milliseconds (new DateTime().getMillis()). >> >> This works, but some (not all) INSERT statement get a bigger delay in >> the same order (2-5 seconds) that this DELETE takes, which is ugly. >> These INSERT statements are executed independently (using different >> threads) of the DELETE. >> >> Is there a better way? Can I somehow avoid locking the unrelated INSERT >> operations? >> >> What helps a bit is when I make those deletes more frequently than the >> delays will get smaller, but then the number of those delayed requests >> will increase. >> >> What also helps a bit (currently have not seen a negative impact) is >> increasing the page size for the Derby Network Server: >> -Dderby.storage.pageSize=32768 >> >> Regards >> Peter >> >> >