In my case I have two separate steps. First SELECT primary keys of those
records to be deleted (in your case SELECT id FROM mytable WHERE created_at
< some_fixed_millis). And then I issue DELETE for those primary keys in
batches of N statements (N being configuration parameter). You could create
stored procedure for this with two parameters (some_fixed_millis,
batch_size).

Your idea DELETE WHERE SELECT and limiting rows needs to be run for every
DELETE step making unnecessary read I/O.


On Mon, 7 Oct 2019 at 14:10, Peter <tableyourt...@gmail.com> wrote:

> Hi Peter,
>
> Thanks a lot for the suggestion.This would be nice if it performs better.
>
> Is the idea to split one request into smaller parts or will "Select+Delete
> IDs" just perform better?
>
> And regarding the latter option - is this possible in one SQL request? So
> something like
>
> DELETE FROM mytable WHERE id IN
>
> ( SELECT id FROM mytable WHERE created_at < some_fixed_millis OFFSET 0 ROWS 
> FETCH NEXT 1000 ROWS ONLY )
>
>
> And then loop through the results via changing OFFSET and ROWS? (Btw: the
> column created_at is indexed)
>
> Or would you recommend doing this as 2 separate statements in Java/JDBC?
> Or via maybe even just issuing the original DELETE request more frequent?
>
> Regards
> Peter
>
> On 06.10.19 03:50, Peter Ondruška wrote:
>
> Peter, try this if it makes a difference:
>
> 1. Select entries to be deleted, note their primary keys.
> 2. Issue delete using keys to be deleted (1.) and use short transaction
> batches.
>
> On Sun, 6 Oct 2019, 01:33 Peter, <tableyourt...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a table "mytable" with columns "id", "created_at" and "json"
>> (VARCHAR, BIGINT, LONG VARCHAR), where data is coming in like new 200k
>> entries every hour and I would like to keep only entries of the last 1
>> or 2 hours. It is expected behaviour for the user if too old entries
>> gets lost as it is some kind of a LRU cache.
>>
>> The current solution is to delete entries older than 4 hours every 30
>> minutes:
>>
>> DELETE FROM mytable WHERE created_at < ?
>>
>> I'm using this in a prepared statement where ? is "4 hours ago" in
>> milliseconds (new DateTime().getMillis()).
>>
>> This works, but some (not all) INSERT statement get a bigger delay in
>> the same order (2-5 seconds) that this DELETE takes, which is ugly.
>> These INSERT statements are executed independently (using different
>> threads) of the DELETE.
>>
>> Is there a better way? Can I somehow avoid locking the unrelated INSERT
>> operations?
>>
>> What helps a bit is when I make those deletes more frequently than the
>> delays will get smaller, but then the number of those delayed requests
>> will increase.
>>
>> What also helps a bit (currently have not seen a negative impact) is
>> increasing the page size for the Derby Network Server:
>> -Dderby.storage.pageSize=32768
>>
>> Regards
>> Peter
>>
>>
>

Reply via email to