Re: Infinite loop - stack overflow

2019-10-07 Thread Rick Hillegas

Hi Geraldine,

I have logged https://issues.apache.org/jira/browse/DERBY-7055 to track 
this issue. If you discover a repro, please update that issue.


Thanks,
-Rick

On 10/4/19 7:12 AM, Geraldine McCormack wrote:

I can certainly log the issue but I do not know how to recreate it
unfortunately...





Re: Avoid locking on DELETE

2019-10-07 Thread Peter
Hi Peter,

Thanks! I have implemented this and indeed the maximum delays are lower
but the time for a delete batch to complete takes now longer (roughly
3-4 times; for batchSize=500, total deleted items around ~1). The
problem is likely that I have VARCHAR for the ID column.

If I increase the frequency of issuing the original DELETE statement:

DELETE FROM mytable WHERE created_at < ?

Won't it have a similar effect due to smaller batches?

Regards
Peter

On 07.10.19 16:31, Peter Ondruška wrote:
> In my case I have two separate steps. First SELECT primary keys of
> those records to be deleted (in your case SELECT id FROM mytable WHERE
> created_at < some_fixed_millis). And then I issue DELETE for those
> primary keys in batches of N statements (N being configuration
> parameter). You could create stored procedure for this with two
> parameters (some_fixed_millis, batch_size).
> Your idea DELETE WHERE SELECT and limiting rows needs to be run for
> every DELETE step making unnecessary read I/O.
>
>
> On Mon, 7 Oct 2019 at 14:10, Peter  > wrote:
>
> Hi Peter,
>
> Thanks a lot for the suggestion.This would be nice if it performs
> better.
>
> Is the idea to split one request into smaller parts or will
> "Select+Delete IDs" just perform better?
>
> And regarding the latter option - is this possible in one SQL
> request? So something like
>
> DELETE FROM mytable WHERE id IN 
>
> ( SELECT id FROM mytable WHERE created_at < some_fixed_millis OFFSET 0 
> ROWS FETCH NEXT 1000 ROWS ONLY )
>
>
> And then loop through the results via changing OFFSET and ROWS?
> (Btw: the column created_at is indexed)
>
> Or would you recommend doing this as 2 separate statements in
> Java/JDBC? Or via maybe even just issuing the original DELETE
> request more frequent?
>
> Regards
> Peter
>
> On 06.10.19 03:50, Peter Ondruška wrote:
>> Peter, try this if it makes a difference:
>>
>> 1. Select entries to be deleted, note their primary keys.
>> 2. Issue delete using keys to be deleted (1.) and use short
>> transaction batches.
>>
>> On Sun, 6 Oct 2019, 01:33 Peter, > > wrote:
>>
>> Hi,
>>
>> I have a table "mytable" with columns "id", "created_at" and
>> "json"
>> (VARCHAR, BIGINT, LONG VARCHAR), where data is coming in like
>> new 200k
>> entries every hour and I would like to keep only entries of
>> the last 1
>> or 2 hours. It is expected behaviour for the user if too old
>> entries
>> gets lost as it is some kind of a LRU cache.
>>
>> The current solution is to delete entries older than 4 hours
>> every 30
>> minutes:
>>
>> DELETE FROM mytable WHERE created_at < ?
>>
>> I'm using this in a prepared statement where ? is "4 hours
>> ago" in
>> milliseconds (new DateTime().getMillis()).
>>
>> This works, but some (not all) INSERT statement get a bigger
>> delay in
>> the same order (2-5 seconds) that this DELETE takes, which is
>> ugly.
>> These INSERT statements are executed independently (using
>> different
>> threads) of the DELETE.
>>
>> Is there a better way? Can I somehow avoid locking the
>> unrelated INSERT
>> operations?
>>
>> What helps a bit is when I make those deletes more frequently
>> than the
>> delays will get smaller, but then the number of those delayed
>> requests
>> will increase.
>>
>> What also helps a bit (currently have not seen a negative
>> impact) is
>> increasing the page size for the Derby Network Server:
>> -Dderby.storage.pageSize=32768
>>
>> Regards
>> Peter
>>
>



Re: Avoid locking on DELETE

2019-10-07 Thread Peter Ondruška
In my case I have two separate steps. First SELECT primary keys of those
records to be deleted (in your case SELECT id FROM mytable WHERE created_at
< some_fixed_millis). And then I issue DELETE for those primary keys in
batches of N statements (N being configuration parameter). You could create
stored procedure for this with two parameters (some_fixed_millis,
batch_size).

Your idea DELETE WHERE SELECT and limiting rows needs to be run for every
DELETE step making unnecessary read I/O.


On Mon, 7 Oct 2019 at 14:10, Peter  wrote:

> Hi Peter,
>
> Thanks a lot for the suggestion.This would be nice if it performs better.
>
> Is the idea to split one request into smaller parts or will "Select+Delete
> IDs" just perform better?
>
> And regarding the latter option - is this possible in one SQL request? So
> something like
>
> DELETE FROM mytable WHERE id IN
>
> ( SELECT id FROM mytable WHERE created_at < some_fixed_millis OFFSET 0 ROWS 
> FETCH NEXT 1000 ROWS ONLY )
>
>
> And then loop through the results via changing OFFSET and ROWS? (Btw: the
> column created_at is indexed)
>
> Or would you recommend doing this as 2 separate statements in Java/JDBC?
> Or via maybe even just issuing the original DELETE request more frequent?
>
> Regards
> Peter
>
> On 06.10.19 03:50, Peter Ondruška wrote:
>
> Peter, try this if it makes a difference:
>
> 1. Select entries to be deleted, note their primary keys.
> 2. Issue delete using keys to be deleted (1.) and use short transaction
> batches.
>
> On Sun, 6 Oct 2019, 01:33 Peter,  wrote:
>
>> Hi,
>>
>> I have a table "mytable" with columns "id", "created_at" and "json"
>> (VARCHAR, BIGINT, LONG VARCHAR), where data is coming in like new 200k
>> entries every hour and I would like to keep only entries of the last 1
>> or 2 hours. It is expected behaviour for the user if too old entries
>> gets lost as it is some kind of a LRU cache.
>>
>> The current solution is to delete entries older than 4 hours every 30
>> minutes:
>>
>> DELETE FROM mytable WHERE created_at < ?
>>
>> I'm using this in a prepared statement where ? is "4 hours ago" in
>> milliseconds (new DateTime().getMillis()).
>>
>> This works, but some (not all) INSERT statement get a bigger delay in
>> the same order (2-5 seconds) that this DELETE takes, which is ugly.
>> These INSERT statements are executed independently (using different
>> threads) of the DELETE.
>>
>> Is there a better way? Can I somehow avoid locking the unrelated INSERT
>> operations?
>>
>> What helps a bit is when I make those deletes more frequently than the
>> delays will get smaller, but then the number of those delayed requests
>> will increase.
>>
>> What also helps a bit (currently have not seen a negative impact) is
>> increasing the page size for the Derby Network Server:
>> -Dderby.storage.pageSize=32768
>>
>> Regards
>> Peter
>>
>>
>


Re: Avoid locking on DELETE

2019-10-07 Thread Peter
Hi Peter,

Thanks a lot for the suggestion.This would be nice if it performs better.

Is the idea to split one request into smaller parts or will
"Select+Delete IDs" just perform better?

And regarding the latter option - is this possible in one SQL request?
So something like

DELETE FROM mytable WHERE id IN 

( SELECT id FROM mytable WHERE created_at < some_fixed_millis OFFSET 0 ROWS 
FETCH NEXT 1000 ROWS ONLY )


And then loop through the results via changing OFFSET and ROWS? (Btw:
the column created_at is indexed)

Or would you recommend doing this as 2 separate statements in Java/JDBC?
Or via maybe even just issuing the original DELETE request more frequent?

Regards
Peter

On 06.10.19 03:50, Peter Ondruška wrote:
> Peter, try this if it makes a difference:
>
> 1. Select entries to be deleted, note their primary keys.
> 2. Issue delete using keys to be deleted (1.) and use short
> transaction batches.
>
> On Sun, 6 Oct 2019, 01:33 Peter,  > wrote:
>
> Hi,
>
> I have a table "mytable" with columns "id", "created_at" and "json"
> (VARCHAR, BIGINT, LONG VARCHAR), where data is coming in like new 200k
> entries every hour and I would like to keep only entries of the last 1
> or 2 hours. It is expected behaviour for the user if too old entries
> gets lost as it is some kind of a LRU cache.
>
> The current solution is to delete entries older than 4 hours every 30
> minutes:
>
> DELETE FROM mytable WHERE created_at < ?
>
> I'm using this in a prepared statement where ? is "4 hours ago" in
> milliseconds (new DateTime().getMillis()).
>
> This works, but some (not all) INSERT statement get a bigger delay in
> the same order (2-5 seconds) that this DELETE takes, which is ugly.
> These INSERT statements are executed independently (using different
> threads) of the DELETE.
>
> Is there a better way? Can I somehow avoid locking the unrelated
> INSERT
> operations?
>
> What helps a bit is when I make those deletes more frequently than the
> delays will get smaller, but then the number of those delayed requests
> will increase.
>
> What also helps a bit (currently have not seen a negative impact) is
> increasing the page size for the Derby Network Server:
> -Dderby.storage.pageSize=32768
>
> Regards
> Peter
>