Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo



On 17/09/2018 14:27 , Chris Travers wrote:



On Mon, Sep 17, 2018 at 6:04 PM marcelo > wrote:




I´m using an ORM (Devart´s) to access the database, so, I cannot
"select ... FOR UPDATE". The application paradigm is that a user
have a list of records (after a query) and she could update or
delete any of them as the business rules allows it. So, at least
an advisory lock is a must.
I´m convinced by now: I would stay with advisory locks...
expecting no app crash could occur...


I would say to fix this in the ORM rather than reinvent what the 
database already gives you in the database.



You are right. But you know...


Thank you all.
Marcelo



Libre de virus. www.avast.com




<#m_-9091154853724945458_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor 
lock-in.

http://www.efficito.com/learn_more




---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Logical locking beyond pg_advisory

2018-09-17 Thread Chris Travers
On Mon, Sep 17, 2018 at 6:04 PM marcelo  wrote:

>
>
> I´m using an ORM (Devart´s) to access the database, so, I cannot "select
> ... FOR UPDATE". The application paradigm is that a user have a list of
> records (after a query) and she could update or delete any of them as the
> business rules allows it. So, at least an advisory lock is a must.
> I´m convinced by now: I would stay with advisory locks... expecting no app
> crash could occur...
>

I would say to fix this in the ORM rather than reinvent what the database
already gives you in the database.



> Thank you all.
> Marcelo
>
>
> 
>  Libre
> de virus. www.avast.com
> 
> <#m_-9091154853724945458_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo



On 17/09/2018 12:21 , Chris Travers wrote:



On Mon, Sep 17, 2018 at 5:09 PM Merlin Moncure > wrote:


On Sun, Sep 16, 2018 at 3:53 PM marcelo mailto:marcelo.nico...@gmail.com>> wrote:
>
> I need a mechanism of "logical locking" more ductile than the
pg_advisory family.
> I'm thinking of a table ("lock_table") that would be part of the
database, with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks
after a certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to
lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool -
unlock the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask
if the table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar)
bool - similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid
varchar) bool - similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by
ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if
the lock is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be
after an update of the acquired column.
> Successful locks insert a new row, except the rare case of a
timeout, which becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?
> b) What are the consequences of the large number of inserts and
deletions
> c) Performance. In fact, pg_advisory* implies a network
roundtrip, but (I think) no table operations.

Why can't you use the advisory lock functions?  The challenge with
manually managed locks are they they are slow and you will lose the
coordination the database provides you.  For example, if your
application crashes you will have to clean up all held locks yourself.
Building out that infrastructure will be difficult.


First, I think in an ideal world, you wouldn't handle this problem 
with either approach but sometimes you have to.


I have done both approaches actually.  LedgerSMB uses its own lock 
table because locks have to persist across multiple HTTP requests and 
we have various automatic cleanup processes.


When I was working on  the queue management stuff at Novozymes we used 
advisory locks extensively.


These two approaches have serious downsides:
1.  Lock tables are *slow* and require careful thinking through 
cleanup scenarios.  In LedgerSMB we tied to the application session 
with an ON DELETE event that would unlock the row.  We estimated that 
for every 2 seconds that the db spent doing useful work, it spent 42 
seconds managing the locks.  Additionally the fact that locks take 
effect on snapshot advance is a problem here.


2.  In my talk, "PostgreSQL at 10TB and  Beyond" I talk about a 
problem we had using advisory locks for managing rows that were being 
processed for deletion.  Since the deletion was the scan for items at 
the head of an index, under heavy load we could spend long enough 
checking dead rows that the locks could go away with our snapshot 
failing to advance.  This would result in duplicate processing.  So 
the fact that advisory locks don't really follow snapshot semantics is 
a really big problem here since it means you can have race conditions 
in advisory locks that can't happen with other locking issues.  I 
still love advisory locks but they are not a good tool for this.


The real solution most of the time is actually to lock the rows by 
selecting FOR UPDATE and possibly SKIP LOCKED. The way update/delete 
row locking in PostgreSQL works is usually good enough except in a few 
rare edge cases.  Only in *very rare* cases do lock tables or advisory 
locks make sense for actual row processing.



merlin



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor 
lock-in.

http://www.efficito.com/learn_more
I´m using an ORM (Devart´s) to access the database, so, I cannot "select 
... FOR UPDATE". The application paradigm is that a user have a list of 
records (after a query) and she could update or delete any of them as 
the business rules allows it. So, at least an advisory lock is a must.
I´m convinced by now: I would stay with advisory locks... expecting no 
app crash could occur...

Thank you all.
Marcelo


---
El software de antivirus Avast ha analizado este correo electrónico en busca 

Re: Logical locking beyond pg_advisory

2018-09-17 Thread Chris Travers
On Mon, Sep 17, 2018 at 5:09 PM Merlin Moncure  wrote:

> On Sun, Sep 16, 2018 at 3:53 PM marcelo  wrote:
> >
> > I need a mechanism of "logical locking" more ductile than the
> pg_advisory family.
> > I'm thinking of a table ("lock_table") that would be part of the
> database, with columns
> > * tablename varchar - name of the table "locked"
> > * rowid integer, - id of the row "locked"
> > * ownerid varchar, - identifier of the "user" who acquired the lock
> > * acquired timestamp - to be able to release "abandoned" locks after a
> certain time
> >
> > and a group of functions
> > 1) lock_table (tablename varchar, ownerid varchar) bool - get to lock
> over the entire table, setting rowid to zero
> > 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock the
> table, if the owner is the recorded one
> > 3) locked_table (tablename varchar, ownerid varchar) bool - ask if the
> table is locked by some user other than the ownerid argument
> > 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool -
> similar to pg_try_advisory_lock
> > 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) bool -
> similar to pg_advisory_unlock
> > 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
> >
> > The timeout (default, maybe 15 minutes) is implicitly applied if the
> lock is taken by another user (there will be no notification).
> > Redundant locks are not queued, they simply return true, may be after an
> update of the acquired column.
> > Successful locks insert a new row, except the rare case of a timeout,
> which becomes an update (ownerid and acquired)
> > Unlock operations deletes the corresponding row
> >
> > My question is double
> > a) What is the opinion on the project?
> > b) What are the consequences of the large number of inserts and deletions
> > c) Performance. In fact, pg_advisory* implies a network roundtrip, but
> (I think) no table operations.
>
> Why can't you use the advisory lock functions?  The challenge with
> manually managed locks are they they are slow and you will lose the
> coordination the database provides you.  For example, if your
> application crashes you will have to clean up all held locks yourself.
> Building out that infrastructure will be difficult.
>

First, I think in an ideal world, you wouldn't handle this problem with
either approach but sometimes you have to.

I have done both approaches actually.  LedgerSMB uses its own lock table
because locks have to persist across multiple HTTP requests and we have
various automatic cleanup processes.

When I was working on  the queue management stuff at Novozymes we used
advisory locks extensively.

These two approaches have serious downsides:
1.  Lock tables are *slow* and require careful thinking through cleanup
scenarios.  In LedgerSMB we tied to the application session with an ON
DELETE event that would unlock the row.  We estimated that for every 2
seconds that the db spent doing useful work, it spent 42 seconds managing
the locks.  Additionally the fact that locks take effect on snapshot
advance is a problem here.

2.  In my talk, "PostgreSQL at 10TB and  Beyond" I talk about a problem we
had using advisory locks for managing rows that were being processed for
deletion.  Since the deletion was the scan for items at the head of an
index, under heavy load we could spend long enough checking dead rows that
the locks could go away with our snapshot failing to advance.  This would
result in duplicate processing.  So the fact that advisory locks don't
really follow snapshot semantics is a really big problem here since it
means you can have race conditions in advisory locks that can't happen with
other locking issues.  I still love advisory locks but they are not a good
tool for this.

The real solution most of the time is actually to lock the rows by
selecting FOR UPDATE and possibly SKIP LOCKED.  The way update/delete row
locking in PostgreSQL works is usually good enough except in a few rare
edge cases.  Only in *very rare* cases do lock tables or advisory locks
make sense for actual row processing.

>
> merlin
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: Logical locking beyond pg_advisory

2018-09-17 Thread Merlin Moncure
On Sun, Sep 16, 2018 at 3:53 PM marcelo  wrote:
>
> I need a mechanism of "logical locking" more ductile than the pg_advisory 
> family.
> I'm thinking of a table ("lock_table") that would be part of the database, 
> with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks after a 
> certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to lock over 
> the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock the table, 
> if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask if the table 
> is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool - 
> similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) bool - 
> similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if the lock is 
> taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be after an 
> update of the acquired column.
> Successful locks insert a new row, except the rare case of a timeout, which 
> becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?
> b) What are the consequences of the large number of inserts and deletions
> c) Performance. In fact, pg_advisory* implies a network roundtrip, but (I 
> think) no table operations.

Why can't you use the advisory lock functions?  The challenge with
manually managed locks are they they are slow and you will lose the
coordination the database provides you.  For example, if your
application crashes you will have to clean up all held locks yourself.
Building out that infrastructure will be difficult.

merlin



Re: Logical locking beyond pg_advisory

2018-09-17 Thread marcelo
/Would be nice if you explain more about what kind of problem you want 
to solve./


There are two main questions "to solve"
a) generally speaking, I like to get some form of "exclusive access" to 
the row before updating or deleting. None of the optimistic / pesimistic 
automatic variants of concurrency management glad me. Nor, of course, 
the "versioning" variants.
b) some of the tables I´m working on have a "number" column (_not the 
PK_) which sometimes come from material, external sources, but sometimes 
must be assigned by the system. This could be solved two main ways
b.1) Use a trigger to get the famous "max(n) + 1". At least in one of 
the cases, the number automatically generated must fall into different 
ranges conditioned by the value of another column, and the ranges vary 
between versions of the database.
b.2) "Lock the entire table", get programmatically the next number for 
the correct range, assign it and free the table lock.


Of course, and beforehand, all database manipulations are done thru 
applications.
Till yesterday, I was working with the "advisory" family. Having a 
bigint as the only "lock identifier" I was working with a hash of the 
table name XORed with the id of the row or zero for the entire table. 
(All my tables have an autosequential integer id as PK).
Even if I found a very robust hash algorithm for the table name, I 
cannot discard some collision once the id was xored. I tested five or 
six table names, along 2 ids every one, without collision. But...
Of course, I need the "full table lock" for inserts. So, it´s a very 
separated concern with updates and deletions. But...


TIA


On 17/09/2018 03:19 , Fabrízio de Royes Mello wrote:


Em dom, 16 de set de 2018 às 17:53, marcelo > escreveu:

>
> I need a mechanism of "logical locking" more ductile than the 
pg_advisory family.
> I'm thinking of a table ("lock_table") that would be part of the 
database, with columns

> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks after 
a certain time

>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to 
lock over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock 
the table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask if 
the table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool 
- similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) 
bool - similar to pg_advisory_unlock

> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if the 
lock is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be 
after an update of the acquired column.
> Successful locks insert a new row, except the rare case of a 
timeout, which becomes an update (ownerid and acquired)

> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?

Would be nice if you explain more about what kind of problem you want 
to solve.


> b) What are the consequences of the large number of inserts and 
deletions


The first thing came to my mind with this approach is table bloat.

> c) Performance. In fact, pg_advisory* implies a network roundtrip, 
but (I think) no table operations.


Yeap... no table operations.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento




---
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Logical locking beyond pg_advisory

2018-09-17 Thread Fabrízio de Royes Mello
Em dom, 16 de set de 2018 às 17:53, marcelo 
escreveu:
>
> I need a mechanism of "logical locking" more ductile than the pg_advisory
family.
> I'm thinking of a table ("lock_table") that would be part of the
database, with columns
> * tablename varchar - name of the table "locked"
> * rowid integer, - id of the row "locked"
> * ownerid varchar, - identifier of the "user" who acquired the lock
> * acquired timestamp - to be able to release "abandoned" locks after a
certain time
>
> and a group of functions
> 1) lock_table (tablename varchar, ownerid varchar) bool - get to lock
over the entire table, setting rowid to zero
> 2) unlock_table (tablename varchar, ownerid varchar) bool - unlock the
table, if the owner is the recorded one
> 3) locked_table (tablename varchar, ownerid varchar) bool - ask if the
table is locked by some user other than the ownerid argument
> 4) lock_row (tablename varchar, rowid integer, ownerid varchar) bool -
similar to pg_try_advisory_lock
> 5) unlock_row (tablename varchar, rowid integer, ownerid varchar) bool -
similar to pg_advisory_unlock
> 6) unlock_all (ownerid varchar) bool - unlock all locks owned by ownerid
>
> The timeout (default, maybe 15 minutes) is implicitly applied if the lock
is taken by another user (there will be no notification).
> Redundant locks are not queued, they simply return true, may be after an
update of the acquired column.
> Successful locks insert a new row, except the rare case of a timeout,
which becomes an update (ownerid and acquired)
> Unlock operations deletes the corresponding row
>
> My question is double
> a) What is the opinion on the project?

Would be nice if you explain more about what kind of problem you want to
solve.

> b) What are the consequences of the large number of inserts and deletions

The first thing came to my mind with this approach is table bloat.

> c) Performance. In fact, pg_advisory* implies a network roundtrip, but (I
think) no table operations.

Yeap... no table operations.

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento