[Sequoia] [JIRA] Commented: (SEQUOIA-955) Raidb1 load balancer and WaitForCompletionPolicy=first doesnot work

Emmanuel Cecchet (JIRA) Tue, 15 Apr 2008 08:48:30 -0700

    [ 
https://forge.continuent.org/jira/browse/SEQUOIA-955?page=comments#action_14428 
]


Emmanuel Cecchet commented on SEQUOIA-955:
------------------------------------------

Additional thread from Andrew Lawrenson on the Sequoia mailing list:

Emmanuel,

        I've done this with both a single controller, with 2 databases 
attached, as well as two replicating controllers (each with two databases).
However, at the moment, when I've done it with a single controller, it is still 
setup to replicate - it's just that the other controller isn't started (if that 
makes any difference).

        thanks,

        Andrew Lawrenson

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Emmanuel Cecchet
Sent: 26 February 2008 17:33
To: Sequoia general mailing list
Subject: Re: [Sequoia] waitForCompletionPolicy (was RE: priority inversion?)

Andrew,

Thanks for the update. Could you confirm that your 2 databases are attached to 
the same controller?
Are you replicating controllers at all or do you have a single controller?

Thanks again for your feedback,
Emmanuel

> > I think this issue is related to SEQUOIA-955 - I was using a 
> > waitForCompletionPolicy of "first".  When set to "all", the problem 
> > disappears.
> >
> > Looking in various places, I assume the current recommendation is to use 
> > "all" & "first" is less reliable.
> >
> > If this is correct, it would probably be worth changing the example 
> > configurations in config/virtualdatabase to use "all" - they currently 
> > (2.10.9) use "first" - which is where I copied my configuration from.
> >
> >         thanks,
> >
> >         Andrew Lawrenson
> >
> >
> >
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of
> > Andrew Lawrenson
> > Sent: 22 February 2008 14:50
> > To: Sequoia general mailing list
> > Subject: [Sequoia] RE: priority inversion?
> >
> > As an update:
> >
> > The same occurs when only using a single controller.
> > The problem occurs because the locks held in sequoia are conflicting with 
> > locks held in the database, causing deadlock (which is undetected as both 
> > sides see only half the locks).
> >
> > I.e.
> > In the database:
> >    Transaction A exclusively locks row R1.
> >    Transaction B waits to lock row R1.
> >
> > In Sequoia:
> >    Sequoia thinks A is waiting for B, and appears not to commit 
> > transaction.  Eventually, transaction B fails with a lock timeout.
> >
> >
> > The full scenario is:
> >
> > I'm executing two statements (from application logs):
> >
> > 2008-02-22 13:05:36,985 DEBUG DRS.DAO - about to execute ' update users set 
> > failcount = 0 where ( id = 611 OR id = 100073 ) '
> > 2008-02-22 13:05:36,995 DEBUG DRS.DAO - executed ' update users set 
> > failcount = 0 where ( id = 611 OR id = 100073 ) '
> > 2008-02-22 13:05:37,033 DEBUG DRS.DAO - about to execute ' update users set 
> > failcount = 0 where ( id = 611 OR id = 100074 ) '
> > 2008-02-22 13:05:37,044 DEBUG DRS.DAO - executed ' update users set 
> > failcount = 0 where ( id = 611 OR id = 100074 ) '
> >
> > The first one (100073) is Sequoia transaction #285 The second one
> > (100074) is Sequoia transaction #287
> >
> > I have two backends - DRSPRI & DRSPRI-BACKUP (both apache derby databases).
> >
> > Both queries perform as expected on DRSPRI (these are a mixture of sequoia 
> > & derby logs):
> >
> > for #285 on 1st back-end (DRSPRI):
> > 2008-02-22 13:05:36,987 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI - Adding task StatementExecuteUpdateTask from transaction 285
> > (update users set failcount = 0 where ( id = 611 OR id = 100073 )) to
> > pending request queue
> > 2008-02-22 13:05:36,987 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Executing task StatementExecuteUpdateTask from transaction 285 (update
> > users set failcount = 0 where ( id = 611 OR id = 100073 ))
> > 2008-02-22 13:05:36.987 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81701),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Begin
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100073 ) :End prepared statement
> > 2008-02-22 13:05:36.989 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81701),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), End compiling
> > prepared statement: update users set failcount = 0 where ( id = 611 OR
> > id = 100073 ) :End prepared statement
> > 2008-02-22 13:05:36.992 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81701),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Executing
> > prepared statement: update users set failcount = 0 where ( id = 611 OR
> > id = 100073 ) :End prepared statement
> > 2008-02-22 13:05:36,993 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Task StatementExecuteUpdateTask from transaction 285 (update users set
> > failcount = 0 where ( id = 611 OR id = 100073 )) completed
> > 2008-02-22 13:05:36,997 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI - Adding task CommitTask (285) to pending request queue
> > 2008-02-22 13:05:36,997 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Executing task CommitTask (285)
> > 2008-02-22 13:05:36.997 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81701),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Committing
> > 2008-02-22 13:05:37.007 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81701),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Committing
> > 2008-02-22 13:05:37,007 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Task CommitTask (285) completed
> >
> > for #287 on 1st back-end (DRSPRI):
> > 2008-02-22 13:05:37,035 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI - Adding task StatementExecuteUpdateTask from transaction 287
> > (update users set failcount = 0 where ( id = 611 OR id = 100074 )) to
> > pending request queue
> > 2008-02-22 13:05:37,036 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Executing task StatementExecuteUpdateTask from transaction 287 (update
> > users set failcount = 0 where ( id = 611 OR id = 100074 ))
> > 2008-02-22 13:05:37.036 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81706),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Begin
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37.038 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81706),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), End compiling
> > prepared statement: update users set failcount = 0 where ( id = 611 OR
> > id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37.041 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81706),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Executing
> > prepared statement: update users set failcount = 0 where ( id = 611 OR
> > id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37,042 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Task StatementExecuteUpdateTask from transaction 287 (update users set
> > failcount = 0 where ( id = 611 OR id = 100074 )) completed
> > 2008-02-22 13:05:37,045 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI - Adding task CommitTask (287) to pending request queue
> > 2008-02-22 13:05:37,045 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Executing task CommitTask (287)
> > 2008-02-22 13:05:37.045 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81706),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Committing
> > 2008-02-22 13:05:37.054 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI' with RAIDb level:1,5,main] (XID = 81706),
> > (SESSIONID = 22), (DATABASE = DRSPRI), (DRDAID = null), Committing
> > 2008-02-22 13:05:37,055 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI -
> > Task CommitTask (287) completed
> >
> > However, problems occur on the second backend:
> >
> > for #285 on 2nd back-end (DRSPRI-BACKUP):
> > 2008-02-22 13:05:36,989 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Adding task StatementExecuteUpdateTask from transaction
> > 285 (update users set failcount = 0 where ( id = 611 OR id = 100073 ))
> > to pending request queue
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Priority inversion for already started request update
> > users set failcount = 0 where ( id = 611 OR id = 100073 )
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Moving StatementExecuteUpdateTask from transaction 285
> > (update users set failcount = 0 where ( id = 611 OR id = 100073 )) to
> > non conflicting queue
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI-BACKU
> > P - Executing task StatementExecuteUpdateTask from transaction 285
> > (update users set failcount = 0 where ( id = 611 OR id = 100073 ))
> > 2008-02-22 13:05:37.091 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80696),
> > (SESSIONID = 23), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), Begin
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100073 ) :End prepared statement
> > 2008-02-22 13:05:37.094 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80696),
> > (SESSIONID = 23), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), End
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100073 ) :End prepared statement
> > 2008-02-22 13:05:37.099 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80696),
> > (SESSIONID = 23), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Executing prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100073 ) :End prepared statement
> >
> > ...but, it hits a lock in the database created by 287 & waits for 1
> > minute before failing
> > 2008-02-22 13:06:37.137 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80696),
> > (SESSIONID = 23), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), Cleanup
> > action starting
> > 2008-02-22 13:06:37.138 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80696),
> > (SESSIONID = 23), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), Failed
> > Statement is: update users set failcount = 0 where ( id = 611 OR id =
> > 100073 )
> > 2008-02-22 13:06:37,155 WARN  
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.DRSPRI-BACKUP
> >  - A worker thread was still processing task StatementExecuteUpdateTask 
> > from transaction 285 (update users set failcount = 0 where ( id = 611 OR id 
> > = 100073 )), aborting the request execution.
> > 2008-02-22 13:06:37,821 WARN
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Unable to remove task StatementExecuteUpdateTask from
> > transaction 285 (update users set failcount = 0 where ( id = 611 OR id
> > = 100073 )) from pending request queue
> > 2008-02-22 13:06:37,821 ERROR
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Failed to remove task StatementExecuteUpdateTask from
> > transaction 285 (update users set failcount = 0 where ( id = 611 OR id
> > = 100073 )) from []
> > 2008-02-22 13:06:37,821 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI-BACKU
> > P - Task StatementExecuteUpdateTask from transaction 285 (update users
> > set failcount = 0 where ( id = 611 OR id = 100073 )) completed
> >
> > #287 on 2nd back-end (DRSPRI-BACKUP)
> > 2008-02-22 13:05:37,038 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Adding task StatementExecuteUpdateTask from transaction
> > 287 (update users set failcount = 0 where ( id = 611 OR id = 100074 ))
> > to pending request queue
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Priority inversion for already started request update
> > users set failcount = 0 where ( id = 611 OR id = 100074 )
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Moving StatementExecuteUpdateTask from transaction 287
> > (update users set failcount = 0 where ( id = 611 OR id = 100074 )) to
> > non conflicting queue
> > 2008-02-22 13:05:37,091 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI-BACKU
> > P - Executing task StatementExecuteUpdateTask from transaction 287
> > (update users set failcount = 0 where ( id = 611 OR id = 100074 ))
> > 2008-02-22 13:05:37.091 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80697),
> > (SESSIONID = 22), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), Begin
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37.094 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80697),
> > (SESSIONID = 22), (DATABASE = DRSPRI-BACKUP), (DRDAID = null), End
> > compiling prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37.098 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 80697),
> > (SESSIONID = 22), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Executing prepared statement: update users set failcount = 0 where (
> > id = 611 OR id = 100074 ) :End prepared statement
> > 2008-02-22 13:05:37,100 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSPRI-BACKU
> > P - Task StatementExecuteUpdateTask from transaction 287 (update users
> > set failcount = 0 where ( id = 611 OR id = 100074 )) completed
> >
> > then, 30 secs later:
> > 2008-02-22 13:06:08,656 DEBUG
> > org.continuent.sequoia.controller.loadbalancer - 287 waits for USERS
> > locked by 285 (and others)
> >
> >
> >
> > so, overall, the summary of the sequence is:
> >
> > 13:05:36,985 app executes #285: ' update users set failcount = 0 where ( id 
> > = 611 OR id = 100073 ) '
> > 13:05:36.987 drspri executes query #285
> > 13:05:36,993 drspri finishes query #285
> >
> > 13:05:36,995 app sees query #285 as finished.
> > ??:??:??,??? app commits #285
> > 13:05:36,997 drspri starts commit for #285
> > 13:05:37,007 drspri finishes commit for #285
> >
> > 13:05:37,033 app executes #287: ' update users set failcount = 0 where ( id 
> > = 611 OR id = 100074 ) '
> > 13:05:37.036 drspri executes query #287
> > 13:05:37.042 drspri finishes query #287
> >
> > 13:05:37,044 app sees query #287 as finished.
> > ??:??:??,??? app commits #287
> > 13:05:37,045 drspri starts commit for #287
> > 13:05:37,055 drspri finishes commit for #287
> >
> > 13:05:37.091 drspri-backup performs Priority Inversion on #285
> > 13:05:37.091 drspri-backup performs Priority Inversion on #287
> > 13:05:37.091 drspri-backup executes query #285
> > 13:05:37.091 drspri-backup executes query #287 in database, #287 runs 
> > first, gets exclusive lock on row.
> > 13:05:37,100 drspri-backup finishes query #287
> > 13:06:08,656 seqoia thinks #287 is waiting for #285, hasn't committed.
> > 13:06:37.137 drspri-backup db fails query #285 with lock timeout
> >
> >
> >
> > So... Does anyone know if this is me doing something stupid, or a sequoia 
> > problem?
> >
> > My RequestScheduler config is:
> >
> >       <RequestScheduler>
> >          <RAIDb-1Scheduler level="passThrough"/>
> >       </RequestScheduler>
> >
> > And the docs say "The load balancer will just ensure that the writes are 
> > sent in the same order to all backends." - but that doesn't seem to be the 
> > case here?
> >
> > Many thanks,
> >
> >         Andrew Lawrenson
> >
> >
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of
> > Andrew Lawrenson
> > Sent: 21 February 2008 17:21
> > To: [email protected]
> > Subject: [Sequoia] priority inversion?
> >
> > Hi all,
> >
> >     I'm attempting to use sequoia (2.10.9) to cluster some Apache Derby 
> > databases, which generally seems to work as expected.
> > However, there is a problem that seems to keep cropping up, and appears 
> > like it might be related to Priority Inversion - and I'm not quite sure 
> > what that is.
> >
> > The problem I'm seeing is that sometimes when two queries (which are 
> > sequential & from the same client) both update the same row, they are 
> > hitting a locking situation which wouldn't normally occur.
> >
> > Normally, the requests from sequoia come through to derby looking like this 
> > (these are Derby Logs, for transaction #40738):
> >
> > 2008-02-21 16:13:34.190 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 40738),
> > (SESSIONID = 21), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Executing prepared statement: update users set failcount = 0 where id
> > = ? :End prepared statement with 1 parameters begin parameter #1: 611
> > :end parameter
> > 2008-02-21 16:13:34.193 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 40738),
> > (SESSIONID = 21), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Committing
> > 2008-02-21 16:13:34.295 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 40738),
> > (SESSIONID = 21), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Committing
> > 2008-02-21 16:13:34.295 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 40738),
> > (SESSIONID = 21), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Committing
> >
> > However, every so often, all I get is the execution, without any commit:
> > 2008-02-21 16:13:34.295 GMT Thread[DRSCluster - BackendWorkerThread
> > for backend 'DRSPRI-BACKUP' with RAIDb level:1,5,main] (XID = 40739),
> > (SESSIONID = 21), (DATABASE = DRSPRI-BACKUP), (DRDAID = null),
> > Executing prepared statement: update users set failcount = 0 where id
> > = ? :End prepared statement with 1 parameters begin parameter #1: 611
> > :end parameter
> >
> > (these is the only activity visible in derby for transaction #40739)
> >
> > but at the same time, Sequoia performs Priority Inversion on (presumably) 
> > this query:
> >
> > 2008-02-21 16:13:34,295 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Priority inversion for already started request update
> > users set failcount = 0 where id = ?/<!%I|611|!%>
> > 2008-02-21 16:13:34,295 DEBUG
> > org.continuent.sequoia.controller.backend.DatabaseBackend.DRSCluster.D
> > RSPRI-BACKUP - Moving StatementExecuteUpdateTask from transaction
> > 4222124650660804 (update users set failcount = 0 where id =
> > ?/<!%I|611|!%>) to non conflicting queue
> >
> > when this happens, it appears that the commit (which is issued by the 
> > client) never gets presented to derby, so that a lock is now kept on the 
> > updated row, and subsequent updates of this row fail with lock timeout 
> > errors (confirmed from Derby lock diagnostics).
> >
> > I've reproduced this several times, and every time there is a Priority 
> > Inversion when the commit disappears.  Without sequoia, I've had no locking 
> > issues.
> >
> > I've also just noticed that in all cases, the commit for the previous
> > transaction, the executing of the failing transaction and the priority
> > inversion are all timestamped to the same millisecond (although the
> > previous commit always comes through to derby before the excecution)
> >
> > So, can anyone explain what Prority Inversion is, and where the problem 
> > might be?
> >
> > If it has any relevance, my setup has two or three distributed controllers 
> > (using appia w/tcp sequencer), each of which has two derby backends.  
> > Although there are distributed controllers, these failures are all local 
> > (e.g. the failures occur in the backends attached to the controller which 
> > receives the request).
> >
> >
> > many thanks in advance,
> >
> >     Andrew Lawrenson
> >
> >
> >


> Raidb1 load balancer and WaitForCompletionPolicy=first doesnot work
> -------------------------------------------------------------------
>
>          Key: SEQUOIA-955
>          URL: https://forge.continuent.org/jira/browse/SEQUOIA-955
>      Project: Sequoia
>         Type: Bug

>   Components: Core
>     Versions: Sequoia 2.10.8
>  Environment: Windows XP
>     Reporter: xavier roques
>     Assignee: Emmanuel Cecchet
>     Priority: Critical

>
>
> If have a VirtualDatabase with two backends db1 and db2, the load balancer is 
> the Raiddb1 and the waitForCompletionPolicy is First.
> DB1 and DB2 are up. My application makes a lot of updates and after some time 
> one of the Database is disabled.
> I don't undertstand why, because the database is up.
> After some searches I find out that database was disabled because of a 
> timeout while trying to get a new connection.
> My pool of connection 50 was full  ? But I released all my connections ...
> So now, I will explain why it does not work.
> There is a load balancer with two backends: Db1 and DB2. Each backend has its 
> own queue and the queue of each backend is managed by different a thread.
> A lot of updates are sent to the virtualdatabase.
> The load balancer dispatches the requests on each backend. Now let's us 
> suppose that the two following requests arrive.
> Request 1 (update)
> Request 2 (commit)
> The load balancer sends the request 1 on each backend and wait for the first 
> completion.
> If the thread managing the DB1 executes the request 1 and send the result to 
> the load balancer. It is possible that the load balancer dispatch the request 
> 2 even if the Request 1 has not been started on the DB2.
> The code of the loadBalancer RAIDDB1.commit is:
> {code}
>  for (int i = 0; i < nbOfThreads; i++)
>     {
>       DatabaseBackend backend = (DatabaseBackend) enabledBackends.get(i);
>       if (backend.isStartedTransaction(lTid))
>         commitList.add(backend);
>     }
> {code}
> Unfortunately when this code is executed the backend DB2 may not have started 
> to execute the request 1 so the commit will only be pushed on the backend 1.
> So when the request 1 will be executed on DB2, a new connection will be open 
> and it will never be released because the commit will never be executed on 
> the  backend DB2.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   https://forge.continuent.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

[Sequoia] [JIRA] Commented: (SEQUOIA-955) Raidb1 load balancer and WaitForCompletionPolicy=first doesnot work

Reply via email to