Re: All subsequent CAS requests time out after heavy use of new CAS feature
Oh yes it is, like Couters :-) On Sat, Dec 24, 2016 at 4:02 AM, Edward Capriolo wrote: > Anecdotal CAS works differently than the typical cassandra workload. If > you run a stress instance 3 nodes one host, you find that you typically run > into CPU issues, but if you are doing a CAS workload you see things timing > out and before you hit 100% CPU. It is a strange beast. > > On Fri, Dec 23, 2016 at 7:28 AM, horschi wrote: > >> Update: I replace all quorum reads on that table with serial reads, and >> now these errors got less. Somehow quorum reads on CAS values cause most of >> these WTEs. >> >> Also I found two tickets on that topic: >> https://issues.apache.org/jira/browse/CASSANDRA-9328 >> https://issues.apache.org/jira/browse/CASSANDRA-8672 >> >> On Thu, Dec 15, 2016 at 3:14 PM, horschi wrote: >> >>> Hi, >>> >>> I would like to warm up this old thread. I did some debugging and found >>> out that the timeouts are coming from StorageProxy.proposePaxos() >>> - callback.isFullyRefused() returns false and therefore triggers a >>> WriteTimeout. >>> >>> Looking at my ccm cluster logs, I can see that two replica nodes return >>> different results in their ProposeVerbHandler. In my opinion the >>> coordinator should not throw a Exception in such a case, but instead retry >>> the operation. >>> >>> What do the CAS/Paxos experts on this list say to this? Feel free to >>> instruct me to do further tests/code changes. I'd be glad to help. >>> >>> Log: >>> >>> node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 >>> 14:48:36,896 PaxosState.java:124 - Rejecting proposal for >>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 >>> columns=[[] | [value]] >>> node1/logs/system.log-Row: id=@ | value=) because >>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, >>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >>> -- >>> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 >>> 14:48:36,980 StorageProxy.java:506 - proposePaxos: >>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 >>> columns=[[] | [value]] >>> node1/logs/system.log-Row: id=@ | value=)//1//0 >>> -- >>> node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 >>> 14:48:36,969 PaxosState.java:117 - Accepting proposal: >>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 >>> columns=[[] | [value]] >>> node2/logs/system.log-Row: id=@ | value=) >>> -- >>> node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 >>> 14:48:36,897 PaxosState.java:124 - Rejecting proposal for >>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 >>> columns=[[] | [value]] >>> node3/logs/system.log-Row: id=@ | value=) because >>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, >>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >>> >>> >>> kind regards, >>> Christian >>> >>> >>> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers wrote: >>> My thinking was that due to the size of the data that there maybe I/O issues. But it sounds more like you're competing for locks and hit a deadlock issue. Regards, Denise Cell - (860)989-3431 <(860)%20989-3431> Sent from mi iPhone On Apr 15, 2016, at 9:00 AM, horschi wrote: Hi Denise, in my case its a small blob I am writing (should be around 100 bytes): CREATE TABLE "Lock" ( lockname varchar, id varchar, value blob, PRIMARY KEY (lockname, id) ) WITH COMPACT STORAGE AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', 'chunk_length_kb' : '8' }; You ask because large values are known to cause issues? Anything special you have in mind? kind regards, Christian On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: > Also, what type of data were you reading/writing? > > Regards, > Denise > > Sent from mi iPad > > On Apr 15, 2016, at 8:29 AM, horschi wrote: > > Hi Jan, > > were you able to resolve your Problem? > > We are trying the same and also see a lot of WriteTimeouts: > WriteTimeoutException: Cassandra timeout during write query at > consistency SERIAL (2 replica were required but only 1 acknowledged the > write) > > How many clients were competing for a lock in your case? In our case > its only two :-( > > cheers, > Christian > > > On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli > wrote: > >> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < >> jan.algermis...@nordsc.com> wrote: >> >>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 >>> snapshot) for implementing distributed locks. >>> >> >> [ and I'm experiencing the problem described in the subject ... ] >> >> >>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Anecdotal CAS works differently than the typical cassandra workload. If you run a stress instance 3 nodes one host, you find that you typically run into CPU issues, but if you are doing a CAS workload you see things timing out and before you hit 100% CPU. It is a strange beast. On Fri, Dec 23, 2016 at 7:28 AM, horschi wrote: > Update: I replace all quorum reads on that table with serial reads, and > now these errors got less. Somehow quorum reads on CAS values cause most of > these WTEs. > > Also I found two tickets on that topic: > https://issues.apache.org/jira/browse/CASSANDRA-9328 > https://issues.apache.org/jira/browse/CASSANDRA-8672 > > On Thu, Dec 15, 2016 at 3:14 PM, horschi wrote: > >> Hi, >> >> I would like to warm up this old thread. I did some debugging and found >> out that the timeouts are coming from StorageProxy.proposePaxos() >> - callback.isFullyRefused() returns false and therefore triggers a >> WriteTimeout. >> >> Looking at my ccm cluster logs, I can see that two replica nodes return >> different results in their ProposeVerbHandler. In my opinion the >> coordinator should not throw a Exception in such a case, but instead retry >> the operation. >> >> What do the CAS/Paxos experts on this list say to this? Feel free to >> instruct me to do further tests/code changes. I'd be glad to help. >> >> Log: >> >> node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896 >> PaxosState.java:124 - Rejecting proposal for >> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, >> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >> node1/logs/system.log-Row: id=@ | value=) because >> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, >> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >> -- >> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 >> 14:48:36,980 StorageProxy.java:506 - proposePaxos: >> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 >> columns=[[] | [value]] >> node1/logs/system.log-Row: id=@ | value=)//1//0 >> -- >> node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969 >> PaxosState.java:117 - Accepting proposal: >> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, >> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >> node2/logs/system.log-Row: id=@ | value=) >> -- >> node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897 >> PaxosState.java:124 - Rejecting proposal for >> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, >> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >> node3/logs/system.log-Row: id=@ | value=) because >> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, >> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] >> >> >> kind regards, >> Christian >> >> >> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers wrote: >> >>> My thinking was that due to the size of the data that there maybe I/O >>> issues. But it sounds more like you're competing for locks and hit a >>> deadlock issue. >>> >>> Regards, >>> Denise >>> Cell - (860)989-3431 <(860)%20989-3431> >>> >>> Sent from mi iPhone >>> >>> On Apr 15, 2016, at 9:00 AM, horschi wrote: >>> >>> Hi Denise, >>> >>> in my case its a small blob I am writing (should be around 100 bytes): >>> >>> CREATE TABLE "Lock" ( >>> lockname varchar, >>> id varchar, >>> value blob, >>> PRIMARY KEY (lockname, id) >>> ) WITH COMPACT STORAGE >>> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', >>> 'chunk_length_kb' : '8' }; >>> >>> You ask because large values are known to cause issues? Anything special >>> you have in mind? >>> >>> kind regards, >>> Christian >>> >>> >>> >>> >>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: >>> Also, what type of data were you reading/writing? Regards, Denise Sent from mi iPad On Apr 15, 2016, at 8:29 AM, horschi wrote: Hi Jan, were you able to resolve your Problem? We are trying the same and also see a lot of WriteTimeouts: WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write) How many clients were competing for a lock in your case? In our case its only two :-( cheers, Christian On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli wrote: > On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < > jan.algermis...@nordsc.com> wrote: > >> I am experimenting with C* 2.0 ( and today's java-driver 2.0 >> snapshot) for implementing distributed locks. >> > > [ and I'm experiencing the problem described in the subject ... ] > > >> Any idea how to approach this problem? >> > > 1) Upgrade to 2.0.1 release. > 2) Try to reproduce symptoms. > 3) If able to, file a JIRA at https://issues.apache.org/jira > /secure/Dashboard.jspa including repro steps
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Update: I replace all quorum reads on that table with serial reads, and now these errors got less. Somehow quorum reads on CAS values cause most of these WTEs. Also I found two tickets on that topic: https://issues.apache.org/jira/browse/CASSANDRA-9328 https://issues.apache.org/jira/browse/CASSANDRA-8672 On Thu, Dec 15, 2016 at 3:14 PM, horschi wrote: > Hi, > > I would like to warm up this old thread. I did some debugging and found > out that the timeouts are coming from StorageProxy.proposePaxos() > - callback.isFullyRefused() returns false and therefore triggers a > WriteTimeout. > > Looking at my ccm cluster logs, I can see that two replica nodes return > different results in their ProposeVerbHandler. In my opinion the > coordinator should not throw a Exception in such a case, but instead retry > the operation. > > What do the CAS/Paxos experts on this list say to this? Feel free to > instruct me to do further tests/code changes. I'd be glad to help. > > Log: > > node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896 > PaxosState.java:124 - Rejecting proposal for > Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, > [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] > node1/logs/system.log-Row: id=@ | value=) because > inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock] > key=locktest_ 1 columns=[[] | [value]] > -- > node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980 > StorageProxy.java:506 - proposePaxos: > Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, > [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] > node1/logs/system.log-Row: id=@ | value=)//1//0 > -- > node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969 > PaxosState.java:117 - Accepting proposal: > Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, > [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] > node2/logs/system.log-Row: id=@ | value=) > -- > node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897 > PaxosState.java:124 - Rejecting proposal for > Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, > [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] > node3/logs/system.log-Row: id=@ | value=) because > inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock] > key=locktest_ 1 columns=[[] | [value]] > > > kind regards, > Christian > > > On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers wrote: > >> My thinking was that due to the size of the data that there maybe I/O >> issues. But it sounds more like you're competing for locks and hit a >> deadlock issue. >> >> Regards, >> Denise >> Cell - (860)989-3431 <(860)%20989-3431> >> >> Sent from mi iPhone >> >> On Apr 15, 2016, at 9:00 AM, horschi wrote: >> >> Hi Denise, >> >> in my case its a small blob I am writing (should be around 100 bytes): >> >> CREATE TABLE "Lock" ( >> lockname varchar, >> id varchar, >> value blob, >> PRIMARY KEY (lockname, id) >> ) WITH COMPACT STORAGE >> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', >> 'chunk_length_kb' : '8' }; >> >> You ask because large values are known to cause issues? Anything special >> you have in mind? >> >> kind regards, >> Christian >> >> >> >> >> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: >> >>> Also, what type of data were you reading/writing? >>> >>> Regards, >>> Denise >>> >>> Sent from mi iPad >>> >>> On Apr 15, 2016, at 8:29 AM, horschi wrote: >>> >>> Hi Jan, >>> >>> were you able to resolve your Problem? >>> >>> We are trying the same and also see a lot of WriteTimeouts: >>> WriteTimeoutException: Cassandra timeout during write query at >>> consistency SERIAL (2 replica were required but only 1 acknowledged the >>> write) >>> >>> How many clients were competing for a lock in your case? In our case its >>> only two :-( >>> >>> cheers, >>> Christian >>> >>> >>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli >>> wrote: >>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < jan.algermis...@nordsc.com> wrote: > I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) > for implementing distributed locks. > [ and I'm experiencing the problem described in the subject ... ] > Any idea how to approach this problem? > 1) Upgrade to 2.0.1 release. 2) Try to reproduce symptoms. 3) If able to, file a JIRA at https://issues.apache.org/jira /secure/Dashboard.jspa including repro steps 4) Reply to this thread with the JIRA ticket URL =Rob >>> >>> >> >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Hi, I would like to warm up this old thread. I did some debugging and found out that the timeouts are coming from StorageProxy.proposePaxos() - callback.isFullyRefused() returns false and therefore triggers a WriteTimeout. Looking at my ccm cluster logs, I can see that two replica nodes return different results in their ProposeVerbHandler. In my opinion the coordinator should not throw a Exception in such a case, but instead retry the operation. What do the CAS/Paxos experts on this list say to this? Feel free to instruct me to do further tests/code changes. I'd be glad to help. Log: node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896 PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] node1/logs/system.log-Row: id=@ | value=) because inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] -- node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980 StorageProxy.java:506 - proposePaxos: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] node1/logs/system.log-Row: id=@ | value=)//1//0 -- node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969 PaxosState.java:117 - Accepting proposal: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] node2/logs/system.log-Row: id=@ | value=) -- node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897 PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] node3/logs/system.log-Row: id=@ | value=) because inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock] key=locktest_ 1 columns=[[] | [value]] kind regards, Christian On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers wrote: > My thinking was that due to the size of the data that there maybe I/O > issues. But it sounds more like you're competing for locks and hit a > deadlock issue. > > Regards, > Denise > Cell - (860)989-3431 <(860)%20989-3431> > > Sent from mi iPhone > > On Apr 15, 2016, at 9:00 AM, horschi wrote: > > Hi Denise, > > in my case its a small blob I am writing (should be around 100 bytes): > > CREATE TABLE "Lock" ( > lockname varchar, > id varchar, > value blob, > PRIMARY KEY (lockname, id) > ) WITH COMPACT STORAGE > AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', > 'chunk_length_kb' : '8' }; > > You ask because large values are known to cause issues? Anything special > you have in mind? > > kind regards, > Christian > > > > > On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: > >> Also, what type of data were you reading/writing? >> >> Regards, >> Denise >> >> Sent from mi iPad >> >> On Apr 15, 2016, at 8:29 AM, horschi wrote: >> >> Hi Jan, >> >> were you able to resolve your Problem? >> >> We are trying the same and also see a lot of WriteTimeouts: >> WriteTimeoutException: Cassandra timeout during write query at >> consistency SERIAL (2 replica were required but only 1 acknowledged the >> write) >> >> How many clients were competing for a lock in your case? In our case its >> only two :-( >> >> cheers, >> Christian >> >> >> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli >> wrote: >> >>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < >>> jan.algermis...@nordsc.com> wrote: >>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks. >>> >>> [ and I'm experiencing the problem described in the subject ... ] >>> >>> Any idea how to approach this problem? >>> >>> 1) Upgrade to 2.0.1 release. >>> 2) Try to reproduce symptoms. >>> 3) If able to, file a JIRA at https://issues.apache.org/ >>> jira/secure/Dashboard.jspa including repro steps >>> 4) Reply to this thread with the JIRA ticket URL >>> >>> =Rob >>> >>> >>> >> >> >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
My thinking was that due to the size of the data that there maybe I/O issues. But it sounds more like you're competing for locks and hit a deadlock issue. Regards, Denise Cell - (860)989-3431 Sent from mi iPhone > On Apr 15, 2016, at 9:00 AM, horschi wrote: > > Hi Denise, > > in my case its a small blob I am writing (should be around 100 bytes): > > CREATE TABLE "Lock" ( > lockname varchar, > id varchar, > value blob, > PRIMARY KEY (lockname, id) > ) WITH COMPACT STORAGE > AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', > 'chunk_length_kb' : '8' }; > > You ask because large values are known to cause issues? Anything special you > have in mind? > > kind regards, > Christian > > > > >> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: >> Also, what type of data were you reading/writing? >> >> Regards, >> Denise >> >> Sent from mi iPad >> >>> On Apr 15, 2016, at 8:29 AM, horschi wrote: >>> >>> Hi Jan, >>> >>> were you able to resolve your Problem? >>> >>> We are trying the same and also see a lot of WriteTimeouts: >>> WriteTimeoutException: Cassandra timeout during write query at consistency >>> SERIAL (2 replica were required but only 1 acknowledged the write) >>> >>> How many clients were competing for a lock in your case? In our case its >>> only two :-( >>> >>> cheers, >>> Christian >>> >>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli wrote: > On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen > wrote: > I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) > for implementing distributed locks. [ and I'm experiencing the problem described in the subject ... ] > Any idea how to approach this problem? 1) Upgrade to 2.0.1 release. 2) Try to reproduce symptoms. 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps 4) Reply to this thread with the JIRA ticket URL =Rob >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Hi Denise, in my case its a small blob I am writing (should be around 100 bytes): CREATE TABLE "Lock" ( lockname varchar, id varchar, value blob, PRIMARY KEY (lockname, id) ) WITH COMPACT STORAGE AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', 'chunk_length_kb' : '8' }; You ask because large values are known to cause issues? Anything special you have in mind? kind regards, Christian On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers wrote: > Also, what type of data were you reading/writing? > > Regards, > Denise > > Sent from mi iPad > > On Apr 15, 2016, at 8:29 AM, horschi wrote: > > Hi Jan, > > were you able to resolve your Problem? > > We are trying the same and also see a lot of WriteTimeouts: > WriteTimeoutException: Cassandra timeout during write query at consistency > SERIAL (2 replica were required but only 1 acknowledged the write) > > How many clients were competing for a lock in your case? In our case its > only two :-( > > cheers, > Christian > > > On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli > wrote: > >> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < >> jan.algermis...@nordsc.com> wrote: >> >>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) >>> for implementing distributed locks. >>> >> >> [ and I'm experiencing the problem described in the subject ... ] >> >> >>> Any idea how to approach this problem? >>> >> >> 1) Upgrade to 2.0.1 release. >> 2) Try to reproduce symptoms. >> 3) If able to, file a JIRA at >> https://issues.apache.org/jira/secure/Dashboard.jspa including repro >> steps >> 4) Reply to this thread with the JIRA ticket URL >> >> =Rob >> >> >> > >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Also, what type of data were you reading/writing? Regards, Denise Sent from mi iPad > On Apr 15, 2016, at 8:29 AM, horschi wrote: > > Hi Jan, > > were you able to resolve your Problem? > > We are trying the same and also see a lot of WriteTimeouts: > WriteTimeoutException: Cassandra timeout during write query at consistency > SERIAL (2 replica were required but only 1 acknowledged the write) > > How many clients were competing for a lock in your case? In our case its only > two :-( > > cheers, > Christian > > >> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli wrote: >>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen >>> wrote: >>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for >>> implementing distributed locks. >> >> [ and I'm experiencing the problem described in the subject ... ] >> >>> Any idea how to approach this problem? >> >> 1) Upgrade to 2.0.1 release. >> 2) Try to reproduce symptoms. >> 3) If able to, file a JIRA at >> https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps >> 4) Reply to this thread with the JIRA ticket URL >> >> =Rob >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Hi Jan, were you able to resolve your Problem? We are trying the same and also see a lot of WriteTimeouts: WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write) How many clients were competing for a lock in your case? In our case its only two :-( cheers, Christian On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli wrote: > On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen < > jan.algermis...@nordsc.com> wrote: > >> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) >> for implementing distributed locks. >> > > [ and I'm experiencing the problem described in the subject ... ] > > >> Any idea how to approach this problem? >> > > 1) Upgrade to 2.0.1 release. > 2) Try to reproduce symptoms. > 3) If able to, file a JIRA at > https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps > 4) Reply to this thread with the JIRA ticket URL > > =Rob > > >
Re: All subsequent CAS requests time out after heavy use of new CAS feature
On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen wrote: > I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for > implementing distributed locks. > [ and I'm experiencing the problem described in the subject ... ] > Any idea how to approach this problem? > 1) Upgrade to 2.0.1 release. 2) Try to reproduce symptoms. 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps 4) Reply to this thread with the JIRA ticket URL =Rob
All subsequent CAS requests time out after heavy use of new CAS feature
Hi, I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks. Basically, I have a table of 'states' I want to serialize access to: create table state ( id text , lock uuid , data text, primary key (id) ) (3 nodes, replication level 3) insert into state (id) values ( 'foo') I try to akquire the lock for state 'foo' like this: update state set lock = myUUID where id = 'foo' if lock = null; and check whether I got it by comparing the lock against my supplied UUID: select lock from state where id = 'foo'; ... do work on 'foo' state release lock: update state set lock = null where id = 'foo' if lock = myUUID; This works pretty well and if I increase the number of clients competing for the lock I start seeing timeouts on the client side. Natural so far and the lock also remains in a consistent state (it works to work around the failing clients and the uncertainty whether they got the lock or not). However, after pausing the clients for a while the timeouts do not disappear. Meaning that when I send a single request after everything calms down , I still get a timeout: Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (-1 replica were required but only -1 acknowledged the write) I do not see any reaction in the C* logs for these follow-up requests that still time out. Any idea how to approach this problem? Jan