[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-30 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887295#comment-13887295
 ] 

Benedict commented on CASSANDRA-5549:
-

Awesome! I've fixed the formatting, added a couple of comments and opened a 
pull request.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887229#comment-13887229
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

(Please submit a pull request for jamm-0.2.6.  Note that jamm uses "normal" 
brace-on-same-line convention.)

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887225#comment-13887225
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Committed!

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886121#comment-13886121
 ] 

Benedict commented on CASSANDRA-5549:
-

I've rebased from trunk and fixed the bugs and pushed (-f) to 
[merge-5549|https://github.com/belliottsmith/cassandra/tree/merge-5549]. There 
were two small issues: in RangeTombstoneList I was setting start[i] = null 
before I copied its value; and in FBUtilities.hashToBigInteger() my 
"equivalency" rewrite wasn't so equivalent after all, so I reverted it. Whoops.

There are still three new tests failing, but this is because we fixed a bug and 
so DefsTest no longer times out; CASSANDRA-6636 addresses these new issues.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884283#comment-13884283
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

(tried to squash only-5549 onto trunk to see if it was something that got fixed 
recently in trunk and got a zillion conflicts, so rebasing that would be useful 
in any case)

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884285#comment-13884285
 ] 

Benedict commented on CASSANDRA-5549:
-

Right, I'll take a look first thing tomorrow. Don't trust myself to actually 
correct anything right now.

I've got a squashed/rebased version that's only a week or so old, so I'll 
update that against your changes and upload.



> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884276#comment-13884276
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

I'm seeing tests fail on both your branch and mine that do not fail in trunk, 
e.g. RangeTombstoneListTest.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882598#comment-13882598
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Fair enough.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882591#comment-13882591
 ] 

Benedict commented on CASSANDRA-5549:
-

I'm not totally convinced this is better, but I think I can make it worth with 
the off-heap stuff. I think it will get a little ugly, as we'll need two 
allocators as well - we'll need some method in the AbstractAllocator interface 
to getOffHeapPartner() and getOnHeapPartner(), or something, so that Memtable 
can avoid special casing the OffHeapAllocator. But the OnHeapPartner won't 
actually support allocating BBs, it will just be for querying and updating 
amounts of memory.

Not relishing it, but don't mind pushing it off until we have to address that 
problem.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882570#comment-13882570
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Okay, went with plan B of unifying MT/Pool and MO/Allocator.  Pretty happy with 
how this worked out: https://github.com/jbellis/cassandra/commits/5549-3

In my mind, off heap stuff can now be added by creating an appropriate 
Allocator class, and a Pool implementation that owns two sub-pools.

(I'm seeing a bunch of test failures but I don't think it's from this refactor.)

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881668#comment-13881668
 ] 

Benedict commented on CASSANDRA-5549:
-

MemoryTracker -> PoolLedger
MemoryOwner -> AllocatorLedger
?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881666#comment-13881666
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

bq. I can see your criticism of Pool being in fact two pools, but I think it 
can also be viewed as just one. It only ever allocates one kind of resource, 
but in some cases those resources may occupy memory both on and off heap.

Well, the real point is that we have two things (Pool/Allocator, MT/MO) that 
operate on (connected on-off heap reservoirs / a single reservoir) and the 
names should reflect that relationship.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881626#comment-13881626
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. I think what I'm having a hard time with is, Pool should be a reservoir 
that we allocate out of and instead it's a container of two such reservoirs (MT)

I can see your criticism of Pool being in fact two pools, but I think it can 
also be viewed as just one. It only ever allocates one kind of resource, but in 
some cases those resources may occupy memory both on and off heap. The 
confusing thing, perhaps, is that we overload the trackers/owners to track not 
only what the pool itself is using, but also what overheads we incur in using 
the pool from Memtable. This is a bit unclean conceptually, but much cleaner in 
implementation.

I also think that renaming MT to Pool would be confusing. Certainly an 
OffHeapPool will legitimately _pool_ (and cache) the resources it allocates, so 
it could get confusing if we call the thing logging the amount the pool, and 
the thing actually pooling something else. MemoryOwner -> Allocator I'm very 
much a -1 on, as it really doesn't allocate anything.

That said, I can see where you're coming from. Only, I have already introduced 
AllocatorGroup into my follow on patch, which I think probably makes 
LinkedAllocatorsGroup pretty confusing :-)

I want to offer some alternatives, but I can't think of anything better.

MemoryTracker -> Ledger?
MemoryOwner -> Account? 

Pool -> Reservoir works for me

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881589#comment-13881589
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

I kind of like the Linked prefix since we can apply that at the MO level too:

MemoryTracker -> Pool
Pool -> LinkedPools
MemoryOwner -> Allocator
PoolAllocator -> LinkedAllocators

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881588#comment-13881588
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Aside: we have one (two) MemoryOwner per Memtable right?  I don't think it's 
worth obfuscating {{owns}} with an updater instead of just using AtomicLong.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881584#comment-13881584
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

I think what I'm having a hard time with is, Pool should be a reservoir that we 
allocate out of and instead it's a container of two such reservoirs (MT).  I 
propose renaming:

MemoryTracker -> Pool
Pool -> LinkedPools?  DualPool?  OnOffHeapPools?  Reservoir?  PoolPool?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875379#comment-13875379
 ] 

Benedict commented on CASSANDRA-5549:
-

Merged your changes, and pushed a version without NBQ et al.

One thing I rolled back was your deletion of the commented out code. I 
uncommented it. That was a massive oversight of something I was using for 
stress testing the memory management stuff for slab allocators :-/

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874878#comment-13874878
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Pushed seal back to issue and includes to isAfter.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874840#comment-13874840
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. The problem is that it's not "just" a memory fence, but is tied to the 
Group behavior. (Once issued, the old group is sealed and a new Group started.)

True, but from the point of view of the Barrier API, that doesn't really seem 
like important information to me, although it is a necessity of function that 
it happens. I think it is important to convey, especially, that the Barrier 
itself doesn't really exist until you call this method, it's just a place 
holder, which seal() doesn't seem to. A few alternatives: fix(), impose(), 
slice().

Coming up a bit empty on really good alternatives, though, and probably not 
worth much more discussion. I am not completely against seal(), as the word 
sounds good when said with Barrier. So whatever works for you.


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874831#comment-13874831
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

bq. issue() is the normal parlance when talking about memory fences

The problem is that it's not "just" a memory fence, but is tied to the Group 
behavior.  (Once issued, the old group is sealed and a new Group started.)

bq. isAfter()?

WFM.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874824#comment-13874824
 ] 

Benedict commented on CASSANDRA-5549:
-

issue() is the normal parlance when talking about memory fences, in my 
experience, which this is roughly a high level model of, so I would like to 
keep it. However we could go with partition()? Or I can wrack my brains some 
more...

OpOrdering -> OpOrder is good. I'm definitely with you on Group as well, now.

I'm not fond of includes(). Maybe isAfter()?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874817#comment-13874817
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

In the meantime, I've pushed a few more renames for your enjoyment.  
(OpOrdering -> OpOrder, issue -> seal, accept -> includes.)

Of these, I am least fond of {{seal}} which I think is better than issue (which 
could mean a lot of different things including creation) but still not super 
explanatory withoutreading javadoc.  Maybe {{sealGroup}} or even 
{{sealCurrentGroup}}?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874815#comment-13874815
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. and we'll be good to go here.

Awesome.

bq. let's build this on top of CLQ for now

I'll upload a patch later on today. Got errands to run and don't want to rush 
swapping it in WaitQueue, as it requires a slightly different approach.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874798#comment-13874798
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

With an alternative identified for CASSANDRA-6557, let's build this on top of 
CLQ for now.  I'll finish review of the Pool stuff and we'll be good to go here.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-17 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874786#comment-13874786
 ] 

Benedict commented on CASSANDRA-5549:
-

I realised as I was going to sleep this morning (strangely the best time to 
realise you've made a mistake, I find) that my fix for the bugfix was itself 
busted. Re-uploaded minor tweak to the branch.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874262#comment-13874262
 ] 

Benedict commented on CASSANDRA-5549:
-

FYI, on an unrelated note I've merged your changes and pushed an unrelated bug 
fix to [5549|https://github.com/belliottsmith/cassandra/tree/only-5549]

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874191#comment-13874191
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. When does CLQ block? Are you thinking of LBQ?

It's double locked, so lock to read, lock to write. Also double-lock to 
Iterator remove. So by NonBlockingQueue, I mean WaitFreeQueue, not 
NoBlockingMethodQueue :-)

bq. So if we can call NBQ an optimization

If it weren't for WaitQueue I'd say it were just an optimisation, but it's 
likely to cause real wasted cycles. I've seen it happening, it's not a 
theoretical risk, the only question is how big the problem would be given the 
use case of WaitQueue and the answer is, I'm not sure. There's a reasonable 
chance it will only trigger wasted cycles infrequently, in which case we can 
address it later. 

CASSANDRA-6557 really needs it to fix the bug I spotted, though, so we'll have 
to put it in before release anyway. Not sure that buys you much respite!


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874178#comment-13874178
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

When does CLQ block?  Are you thinking of LBQ?

As a progress update: I'm done with OpOrdering, CFS, Memtable, et al., and now 
looking at the plumbing (NBQ and the Allocator stuff).  So if we can call NBQ 
an optimization and split it out, great.  Otherwise, I don't see a good place 
to split things. :)

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874175#comment-13874175
 ] 

Benedict commented on CASSANDRA-5549:
-

Maybe split :

- WaitQueue, NBQ
- BTree UpdateFunction
- OpOrdering + CL utilisation
- WaitQueue timing + CL timing
- ObjectSizes + JAMM
- RoW

?


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874168#comment-13874168
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. WaitQueue has changed enough that git no longer recognizes the new one as 
sharing history with the old one. Can you summarize?

Sure:

- It includes the aforementioned improvements with NBQ (including now being 
non-blocking, except in the case of signaller/waiter trampling on each other 
occasionally in pthread world).
- It abstracts the Signal, so that we can have other implementations: wait for 
all, wait for any, for example.
- It supports timing the amount of time spent waiting. I have toyed with 
splitting this out into another ticket, but I did it whilst I was going, as I 
thought having the amount of time spent waiting on Memtable (or CL) flush would 
be a helpful metric.

That said, perhaps we could try splitting this whole thing into a few smaller 
tickets, make it more manageable for you to review. I kind of suspect the main 
body won't diminish a lot though, which is why I didn't in the first place. But 
every little probably helps :-)


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874158#comment-13874158
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. We have ANBQ -> NBQV -> NBQ, with no other implementations.

NBQV and NBQ should be kept separate, as having two different queues modifying 
state could get ugly. They could strictly speaking probably be merged at the 
moment, but unpicking them in future might be difficult, and I'm not sure it's 
helpful. ABQ I use for OffHeap memtables, as I have a more memory efficient 
queue for storing references necessary for GC (at least until we move more 
off-heap).

bq. What is the difference between NBQ and CLQ? I'd be inclined to introduce 
NBQ in a separate ticket.

FIrstly, NBQ is non-blocking, CLQ is not. This is a nice property to have in 
and of itself, as we now have an almost totally non-blocking read and write 
path (actually codahale is now one of the only main blocking points, other than 
the obvious memtable/CL flushes), which maybe not immediately useful may allow 
us some neat optimisations in the near future.

Secondly, NBQ provides some very useful methods, most specifically 
appendIfTail() and removeHeadIf(). Even though CLQ *could* offer these methods, 
it doesn't, and it makes certain things very difficult to write. These are 
mostly useful for the off-heap memtables, but I also use them for 
CASSANDRA-6557 to fix a potentially dangerous bug, and make the code easier to 
understand.

Lastly, and the only reason to include it in _this_ ticket, is the snap() 
functionality that allows a persistent view of the queue that (mostly) does not 
change. In WaitQueue, since it is now used extensively in more highly 
concurrent places with spurious wake ups, this is really essential: waiting 
threads can (actually very easily) wake-up, find they have no work to do, and 
go back to sleep again _before the signalling thread has reached the end of the 
queue_, meaning they get woken up again... potentially repeatedly. This is 
especially a risk due to the use of pthread mutexes to implement 
LockSupport.park/unpark - whilst the signalling thread releases the waiting 
thread's mutex before waking it, if that thread is state changing at that time 
anyway, the signalling thread will wait (a full scheduler delay), and this can 
happen repeatedly.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874043#comment-13874043
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

We have ANBQ -> NBQV -> NBQ, with no other implementations.  Can we combine 
these?

What is the difference between NBQ and CLQ?  I'd be inclined to introduce NBQ 
in a separate ticket.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874005#comment-13874005
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

WaitQueue has changed enough that git no longer recognizes the new one as 
sharing history with the old one.  Can you summarize?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-16 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873300#comment-13873300
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. Do we rely on accept returning true before issue, or can we make that 
IllegalState?

It's a necessity of the semantics. For accept() to be properly synchronised 
with issue(), it needs to be called as soon as the Barrier is exposed, so that 
false can happen the instant issue() happens. Probably should have commented 
that better. The semantics of accept remain consistent, though - namely that it 
only returns true for any operations started before the (eventual) issue().

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-15 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873026#comment-13873026
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Do we rely on accept returning true before issue, or can we make that 
IllegalState?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-15 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871924#comment-13871924
 ] 

Benedict commented on CASSANDRA-5549:
-

For accepts() to work correctly the Ordered on which it is called MUST be 
exposed atomically wrt changing the "current" Ordered/Group. This is at the 
very least difficult with the scheme you suggest, and I think probably 
impossible. At least, not without synchronising on the OpOrdered, and doing 
everything that Barrier does, in the caller.

Note I have scrapped a lengthy message about how *expiring* atomically would 
also be ugly (and still would be necessary to some extent, making the code even 
uglier in the caller than it is in Barrier), as I hadn't yet realised that 
exposing atomically would be nearly impossible, which I hope underlines how 
difficult this is, and that encapsulating as much of it as possible in Barrier 
is a Good Thing. So that we only have to get it right the once.

Stylistically I also think it is *much* clearer to separate the two concepts.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871641#comment-13871641
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Pushed more refactorage to my branch.

I have a nagging feeling that OpOrdering could be done with two classes instead 
of three, merging Barrier and Ordered:

{code}
public void consume()
{
SharedState state = this.state;
state.setReplacement(new State())
state.doSomethingToPrepareForBarrier();

state.opGroup = ordering.currentActiveGroup();
state.opGroup.expire()
state.opGroup.await();

this.state = state.getReplacement();
state.doSomethingWithExclusiveAccess();
}

public void produce()
{
Group opGroup = ordering.start();
try
{
state.doProduceWork();
}
finally
{
opGroup.finishOne();
}
}
{code}

(We could still provide an accepts() method for the benefit of getMemtableFor, 
but I don't see that requiring a 3rd class either.)

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-10 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867933#comment-13867933
 ] 

Benedict commented on CASSANDRA-5549:
-

I've merged your changes and removed all of the mentioned unnecessary changes, 
as well as improved some javadoc, and fixed the bug in PostFlush.

Same repository as before.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-09 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867399#comment-13867399
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. I get that, but it's good hygiene to keep things self contained (and I 
don't have to review it until then .

Sure, and I shouldn't get into bad habits anyway. I was already feeling a 
guilty about this. I should have split the branch a long time ago and never got 
into this problem :-)

bq. But if it's not critical then I'm definitely partial to this form as being 
easier to understand.

It's definitely not, at least not yet. We allocate so much on the write path 
this will disappear in the noise. But it would be nice to be slowly moving 
towards a day when that isn't the case, and baby steps in the right direction 
help.

bq. How good are you at reading JIT assembly?

Not good enough, but this is probably a good opportunity to fix that a little. 
As much as it isn't super important right now, it's a useful thing in and of 
itself.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867396#comment-13867396
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

bq. I should mention that I hope to introduce a patch next week that relies on 
most of these things we'll be stripping out

I get that, but it's good hygiene to keep things self contained (and I don't 
have to review it until then :).

bq. It's also not trivial to design a test that tells you for sure that it 
does/doesn't

How good are you at reading JIT assembly? :)

But if it's not critical then I'm definitely partial to this form as being 
easier to understand.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-09 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867373#comment-13867373
 ] 

Benedict commented on CASSANDRA-5549:
-

As to your change at 6e029a3ee4943a30af078a5c2b6d483c348ffc1f, I'm not totally 
convinced it will remain stack allocated, due to the allocation being split 
into two locations, that would need to be coalesced into one by the compiler. 
It's possible it does this, but it's not obvious that it would. It's also not 
trivial to design a test that tells you for sure that it does/doesn't.

I doubt it will make a huge difference to performance, especially at the 
moment, though.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-09 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867345#comment-13867345
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. Looks like the memtables.isEmpty path in Flush.run will result in PostFlush 
running with a null lastReplayPosition which will cause NPE.

Damn. Should have been more careful with my ninja bugfix.

bq. PoolAllocator.stateUpdater is unused. Should transition be CAS-ing?

Used by OffHeapAllocator. Can remove it until then.

bq. PoolAllocator.Gc is also unused. Let's take this out and add back in when 
needed.

Makes sense

bq. PoolAllocator.setDiscarding and writeBarrier.markBlocking are both 
commented as allowing allocations to exceed the memory limit. Only isBlocking 
actually appears to affect this.

You're absolutely correct: PoolAllocator.setDiscarding() no longer does this. I 
narrowed the scope under which overshoot was permitted by introducing 
markBlocking, but apparently didn't keep the javadocs up to date.

bq. There are several unused methods in OpOrdering. (If we're planning to use 
this in a future patch, let's also leave these out for now.)

I am a little less comfortable about this, as the most important unused methods 
are safe*, which are somewhat interwoven with functionality. Although we can 
remove them relatively cleanly if you prefer.

bq. Signal javadoc ends mid-sentence. Would be nice to have javadoc on all the 
methods too.

Whoops.

bq. Why wrap Pool constructor parameters in Setup object? Simpler to just 
inline the helper methods into the constructor.

Do you mean to have protected methods we can call from within the constructor 
and overridden by the extending class? This is another case of not needed just 
yet, but if you want to provide a custom PoolCleaner or MemoryTracker, you need 
to construct them in the Pool constructor so that the object is fully 
constructed and can be passed to the PoolCleaner and MemoryTracker 
constructors. Can always introduce this in my next patch though.

bq. Should we bother allowing Pool cleaner to be null?

Good point. We could not. At the moment the only pools I intend to create 
without a cleaner in the near future, I probably intend to create "unbounded", 
so it actually should be fine (the cleaner will never be called). We can always 
introduce the checks later if we need them.

I should mention that I hope to introduce a patch next week that relies on most 
of these things we'll be stripping out. But it shouldn't make a huge difference 
either way.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867306#comment-13867306
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Comments so far:

Looks like the memtables.isEmpty path in Flush.run will result in PostFlush 
running with a null lastReplayPosition which will cause NPE.

PoolAllocator.stateUpdater is unused.  Should transition be CAS-ing?

PoolAllocator.Gc is also unused.  Let's take this out and add back in when 
needed.

PoolAllocator.setDiscarding and writeBarrier.markBlocking are both commented as 
allowing allocations to exceed the memory limit.  Only isBlocking actually 
appears to affect this.

There are several unused methods in OpOrdering.  (If we're planning to use this 
in a future patch, let's also leave these out for now.)

Signal javadoc ends mid-sentence.  Would be nice to have javadoc on all the 
methods too.

Why wrap Pool constructor parameters in Setup object?  Simpler to just inline 
the helper methods into the constructor.

Should we bother allowing Pool cleaner to be null?

OCD at https://github.com/jbellis/cassandra/commits/5549-2.  
6e029a3ee4943a30af078a5c2b6d483c348ffc1f is the only non-cosmetic one.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-07 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864173#comment-13864173
 ] 

Benedict commented on CASSANDRA-5549:
-

I just realised there was a fairly glaring bug with Flush/PostFlush. I was 
'optimising' the path to discard CL segments when there were no 2is, but this 
was dangerous because of potential reordering of flushes (so a later flush 
could mark CLS unused when an earlier flush was still in progress). I've fixed 
and uploaded.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863148#comment-13863148
 ] 

Benedict commented on CASSANDRA-5549:
-

Largely the same as before, the mutation thread blocks until enough memory 
becomes available to complete the request. The difference is, depending on 
where/why the breach occurs, it may not block until after it completes its 
modification (as some of the memory bookkeeping is batched to the end for ease 
and speed).

Any call to MemoryOwner.allocate() is potentially a blocking call, if there 
isn't enough room available to satisfy the allocation.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863143#comment-13863143
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

What happens now when I hit my memtable memory ceiling, before flush makes more 
room?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2014-01-03 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861886#comment-13861886
 ] 

Benedict commented on CASSANDRA-5549:
-

I have a patch available for this 
[here|https://github.com/belliottsmith/cassandra/tree/only-5549]

I've been a little reticent to post it, as it's a bit of a monster of a patch, 
but I think I've now done my best to keep it well commented and mostly limit 
unnecessary changes. There are some changes that may appear over engineered for 
their current use, but I am using these in a continuation of this patch for 
off-heap memtables. I'll describe some of these below, but unpicking still 
useful changes seemed wasteful. If they get in the way of review we can revisit 
that decision.

There are several main areas of updates:

1) Removal of switchLock itself: The main work here is actually in the 
OpOrdering synchronisation class. This class explains itself, so I won't go 
into detail here, but provides an easy mechanism for ensuring we can coordinate 
our updates to Memtables so that we know what CL position they contain data to, 
and to know when the memtable is safe to be written to disk. The actual 
flushing of the memtable has been refactored a little also, to keep ordering 
guarantees.

2) Allocators and Memory Management: by removing the switch lock, we get rid of 
our ability to control heap growth by row mutations. To fix this, I've created 
the concept of a PoolAllocator, with associated Pool that has fixed memory 
limits. Any allocation requires the pool to allot room from its limit to the 
allocator (this is dealt with by MemoryTracker and MemoryOwner). This required 
a lot of minor modifications all over the place, to make measurement of object 
sizes at modification time cheap and accurate. Mostly I've achieved this by 
modifying jamm - a new branch is 
[here|https://github.com/belliottsmith/jamm/tree/guess] so that it will always 
give us a useful answer. Wherever we used to be using ObjectSizes adhoc in a 
class (generally incorrectly it turns out, not unsurprisingly as the API isn't 
obvious) I now *always* call measure() on an instance of the object and store 
that in a static field, and use simpler methods for any dynamic space use.

Worth noting: I've renamed IMeasureableMemory.memorySize() to excessHeapSize(), 
and I've modified (where applicable) its value to only count data we wouldn't 
otherwise be storing. This only makes a difference in a few places, but I think 
is an important distinction.

This change also makes any limit on flush queue size irrelevant, so the metric 
we use for controlling flushing is instead a ratio of in-use-memory to 
memory-limit, ignoring any already flushing data, which once breached will 
trigger a flush of the largest CFS.

3) Some concurrency primitives: NonBlockingQueue (and related classes) and 
WaitQueue. NonBlockingQueue is used more extensively in the off heap changes, 
but I leave it in here because it improves WaitQueue a lot, and we rely on 
WaitQueue much more with the proliferation of the OpOrdering operations. It 
helps us move much closer to completely non-blocking read/write operations 
also. We also use it to get rid of the Thread.yield() in SlabAllocator. I've 
aimed to keep NBQ as simple as possible.

4) CommitLog has been updated to use OpOrdering, and also includes a bug fix. I 
considered splitting this into a separate ticket, but it's such a tiny 
proportion of the overall changes I'm not sure it warrants it. The bug fix we 
may want to split out if this takes a while to go through.


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838802#comment-13838802
 ] 

Benedict commented on CASSANDRA-5549:
-

For a given run of the JVM the overhead is constant for each type of object 
allocated, and the objects allocated can be predicted accurately given the 
number of columns we are storing. I've done object size measurement before :-)

I don't see anything in CASSANDRA-4860 that is surprising, but perhaps I've 
missed something specific you're worrying about? In general it's dealing with 
"miscalculating" the portion of a ByteBuffer we're referencing. This is a 
concern for live bytes, not retained bytes, and my scheme outlined above was 
for retained bytes which is what we care about for memory constraints, but I 
will also be replacing the live bytes calculation, since it will be easy to do 
at the same time. But the same approach works, care is just needed.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-04 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838796#comment-13838796
 ] 

Vijay commented on CASSANDRA-5549:
--

Well it is not exactly a constant overhead you might want to look at 
o.a.c.u.ObjectSizes (CASSANDRA-4860)...

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-04 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838782#comment-13838782
 ] 

Benedict commented on CASSANDRA-5549:
-

bq. how does the switch RW Lock to a kind of CAS operation change this 
schematics?

In forceFlush() we obtain the writeLock, but do not relinquish it until we have 
successfully added to the flushWriters queue. The flushWriters queue length 
also configures how often we should flush, so that once it is full, we are 
effectively "out of memory". This is hardly a *precise* mechanism for memory 
control, but it is the one we currently use, and it definitely needs a 
replacement.

bq. IMHO, that might not be good enough since Java's memory over head is not 
considered. And calculating the object size is not cheap either

I don't see your concerns here? We can easily and cheaply calculate the costs - 
we precompute the overheads, and simply apply them on a per row and per key 
basis. The overheads are pretty fixed for both - for SnapTreeMap they're 
exactly the same for each key, and with CASSANDRA-6271 they are relatively 
easily computable (or simply countable, in log(32,N) time - probably I will opt 
for computing first, then counting after insertion to get exact amount used). 
Java's memory overhead is included in any calculation.



> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Benedict
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838627#comment-13838627
 ] 

Vijay commented on CASSANDRA-5549:
--

{quote}
Without switch lock, we won't have anything preventing writes coming through 
when we're over-burdened with memory use by memtables.
{quote}
I should be missing something, how does the switch RW Lock to a kind of CAS 
operation change this schematics?
Are we talking about additional requirement/enhancements to this ticket?

{quote}
 When we flush a memtable we release permits equal to the estimated size of 
each RM
{quote}
IMHO, that might not be good enough since Java's memory over head is not 
considered. And calculating the object size is not cheap either

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Vijay
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-02 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836818#comment-13836818
 ] 

Benedict commented on CASSANDRA-5549:
-

Without switch lock, we won't have anything preventing writes coming through 
when we're over-burdened with memory use by memtables.

What I'd like to suggest is effectively a global Semaphore, with permits equal 
to the size allocated for memtables; on KS.apply(RM) we estimate the size of 
the RM and take that many permits. Once we've added the RM and know better how 
much it occupies, we adjust the Semaphore to (more) accurately reflect the 
amount of memory in use. When we flush a memtable we release permits equal to 
the *estimated size* of each RM.

This may be pushing the boat out, but would probably result in not relying on 
memtable live metering/scanning for size estimation, which we could retire. 
Either way we're estimating the size, but with this approach we're keeping 
*tight* control over the (estimated) memory allocated to memtables, whereas at 
the moment we have some tricks that we hope keep it there. If we estimate space 
used cautiously, we should be able to better guarantee no OOM, at least from 
this part of the code. 

I have a *reasonably* straight forward scheme for estimating size used by a RM 
that should be as good as we currently have. Basic premise is to calculate 
average space used by an item in ConcurrentSkipListMap using metering at 
startup with a map of size, say, 1M entries, rounded up. If we depend on 
CASSANDRA-6271 we can easily calculate exact overhead for the BTrees, or 
otherwise can do a similar metering approach for SnapTreeMap. So we have an 
overhead per row and per value. Separately we track how much space we are using 
for a given memtable's slab allocator. We use the RM's data size only for the 
initial estimation, to decide if we have room, and ignore it once it's actually 
added, as it will be accounted for in the slaballocator.



> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Vijay
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-11-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835123#comment-13835123
 ] 

Benedict commented on CASSANDRA-5549:
-

I suggest the following (somewhat complex seeming approach), building on my 
patch for 3578:

We extract the CLS.Allocation object into a CommitState object that is 
allocated in CFS.apply() prior to performing CL.add(). The CLS.AppendLock 
object is rewritten to be a more custom job which we will for now call 
MutationBarrier, and extracted along with CommitState.

MutationBarrier will be a synchronisation primitive that permits issuing 
periodic barriers that ensure operations started prior to the barrier have all 
completed, and also can be used to create a token on an about-to-be-issued 
barrier that ensures operations started after the barrier (when it *is* issued) 
know not to interfere with any state of an object that is using the barrier. 
This was probably unclear, but the steps of CFS.apply() using it may clarify it:

# Allocate CommitState, register operation with MutationBarrier
# CL.add() - on exit, has updated CommitState with position and segment of 
replay position.*
# Checks Memtable's BarrierToken to see if we are permitted to modify: 
#* if it's absent we simply make our modification and scoot;
#* if it's present and permits us to make our modification, we do so but ALSO 
update a ReplayPosition property (with cas, ensuring it is >= the one we have 
in CommitState)
#* if it's present and does not permit us to modify, we look up its replacement 
Memtable and repeat
# Release our hold on the MutationBarrier, signalling any waiters

When we flush a memtable, we:
# Set the ReplayPosition
# Create the replacement Memtable and chain it from ourselves
# Set the BarrierToken
# Wait on the Barrier
# Flush to disk, using the ReplayPosition we have maintained

Note that we will no longer perform any reference counting on the Memtables. I 
will ensure that the mutation calls are all non-blocking, but may for 
correctness and simplicity make those attempting to flush/issue a new barrier 
take out a (possibly spin-) lock on the MB in order to issue a Token atomically.

Thoughts?

\*CL.sync() will use the MB to fulfil the AppendLock role also. CL.add() will 
release a hold on the MB that is related to CL only, to permit CL to proceed 
immediately.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Vijay
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-09-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13781214#comment-13781214
 ] 

Vijay commented on CASSANDRA-5549:
--

Hi Ryan, Can you give a shot at 
https://github.com/Vijay2win/cassandra/commits/5549-v2 on 10M keys atleast.

Rebased [~jbellis] branch I moved the CommitLogAllocator forceFlush back to 
separate thread, removed isDirty boolean since isClean is called in a separate 
thread and hence shouldn't help performance on writes, rest is all Jonathan...

My benchmark on 32 physical core machine shows a better performance than 
earlier. ~72 vs ~84
http://pastebin.com/GRPMUcSB


> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Vijay
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-08-26 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750764#comment-13750764
 ] 

Ryan McGuire commented on CASSANDRA-5549:
-

Looks pretty similar on the eight core machines:

!5549-sunnyvale.png!

[Data for this benchmark 
here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.remove_switchlock.sunnyvale.json&metric=interval_op_rate&operation=stress-write&smoothing=1]

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-08-26 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750213#comment-13750213
 ] 

Ryan McGuire commented on CASSANDRA-5549:
-

Yes, this is on the quad cores. I'll try the other cluster.

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-08-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750203#comment-13750203
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Was this on the 4 core machines?  Can you test on 8?

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-08-26 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750184#comment-13750184
 ] 

Ryan McGuire commented on CASSANDRA-5549:
-

[~jbellis] I ran a benchmark against your branch comparing it to the prior 
commit. I didn't see any difference in write performance:

!5549-removed-switchlock.png!

[Data for this benchmark 
here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.removed_switchlock.json&metric=interval_op_rate&operation=stress-write&smoothing=1]

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-05-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651624#comment-13651624
 ] 

Jonathan Ellis commented on CASSANDRA-5549:
---

Rebased my switchLock-removal patch from CASSANDRA-5064.

Sylvain's comments from then:

# I see nothing that prevents flushing the same memtable multiple times.
# getting the commit log context and switching the memtable is not done 
atomically with respect to writes. So a write can be pushed in the commit log 
after the context we're getting but still reach the memtable we're about to 
flush. For normal update, this is mostly inefficient in that we'll kept commit 
logs around long than necessary and potentially replay some update 
unnecessarily, but for counter this is a bug.
# it's also possible that for postFlush tasks to not be scheduled in the order 
the commit log context were acquired. So we could discard a commitlog for which 
the data is not yet fully flushed.

I haven't yet addressed these concerns; it's just a straight-up rebase so far.  
However, unit tests pass, so it's probably Good Enough to see what kind of 
performance impact this would have. /cc [~danielnorberg]

> Remove Table.switchLock
> ---
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
> Fix For: 2.0
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira