subject:"\[jira\] \[Comment Edited\] \(CASSANDRA\-5417\) Push composites support in the storage engine"

[jira] [Comment Edited] (CASSANDRA-5417) Push composites support in the storage engine

2013-11-13 Thread Sylvain Lebresne (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821243#comment-13821243
]

Sylvain Lebresne edited comment on CASSANDRA-5417 at 11/13/13 5:11 PM:
---

I've pushed a rebased/heavily modified version at
https://github.com/pcmanus/cassandra/commits/5417-v2.

Compared to the first version, this new version cleans up things a bit but also
try to be a lot more careful with not allocating too much objects for
performance reasons. Yet, the main principle is the same, to make cell names
not be opaque ByteBuffer anymore but rather custom objects (of type CellName in
the patch) and this with 3 main objectives:
# to abstract away the details of the cell name layouts. With CQL3 we have 4
different layout of cell names, depending on whether 1) the underlying
comparator is a CompositeType or not and 2) whether one of the component of the
cell name is used to store a CQL3 column name (and if yes, which one).
Currently, there is a lot of ad-hock and error-prone code all over the place to
deal with those differences manually. The CellName interface of this patch
allow to encapsulate those difference a lot better imo (in the different
implementation of that interface).
# to keep components of the cell name separated in memory. Currently, because a
cell name is an opaque buffer, every code that is only concerned by only one of
the component of the cell name has to manually re-split the byte buffer. On top
of the ad-hock concern described above (every time you split a cell name, you
need to worry about the different layouts), the other concern is that we may
split the same cell name over and over, which has a cost. Typically, each time
we compare two composite cell names we have to re-decode each components one by
one. And during reads, on top of all the cell name comparisons, we also have
to split the cell name to 1) count results in SliceQueryFilter and 2) process
the storage engine result into a ResultSet. By keeping components separated
behind the CellName interface, the attached patch save duplicating this work.
# keeping components of composites names separated should allow to make
component-specific optimizations a lot easier. Typically, some possible
optimizations like sharing common prefixes, using ids for CQL3 column names
(CASSANDRA-4175) or even, why not, per-component-type specific implementation
that uses native java types to save allocations (use an int when the component
is Int32Type), are likely to be easier to do if we can deal with individual
components easily rather than with opaque ByteBuffer. Note that I'm *not*
claiming that this patch does those optimizations yet, just that it very likely
make them a lot easier to implement properly.

The patch itself is pretty big but most of the interesting bits are in the
newly added {{db.composites}} package. The rest of the patch is mainly updating
the the code to use those new interfaces and some related simplification of the
CQL3 code. One interesting bit though is the new AtomDeserializer class and
it's use in sstable iterators. The reason is that for composite cell names,
this patch move the work of splitting the names read from disk into their
components from doing it every time we need to access a component to once at
deserialization time. Which is better, except that when we deserialize an
index block from disk the current code fully deserialize atoms even if they are
not used afterwards by the query (either because it's a query by name and we
only care about a handful of names in the block or because the name is before
the beginning of the slice we're looking for). In that case, the first version
of this patch was actually doing more work splitting/allocating all the
components for names of cells the query don't care about, which is why Jake's
test of the first version was showing slower reads. So this new version
introduce the AtomDeserializer class which basically is a lazy deserializer of
OnDiskAtom (and their cell names) that makes sure we don't do unnecessary work
for atoms we don't care about.

On the performance side I did 2 simple tests:
# I check basic stress writes and reads (with 5M keys but default parameters
otherwise). On writes, the patch is slower than trunk by about 10%. I've
yourkited it but I'm still not entirely sure why tbh (at least why the
difference is that big). Some of my tests suggests that the MemoryMeter is
taking more time and that this impact the overall performance but that could be
a red herring. For reads however, there is no noticeable difference.
# Then I did a read test similar to Jake's test above (I wasn't able to reuse
his sstables because it seems trunk can't read them). I used the following
CQL3 table:
{noformat}
CREATE TABLE timeline (
pk text,
ck1 text,
ck2 int,
val

[jira] [Comment Edited] (CASSANDRA-5417) Push composites support in the storage engine

2013-04-21 Thread T Jake Luciani (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637737#comment-13637737
]

T Jake Luciani edited comment on CASSANDRA-5417 at 4/22/13 2:45 AM:

I've posted a couple minor commits to
https://github.com/tjake/cassandra/tree/5417
There was one failing test and one performance regression.

I ran some benchmarks to gauge the effectiveness of this change.
I created a COMPACT table with a few wide rows and created 1k sstables. I
benchmarked the time it took to perform a major compaction (since that causes
the most Composite comparisons).

First the base
{code}
INFO 20:54:41,504 Compacted 942 sstables to
[/var/lib/cassandra/data/mjff/data/mjff-data-ja-963,]. 1,004,572,909 bytes to
999,124,671 (~99% of original) in 372,638ms = 2.557011MB/s. 947 total rows, 6
unique.
{code}

With this change.
{code}
INFO 22:25:46,747 Compacted 942 sstables to
[/var/lib/cassandra/data/mjff/data/mjff-data-ja-963,]. 1,004,572,909 bytes to
999,124,671 (~99% of original) in 341,939ms = 2.786578MB/s. 947 total rows, 6
unique.
{code}

So a bit of a improvement (~10%).

Then I tested the performance of reading as quickly as possible. This
unfortunately was slower (~30%)
I've attached a screenshot of the current bottleneck if you want to take a look.

http://i.imgur.com/rrp3fsX.png

was (Author: tjake):
I've posted a couple minor commits to
https://github.com/tjake/cassandra/tree/5417
There was one failing test and one performance regression.

So a bit of a improvement (~10%).

Then I tested the performance of reading as quickly as possible. This
unfortunately was slower (~30%)
I've attached a screenshot of the current bottleneck if you want to take a look.

Push composites support in the storage engine
-

Key: CASSANDRA-5417
URL: https://issues.apache.org/jira/browse/CASSANDRA-5417
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 2.0

CompositeType happens to be very useful and is now widely used: CQL3 heavily
rely on it, and super columns are now using it too internally. Besides,
CompositeType has been advised as a replacement of super columns on the
thrift side for a while, so it's safe to assume that it's generally used
there too.
CompositeType has initially been introduced as just another AbstractType.
Meaning that the storage engine has no nothing whatsoever of composites
being, well, composite. This has the following drawbacks:
* Because internally a composite value is handled as just a ByteBuffer, we
end up doing a lot of extra work. Typically, each time we compare 2 composite
value, we end up deserializing the components (which, while it doesn't copy
data per-se because we just slice the global ByteBuffer, still waste some cpu
cycles and allocate a bunch of ByteBuffer objects). And since compare can be
called *a lot*, this is likely not negligible.
* This make CQL3 code uglier than necessary. Basically, CQL3 makes extensive
use of composites, and since it gets backs ByteBuffer from the internal
columns, it always have to check if it's actually a compositeType or not, and
then split it and pick the different parts it needs. It's only an API
problem, but having things exposed as composites directly would definitively
make thinks cleaner. In particular, in most cases, CQL3 don't care whether it
has a composite with only one component or a non-really-composite value, but
we still always distinguishes both cases. Lastly, if we do expose composites
more directly internally, it's not a lot more work to internalize better
the different parts of the cell name that CQL3 uses (what's the clustering
key, what's the actuall CQL3 column name, what's the collection element),
making things cleaner. Last but not least, there is currently a bunch of
places where methods take a ByteBuffer as argument and it's hard to know
whether

[jira] [Comment Edited] (CASSANDRA-5417) Push composites support in the storage engine

[jira] [Comment Edited] (CASSANDRA-5417) Push composites support in the storage engine

2 matches

Site Navigation

Mail list logo

Footer information