[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

Branimir Lambov (JIRA) Tue, 26 May 2015 06:56:46 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559121#comment-14559121
 ]


Branimir Lambov commented on CASSANDRA-8099:
--------------------------------------------

Issues I am seeing in the changes in compaction 
(org.apache.cassandra.db.compaction package), in order of significance:

- Side effects of iteration:
-- [CompactionIterable 
125|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L125]:
 I doubt index update belongs here, as side effect of iteration. Ideally index 
should be collected, not updated. Presumably mergedCell is already being added 
by writer, perhaps cleaning other entries there is a better way to go.
-- [CompactionIterable 
237|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L237]:
 Another side effect of iteration that preferably should be handled by writer.
-- Validation compaction now uses {{CompactionIterable}} and thus has side 
effects (index & cache removal).
- Iterator closing:
-- [CompactionManager 
900|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L900]:
 Several issues:
--* {{partition}} can be used after it is closed.
--* {{partition}} is not always closed: this closes it, but 895 and 870 don't.
--* Why is the iterator closed here rather than in the scope that opened it?
--* Please use explicit close of {{partition}}. This is an abuse of the try 
syntax (not obvious behavior, hidden from view).
-- [CompactionManager 
1052|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052],
 
[1160|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1160]:
 Partition is not closed.
- Stuff that could be confusing:
-- In {{guide_8099}}, description of {{RangeTombstoneMarkers}}: add that there 
is never content between two corresponding tombstone markers on any iterator. 
{{PurgingPartitionIterator}} for one relies on this.
-- [CompactionIterable 
83|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L83]:
 Are we interested in highest-index nowInSec() only? Not max/min? Could you add 
an explanation in comment if that's the case?
-- [CompactionIterable 
47|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L47]:
 AtomicInteger unnecessary-- remove as it could be understood to mean 
thread-safety where there's none.
-- \[Abstract]CompactionIterable are misnamed. They are now iterators.
- Nits:
-- [CompactionIterable 
48|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L48]:
 {{format}} is never used.
-- [CompactionIterable 
100|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionIterable.java#L100]:
 {{column}} is not used.
-- [CompactionManager 
785|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L785]:
 Revert Collections.singleton(sstable) to sstableSet
-- [CompactionManager 
943|https://github.com/pcmanus/cassandra/blob/75b98620e30b5df31431618cc21e090481f33967/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L943]:
 Revert removal of {{break}}.
-- Imports could be cleaned.

I haven't examined correctness/equivalence of 
{{(Partition|Atom)Iterators.merge()}} to older code yet.

Looking at 8099 overall, I don't know if that's a discussion you have had 
before and I haven't participated in, but I have strong objections to using 
iterators as the representations of tables and partitions in the abstraction. 
This appears to be carried over from the old code, but they now gain additional 
content and behaviour which make them even less suited to the role. These 
objects should be Iterable instead. Having that would give clear separation 
between the iteration process and the entity-level data and behaviour which is 
currently a cause of multiple problems, including the top two categories above.

I also think {{OpOrder.Group.close()}} does not belong in an iterator close. To 
minimize the potential for trouble, grouping/locking should be explicit in the 
user of the iterator/entity. For the instances where this is not feasible, 
perhaps the initiating methods could be called "openX" or "startX" vs "X" (e.g. 
openSearch vs search).


> Refactor and modernize the storage engine
> -----------------------------------------
>
>                 Key: CASSANDRA-8099
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0 beta 1
>
>         Attachments: 8099-nit
>
>
> The current storage engine (which for this ticket I'll loosely define as "the 
> code implementing the read/write path") is suffering from old age. One of the 
> main problem is that the only structure it deals with is the cell, which 
> completely ignores the more high level CQL structure that groups cell into 
> (CQL) rows.
> This leads to many inefficiencies, like the fact that during a reads we have 
> to group cells multiple times (to count on replica, then to count on the 
> coordinator, then to produce the CQL resultset) because we forget about the 
> grouping right away each time (so lots of useless cell names comparisons in 
> particular). But outside inefficiencies, having to manually recreate the CQL 
> structure every time we need it for something is hindering new features and 
> makes the code more complex that it should be.
> Said storage engine also has tons of technical debt. To pick an example, the 
> fact that during range queries we update {{SliceQueryFilter.count}} is pretty 
> hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has 
> to go into to simply "remove the last query result".
> So I want to bite the bullet and modernize this storage engine. I propose to 
> do 2 main things:
> # Make the storage engine more aware of the CQL structure. In practice, 
> instead of having partitions be a simple iterable map of cells, it should be 
> an iterable list of row (each being itself composed of per-column cells, 
> though obviously not exactly the same kind of cell we have today).
> # Make the engine more iterative. What I mean here is that in the read path, 
> we end up reading all cells in memory (we put them in a ColumnFamily object), 
> but there is really no reason to. If instead we were working with iterators 
> all the way through, we could get to a point where we're basically 
> transferring data from disk to the network, and we should be able to reduce 
> GC substantially.
> Please note that such refactor should provide some performance improvements 
> right off the bat but it's not it's primary goal either. It's primary goal is 
> to simplify the storage engine and adds abstraction that are better suited to 
> further optimizations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

Reply via email to