[ https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601530#comment-14601530 ]
Branimir Lambov commented on CASSANDRA-8099: -------------------------------------------- Continuing my review with the {{db.filter}} package, apart from {{ColumnFilter.Expression.Serializer.serializedSize}} which appears to be missing a byte when the messaging version is 3.0, I see no correctness problems with the code. More unit testing would definitely help, though. Below are some suggestions that should not be treated as blocking commit. {{PartitionFilter}}: * The name implies it is something that filters partitions, in fact it applies to rows within partition. I'd rename this to {{RowFilter}} or, if there's a reason to keep partition in its name, {{PartitionContentFilter}}. * {{forPaging}} javadoc is contradictory-- on one hand strictly after, on the other could be inclusive. Please clarify. * {{filter(UnfilteredRowIterator)}} is risky, people will call the wrong filter overload by mistake. Rename? * {{filter(SliceableUnfilteredRowIterator)}} does not seem to apply {{queriedColumns}} in either {{NamesPartitionFilter}} or {{SlicePartitionFilter}}. Is this on purpose? If it is, the method above is not its counterpart. Please document. {{NamesPartitionFilter}}: * {{clusterings}} should be {{NavigableSet}} to enable reverse traversal without duplication ({{TreeSet}} implements that too). This simplifies other code in file. * {{toString}} misses a closing bracket. {{SliceableUnfilteredRowIterator}}: * JavaDoc: what is 'seekTo'? 'slice'? {{ColumnSubselection}}: * ELT_SELECTION is not easily decipherable. Why not just ELEMENT instead? And 'element' elsewhere? (elt also) {{ColumnSelection}}: * {{includes/newTester}}: It appears these should only be called if the columns filter is already applied. This is not obvious; I'd state it in JavaDoc and/or assert that {{columns.contains(cell.column())}}. {{DataLimits}}: * {{DISTINCT_NONE}} is not the same as {{CQLLimits.distinct(MAX_VALUE)}}. Would you add a comment to say why it is so? * {{countCells}} should be {{countsCells}}. * {{CQLLimits.isDistinct}} does not appear to be used for anything. Throughout, there are lots of easy-to-fix (just add <?>) raw type warnings. I would also use a virtual method instead of an instance field for {{kind}} everywhere, as done in {{ColumnSubselection}}. [Here|https://github.com/pcmanus/cassandra/compare/8099...blambov:8099-db-filter-suggestions] I've uploaded a branch that includes fixes to a few of the above. > Refactor and modernize the storage engine > ----------------------------------------- > > Key: CASSANDRA-8099 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8099 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Fix For: 3.0 beta 1 > > Attachments: 8099-nit > > > The current storage engine (which for this ticket I'll loosely define as "the > code implementing the read/write path") is suffering from old age. One of the > main problem is that the only structure it deals with is the cell, which > completely ignores the more high level CQL structure that groups cell into > (CQL) rows. > This leads to many inefficiencies, like the fact that during a reads we have > to group cells multiple times (to count on replica, then to count on the > coordinator, then to produce the CQL resultset) because we forget about the > grouping right away each time (so lots of useless cell names comparisons in > particular). But outside inefficiencies, having to manually recreate the CQL > structure every time we need it for something is hindering new features and > makes the code more complex that it should be. > Said storage engine also has tons of technical debt. To pick an example, the > fact that during range queries we update {{SliceQueryFilter.count}} is pretty > hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has > to go into to simply "remove the last query result". > So I want to bite the bullet and modernize this storage engine. I propose to > do 2 main things: > # Make the storage engine more aware of the CQL structure. In practice, > instead of having partitions be a simple iterable map of cells, it should be > an iterable list of row (each being itself composed of per-column cells, > though obviously not exactly the same kind of cell we have today). > # Make the engine more iterative. What I mean here is that in the read path, > we end up reading all cells in memory (we put them in a ColumnFamily object), > but there is really no reason to. If instead we were working with iterators > all the way through, we could get to a point where we're basically > transferring data from disk to the network, and we should be able to reduce > GC substantially. > Please note that such refactor should provide some performance improvements > right off the bat but it's not it's primary goal either. It's primary goal is > to simplify the storage engine and adds abstraction that are better suited to > further optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)