Github user dineshjoshi commented on a diff in the pull request:
https://github.com/apache/cassandra/pull/239#discussion_r204586299
--- Diff:
src/java/org/apache/cassandra/db/streaming/CassandraOutgoingFile.java ---
@@ -114,13 +155,51 @@ public void write(StreamSession session,
DataOutputStreamPlus out, int version)
CassandraStreamHeader.serializer.serialize(header, out, version);
out.flush();
- CassandraStreamWriter writer = header.compressionInfo == null ?
- new CassandraStreamWriter(sstable,
header.sections, session) :
- new
CompressedCassandraStreamWriter(sstable, header.sections,
-
header.compressionInfo, session);
+ IStreamWriter writer;
+ if (shouldStreamFullSSTable())
+ {
+ writer = new CassandraBlockStreamWriter(sstable, session,
components);
+ }
+ else
+ {
+ writer = (header.compressionInfo == null) ?
+ new CassandraStreamWriter(sstable, header.sections,
session) :
+ new CompressedCassandraStreamWriter(sstable,
header.sections,
+
header.compressionInfo, session);
+ }
writer.write(out);
}
+ @VisibleForTesting
+ public boolean shouldStreamFullSSTable()
+ {
+ return isFullSSTableTransfersEnabled && isFullyContained;
+ }
+
+ @VisibleForTesting
+ public boolean fullyContainedIn(List<Range<Token>> normalizedRanges,
SSTableReader sstable)
+ {
+ if (normalizedRanges == null)
+ return false;
+
+ RangeOwnHelper rangeOwnHelper = new
RangeOwnHelper(normalizedRanges);
+ try (KeyIterator iter = new KeyIterator(sstable.descriptor,
sstable.metadata()))
+ {
+ while (iter.hasNext())
+ {
+ DecoratedKey key = iter.next();
+ try
+ {
+ rangeOwnHelper.check(key);
+ } catch(RuntimeException e)
--- End diff --
@iamaleksey thank you for the useful feedback. I did discuss this with
@krummas and I believe while there was a room for improvement, the thinking
back then was that the benefits would outweigh the cost. I looked through the
codebase and this was the best way to definitely verify range containment as I
was going for correctness. That said, what you suggest is obviously better. I
am concerned about scope creep in this PR. Would it be ok if we address it as
part of a separate PR?
It would also be useful, if we could design the effective range computation
and storage in the metadata. I am not sure what sort of gotchas I might run
into.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]