[ 
https://issues.apache.org/jira/browse/CASSANDRA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290398#comment-13290398
 ] 

Jonathan Ellis commented on CASSANDRA-4297:
-------------------------------------------

Comments:

- Would prefer to have CDIS just implement IS, and let callers wrap in DIS when 
desired, similar to how we use SnappyInputStream in IncomingTcpConnection, or 
LZFInputStream in ISR
- Why the changes to OutboundTcpConnection?
- Re MS changes: when would header.file be null?
- Chunk[] sort can use Guava Longs.compare
- I suggest adding a comment to explain why sort is necessary (b/c ranges are 
from replication strategy, so may not be sorted?)
- Instead of using Set + copy into array, why not use an ArrayList + 
trimToSize()
- is the FST comment {{// TODO just use a raw RandomAccessFile since we're 
managing our own buffer here}} obsolete?
- is the CompressedRandomAccessReader path used at all in FST anymore?
- Nit: avoid double negation in if statements with else clauses, e.g. instead of
{code}
.           if (remoteFile.compressionInfo != null)
                dis = new CompressedDataInputStream(socket.getInputStream(), 
remoteFile.compressionInfo);
            else
                dis = new DataInputStream(new 
LZFInputStream(socket.getInputStream()));
{code}
prefer
{code}
[           if (remoteFile.compressionInfo == null)
                dis = new DataInputStream(new 
LZFInputStream(socket.getInputStream()));
            else
                dis = new CompressedDataInputStream(socket.getInputStream(), 
remoteFile.compressionInfo);
{code}
- Nit: suggest moving serialization code for Chunk and CompressionParameters 
into ChunkSerializer and ChunkParametersSerializer classes, respectively, just 
to make the code discoverable for re-use later

At a higher level,
- Should we make nio transfer the default for uncompressed sstables as well, 
and add an option to enable compression?  Alternatively, now that compression 
is the default for new sstables, I'd be okay with removing LZF stream 
compression entirely
- Does this over-transfer data on chunk boundaries?  Put another way, do we 
stream data that doesn't actually belong on the target node?  (I'm okay with 
this, just want to be clear about what's happening.)
                
> Use java NIO as much as possible when streaming compressed SSTables
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-4297
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4297
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuki Morishita
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: streaming
>             Fix For: 1.2
>
>         Attachments: 4297.txt
>
>
> Back in 0.8, streaming uses java NIO (FileChannel#transferTo/transferFrom) to 
> perform zero copy file transfer between nodes. Since 1.0, in order to add new 
> features like sstable compression and internode encryption we had to switch 
> to java IO Input/OutputStreams. What we currently do to transfer compressed 
> SSTable is, in source node, 1) decompress chunk in SSTable, 2) compress using 
> LZF for network, and in destination node, 3) decompress using LZF as reading 
> from socket, 4) compress for SSTable on disk.
> Now, 1.1 comes out with SSTable compression turned on by default. It is 
> reasonable to transfer compressed file as is using NIO instead of 
> decompress/compress in source node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to