[ 
https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141660#comment-15141660
 ] 

Paulo Motta edited comment on CASSANDRA-10990 at 2/10/16 9:33 PM:
------------------------------------------------------------------

Thanks for the comments [~yukim].

bq. What's the difference between MemoryCachedInputStream and 
BufferedInputStream? 

The main difference between {{MemoryCachedInputStream}} and 
{{BufferedInputStream}} is that the former has the ability to mark/reset a 
parent/source stream when it runs out of capacity without losing its mark 
state, allowing us to cascade a {{FileCachedInputStream}} with a 
{{MemoryCachedInputStream}} to provide a multi-tiered cached input stream.

Another less relevant difference is that {{BufferedInputStream}} always does 
buffered reads of up to the capacity of its buffer, while 
{{MemoryCachedInputStream}} only buffer reads when it's marked and only the 
amount that was consumed via its {{read}}/{{skip}} methods.

bq. Why can't we use the latter? 

I tried extending {{BufferedInputStream}} to add the ability to mark a parent 
stream when it runs out of capacity, but that involved reimplementing and/or 
changing most of its methods since {{BufferedInputStream}} always reads from 
its internal buffer and re-fills it when necessary and most of its methods rely 
on that logic. Reading from a parent stream when the buffer is full would 
change this assumption what would require a significant refactor in most of its 
methods. I'm open to suggestions if you see a way of easily adapting 
{{BufferedInputStream}} to fulfil that requirement.

bq. {{MemoryCachedInputStream}} uses default {{ByteArrayOutputStream}} 
constructor which has only size of 32 bytes. Isn't this too small to use for 
cache?

Probably, I will try to find a better value for this. Do you easily remember if 
there is a way to retrieve the average partition size for a given table? I 
remember seeing something along those lines but I'm not sure where it is..

I will start work on the remaining TODO points and review comments. Please let 
me know if you have something to add.


was (Author: pauloricardomg):
Thanks for the comments.

bq. What's the difference between MemoryCachedInputStream and 
BufferedInputStream? Why can't we use the latter? 

The main difference between {{MemoryCachedInputStream}} and 
{{BufferedInputStream}} is that the former has the ability to mark/reset a 
parent/source stream when it runs out of capacity without losing its mark 
state, allowing us to cascade a {{FileCachedInputStream}} with a 
{{MemoryCachedInputStream}} to provide a multi-tiered cached input stream. 


> Support streaming of older version sstables in 3.0
> --------------------------------------------------
>
>                 Key: CASSANDRA-10990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10990
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Jeremy Hanna
>            Assignee: Paulo Motta
>
> In 2.0 we introduced support for streaming older versioned sstables 
> (CASSANDRA-5772).  In 3.0, because of the rewrite of the storage layer, this 
> became no longer supported.  So currently, while 3.0 can read sstables in the 
> 2.1/2.2 format, it cannot stream the older versioned sstables.  We should do 
> some work to make this still possible to be consistent with what 
> CASSANDRA-5772 provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to