[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141660#comment-15141660 ]
Paulo Motta edited comment on CASSANDRA-10990 at 2/10/16 9:33 PM: ------------------------------------------------------------------ Thanks for the comments [~yukim]. bq. What's the difference between MemoryCachedInputStream and BufferedInputStream? The main difference between {{MemoryCachedInputStream}} and {{BufferedInputStream}} is that the former has the ability to mark/reset a parent/source stream when it runs out of capacity without losing its mark state, allowing us to cascade a {{FileCachedInputStream}} with a {{MemoryCachedInputStream}} to provide a multi-tiered cached input stream. Another less relevant difference is that {{BufferedInputStream}} always does buffered reads of up to the capacity of its buffer, while {{MemoryCachedInputStream}} only buffer reads when it's marked and only the amount that was consumed via its {{read}}/{{skip}} methods. bq. Why can't we use the latter? I tried extending {{BufferedInputStream}} to add the ability to mark a parent stream when it runs out of capacity, but that involved reimplementing and/or changing most of its methods since {{BufferedInputStream}} always reads from its internal buffer and re-fills it when necessary and most of its methods rely on that logic. Reading from a parent stream when the buffer is full would change this assumption what would require a significant refactor in most of its methods. I'm open to suggestions if you see a way of easily adapting {{BufferedInputStream}} to fulfil that requirement. bq. {{MemoryCachedInputStream}} uses default {{ByteArrayOutputStream}} constructor which has only size of 32 bytes. Isn't this too small to use for cache? Probably, I will try to find a better value for this. Do you easily remember if there is a way to retrieve the average partition size for a given table? I remember seeing something along those lines but I'm not sure where it is.. I will start work on the remaining TODO points and review comments. Please let me know if you have something to add. was (Author: pauloricardomg): Thanks for the comments. bq. What's the difference between MemoryCachedInputStream and BufferedInputStream? Why can't we use the latter? The main difference between {{MemoryCachedInputStream}} and {{BufferedInputStream}} is that the former has the ability to mark/reset a parent/source stream when it runs out of capacity without losing its mark state, allowing us to cascade a {{FileCachedInputStream}} with a {{MemoryCachedInputStream}} to provide a multi-tiered cached input stream. > Support streaming of older version sstables in 3.0 > -------------------------------------------------- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Reporter: Jeremy Hanna > Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)