[
https://issues.apache.org/jira/browse/HADOOP-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559208#action_12559208
]
stack commented on HADOOP-1398:
-------------------------------
Patch looks great Tom.
You pass 'length' in the below but its not used:
{code}
+ protected FSDataInputStream openFile(FileSystem fs, Path file,
+ int bufferSize, long length) throws IOException {
+ return fs.open(file, bufferSize);
{code}
I presume you have plans for it later?
You have confidence in the LruMap class? You don't have unit tests (though
these things are hard to test). I ask because though small, sometimes these
kinds of classes can prove a little tricky....
Do you have any numbers for how it improves throughput when cached blocks are
'hot'? And you talked of a slight 'cost'. Do you have rough numbers for that
too? (Playing on datanode adjusting the size of the CRC blocks, a similar type
of blocking to what you have here, there was no discernable difference
adjusting sizes).
What do we need to add to make it so its easy to enable/disable this feature on
a per-column basis? Currently edits to column config. requires taking column
offline. Changing this configuration looks safe-to-do while the column stays
on line. Would you agree?
> Add in-memory caching of data
> -----------------------------
>
> Key: HADOOP-1398
> URL: https://issues.apache.org/jira/browse/HADOOP-1398
> Project: Hadoop
> Issue Type: New Feature
> Components: contrib/hbase
> Reporter: Jim Kellerman
> Priority: Trivial
> Attachments: hadoop-blockcache.patch
>
>
> Bigtable provides two in-memory caches: one for row/column data and one for
> disk block caches.
> The size of each cache should be configurable, data should be loaded lazily,
> and the cache managed by an LRU mechanism.
> One complication of the block cache is that all data is read through a
> SequenceFile.Reader which ultimately reads data off of disk via a RPC proxy
> for ClientProtocol. This would imply that the block caching would have to be
> pushed down to either the DFSClient or SequenceFile.Reader
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.