re: "If a variant of hdfs-347 was committed," I agree with what Ryan is saying here, and I'd like to second (third? fourth?) keep pushing for HDFS improvements. Anything else is coding around the bigger I/O issue.
On 7/9/11 6:13 PM, "Ryan Rawson" <ryano...@gmail.com> wrote: >I think my general point is we could hack up the hbase source, add >refcounting, circumvent the gc, etc or we could demand more from the dfs. > >If a variant of hdfs-347 was committed, reads could come from the Linux >buffer cache and life would be good. > >The choice isn't fast hbase vs slow hbase, there are elements of bugs >there >as well. >On Jul 9, 2011 12:25 PM, "M. C. Srivas" <mcsri...@gmail.com> wrote: >> On Fri, Jul 8, 2011 at 6:47 PM, Jason Rutherglen < >jason.rutherg...@gmail.com >>> wrote: >> >>> There are couple of things here, one is direct byte buffers to put the >>> blocks outside of heap, the other is MMap'ing the blocks directly from >>> the underlying HDFS file. >> >> >>> I think they both make sense. And I'm not sure MapR's solution will >>> be that much better if the latter is implemented in HBase. >>> >> >> There're some major issues with mmap'ing the local hdfs file (the >>"block") >> directly: >> (a) no checksums to detect data corruption from bad disks >> (b) when a disk does fail, the dfs could start reading from an alternate >> replica ... but that option is lost when mmap'ing and the RS will crash >> immediately >> (c) security is completely lost, but that is minor given hbase's current >> status >> >> For those hbase deployments that don't care about the absence of the (a) >and >> (b), especially (b), its definitely a viable option that gives good >>perf. >> >> At MapR, we did consider similar direct-access capability and rejected >>it >> due to the above concerns. >> >> >> >>> >>> On Fri, Jul 8, 2011 at 6:26 PM, Ryan Rawson <ryano...@gmail.com> wrote: >>> > The overhead in a byte buffer is the extra integers to keep track of >the >>> > mark, position, limit. >>> > >>> > I am not sure that putting the block cache in to heap is the way to >>>go. >>> > Getting faster local dfs reads is important, and if you run hbase on >top >>> of >>> > Mapr, these things are taken care of for you. >>> > On Jul 8, 2011 6:20 PM, "Jason Rutherglen" >>><jason.rutherg...@gmail.com> >>> > wrote: >>> >> Also, it's for a good cause, moving the blocks out of main heap >>>using >>> >> direct byte buffers or some other more native-like facility (if >>>DBB's >>> >> don't work). >>> >> >>> >> On Fri, Jul 8, 2011 at 5:34 PM, Ryan Rawson <ryano...@gmail.com> >wrote: >>> >>> Where? Everywhere? An array is 24 bytes, bb is 56 bytes. Also the >>>API >>> >>> is...annoying. >>> >>> On Jul 8, 2011 4:51 PM, "Jason Rutherglen" < >jason.rutherg...@gmail.com >>> > >>> >>> wrote: >>> >>>> Is there an open issue for this? How hard will this be? :) >>> >>> >>> > >>>