[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353791#comment-14353791
 ] 

Colin Patrick McCabe commented on HDFS-7844:
--------------------------------------------

bq. Stack wrote: High level, any notion of difference in perf when comparing 
native to offheap to current implementation?

Reading off-heap memory using Unsafe#getLong is very quick.  The main overhead 
from off-heap will be creating wrapper objects for things.  But those are very 
short-lived objects that should never make it past the GC's young generation.  
The off-heap implementation may be able to use less memory for some things 
because we control the packing, which would speed things up (since fetching 
memory is a big cost in the block manager).  We will see some numbers soon.

bq. If we fail to pick up the configured memory manager (or the default), its 
worth a WARN log. Otherwise, folks may be confounded that they are getting the 
native memory manager though they asked for something else:

We shouldn't need it because the creation of the hash table will log the name 
of the memory manager and its type at INFO.

bq. This an arbitrary max? private final static long MAX_ADDRESS = 
0x3fffffffffffffffL;

It's just nice because it allows the code to be provably correct.  I realize 
that the address will never get there in any reasonable length of time.

bq. nit: make a method rather than dup the below...:

ok :)

bq. Is logging open at DEBUG but close at TRACE lead to confusion? Stumped 
debugger?

{{MemoryManager#close}} is really only a unit test thing.  But you're right, 
let's make it DEBUG since the open was DEBUG.

bq. The close has to let out an IOE? What is the caller going to do w/ this 
IOE? The ByteArrayMemoryManager close error string construction is same as 
close on ProbingHashTable?

It's a (mis)feature of {{java.io.Closeable}}.  But I use that interface anyway, 
since Findbugs knows to nag us about it if we forget the close.  A user defined 
interface wouldn't be known to FindBugs (although maybe there are annotations 
these days?)

bq. I like the compromise put upon the Iterator (that resize is allowed while 
Iteration...) Seems appropriate given where this is to be deployed.

Yeah, I think it will be useful.

bq. On TestMemoryManager, maybe parameterize so once through with 
ByteArrayMemoryManager and then a run with the offheap implementation rather 
than have dedicated test for each: 
https://github.com/junit-team/junit/wiki/Parameterized-tests

That's pretty cool.  I think we should do that in a follow-on where we do more 
coverage stuff as well, though...

bq. Yi wrote: It's better to assert maxLoadFactor < 1 (maybe < 0.8?), incorrect 
value will cause hash table failed.

Good idea

bq. \[maintainCompactness\] looks brief, but I think it's not effective. 
putInternal needs probing if the slot was not in the right place, so it's not 
effictive.

{{putInternal}} does do probing, though.  Maybe I'm missing something but I 
think this should work.  Also, I can tell from the log messages that 
{{maintainCompactness}} is getting some testing.  I didn't like the original 
implementation because it was duplicating a lot of code from {{putInternal}}.

bq. I think ByteArrayMemoryManager can only used for test for it's performance 
reason. If SUN Unsafe is not available, we should use current implemention on 
Hadoop trunk. We will not remove current implementation on trunk, right?

To my knowledge, all JVMs that are used in real hadoop clusters have access to 
{{sun.Unsafe}}.  If we want to support a better on-heap memory allocator we can 
always work on that later.  A more efficient on-heap implementation would be to 
take a big byte array and basically hand out offsets into it much the way 
malloc itself does.  We're not going to keep around the old BlockManager code 
because that would be impossible.

> Create an off-heap hash table implementation
> --------------------------------------------
>
>                 Key: HDFS-7844
>                 URL: https://issues.apache.org/jira/browse/HDFS-7844
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7836
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7844-scl.001.patch, HDFS-7844-scl.002.patch, 
> HDFS-7844-scl.003.patch
>
>
> Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to