[jira] [Comment Edited] (HBASE-15506) FSDataOutputStream.write() allocates new byte buffer on each operation

Lars Hofhansl (JIRA) Tue, 29 Mar 2016 21:52:46 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-15506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217374#comment-15217374
 ]


Lars Hofhansl edited comment on HBASE-15506 at 3/30/16 4:51 AM:
----------------------------------------------------------------

My point is that that's not a problem as such. That's how Java is designed to 
work.
It's only a problem when it is problem :)  As in, it definitely slows things 
down, causes long GC pauses, etc.

I don't have to tell you, but for just completeness of the discussion here:
The principle costs to the garbage collector are (1) tracing all objects from 
the "root" objects and (2) collecting all unreachable objects.
Obviously #1 is expensive when many objects need to be traced, and #2 is 
expensive when objects have to be moved (for example to reduce memory 
fragmentation)

64KB objects do not worry me, even if we have many GBs of them, it's just not 
many references to track. Further since they are all the same size, we won't 
fragment the heap in bad ways.

Reusing objects (IMHO) is simply a very questionable technique. Especially when 
you have to reset the objects, which is more expensive then fitting a new 
object into a free slot of the same size.

For what's it's worth, I have seen bad behaviour during heavy loading phases. I 
was always able to configure the GC accordingly, though.

In any case, we should be able to create some single server test workload that 
exhibits problems if there are any. Those are good tests to have anyway, not 
just a way to appease me.



was (Author: lhofhansl):
My point is that that's a problem as such. That's how Java is designed to work.
It's only a problem when it is problem :)  As in, it definitely slows things 
down, causes long GC pauses, etc.

I don't have to tell you, but for just completeness of the discussion here:
The principle costs to the garbage collector are (1) tracing all objects from 
the "root" objects and (2) collecting all unreachable objects.
Obviously #1 is expensive when many objects need to be traced, and #2 is 
expensive when objects have to be moved (for example to reduce memory 
fragmentation)

64KB objects do not worry me, even if we have many GBs of them, it's just not 
many references to track. Further since they are all the same size, we won't 
fragment the heap in bad ways.

Reusing objects (IMHO) is simply a very questionable technique. Especially when 
you have to reset the objects, which is more expensive then fitting a new 
object into a free slot of the same size.

For what's it's worth, I have seen bad behaviour during heavy loading phases. I 
was always able to configure the GC accordingly, though.

In any case, we should be able to create some single server test workload that 
exhibits problems if there are any. Those are good tests to have anyway, not 
just a way to appease me.


> FSDataOutputStream.write() allocates new byte buffer on each operation
> ----------------------------------------------------------------------
>
>                 Key: HBASE-15506
>                 URL: https://issues.apache.org/jira/browse/HBASE-15506
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> Deep inside stack trace in DFSOutputStream.createPacket.
> This should be opened in HDFS. This JIRA is to track HDFS work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-15506) FSDataOutputStream.write() allocates new byte buffer on each operation

Reply via email to