[ 
https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706763#comment-14706763
 ] 

Yi Liu edited comment on HDFS-8900 at 8/21/15 2:08 PM:
-------------------------------------------------------

Thanks [~andrew.wang], I update the patch for your comments.

*About hard limit.*
{quote}
I talked with Colin Patrick McCabe about compatibility offline ....
{quote}
Yes, I agree you and Colin. In fact, when I added the hard limit, the first 
thing I considered is the compatibility. I did it based on two reasons: 1). as 
you said too, it's unlikely anyone's ever changed the max size. 2). The max 
limit is to restrict user's (user/trusted) namespace xattrs, and it doesn't 
break the existing HDFS features. 
I felt it was OK to do this modification on trunk and branch-2 directly 
(Certainly we can still use 4 bytes, but I wanted to save 2 bytes from it :)). 
I see you agree with the hard limit generally, one thing is do we need to only 
have the hard limit in trunk? How about do it in the branch-2 if we think it's 
fine?

{quote}
Could we exit on setting an xattr size bigger than the hard limit, rather than 
doing a silent min? We should mention this new hard limit somewhere as well.
{quote}
Sure, updated in the new patch. User will see the hard limit from error msg. I 
updated the description of the xattr max size in hdfs-default.xml and mentioned 
the hard limit too.

{quote}
Any comment on the max size supported by other filesystems like ext4 or btrfs, 
for reference as to what's reasonable here?
{quote}
It's a long time since I read xattr of ext4 and other fs last time. I remember 
some fs have limit about the xattr length, and some fs don't have. It's detail, 
small difference should be OK.

{quote}
Typo in FSDirectory: "should" -> "should be". Also the line below it says 
"unlimited" but that'll never get triggered now.
{quote}
sure, I forgot to remove the unlimited...

{quote}
Regarding configuration, we could simplify by just using the hard limit. Admins 
would still have the option of disabling xattrs entirely; is there really any 
value in being able to set something smaller than 32KB? This would definitely 
make it a trunk change.
{quote}
Agree, How about doing it in a follow-on if we want later?

*More review comments:* 
{quote}
FSDirectory#addToInodeMap, do we need that new return?
{quote}
right, no need. I removed it. Maybe I added it suddenly :)

{quote}
A bit out of scope and so optional, but I think everywhere we say "prefixName" 
we really want to say "prefixedName" because "prefixName" sounds more like the 
name of the prefix rather than a name with a prefix.
{quote}
Good idea, {{prefixedName}} is much more better, I updated all those in the new 
patch.

{quote}
Some unrelated import changes in FSNamesystemLock and INodeAttributeProvider
{quote}
OK, I reverted this modification since they are unrelated (I saw them, I was 
thinking to remove them). 

{quote}
XAttrFeature has an extra import
{quote}
fixed

{quote}
What's the reason for switching from ImmutableList to List in some places? The 
switch is also not complete, since I still see some usages of ImmutableList. I 
remember we liked ImmutableList originally since it made the need to set very 
explicit.
{quote}
I think originally we use {{ImmutableList}} is mainly because it's immutable, 
and we keep it in {{XAttrFeature}}, we don't want outside modification affects 
it. Now it's packed to {{byte[]}} in {{XAttrFeature}}, so no need immutable 
list. 
I use ArrayList instead of ImmutableList is because when building 
ImmutableList, it needs additional list copy (from an internal ArrayList). Then 
the performance is a bit better.
I missed one place in XAttrStorage and fixed it now.

{quote}
Mind adding Javadoc for SerialNumberMap, and an interface audience private 
annotation?
XAttrsFormat's class javadoc goes over 80 chars, could use interface audience 
private also.
{quote}
Sure, updated them.  One thing is {{XAttrFormat}} is {{package-private}}, so no 
need to add audience private annotation for it.

{quote}
Can we add an IllegalStateException to SerialNumberMap#get(T) for Integer 
overflow? Also there's the case that the int from the map doesn't fit in the 29 
bits in XAttrFormat; check that in XAttrsFormattoBytes?
{quote}
Sure, I also add the check in XAttrFormat#toBytes.   Actually I want to create 
a follow-on to restrict the total number of xattr names for user's (user/trust) 
xattrs. For HDFS kernel, the number of xattr names are less than 10 currently, 
but if users create many various xattrs (maybe by mistake), then it will cause 
unexpected behavior. 

{quote}
Consider also renaming XAttrsFormat to XAttrFormat, so it's named like 
XAttrStorage
{quote}
Good idea.

{quote}
Is it worthwhile to do the same dictionary encoding for the FSImage as well? If 
the # xattrs is large enough to affect memory footprint, it'd also affect 
loading times which can already be minutes. Can be a follow-on JIRA for sure.
{quote}
Actually we already have this compaction for XAttr in FSImage, please see 
{{XAttrCompactProto}} in fsimage.proto.


was (Author: hitliuyi):
Thanks [~andrew.wang], I update the patch for your comments.

*About hard limit.*
{quote}
I talked with Colin Patrick McCabe about compatibility offline ....
{quote}
Yes, I agree you and Colin. In fact, when I added the hard limit, the first 
thing I considered is the compatibility. I did it based on two reasons: 1). as 
you said too, it's unlikely anyone's ever changed the max size. 2). The max 
limit is to restrict user's (user/trusted) namespace xattrs, and it doesn't 
break the existing HDFS features. 
I felt it was OK to do this modification on trunk and branch-2 directly 
(Certainly we can still use 4 bytes, but I wanted to save 2 bytes from it :)). 
I see you agree with the hard limit generally, one thing is do we need to only 
have the hard limit in trunk? How about do it in the branch-2 if we think it's 
fine?

{quote}
Could we exit on setting an xattr size bigger than the hard limit, rather than 
doing a silent min? We should mention this new hard limit somewhere as well.
{quote}
Sure, updated in the new patch. User will see the hard limit from error msg. I 
updated the description of the xattr max size in hdfs-default.xml and mentioned 
the hard limit too.

{quote}
Any comment on the max size supported by other filesystems like ext4 or btrfs, 
for reference as to what's reasonable here?
{quote}
It's a long time since I read xattr of ext4 and other fs last time. I remember 
some fs have limit about the xattr length, and some fs don't have. It's detail, 
small difference should be OK.

{quote}
Typo in FSDirectory: "should" -> "should be". Also the line below it says 
"unlimited" but that'll never get triggered now.
{quote}
sure, I forgot to remove the unlimited...

{quote}
Regarding configuration, we could simplify by just using the hard limit. Admins 
would still have the option of disabling xattrs entirely; is there really any 
value in being able to set something smaller than 32KB? This would definitely 
make it a trunk change.
{quote}
Agree, How about doing it in a follow-on if we want later?

*More review comments:* 
{quote}
FSDirectory#addToInodeMap, do we need that new return?
{quote}
right, no need. I removed it. Maybe I added it suddenly :)

{quote}
A bit out of scope and so optional, but I think everywhere we say "prefixName" 
we really want to say "prefixedName" because "prefixName" sounds more like the 
name of the prefix rather than a name with a prefix.
{quote}
Good idea, {{prefixedName}} is much more better, I updated all those in the new 
patch.

{quote}
Some unrelated import changes in FSNamesystemLock and INodeAttributeProvider
{quote}
OK, I reverted this modification since they are unrelated (I saw them, I was 
thinking to remove them). 

{quote}
XAttrFeature has an extra import
{quote}
fixed

{quote}
What's the reason for switching from ImmutableList to List in some places? The 
switch is also not complete, since I still see some usages of ImmutableList. I 
remember we liked ImmutableList originally since it made the need to set very 
explicit.
{quote}
I think originally we use {{ImmutableList}} is mainly because it's immutable, 
and we keep it in {{XAttrFeature}}, we don't want outside modification affects 
it. Now it's packed to {{byte[]}} in {{XAttrFeature}}, so no need immutable 
list. 
I use ArrayList instead of ImmutableList is because when building 
ImmutableList, it needs additional list copy (from an internal ArrayList). Then 
the performance is a bit better.

{quote}
Mind adding Javadoc for SerialNumberMap, and an interface audience private 
annotation?
XAttrsFormat's class javadoc goes over 80 chars, could use interface audience 
private also.
{quote}
Sure, updated them.  One thing is {{XAttrFormat}} is {{package-private}}, so no 
need to add audience private annotation for it.

{quote}
Can we add an IllegalStateException to SerialNumberMap#get(T) for Integer 
overflow? Also there's the case that the int from the map doesn't fit in the 29 
bits in XAttrFormat; check that in XAttrsFormattoBytes?
{quote}
Sure, I also add the check in XAttrFormat#toBytes.   Actually I want to create 
a follow-on to restrict the total number of xattr names for user's (user/trust) 
xattrs. For HDFS kernel, the number of xattr names are less than 10 currently, 
but if users create many various xattrs (maybe by mistake), then it will cause 
unexpected behavior. 

{quote}
Consider also renaming XAttrsFormat to XAttrFormat, so it's named like 
XAttrStorage
{quote}
Good idea.

{quote}
Is it worthwhile to do the same dictionary encoding for the FSImage as well? If 
the # xattrs is large enough to affect memory footprint, it'd also affect 
loading times which can already be minutes. Can be a follow-on JIRA for sure.
{quote}
Actually we already have this compaction for XAttr in FSImage, please see 
{{XAttrCompactProto}} in fsimage.proto.

> Optimize XAttr memory footprint.
> --------------------------------
>
>                 Key: HDFS-8900
>                 URL: https://issues.apache.org/jira/browse/HDFS-8900
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>         Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch
>
>
> {code}
> private final ImmutableList<XAttr> xAttrs;
> {code}
> Currently we use above in XAttrFeature, it's not efficient from memory point 
> of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, 
> and each object has memory alignment. 
> We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to