[jira] [Comment Edited] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-24 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338473#comment-16338473
 ] 

Manoj Govindassamy edited comment on HDFS-12051 at 1/25/18 12:23 AM:
-

Thanks for working on this [~mi...@cloudera.com]. Few comments on 
HDFS-12051.07.patch

{{NameCache.java}}
 * line 97: {{cache = new byte[cacheSize][];}} Since this will take up a 
contiguous space, we need to restrict the cache size to much lesser size than 
your current MAX size of 1 << 30. Your thoughts?
 * {{#cache}} is now following the {{open addressing}} model. Any reasons why 
you moved to this model compared to your initial design?
 * {{#put()}} 
 ** line 118: the first time cache fill .. shouldn't it be a new byte array 
name constructed from the passed in name? Why use the same caller passed in 
name?
 ** With the {{open addressing}} model, when you overwrite the cache slot with 
new names,  there could be INodes which are already referring to this name and 
are cut from the cache. Though their references are still valid, want  to 
understand why the preference given to new names compared to the old one.
 * I don't see any cache invalidation even when the INodes are removed. This 
takes up memory. Though not huge, design wise its not clean to leave the cache 
with stale values and incur cache lookup penalty in the future put() 
 * {{#getSize()}} since there is no cache invalidation, and since this open 
addressing model, the size returned is not right.
 * line 149: {{cacheSizeFor}} is this roundUp or roundDown to the nearest 2 
power. Please add the link to {{HashMap#tableSizeFor()}} in the comment to show 
where the code is inspired from.


was (Author: manojg):
Thanks for working on this [~mi...@cloudera.com]. Few comments on 
HDFS-12051.07.patch

{{NameCache.java}}
 * line 97: {{cache = new byte[cacheSize][];}} Since this will take up a 
contiguous space, we need to restrict the cache size to much lesser size than 
your current MAX size of 1 << 30. Your thoughts?
 * {{#cache}} is now following the {{open addressing}} model. Any reasons why 
you moved to this model compared to your initial design?
 * {{#put()}} 
 ** line 118: the first time cache fill .. shouldn't it be a new byte array 
name constructed from the passed in name? Why use the same caller passed in 
name?
 ** With the {{open addressing}} model, when you overwrite the cache slot with 
new names,  there could be INodes which are already referring to this name and 
are cut from the cache. 
 * I don't see any cache invalidation even when the INodes are removed. This 
takes up memory. Though not huge, design wise its not clean to leave the cache 
with stale values and incur cache lookup penalty in the future put() 
 * {{#getSize()}} since there is no cache invalidation, and since this open 
addressing model, the size returned is not right.
 * line 149: {{cacheSizeFor}} is this roundUp or roundDown to the nearest 2 
power. Please add the link to {{HashMap#tableSizeFor()}} in the comment to show 
where the code is inspired from.

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 

[jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-24 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338473#comment-16338473
 ] 

Manoj Govindassamy commented on HDFS-12051:
---

Thanks for working on this [~mi...@cloudera.com]. Few comments on 
HDFS-12051.07.patch

{{NameCache.java}}
 * line 97: {{cache = new byte[cacheSize][];}} Since this will take up a 
contiguous space, we need to restrict the cache size to much lesser size than 
your current MAX size of 1 << 30. Your thoughts?
 * {{#cache}} is now following the {{open addressing}} model. Any reasons why 
you moved to this model compared to your initial design?
 * {{#put()}} 
 ** line 118: the first time cache fill .. shouldn't it be a new byte array 
name constructed from the passed in name? Why use the same caller passed in 
name?
 ** With the {{open addressing}} model, when you overwrite the cache slot with 
new names,  there could be INodes which are already referring to this name and 
are cut from the cache. 
 * I don't see any cache invalidation even when the INodes are removed. This 
takes up memory. Though not huge, design wise its not clean to leave the cache 
with stale values and incur cache lookup penalty in the future put() 
 * {{#getSize()}} since there is no cache invalidation, and since this open 
addressing model, the size returned is not right.
 * line 149: {{cacheSizeFor}} is this roundUp or roundDown to the nearest 2 
power. Please add the link to {{HashMap#tableSizeFor()}} in the comment to show 
where the code is inspired from.

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 
> of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 
> 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>  <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name 
> <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode 
> <--  {j.u.ArrayList} <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs 
> <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 
> elements) ... <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java 
> Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of 

[jira] [Commented] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2018-01-22 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335277#comment-16335277
 ] 

Manoj Govindassamy commented on HDFS-11225:
---

[~shashikant], please feel free to own this bug and followup with your 
proposal. I am on to other things and not able to spend time on this. My 
apologies.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
> Attachments: Snaphot_Deletion_Design_Proposal.pdf
>
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and sent the edit to journal nodes, but it was 
> rejected because epoch was updated. See the following stacktrace:
> {noformat}
> 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
> at 
> 

[jira] [Commented] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2018-01-17 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329062#comment-16329062
 ] 

Manoj Govindassamy commented on HDFS-11847:
---

[~jlowe], 

  My bad. My intention was only to use branch-3.0 and not create a new branch. 
Will check my scripts. Thanks for spotting this. Please let me know the 
corrective actions and I will follow them. 

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch, HDFS-11847.05.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2018-01-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324606#comment-16324606
 ] 

Manoj Govindassamy commented on HDFS-11847:
---

Given that HDFS-10480 is available in 3.0, back ported both HDFS-11847 and 
HDFS-11848 to branch-3. 


> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch, HDFS-11847.05.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11848) Enhance dfsadmin listOpenFiles command to list files under a given path

2018-01-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324604#comment-16324604
 ] 

Manoj Govindassamy commented on HDFS-11848:
---

[~linyiqun],
   Given that HDFS-10480 is available in 3.0, back ported both HDFS-11847 and 
HDFS-11848 to branch-3. Hope this is ok with you. Please let me know if 
otherwise.


> Enhance dfsadmin listOpenFiles command to list files under a given path
> ---
>
> Key: HDFS-11848
> URL: https://issues.apache.org/jira/browse/HDFS-11848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Yiqun Lin
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-11848.001.patch, HDFS-11848.002.patch, 
> HDFS-11848.003.patch, HDFS-11848.004.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> One more thing that would be nice here is to filter the output on a passed 
> path or DataNode. Usecases: An admin might already know a stale file by path 
> (perhaps from fsck's -openforwrite), and wants to figure out who the lease 
> holder is. Proposal here is add suboptions to {{listOpenFiles}} to list files 
> filtered by path.
> {{LeaseManager#getINodeWithLeases(INodeDirectory)}} can be used to get the 
> open file list for any given ancestor directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11848) Enhance dfsadmin listOpenFiles command to list files under a given path

2018-01-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11848:
--
Fix Version/s: 3.0.1

> Enhance dfsadmin listOpenFiles command to list files under a given path
> ---
>
> Key: HDFS-11848
> URL: https://issues.apache.org/jira/browse/HDFS-11848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Yiqun Lin
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-11848.001.patch, HDFS-11848.002.patch, 
> HDFS-11848.003.patch, HDFS-11848.004.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> One more thing that would be nice here is to filter the output on a passed 
> path or DataNode. Usecases: An admin might already know a stale file by path 
> (perhaps from fsck's -openforwrite), and wants to figure out who the lease 
> holder is. Proposal here is add suboptions to {{listOpenFiles}} to list files 
> filtered by path.
> {{LeaseManager#getINodeWithLeases(INodeDirectory)}} can be used to get the 
> open file list for any given ancestor directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2018-01-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Fix Version/s: 3.0.1

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch, HDFS-11847.05.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12994) TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket timeout

2018-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319031#comment-16319031
 ] 

Manoj Govindassamy commented on HDFS-12994:
---

Got it. Patch v01 looks good to me. +1, thanks for working on this.

> TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket 
> timeout
> 
>
> Key: HDFS-12994
> URL: https://issues.apache.org/jira/browse/HDFS-12994
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12994.00.patch, HDFS-12994.01.patch
>
>
> Occasionally, {{testNNSendsErasureCodingTasks}} fails due to socket timeout
> {code}
> 2017-12-26 20:35:19,961 [StripedBlockReconstruction-0] INFO  
> datanode.DataNode (StripedBlockReader.java:createBlockReader(132)) - 
> Exception while creating remote block reader, datanode 127.0.0.1:34145
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.newConnectedPeer(StripedBlockReader.java:148)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.createBlockReader(StripedBlockReader.java:123)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.(StripedBlockReader.java:83)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.createReader(StripedReader.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.initReaders(StripedReader.java:150)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.init(StripedReader.java:133)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:56)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> while the target datanode is removed in the test:
> {code}
> 2017-12-26 20:35:18,710 [Thread-2393] INFO  net.NetworkTopology 
> (NetworkTopology.java:remove(219)) - Removing a node: 
> /default-rack/127.0.0.1:34145
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12994) TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket timeout

2018-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319031#comment-16319031
 ] 

Manoj Govindassamy edited comment on HDFS-12994 at 1/9/18 7:51 PM:
---

Got it. Patch v01 looks good to me. +1, thanks for working on this.


was (Author: manojg):
Got it. Patch v02 looks good to me. +1, thanks for working on this.

> TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket 
> timeout
> 
>
> Key: HDFS-12994
> URL: https://issues.apache.org/jira/browse/HDFS-12994
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12994.00.patch, HDFS-12994.01.patch
>
>
> Occasionally, {{testNNSendsErasureCodingTasks}} fails due to socket timeout
> {code}
> 2017-12-26 20:35:19,961 [StripedBlockReconstruction-0] INFO  
> datanode.DataNode (StripedBlockReader.java:createBlockReader(132)) - 
> Exception while creating remote block reader, datanode 127.0.0.1:34145
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.newConnectedPeer(StripedBlockReader.java:148)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.createBlockReader(StripedBlockReader.java:123)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.(StripedBlockReader.java:83)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.createReader(StripedReader.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.initReaders(StripedReader.java:150)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.init(StripedReader.java:133)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:56)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> while the target datanode is removed in the test:
> {code}
> 2017-12-26 20:35:18,710 [Thread-2393] INFO  net.NetworkTopology 
> (NetworkTopology.java:remove(219)) - Removing a node: 
> /default-rack/127.0.0.1:34145
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12994) TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket timeout

2018-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319031#comment-16319031
 ] 

Manoj Govindassamy edited comment on HDFS-12994 at 1/9/18 7:51 PM:
---

Got it. Patch v02 looks good to me. +1, thanks for working on this.


was (Author: manojg):
Got it. Patch v01 looks good to me. +1, thanks for working on this.

> TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket 
> timeout
> 
>
> Key: HDFS-12994
> URL: https://issues.apache.org/jira/browse/HDFS-12994
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12994.00.patch, HDFS-12994.01.patch
>
>
> Occasionally, {{testNNSendsErasureCodingTasks}} fails due to socket timeout
> {code}
> 2017-12-26 20:35:19,961 [StripedBlockReconstruction-0] INFO  
> datanode.DataNode (StripedBlockReader.java:createBlockReader(132)) - 
> Exception while creating remote block reader, datanode 127.0.0.1:34145
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.newConnectedPeer(StripedBlockReader.java:148)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.createBlockReader(StripedBlockReader.java:123)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.(StripedBlockReader.java:83)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.createReader(StripedReader.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.initReaders(StripedReader.java:150)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.init(StripedReader.java:133)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:56)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> while the target datanode is removed in the test:
> {code}
> 2017-12-26 20:35:18,710 [Thread-2393] INFO  net.NetworkTopology 
> (NetworkTopology.java:remove(219)) - Removing a node: 
> /default-rack/127.0.0.1:34145
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12994) TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket timeout

2018-01-09 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319023#comment-16319023
 ] 

Manoj Govindassamy commented on HDFS-12994:
---

[~eddyxu], 
  The intention is to let client detect DN issues quicker? And, the problem 
should happen always when the DN is removed right? Just trying to understand 
the core issue. Thanks.

> TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket 
> timeout
> 
>
> Key: HDFS-12994
> URL: https://issues.apache.org/jira/browse/HDFS-12994
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-12994.00.patch, HDFS-12994.01.patch
>
>
> Occasionally, {{testNNSendsErasureCodingTasks}} fails due to socket timeout
> {code}
> 2017-12-26 20:35:19,961 [StripedBlockReconstruction-0] INFO  
> datanode.DataNode (StripedBlockReader.java:createBlockReader(132)) - 
> Exception while creating remote block reader, datanode 127.0.0.1:34145
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.newConnectedPeer(StripedBlockReader.java:148)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.createBlockReader(StripedBlockReader.java:123)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.(StripedBlockReader.java:83)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.createReader(StripedReader.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.initReaders(StripedReader.java:150)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.init(StripedReader.java:133)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:56)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> while the target datanode is removed in the test:
> {code}
> 2017-12-26 20:35:18,710 [Thread-2393] INFO  net.NetworkTopology 
> (NetworkTopology.java:remove(219)) - Removing a node: 
> /default-rack/127.0.0.1:34145
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-08 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317352#comment-16317352
 ] 

Manoj Govindassamy edited comment on HDFS-12985 at 1/9/18 12:31 AM:


Thanks for the review [~yzhangal]. Committed it to trunk and branch-2. 


was (Author: manojg):
Thanks for the review [~yzhangal]. Committed it to trunk. 

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0, 2.10.0
>
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-08 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
  Resolution: Fixed
   Fix Version/s: 2.10.0
  3.1.0
Target Version/s: 3.1.0, 2.10.0  (was: 3.1.0)
  Status: Resolved  (was: Patch Available)

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0, 2.10.0
>
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-08 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317352#comment-16317352
 ] 

Manoj Govindassamy commented on HDFS-12985:
---

Thanks for the review [~yzhangal]. Committed it to trunk. 

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11848) Enhance dfsadmin listOpenFiles command to list files under a given path

2018-01-05 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313613#comment-16313613
 ] 

Manoj Govindassamy commented on HDFS-11848:
---

Thanks for the patch revision. Looks good overall. +1, with few more nit 
questions below.
1. {{TestDFSAdmin:778}} since the path is "", shouldn't it list all open files 
and hence the validation should be against {{openFilesMap}} instead of 
{{openFiles1}}
1. The input paths are treated like Strings right? That is, even if the input 
path is not a valid path they can still filter the results. Say "/dir1/dir2/d" 
can filter files for both "/dir1/dir2/dir3/",  "/dir1/dir2/dir4/" etc., 
2. If (1) is true, is it any useful to have the input path as a regex pattern? 
Totally ok with me not doing this or taking it in a different jira. 

> Enhance dfsadmin listOpenFiles command to list files under a given path
> ---
>
> Key: HDFS-11848
> URL: https://issues.apache.org/jira/browse/HDFS-11848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Yiqun Lin
> Attachments: HDFS-11848.001.patch, HDFS-11848.002.patch, 
> HDFS-11848.003.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> One more thing that would be nice here is to filter the output on a passed 
> path or DataNode. Usecases: An admin might already know a stale file by path 
> (perhaps from fsck's -openforwrite), and wants to figure out who the lease 
> holder is. Proposal here is add suboptions to {{listOpenFiles}} to list files 
> filtered by path.
> {{LeaseManager#getINodeWithLeases(INodeDirectory)}} can be used to get the 
> open file list for any given ancestor directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long

2018-01-04 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312365#comment-16312365
 ] 

Manoj Govindassamy commented on HDFS-11225:
---

[~shashikant],  Thanks for the proposal. Will take a look and get back to you.

> NameNode crashed because deleteSnapshot held FSNamesystem lock too long
> ---
>
> Key: HDFS-11225
> URL: https://issues.apache.org/jira/browse/HDFS-11225
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
> Environment: CDH5.8.2, HA
>Reporter: Wei-Chiu Chuang
>Assignee: Manoj Govindassamy
>Priority: Critical
>  Labels: high-availability
> Attachments: Snaphot_Deletion_Design_Proposal.pdf
>
>
> The deleteSnapshot operation is synchronous. In certain situations this 
> operation may hold FSNamesystem lock for too long, bringing almost every 
> NameNode operation to a halt.
> We have observed one incidence where it took so long that ZKFC believes the 
> NameNode is down. All other IPC threads were waiting to acquire FSNamesystem 
> lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection 
> timeout of 45 seconds by default, and if all IPC threads wait for 
> FSNamesystem lock and can't accept new incoming connection, ZKFC times out, 
> advances epoch and NameNode will therefore lose its active NN role and then 
> fail.
> Relevant log:
> {noformat}
> Thread 154 (IPC Server handler 86 on 8020):
>   State: RUNNABLE
>   Blocked count: 2753455
>   Waited count: 89201773
>   Stack:
> 
> org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747)
> 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789)
> {noformat}
> After the ZKFC determined NameNode was down and advanced epoch, the NN 
> finished deleting snapshot, and sent the edit to journal nodes, but it was 
> rejected because epoch was updated. See the following stacktrace:
> {noformat}
> 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:149)
> at 
> 

[jira] [Commented] (HDFS-11848) Enhance dfsadmin listOpenFiles command to list files under a given path

2018-01-04 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312361#comment-16312361
 ] 

Manoj Govindassamy commented on HDFS-11848:
---

Thanks for the patch contribution [~linyiqun]. Overall looks good to me. Here 
are few minor comments:
1. {{DFSAdmin:467}} To be consistent with the rest of the options, the 
indentation change can be restored to the old one.
2. {{DFSAdmin:935}} Any benefits of using StringUtils here? The implementation 
is missing trim() before the empty check.
3. {{DFSAdmin:2148}} Would this catch the case where the -path option is not 
provided with any path?
4. {{HDFSCommands.md:412}} (1) -path option missing. (2) "Open files list can 
filtered by given type or path. " should this be "Open files list will be 
filtered by given type and path. "
5. {{TestDFSAdmin:761}} Please add a test case to verify -path without any path 
arguments, and with an empty path ""
6. Nit: On few places the default value for the path is given as null and all 
other places it is given as OpenFilesIterator.FILTER_PATH_DEFAULT. Better if we 
can be consistent with the default value usage for simplicity.

> Enhance dfsadmin listOpenFiles command to list files under a given path
> ---
>
> Key: HDFS-11848
> URL: https://issues.apache.org/jira/browse/HDFS-11848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Yiqun Lin
> Attachments: HDFS-11848.001.patch, HDFS-11848.002.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> One more thing that would be nice here is to filter the output on a passed 
> path or DataNode. Usecases: An admin might already know a stale file by path 
> (perhaps from fsck's -openforwrite), and wants to figure out who the lease 
> holder is. Proposal here is add suboptions to {{listOpenFiles}} to list files 
> filtered by path.
> {{LeaseManager#getINodeWithLeases(INodeDirectory)}} can be used to get the 
> open file list for any given ancestor directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312158#comment-16312158
 ] 

Manoj Govindassamy commented on HDFS-12985:
---

Above unit test failures are not related to the patch. Will take care of the 
checkstyle issue in the next patch revision after review.

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Status: Patch Available  (was: Open)

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Attachment: HDFS-12985.01.patch

Attached v01 to address the following:
1. {{INodeFile#cleanSubtree()}} updates {{ReclaimContext#removedUCFiles}} after 
deleting the snapshot file.
2. {{FSDirDeleteOp#deleteInternal}} already take care of removing the leases 
for removedUCFiles and removedINodes.
3. New unit test {{TestOpenFilesWithSnapshot#testOpenFileDeletionAndNNRestart}} 
added to show the problem and the fix solving the same.

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311928#comment-16311928
 ] 

Manoj Govindassamy edited comment on HDFS-12985 at 1/4/18 7:46 PM:
---

Attached v01 to address the following:
1. {{INodeFile#cleanSubtree()}} updates {{ReclaimContext#removedUCFiles}} after 
deleting the snapshot file.
2. {{FSDirDeleteOp#deleteInternal}} already take care of removing the leases 
for removedUCFiles and removedINodes.
3. New unit test {{TestOpenFilesWithSnapshot#testOpenFileDeletionAndNNRestart}} 
added to show the problem and the fix solving the same.
[~yzhangal], [~eddyxu], can you please take a look at the patch?


was (Author: manojg):
Attached v01 to address the following:
1. {{INodeFile#cleanSubtree()}} updates {{ReclaimContext#removedUCFiles}} after 
deleting the snapshot file.
2. {{FSDirDeleteOp#deleteInternal}} already take care of removing the leases 
for removedUCFiles and removedINodes.
3. New unit test {{TestOpenFilesWithSnapshot#testOpenFileDeletionAndNNRestart}} 
added to show the problem and the fix solving the same.

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-03 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Description: 
NameNode crashes repeatedly with NPE at the startup when trying to find the 
total number of under construction blocks. This crash happens after an open 
file, which was also part of a snapshot gets deleted along with the snapshot.

{noformat}
Failed to start namenode.
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
{noformat}




  was:
NameNode crashes repeatedly with NPE at the startup when trying to find the 
total number of under construction blocks. This crash happens after an open 
file, which was also part of a snapshot gets deleted along with the snapshot.

{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:144)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4456)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1158)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:825)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:751)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:968)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:947)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1674)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2110)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2075)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testSnapshotsForOpenFilesAndDeletion3(TestOpenFilesWithSnapshot.java:747)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}





> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>  

[jira] [Created] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-03 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12985:
-

 Summary: NameNode crashes during restart after an OpenForWrite 
file present in the Snapshot got deleted
 Key: HDFS-12985
 URL: https://issues.apache.org/jira/browse/HDFS-12985
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 2.8.0
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


NameNode crashes repeatedly with NPE at the startup when trying to find the 
total number of under construction blocks. This crash happens after an open 
file, which was also part of a snapshot gets deleted along with the snapshot.

{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:144)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4456)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1158)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:825)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:751)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:968)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:947)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1674)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2110)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2075)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testSnapshotsForOpenFilesAndDeletion3(TestOpenFilesWithSnapshot.java:747)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11848) Enhance dfsadmin listOpenFiles command to list files under a given path

2018-01-03 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310469#comment-16310469
 ] 

Manoj Govindassamy commented on HDFS-11848:
---

Thanks for posting a patch revision [~linyiqun]. Sorry for the delay, will 
review this week. 

> Enhance dfsadmin listOpenFiles command to list files under a given path
> ---
>
> Key: HDFS-11848
> URL: https://issues.apache.org/jira/browse/HDFS-11848
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Yiqun Lin
> Attachments: HDFS-11848.001.patch, HDFS-11848.002.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> One more thing that would be nice here is to filter the output on a passed 
> path or DataNode. Usecases: An admin might already know a stale file by path 
> (perhaps from fsck's -openforwrite), and wants to figure out who the lease 
> holder is. Proposal here is add suboptions to {{listOpenFiles}} to list files 
> filtered by path.
> {{LeaseManager#getINodeWithLeases(INodeDirectory)}} can be used to get the 
> open file list for any given ancestor directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2018-01-02 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Thanks for the review [~eddyxu]. Committed to trunk.

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0
>
> Attachments: HDFS-12629.01.patch, HDFS-12629.02.patch, 
> NN_UI_Summary_BlockCount_AfterFix.png, NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2018-01-02 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308884#comment-16308884
 ] 

Manoj Govindassamy commented on HDFS-11847:
---

Thanks for the review [~xiaochen]. Took care of the checkstyle and javadoc 
issues. Committed to trunk. 

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch, HDFS-11847.05.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2018-01-02 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Attachment: HDFS-12629.02.patch

Thanks for the review [~eddyxu]. Will commit soon to trunk. Re-attaching the 
same patch as v02 to overcome the HDFS precommit build issue. 

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12629.01.patch, HDFS-12629.02.patch, 
> NN_UI_Summary_BlockCount_AfterFix.png, NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-29 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.05.patch

Thanks [~xiaochen] for the review. Attached v05 patch to address the following. 
Please take a look at the latest patch. 
1. HDFS-12969 is tracking the enhancements needed to {{dfsAdmin 
-listOpenFiles}} command.
2. Restored old API in the client packages. 
3. {{FSN#getFilesBlockingDecom}} nows returns a batched list honoring 
{{maxListOpenFilesResponses}}. 
4. Restored the old reporting format
5. Surprisingly I don't see this change in the IDE. Able to get this 
unnecessary change removed after a fresh pull. 
And, updated the test case to cover the batch response for listing open files 
by type. 

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch, HDFS-11847.05.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12969) DfsAdmin listOpenFiles should report files by type

2017-12-29 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12969:
-

 Summary: DfsAdmin listOpenFiles should report files by type
 Key: HDFS-12969
 URL: https://issues.apache.org/jira/browse/HDFS-12969
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.1.0
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


HDFS-11847 has introduced a new option to {{-blockingDecommission}} to an 
existing command 
{{dfsadmin -listOpenFiles}}. But the reporting done by the command doesn't 
differentiate the files based on the type (like blocking decommission). In 
order to change the reporting style, the proto format used for the base command 
has to be updated to carry additional fields and better be done in a new jira 
outside of HDFS-11847. This jira is to track the end-to-end enhancements needed 
for dfsadmin -listOpenFiles console output.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-12-22 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Status: Patch Available  (was: Open)

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12629.01.patch, 
> NN_UI_Summary_BlockCount_AfterFix.png, NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-12-22 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Attachment: HDFS-12629.01.patch

Attached v01 patch to report separate block stats -- Replicated blocks and 
Erasure Coded block groups in the NN UI Summary page.
[~eddyxu], can you please take a look at the patch?

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12629.01.patch, 
> NN_UI_Summary_BlockCount_AfterFix.png, NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-12-22 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Attachment: NN_UI_Summary_BlockCount_AfterFix.png

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12629.01.patch, 
> NN_UI_Summary_BlockCount_AfterFix.png, NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12959) Fix TestOpenFilesWithSnapshot redundant configurations

2017-12-21 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12959:
--
   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

> Fix TestOpenFilesWithSnapshot redundant configurations
> --
>
> Key: HDFS-12959
> URL: https://issues.apache.org/jira/browse/HDFS-12959
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Minor
> Fix For: 3.1.0
>
> Attachments: HDFS-12959.01.patch
>
>
> Fix the redundant configurations that are set in 
> {{TestOpenFilesWithSnapshot#testPointInTimeSnapshotCopiesForOpenFiles}} and 
> {{TestOpenFilesWithSnapshot#testOpenFilesSnapChecksumWithTrunkAndAppend}}. 
> These redundant configurations give an impression that they are needed for 
> the tests to pass through, but infact its not. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12959) Fix TestOpenFilesWithSnapshot redundant configurations

2017-12-21 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300734#comment-16300734
 ] 

Manoj Govindassamy commented on HDFS-12959:
---

Thanks for the review [~eddyxu]. Test failure is not related to the patch. 
Pushed the changes to trunk.

> Fix TestOpenFilesWithSnapshot redundant configurations
> --
>
> Key: HDFS-12959
> URL: https://issues.apache.org/jira/browse/HDFS-12959
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Minor
> Attachments: HDFS-12959.01.patch
>
>
> Fix the redundant configurations that are set in 
> {{TestOpenFilesWithSnapshot#testPointInTimeSnapshotCopiesForOpenFiles}} and 
> {{TestOpenFilesWithSnapshot#testOpenFilesSnapChecksumWithTrunkAndAppend}}. 
> These redundant configurations give an impression that they are needed for 
> the tests to pass through, but infact its not. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12959) Fix TestOpenFilesWithSnapshot redundant configurations

2017-12-21 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12959:
--
Status: Patch Available  (was: Open)

> Fix TestOpenFilesWithSnapshot redundant configurations
> --
>
> Key: HDFS-12959
> URL: https://issues.apache.org/jira/browse/HDFS-12959
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Minor
> Attachments: HDFS-12959.01.patch
>
>
> Fix the redundant configurations that are set in 
> {{TestOpenFilesWithSnapshot#testPointInTimeSnapshotCopiesForOpenFiles}} and 
> {{TestOpenFilesWithSnapshot#testOpenFilesSnapChecksumWithTrunkAndAppend}}. 
> These redundant configurations give an impression that they are needed for 
> the tests to pass through, but infact its not. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12959) Fix TestOpenFilesWithSnapshot redundant configurations

2017-12-21 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12959:
--
Attachment: HDFS-12959.01.patch

Attached v01 patch to remove the redundant configurations in 
TestOpenFilesWithSnapshot.
[~eddyxu], can you please take a look at the patch?

> Fix TestOpenFilesWithSnapshot redundant configurations
> --
>
> Key: HDFS-12959
> URL: https://issues.apache.org/jira/browse/HDFS-12959
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Minor
> Attachments: HDFS-12959.01.patch
>
>
> Fix the redundant configurations that are set in 
> {{TestOpenFilesWithSnapshot#testPointInTimeSnapshotCopiesForOpenFiles}} and 
> {{TestOpenFilesWithSnapshot#testOpenFilesSnapChecksumWithTrunkAndAppend}}. 
> These redundant configurations give an impression that they are needed for 
> the tests to pass through, but infact its not. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12959) Fix TestOpenFilesWithSnapshot redundant configurations

2017-12-21 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12959:
-

 Summary: Fix TestOpenFilesWithSnapshot redundant configurations
 Key: HDFS-12959
 URL: https://issues.apache.org/jira/browse/HDFS-12959
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.0.0
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy
Priority: Minor


Fix the redundant configurations that are set in 
{{TestOpenFilesWithSnapshot#testPointInTimeSnapshotCopiesForOpenFiles}} and 
{{TestOpenFilesWithSnapshot#testOpenFilesSnapChecksumWithTrunkAndAppend}}. 
These redundant configurations give an impression that they are needed for the 
tests to pass through, but infact its not. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-21 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.04.patch

Attached v04 patch to address the TestAnnotations failure in the previous test 
run. Other test failures are not related to the patch. 

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch, HDFS-11847.04.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12953) XORRawDecoder.doDecode throws NullPointerException

2017-12-21 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy reassigned HDFS-12953:
-

Assignee: Manoj Govindassamy  (was: Lei (Eddy) Xu)

> XORRawDecoder.doDecode throws NullPointerException
> --
>
> Key: HDFS-12953
> URL: https://issues.apache.org/jira/browse/HDFS-12953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Lei (Eddy) Xu
>Assignee: Manoj Govindassamy
>
> Thanks [~danielpol] report on HDFS-12860.
> {noformat}
> 17/11/30 04:19:55 INFO mapreduce.Job: map 0% reduce 0%
> 17/11/30 04:20:01 INFO mapreduce.Job: Task Id : 
> attempt_1512036058655_0003_m_02_0, Status : FAILED
> Error: java.lang.NullPointerException
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.XORRawDecoder.doDecode(XORRawDecoder.java:83)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:106)
> at 
> org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170)
> at 
> org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:423)
> at 
> org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94)
> at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:382)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:318)
> at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at 
> org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:563)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-20 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.03.patch

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-20 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: (was: HDFS-11847.03.patch)

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-20 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.03.patch

Thanks for the detailed review [~xiaochen]. Attached v03 patch to address the 
following. Please take a look.
1. Deprecated the old API and added a new one which accepts the additional type 
argument to filter the result by.
2. Updated {{FSN#listOpenFiles}} to check for {{ALL_OPEN_FILES}} type first and 
then for combination filtering option later. But, the result set and the 
reporting doesn't differentiate the entries by type. For this, we need to add 
the type to the {{OpenFileEntry}}. Will do this.
3. About printing DataNodes details in the results, planning to take this 
enhancement along with the pending item in (2) in a separate jira if you are 
ok. I need to change the proto, the handling of the 
4. Yes, better to return as much result as possible. Made 
{{DatanodeAdminManager#processBlocksInternal}} to log the warning message on 
unexpected open files and continue to the next one.
5. In {{DatanodeAdminManager#processBlocksInternal}}, the computation is at the 
DataNode level. There can be multiple blocks across DNs for the same file and 
the full count need to be tracked for JMX reporting purposes. So, retaining the 
existing lowRedundancyBlocksInOpenFiles field. When I removed this field and 
piggy backed on the {{lowRedundancyOpenFiles.size()}}, the actual count was 
lesser than the expected for few tests.
6. In {{LeavingServiceStatus}} both members are needed due to (5)

7. Updated the comment for the class {{LeavingServiceStatus}}
8. {{FSN#getFilesBlockingDecom}} added hasReadLock()
9. {{TestDecommission#verifyOpenFilesBlockingDecommission}} PrintStream is now 
copied before the exchange and restored to the old one. Good find.
10. {{DFS_NAMENODE_REDUNDANCY_INTERVAL_SECONDS_KEY}} value is actually in 
seconds. So, it is 1000 seconds and not 1 sec. Anyways, updated this to Max 
value.


> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-18 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.02.patch

Attached v02 patch with more unit  tests added.

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-13 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Status: Patch Available  (was: Open)

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11847) Enhance dfsadmin listOpenFiles command to list files blocking datanode decommissioning

2017-12-13 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--
Attachment: HDFS-11847.01.patch

Attached v01 patch to address the following:
1. Ability to query for open files with interested types - like 
BLOCKING_DECOMMISSION, BLOCKING_DECOMMISSION, ALL, etc.,
2. A new method FSNamesystem#getFilesBlockingDecom() to get the list of all 
open files blocking deocmmissiom
3. Basic tests. More sophisticated tests pending.

> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --
>
> Key: HDFS-11847
> URL: https://issues.apache.org/jira/browse/HDFS-11847
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-11847.01.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12918) NameNode fails to start after upgrade - Missing state in ECPolicy Proto

2017-12-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288591#comment-16288591
 ] 

Manoj Govindassamy edited comment on HDFS-12918 at 12/13/17 5:46 AM:
-

We have an upgrade incompatible fix landed in 3.0 at 
e565b5277d5b890dad107fe85e295a3907e4bfc1. The fix is necessary and it verifies 
the EC Policy state when loading FSImage. This issue is nothing to do with the 
default value for the ECPolicyState field in the ErasureCodingPolicyProto. 
While the ECPolicyState field is optional in ECPolocyProto message for over the 
wire communications, but its mandatory in FSImage for the EC files. I hope the 
upgrade incompatible changes before the C6 GA are ok. Please let me know if you 
have other thoughts. 


was (Author: manojg):
We have an upgrade incompatible fix landed in C6 at 
e565b5277d5b890dad107fe85e295a3907e4bfc1. The fix is necessary and it verifies 
the EC Policy state when loading FSImage. This issue is nothing to do with the 
default value for the ECPolicyState field in the ErasureCodingPolicyProto. 
While the ECPolicyState field is optional in ECPolocyProto message for over the 
wire communications, but its mandatory in FSImage for the EC files. I hope the 
upgrade incompatible changes before the C6 GA are ok. Please let me know if you 
have other thoughts. 

> NameNode fails to start after upgrade - Missing state in ECPolicy Proto 
> 
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12918) NameNode fails to start after upgrade - Missing state in ECPolicy Proto

2017-12-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy resolved HDFS-12918.
---
Resolution: Won't Fix

We have an upgrade incompatible fix landed in C6 at 
e565b5277d5b890dad107fe85e295a3907e4bfc1. The fix is necessary and it verifies 
the EC Policy state when loading FSImage. This issue is nothing to do with the 
default value for the ECPolicyState field in the ErasureCodingPolicyProto. 
While the ECPolicyState field is optional in ECPolocyProto message for over the 
wire communications, but its mandatory in FSImage for the EC files. I hope the 
upgrade incompatible changes before the C6 GA are ok. Please let me know if you 
have other thoughts. 

> NameNode fails to start after upgrade - Missing state in ECPolicy Proto 
> 
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12918) NameNode fails to start after upgrade - Missing state in ECPolicy Proto

2017-12-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12918:
--
Affects Version/s: 3.0.0-beta1
  Component/s: hdfs

> NameNode fails to start after upgrade - Missing state in ECPolicy Proto 
> 
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12918) NameNode fails to start after upgrade - Missing state in ECPolicy Proto

2017-12-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12918:
--
Summary: NameNode fails to start after upgrade - Missing state in ECPolicy 
Proto   (was: EC Policy defaults incorrectly to enabled in protobufs)

> NameNode fails to start after upgrade - Missing state in ECPolicy Proto 
> 
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12918) EC Policy defaults incorrectly to enabled in protobufs

2017-12-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288521#comment-16288521
 ] 

Manoj Govindassamy commented on HDFS-12918:
---


A new check added in the convert seems to be not backward compatible. It is 
going to break the upgrade from previous image format where the 
ErasureCodingPolicyProto didn't have state field. It is suppose to be an 
optional field and the below check need to be relaxed as well. [~xiaochen] your 
thoughts please?

{noformat}
  /**
   * Convert the protobuf to a {@link ErasureCodingPolicyInfo}. This should only
   * be needed when the caller is interested in the state of the policy.
   */
  public static ErasureCodingPolicyInfo convertErasureCodingPolicyInfo(
  ErasureCodingPolicyProto proto) {
ErasureCodingPolicy policy = convertErasureCodingPolicy(proto);
ErasureCodingPolicyInfo info = new ErasureCodingPolicyInfo(policy);
Preconditions.checkArgument(proto.hasState(),<==
"Missing state field in ErasureCodingPolicy proto");
info.setState(convertECState(proto.getState()));
return info;
  }
{noformat}


> EC Policy defaults incorrectly to enabled in protobufs
> --
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5926) Documentation should clarify dfs.datanode.du.reserved impact from reserved disk capacity

2017-12-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-5926:
-
Summary: Documentation should clarify dfs.datanode.du.reserved impact from 
reserved disk capacity  (was: documation should clarify 
dfs.datanode.du.reserved wrt reserved disk capacity)

> Documentation should clarify dfs.datanode.du.reserved impact from reserved 
> disk capacity
> 
>
> Key: HDFS-5926
> URL: https://issues.apache.org/jira/browse/HDFS-5926
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.20.2
>Reporter: Alexander Fahlke
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-5926-1.patch
>
>
> I'm using hadoop-0.20.2 on Debian Squeeze and ran into the same confusion as 
> many others with the parameter for dfs.datanode.du.reserved. One day some 
> data nodes got out of disk errors although there was space left on the disks.
> The following values are rounded to make the problem more clear:
> - the disk for the DFS data has 1000GB and only one Partition (ext3) for DFS 
> data
> - you plan to set the dfs.datanode.du.reserved to 20GB
> - the reserved reserved-blocks-percentage by tune2fs is 5% (the default)
> That gives all users, except root, 5% less capacity that they can use.
> Although the System reports the total of 1000GB as usable for all users via 
> df. The hadoop-deamons are not running as root.
> If i read it right, than hadoop get's the free capacity via df.
>  
> Starting in 
> {{/src/hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java}} on line 
> 350: {{return usage.getCapacity()-reserved;}}
> going to {{/src/core/org/apache/hadoop/fs/DF.java}} which says:
> {{"Filesystem disk space usage statistics. Uses the unix 'df' program"}}
> When you have 5% reserved by tune2fs (in our case 50GB) and you give 
> dfs.datanode.du.reserved only 20GB, than you can possibly ran into out of 
> disk errors that hadoop can't handle.
> In this case you must add the planned 20GB du reserved to the reserved 
> capacity by tune2fs. This results in (at least) 70GB for 
> dfs.datanode.du.reserved in my case.
> Two ideas:
> # The documentation must be clear at this point to avoid this problem.
> # Hadoop could check for reserved space by tune2fs (or other tools) and add 
> this value to the dfs.datanode.du.reserved parameter.
> This ticket is a follow up from the Mailinglist: 
> https://mail-archives.apache.org/mod_mbox/hadoop-common-user/201312.mbox/%3CCAHodO=Kbv=13T=2otz+s8nsodbs1icnzqyxt_0wdfxy5gks...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12918) EC Policy defaults incorrectly to enabled in protobufs

2017-12-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288378#comment-16288378
 ] 

Manoj Govindassamy commented on HDFS-12918:
---

[~zamsden],
  There is an addendum patch HDFS-12682 after HDFS-12258 to make the policy 
immutable by pulling the EC state to {{ErasureCodingPolicyInfo}}. As you 
pointed out {{hdfs.proto}} default value looks wrong to me as well. But, in the 
PBHelperClient code there is an explicit handling for this, both while saving 
the ECPolicy and while retrieving.  So, ECPI saved and retrieved from FSImages 
should be right. 

{{PBHelperClient}}
{noformat}
  /**
   * Convert the protobuf to a {@link ErasureCodingPolicyInfo}. This should only
   * be needed when the caller is interested in the state of the policy.
   */
  public static ErasureCodingPolicyInfo convertErasureCodingPolicyInfo(
  ErasureCodingPolicyProto proto) {
ErasureCodingPolicy policy = convertErasureCodingPolicy(proto);
ErasureCodingPolicyInfo info = new ErasureCodingPolicyInfo(policy);
Preconditions.checkArgument(proto.hasState(),
"Missing state field in ErasureCodingPolicy proto");
info.setState(convertECState(proto.getState()));  <
return info;
  }

  /**
   * Convert a {@link ErasureCodingPolicyInfo} to protobuf.
   * The protobuf will have the policy, and state. State is relevant when:
   * 1. Persisting a policy to fsimage
   * 2. Returning the policy to the RPC call
   * {@link DistributedFileSystem#getAllErasureCodingPolicies()}
   */
  public static ErasureCodingPolicyProto convertErasureCodingPolicy(
  ErasureCodingPolicyInfo info) {
final ErasureCodingPolicyProto.Builder builder =
createECPolicyProtoBuilder(info.getPolicy());
builder.setState(convertECState(info.getState()));  <===
return builder.build();
  }

{noformat}

Listing Policies:
{noformat}
$ hdfs ec -listPolicies
Erasure Coding Policies:
ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=ENABLED
ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
CellSize=1048576, Id=3], State=DISABLED
ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED
{noformat}

But, there is another version of {{convertErasureCodingPolicy}} which takes in 
only {{ErasureCodingPolicy}} where the state is missing and the default  state 
from {{ErasureCodingPolicyProto}} will be used.

{noformat}
  /**
   * Convert a {@link ErasureCodingPolicy} to protobuf.
   * This means no state of the policy will be set on the protobuf.
   */
  public static ErasureCodingPolicyProto convertErasureCodingPolicy(
  ErasureCodingPolicy policy) {
return createECPolicyProtoBuilder(policy).build();
  }
{noformat}

Probably you are seeing the default value of the EC state from the callers 
(like ListStatus, BlockRecovery, BlockGroupChecksum etc.,) of the above convert 
util. Can you please confirm where you are seeing the inconsistent EC state? 

> EC Policy defaults incorrectly to enabled in protobufs
> --
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA

[jira] [Assigned] (HDFS-12918) EC Policy defaults incorrectly to enabled in protobufs

2017-12-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy reassigned HDFS-12918:
-

Assignee: Manoj Govindassamy

I can take a look at this if you haven't already started to work on the patch. 
Please let me know.

> EC Policy defaults incorrectly to enabled in protobufs
> --
>
> Key: HDFS-12918
> URL: https://issues.apache.org/jira/browse/HDFS-12918
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zach Amsden
>Assignee: Manoj Govindassamy
>Priority: Critical
>
> According to documentation and code comments, the default setting for erasure 
> coding policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the 
> protobuf to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
> ooptional string name = 1;
> optional ECSchemaProto schema = 2;
> optional uint32 cellSize = 3;
> required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be 
> included, and existing serialized data without this optional field will be 
> incorrectly interpreted as having erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS 
> installations that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12855) Fsck violates namesystem locking

2017-12-08 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy reassigned HDFS-12855:
-

Assignee: Manoj Govindassamy

> Fsck violates namesystem locking 
> -
>
> Key: HDFS-12855
> URL: https://issues.apache.org/jira/browse/HDFS-12855
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.4
>Reporter: Konstantin Shvachko
>Assignee: Manoj Govindassamy
>
> {{NamenodeFsck}} access {{FSNamesystem}} structures, such as INodes, 
> BlockInfo without holding a lock. See e.g. {{NamenodeFsck.blockIdCK()}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12825) Fsck report shows config key name for min replication issues

2017-12-08 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284172#comment-16284172
 ] 

Manoj Govindassamy commented on HDFS-12825:
---

Thanks for the contribution [~gabor.bota]. Committed to trunk.

> Fsck report shows config key name for min replication issues
> 
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12825) Fsck report shows config key name for min replication issues

2017-12-08 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12825:
--
   Resolution: Fixed
 Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

> Fsck report shows config key name for min replication issues
> 
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12825) Fsck report shows config key name for min replication issues

2017-12-07 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12825:
--
Summary: Fsck report shows config key name for min replication issues  
(was: After Block Corrupted, FSCK Report printing the Direct configuration.  )

> Fsck report shows config key name for min replication issues
> 
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12825) Fsck report shows config key name for min replication issues

2017-12-07 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12825:
--
Labels: incompatibleChange newbie  (was: newbie)

> Fsck report shows config key name for min replication issues
> 
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: incompatibleChange, newbie
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12825) Fsck report shows config key name for min replication issues

2017-12-07 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12825:
--
  Labels: newbie  (was: incompatibleChange newbie)
Hadoop Flags: Incompatible change

> Fsck report shows config key name for min replication issues
> 
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12825) After Block Corrupted, FSCK Report printing the Direct configuration.

2017-12-07 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282239#comment-16282239
 ] 

Manoj Govindassamy commented on HDFS-12825:
---

Patch looks good to me. +1. Thanks for working on this [~gabor.bota] and thanks 
for reporting [~Harsha1206], [~usharani].
[~gabor.bota], I would prefer labelling this jira as Incompatible change since 
it changes the fsck output format.

> After Block Corrupted, FSCK Report printing the Direct configuration.  
> ---
>
> Key: HDFS-12825
> URL: https://issues.apache.org/jira/browse/HDFS-12825
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>Assignee: Gabor Bota
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-12825.001.patch, error.JPG
>
>
> Scenario:
> Corrupt the Block in any datanode
> Take the *FSCK *Report for that file.
> Actual Output:
> ==
> printing the direct configuration in fsck report
> {{dfs.namenode.replication.min}}
> Expected Output:
> 
> it should be {{MINIMAL BLOCK REPLICATION}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12855) Fsck violates namesystem locking

2017-11-28 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269303#comment-16269303
 ] 

Manoj Govindassamy commented on HDFS-12855:
---

The latest trunk code is also similar to 2.7.x line. 
{{NamenodeFsck.blockIdCK()}} works on BlockManager and FSNameSystem layer 
directly without holding NameSystem locks. One race I can think of is fsck with 
block id option running in parallel with a file deletion which contains the 
same block. Since the BlockInfo is obtained without holding a lock, the file 
could get deleted and later the INode retrieval could return null and could 
face NPE when accessing INode members. Haven't proved this with a test yet 
though.  

{noformat}
  public void blockIdCK(String blockId) {
...
try {
  //get blockInfo
  Block block = new Block(Block.getBlockId(blockId));
  //find which file this block belongs to
  BlockInfo blockInfo = blockManager.getStoredBlock(block);
  if(blockInfo == null) {
out.println("Block "+ blockId +" " + NONEXISTENT_STATUS);
LOG.warn("Block "+ blockId + " " + NONEXISTENT_STATUS);
return;
  }
  final INodeFile iNode = 
namenode.getNamesystem().getBlockCollection(blockInfo);
  NumberReplicas numberReplicas= blockManager.countNodes(blockInfo);
  out.println("Block Id: " + blockId);
  out.println("Block belongs to: "+iNode.getFullPathName());
  out.println("No. of Expected Replica: " +
  blockManager.getExpectedRedundancyNum(blockInfo));
{noformat}

> Fsck violates namesystem locking 
> -
>
> Key: HDFS-12855
> URL: https://issues.apache.org/jira/browse/HDFS-12855
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.4
>Reporter: Konstantin Shvachko
>
> {{NamenodeFsck}} access {{FSNamesystem}} structures, such as INodes, 
> BlockInfo without holding a lock. See e.g. {{NamenodeFsck.blockIdCK()}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-11-20 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12730:
--
   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Thanks for the review [~yzhangal] and [~hanishakoneru]. Committed it to trunk.

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0
>
> Attachments: HDFS-12730.01.patch, HDFS-12730.02.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-11-17 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257326#comment-16257326
 ] 

Manoj Govindassamy commented on HDFS-12730:
---

Test failures are not related to the patch. 

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12730.01.patch, HDFS-12730.02.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12823) Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to branch-2.7

2017-11-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256324#comment-16256324
 ] 

Manoj Govindassamy commented on HDFS-12823:
---

v02 LGTM, +1. Thanks [~xkrogen].

> Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to 
> branch-2.7
> 
>
> Key: HDFS-12823
> URL: https://issues.apache.org/jira/browse/HDFS-12823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, hdfs-client
>Reporter: Erik Krogen
>Assignee: Erik Krogen
> Attachments: HDFS-12823-branch-2.7.000.patch, 
> HDFS-12823-branch-2.7.001.patch, HDFS-12823-branch-2.7.002.patch
>
>
> Given the pretty significant performance implications of HDFS-9259 (see 
> discussion in HDFS-10326) when doing transfers across high latency links, it 
> would be helpful to have this configurability exist in the 2.7 series. 
> Opening a new JIRA since the original HDFS-9259 has been closed for a while 
> and there are conflicts due to a few classes moving.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12823) Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to branch-2.7

2017-11-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256137#comment-16256137
 ] 

Manoj Govindassamy commented on HDFS-12823:
---

Thanks for the extra efforts [~xkrogen]. Much appreciated. +1, pending Jenkins. 
 

> Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to 
> branch-2.7
> 
>
> Key: HDFS-12823
> URL: https://issues.apache.org/jira/browse/HDFS-12823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, hdfs-client
>Reporter: Erik Krogen
>Assignee: Erik Krogen
> Attachments: HDFS-12823-branch-2.7.000.patch, 
> HDFS-12823-branch-2.7.001.patch
>
>
> Given the pretty significant performance implications of HDFS-9259 (see 
> discussion in HDFS-10326) when doing transfers across high latency links, it 
> would be helpful to have this configurability exist in the 2.7 series. 
> Opening a new JIRA since the original HDFS-9259 has been closed for a while 
> and there are conflicts due to a few classes moving.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12823) Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to branch-2.7

2017-11-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256043#comment-16256043
 ] 

Manoj Govindassamy commented on HDFS-12823:
---

[~xkrogen],
  Yes, not a good idea to introduce getters and setters for all those 50+ 
fields as part of this jira. Adding a getter for the newly added ones will be 
better though. Otherwise, the v0 patch LGTM, +1. Thanks for working on this.

> Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to 
> branch-2.7
> 
>
> Key: HDFS-12823
> URL: https://issues.apache.org/jira/browse/HDFS-12823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, hdfs-client
>Reporter: Erik Krogen
>Assignee: Erik Krogen
> Attachments: HDFS-12823-branch-2.7.000.patch
>
>
> Given the pretty significant performance implications of HDFS-9259 (see 
> discussion in HDFS-10326) when doing transfers across high latency links, it 
> would be helpful to have this configurability exist in the 2.7 series. 
> Opening a new JIRA since the original HDFS-9259 has been closed for a while 
> and there are conflicts due to a few classes moving.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12823) Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to branch-2.7

2017-11-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16255937#comment-16255937
 ] 

Manoj Govindassamy commented on HDFS-12823:
---

[~xkrogen],

Can we please make use of {{getSocketSendBufferSize()}} instead of directly 
referring to the member variable in the below check in {{DFSOutputStream}}?
{noformat}
1704if (client.getConf().socketSendBufferSize > 0) {
1705  sock.setSendBufferSize(client.getConf().socketSendBufferSize);
1706}
{noformat}


> Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to 
> branch-2.7
> 
>
> Key: HDFS-12823
> URL: https://issues.apache.org/jira/browse/HDFS-12823
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, hdfs-client
>Reporter: Erik Krogen
>Assignee: Erik Krogen
> Attachments: HDFS-12823-branch-2.7.000.patch
>
>
> Given the pretty significant performance implications of HDFS-9259 (see 
> discussion in HDFS-10326) when doing transfers across high latency links, it 
> would be helpful to have this configurability exist in the 2.7 series. 
> Opening a new JIRA since the original HDFS-9259 has been closed for a while 
> and there are conflicts due to a few classes moving.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-11-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12730:
--
Attachment: HDFS-12730.02.patch

Attached v02 patch to address the comment.
-- added a case to verify the config switched on to off and the effect of file 
lengths for the open files in the newly taken snapshots.
[~yzhangal], [~hanishakoneru], can you please take a look? 
  

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12730.01.patch, HDFS-12730.02.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-11-15 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254642#comment-16254642
 ] 

Manoj Govindassamy edited comment on HDFS-12730 at 11/16/17 2:16 AM:
-

Thanks for the review [~hanishakoneru]. Thats right, after the config change 
and after a fresh meta data change all the previously opened files will turn 
immutable. 
[~yzhangal], [~eddyxu] can you also please take a look at the patch?


was (Author: manojg):
Thanks for the review [~hanishakoneru]. Thats right, after the config change 
and after a fresh meta data change all the previously opened files will turn 
immutable. 
[~yzhangal], can you also please take a look at the patch?

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12730.01.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-10-27 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12730:
--
Status: Patch Available  (was: Open)

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12730.01.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-10-26 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12730:
--
Attachment: HDFS-12730.01.patch

Attached v01 patch to verify the attributes of the open files captured in 
snapshots with/without config. [~yzhangal], can you please take a look at the 
patch?

> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12730.01.patch
>
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-10-26 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12730:
--
Description: 
Open files captured in the snapshots have their meta data preserved based on 
the config 
_dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
upgrade scenario or when the NameNode gets restarted with config turned on or 
off,  the attributes of the open files captured in the snapshots are influenced 
accordingly. Better to have a test case to verify open file attributes across 
config turn on and off, and the current expected behavior with HDFS-11402 so as 
to catch any regressions in the future.

  was:
Open files captured in the snapshots have their meta data preserved based on 
the config 
_dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). It is possible 
for the NameNode to get restarted with config turned on or off and the 
attributes of the open files captured in the snapshots are influenced 
accordingly. Better to have a test case to verify open file attributes across 
config turn on and off, and the current expected behavior with HDFS-11402 so as 
to catch any regressions in the future.


> Verify open files captured in the snapshots across config disable and enable
> 
>
> Key: HDFS-12730
> URL: https://issues.apache.org/jira/browse/HDFS-12730
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> Open files captured in the snapshots have their meta data preserved based on 
> the config 
> _dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). During the 
> upgrade scenario or when the NameNode gets restarted with config turned on or 
> off,  the attributes of the open files captured in the snapshots are 
> influenced accordingly. Better to have a test case to verify open file 
> attributes across config turn on and off, and the current expected behavior 
> with HDFS-11402 so as to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12730) Verify open files captured in the snapshots across config disable and enable

2017-10-26 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12730:
-

 Summary: Verify open files captured in the snapshots across config 
disable and enable
 Key: HDFS-12730
 URL: https://issues.apache.org/jira/browse/HDFS-12730
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs
Affects Versions: 3.0.0
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


Open files captured in the snapshots have their meta data preserved based on 
the config 
_dfs.namenode.snapshot.capture.openfiles_ (refer HDFS-11402). It is possible 
for the NameNode to get restarted with config turned on or off and the 
attributes of the open files captured in the snapshots are influenced 
accordingly. Better to have a test case to verify open file attributes across 
config turn on and off, and the current expected behavior with HDFS-11402 so as 
to catch any regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-25 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219923#comment-16219923
 ] 

Manoj Govindassamy commented on HDFS-12544:
---

The build failure doesn't look related to the commit. 
{noformat}
[INFO] Apache Hadoop Cloud Storage Project  FAILURE [  3.193 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 17:45 min
[INFO] Finished at: 2017-10-26T01:22:04+00:00
[INFO] Final Memory: 440M/4669M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-deploy-plugin:2.8.1:deploy (default-deploy) on 
project hadoop-cloud-storage-project: Failed to retrieve remote metadata 
org.apache.hadoop:hadoop-main:3.1.0-SNAPSHOT/maven-metadata.xml: Could not 
transfer metadata 
org.apache.hadoop:hadoop-main:3.1.0-SNAPSHOT/maven-metadata.xml from/to 
apache.snapshots.https 
(https://repository.apache.org/content/repositories/snapshots): Failed to 
transfer file: 
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-main/3.1.0-SNAPSHOT/maven-metadata.xml.
 Return code is: 503 , ReasonPhrase:Service Unavailable. -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-cloud-storage-project
Build step 'Execute shell' marked build as failure
[JIRA] Updating issue HDFS-12544
[JIRA] Updating issue HADOOP-14957
[JIRA] Updating issue HDFS-12579
[JIRA] Updating issue YARN-4827
[JIRA] Updating issue HADOOP-14840
ERROR: No tool found matching LATEST1_8_HOME
Setting MAVEN_3_3_3_HOME=/home/jenkins/tools/maven/apache-maven-3.3.3
Finished: FAILURE
{noformat}

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch, HDFS-12544.05.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-25 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12544:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch, HDFS-12544.05.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-25 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219224#comment-16219224
 ] 

Manoj Govindassamy commented on HDFS-12544:
---

Thanks for the review [~yzhangal]. Fixed the checkstyle issue and committed to 
trunk.
Filed HADOOP-14983 to track the DistCp enhancements to support snap root 
descendant directories.

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch, HDFS-12544.05.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-24 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12544:
--
Attachment: HDFS-12544.05.patch

Attached v05 patch with test updated to cover more cases discussed in the 
previous comment.
[~yzhangal], can you please take a look at the latest patch?

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch, HDFS-12544.05.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-24 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217386#comment-16217386
 ] 

Manoj Govindassamy commented on HDFS-12544:
---

[~yzhangal],
1.  Yes, just like the files moved out of the scope directory are showing as 
"Deleted", the files moved in under a scope directory as part of renames will 
show as "Added".

2. The newly created directory/files are available in the current version. So, 
even these newly created dirs can be requested for the scope diff. Its just 
that they are not part of any older snapshots so we will get empty diff list. 

Will post a new patch revision with tests updated to cover above cases. Thanks.

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12653) Implement toArray() and subArray() for ReadOnlyList

2017-10-20 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12653:
--
Attachment: HDFS-12653.01.patch

Attached v01 patch to address the following
1. Implemented {{ReadOnlyList#toArray()}} and {{ReadOnlyList#subArray()}} to 
return an array view of the backing list
2. TestReadOnly - unit tests to verify various contracts in ReadOnlyList. 
ReadOnly#toArray() and ReadOnlyList#subArray() can be made use when getting 
attributes from INodeAttributesProvider (HDFS-12652) and when working on the 
children list for a snapshot. Will follow on these after completing this jira.
[~eddyxu], [~yzhangal], [~daryn], can you please take a look at the patch. 

> Implement toArray() and subArray() for ReadOnlyList
> ---
>
> Key: HDFS-12653
> URL: https://issues.apache.org/jira/browse/HDFS-12653
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12653.01.patch
>
>
> {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
> list supports following Util methods for easy construction of read only views 
> of any given list. 
> {noformat}
> public static  ReadOnlyList asReadOnlyList(final List list) 
> public static  List asList(final ReadOnlyList list)
> {noformat}
> {{asList}} above additionally overrides {{Object[] toArray()}} of the 
> {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
> returns an array of Objects referring to the backing list and avoid any 
> copying of objects. Given that we have many usages of read only lists,
> 1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
> {{ReadOnlyList}} as well. 
> 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
> lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-19 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12544:
--
Attachment: HDFS-12544.04.patch

Attached v04 patch to address the following:
1. Handled file rename/move case for the snapshot scope directory.
2. New unit test for the file rename.
3. Added more comments in the test and snapshot manager.
4. Fixed typos pointed out by Yongjun in the previous comment.
[~yzhangal], can you please take a look at the patch?


> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-19 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211876#comment-16211876
 ] 

Manoj Govindassamy commented on HDFS-12544:
---

Thanks for the review comments [~yzhangal]. Good discussion on the file rename 
behavior w.r.t snapshot diff for descendant directory. Thats right, the renamed 
files still show up in the diff report as "R" entry even though they are moved 
out of the scope (descendant) directory. To get the same behavior as the normal 
snapshot diff report, these renamed files whose target is not under the scoped 
directory should be shown as "D" deleted entries in the report. Will post a new 
patch to handle this case.

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-16 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.04.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-16 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206840#comment-16206840
 ] 

Manoj Govindassamy commented on HDFS-12614:
---

Thanks for the review [~daryn] and [~yzhangal]. Committed to trunk.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.04.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12653) Implement toArray() and subArray() for ReadOnlyList

2017-10-13 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204273#comment-16204273
 ] 

Manoj Govindassamy commented on HDFS-12653:
---

[~daryn],
  Currently ReadOnlyList is predominantly used by the Directory and Snapshot 
subsystems for storing their children inodes / snapshots  in a _sorted_ order. 
I see it as a SortedList and many a times the users of this list make use of 
the sorted nature of the elements for searching - 
{{ReadOnlyList#Util#binarySearch(ReadOnlyList, K key)}}. On top of this 
sorted benefits, {{ReadOnlyList#Util#asList()}} gives a {{List}} where  
{{toArray()}} differs significantly from the Collections toArray -- the 
returned array is more of a _view_ of the backing read only list, without 
copying any elements. 
 
  I believe we can make use of ReadOnlyList for enhancing the performance of 
{{INodeAttributesProvider#getAttributes()}} by converting byte[][] 
bPathComponents to ReadOnlyList sPathComponents only one time and 
getting the _view_ of the string path components using toArray() or 
subArray(start, end). Collections doesn't have subArray() concept, theres only 
subList(). 


> Implement toArray() and subArray() for ReadOnlyList
> ---
>
> Key: HDFS-12653
> URL: https://issues.apache.org/jira/browse/HDFS-12653
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
> list supports following Util methods for easy construction of read only views 
> of any given list. 
> {noformat}
> public static  ReadOnlyList asReadOnlyList(final List list) 
> public static  List asList(final ReadOnlyList list)
> {noformat}
> {{asList}} above additionally overrides {{Object[] toArray()}} of the 
> {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
> returns an array of Objects referring to the backing list and avoid any 
> copying of objects. Given that we have many usages of read only lists,
> 1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
> {{ReadOnlyList}} as well. 
> 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
> lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-13 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.04.patch

Thanks for the review [~daryn]. 
Thats right, string literals and constant string expressions are already 
interned. Attached 04 patch, removing the explicit string intern. Please take a 
look at the latest revision.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.04.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12653) Implement toArray() and toSubArray() for ReadOnlyList

2017-10-12 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12653:
-

 Summary: Implement toArray() and toSubArray() for ReadOnlyList
 Key: HDFS-12653
 URL: https://issues.apache.org/jira/browse/HDFS-12653
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


{{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
list supports following Util methods for easy construction of read only views 
of any given list. 

{noformat}
public static  ReadOnlyList asReadOnlyList(final List list) 

public static  List asList(final ReadOnlyList list)
{noformat}

{{asList}} above additionally overrides {{Object[] toArray()}} of the 
{{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
returns an array of Objects referring to the backing list and avoid any copying 
of objects. Given that we have many usages of read only lists,

1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
{{ReadOnlyList}} as well. 
2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12653) Implement toArray() and subArray() for ReadOnlyList

2017-10-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12653:
--
Summary: Implement toArray() and subArray() for ReadOnlyList  (was: 
Implement toArray() and toSubArray() for ReadOnlyList)

> Implement toArray() and subArray() for ReadOnlyList
> ---
>
> Key: HDFS-12653
> URL: https://issues.apache.org/jira/browse/HDFS-12653
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> {{ReadOnlyList}} today gives an unmodifiable view of the backing List. This 
> list supports following Util methods for easy construction of read only views 
> of any given list. 
> {noformat}
> public static  ReadOnlyList asReadOnlyList(final List list) 
> public static  List asList(final ReadOnlyList list)
> {noformat}
> {{asList}} above additionally overrides {{Object[] toArray()}} of the 
> {{java.util.List}} interface. Unlike the {{java.util.List}}, the above one 
> returns an array of Objects referring to the backing list and avoid any 
> copying of objects. Given that we have many usages of read only lists,
> 1. Lets have a light-weight / shared-view {{toArray()}} implementation for 
> {{ReadOnlyList}} as well. 
> 2. Additionally, similar to {{java.util.List#subList(fromIndex, toIndex)}}, 
> lets have {{ReadOnlyList#subArray(fromIndex, toIndex)}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-12 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202794#comment-16202794
 ] 

Manoj Govindassamy commented on HDFS-12614:
---

Filed HDFS-12652 to track {{INodeAttributeProvider#getAttributes()}} 
performance improvement task detailed by [~daryn] in the previous comments. I 
am assuming that the request is not for changing the 
INodeAttributesProvider#getAttributes() interface.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12652) INodeAttributesProvider#getAttributes(): Avoid multiple conversions of path components byte[][] to String[] when requesting INode attributes

2017-10-12 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12652:
-

 Summary: INodeAttributesProvider#getAttributes(): Avoid multiple 
conversions of path components byte[][] to String[] when requesting INode 
attributes
 Key: HDFS-12652
 URL: https://issues.apache.org/jira/browse/HDFS-12652
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.0.0-beta1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


{{INodeAttributesProvider#getAttributes}} needs the path components passed in 
to be an array of Strings. Where as the INode and related layers maintain path 
components as an array of byte[]. So, these layers are required to convert each 
byte[] component of the path back into a string and for multiple times when 
requesting for INode attributes from the Provider. 

That is, the path "/a/b/c" requires calling the attribute provider with: (1) 
"", (2) "", "a", (3) "", "a","b", (4) "", "a","b", "c". Every single one of 
those strings were freshly (re)converted from a byte[]. Say, a file listing is 
done on a huge directory containing 100s of millions of files, then these 
multiple time redundant conversions of byte[][] to String[] create lots of tiny 
object garbages, occupying memory and affecting performance. Better if we could 
avoid creating redundant copies of path component strings.
  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-12 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.03.patch

Attached v03 patch with more comments. [~yzhangal], [~daryn], can you please 
take a look at the latest patch revision? Thanks.


> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-10 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.02.patch

Thanks for the review [~daryn]. I had the same dilemma on whether to change the 
semantics for the root path component. I didn't see any functionalities failing 
because of this change though. But, I do concur that semantic change was 
riskier.

Attached v02 patch to workaround the issue in 
{{FSPermissionChecker#getINodeAttrs()}} for the null root path component. 
Please take a look. I will track the other enhancement you talked about in a 
new jira.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-10 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12544:
--
Attachment: HDFS-12544.03.patch

Thanks for the review [~yzhangal]. Attached v03 patch to address the following 
comments. Can you please review the latest patch?

bq. It seems to make sense to include a new field snapshotDiffScopeDir in the 
SnapshotDiffInfo class, and initialize it as the constructor. 
Done.

bq. suggest to move the checking from 
SnapshotManager%getSnapshottableAncestorDir to its caller, ..
Done.

bq. suggest to remove the method 
SnapshotManager%setSnapshotDiffAllowSnapRootDescendant, and use the config 
property to pass on the value to the cluster..
Done.

bq. Nit. In SnapshotManager.java, change "directories" to "directory" in the 
following text...
Done.


> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-10-10 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Attachment: NN_UI_Summary_BlockCount_BeforeFix.png

> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-10-10 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12629:
--
Description: 
Currently NameNode UI displays total files and directories and total blocks in 
the cluster under the Summary tab. But, the total blocks count split by type is 
missing. It would be good if we can display total blocks counts by type 
(provided by HDFS-12573) along with the total block count. 



  was:Currently NameNode UI displays total files and directories and total 
blocks in the cluster under the Summary tab. But, the total blocks count split 
by type is missing. It would be good if we can have these total blocks counts 
also displayed along with the total block count. 


> NameNode UI should report total blocks count by type - replicated and erasure 
> coded
> ---
>
> Key: HDFS-12629
> URL: https://issues.apache.org/jira/browse/HDFS-12629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: NN_UI_Summary_BlockCount_BeforeFix.png
>
>
> Currently NameNode UI displays total files and directories and total blocks 
> in the cluster under the Summary tab. But, the total blocks count split by 
> type is missing. It would be good if we can display total blocks counts by 
> type (provided by HDFS-12573) along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12629) NameNode UI should report total blocks count by type - replicated and erasure coded

2017-10-10 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12629:
-

 Summary: NameNode UI should report total blocks count by type - 
replicated and erasure coded
 Key: HDFS-12629
 URL: https://issues.apache.org/jira/browse/HDFS-12629
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.0.0-beta1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


Currently NameNode UI displays total files and directories and total blocks in 
the cluster under the Summary tab. But, the total blocks count split by type is 
missing. It would be good if we can have these total blocks counts also 
displayed along with the total block count. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12573) Divide the total block metrics into replica and ec

2017-10-10 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12573:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch contribution [~tasanuma0829].
Committed to trunk.

> Divide the total block metrics into replica and ec
> --
>
> Key: HDFS-12573
> URL: https://issues.apache.org/jira/browse/HDFS-12573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, metrics, namenode
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
> Fix For: 3.0.0
>
> Attachments: HDFS-12573.1.patch, HDFS-12573.2.patch, 
> HDFS-12573.3.patch
>
>
> Following HDFS-10999, let's separate total blocks metrics. It would be useful 
> for administrators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-09 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Affects Version/s: 3.0.0-beta1
 Target Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-09 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.01.patch

Attached v01 patch to address the issue in {{FSDirecotry#resolvePath()}} when 
{{INodeAttributesProvider}} is enabled.
[~eddyxu], [~kihwal], [~daryn] can you please take a look?

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.01.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-06 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12614:
--
Attachment: HDFS-12614.test.01.patch

Attached a test case to show the problem with INodeAttributesProvider and path 
resolving.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   >