[jira] [Assigned] (HDFS-12228) [SPS]: Add storage policy satisfier related metrics
[ https://issues.apache.org/jira/browse/HDFS-12228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HDFS-12228: -- Assignee: Ajith S > [SPS]: Add storage policy satisfier related metrics > --- > > Key: HDFS-12228 > URL: https://issues.apache.org/jira/browse/HDFS-12228 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Reporter: Rakesh R >Assignee: Ajith S > > This jira to discuss and implement metrics needed for SPS feature. > Below are few metrics: > # count of {{inprogress}} block movements > # count of {{successful}} block movements > # count of {{failed}} block movements > Need to analyse and add more. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Status: Patch Available (was: Open) Attached rebased patch > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Attachment: HDFS-8693.03.patch > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Status: Open (was: Patch Available) Will rebase and upload new > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062555#comment-16062555 ] Ajith S commented on HDFS-8693: --- Hi [~zhangchen] I had uploaded the patch sometime back, let me try to rebase and address test case failure for this > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11191) Datanode Capacity is misleading if the dfs.datanode.data.dir is configured with two directories from the same file system.
[ https://issues.apache.org/jira/browse/HDFS-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815074#comment-15815074 ] Ajith S commented on HDFS-11191: Similar to HDFS-8610 ? > Datanode Capacity is misleading if the dfs.datanode.data.dir is configured > with two directories from the same file system. > -- > > Key: HDFS-11191 > URL: https://issues.apache.org/jira/browse/HDFS-11191 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.5.0 > Environment: SLES 11SP3 > HDP 2.5.0 >Reporter: Deepak Chander >Assignee: Weiwei Yang > Labels: capacity, datanode, storage, user-experience > Attachments: HDFS-11191.01.patch, HDFS-11191.02.patch, > HDFS-11191.03.patch, HDFS-11191.04.patch, HDFS-11191.05.patch, > HDFS-11191.06.patch, HDFS-11191.07.patch > > > In the command “hdfs dfsadmin -report” The Configured Capacity is misleading > if the dfs.datanode.data.dir is configured with two directories from the same > file system. > hdfs@kimtest1:~> hdfs dfsadmin -report > Configured Capacity: 239942369274 (223.46 GB) > Present Capacity: 207894724602 (193.62 GB) > DFS Remaining: 207894552570 (193.62 GB) > DFS Used: 172032 (168 KB) > DFS Used%: 0.00% > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 0 > - > Live datanodes (3): > Name: 172.26.79.87:50010 (kimtest3) > Hostname: kimtest3 > Decommission Status : Normal > Configured Capacity: 79980789758 (74.49 GB) > DFS Used: 57344 (56 KB) > Non DFS Used: 9528000512 (8.87 GB) > DFS Remaining: 70452731902 (65.61 GB) > DFS Used%: 0.00% > DFS Remaining%: 88.09% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 2 > Last contact: Tue Nov 29 06:59:02 PST 2016 > Name: 172.26.80.38:50010 (kimtest4) > Hostname: kimtest4 > Decommission Status : Normal > Configured Capacity: 79980789758 (74.49 GB) > DFS Used: 57344 (56 KB) > Non DFS Used: 13010952192 (12.12 GB) > DFS Remaining: 66969780222 (62.37 GB) > DFS Used%: 0.00% > DFS Remaining%: 83.73% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 2 > Last contact: Tue Nov 29 06:59:02 PST 2016 > Name: 172.26.79.86:50010 (kimtest2) > Hostname: kimtest2 > Decommission Status : Normal > Configured Capacity: 79980789758 (74.49 GB) > DFS Used: 57344 (56 KB) > Non DFS Used: 9508691968 (8.86 GB) > DFS Remaining: 70472040446 (65.63 GB) > DFS Used%: 0.00% > DFS Remaining%: 88.11% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 2 > Last contact: Tue Nov 29 06:59:02 PST 2016 > If you see my datanode root file system size its only 38GB > kimtest3:~ # df -h / > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/system-root 38G 2.6G 33G 8% / > kimtest4:~ # df -h / > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/system-root 38G 4.2G 32G 12% / > kimtest2:~ # df -h / > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/system-root 38G 2.6G 33G 8% / > The below is from hdfs-site.xml file > > dfs.datanode.data.dir > file:///grid/hadoop/hdfs/dn, file:///grid1/hadoop/hdfs/dn > > I have removed the other directory grid1 and restarted datanode process. > > dfs.datanode.data.dir > file:///grid/hadoop/hdfs/dn > > Now the size is reflecting correctly > hdfs@kimtest1:/grid> hdfs dfsadmin -report > Configured Capacity: 119971184637 (111.73 GB) > Present Capacity: 103947243517 (96.81 GB) > DFS Remaining: 103947157501 (96.81 GB) > DFS Used: 86016 (84 KB) > DFS Used%: 0.00% > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 0 > - > Live datanodes (3): > Name: 172.26.79.87:50010 (kimtest3) > Hostname: kimtest3 > Decommission Status : Normal > Configured Capacity: 39990394879 (37.24 GB) > DFS Used: 28672 (28 KB) > Non DFS Used: 4764057600 (4.44 GB) > DFS Remaining: 35226308607 (32.81 GB) > DFS Used%: 0.00% > DFS Remaining%: 88.09% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 2 > Last contact: Tue Nov 29 07:34:02 PST 2016 > Name: 172.26.80.38:50010 (kimtest4) > Hostname: kimtest4 > Decommission Status : Normal > Configured Capacity: 39990394879 (37.24 GB) > DFS Used: 2867
[jira] [Commented] (HDFS-11107) TestStartup#testStorageBlockContentsStaleAfterNNRestart flaky failure
[ https://issues.apache.org/jira/browse/HDFS-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15659861#comment-15659861 ] Ajith S commented on HDFS-11107: Will like to work on this, Please assign this to me > TestStartup#testStorageBlockContentsStaleAfterNNRestart flaky failure > - > > Key: HDFS-11107 > URL: https://issues.apache.org/jira/browse/HDFS-11107 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Xiaobing Zhou >Priority: Minor > Labels: unit-test > > It's noticed that this failed in the last Jenkins run of HDFS-11085, but it's > not reproducible and passed with and without the patch. > {noformat} > Error Message > expected:<0> but was:<2> > Stacktrace > java.lang.AssertionError: expected:<0> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.namenode.TestStartup.testStorageBlockContentsStaleAfterNNRestart(TestStartup.java:726) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11104) BlockPlacementPolicyDefault choose favoredNodes in turn which may cause imbalance
[ https://issues.apache.org/jira/browse/HDFS-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15659720#comment-15659720 ] Ajith S commented on HDFS-11104: Would like to work on this, can u please assign to me > BlockPlacementPolicyDefault choose favoredNodes in turn which may cause > imbalance > - > > Key: HDFS-11104 > URL: https://issues.apache.org/jira/browse/HDFS-11104 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Doris Gu > > if client transfer favoredNodes when it writes files into hdfs,chooseTarget > in BlockPlacementPolicyDefault prior chooseTarget in turn: > {quote} > DatanodeStorageInfo[] chooseTarget(String src, > int numOfReplicas, > Node writer, > Set excludedNodes, > long blocksize, > List favoredNodes, > BlockStoragePolicy storagePolicy) { > try { > ... >*for (int i = 0; i < favoredNodes.size() && results.size() < > numOfReplicas; i++)* { > DatanodeDescriptor favoredNode = favoredNodes.get(i); > // Choose a single node which is local to favoredNode. > // 'results' is updated within chooseLocalNode > final DatanodeStorageInfo target = chooseLocalStorage(favoredNode, > favoriteAndExcludedNodes, blocksize, maxNodesPerRack, > results, avoidStaleNodes, storageTypes, false); > ... > {quote} > why not shuffle it here? Make block more balanced, save the cost balancer > will pay and make cluster more stable. > {quote} > for (DatanodeDescriptor favoredNode : > DFSUtil.shuffle(favoredNodes.toArray(new > DatanodeDescriptor[favoredNodes.size()]))) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643993#comment-15643993 ] Ajith S commented on HDFS-8693: --- Attacing rebased patch. Please review > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Attachment: HDFS-8693.02.patch > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Status: Patch Available (was: Open) > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.02.patch, HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Status: Open (was: Patch Available) > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-8094) Cluster web console (dfsclusterhealth.jsp) is not working
[ https://issues.apache.org/jira/browse/HDFS-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S resolved HDFS-8094. --- Resolution: Duplicate > Cluster web console (dfsclusterhealth.jsp) is not working > - > > Key: HDFS-8094 > URL: https://issues.apache.org/jira/browse/HDFS-8094 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > According to documentation, cluster can be monitored at > http:///dfsclusterhealth.jsp > Currently, this url doesn't seem to be working. This seems to be removed as > part of HDFS-6252 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Status: Patch Available (was: Open) > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8693: -- Attachment: HDFS-8693.1.patch Attaching Patch. Please review > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > Attachments: HDFS-8693.1.patch > > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217226#comment-15217226 ] Ajith S commented on HDFS-9478: --- Thanks [~arpitagarwal] > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Fix For: 2.7.3 > > Attachments: HDFS-9478.2.patch, HDFS-9478.3.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-9478: -- Attachment: HDFS-9478.3.patch Thanks for the input, i have updated the patch as per review comment. > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.3.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214017#comment-15214017 ] Ajith S commented on HDFS-9478: --- e.getCause() is a Throwable and would need method signature change (adding throws to all methods in above stacktrace) , i feel its better to wrap the e.getCause() in RuntimeException(Unchecked exception) instead. > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-9478: -- Attachment: HDFS-9478.2.patch Modified as per comments. Please review > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182465#comment-15182465 ] Ajith S commented on HDFS-9478: --- {code:java} . catch (RuntimeException e) { throw e; } catch (InvocationTargetException e) { throw new RuntimeException(theClass.getName() + " could not be constructed.",e); } catch (Exception e) { //empty } {code} I am planning to replace the catch blocks with above set of catch blocks in org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance I have a few concerns here 1. I expect FairCallQueue to log the exception cause instead of depending on createCallQueueInstance 2. Catching InvocationTargetException changes the code flow here, (in a scenario where the CallQueue object may have been created using the default constructor) > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179669#comment-15179669 ] Ajith S commented on HDFS-9478: --- guess we better refactor *org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance* and check for _InvocationTargetException_ any thoughts.? > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179665#comment-15179665 ] Ajith S commented on HDFS-9478: --- I think the *RuntimeException* thrown by *FairCallQueue* is wrapped as *InvocationTargetException*, so *RuntimeException* catch block will be bypassed and instead caught by following *Exception* catch block Reference : http://docs.oracle.com/javase/8/docs/api/java/lang/reflect/Constructor.html#newInstance-java.lang.Object...- _InvocationTargetException - if the underlying constructor throws an exception._ > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-9478: -- Status: Patch Available (was: Open) > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-9478: -- Attachment: HDFS-9478.patch Please review > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059359#comment-15059359 ] Ajith S commented on HDFS-9011: --- would cause this https://issues.apache.org/jira/browse/HDFS-8610 > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Patch Available (was: Open) Addressed related test case failure and checkstyle errors. Please review > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch, HDFS-4167.07.patch, HDFS-4167.08.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Attachment: HDFS-4167.08.patch > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch, HDFS-4167.07.patch, HDFS-4167.08.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Open (was: Patch Available) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch, HDFS-4167.07.patch, HDFS-4167.08.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Patch Available (was: Open) Resolved conflicts and resubmitting patch > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch, HDFS-4167.07.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Attachment: HDFS-4167.07.patch > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch, HDFS-4167.07.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Open (was: Patch Available) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Patch Available (was: Open) Please review > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Attachment: HDFS-4167.06.patch fixed related testcases > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch, HDFS-4167.06.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Open (was: Patch Available) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Patch Available (was: Open) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Open (was: Patch Available) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Status: Patch Available (was: Open) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-4167: -- Attachment: HDFS-4167.05.patch Updated to trunk and added support for CLI. Please review > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch, > HDFS-4167.05.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8940) Support for large-scale multi-tenant inotify service
[ https://issues.apache.org/jira/browse/HDFS-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715919#comment-14715919 ] Ajith S commented on HDFS-8940: --- Hi all, Design document suggests polling from SbNN(without inotify). I think as [~cmccabe] and [~surendrasingh] mentioned "read-inotify-from-standby" also seems to be good approach, but we may also have to consider "delayed namespace information" in SbNN [~mingma] thanks for the design > Support for large-scale multi-tenant inotify service > > > Key: HDFS-8940 > URL: https://issues.apache.org/jira/browse/HDFS-8940 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma > Attachments: Large-Scale-Multi-Tenant-Inotify-Service.pdf > > > HDFS-6634 provides the core inotify functionality. We would like to extend > that to provide a large-scale service that ten of thousands of clients can > subscribe to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8956) Not able to start Datanode
[ https://issues.apache.org/jira/browse/HDFS-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712785#comment-14712785 ] Ajith S commented on HDFS-8956: --- I suggest may be you can check your DN address rather than port. When a port is occupied, you get *Caused by: java.net.BindException: Address already in use* but when a address like hostname is wrong you get *Caused by: java.net.BindException: Cannot assign requested address*. So check if the http server address mentioned for datanode is correct (as the stack tells *Cannot assign requested address*) or I guess may be a misconiguration in *hosts* file > Not able to start Datanode > -- > > Key: HDFS-8956 > URL: https://issues.apache.org/jira/browse/HDFS-8956 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, namenode >Affects Versions: 2.7.0 > Environment: Centos >Reporter: sreelakshmi > > Data node service is not started on one of the data nodes, "java.net.bind > exception" is thrown. > Verified that ports 50010,50070 and 50075 are not in use by any other > application. > 15/08/26 01:50:15 INFO http.HttpServer2: HttpServer.start() threw a non Bind > IOException > java.net.BindException: Port in use: localhost:0 > at > org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919) > at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:779) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1134) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:434) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2404) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2291) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2338) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2515) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2539) > Caused by: java.net.BindException: Cannot assign requested address -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8763) Incremental blockreport order may replicate unnecessary block
[ https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712355#comment-14712355 ] Ajith S commented on HDFS-8763: --- Hi all, agree with changing {{dfs.namenode.replication.interval}}. But i somehow think this just reduces the window of such problem to happen rather than eliminate it. Because at some point of time it might happen that the replication monitor thread overlaps on close and same issue reoccurs. Please correct me if i am wrong :) > Incremental blockreport order may replicate unnecessary block > - > > Key: HDFS-8763 > URL: https://issues.apache.org/jira/browse/HDFS-8763 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.4.0 >Reporter: jiangyu >Assignee: Walter Su >Priority: Minor > > For our cluster, the NameNode is always very busy, so for every incremental > block report , the contention of lock is heavy. > The logic of incremental block report is as follow, client send block to dn1 > and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all > datanode will report the newly received block to namenode. In NameNode side, > all will go to the method processIncrementalBlockReport in BlockManager > class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK, > for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is > common), but in some busy environment, it is easy to find dn1 report before > dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report > third. > So dn1 will addStoredBlock and find the replica of this block is not reach > the the original number(which is 3), and the block will add to > neededReplications construction and soon ask some node in pipeline (dn1 or > dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block, > then choose one node to invalidate. > Here is one log i found in our cluster: > 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp. > BP-1386326728-xx.xx.2.131-1382089338395 > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]} > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.75:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.62:50010 is added to > blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION, > primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW], > > ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]} > size 0 > 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to > replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010 > 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock: > blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size > 67750 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add > blk_3194502674_2121080184 to xx.xx.7.75:50010 > 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates: > (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks > set > 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184, > blk_3194497594_2121075104] > Some day, the number of this situation can be 40, that is not good for > the performance and waste network band. > Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find > any difference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-5711) Removing memory limitation of the Namenode by persisting Block - Block location mappings to disk.
[ https://issues.apache.org/jira/browse/HDFS-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HDFS-5711: - Assignee: Ajith S > Removing memory limitation of the Namenode by persisting Block - Block > location mappings to disk. > - > > Key: HDFS-5711 > URL: https://issues.apache.org/jira/browse/HDFS-5711 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Rohan Pasalkar >Assignee: Ajith S >Priority: Minor > > This jira is to track changes to be made to remove HDFS name-node memory > limitation to hold block - block location mappings. > It is a known fact that the single Name-node architecture of HDFS has > scalability limits. The HDFS federation project alleviates this problem by > using horizontal scaling. This helps increase the throughput of metadata > operation and also the amount of data that can be stored in a Hadoop cluster. > The Name-node stores all the filesystem metadata in memory (even in the > federated architecture), the > Name-node design can be enhanced by persisting part of the metadata onto > secondary storage and retaining > the popular or recently accessed metadata information in main memory. This > design can benefit a HDFS deployment > which doesn't use federation but needs to store a large number of files or > large number of blocks. Lin Xiao from Hortonworks attempted a similar > project [1] in the Summer of 2013. They used LevelDB to persist the Namespace > information (i.e file and directory inode information). > A patch with this change is yet to be submitted to code base. We also intend > to use LevelDB to persist metadata, and plan to > provide a complete solution, by not just persisting the Namespace > information but also the Blocks Map onto secondary storage. > We did implement the basic prototype which stores the block-block location > mapping metadata to the persistent key-value store i.e. levelDB. Prototype > also maintains the in-memory cache of the recently used block-block location > mappings metadata. > References: > [1] Lin Xiao, Hortonworks, Removing Name-node’s memory limitation, HDFS-5389, > http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HDFS-4167: - Assignee: Ajith S (was: Jing Zhao) > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Ajith S > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4167) Add support for restoring/rolling back to a snapshot
[ https://issues.apache.org/jira/browse/HDFS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707815#comment-14707815 ] Ajith S commented on HDFS-4167: --- Hi [~jingzhao] Can i please work on this patch in case you are not.?? > Add support for restoring/rolling back to a snapshot > > > Key: HDFS-4167 > URL: https://issues.apache.org/jira/browse/HDFS-4167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Suresh Srinivas >Assignee: Jing Zhao > Attachments: HDFS-4167.000.patch, HDFS-4167.001.patch, > HDFS-4167.002.patch, HDFS-4167.003.patch, HDFS-4167.004.patch > > > This jira tracks work related to restoring a directory/file to a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700792#comment-14700792 ] Ajith S commented on HDFS-8852: --- Thanks for the input [~ajisakaa] Uploaded the new patch as per your comments > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > Attachments: HDFS-8852.2.patch, HDFS-8852.patch > > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8852: -- Attachment: HDFS-8852.2.patch > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > Attachments: HDFS-8852.2.patch, HDFS-8852.patch > > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692803#comment-14692803 ] Ajith S commented on HDFS-8808: --- +1 > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8852: -- Status: Patch Available (was: Open) > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > Attachments: HDFS-8852.patch > > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8852: -- Attachment: HDFS-8852.patch Please review the patch > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > Attachments: HDFS-8852.patch > > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661679#comment-14661679 ] Ajith S commented on HDFS-8852: --- +1 will update accordingly. Thanks [~ajisakaa] > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661219#comment-14661219 ] Ajith S commented on HDFS-8852: --- May be we can update it to "Appending the content to files is supported at the end but cannot by updated at arbitrary point and it is also not possible to have multiple writers. Files can only be written by a single writer." > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HDFS-8852: - Assignee: Ajith S > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659680#comment-14659680 ] Ajith S commented on HDFS-8808: --- +1 on Zhang's comment Hi [~zhz] if i could suggest, may be add a configuration to configure bandwidth specifically for bootstrap.?? Because i think its case basis if we want to throttle bandwidth on bootstrap.. please correct me if i am wrong > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
[ https://issues.apache.org/jira/browse/HDFS-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8413: -- Status: Open (was: Patch Available) > Directories are not listed recursively when fs.defaultFs is viewFs > -- > > Key: HDFS-8413 > URL: https://issues.apache.org/jira/browse/HDFS-8413 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: viewfs > Attachments: HDFS-8413.patch > > > Mount a cluster on client throught viewFs mount table > Example: > {quote} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./nn1 > hdfs://ns1/ > > > fs.viewfs.mounttable.default.link./user > hdfs://host-72:8020/ > > > {quote} > Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* > only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S resolved HDFS-8574. --- Resolution: Not A Problem > When block count for a volume exceeds dfs.blockreport.split.threshold, block > report causes exception > > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > when a single volume contains many blocks, i.e more than the threshold, it is > trying to send the entire blockreport in one RPC, causing exception > {code} > java.lang.IllegalStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message was too > large. May be malicious. Use CodedInputStream.setSizeLimit() to increase > the size limit. > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653196#comment-14653196 ] Ajith S commented on HDFS-8574: --- Closing this issue as per comments > When block count for a volume exceeds dfs.blockreport.split.threshold, block > report causes exception > > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > when a single volume contains many blocks, i.e more than the threshold, it is > trying to send the entire blockreport in one RPC, causing exception > {code} > java.lang.IllegalStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message was too > large. May be malicious. Use CodedInputStream.setSizeLimit() to increase > the size limit. > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653150#comment-14653150 ] Ajith S commented on HDFS-8693: --- Hi [~john.jian.fang] and [~kihwal] Agreed, need to fix refreshNameNodes. In refreshNNList, can we just add a new NN actor and replace the old NN actor in block pool service.?? I would like to work on this issue :) > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Priority: Critical > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S reassigned HDFS-8693: - Assignee: Ajith S > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Assignee: Ajith S >Priority: Critical > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
[ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653147#comment-14653147 ] Ajith S commented on HDFS-8693: --- Hi [~kihwal] I tested with federated HA cluster when adding a new nameservice, the command works. Is there any special scenario when you said it doesn't work for federated HA cluster.? > refreshNamenodes does not support adding a new standby to a running DN > -- > > Key: HDFS-8693 > URL: https://issues.apache.org/jira/browse/HDFS-8693 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ha >Affects Versions: 2.6.0 >Reporter: Jian Fang >Priority: Critical > > I tried to run the following command on a Hadoop 2.6.0 cluster with HA > support > $ hdfs dfsadmin -refreshNamenodes datanode-host:port > to refresh name nodes on data nodes after I replaced one name node with a new > one so that I don't need to restart the data nodes. However, I got the > following error: > refreshNamenodes: HA does not currently support adding a new standby to a > running DN. Please do a rolling restart of DNs to reconfigure the list of NNs. > I checked the 2.6.0 code and the error was thrown by the following code > snippet, which led me to this JIRA. > void refreshNNList(ArrayList addrs) throws IOException { > Set oldAddrs = Sets.newHashSet(); > for (BPServiceActor actor : bpServices) > { oldAddrs.add(actor.getNNSocketAddress()); } > Set newAddrs = Sets.newHashSet(addrs); > if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) > { // Keep things simple for now -- we can implement this at a later date. > throw new IOException( "HA does not currently support adding a new standby to > a running DN. " + "Please do a rolling restart of DNs to reconfigure the list > of NNs."); } > } > Looks like this the refreshNameNodes command is an uncompleted feature. > Unfortunately, the new name node on a replacement is critical for auto > provisioning a hadoop cluster with HDFS HA support. Without this support, the > HA feature could not really be used. I also observed that the new standby > name node on the replacement instance could stuck in safe mode because no > data nodes check in with it. Even with a rolling restart, it may take quite > some time to restart all data nodes if we have a big cluster, for example, > with 4000 data nodes, let alone restarting DN is way too intrusive and it is > not a preferable operation in production. It also increases the chance for a > double failure because the standby name node is not really ready for a > failover in the case that the current active name node fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653015#comment-14653015 ] Ajith S commented on HDFS-8808: --- Hi [~ggop] Why not bootstrap the standby without that property and when its complete, before starting the standby you add dfs.image.tranfer.bandwidthPerSec > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Gautam Gopalakrishnan > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8610) if set several dirs which belongs to one disk in "dfs.datanode.data.dir", NN calculate capacity wrong
[ https://issues.apache.org/jira/browse/HDFS-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587964#comment-14587964 ] Ajith S commented on HDFS-8610: --- Hi [~brahmareddy] Yes you are right. Please refer HDFS-8574 when very high number of files with small size are present we encounter {{InvalidProtocolBufferException}}. So to avoid this, the data.dir was split into subfolder so that the block report size will be smaller. Looks like we have a dead end for this scenario > if set several dirs which belongs to one disk in "dfs.datanode.data.dir", NN > calculate capacity wrong > - > > Key: HDFS-8610 > URL: https://issues.apache.org/jira/browse/HDFS-8610 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.0 >Reporter: tongshiquan >Assignee: Ajith S >Priority: Minor > > In my machine, disk info as below: > /dev/sdc1 8.1T 2.0T 5.7T 27% /export2 > /dev/sdd1 8.1T 2.0T 5.7T 27% /export3 > /dev/sde1 8.1T 2.8T 5.0T 36% /export4 > then set "dfs.datanode.data.dir" as below, each disk have 10 dirs: > /export2/BigData/hadoop/data/dn,/export2/BigData/hadoop/data/dn1,/export2/BigData/hadoop/data/dn2,/export2/BigData/hadoop/data/dn3,/export2/BigData/hadoop/data/dn4,/export2/BigData/hadoop/data/dn5,/export2/BigData/hadoop/data/dn6,/export2/BigData/hadoop/data/dn7,/export2/BigData/hadoop/data/dn8,/export2/BigData/hadoop/data/dn9,/export2/BigData/hadoop/data/dn10,/export3/BigData/hadoop/data/dn,/export3/BigData/hadoop/data/dn1,/export3/BigData/hadoop/data/dn2,/export3/BigData/hadoop/data/dn3,/export3/BigData/hadoop/data/dn4,/export3/BigData/hadoop/data/dn5,/export3/BigData/hadoop/data/dn6,/export3/BigData/hadoop/data/dn7,/export3/BigData/hadoop/data/dn8,/export3/BigData/hadoop/data/dn9,/export3/BigData/hadoop/data/dn10,/export4/BigData/hadoop/data/dn,/export4/BigData/hadoop/data/dn1,/export4/BigData/hadoop/data/dn2,/export4/BigData/hadoop/data/dn3,/export4/BigData/hadoop/data/dn4,/export4/BigData/hadoop/data/dn5,/export4/BigData/hadoop/data/dn6,/export4/BigData/hadoop/data/dn7,/export4/BigData/hadoop/data/dn8,/export4/BigData/hadoop/data/dn9,/export4/BigData/hadoop/data/dn10 > then NN will think in this DN have 8.1T * 30 = 243 TB, but actually it only > have 24.3TB -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587324#comment-14587324 ] Ajith S commented on HDFS-8574: --- Hi [~arpitagarwal] Thanks for the input. Yes you are right, HDFS was not designed for tiny blocks. My scenario was like, i wanted to test the NN limits so i inserted 10 million files with size ~10KB(10KB because i had smaller disk). My DN had one {{data.dir}} directory, when i faced this exception. But when i increased the {{data.dir}} to 5, the issue was resolved. Later i checked and came across this piece of code where the block report was sent per volume of DN. My question is when we check for overflow, based on number of blocks, then why we split based on report, as in a single report, there might be still overflow for given limit {{dfs.blockreport.split.threshold}} Please correct me if i am wrong > When block count for a volume exceeds dfs.blockreport.split.threshold, block > report causes exception > > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > when a single volume contains many blocks, i.e more than the threshold, it is > trying to send the entire blockreport in one RPC, causing exception > {code} > java.lang.IllegalStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message was too > large. May be malicious. Use CodedInputStream.setSizeLimit() to increase > the size limit. > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581677#comment-14581677 ] Ajith S commented on HDFS-8574: --- Updated the issue based on your comment > When block count for a volume exceeds dfs.blockreport.split.threshold, block > report causes exception > > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > when a single volume contains many blocks, i.e more than the threshold, it is > trying to send the entire blockreport in one RPC, causing exception > {code} > java.lang.IllegalStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message was too > large. May be malicious. Use CodedInputStream.setSizeLimit() to increase > the size limit. > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8574) When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8574: -- Description: This piece of code in {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} {code} // Send one block report per message. for (int r = 0; r < reports.length; r++) { StorageBlockReport singleReport[] = { reports[r] }; DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), singleReport, new BlockReportContext(reports.length, r, reportId)); numReportsSent++; numRPCs++; if (cmd != null) { cmds.add(cmd); } {code} when a single volume contains many blocks, i.e more than the threshold, it is trying to send the entire blockreport in one RPC, causing exception {code} java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) {code} was: This piece of code in {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} {code} // Send one block report per message. for (int r = 0; r < reports.length; r++) { StorageBlockReport singleReport[] = { reports[r] }; DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), singleReport, new BlockReportContext(reports.length, r, reportId)); numReportsSent++; numRPCs++; if (cmd != null) { cmds.add(cmd); } {code} is creating many cmds in case the block count exceeds the {{dfs.blockreport.split.threshold}} limit. A better way for this will be spliting the block reports in equal number of buckets of size {{dfs.blockreport.split.threshold}} therefore reducing the number of RPCs in block reporting Summary: When block count for a volume exceeds dfs.blockreport.split.threshold, block report causes exception (was: When block count exceeds dfs.blockreport.split.threshold, the block report are sent in one per message) > When block count for a volume exceeds dfs.blockreport.split.threshold, block > report causes exception > > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > when a single volume contains many blocks, i.e more than the threshold, it is > trying to send the entire blockreport in one RPC, causing exception > {code} > java.lang.IllegalStateException: > com.google.protobuf.InvalidProtocolBufferException: Protocol message was too > large. May be malicious. Use CodedInputStream.setSizeLimit() to increase > the size limit. > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) > at > org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8574) When block count exceeds dfs.blockreport.split.threshold, the block report are sent in one per message
[ https://issues.apache.org/jira/browse/HDFS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581674#comment-14581674 ] Ajith S commented on HDFS-8574: --- Hi walter, thanks for that info. You are right, the number of RPCs is equal to number of volumes. But in my scenario, there is one volume which contains files way more than {{dfs.blockreport.split.threshold}} (may be 10 times) So the previous loop has created one report with all the blocklist on that volume {code} for(Map.Entry kvPair : perVolumeBlockLists.entrySet()) { BlockListAsLongs blockList = kvPair.getValue(); reports[i++] = new StorageBlockReport(kvPair.getKey(), blockList); totalBlockCount += blockList.getNumberOfBlocks(); } {code} So next, when it tries to send this block report to NN, it receives {code} java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:369) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:347) at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:325) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:473) {code} So may be we can redesign so that multiple block reports can be sent per volume.? what do you suggest.? > When block count exceeds dfs.blockreport.split.threshold, the block report > are sent in one per message > -- > > Key: HDFS-8574 > URL: https://issues.apache.org/jira/browse/HDFS-8574 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > This piece of code in > {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} > {code} > // Send one block report per message. > for (int r = 0; r < reports.length; r++) { > StorageBlockReport singleReport[] = { reports[r] }; > DatanodeCommand cmd = bpNamenode.blockReport( > bpRegistration, bpos.getBlockPoolId(), singleReport, > new BlockReportContext(reports.length, r, reportId)); > numReportsSent++; > numRPCs++; > if (cmd != null) { > cmds.add(cmd); > } > {code} > is creating many cmds in case the block count exceeds the > {{dfs.blockreport.split.threshold}} limit. A better way for this will be > spliting the block reports in equal number of buckets of size > {{dfs.blockreport.split.threshold}} therefore reducing the number of RPCs in > block reporting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8574) When block count exceeds dfs.blockreport.split.threshold, the block report are sent in one per message
Ajith S created HDFS-8574: - Summary: When block count exceeds dfs.blockreport.split.threshold, the block report are sent in one per message Key: HDFS-8574 URL: https://issues.apache.org/jira/browse/HDFS-8574 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ajith S Assignee: Ajith S This piece of code in {{org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport()}} {code} // Send one block report per message. for (int r = 0; r < reports.length; r++) { StorageBlockReport singleReport[] = { reports[r] }; DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), singleReport, new BlockReportContext(reports.length, r, reportId)); numReportsSent++; numRPCs++; if (cmd != null) { cmds.add(cmd); } {code} is creating many cmds in case the block count exceeds the {{dfs.blockreport.split.threshold}} limit. A better way for this will be spliting the block reports in equal number of buckets of size {{dfs.blockreport.split.threshold}} therefore reducing the number of RPCs in block reporting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
[ https://issues.apache.org/jira/browse/HDFS-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8413: -- Status: Patch Available (was: Open) *Resolution* : decide if the mount point is a directory or not by querying the target filesystem Submitting the patch for same. Please review > Directories are not listed recursively when fs.defaultFs is viewFs > -- > > Key: HDFS-8413 > URL: https://issues.apache.org/jira/browse/HDFS-8413 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: viewfs > Attachments: HDFS-8413.patch > > > Mount a cluster on client throught viewFs mount table > Example: > {quote} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./nn1 > hdfs://ns1/ > > > fs.viewfs.mounttable.default.link./user > hdfs://host-72:8020/ > > > {quote} > Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* > only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
[ https://issues.apache.org/jira/browse/HDFS-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8413: -- Attachment: HDFS-8413.patch > Directories are not listed recursively when fs.defaultFs is viewFs > -- > > Key: HDFS-8413 > URL: https://issues.apache.org/jira/browse/HDFS-8413 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: viewfs > Attachments: HDFS-8413.patch > > > Mount a cluster on client throught viewFs mount table > Example: > {quote} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./nn1 > hdfs://ns1/ > > > fs.viewfs.mounttable.default.link./user > hdfs://host-72:8020/ > > > {quote} > Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* > only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
[ https://issues.apache.org/jira/browse/HDFS-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14547829#comment-14547829 ] Ajith S commented on HDFS-8413: --- I think the problem is in {quote} FileStatus[] org.apache.hadoop.fs.viewfs.ViewFileSystem.InternalDirOfViewFs.listStatus(Path f) {quote} even if inode is instance of INodeLink, it can be still a directory i.e the mount points in viewFS on client, can be treated as symbolic link directories. Please correct me if i am wrong > Directories are not listed recursively when fs.defaultFs is viewFs > -- > > Key: HDFS-8413 > URL: https://issues.apache.org/jira/browse/HDFS-8413 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: viewfs > > Mount a cluster on client throught viewFs mount table > Example: > {quote} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./nn1 > hdfs://ns1/ > > > fs.viewfs.mounttable.default.link./user > hdfs://host-72:8020/ > > > {quote} > Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* > only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
[ https://issues.apache.org/jira/browse/HDFS-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8413: -- Affects Version/s: 2.7.0 > Directories are not listed recursively when fs.defaultFs is viewFs > -- > > Key: HDFS-8413 > URL: https://issues.apache.org/jira/browse/HDFS-8413 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: viewfs > > Mount a cluster on client throught viewFs mount table > Example: > {quote} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./nn1 > hdfs://ns1/ > > > fs.viewfs.mounttable.default.link./user > hdfs://host-72:8020/ > > > {quote} > Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* > only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8413) Directories are not listed recursively when fs.defaultFs is viewFs
Ajith S created HDFS-8413: - Summary: Directories are not listed recursively when fs.defaultFs is viewFs Key: HDFS-8413 URL: https://issues.apache.org/jira/browse/HDFS-8413 Project: Hadoop HDFS Issue Type: Bug Reporter: Ajith S Assignee: Ajith S Mount a cluster on client throught viewFs mount table Example: {quote} fs.defaultFS viewfs:/// fs.viewfs.mounttable.default.link./nn1 hdfs://ns1/ fs.viewfs.mounttable.default.link./user hdfs://host-72:8020/ {quote} Try to list the files recursively *(hdfs dfs -ls -R / or hadoop fs -ls -R /)* only the parent folders are listed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7998) HDFS Federation : Command mentioned to add a NN to existing federated cluster is wrong
[ https://issues.apache.org/jira/browse/HDFS-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-7998: -- Labels: BB2015-05-RFC (was: BB2015-05-TBR) > HDFS Federation : Command mentioned to add a NN to existing federated cluster > is wrong > --- > > Key: HDFS-7998 > URL: https://issues.apache.org/jira/browse/HDFS-7998 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-RFC > Attachments: HDFS-7998.patch > > > HDFS Federation documentation > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/Federation.html > has the following command to add a namenode to existing cluster > > $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode > > : > this command is incorrect, actual correct command is > > $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNamenodes > > : > need to update the same in documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7998) HDFS Federation : Command mentioned to add a NN to existing federated cluster is wrong
[ https://issues.apache.org/jira/browse/HDFS-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533908#comment-14533908 ] Ajith S commented on HDFS-7998: --- The test failure is not because of the patch > HDFS Federation : Command mentioned to add a NN to existing federated cluster > is wrong > --- > > Key: HDFS-7998 > URL: https://issues.apache.org/jira/browse/HDFS-7998 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7998.patch > > > HDFS Federation documentation > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/Federation.html > has the following command to add a namenode to existing cluster > > $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode > > : > this command is incorrect, actual correct command is > > $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNamenodes > > : > need to update the same in documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533902#comment-14533902 ] Ajith S commented on HDFS-8274: --- Whitespace error not related to the patch. Testcases are not required > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: BB2015-05-RFC > Attachments: HDFS-8274.patch > > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8274: -- Labels: BB2015-05-RFC (was: BB2015-05-TBR) > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: BB2015-05-RFC > Attachments: HDFS-8274.patch > > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533898#comment-14533898 ] Ajith S commented on HDFS-8340: --- The whitespac ewrning is not related to patch > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation, nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-RFC > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8067) haadmin prints out stale help messages
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8067: -- Labels: BB2015-05-RFC (was: BB2015-05-TBR) > haadmin prints out stale help messages > -- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-RFC > Attachments: HDFS-8067-01.patch, HDFS-8067-02.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Labels: BB2015-05-RFC (was: BB2015-05-TBR) > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation, nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-RFC > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Component/s: documentation > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation, nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Labels: BB2015-05-TBR (was: ) > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Status: Patch Available (was: Open) Fixed as per analysis. Please review and commit > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Attachment: HDFS-8340.patch > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-8340.patch > > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Component/s: nfs > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
[ https://issues.apache.org/jira/browse/HDFS-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8340: -- Labels: (was: nfs) > NFS transfer size configuration should include > nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) > - > > Key: HDFS-8340 > URL: https://issues.apache.org/jira/browse/HDFS-8340 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > > According to documentation > http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > bq. For larger data transfer size, one needs to update “nfs.rtmax” and > “nfs.rtmax” in hdfs-site.xml. > nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and > “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8340) NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY)
Ajith S created HDFS-8340: - Summary: NFS transfer size configuration should include nfs.wtmax(DFS_NFS_MAX_WRITE_TRANSFER_SIZE_KEY) Key: HDFS-8340 URL: https://issues.apache.org/jira/browse/HDFS-8340 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ajith S Assignee: Ajith S Priority: Minor According to documentation http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html bq. For larger data transfer size, one needs to update “nfs.rtmax” and “nfs.rtmax” in hdfs-site.xml. nfs.rtmax is mentioned twice, instead it should be “nfs.rtmax” and “nfs.wtmax” -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8274: -- Labels: BB2015-05-TBR (was: ) > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Labels: BB2015-05-TBR > Attachments: HDFS-8274.patch > > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8274: -- Status: Patch Available (was: Open) Please review the patch. As per the analysis, corrected as per expected property > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Attachments: HDFS-8274.patch > > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8274: -- Attachment: HDFS-8274.patch > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > Attachments: HDFS-8274.patch > > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8274) NFS configuration nfs.dump.dir not working
[ https://issues.apache.org/jira/browse/HDFS-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516529#comment-14516529 ] Ajith S commented on HDFS-8274: --- Seems to be affetected after HDFS-6056 > NFS configuration nfs.dump.dir not working > -- > > Key: HDFS-8274 > URL: https://issues.apache.org/jira/browse/HDFS-8274 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.0 >Reporter: Ajith S >Assignee: Ajith S > > As per the document > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html > we can configure > {quote} > nfs.dump.dir > {quote} > as nfs file dump directory, but using this configuration in *hdfs-site.xml* > doesn't work and when nfs gateway is started, default location is used i.e > \tmp\.hdfs-nfs > The reason being the key expected in *NfsConfigKeys.java* > {code} > public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; > {code} > we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8274) NFS configuration nfs.dump.dir not working
Ajith S created HDFS-8274: - Summary: NFS configuration nfs.dump.dir not working Key: HDFS-8274 URL: https://issues.apache.org/jira/browse/HDFS-8274 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Ajith S Assignee: Ajith S As per the document http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html we can configure {quote} nfs.dump.dir {quote} as nfs file dump directory, but using this configuration in *hdfs-site.xml* doesn't work and when nfs gateway is started, default location is used i.e \tmp\.hdfs-nfs The reason being the key expected in *NfsConfigKeys.java* {code} public static final String DFS_NFS_FILE_DUMP_DIR_KEY = "nfs.file.dump.dir"; {code} we can change it to *nfs.dump.dir* instead -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8067) haadmin prints out stale help messages
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486659#comment-14486659 ] Ajith S commented on HDFS-8067: --- Updated with HDFS-8067-02.patch > haadmin prints out stale help messages > -- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-8067-01.patch, HDFS-8067-02.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8067) haadmin prints out stale help messages
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8067: -- Attachment: HDFS-8067-02.patch > haadmin prints out stale help messages > -- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-8067-01.patch, HDFS-8067-02.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8067) haadmin prints out stale help messages
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486657#comment-14486657 ] Ajith S commented on HDFS-8067: --- Yes ill update the patch for help message > haadmin prints out stale help messages > -- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-8067-01.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8094) Cluster web console (dfsclusterhealth.jsp) is not working
Ajith S created HDFS-8094: - Summary: Cluster web console (dfsclusterhealth.jsp) is not working Key: HDFS-8094 URL: https://issues.apache.org/jira/browse/HDFS-8094 Project: Hadoop HDFS Issue Type: Bug Reporter: Ajith S Assignee: Ajith S According to documentation, cluster can be monitored at http:///dfsclusterhealth.jsp Currently, this url doesn't seem to be working. This seems to be removed as part of HDFS-6252 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7998) HDFS Federation : Command mentioned to add a NN to existing federated cluster is wrong
[ https://issues.apache.org/jira/browse/HDFS-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-7998: -- Target Version/s: 2.7.0 > HDFS Federation : Command mentioned to add a NN to existing federated cluster > is wrong > --- > > Key: HDFS-7998 > URL: https://issues.apache.org/jira/browse/HDFS-7998 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-7998.patch > > > HDFS Federation documentation > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/Federation.html > has the following command to add a namenode to existing cluster > > $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode > > : > this command is incorrect, actual correct command is > > $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNamenodes > > : > need to update the same in documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7998) HDFS Federation : Command mentioned to add a NN to existing federated cluster is wrong
[ https://issues.apache.org/jira/browse/HDFS-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-7998: -- Status: Patch Available (was: Open) Submitting the patch. Updated the document for correct command. Please review the same > HDFS Federation : Command mentioned to add a NN to existing federated cluster > is wrong > --- > > Key: HDFS-7998 > URL: https://issues.apache.org/jira/browse/HDFS-7998 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-7998.patch > > > HDFS Federation documentation > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/Federation.html > has the following command to add a namenode to existing cluster > > $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode > > : > this command is incorrect, actual correct command is > > $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNamenodes > > : > need to update the same in documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7998) HDFS Federation : Command mentioned to add a NN to existing federated cluster is wrong
[ https://issues.apache.org/jira/browse/HDFS-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-7998: -- Attachment: HDFS-7998.patch > HDFS Federation : Command mentioned to add a NN to existing federated cluster > is wrong > --- > > Key: HDFS-7998 > URL: https://issues.apache.org/jira/browse/HDFS-7998 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Ajith S >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-7998.patch > > > HDFS Federation documentation > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/Federation.html > has the following command to add a namenode to existing cluster > > $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode > > : > this command is incorrect, actual correct command is > > $HADOOP_PREFIX_HOME/bin/hdfs dfsadmin -refreshNamenodes > > : > need to update the same in documentation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8067) haadmin commands doesn't work in Federation with HA
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484769#comment-14484769 ] Ajith S commented on HDFS-8067: --- Thanks for the issue update > haadmin commands doesn't work in Federation with HA > --- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Blocker > Attachments: HDFS-8067-01.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8067) haadmin commands doesn't work in Federation with HA
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484770#comment-14484770 ] Ajith S commented on HDFS-8067: --- Thanks for the issue update > haadmin commands doesn't work in Federation with HA > --- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Blocker > Attachments: HDFS-8067-01.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8067) haadmin commands doesn't work in Federation with HA
[ https://issues.apache.org/jira/browse/HDFS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-8067: -- Status: Patch Available (was: Open) Submitting the patch. (reverted the changes in HDFS-7808) and (corrected HDFS-7324). Please review the same > haadmin commands doesn't work in Federation with HA > --- > > Key: HDFS-8067 > URL: https://issues.apache.org/jira/browse/HDFS-8067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Ajith S >Assignee: Ajith S >Priority: Blocker > Attachments: HDFS-8067-01.patch > > > Scenario : Setting up multiple nameservices with HA configuration for each > nameservice (manual failover) > After starting the journal nodes and namenodes, both the nodes are in standby > mode. > all the following haadmin commands > *haadmin* >-transitionToActive >-transitionToStandby >-failover >-getServiceState >-checkHealth > failed with exception > _Illegal argument: Unable to determine the nameservice id._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)