Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
The fsimage file size is 1658934155 2013/8/13 Harsh J ha...@cloudera.com How large are your checkpointed fsimage files? On Mon, Aug 12, 2013 at 3:42 PM, lei liu liulei...@gmail.com wrote: When Standby Namenode is doing checkpoint, upload the image file to Active NameNode, the Active NameNode is very slow. What is reason result to the Active NameNode is slow? Thanks, LiuLei -- Harsh J
Re: Jobtracker page hangs ..again.
Thanks harsh, Appreciate your input, as always. On Aug 12, 2013, at 20:01, Harsh J ha...@cloudera.com wrote: If you're not already doing it, run a local name caching daemon (such as ncsd, etc.) on each cluster node. Hadoop does a lot of lookups and a local cache would do good in reducing the load on your DNS. On Tue, Aug 13, 2013 at 3:09 AM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Update, after adjust the network routing, dns query speed is in micro sec as suppose to be. the issue is completely solve. Jobtracker page doesn't hang anymore when launch 100k mappers job.. Cheers, On Mon, Aug 12, 2013 at 1:29 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Ok, after some sweat, i think I found the root cause but still need another team to help me fix it. It lines on the DNS. Somehow each of the tip:task line, through the tcpdump, i saw the dns query to dns server. Timestamp seems matches to me. 2013-08-11 20:39:23,493 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201308111631_0006_m_00 has split on node:/rack1/host1 127 ms 2013-08-11 20:39:23,620 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201308111631_0006_m_00 has split on node:/rack1/host2 126 ms 2013-08-11 20:39:23,746 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201308111631_0006_m_00 has split on node:/rack2/host3 20:39:23.367337 IP jtk.53110 dns1.domain: 41717+ A? host1. (37) 20:39:23.367345 IP jtk.53110 dns1.domain: 7221+ ? host1. (37) 20:39:23.493486 IP dns1.domain jtk.53110: 7221* 0/1/0 (89) 20:39:23.493505 IP dns1.domain : jtk.41717* 1/4/2 A xx.xx.xx.xx (189) 20:39:23.493766 IP jtk.48042 dns1.domain: 35450+ A? host2. (37) 20:39:23.493774 IP jtk.48042 dns1.domain: 56007+ ? host2. (37) 20:39:23.619903 IP dns1.domain jtk.48042: 35450* 1/4/2 A yy.yy.yy.yy (189) 20:39:23.619921 IP dns1.domain jtk.48042: 56007* 0/1/0 (89) 20:39:23.620208 IP jtk.41237 dns2.domain: 49511+ A? host3. (37) 20:39:23.620215 IP jtk.41237 dns2.domain: 29199+ ? host3. (37) 20:39:23.746358 IP dns2.domain jtk.41237: 49511* 1/4/2 A zz.zz.zz.zz (189) I looked at the jobtracker log in other datacenter when submitted with the same query. Timestamp in each tip:task line is much much faster. The question that raise here is the job initialization is really request the DNS, if so is there any way to suppress that. topology file is already in place where name and ip are already there. Hope this make sense Patai On Fri, Aug 9, 2013 at 6:57 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Appreciate your input Bryant, i will try to reproduce and see the namenode log before, while, and after it pause. Wish me luck On Fri, Aug 9, 2013 at 2:09 PM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: When I've had problems with a slow jobtracker, i've found the issue to be one of the following two (so far) possibilities: - long GC pause (I'm guessing this is not it based on your email) - hdfs is slow I haven't dived into the code yet, but circumstantially I've found that when you submit a job the jobtracker needs to put a bunch of files in hdfs, such as the job.xml, the job jar, etc. I'm not sure how this scales with larger and larger jobs, aside form the size of the splits serialization in the job.xml, but if your HDFS is slow for any reason it can cause pauses in your jobtracker. This affects other jobs being able to submit, as well as the 50030 web ui. I'd take a look at your namenode logs. When the jobtracker logs pause, do you see a corresponding pause in the namenode logs? What gets spewed before and after that pause? On Fri, Aug 9, 2013 at 4:41 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: A while back, i was fighting with the jobtracker page hangs when i browse to http://jobtracker:50030 browser doesn't show jobs info as usual which ends up because of allowing too much job history kept in jobtracker. Currently, i am setting up a new cluster 40g heap on the namenode and jobtracker in dedicated machines. Not fun part starts here; a developer tried to test out the cluster by launching a 76k map job (the cluster has around 6k-ish mappers) Job initialization was success, and finished the job. However, before the job is actually running, i can't access to the jobtracker page anymore same symptom as above. i see bunch of this in jobtracker log 2013-08-08 00:23:00,509 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201307291733_0619_m_076796 has split on node: /rack/node .. .. .. Until i see this INFO org.apache.hadoop.mapred.JobInProgress: job_201307291733_0619 LOCALITY_WAIT_FACTOR=1.0 2013-08-08 00:23:00,509 INFO org.apache.hadoop.mapred.JobInProgress: Job job_201307291733_0619 initialized successfully with 76797 map tasks and 10 reduce tasks. that's when i can access to the jobtracker page
Re: Exceptions in Name node and Data node logs
Sorry for not giving version details I am using Hadoop version - 1.1.2 and Hbase version - 0.94.7 On Tue, Aug 13, 2013 at 1:53 PM, Vimal Jain vkj...@gmail.com wrote: Hi, I have configured Hadoop and Hbase in pseudo distributed mode. So far things were working fine , but suddenly i started receiving some exceptions in my namenode and datanode log files. It keeps repeating and thus fills up my disk space. Please help here. *Exception in data node :-* 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Got blockRec eived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at sun.proxy.$Proxy5.blockReceived(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527) at java.lang.Thread.run(Thread.java:662) *Exception in name node :- * 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.blockReceived: blk_-4787262105551508952_28369 is received from dead or unregistered node 192.168.20.30:50010 2013-07-31 19:39:50,671 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.io.IOException: Got blo ckReceived message from unregistered or dead node blk_-4787262105551508952_28369 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call blockReceived(DatanodeRegistration( 192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559, [Ljava.l ang.String;@294f9d6) from 192.168.20.30:59764: error: java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551 508952_28369 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Re: Exceptions in Name node and Data node logs
Along with these exceptions i am seeing some exceptions in hbase logs too. Here it is : *Exception in Master log :* 2013-07-31 15:51:04,694 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 1266874891120ms instead of 1ms, this is likely due to a long garbage c ollecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 15:51:04,798 WARN org.apache.hadoop.hbase.master.CatalogJanitor: Failed scan of catalog table org.apache.hadoop.hbase.client.ScannerTimeoutException: 82253ms passed since the last invocation, timeout is currently set to 6 at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:283) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:727) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:184) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:169) at org.apache.hadoop.hbase.master.CatalogJanitor.getSplitParents(CatalogJanitor.java:123) at org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:134) at org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:92) at org.apache.hadoop.hbase.Chore.run(Chore.java:67) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: -8839286818925700393 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2544) at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:143) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:42) at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:164) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:274) ... 8 more 2013-07-31 15:54:42,526 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. starting at key '' 2013-07-31 15:54:42,526 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Advancing internal scanner to startKey at '' 2013-07-31 15:54:42,531 DEBUG org.apache.hadoop.hbase.client.ClientScanner: Finished with scanning at {NAME = '.META.,,1', STARTKEY = '', ENDKEY = '', ENCODED = 1028785192,} 2013-07-31 15:54:42,532 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 5 catalog row(s) and gc'd 0 unreferenced parent region(s) 2013-07-31 15:54:42,751 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing because balanced cluster; servers=1 regions=5 averag e=5.0 mostloaded=5 leastloaded=5 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 25771ms instead of 1000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,358 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 30091ms instead of 1000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 28613ms instead of 1ms, this is likely due to a long garbage collectin g pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,361 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 27457ms instead of 1ms, this is likely due to a long garbage collectin g pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,362 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 34587ms instead of 1ms, this is likely due to a long garbage collectin g pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,367 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 78600ms instead of 6ms, this is likely due to a long garbage collectin g pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-07-31 16:43:23,369 WARN
Re: Exceptions in Name node and Data node logs
Hi, One of your DN is marked as dead because NN is not able to get heartbeat message from DN but NN still getting block information from dead node. This error is similar to a bug *HDFS-1250* reported 2 years back and fixed in 0.20 release. Can you please check the status of DN's in cluster. #bin/hadoop dfsadmin -report Thanks On Tue, Aug 13, 2013 at 1:53 PM, Vimal Jain vkj...@gmail.com wrote: Hi, I have configured Hadoop and Hbase in pseudo distributed mode. So far things were working fine , but suddenly i started receiving some exceptions in my namenode and datanode log files. It keeps repeating and thus fills up my disk space. Please help here. *Exception in data node :-* 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Got blockRec eived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at sun.proxy.$Proxy5.blockReceived(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527) at java.lang.Thread.run(Thread.java:662) *Exception in name node :- * 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.blockReceived: blk_-4787262105551508952_28369 is received from dead or unregistered node 192.168.20.30:50010 2013-07-31 19:39:50,671 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.io.IOException: Got blo ckReceived message from unregistered or dead node blk_-4787262105551508952_28369 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call blockReceived(DatanodeRegistration( 192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559, [Ljava.l ang.String;@294f9d6) from 192.168.20.30:59764: error: java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551 508952_28369 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) -- Thanks and Regards, Vimal Jain
Re: Exceptions in Name node and Data node logs
Hi Jitendra, Thanks for your reply. Currently my hadoop/hbase is down in production as it had filled up the disk space with above exceptions in log files and had to be brought down. Also i am using hadoop/hbase in pseudo distributed mode , so there is only one node which hosts all 6 processes ( 3 from hadoop and 3 from hbase). On Tue, Aug 13, 2013 at 2:50 PM, Jitendra Yadav jeetuyadav200...@gmail.comwrote: Hi, One of your DN is marked as dead because NN is not able to get heartbeat message from DN but NN still getting block information from dead node. This error is similar to a bug *HDFS-1250* reported 2 years back and fixed in 0.20 release. Can you please check the status of DN's in cluster. #bin/hadoop dfsadmin -report Thanks On Tue, Aug 13, 2013 at 1:53 PM, Vimal Jain vkj...@gmail.com wrote: Hi, I have configured Hadoop and Hbase in pseudo distributed mode. So far things were working fine , but suddenly i started receiving some exceptions in my namenode and datanode log files. It keeps repeating and thus fills up my disk space. Please help here. *Exception in data node :-* 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Got blockRec eived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at sun.proxy.$Proxy5.blockReceived(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527) at java.lang.Thread.run(Thread.java:662) *Exception in name node :- * 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.blockReceived: blk_-4787262105551508952_28369 is received from dead or unregistered node 192.168.20.30:50010 2013-07-31 19:39:50,671 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.io.IOException: Got blo ckReceived message from unregistered or dead node blk_-4787262105551508952_28369 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call blockReceived(DatanodeRegistration( 192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559, [Ljava.l ang.String;@294f9d6) from 192.168.20.30:59764: error: java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551 508952_28369 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
I write one programm to test NameNode performance. Please see the EditLogPerformance.java I use 60 threads to execute the EditLogPerformance.javacode, the testing result is below content: 2013-08-13 17:43:01,479 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055 2013-08-13 17:43:11,482 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725 2013-08-13 17:43:21,484 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2 2013-08-13 17:43:31,487 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6 2013-08-13 17:43:41,490 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6 2013-08-13 17:43:51,493 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53 2013-08-13 17:44:01,496 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1040 speed:676 2013-08-13 17:44:11,499 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149 2013-08-13 17:44:21,502 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997 2013-08-13 17:44:31,505 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072 2013-08-13 17:44:41,508 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008 2013-08-13 17:44:51,511 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994 2013-08-13 17:45:01,514 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071 2013-08-13 17:45:11,517 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090 2013-08-13 17:45:21,520 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170 2013-08-13 17:45:31,523 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059 2013-08-13 17:45:41,526 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076 2013-08-13 17:45:51,529 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021 2013-08-13 17:46:01,532 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1064 speed:1067 2013-08-13 17:46:11,534 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066 2013-08-13 17:46:21,537 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093 2013-08-13 17:46:31,540 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023 2013-08-13 17:46:41,543 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971 2013-08-13 17:46:51,546 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041 2013-08-13 17:47:01,549 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031 2013-08-13 17:47:11,552 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079 2013-08-13 17:47:21,555 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069 2013-08-13 17:47:31,558 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009 2013-08-13 17:47:41,561 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978 2013-08-13 17:47:51,564 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028 2013-08-13 17:48:01,567 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062 2013-08-13 17:48:11,570 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979 2013-08-13 17:48:21,573 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005 2013-08-13 17:48:31,576 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012 2013-08-13 17:48:41,579 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034 2013-08-13 17:48:51,581 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045 2013-08-13 17:49:01,584 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11049312 speed:1045 2013-08-13 17:49:11,587 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11070396 speed:1054 2013-08-13 17:49:21,590 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11087408 speed:850 2013-08-13 17:49:31,593 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11087418 speed:0 2013-08-13 17:49:41,596 INFO
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Perhaps turning on fsimage compression may help. See documentation of dfs.image.compress at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml. You can also try to throttle the bandwidth it uses via dfs.image.transfer.bandwidthPerSec. On Tue, Aug 13, 2013 at 3:45 PM, lei liu liulei...@gmail.com wrote: I write one programm to test NameNode performance. Please see the EditLogPerformance.java I use 60 threads to execute the EditLogPerformance.javacode, the testing result is below content: 2013-08-13 17:43:01,479 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055 2013-08-13 17:43:11,482 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725 2013-08-13 17:43:21,484 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2 2013-08-13 17:43:31,487 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6 2013-08-13 17:43:41,490 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6 2013-08-13 17:43:51,493 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53 2013-08-13 17:44:01,496 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1040 speed:676 2013-08-13 17:44:11,499 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149 2013-08-13 17:44:21,502 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997 2013-08-13 17:44:31,505 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072 2013-08-13 17:44:41,508 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008 2013-08-13 17:44:51,511 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994 2013-08-13 17:45:01,514 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071 2013-08-13 17:45:11,517 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090 2013-08-13 17:45:21,520 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170 2013-08-13 17:45:31,523 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059 2013-08-13 17:45:41,526 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076 2013-08-13 17:45:51,529 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021 2013-08-13 17:46:01,532 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1064 speed:1067 2013-08-13 17:46:11,534 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066 2013-08-13 17:46:21,537 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093 2013-08-13 17:46:31,540 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023 2013-08-13 17:46:41,543 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971 2013-08-13 17:46:51,546 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041 2013-08-13 17:47:01,549 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031 2013-08-13 17:47:11,552 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079 2013-08-13 17:47:21,555 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069 2013-08-13 17:47:31,558 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009 2013-08-13 17:47:41,561 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978 2013-08-13 17:47:51,564 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028 2013-08-13 17:48:01,567 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062 2013-08-13 17:48:11,570 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979 2013-08-13 17:48:21,573 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005 2013-08-13 17:48:31,576 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10986794 speed:1012 2013-08-13 17:48:41,579 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11007484 speed:1034 2013-08-13 17:48:51,581 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:11028400 speed:1045 2013-08-13 17:49:01,584 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) -
Maven Cloudera Configuration problem
Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan
Re: when Standby Namenode is doing checkpoint, the Active NameNode is slow.
Hi, I'm agreed with Harsh comment on image file compression and transfer bandwidth parameter for optimizing checkpoint process. In addition I'm not able to correlate your performance program log timings(less then 10) and file transfer logs timing on active/stand by nodes. Thanks On Tue, Aug 13, 2013 at 4:06 PM, Harsh J ha...@cloudera.com wrote: Perhaps turning on fsimage compression may help. See documentation of dfs.image.compress at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml . You can also try to throttle the bandwidth it uses via dfs.image.transfer.bandwidthPerSec. On Tue, Aug 13, 2013 at 3:45 PM, lei liu liulei...@gmail.com wrote: I write one programm to test NameNode performance. Please see the EditLogPerformance.java I use 60 threads to execute the EditLogPerformance.javacode, the testing result is below content: 2013-08-13 17:43:01,479 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10392810 speed:1055 2013-08-13 17:43:11,482 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407310 speed:725 2013-08-13 17:43:21,484 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407358 speed:2 2013-08-13 17:43:31,487 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407490 speed:6 2013-08-13 17:43:41,490 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10407624 speed:6 2013-08-13 17:43:51,493 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10408690 speed:53 2013-08-13 17:44:01,496 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1040 speed:676 2013-08-13 17:44:11,499 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10445216 speed:1149 2013-08-13 17:44:21,502 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10465166 speed:997 2013-08-13 17:44:31,505 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10486614 speed:1072 2013-08-13 17:44:41,508 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10506778 speed:1008 2013-08-13 17:44:51,511 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10526660 speed:994 2013-08-13 17:45:01,514 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10548092 speed:1071 2013-08-13 17:45:11,517 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10569892 speed:1090 2013-08-13 17:45:21,520 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10593296 speed:1170 2013-08-13 17:45:31,523 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10614478 speed:1059 2013-08-13 17:45:41,526 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10636006 speed:1076 2013-08-13 17:45:51,529 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10656430 speed:1021 2013-08-13 17:46:01,532 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:1064 speed:1067 2013-08-13 17:46:11,534 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10699096 speed:1066 2013-08-13 17:46:21,537 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10720970 speed:1093 2013-08-13 17:46:31,540 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10741432 speed:1023 2013-08-13 17:46:41,543 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10760854 speed:971 2013-08-13 17:46:51,546 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10781680 speed:1041 2013-08-13 17:47:01,549 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10802302 speed:1031 2013-08-13 17:47:11,552 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10823888 speed:1079 2013-08-13 17:47:21,555 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10845276 speed:1069 2013-08-13 17:47:31,558 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10865470 speed:1009 2013-08-13 17:47:41,561 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10885046 speed:978 2013-08-13 17:47:51,564 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10905606 speed:1028 2013-08-13 17:48:01,567 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10926854 speed:1062 2013-08-13 17:48:11,570 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10946446 speed:979 2013-08-13 17:48:21,573 INFO my.EditLogPerformance (EditLogPerformance.java:run(37)) - totalCount:10966554 speed:1005 2013-08-13
Re: Maven Cloudera Configuration problem
You need to configure your namenode and jobtracker information in the configuration files within you application. Only set the relevant properties in the copy of the files that you are bundling in your job. For rest the default values would be used from the default configuration files (core-default.xml, mapred-default.xml) already bundled in the lib/jar provided by cloudera/hadoop. The assumption is that this is for MRv1. Anyway, you should go through this for details http://hadoop.apache.org/docs/stable/cluster_setup.html *core-site.xml *(teh security ones are optional and if you are not using anything special you can remove them and rely on the defaults which is also 'simple'. configuration property namefs.defaultFS/name valuehdfs://server:8020/value /property property namehadoop.security.authentication/name valuesimple/value /property property namehadoop.security.auth_to_local/name valueDEFAULT/value /property /configuration *map-red.xml* /configuration property namemapred.job.tracker/name valuehttp://server:/value /property /configuration* * Regards, Shahab * * On Tue, Aug 13, 2013 at 7:19 AM, Pavan Sudheendra pavan0...@gmail.comwrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan
Re: Maven Cloudera Configuration problem
Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan
Re: Exceptions in Name node and Data node logs
Hi, As Jitendra pointed out , this issue was fixed in .20 version. I am using Hadoop 1.1.2 so why its occurring again ? Please help here. On Tue, Aug 13, 2013 at 2:56 PM, Vimal Jain vkj...@gmail.com wrote: Hi Jitendra, Thanks for your reply. Currently my hadoop/hbase is down in production as it had filled up the disk space with above exceptions in log files and had to be brought down. Also i am using hadoop/hbase in pseudo distributed mode , so there is only one node which hosts all 6 processes ( 3 from hadoop and 3 from hbase). On Tue, Aug 13, 2013 at 2:50 PM, Jitendra Yadav jeetuyadav200...@gmail.com wrote: Hi, One of your DN is marked as dead because NN is not able to get heartbeat message from DN but NN still getting block information from dead node. This error is similar to a bug *HDFS-1250* reported 2 years back and fixed in 0.20 release. Can you please check the status of DN's in cluster. #bin/hadoop dfsadmin -report Thanks On Tue, Aug 13, 2013 at 1:53 PM, Vimal Jain vkj...@gmail.com wrote: Hi, I have configured Hadoop and Hbase in pseudo distributed mode. So far things were working fine , but suddenly i started receiving some exceptions in my namenode and datanode log files. It keeps repeating and thus fills up my disk space. Please help here. *Exception in data node :-* 2013-07-31 19:39:51,094 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Got blockRec eived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1107) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at sun.proxy.$Proxy5.blockReceived(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:1006) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1527) at java.lang.Thread.run(Thread.java:662) *Exception in name node :- * 2013-07-31 19:39:50,671 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.blockReceived: blk_-4787262105551508952_28369 is received from dead or unregistered node 192.168.20.30:50010 2013-07-31 19:39:50,671 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.io.IOException: Got blo ckReceived message from unregistered or dead node blk_-4787262105551508952_28369 2013-07-31 19:39:50,671 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call blockReceived(DatanodeRegistration( 192.168.20.30:50010, storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075, ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@64f2d559, [Ljava.l ang.String;@294f9d6) from 192.168.20.30:59764: error: java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551 508952_28369 java.io.IOException: Got blockReceived message from unregistered or dead node blk_-4787262105551508952_28369 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:4188) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:1069) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387) -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Re: Maven Cloudera Configuration problem
Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan
Re: YARN with local filesystem
I was able to execute the example by running the job as the yarn user. For example the following successfully completes: sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out Whereas this fails with the local user rpaulk: yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk rmang...@gmail.com wrote: I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on the local filesystem instead of HDFS. The appTokens file that the error states is missing, does exist after the job fails. I saw other 'similar' issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the jobs run fine. In core-site.xml property namefs.defaultFS/name valuefile:value /property In mapred-site.xml property namemapreduce.framework.name/name valueyarn/value /property 2013-07-29 16:13:06,549 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1375138534137_0003_01_01 by user rpaulk 2013-07-29 16:13:06,549 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1375138534137_0003 2013-07-29 16:13:06,549 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk IP=172.20.130.215 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1375138534137_0003 CONTAINERID=container_1375138534137_0003_01_01 2013-07-29 16:13:06,551 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1375138534137_0003 transitioned from NEW to INITING 2013-07-29 16:13:06,551 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1375138534137_0003_01_01 to application application_1375138534137_0003 2013-07-29 16:13:06,554 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1375138534137_0003 transitioned from INITING to RUNNING 2013-07-29 16:13:06,555 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1375138534137_0003_01_01 transitioned from NEW to LOCALIZING *2013-07-29 16:13:06,555 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 * *34137_0003/appTokens transitioned from INIT to DOWNLOADING* 2013-07-29 16:13:06,556 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 34137_0003/job.jar transitioned from INIT to DOWNLOADING 2013-07-29 16:13:06,556 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING 2013-07-29 16:13:06,556 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385 34137_0003/job.split transitioned from INIT to DOWNLOADING 2013-07-29 16:13:06,556 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml transitioned from INIT to DOWNLOADING 2013-07-29 16:13:06,556 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_1375138534137_0003_01_01 2013-07-29 16:13:06,559 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_01.tokens. Credentials list: 2013-07-29 16:13:06,560 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user rpaulk 2013-07-29 16:13:06,564 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_01.tokens to /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_01.tokens 2013-07-29 16:13:06,564 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set to /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003 =
Re: Maven Cloudera Configuration problem
When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan
Requesting set of containers on a single node
Hi, My application has a group of processes that need to communicate with each other either through Shared Memory or TCP/IP depending on where the containers are allocated, on the same machine or on different machines. Obviously I would like them to get them allocated on the same node whenever possible which requires all of the containers to be on the same node. But I don't want to specify a node name in my request because I don't bother wherever they run in the cluster but all of them have to be on the same node. Is there a way to make such a request for containers currently? Or if not, I think this would be good to be have because many applications could have such kind of requirement. Thanks, Kishore
Re: Maven Cloudera Configuration problem
I've been stuck on the same question lately so don't take this as definitive, just my best guess at what's required. Using maven as your hadoop source is going to give you a vanilla hadoop; one that runs on localhost. You need one that you've customized to point to your remote cluster and you can't get that via maven. So my *GUESS* is you need to do a plain local install of hadoop and point HADOOP_HOME at that. Customize as required, then convince eclipse to use that instead of going thru maven (i.e. remove hadoop from the dependency list). Everyone; is this on the right path? Anyone know of exact instructions? On Aug 13, 2013, at 12:07 PM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu
Re: Maven Cloudera Configuration problem
Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote: Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
You should not use LocalJobRunner. Make sure that the mapred.job.tracker property does not point to 'local' an instead to your job-tracker host and port. *But before that* as Sandy said, your client machine (from where you will be kicking of your jobs and apps) should be using config files which will have your cluster's configuration. This is the alternative that you should follow if you don't want to bundle the configs for your cluster in the application itself (either in java code or separate copies of relevant properties set of config files.) This was something which I was suggesting early on to just to get you started using your cluster instead of local mode. By the way have you seen the following link? It gives you step by step information about how to generate config files from your cluster specific to your cluster and then how to place them and use the from any machine you want to designate as your client. Running your jobs form one of the datanodes without proper config would not work. https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration Regards, Shahab On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra pavan0...@gmail.comwrote: Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote: Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan
Re: Maven Cloudera Configuration problem
That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4) does not offer an Export Client Config option. What am I missing? On Aug 13, 2013, at 4:04 PM, Shahab Yunus shahab.yu...@gmail.com wrote: You should not use LocalJobRunner. Make sure that the mapred.job.tracker property does not point to 'local' an instead to your job-tracker host and port. *But before that* as Sandy said, your client machine (from where you will be kicking of your jobs and apps) should be using config files which will have your cluster's configuration. This is the alternative that you should follow if you don't want to bundle the configs for your cluster in the application itself (either in java code or separate copies of relevant properties set of config files.) This was something which I was suggesting early on to just to get you started using your cluster instead of local mode. By the way have you seen the following link? It gives you step by step information about how to generate config files from your cluster specific to your cluster and then how to place them and use the from any machine you want to designate as your client. Running your jobs form one of the datanodes without proper config would not work. https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration Regards, Shahab On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra pavan0...@gmail.comwrote: Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote: Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu
Re: Maven Cloudera Configuration problem
In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have appropriate permissions by the way?) Then I click on any one of the 3 services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish menu. Then for each of these I can click the *Configuration* tab which is in the top-middlish section of the page. Once the configuration page opens then I click on Action menu on the top-right. One of the sub-menu of this is *Download Client Configuration* which as the name says downloads the config files (zip file to be exact) to be used at client machines. Regards, Shahab On Tue, Aug 13, 2013 at 6:07 PM, Brad Cox bradj...@gmail.com wrote: That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4) does not offer an Export Client Config option. What am I missing? On Aug 13, 2013, at 4:04 PM, Shahab Yunus shahab.yu...@gmail.com wrote: You should not use LocalJobRunner. Make sure that the mapred.job.tracker property does not point to 'local' an instead to your job-tracker host and port. *But before that* as Sandy said, your client machine (from where you will be kicking of your jobs and apps) should be using config files which will have your cluster's configuration. This is the alternative that you should follow if you don't want to bundle the configs for your cluster in the application itself (either in java code or separate copies of relevant properties set of config files.) This was something which I was suggesting early on to just to get you started using your cluster instead of local mode. By the way have you seen the following link? It gives you step by step information about how to generate config files from your cluster specific to your cluster and then how to place them and use the from any machine you want to designate as your client. Running your jobs form one of the datanodes without proper config would not work. https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration Regards, Shahab On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote: Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu
Re: Maven Cloudera Configuration problem
Folks, can you please take this thread to CDH related mailing list? On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox bradj...@gmail.com wrote: That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4) does not offer an Export Client Config option. What am I missing? On Aug 13, 2013, at 4:04 PM, Shahab Yunus shahab.yu...@gmail.com wrote: You should not use LocalJobRunner. Make sure that the mapred.job.tracker property does not point to 'local' an instead to your job-tracker host and port. *But before that* as Sandy said, your client machine (from where you will be kicking of your jobs and apps) should be using config files which will have your cluster's configuration. This is the alternative that you should follow if you don't want to bundle the configs for your cluster in the application itself (either in java code or separate copies of relevant properties set of config files.) This was something which I was suggesting early on to just to get you started using your cluster instead of local mode. By the way have you seen the following link? It gives you step by step information about how to generate config files from your cluster specific to your cluster and then how to place them and use the from any machine you want to designate as your client. Running your jobs form one of the datanodes without proper config would not work. https://ccp.cloudera.com/display/FREE373/Generating+Client+Configuration Regards, Shahab On Tue, Aug 13, 2013 at 1:07 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the job on one datanode.. What changes should i make so that my application would take advantage of the cluster as a whole? On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote: Nothing in your pom.xml should affect the configurations your job runs with. Are you running your job from a node on the cluster? When you say localhost configurations, do you mean it's using the LocalJobRunner? -sandy (iphnoe tpying) On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra pavan0...@gmail.com wrote: When i actually run the job on the multi node cluster, logs shows it uses localhost configurations which i don't want.. I just have a pom.xml which lists all the dependencies like standard hadoop, standard hbase, standard zookeeper etc., Should i remove these dependencies? I want the cluster settings to apply in my map-reduce application.. So, this is where i'm stuck at.. On Tue, Aug 13, 2013 at 9:30 PM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi Shabab and Sandy, The thing is we have a 6 node cloudera cluster running.. For development purposes, i was building a map-reduce application on a single node apache distribution hadoop with maven.. To be frank, i don't know how to deploy this application on a multi node cloudera cluster. I am fairly well versed with Multi Node Apache Hadoop Distribution.. So, how can i go forward? Thanks for all the help :) On Tue, Aug 13, 2013 at 9:22 PM, sandy.r...@cloudera.com wrote: Hi Pavan, Configuration properties generally aren't included in the jar itself unless you explicitly set them in your java code. Rather they're picked up from the mapred-site.xml file located in the Hadoop configuration directory on the host you're running your job from. Is there an issue you're coming up against when trying to run your job on a cluster? -Sandy (iphnoe tpying) On Aug 13, 2013, at 4:19 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi, I'm currently using maven to build the jars necessary for my map-reduce program to run and it works for a single node cluster.. For a multi node cluster, how do i specify my map-reduce program to ingest the cluster settings instead of localhost settings? I don't know how to specify this using maven to build my jar. I'm using the cdh distribution by the way.. -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan -- Regards- Pavan Dr. Brad J. CoxCell: 703-594-1883 Blog: http://bradjcox.blogspot.com http://virtualschool.edu -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
updated to 1.2.1, map completed percentage keeps oscillating
Hi everyone I recently updated my cluster to 1.2.1 and now the percentage of compeleted map tasks while the job is running keeps changing: 13/08/13 16:53:01 INFO mapred.JobClient: Running job: job_201308131452_0007 13/08/13 16:53:02 INFO mapred.JobClient: map 0% reduce 0% 13/08/13 16:53:19 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:53:22 INFO mapred.JobClient: map 34% reduce 0% 13/08/13 16:53:25 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:53:49 INFO mapred.JobClient: map 44% reduce 0% 13/08/13 16:53:52 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:54:02 INFO mapred.JobClient: map 38% reduce 0% 13/08/13 16:54:05 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:54:14 INFO mapred.JobClient: map 44% reduce 0% 13/08/13 16:54:17 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:54:26 INFO mapred.JobClient: map 24% reduce 0% 13/08/13 16:54:29 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:54:53 INFO mapred.JobClient: map 24% reduce 0% 13/08/13 16:54:56 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:55:05 INFO mapred.JobClient: map 32% reduce 0% 13/08/13 16:55:08 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:55:17 INFO mapred.JobClient: map 20% reduce 0% 13/08/13 16:55:20 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:55:32 INFO mapred.JobClient: map 4% reduce 0% 13/08/13 16:55:35 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:55:47 INFO mapred.JobClient: map 19% reduce 0% 13/08/13 16:55:50 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:56:02 INFO mapred.JobClient: map 46% reduce 0% 13/08/13 16:56:06 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:56:36 INFO mapred.JobClient: map 29% reduce 0% 13/08/13 16:56:39 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:57:07 INFO mapred.JobClient: map 48% reduce 0% 13/08/13 16:57:10 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:58:16 INFO mapred.JobClient: map 39% reduce 0% 13/08/13 16:58:20 INFO mapred.JobClient: map 2% reduce 0% 13/08/13 16:58:23 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:58:32 INFO mapred.JobClient: map 44% reduce 0% 13/08/13 16:58:35 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:58:50 INFO mapred.JobClient: map 18% reduce 0% 13/08/13 16:58:53 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:59:08 INFO mapred.JobClient: map 16% reduce 0% 13/08/13 16:59:11 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:59:42 INFO mapred.JobClient: map 18% reduce 0% 13/08/13 16:59:45 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 16:59:54 INFO mapred.JobClient: map 11% reduce 0% 13/08/13 16:59:57 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:00:09 INFO mapred.JobClient: map 33% reduce 0% 13/08/13 17:00:12 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:00:24 INFO mapred.JobClient: map 39% reduce 0% 13/08/13 17:00:27 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:00:51 INFO mapred.JobClient: map 37% reduce 0% 13/08/13 17:00:54 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:01:12 INFO mapred.JobClient: map 50% reduce 0% 13/08/13 17:01:15 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:01:39 INFO mapred.JobClient: map 44% reduce 0% 13/08/13 17:01:42 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:01:54 INFO mapred.JobClient: map 36% reduce 0% 13/08/13 17:01:57 INFO mapred.JobClient: map 51% reduce 0% 13/08/13 17:02:24 INFO mapred.JobClient: map 11% reduce 0% 13/08/13 17:02:27 INFO mapred.JobClient: map 51% reduce 0% this is the output of one job with one map task. there was no failure. the map task was not re-spawned. it just ran and finished on the node on which it was started, but this is the output. what gives? -- Kaveh Minooie
Calling a MATLAB library in map reduce program
Hi, I have to run some analytics on the files present in HDFS using a MATLAB code. I am thinking of compiling the MATLAB code into a C++ library and calling it in map reduce code. How can I implement this? I read Hadoop streaming or Hadoop pipes can be used for this. But I have not tried it on my own. Please share your valuable suggestions. Regards, Anand.C
Reduce Task Clarification
I am working on a MapReduce job where I would like to have the output sorted by a LongWritable value. I read the Anatomy of a MapReduce Run in the Definitive Guide and it didn't say explicitly whether reduce() gets called only once per map output key. If it does get called only once I was thinking that I could use this: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setSortComparatorClass(java.lang.Class)to do the sorting. Thank you for your time. -- Sam Garrett ActionX, NYC