RE: Any answer ? Candidate application for map reduce
Hello Yanbo Liang, My issue is that I neither do not have data nor I have an application! I was looking for an open source benchmark application to test this, preferably, if the application (or simple test code) is related to image processing domain Regards Bala From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: 25 March 2013 14:55 To: user@hadoop.apache.org Subject: Re: Any answer ? Candidate application for map reduce From your description split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back is just suit for the MapReduce paradigm. First you can feed the split chunks to Mapper and merge the processed chunks at Reducer. Why did you not use MapReduce paradigm? 2013/3/25 Raymond Tay raymondtay1...@gmail.commailto:raymondtay1...@gmail.com Hello Bala, I bumped into your email by chance. I'm not sure what you are after, really because if you wanted something like image processing and feature detection then hadoop's mailing list is probably not the place and you might want to join NVIDIA's or ATI's CUDA / OpenCL forums. Hope this helps Ray On 25 Mar, 2013, at 1:47 PM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Any answers from anyone of you :) Regards Bala From: AMARNATH, Balachandar [mailto:BALACHANDAR.AMARNATH@mailto:BALACHANDAR.AMARNATH@airbus.comhttp://airbus.com] Sent: 22 March 2013 10:25 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Candidate application for map reduce Hello, I am looking for an sample application (preferably image processing, feature detection etc) that can be good candidate of map reduce paradigm. To very specific, I am looking for an open source simple application that process data and produce some results (A). When you split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back. Is there any website where can I look into such kind of benchmark applications ? Any pointers and thoughts will be helpful here. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
RE: Any answer ? Candidate application for map reduce
This is encouraging ! Thanks a lot, From: Sandy Ryza [mailto:sandy.r...@cloudera.com] Sent: 26 March 2013 01:22 To: user@hadoop.apache.org Subject: Re: Any answer ? Candidate application for map reduce Hi Bala, A standard benchmark program for mapreduce is terasort, which is included in the hadoop examples jar. You can generate data for it using teragen, which runs a map-only job: hadoop jar path-to-examples-jar.jar number of records directory to put them in and then sort the data using terasort: hadoop jar path-to-examples-jar.jar input directory output directory -Sandy On Mon, Mar 25, 2013 at 4:12 AM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Hello Yanbo Liang, My issue is that I neither do not have data nor I have an application! I was looking for an open source benchmark application to test this, preferably, if the application (or simple test code) is related to image processing domain Regards Bala From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com] Sent: 25 March 2013 14:55 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Re: Any answer ? Candidate application for map reduce From your description split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back is just suit for the MapReduce paradigm. First you can feed the split chunks to Mapper and merge the processed chunks at Reducer. Why did you not use MapReduce paradigm? 2013/3/25 Raymond Tay raymondtay1...@gmail.commailto:raymondtay1...@gmail.com Hello Bala, I bumped into your email by chance. I'm not sure what you are after, really because if you wanted something like image processing and feature detection then hadoop's mailing list is probably not the place and you might want to join NVIDIA's or ATI's CUDA / OpenCL forums. Hope this helps Ray On 25 Mar, 2013, at 1:47 PM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Any answers from anyone of you :) Regards Bala From: AMARNATH, Balachandar [mailto:BALACHANDAR.AMARNATH@mailto:BALACHANDAR.AMARNATH@airbus.comhttp://airbus.com] Sent: 22 March 2013 10:25 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Candidate application for map reduce Hello, I am looking for an sample application (preferably image processing, feature detection etc) that can be good candidate of map reduce paradigm. To very specific, I am looking for an open source simple application that process data and produce some results (A). When you split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back. Is there any website where can I look into such kind of benchmark applications ? Any pointers and thoughts will be helpful here. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from
Any answer ? Candidate application for map reduce
Any answers from anyone of you :) Regards Bala From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com] Sent: 22 March 2013 10:25 To: user@hadoop.apache.org Subject: Candidate application for map reduce Hello, I am looking for an sample application (preferably image processing, feature detection etc) that can be good candidate of map reduce paradigm. To very specific, I am looking for an open source simple application that process data and produce some results (A). When you split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back. Is there any website where can I look into such kind of benchmark applications ? Any pointers and thoughts will be helpful here. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
Candidate application for map reduce
Hello, I am looking for an sample application (preferably image processing, feature detection etc) that can be good candidate of map reduce paradigm. To very specific, I am looking for an open source simple application that process data and produce some results (A). When you split the data in to chunks, feed the chunks to the application, and merge the processed chunks to get A back. Is there any website where can I look into such kind of benchmark applications ? Any pointers and thoughts will be helpful here. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
New user question
Hello, Can I have windows namenode and linux datanodes? What is the requirements (passwordless ssh .. etc) Regards Bala The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
Issue: Namenode is in safe mode
Hi, I have created a hadoop cluster with two nodes (A and B). 'A' act both as namenode and datanode, and 'B' act as datanode only. With this setup, I could store, read files. Now, I added one more datanode 'C' and relieved 'A' from datanode duty. This means, 'A' act only as namenode, and both B and C act as datanodes. Now, I tried to create a directory, it says ' org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory Name node is in safe mode' Can someone tell me why the namenode now is in safe mode? With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
store file gives exception
Now I came out of the safe mode through admin command. I tried to put a file into hdfs and encountered this error. org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/hosts could only be replicated to 0 nodes, instead of 1 Any hint to fix this, This happens when the namenode is not datanode. Am I making sense? With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
RE: Issue: Namenode is in safe mode
The repliation factor was 1 when I removed the entry of A in slaves file. I did not mark it retirement. I do not know yet how to mark a node for retirement. I waited for few minutes and then I could see thte namenode running again From: Nitin Pawar [mailto:nitinpawar...@gmail.com] Sent: 06 March 2013 15:31 To: user@hadoop.apache.org Subject: Re: Issue: Namenode is in safe mode what is your replication factor? when you removed node A as datanode .. did you first mark it for retirement? if you just removed it from service then the blocks from that datanode are missing and namenode when starts up it checks for the blocks. Unless it reaches its threshold value it will not let you write any more data on your hdfs. I will suggest to start datanode on A, then mark it for retirement so namenode will move the blocks to new datanode and once it is done namenode will retire that datanode. On Wed, Mar 6, 2013 at 3:21 PM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Hi, I have created a hadoop cluster with two nodes (A and B). 'A' act both as namenode and datanode, and 'B' act as datanode only. With this setup, I could store, read files. Now, I added one more datanode 'C' and relieved 'A' from datanode duty. This means, 'A' act only as namenode, and both B and C act as datanodes. Now, I tried to create a directory, it says ' org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory Name node is in safe mode' Can someone tell me why the namenode now is in safe mode? With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. -- Nitin Pawar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
RE: store file gives exception
Hi, I could successfully install hadoop cluster with three nodes (2 datanodes and 1 namenode). However, when I tried to store a file, I get the following error. 13/03/06 16:45:56 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 13/03/06 16:45:56 WARN hdfs.DFSClient: Could not get block locations. Source file /user/bala/kumki/hosts - Aborting... put: java.io.IOException: File /user/bala/kumki/hosts could only be replicated to 0 nodes, instead of 1 13/03/06 16:45:56 ERROR hdfs.DFSClient: Exception closing file /user/bala/kumki/hosts : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/bala/kumki/hosts could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) Any hint to fix this, Regards Bala From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com] Sent: 06 March 2013 15:29 To: user@hadoop.apache.org Subject: store file gives exception Now I came out of the safe mode through admin command. I tried to put a file into hdfs and encountered this error. org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/hosts could only be replicated to 0 nodes, instead of 1 Any hint to fix this, This happens when the namenode is not datanode. Am I making sense? With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
RE: store file gives exception
Hi all, I thought the below issue is coming because of non availability of enough space. Hence, I replaced the datanodes with other nodes with more space and it worked. Now, I have a working HDFS cluster. I am thinking of my application where I need to execute 'a set of similar instructions' (job) over large number of files. I am planning to do this in parallel in different machines. I would like to schedule this job to the datanode that already has data input file in it. At first, I shall store the files in HDFS. Now, to complete my task, Is there a scheduler available in hadoop framework that given the input file required for a job, can return the data node name where the file is actually stored? Am I making sense here? Regards Bala From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com] Sent: 06 March 2013 16:49 To: user@hadoop.apache.org Subject: RE: store file gives exception Hi, I could successfully install hadoop cluster with three nodes (2 datanodes and 1 namenode). However, when I tried to store a file, I get the following error. 13/03/06 16:45:56 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 13/03/06 16:45:56 WARN hdfs.DFSClient: Could not get block locations. Source file /user/bala/kumki/hosts - Aborting... put: java.io.IOException: File /user/bala/kumki/hosts could only be replicated to 0 nodes, instead of 1 13/03/06 16:45:56 ERROR hdfs.DFSClient: Exception closing file /user/bala/kumki/hosts : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/bala/kumki/hosts could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) Any hint to fix this, Regards Bala From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com] Sent: 06 March 2013 15:29 To: user@hadoop.apache.org Subject: store file gives exception Now I came out of the safe mode through admin command. I tried to put a file into hdfs and encountered this error. org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/hosts could only be replicated to 0 nodes, instead of 1 Any hint to fix this, This happens when the namenode is not datanode. Am I making sense? With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else
Hadoop cluster setup - could not see second datanode
Thanks for the information, Now I am trying to install hadoop dfs using 2 nodes. A namenode cum datanode, and a separate data node. I use the following configuration for my hdfs-site.xml configuration property namefs.default.name/name valuelocalhost:9000/value /property property namedfs.data.dir/name value/home/bala/data/value /property property namedfs.name.dir/name value/home/bala/name/value /property /configuration In namenode, I have added the datanode hostnames (machine1 and machine2). When I do 'start-all.sh', I see the log that the data node is starting in both the machines but I went to the browser in the namenode, I see only one live node. (That is the namenode which is configured as datanode) Any hint here will help me With regards Bala From: Mahesh Balija [mailto:balijamahesh@gmail.com] Sent: 05 March 2013 14:15 To: user@hadoop.apache.org Subject: Re: Hadoop file system You can be able to use Hdfs alone in the distributed mode to fulfill your requirement. Hdfs has the Filesystem java api through which you can interact with the HDFS from your client. HDFS is good if you have less number of files with huge size rather than you having many files with small size. Best, Mahesh Balija, Calsoft Labs. On Tue, Mar 5, 2013 at 10:43 AM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Hi, I am new to hdfs. In my java application, I need to perform 'similar operation' over large number of files. I would like to store those files in distributed machines. I don't think, I will need map reduce paradigm. But however I would like to use HDFS for file storage and access. Is it possible (or nice idea) to use HDFS as a stand alone stuff? And, java APIs are available to work with HDFS so that I can read/write in distributed environment ? Any thoughts here will be helpful. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
RE: Hadoop cluster setup - could not see second datanode
I fixed it the below issue :) Regards Bala From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com] Sent: 05 March 2013 17:05 To: user@hadoop.apache.org Subject: Hadoop cluster setup - could not see second datanode Thanks for the information, Now I am trying to install hadoop dfs using 2 nodes. A namenode cum datanode, and a separate data node. I use the following configuration for my hdfs-site.xml configuration property namefs.default.name/name valuelocalhost:9000/value /property property namedfs.data.dir/name value/home/bala/data/value /property property namedfs.name.dir/name value/home/bala/name/value /property /configuration In namenode, I have added the datanode hostnames (machine1 and machine2). When I do 'start-all.sh', I see the log that the data node is starting in both the machines but I went to the browser in the namenode, I see only one live node. (That is the namenode which is configured as datanode) Any hint here will help me With regards Bala From: Mahesh Balija [mailto:balijamahesh@gmail.com] Sent: 05 March 2013 14:15 To: user@hadoop.apache.org Subject: Re: Hadoop file system You can be able to use Hdfs alone in the distributed mode to fulfill your requirement. Hdfs has the Filesystem java api through which you can interact with the HDFS from your client. HDFS is good if you have less number of files with huge size rather than you having many files with small size. Best, Mahesh Balija, Calsoft Labs. On Tue, Mar 5, 2013 at 10:43 AM, AMARNATH, Balachandar balachandar.amarn...@airbus.commailto:balachandar.amarn...@airbus.com wrote: Hi, I am new to hdfs. In my java application, I need to perform 'similar operation' over large number of files. I would like to store those files in distributed machines. I don't think, I will need map reduce paradigm. But however I would like to use HDFS for file storage and access. Is it possible (or nice idea) to use HDFS as a stand alone stuff? And, java APIs are available to work with HDFS so that I can read/write in distributed environment ? Any thoughts here will be helpful. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free. The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
Map reduce technique
Hi, I am new to map reduce paradigm. I read in a tutorial that says that 'map' function splits the data and into key value pairs. This means, the map-reduce framework automatically splits the data into pieces or do we need to explicitly provide the method to split the data into pieces. If it does automatically, how it splits an image file (size etc)? I see, processing of an image file as a whole will give different results than processing them in chunks. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.
Hadoop file system
Hi, I am new to hdfs. In my java application, I need to perform 'similar operation' over large number of files. I would like to store those files in distributed machines. I don't think, I will need map reduce paradigm. But however I would like to use HDFS for file storage and access. Is it possible (or nice idea) to use HDFS as a stand alone stuff? And, java APIs are available to work with HDFS so that I can read/write in distributed environment ? Any thoughts here will be helpful. With thanks and regards Balachandar The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.