Fwd: XML to TEXT
Please help to convert this xml to text. I have the attached the xml. Please find the attachement. Some student has two address tag and some student has one address tag and some student dont have address tag tag. I need to convert the xml into string. this is my desired output. 100,ranjini,HOME,a street,ad street,ads street,chennai,tn,OFFICE,adsja1 street,adsja2 street,adsja3 street,mumbai,Maharastra 101,nivetha,HOME,a street,ad street,ads street,chennai,tn 102,siva In normal java i have written using recursion but how to write in mapreduce. How to write the code in Mapreduce .? Pl help . Thanks in advance. Regards, Ranjini R On Fri, Jan 10, 2014 at 12:47 PM, Ranjini Rathinam ranjinibe...@gmail.com wrote: Hi, Its working fine. problem was in xml . THe space i have given. Thanks a lot. Regards, Ranjini.R On Thu, Jan 9, 2014 at 10:47 PM, Diego Gutierrez diego.gutier...@ucsp.edu.pe wrote: Hi, I'm sending you the eclipse project with the code. Hope this helps. Regards Diego Gutiérrez 2014/1/9 Ranjini Rathinam ranjinibe...@gmail.com Hi, I am using here java 1.6 and hadoop 0.20 version , ubuntu 12.04. If possible please send the jar and code for review. Thanks for the support, Ranjini On Wed, Jan 8, 2014 at 11:00 PM, Diego Gutierrez diego.gutier...@ucsp.edu.pe wrote: Hi, I've notice that your xml file has break lines. Hadoop by default splits every file into lines and pass them to the map function, in other words, each map function process one line of the file. Please remove the break lines from your xml and try again. I've tested here with your xml file(just changing DTMNodeList list = (DTMNodeList) getNode(/Company/Employee, doc, XPathConstants.NODESET) ) and this is the output in result.txt id,name 100,ranjini,IT1,123456,nextlevel1,Chennai1Navallur1 1001,ranjinikumar,IT,1234516,nextlevel,ChennaiNavallur Note: I dont know if the java version or hadoop version can be the problem here. I'm using ubuntu 12.04, java oracle 7 and hadoop 2.2.0. If you want, I can send you the jar file with the code :) Regards Diego Gutiérrez. 2014/1/7 Ranjini Rathinam ranjinibe...@gmail.com Hi Gutierrez , As suggest i tried with the code , but in the result.txt i got output only header. Nothing else was printing. After debugging i came to know that while parsing , there is no value. The problem is in line given below which is bold. While putting SysOut i found no value printing in this line. String xmlContent = value.toString(); InputStream is = new ByteArrayInputStream(xmlContent.getBytes()); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder; try { builder = factory.newDocumentBuilder(); * Document doc = builder.parse(is);* String ed=doc.getDocumentElement().getNodeName(); out.write(ed.getBytes()); DTMNodeList list = (DTMNodeList) getNode(/Company/Employee, doc,XPathConstants.NODESET); When iam printing out.write(xmlContent.getBytes):- the whole xml is being printed. then i wrote for Sysout for list ,nothing printed. out.write(ed.getBytes):- nothing is being printed. Please suggest where i am going wrong. Please help to fix this. Thanks in advance. I have attached my code.Please review. Mapper class:- public class XmlTextMapper extends MapperLongWritable, Text, Text, Text { private static final XPathFactory xpathFactory = XPathFactory.newInstance(); @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String resultFileName = /user/task/Sales/result.txt; Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(URI.create(resultFileName), conf); FSDataOutputStream out = fs.create(new Path(resultFileName)); InputStream resultIS = new ByteArrayInputStream(new byte[0]); String header = id,name\n; out.write(header.getBytes()); String xmlContent = value.toString(); InputStream is = new ByteArrayInputStream(xmlContent.getBytes()); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder; try { builder = factory.newDocumentBuilder(); Document doc = builder.parse(is); String ed=doc.getDocumentElement().getNodeName(); out.write(ed.getBytes()); DTMNodeList list = (DTMNodeList) getNode(/Company/Employee, doc,XPathConstants.NODESET); int size = list.getLength(); for (int i = 0; i size; i++) { Node node = list.item(i); String line = ; NodeList nodeList = node.getChildNodes(); int childNumber = nodeList.getLength(); for (int j = 0; j childNumber; j++)
Test hadoop code on the cloud
Hi! I need to test my hadoop code on a cluster, what is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance
Re: Test hadoop code on the cloud
I think you can test it in Amazon EC2 with pseudo distribute, it support 1 tiny instance for 1 year free. Send From My Macbook On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote: Hi! I need to test my hadoop code on a cluster, what is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance
Re: Test hadoop code on the cloud
Thanks for the answer, but if i want to test my code on a full distributed installation? (for more accurate performance) 2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com: I think you can test it in Amazon EC2 with pseudo distribute, it support 1 tiny instance for 1 year free. Send From My Macbook On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote: Hi! I need to test my hadoop code on a cluster, what is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance
Re: XML to TEXT
Which input format you are using . Use xml input format. On 3 Jan 2014 10:47, Ranjini Rathinam ranjinibe...@gmail.com wrote: Hi, Need to convert XML into text using mapreduce. I have used DOM and SAX parser. After using SAX Builder in mapper class. the child node act as root Element. While seeing in Sys out i found thar root element is taking the child element and printing. For Eg, CompEmpid100/idnameRR/name/Emp/Comp when this xml is passed in mapper , in sys out printing the root element I am getting the the root element as id name Please suggest and help to fix this. I need to convert the xml into text using mapreduce code. Please provide with example. Required output is id,name 100,RR Please help. Thanks in advance, Ranjini R
Re: Test hadoop code on the cloud
You can check Amazon Elastic MapReduce, which comes preconfigured on EC2 but you need to pay a little por it, or make your custom instalation on EC2 (beware that EC2 instances come with nothing but really basic shell tools on it, so it may take a while to get it running). Amazon's free tier allows you to instantiate several tiny machines; when you spend your free quota they start charging you so be careful. Good luck :D On 12 February 2014 13:27, Andrea Barbato and.barb...@gmail.com wrote: Thanks for the answer, but if i want to test my code on a full distributed installation? (for more accurate performance) 2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com: I think you can test it in Amazon EC2 with pseudo distribute, it support 1 tiny instance for 1 year free. Send From My Macbook On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote: Hi! I need to test my hadoop code on a cluster, what is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance
Re: Chian Jobs in C++ with Pipes
I've been dealing with a similar situation and I haven't found other solution rather than launching two independent jobs (with a script or whatever you like), letting the output of the first be the input of the last. If you find any other option please let me know. Regards On 12 February 2014 12:55, Massimo Simoniello massimo.simonie...@gmail.comwrote: Hi, I'm using Hadoop Pipes and I want to chain two jobs (job1 and job2). Is it possible? I use the FileInputFormat.addInputPath() and FileOutputFormat.setOutputPath() functions to do it in Java, but I want to know if there is some way for do it in C++ with pipes. Thanks in advance, Massimo
Re: Test hadoop code on the cloud
As a slightly more advanced option for OpenStack people: Consider trying savanna (Hadoop provisioned on top of open stack) as well. On Wed, Feb 12, 2014 at 10:23 AM, Silvina Caíno Lores silvi.ca...@gmail.com wrote: You can check Amazon Elastic MapReduce, which comes preconfigured on EC2 but you need to pay a little por it, or make your custom instalation on EC2 (beware that EC2 instances come with nothing but really basic shell tools on it, so it may take a while to get it running). Amazon's free tier allows you to instantiate several tiny machines; when you spend your free quota they start charging you so be careful. Good luck :D On 12 February 2014 13:27, Andrea Barbato and.barb...@gmail.com wrote: Thanks for the answer, but if i want to test my code on a full distributed installation? (for more accurate performance) 2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com: I think you can test it in Amazon EC2 with pseudo distribute, it support 1 tiny instance for 1 year free. Send From My Macbook On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote: Hi! I need to test my hadoop code on a cluster, what is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance -- Jay Vyas http://jayunit100.blogspot.com
Re: Chian Jobs in C++ with Pipes
Yes, of course. It's a solution but I need all jobs in a single file like in java.. Can anyone help me? 2014-02-12 16:34 GMT+01:00 Silvina Caíno Lores silvi.ca...@gmail.com: I've been dealing with a similar situation and I haven't found other solution rather than launching two independent jobs (with a script or whatever you like), letting the output of the first be the input of the last. If you find any other option please let me know. Regards On 12 February 2014 12:55, Massimo Simoniello massimo.simonie...@gmail.com wrote: Hi, I'm using Hadoop Pipes and I want to chain two jobs (job1 and job2). Is it possible? I use the FileInputFormat.addInputPath() and FileOutputFormat.setOutputPath() functions to do it in Java, but I want to know if there is some way for do it in C++ with pipes. Thanks in advance, Massimo
RE: very long timeout on failed RM connect
Setting conf.set(yarn.resourcemanager.connect.max-wait.ms, 500); conf.set(yarn.resourcemanager.connect.retry-interval.ms, 500); still results in a wait of around 15 seconds. Setting this: conf.set(ipc.client.connect.max.retries, 2); Also does not help. Is there a retry parameter that can be set? Thanks John From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Monday, February 10, 2014 12:12 PM To: user@hadoop.apache.org Subject: RE: very long timeout on failed RM connect I tried: conf.set(yarn.resourcemanager.connect.max-wait.ms, 1); conf.set(yarn.resourcemanager.connect.retry-interval.ms, 1000); But it has no apparent effect. Still hangs for a very long time. john From: Jian He [mailto:j...@hortonworks.com] Sent: Monday, February 10, 2014 11:05 AM To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Re: very long timeout on failed RM connect Setting the following two properties may solve your problem. yarn.resourcemanager.connect.max-wait.mshttp://yarn.resourcemanager.connect.max-wait.ms/ controls Maximum time to wait to establish connection to ResourceManager. yarn.resourcemanager.connect.retry-interval.mshttp://yarn.resourcemanager.connect.retry-interval.ms/ controls How often to try connecting to the ResourceManager. Jian On Mon, Feb 10, 2014 at 6:44 AM, John Lilley john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote: Our application (running outside the Hadoop cluster) connects to the RM through YarnClient. This works fine, except we've found that if the RM address or port is misconfigured in our software, or a firewall blocks access, the first call into the client (in this case getNodeReports) hangs for a very long time. I've tried conf.set(ipc.client.connect.max.retries, 2); But this doesn't help. Is there a configuration setting I can make on the YarnClient that will reduce this hang time? I understand why this long-winded retry strategy exists, in order to prevent a highly-loaded cluster from failing jobs. But it is not appropriate for an interactive application. Thanks John CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found
What's the value for io.compression.codecs config parameter ? Thanks On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: I am runing example of wordcout but encount an exception: I googled and know lzo compression's license is incompatible with apache's so it's not built in. the question is I am using default configuration of hadoop 1.2.1, why it need lzo? anothe question is, what's Cleaning up the staging area mean? ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt /lili/test 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process : 1 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area hdfs:// 172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216 java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156) at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)
Re: Can Yarn AppMaster move Container logs to hdfs?
Hi Emmanuel, log aggregation now only aggregates finished apps logs onto hdfs. there's no way as of now to support running apps. that'll be a to-do feature in the feature. Jian On Mon, Feb 10, 2014 at 11:53 AM, Emmanuel Espina espinaemman...@gmail.comwrote: Sorry when I said log running I meant LONG running, that is, a service, not a batch job 2014-02-10 16:49 GMT-03:00 Emmanuel Espina espinaemman...@gmail.com: I'm building a custom YARN app where the App Master is a log running service that can start jobs in containers. The ideal situation for us would be to be able to move all the logs produced by each container to hdfs. The aggregation options that yarn provides does this but only after the app master finishes (and in our case it never finishes). Is there any way of doing this? Thanks Emmanuel -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found
For the codecs, you can choose among org.apache.hadoop.io.compress.*Codec. LzoCodec has been moved out of Hadoop (see HADOOP-4874). - Zhijie On Wed, Feb 12, 2014 at 10:54 AM, Ted Yu yuzhih...@gmail.com wrote: What's the value for io.compression.codecs config parameter ? Thanks On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: I am runing example of wordcout but encount an exception: I googled and know lzo compression's license is incompatible with apache's so it's not built in. the question is I am using default configuration of hadoop 1.2.1, why it need lzo? anothe question is, what's Cleaning up the staging area mean? ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt /lili/test 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process : 1 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area hdfs:// 172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216 java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156) at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109) -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: very long timeout on failed RM connect
ipc.client.connect.retry.interval sets the underlying ipc retry interval yarn.resourcemanager.connect.retry-interval.ms sets the upper layer clientRMProxy retry interval. Each clientRMProxy retry includes one full round of retries of the underlying ipc. In each ClientRMProxy retry, the max number of underlying ipc retry is controlled by ipc.client.connect.max.retries. Did you try setting both ? Jian On Wed, Feb 12, 2014 at 8:36 AM, John Lilley john.lil...@redpoint.netwrote: Setting conf.set(yarn.resourcemanager.connect.max-wait.ms, 500); conf.set(yarn.resourcemanager.connect.retry-interval.ms, 500); still results in a wait of around 15 seconds. Setting this: conf.set(ipc.client.connect.max.retries, 2); Also does not help. Is there a retry parameter that can be set? Thanks John *From:* John Lilley [mailto:john.lil...@redpoint.net] *Sent:* Monday, February 10, 2014 12:12 PM *To:* user@hadoop.apache.org *Subject:* RE: very long timeout on failed RM connect I tried: conf.set(yarn.resourcemanager.connect.max-wait.ms, 1); conf.set(yarn.resourcemanager.connect.retry-interval.ms, 1000); But it has no apparent effect. Still hangs for a very long time. john *From:* Jian He [mailto:j...@hortonworks.com j...@hortonworks.com] *Sent:* Monday, February 10, 2014 11:05 AM *To:* user@hadoop.apache.org *Subject:* Re: very long timeout on failed RM connect Setting the following two properties may solve your problem. yarn.resourcemanager.connect.max-wait.ms controls Maximum time to wait to establish connection to ResourceManager. yarn.resourcemanager.connect.retry-interval.ms controls How often to try connecting to the ResourceManager. Jian On Mon, Feb 10, 2014 at 6:44 AM, John Lilley john.lil...@redpoint.net wrote: Our application (running outside the Hadoop cluster) connects to the RM through YarnClient. This works fine, except we've found that if the RM address or port is misconfigured in our software, or a firewall blocks access, the first call into the client (in this case getNodeReports) hangs for a very long time. I've tried conf.set(ipc.client.connect.max.retries, 2); But this doesn't help. Is there a configuration setting I can make on the YarnClient that will reduce this hang time? I understand why this long-winded retry strategy exists, in order to prevent a highly-loaded cluster from failing jobs. But it is not appropriate for an interactive application. Thanks John CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found
property nameio.compression.codecs/name valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value descriptionA list of the compression codec classes that can be used for compression/decompression./description /property property nameio.compression.codec.lzo.class/name valuecom.hadoop.compression.lzo.LzoCodec/value /property On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote: What's the value for io.compression.codecs config parameter ? Thanks On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: I am runing example of wordcout but encount an exception: I googled and know lzo compression's license is incompatible with apache's so it's not built in. the question is I am using default configuration of hadoop 1.2.1, why it need lzo? anothe question is, what's Cleaning up the staging area mean? ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt /lili/test 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process : 1 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216 java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156) at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)
OPENFORWRITE Files issue
Say I have a text file on hdfs in OPENFORWRITE, HEALTHY status. some process is appending to it. It has 4 lines in it. hadoop fs -cat /file | wc -l 4 However when I do a wordcount on this file, only first line is visible to the mapreduce. Similar in hive when i do select count(*) from filetable = 1 If I do hadoop cp /file /file2, then it works as expected.(file2 is closed, file is still open) wordcount would see 5 lines in the input directory(1 from opened file, 4 from copied file), hive will return 5. I am wondering if there is anything related to TextInputFormat? I am using CDH 4.4.0 Thanks. Xiao Li
Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found
Please remove LzoCodec from config. Cheers On Feb 12, 2014, at 5:12 PM, Li Li fancye...@gmail.com wrote: property nameio.compression.codecs/name valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value descriptionA list of the compression codec classes that can be used for compression/decompression./description /property property nameio.compression.codec.lzo.class/name valuecom.hadoop.compression.lzo.LzoCodec/value /property On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote: What's the value for io.compression.codecs config parameter ? Thanks On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: I am runing example of wordcout but encount an exception: I googled and know lzo compression's license is incompatible with apache's so it's not built in. the question is I am using default configuration of hadoop 1.2.1, why it need lzo? anothe question is, what's Cleaning up the staging area mean? ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt /lili/test 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process : 1 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216 java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156) at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)
Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found
thanks. it's correct now. On Thu, Feb 13, 2014 at 9:37 AM, Ted Yu yuzhih...@gmail.com wrote: Please remove LzoCodec from config. Cheers On Feb 12, 2014, at 5:12 PM, Li Li fancye...@gmail.com wrote: property nameio.compression.codecs/name valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value descriptionA list of the compression codec classes that can be used for compression/decompression./description /property property nameio.compression.codec.lzo.class/name valuecom.hadoop.compression.lzo.LzoCodec/value /property On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote: What's the value for io.compression.codecs config parameter ? Thanks On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote: I am runing example of wordcout but encount an exception: I googled and know lzo compression's license is incompatible with apache's so it's not built in. the question is I am using default configuration of hadoop 1.2.1, why it need lzo? anothe question is, what's Cleaning up the staging area mean? ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt /lili/test 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process : 1 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216 java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156) at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580) at org.apache.hadoop.examples.WordCount.main(WordCount.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)
hadoop 2.2.0 QJM exception : NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage
Hi All, I don't know why the journal node logs has this weird NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage exception. This error occurs each time I switch my namenode from standby to active 2014-02-13 10:34:47,873 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133208 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133208-0133318 2014-02-13 10:36:38,492 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_281 - /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_281-282 2014-02-13 10:36:51,118 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133319 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133319-0133422 2014-02-13 10:38:38,755 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_283 - /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_283-284 2014-02-13 10:38:54,620 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133423 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133423-0133432 2014-02-13 10:40:27,543 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Updating lastPromisedEpoch from 2 to 3 for client /10.18.30.155 2014-02-13 10:40:27,569 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Scanning storage FileJournalManager(root=/data/hadoop/hadoop-data/journal/hadoopdev64) 2014-02-13 10:40:27,570 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.newEpoch from 10.18.30.155:35408 Call#339 Retry#0: error: java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateEditLog(FSEditLogLoader.java:814) at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.validateEditLog(EditLogFileInputStream.java:289) at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.validateLog(FileJournalManager.java:457) at org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:189) at org.apache.hadoop.hdfs.qjournal.server.Journal.newEpoch(Journal.java:301) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.newEpoch(JournalNodeRpcServer.java:132) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.newEpoch(QJournalProtocolServerSideTranslatorPB.java:114) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:17439) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) 2014-02-13 10:40:58,074 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133433 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133433-0133548 Below is the partial logs from namenode when it try to activate but failed abruptly: 2014-02-13 10:40:27,389 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2014-02-13 10:40:27,390 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279) at
(Solved) hadoop 2.2.0 QJM exception : NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage
Dear All, Sorry, I found the root cause for this problem, it appears that I overwrite the hadoop-hdfs-2.2.0.jar with my own custom jar, but forgot to restart the journal node process, so the process cannot find the FSImage class, but it actually there inside my custom jar. Note to myself: make sure to shutdown all process before replacing the jar(s). Best regards, Henry From: MA11 YTHung1 Sent: Thursday, February 13, 2014 10:49 AM To: user@hadoop.apache.org Subject: hadoop 2.2.0 QJM exception : NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage Hi All, I don't know why the journal node logs has this weird NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage exception. This error occurs each time I switch my namenode from standby to active 2014-02-13 10:34:47,873 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133208 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133208-0133318 2014-02-13 10:36:38,492 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_281 - /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_281-282 2014-02-13 10:36:51,118 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133319 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133319-0133422 2014-02-13 10:38:38,755 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_283 - /data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_283-284 2014-02-13 10:38:54,620 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133423 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133423-0133432 2014-02-13 10:40:27,543 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Updating lastPromisedEpoch from 2 to 3 for client /10.18.30.155 2014-02-13 10:40:27,569 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Scanning storage FileJournalManager(root=/data/hadoop/hadoop-data/journal/hadoopdev64) 2014-02-13 10:40:27,570 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.newEpoch from 10.18.30.155:35408 Call#339 Retry#0: error: java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateEditLog(FSEditLogLoader.java:814) at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.validateEditLog(EditLogFileInputStream.java:289) at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.validateLog(FileJournalManager.java:457) at org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:189) at org.apache.hadoop.hdfs.qjournal.server.Journal.newEpoch(Journal.java:301) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.newEpoch(JournalNodeRpcServer.java:132) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.newEpoch(QJournalProtocolServerSideTranslatorPB.java:114) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:17439) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) 2014-02-13 10:40:58,074 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133433 - /data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133433-0133548 Below is the partial logs from namenode when it try to activate but failed abruptly: 2014-02-13 10:40:27,389 INFO
Unable to load native-hadoop library for your platform
I am trying to run an example and I get the following error: HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki OpenJDK 64-Bit Server VM warning: You have loaded library /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/02/13 05:24:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable I tried to run execstack -c, but the problem stays the same. Any help? HadoopMaster-nh:~# execstack -c /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki OpenJDK 64-Bit Server VM warning: You have loaded library /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/02/13 05:26:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
RE: Unable to load native-hadoop library for your platform
Funny, I was just trying to add something to the wiki addressing this. These instructions are for 2.2, but I imagine that 2.0.5 is probably very similar. If the formatting doesn't come through for whatever reason, I posted the same thing here: http://answers.splunk.com/answers/118174/hunk-reports-an-error-with-apache-hadoop?page=1focusedAnswerId=122311#122311 This isn't necessarily a big problem - Hadoop will function without native libraries. You may find it easier to ignore the message or disable/redirect JVM warnings. You can disable the error message or redirect it to stderr, but that only moves the error out of your way and doesn't deal with the root problem. The root problem is that the hadoop distribution does not include native libraries. They must be compiled from source. You can build your own distribution that includes native libraries using the following steps: 1) Install developer tools and dependencies: 1a) From repositories: apt-get install gcc g++ make maven cmake zlib zlib1g-dev for RedHat environments, you can probably use a similar yum line: yum install gcc g++ make maven cmake zlib zlib-devel There may be some other dependencies or slightly different package names depending on what you already have installed and what OS you are running. If so, some google-able errors will pop up during the rest of the process. 1b) Protocol Buffers From Source: mkdir /tmp/protobuf cd /tmp/protobuf wget http:// protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz tar -xvzf ./protobuf-2.5.0.tar.gz cd protobuf-2.5.0 ./configure --prefix=/usr make sudo make install cd java mvn install mvn package sudo ldconfig cd /tmp rm -rf protobuf 2) download hadoop source: mkdir /tmp/hadoop-build cd /tmp/hadoop-build wget http:// apache.petsads.us/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz tar -xvzf ./hadoop-2.2.0-src.tar.gz cd hadoop-2.2.0-src 3) Edit the hadoop-auth pom file. vi hadoop-common-project/hadoop-auth/pom.xml add the following dependency: dependency groupIdorg.mortbay.jetty/groupId artifactIdjetty-util/artifactId scopetest/scope /dependency You should see an already existing dependency that looks very similar if you search for org.mortbay.jetty, add this dependency above or below it. 3) Compile it: export Platform=x64 cd /tmp/hadoop-build/hadoop-2.2.0-src mvn clean install -DskipTests cd hadoop-mapreduce-project mvn package -Pdist,native -DskipTests=true -Dtar cd /tmp/hadoop-build/hadoop-2.2.0-src mvn package -Pdist,native -DskipTests=true -Dtar 4) Copy your natively compiled distribution somewhere to be saved: cp /tmp/hadoop-build/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0.tar.gz /my/distribution/share/hadoop-2.2.0.tar.gz 5) Delete the build files (once you are satisfied that everything is working properly): cd /tmp rm -rf hadoop-build Now any fresh installations based on this build will include native 64 bit libraries. You can set up a new instance of hadoop locally, or you can simply overwrite the files in the $HADOOP-INSTALL/lib/native directory with those in your hadoop-2.2.0.tar.gz file. *From:* xeon Mailinglist [mailto:xeonmailingl...@gmail.com] *Sent:* Wednesday, February 12, 2014 9:28 PM *To:* user@hadoop.apache.org *Subject:* Unable to load native-hadoop library for your platform I am trying to run an example and I get the following error: HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki OpenJDK 64-Bit Server VM warning: You have loaded library /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/02/13 05:24:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable I tried to run execstack -c, but the problem stays the same. Any help? HadoopMaster-nh:~# execstack -c /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki OpenJDK 64-Bit Server VM warning: You have loaded library /root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c libfile', or link it with '-z noexecstack'. 14/02/13 05:26:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Password not found for ApplicationAttempt
My MR job failed due to below error, I'm running YARN 2.2.0 release. Does anybody know what the error means and how to fix it? 2014-02-12 18:25:31,748 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xinx (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Password not found for ApplicationAttempt appattempt_1392258153412_0001_02 2014-02-12 18:25:31,749 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Password not found for ApplicationAttempt appattempt_1392258153412_0001_02 2014-02-12 18:25:31,749 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:xinx (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Password not found for ApplicationAttempt appattempt_1392258153412_0001_02 2014-02-12 18:25:31,752 ERROR [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while registering org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not found for ApplicationAttempt appattempt_1392258153412_0001_02 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy29.registerApplicationMaster(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:112) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:213) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:811) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1061) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1445) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1441) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1374) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Password not found for ApplicationAttempt appattempt_1392258153412_0001_02 at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106) ... 22 more 2014-02-12 18:25:31,754 INFO [main] org.apache.hadoop.service.AbstractService: Service RMCommunicator failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not found for ApplicationAttempt