Fwd: XML to TEXT

2014-02-12 Thread Ranjini Rathinam

 Please help to convert this xml to text.


  I have the attached the xml. Please find the attachement.

 Some student has two address tag and some student has one address tag and
 some student dont have address tag tag.

 I need to convert the xml into string.

 this is my desired output.

 100,ranjini,HOME,a street,ad street,ads street,chennai,tn,OFFICE,adsja1
 street,adsja2 street,adsja3 street,mumbai,Maharastra
 101,nivetha,HOME,a street,ad street,ads street,chennai,tn
 102,siva


 In normal java i have written using recursion but how to write in
 mapreduce.

 How to write the code in Mapreduce .? Pl help .

 Thanks in advance.
  Regards,
 Ranjini R


 On Fri, Jan 10, 2014 at 12:47 PM, Ranjini Rathinam 
 ranjinibe...@gmail.com wrote:

 Hi,

 Its working fine. problem was in xml . THe space i have given.

 Thanks a lot.

 Regards,
 Ranjini.R

  On Thu, Jan 9, 2014 at 10:47 PM, Diego Gutierrez 
 diego.gutier...@ucsp.edu.pe wrote:

  Hi,

 I'm sending you the eclipse project with the code. Hope this helps.

 Regards
 Diego Gutiérrez



 2014/1/9 Ranjini Rathinam ranjinibe...@gmail.com

 Hi,

 I am using here java 1.6 and hadoop 0.20 version ,  ubuntu 12.04.

 If possible please send the jar and code for review.

 Thanks for the support,

 Ranjini

  On Wed, Jan 8, 2014 at 11:00 PM, Diego Gutierrez 
 diego.gutier...@ucsp.edu.pe wrote:

   Hi,

 I've notice that your xml file has break lines. Hadoop by default
 splits every file into lines and pass them to the map function, in other
 words, each map function process one line of the file. Please remove the
 break lines from your xml and try again. I've tested here with your xml
 file(just changing DTMNodeList list = (DTMNodeList)
 getNode(/Company/Employee, doc,
 XPathConstants.NODESET) ) and this is the output
 in result.txt


 id,name
 100,ranjini,IT1,123456,nextlevel1,Chennai1Navallur1
 1001,ranjinikumar,IT,1234516,nextlevel,ChennaiNavallur


 Note: I dont know if the java version or hadoop version can be the
 problem here. I'm using ubuntu 12.04, java oracle 7 and hadoop 2.2.0.


 If you want, I can send you the jar file with the code :)

 Regards
 Diego Gutiérrez.



 2014/1/7 Ranjini Rathinam ranjinibe...@gmail.com

 Hi Gutierrez ,

 As suggest i tried with the code , but in the result.txt i got
 output only header. Nothing else was printing.

 After debugging i came to know that while parsing , there is no
 value.

 The problem is in line given below which is bold. While putting
 SysOut i found no value printing in this line.

  String xmlContent = value.toString();

 InputStream is = new
 ByteArrayInputStream(xmlContent.getBytes());
 DocumentBuilderFactory factory =
 DocumentBuilderFactory.newInstance();
 DocumentBuilder builder;
 try {
 builder = factory.newDocumentBuilder();

 * Document doc = builder.parse(is);*
String ed=doc.getDocumentElement().getNodeName();
out.write(ed.getBytes());
 DTMNodeList list = (DTMNodeList)
 getNode(/Company/Employee, doc,XPathConstants.NODESET);

 When iam printing

 out.write(xmlContent.getBytes):- the whole xml is being printed.

 then i wrote for Sysout for list ,nothing printed.
  out.write(ed.getBytes):- nothing is being printed.

 Please suggest where i am going wrong. Please help to fix this.

 Thanks in advance.

 I have attached my code.Please review.


 Mapper class:-

 public class XmlTextMapper extends MapperLongWritable, Text, Text,
 Text {
  private static final XPathFactory xpathFactory =
 XPathFactory.newInstance();
 @Override
 public void map(LongWritable key, Text value, Context context)
 throws IOException, InterruptedException {
 String resultFileName = /user/task/Sales/result.txt;

 Configuration conf = new Configuration();
 FileSystem fs = FileSystem.get(URI.create(resultFileName),
 conf);
 FSDataOutputStream out = fs.create(new Path(resultFileName));
 InputStream resultIS = new ByteArrayInputStream(new byte[0]);
 String header = id,name\n;
 out.write(header.getBytes());
  String xmlContent = value.toString();

 InputStream is = new
 ByteArrayInputStream(xmlContent.getBytes());
 DocumentBuilderFactory factory =
 DocumentBuilderFactory.newInstance();
 DocumentBuilder builder;
 try {
 builder = factory.newDocumentBuilder();
 Document doc = builder.parse(is);
String ed=doc.getDocumentElement().getNodeName();
out.write(ed.getBytes());
 DTMNodeList list = (DTMNodeList)
 getNode(/Company/Employee, doc,XPathConstants.NODESET);
  int size = list.getLength();
 for (int i = 0; i  size; i++) {
 Node node = list.item(i);
 String line = ;
 NodeList nodeList = node.getChildNodes();
 int childNumber = nodeList.getLength();
 for (int j = 0; j  childNumber; j++)
   

Test hadoop code on the cloud

2014-02-12 Thread Andrea Barbato
Hi!
I need to test my hadoop code on a cluster,
what is the simplest way to do this on the cloud?
Is there any way to do it for free?

Thank in advance


Re: Test hadoop code on the cloud

2014-02-12 Thread Zhao Xiaoguang
I think you can test it in Amazon EC2 with pseudo distribute, it support 1 tiny 
instance for 1 year free. 


Send From My Macbook

On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote:

 Hi! 
 I need to test my hadoop code on a cluster, 
 what is the simplest way to do this on the cloud?
 Is there any way to do it for free?
 
 Thank in advance



Re: Test hadoop code on the cloud

2014-02-12 Thread Andrea Barbato
Thanks for the answer, but if i want to test my code on a full distributed
installation? (for more accurate performance)


2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com:

 I think you can test it in Amazon EC2 with pseudo distribute, it support 1
 tiny instance for 1 year free.


 Send From My Macbook

 On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com wrote:

  Hi!
  I need to test my hadoop code on a cluster,
  what is the simplest way to do this on the cloud?
  Is there any way to do it for free?
 
  Thank in advance




Re: XML to TEXT

2014-02-12 Thread Shekhar Sharma
Which input format you are using . Use xml input format.
On 3 Jan 2014 10:47, Ranjini Rathinam ranjinibe...@gmail.com wrote:

 Hi,

 Need to convert XML into text using mapreduce.

 I have used DOM and SAX parser.

 After using SAX Builder in mapper class. the child node act as root
 Element.

 While seeing in Sys out i found thar root element is taking the child
 element and printing.

 For Eg,

 CompEmpid100/idnameRR/name/Emp/Comp
 when this xml is passed in mapper , in sys out printing the root element

 I am getting the the root element as

 id
 name

 Please suggest and help to fix this.

 I need to convert the xml into text using mapreduce code. Please provide
 with example.

 Required output is

 id,name
 100,RR

 Please help.

 Thanks in advance,
 Ranjini R



















Re: Test hadoop code on the cloud

2014-02-12 Thread Silvina Caíno Lores
You can check Amazon Elastic MapReduce, which comes preconfigured on EC2
but you need to pay a little por it, or make your custom instalation on EC2
(beware that EC2 instances come with nothing but really basic shell tools
on it, so it may take a while to get it running).

Amazon's free tier allows you to instantiate several tiny machines; when
you spend your free quota they start charging you so be careful.

Good luck :D




On 12 February 2014 13:27, Andrea Barbato and.barb...@gmail.com wrote:

 Thanks for the answer, but if i want to test my code on a full
 distributed installation? (for more accurate performance)


 2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com:

 I think you can test it in Amazon EC2 with pseudo distribute, it support 1
 tiny instance for 1 year free.


 Send From My Macbook

 On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com
 wrote:

  Hi!
  I need to test my hadoop code on a cluster,
  what is the simplest way to do this on the cloud?
  Is there any way to do it for free?
 
  Thank in advance





Re: Chian Jobs in C++ with Pipes

2014-02-12 Thread Silvina Caíno Lores
I've been dealing with a similar situation and I haven't found other
solution rather than launching two independent jobs (with a script or
whatever you like), letting the output of the first be the input of the
last. If you find any other option please let me know.

Regards


On 12 February 2014 12:55, Massimo Simoniello
massimo.simonie...@gmail.comwrote:

 Hi,

 I'm using Hadoop Pipes and I want to chain two jobs (job1 and job2). Is it
 possible?
 I use the FileInputFormat.addInputPath()
 and FileOutputFormat.setOutputPath() functions to do it in Java, but I want
 to know if there is some way for do it in C++ with pipes.

 Thanks in advance,

 Massimo






Re: Test hadoop code on the cloud

2014-02-12 Thread Jay Vyas
As a slightly more advanced option for OpenStack people: Consider trying
savanna (Hadoop provisioned on top of open stack) as well.


On Wed, Feb 12, 2014 at 10:23 AM, Silvina Caíno Lores silvi.ca...@gmail.com
 wrote:

 You can check Amazon Elastic MapReduce, which comes preconfigured on EC2
 but you need to pay a little por it, or make your custom instalation on EC2
 (beware that EC2 instances come with nothing but really basic shell tools
 on it, so it may take a while to get it running).

 Amazon's free tier allows you to instantiate several tiny machines; when
 you spend your free quota they start charging you so be careful.

 Good luck :D




 On 12 February 2014 13:27, Andrea Barbato and.barb...@gmail.com wrote:

 Thanks for the answer, but if i want to test my code on a full
 distributed installation? (for more accurate performance)


 2014-02-12 13:01 GMT+01:00 Zhao Xiaoguang cool...@gmail.com:

 I think you can test it in Amazon EC2 with pseudo distribute, it support
 1 tiny instance for 1 year free.


 Send From My Macbook

 On Feb 12, 2014, at 6:29 PM, Andrea Barbato and.barb...@gmail.com
 wrote:

  Hi!
  I need to test my hadoop code on a cluster,
  what is the simplest way to do this on the cloud?
  Is there any way to do it for free?
 
  Thank in advance






-- 
Jay Vyas
http://jayunit100.blogspot.com


Re: Chian Jobs in C++ with Pipes

2014-02-12 Thread Massimo Simoniello
Yes, of course. It's a solution but I need all jobs in a single file like
in java.. Can anyone help me?


2014-02-12 16:34 GMT+01:00 Silvina Caíno Lores silvi.ca...@gmail.com:

 I've been dealing with a similar situation and I haven't found other
 solution rather than launching two independent jobs (with a script or
 whatever you like), letting the output of the first be the input of the
 last. If you find any other option please let me know.

 Regards


 On 12 February 2014 12:55, Massimo Simoniello 
 massimo.simonie...@gmail.com wrote:

 Hi,

 I'm using Hadoop Pipes and I want to chain two jobs (job1 and job2). Is
 it possible?
 I use the FileInputFormat.addInputPath()
 and FileOutputFormat.setOutputPath() functions to do it in Java, but I want
 to know if there is some way for do it in C++ with pipes.

 Thanks in advance,

 Massimo







RE: very long timeout on failed RM connect

2014-02-12 Thread John Lilley
Setting
conf.set(yarn.resourcemanager.connect.max-wait.ms, 500);
conf.set(yarn.resourcemanager.connect.retry-interval.ms, 500);
still results in a wait of around 15 seconds.  Setting this:
   conf.set(ipc.client.connect.max.retries, 2);
Also does not help.  Is there a retry parameter that can be set?
Thanks
John

From: John Lilley [mailto:john.lil...@redpoint.net]
Sent: Monday, February 10, 2014 12:12 PM
To: user@hadoop.apache.org
Subject: RE: very long timeout on failed RM connect

I tried:
conf.set(yarn.resourcemanager.connect.max-wait.ms, 1);
conf.set(yarn.resourcemanager.connect.retry-interval.ms, 1000);

But it has no apparent effect.  Still hangs for a very long time.
john

From: Jian He [mailto:j...@hortonworks.com]
Sent: Monday, February 10, 2014 11:05 AM
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Re: very long timeout on failed RM connect


Setting the following two properties may solve your problem.

yarn.resourcemanager.connect.max-wait.mshttp://yarn.resourcemanager.connect.max-wait.ms/
 controls Maximum time to wait to establish connection to ResourceManager.

yarn.resourcemanager.connect.retry-interval.mshttp://yarn.resourcemanager.connect.retry-interval.ms/
 controls How often to try connecting to the ResourceManager.



Jian

On Mon, Feb 10, 2014 at 6:44 AM, John Lilley 
john.lil...@redpoint.netmailto:john.lil...@redpoint.net wrote:
Our application (running outside the Hadoop cluster) connects to the RM through 
YarnClient.  This works fine, except we've found that if the RM address or port 
is misconfigured in our software, or a firewall blocks access, the first call 
into the client (in this case getNodeReports) hangs for a very long time.  I've 
tried
conf.set(ipc.client.connect.max.retries, 2);
But this doesn't help.  Is there a configuration setting I can make on the 
YarnClient that will reduce this hang time?
I understand why this long-winded retry strategy exists, in order to prevent a 
highly-loaded cluster from failing jobs.  But it is not appropriate for an 
interactive application.
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found

2014-02-12 Thread Ted Yu
What's the value for io.compression.codecs config parameter ?

Thanks


On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote:

 I am runing example of wordcout but encount an exception:
 I googled and know lzo compression's license is incompatible with apache's
 so it's not built in.
 the question is I am using default configuration of hadoop 1.2.1, why
 it need lzo?
 anothe question is, what's Cleaning up the staging area mean?


 ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt
 /lili/test

 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process
 : 1
 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area
 hdfs://
 172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216
 java.lang.IllegalArgumentException: Compression codec
 com.hadoop.compression.lzo.LzoCodec not found.
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156)
 at
 org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47)
 at
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)
 at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
 at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
 at
 org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
 at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
 at
 org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
 at
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:264)
 at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)



Re: Can Yarn AppMaster move Container logs to hdfs?

2014-02-12 Thread Jian He
Hi Emmanuel, log aggregation now only aggregates finished apps logs onto
hdfs. there's no way as of now to support running apps. that'll be a to-do
feature in the feature.

Jian



On Mon, Feb 10, 2014 at 11:53 AM, Emmanuel Espina
espinaemman...@gmail.comwrote:

 Sorry when I said log running I meant LONG running, that is, a service,
 not a batch job


 2014-02-10 16:49 GMT-03:00 Emmanuel Espina espinaemman...@gmail.com:

 I'm building a custom YARN app where the App Master is a log running
 service that can start jobs in containers.

 The ideal situation for us would be to be able to move all the logs
 produced by each container to hdfs. The aggregation options that yarn
 provides does this but only after the app master finishes (and in our case
 it never finishes).

 Is there any way of doing this?

 Thanks
 Emmanuel




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found

2014-02-12 Thread Zhijie Shen
For the codecs, you can choose
among org.apache.hadoop.io.compress.*Codec. LzoCodec has been moved out of
Hadoop (see HADOOP-4874).

- Zhijie


On Wed, Feb 12, 2014 at 10:54 AM, Ted Yu yuzhih...@gmail.com wrote:

 What's the value for io.compression.codecs config parameter ?

 Thanks


 On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote:

 I am runing example of wordcout but encount an exception:
 I googled and know lzo compression's license is incompatible with apache's
 so it's not built in.
 the question is I am using default configuration of hadoop 1.2.1, why
 it need lzo?
 anothe question is, what's Cleaning up the staging area mean?


 ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt
 /lili/test

 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to
 process : 1
 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area
 hdfs://
 172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216
 java.lang.IllegalArgumentException: Compression codec
 com.hadoop.compression.lzo.LzoCodec not found.
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156)
 at
 org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47)
 at
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)
 at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
 at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
 at
 org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
 at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
 at
 org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
 at
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:264)
 at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)





-- 
Zhijie Shen
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: very long timeout on failed RM connect

2014-02-12 Thread Jian He
ipc.client.connect.retry.interval sets the underlying ipc retry interval

yarn.resourcemanager.connect.retry-interval.ms sets the upper layer
clientRMProxy retry interval.


Each clientRMProxy retry includes one full round of retries of the
underlying ipc. In each ClientRMProxy retry, the max number of underlying
ipc retry is controlled by ipc.client.connect.max.retries.

Did you try setting both ?


Jian




On Wed, Feb 12, 2014 at 8:36 AM, John Lilley john.lil...@redpoint.netwrote:

  Setting

 conf.set(yarn.resourcemanager.connect.max-wait.ms, 500);

 conf.set(yarn.resourcemanager.connect.retry-interval.ms, 500);

 still results in a wait of around 15 seconds.  Setting this:

conf.set(ipc.client.connect.max.retries, 2);

 Also does not help.  Is there a retry parameter that can be set?

 Thanks

 John



 *From:* John Lilley [mailto:john.lil...@redpoint.net]
 *Sent:* Monday, February 10, 2014 12:12 PM
 *To:* user@hadoop.apache.org
 *Subject:* RE: very long timeout on failed RM connect



 I tried:

 conf.set(yarn.resourcemanager.connect.max-wait.ms, 1);

 conf.set(yarn.resourcemanager.connect.retry-interval.ms, 1000);



 But it has no apparent effect.  Still hangs for a very long time.

 john



 *From:* Jian He [mailto:j...@hortonworks.com j...@hortonworks.com]
 *Sent:* Monday, February 10, 2014 11:05 AM
 *To:* user@hadoop.apache.org
 *Subject:* Re: very long timeout on failed RM connect



 Setting the following two properties may solve your problem.

 yarn.resourcemanager.connect.max-wait.ms controls Maximum time to wait to
 establish connection to ResourceManager.

 yarn.resourcemanager.connect.retry-interval.ms controls How often to try
 connecting to the ResourceManager.



 Jian



 On Mon, Feb 10, 2014 at 6:44 AM, John Lilley john.lil...@redpoint.net
 wrote:

 Our application (running outside the Hadoop cluster) connects to the RM
 through YarnClient.  This works fine, except we've found that if the RM
 address or port is misconfigured in our software, or a firewall blocks
 access, the first call into the client (in this case getNodeReports) hangs
 for a very long time.  I've tried

 conf.set(ipc.client.connect.max.retries, 2);

 But this doesn't help.  Is there a configuration setting I can make on the
 YarnClient that will reduce this hang time?

 I understand why this long-winded retry strategy exists, in order to
 prevent a highly-loaded cluster from failing jobs.  But it is not
 appropriate for an interactive application.

 Thanks

 John






 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found

2014-02-12 Thread Li Li
property
  nameio.compression.codecs/name
  
valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value
  descriptionA list of the compression codec classes that can be used
   for compression/decompression./description
/property

property
nameio.compression.codec.lzo.class/name
valuecom.hadoop.compression.lzo.LzoCodec/value
/property

On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote:
 What's the value for io.compression.codecs config parameter ?

 Thanks


 On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote:

 I am runing example of wordcout but encount an exception:
 I googled and know lzo compression's license is incompatible with apache's
 so it's not built in.
 the question is I am using default configuration of hadoop 1.2.1, why
 it need lzo?
 anothe question is, what's Cleaning up the staging area mean?


 ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt
 /lili/test

 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process
 : 1
 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area

 hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216
 java.lang.IllegalArgumentException: Compression codec
 com.hadoop.compression.lzo.LzoCodec not found.
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156)
 at
 org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47)
 at
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)
 at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
 at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
 at
 org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
 at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
 at
 org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
 at
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:264)
 at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
 at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)




OPENFORWRITE Files issue

2014-02-12 Thread Xiao Li



Say I have a text file on hdfs in OPENFORWRITE, HEALTHY status. some process 
is appending to it. 


It has 4 lines in it.


hadoop fs -cat /file | wc -l 
4


However when I do a wordcount on this file, only first line is visible to the 
mapreduce. Similar in hive when i do select count(*) from filetable = 1


If I do hadoop cp /file /file2, then it works as expected.(file2 is closed, 
file is still open)


wordcount would see 5 lines in the input directory(1 from opened file, 4 from 
copied file), hive will return 5.


I am wondering if there is anything related to TextInputFormat?


I am using CDH 4.4.0


Thanks.


Xiao Li




Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found

2014-02-12 Thread Ted Yu
Please remove LzoCodec from config. 

Cheers

On Feb 12, 2014, at 5:12 PM, Li Li fancye...@gmail.com wrote:

 property
  nameio.compression.codecs/name
  
 valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value
  descriptionA list of the compression codec classes that can be used
   for compression/decompression./description
 /property
 
 property
 nameio.compression.codec.lzo.class/name
 valuecom.hadoop.compression.lzo.LzoCodec/value
 /property
 
 On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote:
 What's the value for io.compression.codecs config parameter ?
 
 Thanks
 
 
 On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote:
 
 I am runing example of wordcout but encount an exception:
 I googled and know lzo compression's license is incompatible with apache's
 so it's not built in.
 the question is I am using default configuration of hadoop 1.2.1, why
 it need lzo?
 anothe question is, what's Cleaning up the staging area mean?
 
 
 ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt
 /lili/test
 
 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process
 : 1
 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area
 
 hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216
 java.lang.IllegalArgumentException: Compression codec
 com.hadoop.compression.lzo.LzoCodec not found.
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116)
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156)
at
 org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47)
at
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)
at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
at
 org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at
 org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)
 
 


Re: Compression codec com.hadoop.compression.lzo.LzoCodec not found

2014-02-12 Thread Li Li
thanks. it's correct now.

On Thu, Feb 13, 2014 at 9:37 AM, Ted Yu yuzhih...@gmail.com wrote:
 Please remove LzoCodec from config.

 Cheers

 On Feb 12, 2014, at 5:12 PM, Li Li fancye...@gmail.com wrote:

 property
  nameio.compression.codecs/name
  
 valueorg.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec/value
  descriptionA list of the compression codec classes that can be used
   for compression/decompression./description
 /property

 property
 nameio.compression.codec.lzo.class/name
 valuecom.hadoop.compression.lzo.LzoCodec/value
 /property

 On Thu, Feb 13, 2014 at 2:54 AM, Ted Yu yuzhih...@gmail.com wrote:
 What's the value for io.compression.codecs config parameter ?

 Thanks


 On Tue, Feb 11, 2014 at 10:11 PM, Li Li fancye...@gmail.com wrote:

 I am runing example of wordcout but encount an exception:
 I googled and know lzo compression's license is incompatible with apache's
 so it's not built in.
 the question is I am using default configuration of hadoop 1.2.1, why
 it need lzo?
 anothe question is, what's Cleaning up the staging area mean?


 ./bin/hadoop jar hadoop-examples-1.2.1.jar wordcount /lili/data.txt
 /lili/test

 14/02/12 14:06:10 INFO input.FileInputFormat: Total input paths to process
 : 1
 14/02/12 14:06:10 INFO mapred.JobClient: Cleaning up the staging area

 hdfs://172.19.34.24:8020/home/hadoop/dfsdir/hadoop-hadoop/mapred/staging/hadoop/.staging/job_201401080916_0216
 java.lang.IllegalArgumentException: Compression codec
 com.hadoop.compression.lzo.LzoCodec not found.
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:116)
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:156)
at
 org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:47)
at
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:258)
at
 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)
at
 org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1071)
at
 org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
 org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at
 org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at
 org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
at
 org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:109)




hadoop 2.2.0 QJM exception : NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage

2014-02-12 Thread Henry Hung
Hi All,

I don't know why the journal node logs has this weird NoClassDefFoundError: 
org/apache/hadoop/hdfs/server/namenode/FSImage exception.
This error occurs each time I switch my namenode from standby to active

2014-02-13 10:34:47,873 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133208
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133208-0133318
2014-02-13 10:36:38,492 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_281
 - 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_281-282
2014-02-13 10:36:51,118 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133319
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133319-0133422
2014-02-13 10:38:38,755 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_283
 - 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_283-284
2014-02-13 10:38:54,620 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133423
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133423-0133432
2014-02-13 10:40:27,543 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: 
Updating lastPromisedEpoch from 2 to 3 for client /10.18.30.155
2014-02-13 10:40:27,569 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: 
Scanning storage 
FileJournalManager(root=/data/hadoop/hadoop-data/journal/hadoopdev64)
2014-02-13 10:40:27,570 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8485, call 
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.newEpoch from 
10.18.30.155:35408 Call#339 Retry#0: error: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hdfs/server/namenode/FSImage
java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateEditLog(FSEditLogLoader.java:814)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.validateEditLog(EditLogFileInputStream.java:289)
at 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.validateLog(FileJournalManager.java:457)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:189)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.newEpoch(Journal.java:301)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.newEpoch(JournalNodeRpcServer.java:132)
at 
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.newEpoch(QJournalProtocolServerSideTranslatorPB.java:114)
at 
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:17439)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
2014-02-13 10:40:58,074 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133433
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133433-0133548



Below is the partial logs from namenode when it try to activate but failed 
abruptly:

2014-02-13 10:40:27,389 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for standby state
2014-02-13 10:40:27,390 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
interrupted
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
at 

(Solved) hadoop 2.2.0 QJM exception : NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage

2014-02-12 Thread Henry Hung
Dear All,

Sorry, I found the root cause for this problem, it appears that I overwrite the 
hadoop-hdfs-2.2.0.jar with my own custom jar, but forgot to restart the journal 
node process,
so the process cannot find the FSImage class, but it actually there inside my 
custom jar.

Note to myself: make sure to shutdown all process before replacing the jar(s).

Best regards,
Henry

From: MA11 YTHung1
Sent: Thursday, February 13, 2014 10:49 AM
To: user@hadoop.apache.org
Subject: hadoop 2.2.0 QJM exception : NoClassDefFoundError: 
org/apache/hadoop/hdfs/server/namenode/FSImage

Hi All,

I don't know why the journal node logs has this weird NoClassDefFoundError: 
org/apache/hadoop/hdfs/server/namenode/FSImage exception.
This error occurs each time I switch my namenode from standby to active

2014-02-13 10:34:47,873 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133208
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133208-0133318
2014-02-13 10:36:38,492 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_281
 - 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_281-282
2014-02-13 10:36:51,118 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133319
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133319-0133422
2014-02-13 10:38:38,755 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_inprogress_283
 - 
/data/hadoop/hadoop-data/journal/hadoopdev64/current/edits_283-284
2014-02-13 10:38:54,620 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133423
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133423-0133432
2014-02-13 10:40:27,543 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: 
Updating lastPromisedEpoch from 2 to 3 for client /10.18.30.155
2014-02-13 10:40:27,569 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: 
Scanning storage 
FileJournalManager(root=/data/hadoop/hadoop-data/journal/hadoopdev64)
2014-02-13 10:40:27,570 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 8485, call 
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.newEpoch from 
10.18.30.155:35408 Call#339 Retry#0: error: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hdfs/server/namenode/FSImage
java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/FSImage
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.validateEditLog(FSEditLogLoader.java:814)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.validateEditLog(EditLogFileInputStream.java:289)
at 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.validateLog(FileJournalManager.java:457)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:189)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.newEpoch(Journal.java:301)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.newEpoch(JournalNodeRpcServer.java:132)
at 
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.newEpoch(QJournalProtocolServerSideTranslatorPB.java:114)
at 
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:17439)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
2014-02-13 10:40:58,074 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_inprogress_0133433
 - 
/data/hadoop/hadoop-data/journal/hadoopdev/current/edits_0133433-0133548



Below is the partial logs from namenode when it try to activate but failed 
abruptly:

2014-02-13 10:40:27,389 INFO 

Unable to load native-hadoop library for your platform

2014-02-12 Thread xeon Mailinglist
I am trying to run an example and I get the following error:

HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki
OpenJDK 64-Bit Server VM warning: You have loaded library
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might
have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c
libfile', or link it with '-z noexecstack'.
14/02/13 05:24:48 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


I tried to run execstack -c, but the problem stays the same. Any help?
HadoopMaster-nh:~# execstack -c
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0
HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki
OpenJDK 64-Bit Server VM warning: You have loaded library
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might
have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c
libfile', or link it with '-z noexecstack'.
14/02/13 05:26:45 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


RE: Unable to load native-hadoop library for your platform

2014-02-12 Thread Steve Kallestad
Funny, I was just trying to add something to the wiki addressing this.



These instructions are for 2.2, but I imagine that 2.0.5 is probably very
similar.



If the formatting doesn't come through for whatever reason, I posted the
same thing here:



http://answers.splunk.com/answers/118174/hunk-reports-an-error-with-apache-hadoop?page=1focusedAnswerId=122311#122311



This isn't necessarily a big problem - Hadoop will function without native
libraries.  You may find it easier to ignore the message or
disable/redirect JVM warnings.

You can disable the error message or redirect it to stderr, but that only
moves the error out of your way and doesn't deal with the root problem. The
root problem is that the hadoop distribution does not include native
libraries. They must be compiled from source.

You can build your own distribution that includes native libraries using
the following steps:

1) Install developer tools and dependencies:

1a) From repositories:

apt-get install gcc g++ make maven cmake zlib zlib1g-dev

for RedHat environments, you can probably use a similar yum line:

yum install gcc g++ make maven cmake zlib zlib-devel

There may be some other dependencies or slightly different package names
depending on what you already have installed and what OS you are running.
If so, some google-able errors will pop up during the rest of the process.

1b) Protocol Buffers From Source:

mkdir /tmp/protobuf

cd /tmp/protobuf

wget http:// protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz

tar -xvzf ./protobuf-2.5.0.tar.gz

cd protobuf-2.5.0

./configure --prefix=/usr

make

sudo make install

cd java

mvn install

mvn package

sudo ldconfig

cd /tmp

rm -rf protobuf

2) download hadoop source:

mkdir /tmp/hadoop-build

cd /tmp/hadoop-build

wget http://
apache.petsads.us/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz

tar -xvzf ./hadoop-2.2.0-src.tar.gz

cd hadoop-2.2.0-src

3) Edit the hadoop-auth pom file.

vi hadoop-common-project/hadoop-auth/pom.xml

add the following dependency:

dependency

   groupIdorg.mortbay.jetty/groupId

  artifactIdjetty-util/artifactId

  scopetest/scope

/dependency

You should see an already existing dependency that looks very similar if
you search for org.mortbay.jetty, add this dependency above or below it.

3) Compile it:

export Platform=x64

cd /tmp/hadoop-build/hadoop-2.2.0-src

mvn clean install -DskipTests

cd hadoop-mapreduce-project

mvn package -Pdist,native -DskipTests=true -Dtar

cd /tmp/hadoop-build/hadoop-2.2.0-src

mvn package -Pdist,native -DskipTests=true -Dtar

4) Copy your natively compiled distribution somewhere to be saved:

cp
/tmp/hadoop-build/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0.tar.gz
/my/distribution/share/hadoop-2.2.0.tar.gz

5) Delete the build files (once you are satisfied that everything is
working properly):

cd /tmp

rm -rf hadoop-build

Now any fresh installations based on this build will include native 64 bit
libraries. You can set up a new instance of hadoop locally, or you can
simply overwrite the files in the $HADOOP-INSTALL/lib/native directory with
those in your hadoop-2.2.0.tar.gz file.





*From:* xeon Mailinglist [mailto:xeonmailingl...@gmail.com]
*Sent:* Wednesday, February 12, 2014 9:28 PM
*To:* user@hadoop.apache.org
*Subject:* Unable to load native-hadoop library for your platform



I am trying to run an example and I get the following error:



HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki

OpenJDK 64-Bit Server VM warning: You have loaded library
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might
have disabled stack guard. The VM will try to fix the stack guard now.

It's highly recommended that you fix the library with 'execstack -c
libfile', or link it with '-z noexecstack'.

14/02/13 05:24:48 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable





I tried to run execstack -c, but the problem stays the same. Any help?

HadoopMaster-nh:~# execstack -c
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0

HadoopMaster-nh:~# /root/Programs/hadoop/bin/hdfs dfs -count /wiki

OpenJDK 64-Bit Server VM warning: You have loaded library
/root/Programs/hadoop-2.0.5-alpha/lib/native/libhadoop.so.1.0.0 which might
have disabled stack guard. The VM will try to fix the stack guard now.

It's highly recommended that you fix the library with 'execstack -c
libfile', or link it with '-z noexecstack'.

14/02/13 05:26:45 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


Password not found for ApplicationAttempt

2014-02-12 Thread Anfernee Xu
My MR job failed due to below error, I'm running YARN 2.2.0 release.

Does anybody know what the error means and how to fix it?

2014-02-12 18:25:31,748 ERROR [main]
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:xinx (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Password not found for ApplicationAttempt
appattempt_1392258153412_0001_02
2014-02-12 18:25:31,749 WARN [main] org.apache.hadoop.ipc.Client: Exception
encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Password not found for ApplicationAttempt
appattempt_1392258153412_0001_02
2014-02-12 18:25:31,749 ERROR [main]
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:xinx (auth:SIMPLE)
cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Password not found for ApplicationAttempt
appattempt_1392258153412_0001_02
2014-02-12 18:25:31,752 ERROR [main]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while
registering
org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not
found for ApplicationAttempt appattempt_1392258153412_0001_02
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy29.registerApplicationMaster(Unknown Source)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:154)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:112)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:213)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:811)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1061)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1445)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1441)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1374)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
Password not found for ApplicationAttempt
appattempt_1392258153412_0001_02
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
... 22 more
2014-02-12 18:25:31,754 INFO [main]
org.apache.hadoop.service.AbstractService: Service RMCommunicator failed in
state STARTED; cause:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not
found for ApplicationAttempt