I need help building Hadoop 2.0 for windows

2014-01-06 Thread Steve Lewis
I usually run Hadoop on a linux cluster but do most of my development in single machine mode under windows. This was fairly straightforward for 0.2. For 1.0 I needed to copy and fix FileUtils but for 2.0 I am expected to build 2 files from source - WinUtils.exe and hadoop.dll. There is really only

accessing hadoop filesystem from Tomcat

2014-01-06 Thread Henry Hung
Hi all, I just want to confirm if my understanding with Hadoop FileSystem object is correct or not. From the source code of org.apache.hadoop.fs. FileSystem (either from version 1.0.4 or 2.2.0), the method public static FileSystem get(URI uri, Configuration conf) throws IOException is using

Re: XML to TEXT

2014-01-06 Thread Ranjini Rathinam
Hi, Thanks a lot . Ranjini On Fri, Jan 3, 2014 at 10:40 PM, Diego Gutierrez diego.gutier...@ucsp.edu.pe wrote: Hi, I suggest to use the XPath, this is a native java support for parse xml and json formats. For the main problem, like distcp command(

Fine tunning

2014-01-06 Thread Ranjini Rathinam
Hi, I have a input File of 16 fields in it. Using Mapreduce code need to load the hbase tables. The first eight has to go into one table in hbase and last eight has to got to another hbase table. The data is being loaded into hbase table in 0.11 sec , but if any lookup is being added in the

Re: XML to TEXT

2014-01-06 Thread Rajesh Nagaraju
hi rajini Can u use hive? then u can just use xpaths in ur select clause cheers R+ On Mon, Jan 6, 2014 at 2:44 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote: Hi, Thanks a lot . Ranjini On Fri, Jan 3, 2014 at 10:40 PM, Diego Gutierrez diego.gutier...@ucsp.edu.pe wrote: Hi, I

Hadoop permissions issue

2014-01-06 Thread Manikandan Saravanan
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up java.lang.NullPointerException at org.apache.avro.util.Utf8.init(Utf8.java:37) at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)

Understanding MapReduce source code : Flush operations

2014-01-06 Thread nagarjuna kanamarlapudi
Hi, I am using hadoop/ map reduce for aout 2.5 years. I want to understand the internals of the hadoop source code. Let me put my requirement very clear. I want to have a look at the code where of flush operations that happens after the reduce phase. Reducer writes the output to OutputFormat

Fwd: Understanding MapReduce source code : Flush operations

2014-01-06 Thread nagarjuna kanamarlapudi
-- Forwarded message -- From: nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com Date: Mon, Jan 6, 2014 at 8:09 AM Subject: Understanding MapReduce source code : Flush operations To: mapreduce-u...@hadoop.apache.org Hi, I am using hadoop/ map reduce for aout 2.5 years. I

Re: Hadoop permissions issue

2014-01-06 Thread Devin Suiter RDX
Based on the Exception type, it looks like something in your job is looking for a valid value, and not finding it. You will probably need to share the job code for people to help with this - to my eyes, this doesn't appear to be a Hadoop configuration issue, or any kind of problem with how the

Re: Hadoop permissions issue

2014-01-06 Thread Manikandan Saravanan
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time and it throws this exception. From the nutch mailing list, it seems that my job is stumbling upon a repUrl value that’s null. --  Manikandan

Fwd: Understanding MapReduce source code : Flush operations

2014-01-06 Thread nagarjuna kanamarlapudi
-- Forwarded message -- From: nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com Date: Mon, Jan 6, 2014 at 6:39 PM Subject: Understanding MapReduce source code : Flush operations To: mapreduce-u...@hadoop.apache.org Hi, I am using hadoop/ map reduce for aout 2.5 years. I

Re: Understanding MapReduce source code : Flush operations

2014-01-06 Thread Hardik Pandya
Please do not tell me since last 2.5 years you have not used virtual Hadoop environment to debug your Map Reduce application before deploying to Production environment No one can stop you looking at the code , Hadoop and its ecosystem is open-source On Mon, Jan 6, 2014 at 9:35 AM, nagarjuna

Re: Fine tunning

2014-01-06 Thread Hardik Pandya
Can you please share how you are doing the lookup? On Mon, Jan 6, 2014 at 4:23 AM, Ranjini Rathinam ranjinibe...@gmail.comwrote: Hi, I have a input File of 16 fields in it. Using Mapreduce code need to load the hbase tables. The first eight has to go into one table in hbase and last

Re: Spill Failed Caused by ArrayIndexOutOfBoundsException

2014-01-06 Thread Hardik Pandya
The error is happening during Sort And Spill phase org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill It seems like you are trying to compare two Int values and it fails during compare Caused by: java.lang.ArrayIndexOutOfBoundsException: 99614720 at

Re: Understanding MapReduce source code : Flush operations

2014-01-06 Thread nagarjuna kanamarlapudi
This is not in DFSClient. Before the output is written on to HDFS, lot of operations take place. Like reducer output in mem reaching 90% of HDFS block size, then starting to flush the data etc.., So, my requirement is to have a look at that code where in I want to change the logic a bit which

Re: unable to compile hadoop source code

2014-01-06 Thread Diego Gutierrez
El ene 6, 2014 10:48 PM, nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com escribió: Hi, I checked out the source code from https://svn.apache.org/repos/asf/hadoop/common/trunk/ I tried to compile the code with mvn. I am compiling this on a mac os X , mavericks. Any help is

RE: unable to compile hadoop source code

2014-01-06 Thread Rohith Sharma K S
You can read Build instructions for Hadoop. http://svn.apache.org/repos/asf/hadoop/common/trunk/BUILDING.txt For your problem, proto-buf not set in PATH. After setting, recheck proto-buffer version is 2.5 From: nagarjuna kanamarlapudi [mailto:nagarjuna.kanamarlap...@gmail.com] Sent: 07 January

Re: unable to compile hadoop source code

2014-01-06 Thread nagarjuna kanamarlapudi
Thanks all, the following is required Download from http://code.google.com/p/protobuf/downloads/list $ ./configure $ make $ make check $ make install Then compile the source code On Tue, Jan 7, 2014 at 9:46 AM, Rohith Sharma K S rohithsharm...@huawei.com wrote: You can read Build

Re: Understanding MapReduce source code : Flush operations

2014-01-06 Thread Vinod Kumar Vavilapalli
What OutputFormat are you using? Once it reaches OutputFormat (specifically RecordWriter) it all depends on what the RecordWriter does. Are you using some OutputFormat with a RecordWriter that buffers like this? Thanks, +Vinod On Jan 6, 2014, at 7:11 PM, nagarjuna kanamarlapudi

Re: What makes a map to fail?

2014-01-06 Thread Saeed Adel Mehraban
When I click on individual maps logs, it says Aggregation is not enabled. Try the nodemanager at slave1-machine:60933 How could I enable aggregation? On Sun, Jan 5, 2014 at 1:21 PM, Harsh J ha...@cloudera.com wrote: Every failed task typically carries a diagnostic message and a set of logs