I usually run Hadoop on a linux cluster but do most of my development in
single machine mode under windows.
This was fairly straightforward for 0.2. For 1.0 I needed to copy and fix
FileUtils but for 2.0 I am expected to build 2 files from source -
WinUtils.exe and hadoop.dll. There is really only
Hi all,
I just want to confirm if my understanding with Hadoop FileSystem object is
correct or not.
From the source code of org.apache.hadoop.fs. FileSystem (either from version
1.0.4 or 2.2.0), the method
public static FileSystem get(URI uri, Configuration conf) throws IOException
is using
Hi,
Thanks a lot .
Ranjini
On Fri, Jan 3, 2014 at 10:40 PM, Diego Gutierrez
diego.gutier...@ucsp.edu.pe wrote:
Hi,
I suggest to use the XPath, this is a native java support for parse xml
and json formats.
For the main problem, like distcp command(
Hi,
I have a input File of 16 fields in it.
Using Mapreduce code need to load the hbase tables.
The first eight has to go into one table in hbase and last eight has to got
to another hbase table.
The data is being loaded into hbase table in 0.11 sec , but if any lookup
is being added in the
hi rajini
Can u use hive? then u can just use xpaths in ur select clause
cheers
R+
On Mon, Jan 6, 2014 at 2:44 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote:
Hi,
Thanks a lot .
Ranjini
On Fri, Jan 3, 2014 at 10:40 PM, Diego Gutierrez
diego.gutier...@ucsp.edu.pe wrote:
Hi,
I
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs
fine. But in the next job, this error comes up
java.lang.NullPointerException
at org.apache.avro.util.Utf8.init(Utf8.java:37)
at
org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
Hi,
I am using hadoop/ map reduce for aout 2.5 years. I want to understand the
internals of the hadoop source code.
Let me put my requirement very clear.
I want to have a look at the code where of flush operations that happens
after the reduce phase.
Reducer writes the output to OutputFormat
-- Forwarded message --
From: nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com
Date: Mon, Jan 6, 2014 at 8:09 AM
Subject: Understanding MapReduce source code : Flush operations
To: mapreduce-u...@hadoop.apache.org
Hi,
I am using hadoop/ map reduce for aout 2.5 years. I
Based on the Exception type, it looks like something in your job is looking
for a valid value, and not finding it.
You will probably need to share the job code for people to help with this -
to my eyes, this doesn't appear to be a Hadoop configuration issue, or any
kind of problem with how the
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the
DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time
and it throws this exception. From the nutch mailing list, it seems that my job
is stumbling upon a repUrl value that’s null.
--
Manikandan
-- Forwarded message --
From: nagarjuna kanamarlapudi nagarjuna.kanamarlap...@gmail.com
Date: Mon, Jan 6, 2014 at 6:39 PM
Subject: Understanding MapReduce source code : Flush operations
To: mapreduce-u...@hadoop.apache.org
Hi,
I am using hadoop/ map reduce for aout 2.5 years. I
Please do not tell me since last 2.5 years you have not used virtual Hadoop
environment to debug your Map Reduce application before deploying to
Production environment
No one can stop you looking at the code , Hadoop and its ecosystem is
open-source
On Mon, Jan 6, 2014 at 9:35 AM, nagarjuna
Can you please share how you are doing the lookup?
On Mon, Jan 6, 2014 at 4:23 AM, Ranjini Rathinam ranjinibe...@gmail.comwrote:
Hi,
I have a input File of 16 fields in it.
Using Mapreduce code need to load the hbase tables.
The first eight has to go into one table in hbase and last
The error is happening during Sort And Spill phase
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
It seems like you are trying to compare two Int values and it fails during
compare
Caused by: java.lang.ArrayIndexOutOfBoundsException: 99614720
at
This is not in DFSClient.
Before the output is written on to HDFS, lot of operations take place.
Like reducer output in mem reaching 90% of HDFS block size, then starting
to flush the data etc..,
So, my requirement is to have a look at that code where in I want to change
the logic a bit which
El ene 6, 2014 10:48 PM, nagarjuna kanamarlapudi
nagarjuna.kanamarlap...@gmail.com escribió:
Hi,
I checked out the source code from
https://svn.apache.org/repos/asf/hadoop/common/trunk/
I tried to compile the code with mvn.
I am compiling this on a mac os X , mavericks. Any help is
You can read Build instructions for Hadoop.
http://svn.apache.org/repos/asf/hadoop/common/trunk/BUILDING.txt
For your problem, proto-buf not set in PATH. After setting, recheck
proto-buffer version is 2.5
From: nagarjuna kanamarlapudi [mailto:nagarjuna.kanamarlap...@gmail.com]
Sent: 07 January
Thanks all, the following is required
Download from http://code.google.com/p/protobuf/downloads/list
$ ./configure
$ make
$ make check
$ make install
Then compile the source code
On Tue, Jan 7, 2014 at 9:46 AM, Rohith Sharma K S rohithsharm...@huawei.com
wrote:
You can read Build
What OutputFormat are you using?
Once it reaches OutputFormat (specifically RecordWriter) it all depends on what
the RecordWriter does. Are you using some OutputFormat with a RecordWriter that
buffers like this?
Thanks,
+Vinod
On Jan 6, 2014, at 7:11 PM, nagarjuna kanamarlapudi
When I click on individual maps logs, it says Aggregation is not enabled.
Try the nodemanager at slave1-machine:60933
How could I enable aggregation?
On Sun, Jan 5, 2014 at 1:21 PM, Harsh J ha...@cloudera.com wrote:
Every failed task typically carries a diagnostic message and a set of
logs
20 matches
Mail list logo