Its your use of the mapred.input.dir property, which is a reserved
name in the framework (its what FileInputFormat uses).
You have a config you extract path from:
Path input = new Path(conf.get("mapred.input.dir"));
Then you do:
FileInputFormat.addInputPath(job, input);
Which internally, simply
Hi all,
I have experience with hadoop 0.20.204 on 3 machines cluster as pilot,
now im trying to setup real cluster on 32 linux machines.
I have some question:
1. is hadoop 1.0 stable?? in hadoop site this version is indicated as
beta release
2. as you know installing and setting up hadoop
Sorry for multiple emails. I did find:
2012-03-05 17:26:35,636 INFO
org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call-
Usage threshold init = 715849728(699072K) used = 575921696(562423K)
committed = 715849728(699072K) max = 715849728(699072K)
2012-03-05 17:26:35,719 INFO
All I see in the logs is:
2012-03-05 17:26:36,889 FATAL org.apache.hadoop.mapred.TaskTracker: Task:
attempt_201203051722_0001_m_30_1 - Killed : Java heap space
Looks like task tracker is killing the tasks. Not sure why. I increased
heap from 512 to 1G and still it fails.
On Mon, Mar 5, 201
Hi Rohini,
The similar problem was just encountered for me yesterday. But for my
situation, the max process num (ulimit -u) is set to 1024, which is too
small. And when i increase it to 100, the problem gone. But u said
"Ulimit on the machine is set to unlimited", i'm not sure this will h
Unfortunately, "public" didn't change my error ... Any other ideas? Has
anyone ran Hadoop on eclipse with custom sequence inputs ?
Thank you,
Mark
On Mon, Mar 5, 2012 at 9:58 AM, Mark question wrote:
> Hi Madhu, it has the following line:
>
> TermDocFreqArrayWritable () {}
>
> but I'll try it w
Streaming is good for simulation. Long running map-only processes, where pig
doesn't really help and it is simple to fire off a streaming process. You do
have to set some options so they can take a long time to return/return counters.
Russell Jurney http://datasyndrome.com
On Mar 5, 2012, at 1
I'm really interested in this as well. I have trouble seeing a really
good use case for streaming map-reduce. Is there something I can do in
streaming that I can't do in Pig? If I want to re-use previously made
Python functions from my code base, I can do that in Pig as much as
Streaming, and f
Hi Madhu, it has the following line:
TermDocFreqArrayWritable () {}
but I'll try it with "public" access in case it's been called outside of my
package.
Thank you,
Mark
On Sun, Mar 4, 2012 at 9:55 PM, madhu phatak wrote:
> Hi,
> Please make sure that your CustomWritable has a default constru
On Mon, Mar 5, 2012 at 7:40 AM, John Conwell wrote:
> AWS MapReduce (EMR) does not use S3 for its HDFS persistance. If it did
> your S3 billing would be massive :) EMR reads all input jar files and
> input data from S3, but it copies these files down to its local disk. It
> then does starts th
AWS MapReduce (EMR) does not use S3 for its HDFS persistance. If it did
your S3 billing would be massive :) EMR reads all input jar files and
input data from S3, but it copies these files down to its local disk. It
then does starts the MR process, doing all HDFS reads and writes to the
local dis
On 02/27/2012 11:53 AM, W.P. McNeill wrote:
You don't need any virtualization. Mac OS X is Linux and runs Hadoop as is.
Nitpick: OS X is NEXTSTEP based on Mach, which is a different
POSIX-compliant system from Linux.
Can someone have a look at the patch MAPREDUCE-2457 and see if it can be
modified to work for 0.20.205?
I am very new to java and have no idea what's going on in that patch. If
you have any pointers for me, I will see if I can do it on my own.
Thanks,
Austin
On Fri, Mar 2, 2012 at 7:15 PM, Austin
13 matches
Mail list logo