them? The env program can't find them and that's
>> probably why your scripts with shbang don't run.
>>
>> On Tue, Sep 13, 2011 at 1:12 PM, Bejoy KS wrote:
>> > Thanks Jeremy. But I didn't follow 'redirect "stdout" to "stderr" a
nd you a pointer.
J
On Mon, Sep 12, 2011 at 8:27 AM, Bejoy KS wrote:
> Thanks Jeremy. I tried with your first suggestion and the mappers ran into
> completion. But then the reducers failed with another exception related to
> pipes. I believe it may be due to permission issues again. I
I would suggest you try putting your mapper/reducer py files in a directory
that is world readable at every level . i.e /tmp/test. I had similar
problems when I was using streaming and I believe my workaround was to put
the mapper/reducers outside my home directory. The other more involved
alternat
contains all the
files that meet my criteria.
Thanks,
Jeremy
Hassen,
I've been very succesful using Hadoop Streaming, Dumbo, and TypedBytes
as a solution for using python to implement mappers and reducers.
TypedBytes is a hadoop encoding format that allows binary data
(including lists and maps) to be encoded in a format that permits the
serialized data to
JJ
If you want to use complex types in a streaming job I think you need to
encode the values using the typedbytes format within the sequence file;
i.e the key and value in the sequence file are both typedbytes writable.
This is independent of the language the mapper and reducer is written in
becau
Thanks Todd.
Unfortunately, I'm using Hadoop cascading, so I'm not sure if there's
an easy mechanism to force LocalJobs it fires off to use a different
configuration. I'll talk to the Cascading folks and find out.
J
Quoting Todd Lipcon :
Hi Jeremy,
That'
next time I
restart the daemons, the task tracker will fail because it can't
rename /var/lib/hadoop-0.20/cache/pseudo/localRunner.
Does anybody have suggestions how to fix this?
Thanks
Jeremy
Mike,
Check out this wiki
http://code.google.com/p/hadoop-clusternet/wiki/DebuggingJobsUsingEclipse
It shows how if your running in stand alone mode you can run a job in
debug mode so that you can then start a remote debugging session with
Eclipse. You can then step through your code.
I've found
org.apache.hadoop.mapreduce is a newer api. To avoid breaking backwards
compatibility the older api, org.apache.hadoop.mapred was preserved, and
the newer api was just given a new name.
Check out
http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api
J
On Tue, 2011-04-26 at 20:17
could have another operator after that which would process
each word.
Jeremy
On Sat, 2011-04-23 at 19:39 +0530, ranjith k wrote:
> Thank u..harsh.
> I have a map function. The inputFormat is line input format. i need to
> run another map reduce task from the map function for each word
onfiguration file is overwritten by the call
jobConf._set("stream.addenvironment",addTaskEnvironment_);
in StreamJob.setJobConf()?
I'm using CDH3B.
Thanks
Jeremy
Wow, sorry, that was just my sad excuse. Thanks again.
On Feb 9, 2011, at 6:53 PM, Andrew Hitchcock wrote:
> Ah, nice catch. I'll go fix that message now :)
>
> On Wed, Feb 9, 2011 at 4:50 PM, Jeremy Hanna
> wrote:
>> Bah - you're right. I don't know
Bah - you're right. I don't know why I thought the real error was obscured,
besides being distracted by "you should of" should be "you should have".
Thanks and apologies...
Jeremy
On Feb 9, 2011, at 6:10 PM, Andrew Hitchcock wrote:
> "This file sys
Anyone know why I would be getting an error doing a filesystem.open on a file
with a s3n prefix?
for the input path "s3n://backlog.dev/129664890/" - I get the following
stacktrace:
java.lang.IllegalArgumentException: This file system object
(hdfs://ip-10-114-89-36.ec2.internal:9000) does n
it is file based, it
> will write to the given file output path, else to Cassandra, DB, whatever you
> specify..
>
> Thanks and Regards,
> Sonal
> Connect Hadoop with databases, Salesforce, FTP servers and others
> Nube Technologies
>
>
>
>
>
>
&g
e to hdfs as long as you give a
> path like s3:// and hdfs:// .
>
> Koji
>
>
> On 2/1/11 11:13 AM, "Jeremy Hanna" wrote:
>
> I wanted to input from s3 but output to someplace else in aws with
> elastic mapreduce. Their docs seem to only suggest that they only
> read from/write to s3. Is that correct?
>
I wanted to input from s3 but output to someplace else in aws with
elastic mapreduce. Their docs seem to only suggest that they only
read from/write to s3. Is that correct?
I wanted to input from s3 but output to someplace else in aws with elastic
mapreduce. Their docs seem to only suggest that they only read from/write to
s3. Is that correct?
19 matches
Mail list logo