date:20110914

Handling of small files in hadoop

2011-09-14 Thread Naveen Mahale

Hi all, I use hadoop-0.21.0 distribution. I have a large number of small files (KB). Is there any efficient way of handling it in hadoop? I have heard that solution for that problem is using: 1. HAR (hadoop archives) 2. cat on files I would like to know if there are any

Setting permissions on startup, during safe mode

2011-09-14 Thread Ossi

hi, every time after starting our hadoop cluster (using Cloudera's) this message appears: 2011-09-13 04:35:05,207 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode extension entered. The reported blocks 8995 has reached the threshold 0.9990 of total blocks 9005. Safe mode will be turned

Re: Handling of small files in hadoop

2011-09-14 Thread Joey Echeverria

Hi Naveen, I use hadoop-0.21.0 distribution. I have a large number of small files (KB). Word of warning, 0.21 is not a stable release. The recommended version is in the 0.20.x range. Is there any efficient way of handling it in hadoop? I have heard that solution for that problem is using:

Re: Hadoop Streaming job Fails - Permission Denied error

2011-09-14 Thread Brock Noland

Hi, This probably belongs on mapreduce-user as opposed to common-user. I have BCC'ed the common-user group. Generally it's a best practice to ship the scripts with the job. Like so: hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar -input

Re: Running example application with capacity scheduler ?

2011-09-14 Thread Thomas Graves

I believe it defaults to submit a job to the default queue if you don't specify it. You don't have the default queue defined in your list of mapred.queue.names. So add -Dmapred.job.queue.name=myqueue1 (or another queue you have defined) to the wordcount command like: bin/hadoop jar

Am i crazy? - question about hadoop streaming

2011-09-14 Thread Mark Kerzner

Hi, I am using the latest Cloudera distribution, and with that I am able to use the latest Hadoop API, which I believe is 0.21, for such things as import org.apache.hadoop.mapreduce.Reducer; So I am using mapreduce, not mapred, and everything works fine. However, in a small streaming job,

Re: Am i crazy? - question about hadoop streaming

2011-09-14 Thread Konstantin Boudnik

I am sure if you ask at provider's specific list you'll get a better answer than from common Hadoop list ;) Cos On Wed, Sep 14, 2011 at 09:48PM, Mark Kerzner wrote: Hi, I am using the latest Cloudera distribution, and with that I am able to use the latest Hadoop API, which I believe is

Re: Am i crazy? - question about hadoop streaming

2011-09-14 Thread Mark Kerzner

I am sorry, you are right. mark On Wed, Sep 14, 2011 at 9:52 PM, Konstantin Boudnik c...@apache.org wrote: I am sure if you ask at provider's specific list you'll get a better answer than from common Hadoop list ;) Cos On Wed, Sep 14, 2011 at 09:48PM, Mark Kerzner wrote: Hi, I am

Re: Handling of small files in hadoop

2011-09-14 Thread Naveen Mahale

Hey, thanks Joey for that information. Would work on what you said. Regards Naveen Mahale On Wed, Sep 14, 2011 at 5:32 PM, Joey Echeverria j...@cloudera.com wrote: Hi Naveen, I use hadoop-0.21.0 distribution. I have a large number of small files (KB). Word of warning, 0.21 is not a

Re: Am i crazy? - question about hadoop streaming

2011-09-14 Thread Prashant

On 09/15/2011 08:18 AM, Mark Kerzner wrote: Hi, I am using the latest Cloudera distribution, and with that I am able to use the latest Hadoop API, which I believe is 0.21, for such things as import org.apache.hadoop.mapreduce.Reducer; So I am using mapreduce, not mapred, and everything works

Re: Am i crazy? - question about hadoop streaming

2011-09-14 Thread Mark Kerzner

Thank you, Prashant, it seems so. I already verified this by refactoring the code to use 0.20 API as well as 0.21 API in two different packages, and streaming happily works with 0.20. Mark On Wed, Sep 14, 2011 at 11:46 PM, Prashant prashan...@imaginea.com wrote: On 09/15/2011 08:18 AM, Mark

Handling of small files in hadoop

Setting permissions on startup, during safe mode

Re: Handling of small files in hadoop

Re: Hadoop Streaming job Fails - Permission Denied error

Re: Running example application with capacity scheduler ?

Am i crazy? - question about hadoop streaming

Re: Am i crazy? - question about hadoop streaming

Re: Am i crazy? - question about hadoop streaming

Re: Handling of small files in hadoop

Re: Am i crazy? - question about hadoop streaming

Re: Am i crazy? - question about hadoop streaming

11 matches

Site Navigation

Mail list logo

Footer information