date:20130524

Re: splittable vs seekable compressed formats

2013-05-24 Thread Rahul Bhattacharjee

Yeah , I think John meant seeking to record boundaries. Thanks, Rahul On Fri, May 24, 2013 at 12:22 PM, Harsh J wrote: > SequenceFiles should be seekable provided you know/manage their sync > points during writes I think. With LZO this may be non-trivial. > > On Thu, May 23, 2013 at 11:01 PM,

Abort a job when a counter reaches to a threshold

2013-05-24 Thread abhinav gupta

Hi, While running a map-reduce job, that has only mappers, I have a counter that counts the number of failed documents .And after all the mappers are done, I want the job to fail if the total number of failed documents are above a fixed fraction. ( I need it in the end because I don't know the

Re: Abort a job when a counter reaches to a threshold

2013-05-24 Thread Harsh J

Yes there is a job level end-point upon success via OutputCommitter: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/OutputCommitter.html#commitJob(org.apache.hadoop.mapreduce.JobContext) On Fri, May 24, 2013 at 1:13 PM, abhinav gupta wrote: > Hi, > > While running a map-red

Re: pauses during startup (maybe network related?)

2013-05-24 Thread Ted

just an FYI in case anyone finds this thread in a web search I just edited my /etc/hosts file and added a mapping of 127.0.0.1 and everything starts up almost instantly, the difference is night and day :) On 5/24/13, Harsh J wrote: > You are spot on about the DNS lookup slowing thing

how to stop the specified client computer OS to connect hadoop using super user

2013-05-24 Thread 麦树荣

hi, all Our hadoop is started by user "hadoop",and the user "hadoop" is super user. Therefore, the hadoop client from other computer can manipulate HDFS using super user "hadoop" as long as the client computer OS has the user "hadoop". Are there any idea to stop the specified client computer OS

Re:how to stop the specified client computer OS to connect hadoopu

2013-05-24 Thread lxw

you can use iptables.. -- 原始邮件 -- 发件人: "麦树荣" ; 发送时间: 2013年5月24日(星期五) 17:27 收件人: "user@hadoop.apache.org" ; 主题: how to stop the specified client computer OS to connect hadoopusing super user hi, all Our hadoop is started by user "hadoop",and the u

Re: diff between these 2 dirs

2013-05-24 Thread Sai Sai

Just wondering if someone can explain what is the diff between these 2 dirs: Contents of directory /home/satish/work/mapred/staging/satish/.staging and this dir: /hadoop/mapred/system Thanks Sai

Re: Hint on EOFException's on datanodes

2013-05-24 Thread Azuryy Yu

maybe network issue, datanode received an incomplete packet. --Send from my Sony mobile. On May 24, 2013 1:39 PM, "Stephen Boesch" wrote: > > On a smallish (10 node) cluster with only 2 mappers per node after a few > minutes EOFExceptions are cropping up on the datanodes: an example is shown > b

RE: how to stop the specified client computer OS to connect hadoop using super user

2013-05-24 Thread zangxiangyu

Hi. why not try iptablesJ ,which in fact with no relation with hadoop. Add all hadoop nodes and client in whitelist,add XX to blacklist if only as you will “ stop the specified client computer OS to connect hadoop”,simply create one iptable rule. For long time planning,Kerberos is a must. From:

Please help me with heartbeat storm

2013-05-24 Thread Eremikhin Alexey

Hi all, I have 29 servers hadoop cluster in almost default configuration. After installing Hadoop 1.0.4 I've noticed that JT and some TT waste CPU. I started stracing its behaviour and found that some TT send heartbeats in an unlimited ways. It means hundreds in a second. Daemon restart solves

Re: Reducer that outputs no key

2013-05-24 Thread Something Something

You can ignore this for now. I was able to get merging of files to work under Hadoop Streaming by using the following 2 properties: -mapper "cut -f2-" -Dmapred.reduce.tasks=0 On Fri, May 24, 2013 at 12:55 AM, Something Something < mailinglist...@gmail.com> wrote: > Hello, > > Trying to use Had

Error while using the Hadoop Streaming

2013-05-24 Thread Adamantios Corais

I tried this nice example: http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/ The python scripts work pretty fine from my laptop (through terminal), but they don't when I execute them on the CDH3 (Pseudo-Distributed Mode). Any ideas? hadoop jar /home/yyy/Dropbox

Re: Child Error

2013-05-24 Thread Jim Twensky

Hi again, in addition to my previous post, I was able to get some error logs from the task tracker/data node this morning and looks like it might be a jetty issue: 2013-05-23 19:59:20,595 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201305231647_0007_m_001

Re: Error while using the Hadoop Streaming

2013-05-24 Thread Jitendra Yadav

Hi, I have run Michael's python map reduce example several times without any issue. I think this issue is related to your file path 'mapper.py'. you are using python binary? try this, hadoop jar /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/hadoop-streaming-1.1.2.jar \ -input /user/yyy/

Re: Where to begin from??

2013-05-24 Thread Sanjay Subramanian

Hey guys Is there a way to dynamically change the input dir and outputdir I have the following CONSTANT directories in HDFS * /path/to/input/-99-99 (empty directory ) * /path/to/output/-99-99 (empty directory) A new directory with yesterdays date like /path/to/input/2013-05-23 g

Re: Error while using the Hadoop Streaming

2013-05-24 Thread Adamantios Corais

Hi, Thanks a lot for your response. Unfortunately, I run into the same problem though. What do you mean by "python binary"? This is what I have in the very first line of both scripts: #!/usr/bin/python Any ideas? On Fri, May 24, 2013 at 7:41 PM, Jitendra Yadav wrote: > Hi, > > I have run Mic

Re: Error while using the Hadoop Streaming

2013-05-24 Thread Jitendra Yadav

Hi, In your first mail you were using "/usr/bin/python" binary file just after "- mapper", I don't think we need python executable to run this example. Make sure that you are using correct path of you files "mapper.py and reducer.py" while executing. ~Thanks On Fri, May 24, 2013 at 11:31 PM

Re: Error while using the Hadoop Streaming

2013-05-24 Thread Adamantios Corais

That's the point. I think I have chosen them right, but how could I double-check it? As you see files "mapper.py and reducer.py" are on my laptop whereas input file is on the HDFS. Does this sounds ok to you? On Fri, May 24, 2013 at 8:10 PM, Jitendra Yadav wrote: > Hi, > > In your first mail y

MiniYARNCluster logs

2013-05-24 Thread Prashant Kommireddi

Hey guys, We are using the MiniYARNCluster and trying to see where the NN, RM, job logs can be found. We see the job logs are present on HDFS but not on any local dirs. Also, none of the master node logs (NN, RM) are available. Digging in a bit further (just looked at this 1 file), I see there is

RE: splittable vs seekable compressed formats

2013-05-24 Thread John Lilley

More specifically, seeking to a known location in the uncompressed data. So not just seeking to “the nearest record boundary”, but seeking to “position 1 in the uncompressed data”. I can see that if the writer kept track of this information on the side it would be available; my questio

Apache Flume Properties File

2013-05-24 Thread Raj Hadoop

Hi, I just installed Apache Flume 1.3.1 and trying to run a small example to test. Can any one suggest me how can I do this? I am going through the documentation right now. Thanks, Raj

question of how to debug hadoop code and mahout code

2013-05-24 Thread qiaoresearcher

hi all, is there a way to debug hadoop code in eclipse step by step using hdfs file system? thanks,

Single Output file from STORE command

2013-05-24 Thread Mix Nin

PIG STORE command produces multiple output files. I want a single output file and I tried using command as below STORE (foreach (group NoNullData all) generate flatten($1)) into ''; This command produces one single file but at the same time forces to use single reducer which kills performanc

Re: Apache Flume Properties File

2013-05-24 Thread Hitesh Shah

Hello Raj BCC-ing user@hadoop and user@hive Could you please not cross-post questions to multiple mailing lists? For questions on hadoop, go to user@hadoop. For questions on hive, please send them to the hive mailing list and not the user@hadoop mailing list. Likewise for flume. thanks -- H

Re: Apache Flume Properties File

2013-05-24 Thread Raj Hadoop

Hi, When I am reading all the stuff on internet on Flume, everything is mostly on CDH distribution. I am aware that Flume is Cloudera's contribution but I am using a strict Apache version in my research work. When I was reading all this, I want to make sure from the forum that Apache flume if ha

Re: Child Error

2013-05-24 Thread Jean-Marc Spaggiari

Hi Jim, Which JVM are you using? I don't think you have any memory issue. Else you will have got some OOME... JM 2013/5/24 Jim Twensky > Hi again, in addition to my previous post, I was able to get some error > logs from the task tracker/data node this morning and looks like it might > be a j

Issue with data Copy from CDH3 to CDH4

2013-05-24 Thread samir das mohapatra

Hi all, We tried to pull the data from upstream cluster(cdh3) which is running cdh3 to down stream system (running cdh4) ,Using *distcp* to copy the data, it was throughing some exception bcz due to version isssue. I wanted to know is there any solution to pull the data from CDH3 to CDH4 wi

Re: Issue with data Copy from CDH3 to CDH4

2013-05-24 Thread Jagat Singh

A bit unrelated but yet similar. Copy to sequence files data from cdh3 to cdh4 cluster. http://feedly.com/k/14KMYIk Thanks Jagat On May 25, 2013 1:50 PM, "samir das mohapatra" wrote: > Hi all, > > We tried to pull the data from upstream cluster(cdh3) which is > running cdh3 to down stream

Re: Where to begin from??

2013-05-24 Thread schhajed.iet

Did you try using MultipleOutputs Class. Sent from Windows Mail From: Sanjay Subramanian Sent: ‎Friday‎, ‎24‎ ‎May‎ ‎2013 ‎11‎:‎13‎ ‎PM To: user@hadoop.apache.org Hey guys Is there a way to dynamically change the input dir and outputdir I have the following CONSTANT directories in

Re: splittable vs seekable compressed formats

Abort a job when a counter reaches to a threshold

Re: Abort a job when a counter reaches to a threshold

Re: pauses during startup (maybe network related?)

how to stop the specified client computer OS to connect hadoop using super user

Re:how to stop the specified client computer OS to connect hadoopu

Re: diff between these 2 dirs

Re: Hint on EOFException's on datanodes

RE: how to stop the specified client computer OS to connect hadoop using super user

Please help me with heartbeat storm

Re: Reducer that outputs no key

Error while using the Hadoop Streaming

Re: Child Error

Re: Error while using the Hadoop Streaming

Re: Where to begin from??

Re: Error while using the Hadoop Streaming

Re: Error while using the Hadoop Streaming

Re: Error while using the Hadoop Streaming

MiniYARNCluster logs

RE: splittable vs seekable compressed formats

Apache Flume Properties File

question of how to debug hadoop code and mahout code

Single Output file from STORE command

Re: Apache Flume Properties File

Re: Apache Flume Properties File

Re: Child Error

Issue with data Copy from CDH3 to CDH4

Re: Issue with data Copy from CDH3 to CDH4

Re: Where to begin from??

29 matches

Site Navigation

Mail list logo

Footer information