Re: Executing a Python program inside Map Function

2013-01-26 Thread Harsh J
ch is on my local disk and I want to use > the output of that script for further processing in Map Function to produce > Pairs. > Can some give me some idea how to do it. > > > Regards > Sundeep -- Harsh J

Re: Difference between HDFS and local filesystem

2013-01-26 Thread Harsh J
machines or will it throw errors? will the files be replicated and will they > be partitioned for running MapReduce if i use Localfile system? > > Can someone please explain. > > Regards > Sundeep > > > > -- Harsh J

Re: affected by 0.20: Allow block reports to proceed without holding FSDataset lock

2013-01-25 Thread Harsh J
upports fuse_dfs. Any > advice? > > -- > Thanks & Best Regards > Xibin Liu > -- Harsh J

Re: How to Backup HDFS data ?

2013-01-24 Thread Harsh J
ement wants to backup this data. I >> > am >> > talking about 20 TB of active HDFS data with an incremental of 2 >> > TB/month. >> > We would like to have weekly and monthly backups upto 12 months. >> > >> > Any ideas how to do this ? >> > >> > -- Steve > > -- Harsh J

Re: Filesystem closed exception

2013-01-24 Thread Harsh J
uns a > launcher mapper for a simple java action. Hence, the java action could very > well interact with a file system. I know this is probably better addressed > in Oozie context, but wanted to get the map reduce view of things. > > > Thanks, > Hemanth -- Harsh J

Re: HDFS File/Folder permission control with POSIX standard

2013-01-24 Thread Harsh J
tandard right? Removing other users permission mens > it's can't access other right. > > Please guide me. > > -Dhanasekaran. > > > > Did I learn something today? If not, I wasted it. -- Harsh J

Re: Submitting MapReduce job from remote server using JobClient

2013-01-24 Thread Harsh J
s attempting... > Anyone tried it before ? > I'm just looking for a way to submit MapReduce jobs from Java code and be > able to monitor them. > > Thanks, > > Amit. -- Harsh J

Re: Join Operation Using Hadoop MapReduce

2013-01-24 Thread Harsh J
: > Hi I am working join operation using MapReduce > So if anyone has useful information plz share it. > Example Code or New Technique along with existing one. > > Thank You. > > -- > > > Thanx and Regards > Vikas Jadhav -- Harsh J

Re: Modifying Hadoop For join Operation

2013-01-24 Thread Harsh J
ficient Way. > Thanks. > > -- > > > Thanx and Regards > Vikas Jadhav -- Harsh J

Re: hdfs du periodicity and hdfs not respond at that time

2013-01-24 Thread Harsh J
; Thanks, http://search-hadoop.com/m/LLBgUiH0Bg2 is my issue , but I still > dont't know how to solve this problem, 3 minutes not respond once an hour > is a big problem for me, any clue for this? > > > 2013/1/24 Harsh J >> >> Hi, >> >> HDFS does this to

Re: hdfs du periodicity and hdfs not respond at that time

2013-01-23 Thread Harsh J
, so when hdfs exec du, datanode will not respond for > about 3 minuts because of io loading, this cause a lot of problem, anybody > knows why hdfs doing this and how to disable it? > > -- > Thanks & Regards > Xibin Liu > -- Harsh J

Re: MulitpleOutputs outputs just one line

2013-01-23 Thread Harsh J
e logs that the mos.write() is being invoked, but only one line is > printed to the output file under /tmp. Is there some config I missed? > > Thanks. -- Harsh J

Re: cdh4 HA fencing fails when the other node is down

2013-01-23 Thread Harsh J
will be a solution for this scenario (With zkfc). > > Please guide me/point me to solution. > > > -Sagar -- Harsh J

Re: Error after upgrading from CDH3 to CDH4

2013-01-23 Thread Harsh J
received this communication in error, please resend this > communication to the sender and delete the original message or any copy > of it from your computer system. > > Thank You. > -- Harsh J

Re: cdh4 HA fencing fails when the other node is down

2013-01-23 Thread Harsh J
will be a solution for this scenario (With zkfc). > > Please guide me/point me to solution. > > > -Sagar -- Harsh J

Re: Hadoop-1.1.0 on EC2

2013-01-23 Thread Harsh J
AMI image for hadoop > 1.1.0. Will we have to build our own AMI or is there another we can safely > use? > > Thanks, > Daniel. > -- Harsh J

Re: Trouble starting up Task Tracker

2013-01-23 Thread Harsh J
confidential and proprietary > information. This email and any files transmitted with it are intended > solely for the use of the individual or entity to whom they are addressed. > You are hereby notified that any unauthorized disclosure, copying, or > distribution of this message, or the taking of any unauthorized action based > on information contained herein is strictly prohibited. Unauthorized use of > information contained herein may subject you to civil and criminal > prosecution and penalties. If you are not the intended recipient, you should > delete this message immediately and notify the sender immediately by > telephone or by replying to this transmission. -- Harsh J

Re: NameNode low on available disk space

2013-01-23 Thread Harsh J
if the NN has picked it up. -- Harsh J

Re: ISSUE with Hadoop JobTracker Web UI under CDH4

2013-01-23 Thread Harsh J
, >The Problem is after running any job under CDH4 , it is not showing > any job status > > like in cdh3 >1) running job >2) failed job >3) completed job > > under job tracker log from Web-UI > through http://localhost:50030 > > Note: Is it required any other configuration for that ? > -- Harsh J

Re: Custom Partitioner is now working in CDH4

2013-01-23 Thread Harsh J
if(ageInt >20 && ageInt <=50){ > > return 1 % numReduceTasks; > } > //otherwise assign partition 2 > else > return 2 % numReduceTasks; > > } > } > > > > DRIVER LEVEL CODE > --- > > job.setPartitionerClass(AgePartitioner.class); > job.setNumReduceTasks(3); > > > > > -- Harsh J

Re: NameNode low on available disk space

2013-01-23 Thread Harsh J
r MB ? > I am layman on hadoop. The link I followed to install is given below > > > https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode > > Thanks, > > > > > On Wed, Jan 23, 2013 at 10:12 PM, Harsh J wro

Re: Hadoop Nutch Mkdirs failed to create file

2013-01-23 Thread Harsh J
.java:448) > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490) > ** at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) > at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260) > > > -- Harsh J

Re: NameNode low on available disk space

2013-01-23 Thread Harsh J
en I run command to leave > safemode manually. I never got alerts for low disk space on machine level > and i didn't see the space fluctuates GBs into MBs . > > > > > > On Wed, Jan 23, 2013 at 9:10 PM, Harsh J wrote: > >> Mohit, >> >> When do you specific

Re: NameNode low on available disk space

2013-01-23 Thread Harsh J
hecker: Space >> available on volume '/dev/mapper/vg_operamast1-lv_root' is 10653696, which >> is below the configured reserved amount 104857600 >> >> >> On Wed, Jan 23, 2013 at 11:13 AM, Harsh J wrote: >> >>> Hi again, >>> >>&

Re: EOF when Combiner works

2013-01-23 Thread Harsh J
168 INFO org.apache.hadoop.mapred.MapTask: Spilling > map output: buffer full= true > 2013-01-23 13:28:42,169 INFO org.apache.hadoop.mapred.MapTask: bufstart = > 0; bufend = 790273835; bufvoid = 987842480 > 2013-01-23 13:28:42,169 INFO org.apache.hadoop.mapred.MapTask: kvstart = > 0; kvend = 4072970; length = 5368709 > 2013-01-23 13:28:56,417 INFO org.apache.hadoop.io.compress.CodecPool: Got > brand-new compressor > 2013-01-23 13:29:18,998 INFO org.apache.hadoop.mapred.MapTask: Finished > spill 0 > ... > > > Please help me to understand the reason of task fails. > -- Harsh J

Re: Understanding harpoon - help needed

2013-01-23 Thread Harsh J
of the same file]?" - Yes, for remote client reads. Access order is randomized for these form of clients, leading to possibly different patterns each time. -- Harsh J

Re: NameNode low on available disk space

2013-01-22 Thread Harsh J
c static final longDFS_NAMENODE_DU_RESERVED_DEFAULT = 1024 * 1024 * 100; // 100 MB On Wed, Jan 23, 2013 at 10:12 AM, Harsh J wrote: > Edit your hdfs-site.xml (or whatever place of config your NN uses) to > lower the value of property "dfs.namenode.resource.du.reserved". Crea

Re: NameNode low on available disk space

2013-01-22 Thread Harsh J
tateChange: STATE* > Safe mode is ON. > > > > On Wed, Jan 23, 2013 at 2:50 AM, Steve Loughran > wrote: > >> ser@hadoop.apache.orglist > > > > -- Harsh J

Re: CDH412/Hadoop 2.0.3 Upgrade instructions

2013-01-22 Thread Harsh J
he versions you're upgrading. Other than a recompile, you may mostly not require to do anything else. May we also know your reason to not use CM when its aimed to make all this much easier to do and manage? We appreciate any form of feedback, thanks! -- Harsh J

Re: Can't find the Job Status in WEB UI

2013-01-21 Thread Harsh J
ly one NIC. > > The datanodes can only access LAN in the cluster. > > [image: cid:image002.jpg@01CDF7E5.F8D45990] > > ** ** > -- Harsh J <>

Re: Prolonged safemode

2013-01-20 Thread Harsh J
about this. Thank you so much. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Sun, Jan 20, 2013 at 4:29 PM, Harsh J wrote: > >> If your DN is starting too slow, then you should investigate why. >> >> In any case

Re: Prolonged safemode

2013-01-20 Thread Harsh J
s time. > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > -- Harsh J

Re: how to restrict the concurrent running map tasks?

2013-01-18 Thread Harsh J
> running as the job starts. > > I'm not sure whether I set this parameter in wrong way ? or misunderstand > it. > > After looking through the hadoop document, I can't find another parameter > to limit the concurrent running map tasks. > > Hope someone can help me ,Thanks. > -- Harsh J

Re: How to unit test mappers reading data from DistributedCache?

2013-01-17 Thread Harsh J
) should look like? > > Thanks. > -- Harsh J

Re: Help. Strange thing. It's block me 1 week....

2013-01-17 Thread Harsh J
pred.child.java.opts > -Xmx2000m > > > > mapreduce.reduce.java.opts > -Xmx2000m > > > > > mapred.reduce.tasks > AutoReduce > > > > io.sort.factor > 12 > > > > io.sort.mb > 300 > > > > > io.file.buffer.size > 65536 > > > > dfs.datanode.handler.count > 8 > > > > > > > > > -- Harsh J

Re: run hadoop in standalone mode

2013-01-17 Thread Harsh J
isabled ipV6, firewall on my linux machine. But, i still get this error > message. localhost is bound with 127.0.01 . core-site.xml and > mapreduce-site.xml are empty as they are not modified. > > Anybody can give me a hint if I need to do some specific configuration to > run hadoop in standalone mode? > > thanks and regards, > > Yiyu > > -- Harsh J

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
, Jan 18, 2013 at 3:14 AM, Brennon Church wrote: > Pretty spiky. I'll throttle it back to 1MB/s and see if it reduces > things as expected. > > Thanks! > > --Brennon > > > On 1/17/13 1:41 PM, Harsh J wrote: > > Not true per the sources, it controls all

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
t; > --Brennon > > > On 1/17/13 11:04 AM, Harsh J wrote: > > You can limit the bandwidth in bytes/second values applied > via dfs.balance.bandwidthPerSec in each DN's hdfs-site.xml. Default is 1 > MB/s (1048576). > > Also, unsure if your version

Re: Throttle replication speed in case of datanode failure

2013-01-17 Thread Harsh J
ing Hadoop v1.0.1. I think the > dfs.namenode.replication.work.multiplier.per.iteration option would do the > trick, but that is in v1.1.0 and higher. > > Thanks. > > --Brennon > -- Harsh J

Re: Hadoop NON DFS space

2013-01-17 Thread Harsh J
t;> > https://mtariq.jux.com/ >> > cloudfront.blogspot.com >> > >> > >> > On Wed, Jan 16, 2013 at 6:15 PM, Chris Embree >> wrote: >> > >> >> Ha, you joke, but we're planning on running with no local OS. If it >> >> works >&

Re: Eclipse Plugin for Hadoop

2013-01-17 Thread Harsh J
> HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > > -- Harsh J

Re: hadoop namenode recovery

2013-01-17 Thread Harsh J
does that slow down the overall NN > performance? > > Thanks, > randy > > > On 01/15/2013 11:14 PM, Harsh J wrote: > >> The NFS mount is to be soft-mounted; so if the NFS goes down, the NN >> ejects it out and continues with the local disk. If auto-restore is >

Re: test put file to hdfs error

2013-01-16 Thread Harsh J
:452) > at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:1494) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1395) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) > -- Harsh J

Re: Configuration object not loading parameters in unit tests

2013-01-16 Thread Harsh J
have is that the default filesystem is being > created, rather than my custom filesystem. > > File system: org.apache.hadoop.fs.LocalFileSystem@4ce2cb55 > > > -- > Jay Vyas > http://jayunit100.blogspot.com > -- Harsh J

Re: does "fs -put " create subdirectories?

2013-01-16 Thread Harsh J
ot;hadoop fs -put mmdd.tsv t1/2012/01/01/mmdd.tsv" create the > > necassary subdirectories in hdfs?thanksJohn > -- Harsh J

Re: Hadoop NON DFS space

2013-01-15 Thread Harsh J
Wipe your OS out. Please read: http://search-hadoop.com/m/9Qwi9UgMOe On Wed, Jan 16, 2013 at 1:16 PM, Vikas Jadhav wrote: > > how to remove non dfs space from hadoop cluster > > -- > * > * > * > > Thanx and Regards* > * Vikas Jadhav* > -- Harsh J

Re: config file loactions in Hadoop 2.0.2

2013-01-15 Thread Harsh J
p >> install directory]/etc/hadoop >> but I still cannot find >> capacity-scheduler.xml >> masters - for listing master nodes >> >> Please help me setup this version. >> >> Thanking You, >> >> -- >> Regards, >> Ouch Whisper >> 010101010101 >> > > -- Harsh J

Re: hadoop namenode recovery

2013-01-15 Thread Harsh J
/or performance if there's a problem with the > NFS server? Or the network? > > Thanks, > randy > > > On 01/14/2013 11:36 PM, Harsh J wrote: > >> Its very rare to observe an NN crash due to a software bug in >> production. Most of the times its a hardware fault you sho

Re: FileSystem.workingDir vs mapred.local.dir

2013-01-15 Thread Harsh J
wrote: >> >>> Hi guys: What is the relationship between the "working directory" in >>> the FileSystem class (filesystem.workingDir), compared with the >>> mapred.local.dir properties ? >>> >>> It seems like these would essentially refer to the same thing? >>> -- >>> Jay Vyas >>> http://jayunit100.blogspot.com >>> >> >> > > > -- > Jay Vyas > http://jayunit100.blogspot.com > -- Harsh J

Re: question about ZKFC daemon

2013-01-15 Thread Harsh J
drives from the HDFS metadata for best performance and isolation. > > Here, ZooKeeper daemons = ZKFC? > > > Thanks > > ESGLinux, > > > > 2013/1/15 Harsh J > >> Hi, >> >> I fail to see your confusion. >> >> ZKFC != ZK >> >> ZK is

Re: question about ZKFC daemon

2013-01-15 Thread Harsh J
;> >>>> ESGLinux, >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> 2012/12/28 Craig Munro >> >>>>> >> >>>>> You need the following: >> >>>>> >> >>&

Re: hadoop namenode recovery

2013-01-14 Thread Harsh J
rt time. > > Thanking You, > > -- > Regards, > Ouch Whisper > 010101010101 > -- Harsh J

Re: Adding new HUG in Sao Paulo to HadoopUsersGroup wiki page

2013-01-14 Thread Harsh J
or making the update will do. > > My id: PauloMagalhaes > What I want to add:" > === South America === > * [http://www.meetup.com/SaoPauloHUG/| Sao Paulo HUG]: Hadoop Users Group > in Sao Paulo, Brazil > " > > Thanks, > Paulo Magalhaes > -- Harsh J

Re: Adding new HUG in Sao Paulo to HadoopUsersGroup

2013-01-14 Thread Harsh J
ers group in the > HadoopUserGroups wiki page. Either giving me the rights or making the > update will do. > > My id: PauloMagalhaes > What I want to add: > === South America === > * [http://www.meetup.com/SaoPauloHUG/| Sao Paulo HUG]: Hadoop Users > Group in Sao Paulo, Brazil -- Harsh J

Re: TupleWritable Format

2013-01-14 Thread Harsh J
epresentative of > HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > > -- Harsh J

Re: Scheduling non-MR processes

2013-01-12 Thread Harsh J
So if we program our application to schedule its > tasks directly with YARN we should be able to do what I am describing? Is > there any non-native-Java interop for YARN or should we focus on JNI for that? > > John > > > -Original Message- > From: Harsh J [mailto:ha

Re: Scheduling non-MR processes

2013-01-12 Thread Harsh J
or gone unresponsive. But ideally, you'd want to leverage YARN for this. Libraries such as Kitten [2] help along in this task. [1] - https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/examples/org/apache/hadoop/examples/SleepJob.java [2] - https://github.com/cloudera/kitten/ -- Harsh J

Re: Sub-queues in capacity scheduler

2013-01-10 Thread Harsh J
for. I really wonder if it is possible (and how) to make it work in > cdh3u4. > > -P -- Harsh J

Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.

2013-01-10 Thread Harsh J
>>> >>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_02> >>> task_201301090834_0041_r_03<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_03> >>> 0.00% >>> 10-Jan-2013 04:18:57 >>> 10-Jan-2013 06:46:38 (2hrs, 27mins, 41sec) >>> >>> Task attempt_201301090834_0041_r_03_0 failed to report status for 602 >>> seconds. Killing! >>> Task attempt_201301090834_0041_r_03_1 failed to report status for 602 >>> seconds. Killing! >>> Task attempt_201301090834_0041_r_03_2 failed to report status for 602 >>> seconds. Killing! >>> >>> >>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_03> >>> task_201301090834_0041_r_05<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_05> >>> 0.00% >>> 10-Jan-2013 06:11:07 >>> 10-Jan-2013 06:46:38 (35mins, 31sec) >>> >>> >>> Task attempt_201301090834_0041_r_05_0 failed to report status for 600 >>> seconds. Killing! >>> >>> >>> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_05> >>> >> >> > -- Harsh J

Re: queues in haddop

2013-01-10 Thread Harsh J
mplementing RabbitMQ and receive data from them using Spring Integration > Data pipe lines. > > I cannot afford to loose any of the JSON files received. > > Thanking You, > > -- > Regards, > Ouch Whisper > 010101010101 -- Harsh J

Re: How to interpret the progress meter?

2013-01-10 Thread Harsh J
r. For my > reduce steps, they ramp up to 40-60% in just a few minutes, then take hours > to slowly inch their way up the rest of the way to 100%. > > What does the "complete" percentage really mean? > > -- > Roy Smith > r...@panix.com > -- Harsh J

Re: Suppress Permanently added a host warning

2013-01-09 Thread Harsh J
> /usr/local/hadoop/libexec/../logs/hadoop-root-secondarynamenode-master.out > > Regards > Sundeep -- Harsh J

Re: balancer and under replication

2013-01-07 Thread Harsh J
> is being used ? > or > - we need to bump up setrep to kind of trigger the number of replication > block ? > or > - ?? > > Thanks > -P -- Harsh J

Re: 64-bit libhadoop.so for 2.0.2-alpha?

2013-01-07 Thread Harsh J
t 64-bit libhadoop.so for 2.0.2-alpha? The one under > HADOOP_INSTALL_DIR/lib/native is 32-bit and get wrong ELF when running > Hadoop client on 64-bit platform. > > > > Thanks, > > Jane -- Harsh J

Re: sporadic failure

2013-01-07 Thread Harsh J
Thanks for following up, glad to know it is resolved! On Mon, Jan 7, 2013 at 6:42 AM, Stan Rosenberg wrote: > On Sat, Jan 5, 2013 at 2:44 AM, Harsh J wrote: >> I'd check the NN audit logs for the file >> /user/apache/.staging/job_201211150255_237458/job.xml to see when/wh

Re: Skipping entire task

2013-01-06 Thread Harsh J
p can skip bad records >>> >>> http://devblog.factual.com/practical-hadoop-streaming-dealing-with-brittle-c >>> ode. >>> But it is also possible to skip entire tasks? >>> >>> -Håvard >>> >>> -- >>> Håvard Wahl Kongsgård >>> Faculty of Medicine & >>> Department of Mathematical Sciences >>> NTNU >>> >>> http://havard.security-review.net/ >>> >> > > > > -- > Håvard Wahl Kongsgård > Faculty of Medicine & > Department of Mathematical Sciences > NTNU > > http://havard.security-review.net/ -- Harsh J

Re: Correct way to create a Job instance?

2013-01-06 Thread Harsh J
e correct way to create a Job is: > > Job job = Job.getInstance(Configuration conf) > > is this correct? > > second question: > Job API says: Job.getInstance() Creates a new Job with no particular > Cluster. A Cluster will be created with a generic Configuration. > > What

Re: more reduce tasks

2013-01-04 Thread Harsh J
/03 22:00:03 INFO streaming.StreamJob: map 100% reduce 100% >> 13/01/03 22:00:07 INFO streaming.StreamJob: Job complete: >> job_201301021717_0038 >> 13/01/03 22:00:07 INFO streaming.StreamJob: Output: 1gb.wc >> $ hadoop dfs -cat 1gb.wc/part-* >> 472173052 >> 165736187 >> 201719914 >> 184376668 >> 163872819 >> $ >> >> where /tmp/wcc contains >> #!/bin/bash >> wc -c >> >> Thanks for any answer, >> Pavel Hančar >> >> > -- Harsh J

Re: Possible to run an application jar as a hadoop daemon?

2013-01-04 Thread Harsh J
s automatically setup for you. -- Harsh J

Re: Gridmix version 1.0.4 Error

2013-01-04 Thread Harsh J
he.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:215) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.mapred.gridmix.Gridmix.main(Gridmix.java:390) > 13/01/04 14:19:26 INFO gridmix.Gridmix: Exiting... > hostname:gridmix seanbarry$ > > -- Harsh J

Re: sporadic failure

2013-01-04 Thread Harsh J
.TaskTracker$TaskLauncher.run(TaskTracker.java:2505) > > This problem doesn't seem relevant to only a specific distribution, > but for completeness we are running CDH3u3. > > Thanks! > > stan > -- Harsh J

Re: Does mapred.local.dir is important factor in reducer side?

2012-12-31 Thread Harsh J
merge at reducer side. In which location this merge happens? If that >> location does not have enough space does reducer fail? What is the solution >> for MapReduce jobs if intermediat results for some keys is more than local >> disk of reducer? > > -- Harsh J

Re: Does mapred.local.dir is important factor in reducer side?

2012-12-31 Thread Harsh J
If that > location does not have enough space does reducer fail? What is the solution > for MapReduce jobs if intermediat results for some keys is more than local > disk of reducer? -- Harsh J

Re: how to start hadoop 1.0.4 backup node?

2012-12-28 Thread Harsh J
c/webapps/hdfs >> ./src/test/org/apache/hadoop/hdfs >> ./src/test/system/aop/org/apache/hadoop/hdfs >> ./src/test/system/java/org/apache/hadoop/hdfs >> ./src/hdfs >> ./src/hdfs/org/apache/hadoop/hdfs >> >> >> thanks! >> Andy > > -- Harsh J

Re: question about ZKFC daemon

2012-12-27 Thread Harsh J
oring - the ZKFC > pings its local NameNode on a periodic basis with a health-check command.) > so what does the third ZKFC? I used the jobtracker node but I could use > another node without any daemon on it... > > Thanks in advance, > > ESGLInux, > > > -- Harsh J

Re: Setting number of mappers in Teragen

2012-12-26 Thread Harsh J
apred.map.tasks = 20 . Can anyone tell me how to force teragen to use 20 > mappers for generating the data? I am using cdh4.1.2 with Mapreducev1(Hadoop > 0.20.2) > -- > Thanks & Regards, > Anil Gupta -- Harsh J

Re: distributed cache

2012-12-26 Thread Harsh J
a > small chunk of a file? > "gradually decreasing performance for long reads" -- you mean parallel > multiple threads long read degrade performance? Or single thread exclusive > long read degrade performance? > > regards, > Lin > > > On Wed, Dec 26, 2012 at 7

Re: Map Shuffle Bytes

2012-12-26 Thread Harsh J
at 6:03 PM, Eduard Skaley >> wrote: >>> >>> Hello guys, >>> >>> I need a counter for shuffled bytes to the mappers. >>> Is there existing one or should I define one myself ? >>> How can I implement such a counter? >>> >>> Thank you and happy Christmas time, >>> Eduard >> >> >> > -- Harsh J

Re: distributed cache

2012-12-26 Thread Harsh J
ks Harsh, multiple concurrent read is generally faster or? > > regards, > Lin > > > On Wed, Dec 26, 2012 at 6:21 PM, Harsh J wrote: >> >> There is no limitation in HDFS that limits reads of a block to a >> single client at a time (no reason to do so) - so d

Re: distributed cache

2012-12-26 Thread Harsh J
m multiple mappers or > reducers which requires the DistributedCache). > > regards, > Lin > > > On Wed, Dec 26, 2012 at 4:51 PM, Harsh J wrote: >> >> Hi Lin, >> >> DistributedCache files are stored onto the HDFS by the client first. >> The TaskTr

Re: distributed cache

2012-12-26 Thread Harsh J
>>> >>> Yes, you are correct. The JobTracker will put files for the distributed >>> cache into HDFS with a higher replication count (10 by default). Whenever a >>> TaskTracker needs those files for a task it is launching locally, it will >>> fetch a copy to its local disk. So it won't need to do this again for future >>> tasks on this node. After a job is done, all local copies and the HDFS >>> copies of files in the distributed cache are cleaned up. >>> >>> Kai >>> >>> -- >>> Kai Voigt >>> k...@123.org >>> >>> >>> >>> >> >> >> -- >> Kai Voigt >> k...@123.org >> >> >> >> > -- Harsh J

Re: Map Shuffle Bytes

2012-12-26 Thread Harsh J
ting one or should I define one myself ? > How can I implement such a counter? > > Thank you and happy Christmas time, > Eduard -- Harsh J

Re: How to estimate hadoop.tmp.dir disk space

2012-12-26 Thread Harsh J
eared > How to estimate hadoop.tmp.dir disk space? > > thx > -- > cente...@gmail.com -- Harsh J

Re: good way to debug map reduce code

2012-12-26 Thread Harsh J
rror.. it use to just throw them off and it > was very fast to debug as you code. > Is there any similar way .. where i dont have to run hadoop jobs to debg and > wait and go thru hadoop logs to see that maybe i miss a semi-colon.. > Thanks > Jamal -- Harsh J

Re: reducer tasks start time issue

2012-12-22 Thread Harsh J
ducers. My question > is, reducer tasks cannot begin until all mapper tasks complete? If so, why > designed in this way? > > thanks in advance, > Lin -- Harsh J

Re: Child processes on datanodes/task trackers

2012-12-22 Thread Harsh J
ava.lang.ref.Reference$Lock) > > "VM Thread" prio=10 tid=0x7f2ae8066800 nid=0x1122 runnable > > "GC task thread#0 (ParallelGC)" prio=10 tid=0x7f2ae801c800 > nid=0x1114 runnable > > "GC task thread#1 (ParallelGC)" prio=10 tid=0x7f2ae801e800 > nid=0x1115 runnable > > "GC task thread#2 (ParallelGC)" prio=10 tid=0x7f2ae802 > nid=0x111c runnable > > "GC task thread#3 (ParallelGC)" prio=10 tid=0x7f2ae8022000 > nid=0x111f runnable > > "VM Periodic Task Thread" prio=10 tid=0x7f2ae809d000 nid=0x1131 > waiting on condition > > JNI global references: 1774 > > > Thanks, Tabatabaei -- Harsh J

Re: Merging files

2012-12-22 Thread Harsh J
rom different locations from HDFS location > into one file into HDFS location? -- Harsh J

Re: Seekable interface and CompressInputStream question

2012-12-21 Thread Harsh J
ble, but in fact it isn't. How do I write a generic > InputFormat to support both splitable/unsplitable compress input stream in > this case? Or my understanding is not correct, that Seekable and Split are > totally different things? > > Thanks > > Yong -- Harsh J

Re: Is it possible to run from localized directory instead of jar?

2012-12-21 Thread Harsh J
ory before the tasks for the job start. The > job.jar location is accessible to the application through the api > JobConf.getJar() . To access the unjarred directory, > JobConf.getJar().getParent() can be called. > > > Thanks. > > -- > -ilya -- Harsh J

Re: Question about HA and Federation

2012-12-21 Thread Harsh J
Node for NS2 + Secondary NameNode for NS3 > > Is this correct? > > thanks, > > ESGLinux > > 2012/12/20 Harsh J >> >> Btw, you can co-locate NameNodes (unique namespace ones) onto the same >> machine if you need to - the configs easily allow this via rpc/http &g

Re: Question about HA and Federation

2012-12-20 Thread Harsh J
ch, > > I didn't understand how could I mix HA and Federation and how many nodes I > need > > Kind Regards, > > ESGLinux, > > 2012/12/20 Harsh J >> >> Yes I think its safe to say that - sorry that I missed out SNNs in my >> first response (I count

Re: Question about HA and Federation

2012-12-20 Thread Harsh J
meNode (it does the work of the old > Secondary NameNode) for NS1 NameSpace (these are 2 diferent machines) > 1 NameNode for NS2 + 1 Secondary NameNode > 1 NameNode for NS3 + 1 Secondary NameNode > > We can say that we need 2 nodes per NameSpace, is that true? > > Thanks, >

Re: Question about HA and Federation

2012-12-20 Thread Harsh J
t; 1 NameNode for NS2 > 1 NameNode for NS3 > > but what about the Secondary Name Nodes for NS2 and NS3? or I don´t need it? > perhaps I´m mixing concepts > > Thanks again, > > Greetings, > > ESGLinux > > > > > 2012/12/20 Harsh J >>

Re: Question about HA and Federation

2012-12-20 Thread Harsh J
amespace? > > I have read this documentation but It´s not clear for me :-( > > https://ccp.cloudera.com/display/CDH4DOC/Introduction+to+Hadoop+High+Availability > http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html > > Thanks in advance > > ESGLinux > -- Harsh J

Re: why not hadoop backup name node data to local disk daily or hourly?

2012-12-20 Thread Harsh J
ues.apache.org/jira/browse/HDFS-2397. We are perhaps deprecating the CheckpointNode though: https://issues.apache.org/jira/browse/HDFS-4114. -- Harsh J

Re: why not hadoop backup name node data to local disk daily or hourly?

2012-12-20 Thread Harsh J
e. Now I do this just using bash script. I don't think using a bash script to backup the metadata is a better solution than relying on the SecondaryNameNode. Two reasons: It does the same form of a copy-backup (no validation like SNN does), and it does not checkpoint (i.e. merge the edits into the fsimage). -- Harsh J

Re: Development against hadoop v 0.23.5

2012-12-19 Thread Harsh J
terested in writing a distributed application (which is not the same as writing an MR job), you can read [2]. [1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/ [2] - http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html -- Harsh J

Re: hftp can list directories but won't send files

2012-12-18 Thread Harsh J
pred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > Can someone tell me why hftp is failing to serve files, or at least where to > look? > -- Harsh J

Re: datanode write forwarding

2012-12-18 Thread Harsh J
details and pointer to the code. > > Chris (another one) > > On Dec 18, 2012 5:14 PM, "Harsh J" wrote: >> >> Hi, >> >> The received write packet is directly socket-written to the next >> node's receiver (async process - we don't wait for

Re: datanode write forwarding

2012-12-18 Thread Harsh J
> Jay Vyas > http://jayunit100.blogspot.com -- Harsh J

<    3   4   5   6   7   8   9   10   11   12   >