Re: problem running multiple native mode map reduce processes concurrently

2013-03-21 Thread Harsh J
sses. > > > > I expected to be able to run more than one of these processes at the same > time. It appears I cannot. Does anyone have any suggestions that would > help me do this? > > > > --Derrick H. Karimi > > --Software Developer, SEI Innovation Center > > --Carnegie Mellon University > > -- Harsh J

Re: Webserver and hadoop cluster

2013-03-21 Thread Harsh J
. do I need to install hadoop in the web server so it can communicate > with the hadoop cluster or is there any other way to get the files from > hadoop to the web server similar to the database you need only to connect to > the database using driver? -- Harsh J

Re: eclipse plugin for hadoop 2.0.0 alpha

2013-03-21 Thread Harsh J
gt;> currently working on a project to device a solution for small files problem >> and i am using hdfs federation. I want to integrate our web server with >> hdfs. So I need eclipse plugin for this version. Please help me out. > > -- Harsh J

Re: Need your help with Hadoop

2013-03-21 Thread Harsh J
> datanode? > > Are there any specific need for the path we added into the hdfs.xml? > More detials you can see in the details > > > BRs > Geelong > > > 2013/3/22 Harsh J > >> You can mount the disks you need on Linux as proper paths, mkdir some >> di

Re: Capacity Scheduler question

2013-03-22 Thread Harsh J
arking up the wrong > tree? If the capacity scheduler won't help me, can you think of anything > that will? > > Thanks! > > --Jeremy -- Harsh J

Re: Setup/Cleanup question

2013-03-22 Thread Harsh J
s will the setup & cleanup run n number of times which > means once for each mapper or for all the mappers they will run only once. > Any help is appreciated. > Thanks > Sai -- Harsh J

Re: Cluster lost IP addresses

2013-03-22 Thread Harsh J
eated). >>> >>> This must have happened to someone before. >>> Nothing else on the machines has been changed. Most importantly the data >>> in HDFS is still sitting there. >>> >>> Is there a way to recover this cluster to a useable state? >>> thanks >>> John >> >> >> >> -- >> http://balajin.net/blog >> http://flic.kr/balajijegan -- Harsh J

Re: how to control (or understand) the memory usage in hdfs

2013-03-22 Thread Harsh J
e test environment with almost no data at all, or is it > suppose to work like OS-disk caches were it always works but just > performs better or worst and I just have something configured wrong?. > Basically my objective isn't performance, it's that the server must > not shut itself down, it can slow down but not shut off. > > -- > Ted. -- Harsh J

Re: Cluster lost IP addresses

2013-03-22 Thread Harsh J
r at least be responsible for fixing this > abomination. Sad that this code was released GA. > > Sorry folks. HDFS/Mapred is really cool tech, I'm just jaded about this > kind of silliness. > > In my Not So Humble Opinion. > Chris > > > On Sat, Mar 23, 2013 at

Re: Dissecting MR output article

2013-03-22 Thread Harsh J
for this. > but only for MRv1. > > On Mar 23, 2013 1:50 PM, "Sai Sai" wrote: >> >> >> Just wondering if there is any step by step explaination/article of MR >> output we get when we run a job either in eclipse or ubuntu. >> Any help is appreciated. >> Thanks >> Sai -- Harsh J

Re: For a new installation: use the BackupNode or the CheckPointNode?

2013-03-23 Thread Harsh J
able build (1.1.2 ), is there any >> reason to use the CheckPointNode over the BackupNode? >> >> >> >> It seems that we need to choose one or the other, and from the docs it >> seems like the BackupNode is more efficient in its processes. > > > > > -- > Regards, > Varun Kumar.P -- Harsh J

Re: how to control (or understand) the memory usage in hdfs

2013-03-23 Thread Harsh J
switched it to > 2048. > > I'm going to run the test again with 1024mb and jconsole running, none > of this makes any sense to me. > > On 3/23/13, Harsh J wrote: >> I run a 128 MB heap size DN for my simple purposes on my Mac and it >> runs well for what load I appl

Re: DistributedCache - why not read directly from HDFS?

2013-03-23 Thread Harsh J
) method? > > Thank you very much, > Alberto > > > -- > Alberto Cordioli -- Harsh J

Re: Delete a hdfs directory if it already exists in a shell script.

2013-03-23 Thread Harsh J
cannot create directory /user/lnindrakrishna/03232013: File exists > > > > Looks like hadoop fs -test -d /user/lnindrakrishna/$DIRECTORY is returning > NULL and that is the reason it throws null value for echo $TestDir and it > goes to else part and displays "Directory does not Exist" > > > What is wrong in the above shell script that I have written > > > - -- Harsh J

Re: empty file

2013-12-11 Thread Harsh J
ck replication: 0.0 > Corrupt blocks: 0 > Missing replicas: 0 > Number of data-nodes: 38 > Number of racks: 6 > FSCK ended at Wed Dec 11 20:35:30 CST 2013 in 1 milliseconds > > > The filesystem under path '/tmp/corrupt_lzo/lc_hadoop16.1386270004881.lzo' > is HEALTHY > > -- > chenchun > > -- Harsh J

Re: conf.set() and conf.get()

2013-12-29 Thread Harsh J
rom Driver to mapper. > Am i able to store an object value using conf.set. In order to access the > same value from the driver? > > > -- > Thanks & Regards > > Unmesha Sreeveni U.B > > -- Harsh J

Re: Job fails while re attempting the task in multiple outputs case

2013-12-30 Thread Harsh J
rtFile(FSNamesystem.java:1599) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:732) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:711) > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown > Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1448) > > > Thanks & Regards, > B Anil Kumar. -- Harsh J

Re: MapReduce MIME Input type?

2013-12-30 Thread Harsh J
Suiter* > Jr. Data Solutions Software Engineer > 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212 > Google Voice: 412-256-8556 | www.rdx.com > -- Harsh J

Re: LookUp in mapreduce

2013-12-30 Thread Harsh J
),Bytes.toBytes("1")) > } > > > but the column valid_ind is not printing. > > Please help with sample code to fix . > > thanks in advance > > Ranjini.R > -- Harsh J

Re: Reduce task hang[EMERGENCE]

2014-01-02 Thread Harsh J
r: attempt_201312201200_34795_r_00_0 0.0% reduce > copy >> > >> hadoop-hadoop-tasktracker-10-200-91-186.out:14/01/03 06:14:17 INFO >> mapred.TaskTracker: attempt_201312201200_34795_r_00_0 0.0% reduce > copy >> > >> hadoop-hadoop-tasktracker-10-200-91-186.out:14/01/03 06:14:23 INFO >> mapred.TaskTracker: attempt_201312201200_34795_r_00_0 0.0% reduce > copy >> > > > -- Harsh J

Re: Reduce task hang[EMERGENCE]

2014-01-02 Thread Harsh J
t some people's max job running greater than one. and these people's job > never hanged... > > > On Fri, Jan 3, 2014 at 1:13 PM, Harsh J wrote: >> >> Does the Reduce task log (of attempt_201312201200_34795_r_00_0) >> show any errors in trying to communicat

Re: Block size

2014-01-03 Thread Harsh J
3, 2014 at 11:37 AM, Kurt Moesky wrote: > > I see the default block size for HDFS is 64 MB, is this a value that can be > changed easily? > > -- Harsh J

Re: Cutting a line in between while creating blocks

2014-01-04 Thread Harsh J
ines, how does > Hadoop take care of the problem of not Cutting a line in between while > creating blocks? > Is it taken care of by Hadoop? > > Thanks > Shalish. -- Harsh J

Re: What makes a map to fail?

2014-01-05 Thread Harsh J
Adel Mehraban wrote: > Hi all, > My task jobs are failing due to many failed maps. I want to know what makes > a map to fail? Is it something like exceptions or what? -- Harsh J

Re: Need idea for mapreduce

2014-01-05 Thread Harsh J
h column to map() >Find max in Reduce() > > -- > Thanks & Regards > > Unmesha Sreeveni U.B > Junior Developer > > http://www.unmeshasreeveni.blogspot.in/ > > -- Harsh J

Re: org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeExceptio

2014-01-09 Thread Harsh J
va:623) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:225) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:664) > at java.lang.Thread.run(Thread.java:701) > > I set the core-site.xml > > >fs.default.name > hdfs://10.103.0.17:9000 >hadoop.tmp.dir /tmp/hadoop-temp > > > hadoop.proxyuser.root.hosts* > > hadoop.proxyuser.root.groups* > > > -- > Best regards, -- Harsh J

Re: Wordcount Hadoop pipes C++ Running issue

2014-01-09 Thread Harsh J
apred.JobClient: map 0% reduce 0% > INFO mapred.JobClient: Job complete: job_local2050700100_0001 > INFO mapred.JobClient: Counters: 0 > INFO mapred.JobClient: Job Failed: NA > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) > at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248) > at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479) > at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494) > > I have tried with the Hadoop version: > > 0.19.2 > 1.2.1 > 2.2.0 -- Harsh J

Re: Error in starting DFS - message says JAVA_HOME not set though the workstation has working java

2014-01-09 Thread Harsh J
ersion : 2.1.1-beta > Java version:java-7-openjdk-amd64 > > Thanks. -- Harsh J

Re: Hadoop C++ HDFS test running Exception

2014-01-13 Thread Harsh J
op/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar > > > Any help is appreciated.. thanks -- Harsh J

Re: How to configure multiple reduce jobs in hadoop 2.2.0

2014-01-15 Thread Harsh J
on it. I have 8 > data blocks spread across all the 3 nodes. While running map reduce job I > could see 8 map tasks running however reduce job is only 1. Is there a way > to configure multiple reduce jobs? > > --Ashish -- Harsh J

Re: How to start historyserver on all nodes automatically?

2014-01-21 Thread Harsh J
nt to track task logs on hdfs. I need > to start historyserver vie mr-jobhistory-daemon.sh start historyserver on > all nodes. Is there any way to run historyserver automatically when yarn > starts? -- Harsh J

Re: How to learn hadoop follow Tom White

2014-01-21 Thread Harsh J
the sample code is > under 0.20, should I learn and exercise it under Hadoop 1.0 version? I have > installed Hadoop 2.2 which is another branch. > > thanks, > > Xiaoguang -- Harsh J

Re: org.apache.hadoop.ipc.StandbyException occurs at the thirty of per hour in standby NN

2014-01-24 Thread Harsh J
s anyone know what the problem is ? > > > > > > > > Thanks, > > Francis.Hu -- Harsh J

Re: hdfs fsck -locations

2014-01-24 Thread Harsh J
Hi Mark, Yes, the locations are shown as IP. On Fri, Jan 24, 2014 at 12:09 AM, Mark Kerzner wrote: > Hi, > > hdfs fsck -locations > > is supposed to show every block with its location? Is location the ip of the > datanode? > > Thank you, > Mark -- Harsh J

Re: hdfs fsck -locations

2014-01-24 Thread Harsh J
f racks: 1 > FSCK ended at Fri Jan 24 07:45:24 CST 2014 in 0 milliseconds > > > > On Fri, Jan 24, 2014 at 4:34 AM, Harsh J wrote: > >> Hi Mark, >> >> Yes, the locations are shown as IP. >> >> On Fri, Jan 24, 2014 at 12:09 AM, Mark Kerzner >> wrote: &g

Re: hdfs fsck -locations

2014-01-24 Thread Harsh J
>> Total blocks (validated):1 (avg. block size 7217 B) >> >> Minimally replicated blocks: 1 (100.0 %) >> >> Over-replicated blocks: 0 (0.0 %) >> >> Under-replicated blocks:0 (0.0 %) >> >> Mis-replicated blocks:

Re: What is the fix for this error ?

2014-01-24 Thread Harsh J
ymellon.com/eu.htm for certain > disclosures relating to European legal entities. -- Harsh J

Re: Datanode Shutting down automatically

2014-01-24 Thread Harsh J
dfs.server.datanode.DataNode.secureMain(DataNode.java:1795) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812) > > 2014-01-24 17:26:28,860 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down DataNode at user/127.0.1.1 > / > > > Shutdown command is initialised automatically. > Please take a look and respond with a solution. -- Harsh J

Re: HDFS data transfer is faster than SCP based transfer?

2014-01-25 Thread Harsh J
they do lot of file transfer. Now all these transfers are > replaced with HDFS copy. > > Can anyone tell me HDFS transfer is faster as I witnessed? Is it because, it > uses TCP/IP? Can anyone give me reasonable reasons to support the decrease > of time? > > > with thanks and regards > rab -- Harsh J

Re: Memory problems with BytesWritable and huge binary files

2014-01-25 Thread Harsh J
>>> >>> >>> >>> -- >>> Adam Retter >>> >>> skype: adam.retter >>> tweet: adamretter >>> http://www.adamretter.org.uk >> >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. > > > > -- > Adam Retter > > skype: adam.retter > tweet: adamretter > http://www.adamretter.org.uk -- Harsh J

Re: HDFS open file limit

2014-01-27 Thread Harsh J
Hi John, There is a concurrent connections limit on the DNs that's set to a default of 4k max parallel threaded connections for reading or writing blocks. This is also expandable via configuration but usually the default value suffices even for pretty large operations given the replicas help sprea

Re: BlockMissingException reading HDFS file, but the block exists and fsck shows OK

2014-01-27 Thread Harsh J
Can you check the log of the DN that is holding the specific block for any errors? On Jan 27, 2014 8:37 PM, "John Lilley" wrote: > I am getting this perplexing error. Our YARN application launches tasks > that attempt to simultaneously open a large number of files for merge. > There seems to be

Re: memory management module of Namenode

2014-01-28 Thread Harsh J
etadata record in main memory. where can i find the source code file > for this namenode memory management? i am using github and have > 'hadoop-common' repository. -- Harsh J

Re: performance of "hadoop fs -put"

2014-01-28 Thread Harsh J
Are you calling one command per file? That's bound to be slow as it invokes a new JVM each time. On Jan 29, 2014 7:15 AM, "Jay Vyas" wrote: > Im finding that "hadoop fs -put" on a cluster is quite slow for me when i > have large amounts of small files... much slower than native file ops. > Note t

Re: Force one mapper per machine (not core)?

2014-01-29 Thread Harsh J
t to be > self-contented is to be vile and ignorant, and that to aspire is better than > to > be blindly and impotently happy." >-- Edwin A. Abbott, Flatland > > -- Harsh J

Re: Configuring hadoop 2.2.0

2014-01-29 Thread Harsh J
ine > to be the "job tracker"? Did this job tracker node change its name to > something else in the current docs? > > Thanks, > Ognen -- Harsh J

Re: Capture Directory Context in Hadoop Mapper

2014-01-29 Thread Harsh J
irectory-context-in-hadoop-mapper/ > http://www.idryman.org/blog/2014/01/27/capture-path-info-in-hadoop-inputformat-class/ > > Felix -- Harsh J

Re: Force one mapper per machine (not core)?

2014-01-31 Thread Harsh J
ot absolutely certain. I know that the cluster "behaves" > like the MR1 clusters I've worked with for years (I interact with the job > tracker in a classical way for example). Can I tell whether it's MR1 or > MR2 from the job tracker or namename web UIs? > > Th

Re: Backup NameNode metadata in HA configuration

2014-02-03 Thread Harsh J
t; www.griddynamics.com >> itretya...@griddynamics.com > > > > > -- > Best Regards > Ivan Tretyakov > > Deployment Engineer > Grid Dynamics > +7 812 640 38 76 > Skype: ivan.v.tretyakov > www.griddynamics.com > itretya...@griddynamics.com -- Harsh J

Re: Backup NameNode metadata in HA configuration

2014-02-03 Thread Harsh J
N data directory (dfs.name.dir, > dfs.namenode.name.dir) is appropriate or not? > > > On Tue, Feb 4, 2014 at 6:00 AM, Harsh J wrote: >> >> Hi, >> >> You'll ideally just need the latest >> fsimage file, which includes the whole check pointed namespace, so &

Re: what's counterpart for org.apache.hadoop.mapred.TaskTrackerMetricsInst in YARN

2014-02-04 Thread Harsh J
t it to run as NodeManager > starts up, so how can I make this happen? > > -- > --Anfernee -- Harsh J

Re: DistCP : Is it gauranteed to work for any two uri schemes?

2014-02-04 Thread Harsh J
t; sytems. > > I've havent found many examples online with different URI schemes. > > With emerging HDFS alternatives, I'd be interested in ways to otimize IO > between different filesystems using distcp. > > -- > Jay Vyas > http://jayunit100.blogspot.com -- Harsh J

Re: Where to find a list of most Hadoop2 config file parameters?

2014-02-05 Thread Harsh J
p-cluster-installation-using-cloudera-manager-and-cloudera-parcels/ > > http://shalishvj.wordpress.com/2014/02/04/a-hadoop-cluster-home-using-centos-6-5-cloudera-manager-4-8-1-and-cloudera-parcels-episode-2/ > > Thanks > Shalish. > > > > > -- Harsh J

Re: HDFS: file is not distributed after upload

2014-02-07 Thread Harsh J
.79 GB) > DFS Used%: 0% > DFS Remaining%: 58.51% > Last contact: Fri Feb 07 12:10:27 MSK 2014 > > > Name: 10.10.1.12:50010 > Decommission Status : Normal > Configured Capacity: 317070499840 (295.29 GB) > DFS Used: 24576 (24 KB) > Non DFS Used: 84642578432 (78.83 GB) > DFS Remaining: 232427896832(216.47 GB) > DFS Used%: 0% > DFS Remaining%: 73.3% > Last contact: Fri Feb 07 12:10:27 MSK 2014 > > > Best, > Alex > -- Harsh J

Re: Can we avoid restarting of AM when it fails?

2014-02-08 Thread Harsh J
en it fails it is again started with _02 . Is > there a way for me to avoid the second instance of the Application Master > getting started? Is it re-started automatically by the RM after the first > one failed? > > Thanks, > Kishore -- Harsh J

Re: Can we avoid restarting of AM when it fails?

2014-02-08 Thread Harsh J
Correction: Set it to 1 (For 1 max attempt), not 0. On Sat, Feb 8, 2014 at 7:31 PM, Harsh J wrote: > You can set > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.html#setMaxAppAttempts(int) > to 0, at a per-app level, to pr

Re: A hadoop command to determine the replication factor of a hdfs file ?

2014-02-08 Thread Harsh J
ermine the replication factor of a hdfs file > ? Please advise. > > I know that "fs setrep" only changes the replication factor. > > Regards, > Raj -- Harsh J

Re: Facing problem while emitting GenericData$Record

2014-02-08 Thread Harsh J
at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114) > at > org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:175) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) > at > org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290) > > > > Thanks & Regards, > B Anil Kumar. > Thanks & Regards, > B Anil Kumar. -- Harsh J

Re: YARN FSDownload: How did Mr1 do it ?

2014-02-11 Thread Harsh J
saw that in MR1. > > How did MR1 JobTrackers handle resource localization differently than MR2 > App Masters? > > -- > Jay Vyas > http://jayunit100.blogspot.com -- Harsh J

Re: How view Tracking UI under hadoop-2.2.0

2014-02-14 Thread Harsh J
ted, and may be unlawful.If you have received this > communication in error,please > immediately notify the sender by return e-mail, and delete the original > message and all copies from > your system. Thank you. > --- -- Harsh J

Re: How to ascertain why LinuxContainer dies?

2014-02-14 Thread Harsh J
at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >   at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >   at java.lang.Thread.run(Thread.java:662) > > where can i find the root cause of the non-zero exit code ? > > -- > Jay Vyas > http://jayunit100.blogspot.com -- Harsh J

Re: What if file format is dependent upon first few lines?

2014-02-27 Thread Harsh J
ond split would lost the "file format" information. > > How could each mapper get the first few lines in the file? -- Harsh J

Re: CapacityScheduler and FairScheduler

2014-02-27 Thread Harsh J
Carlos wrote: > I'm reading about them and it looks CapacityScheduler as a particular > configuration of Fair Scheduler (setting FIFO as scheduler in each defined > queue). Can I understand Capacity scheduler in that way or I'm missing > something? > > Regards. -- Harsh J

Re: Multiple inputs for different avro inputs

2014-02-27 Thread Harsh J
.avro.AvroTypeException: Found Event, expecting union > > How to fix this issue? > > One more doubt: Why we don't have AvroMultipleInputs just like > AvroMultipleOutputs? Any reason? > > Thanks & Regards, > B Anil Kumar. -- Harsh J

Re: Question on DFS Balancing

2014-03-04 Thread Harsh J
ultiple disks on single datanode? > > Thanks > Divye Sheth -- Harsh J

Re: Question on DFS Balancing

2014-03-05 Thread Harsh J
n Wed, Mar 5, 2014 at 11:28 AM, Harsh J wrote: >> >> You're probably looking for >> https://issues.apache.org/jira/browse/HDFS-1804 >> >> On Tue, Mar 4, 2014 at 5:54 AM, divye sheth wrote: >> > Hi, >> > >> > I am new to the mailing list. >

Re: HDFS java client vs the Command Line

2014-03-06 Thread Harsh J
ieces that this is an issue with the OSX > "feature" of case insensitivity with its file system. Can anyone confirm > this? If so, can anyone advise as to a workaround? > > Such a simple thing to get hung up on, go figure. > > Thanks > > -- > There are ways and there are ways, > > Geoffry Roberts -- Harsh J

Re: HDFS java client vs the Command Line

2014-03-06 Thread Harsh J
_PATH was close but not correct. Fixed it and all is well. > > > Thanks > > > > > On Thu, Mar 6, 2014 at 1:05 PM, Harsh J wrote: >> >> I've never faced an issue trying to run hadoop and related programs on >> my OSX. What is your error exactly? >> &g

Re: MapReduce: How to output multiplt Avro files?

2014-03-06 Thread Harsh J
sController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > I have no idea what this means. -- Harsh J

Re: Running a Job in a Local Job Runner:Windows 7 64-bit

2014-03-06 Thread Harsh J
for running > Map- > Reduce jobs in a single JVM. It's designed for testing, and is very > convenient for use in > an IDE, since you can run it in a debugger to step through the code in your > mapper and > reducer. > > Do I also need to install Hadoop locally on Windows for that? > > Thanks, > -RR -- Harsh J

Re: Hadoop2.x reading data

2014-03-11 Thread Harsh J
the > file in pig and it shows as follows while reading but file has multiple > records and also weird thing is if I dump the variable its shows the pig > tuples, > > Successfully read 0 records from: "/tmp/sample.txt" > > Any reason? > > -- > Regards, > Viswa.J -- Harsh J

Re: question about yarn webapp

2014-03-19 Thread Harsh J
ver Stacks from web ui. But I don't know which code handle > the function, how can the web app get the stacks information from jvm? -- Harsh J

Re: The reduce copier failed

2014-03-19 Thread Harsh J
worried about that > message? > > Regards, > Mahmood -- Harsh J

Re: The reduce copier failed

2014-03-20 Thread Harsh J
ava.lang.reflect.Method.invoke(Method.java:601) > at org.apache.hadoop.util.RunJar.main(RunJar.java:160) > > > Regards, > Mahmood > > > On Thursday, March 20, 2014 3:41 AM, Harsh J wrote: > While it does mean a retry, if the job eventually fails (after finite > retr

Re: Hadoop upgrade, what happend on Datanode side and how long can it takes?

2014-03-21 Thread Harsh J
upgrade the "big" cluster? > > Greats, > > Norbert -- Harsh J

Re: Benchmark Failure

2014-03-22 Thread Harsh J
INFO hdfs.NNBench: Replication >>> factor: 3 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: Successful file >>> operations: 0 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: >>> 14/03/17 23:56:18 INFO hdfs.NNBench: # maps that missed the >>> barrier: 11 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: # >>> exceptions: 1000 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: >>> 14/03/17 23:56:18 INFO hdfs.NNBench:TPS: >>> Create/Write/Close: 0 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: Avg exec time (ms): >>> Create/Write/Close: Infinity >>> 14/03/17 23:56:18 INFO hdfs.NNBench: Avg Lat (ms): >>> Create/Write: NaN >>> 14/03/17 23:56:18 INFO hdfs.NNBench:Avg Lat (ms): >>> Close: NaN >>> 14/03/17 23:56:18 INFO hdfs.NNBench: >>> 14/03/17 23:56:18 INFO hdfs.NNBench: RAW DATA: AL Total >>> #1: 0 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: RAW DATA: AL Total >>> #2: 0 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: RAW DATA: TPS Total >>> (ms): 1131 >>> 14/03/17 23:56:18 INFO hdfs.NNBench:RAW DATA: Longest Map Time >>> (ms): 1.395071776653E12 >>> 14/03/17 23:56:18 INFO hdfs.NNBench:RAW DATA: Late >>> maps: 11 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: RAW DATA: # of >>> exceptions: 1000 >>> 14/03/17 23:56:18 INFO hdfs.NNBench: >>> > -- Harsh J

Re: How to get locations of blocks programmatically?

2014-03-28 Thread Harsh J
for all > blocks in the path. > Is it possible to list all blocks and the block locations for a given path > programmatically? > Thanks, > > Libo -- Harsh J

Re: mapred job -list error

2014-03-28 Thread Harsh J
at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1237) > > when i executed the same command yesterday, it was ok. > Thanks for any help -- Harsh J

Re: How to run data node block scanner on data node in a cluster from a remote machine?

2014-03-28 Thread Harsh J
is to configure the property of > dfs.datanode.scan.period in hdfs-site.xml but is there any other other way. > Is it possible to run data node block scanner on data node either through > command or pragmatically. -- Harsh J

Re: Does hadoop depends on ecc memory to generate checksum for data stored in HDFS

2014-03-28 Thread Harsh J
s on ecc memory to generate checksum > for data stored in HDFS? > > -- Harsh J

Re: Hadoop Serialization mechanisms

2014-03-30 Thread Harsh J
Were there any changes w.r.t. to Serialization from Hadoop 1.x to Hadoop >> 2.x? >> Will there be a significant performance gain if the default Serialization >> i.e. Writables is replaced with Avro, Protol Buffers or Thrift in Map Reduce >> programming? >> >> >> Thanks, >> -RR > > > > > -- > Jay Vyas > http://jayunit100.blogspot.com -- Harsh J

Re: Why block sizes shown by 'fsck' and '-stat' are inconsistent?

2014-04-05 Thread Harsh J
onslusion: > The block size is 134217728 B shown by stat. > > Also, if I browser this file from http://namenode:50070, the file size of > /user/user1/filesize/derby.jar equals to 2.5 MB(2673375 B), however the > block size equals to 128 MB(134217728 B). > > Why block sizes shown by 'fsck' and '-stat' are inconsistent? > > > -- Harsh J

Re: Why block sizes shown by 'fsck' and '-stat' are inconsistent?

2014-04-05 Thread Harsh J
> As I mentioned HDFS use only what it needs on the local file system. For > example, a 16 KB hdfs file only use 16 KB local file system storage, not 64 > MB(its hdfs block size) storage. In this case, what's the use of the block > size(64 MB) of the 16 KB file? > > > 20

Re: customize containter-log4j.properties

2014-04-05 Thread Harsh J
ties() and cannot be modified: > http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-common/2.2.0/org/apache/hadoop/mapreduce/v2/util/MRApps.java#457 > > Do you have any idea on how to possibly customize this configuration file? > > Thanks > -- > Federico Baldo -- Harsh J

Re: MapReduce for complex key/value pairs?

2014-04-08 Thread Harsh J
r ngram. In other > words, the key would be the ngram but the value would be an integer (the > count) _and_ an array of document id's. > > Is this something that can be done? Any pointers would be appreciated. > > I am using Java, btw. > >Thank you, > >Natalia Connolly > -- Harsh J

Re: File requests to Namenode

2014-04-09 Thread Harsh J
You could look at metrics the NN publishes, or look at/process the HDFS audit log. On Wed, Apr 9, 2014 at 6:36 PM, Diwakar Sharma wrote: > How and where to check how many datanode block address requests a namenode > gets when running a map reduce job. > > - Diwakar -- Harsh J

Re: use setrep change number of file replicas,but not work

2014-04-09 Thread Harsh J
09:41:12 CST 2014 > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 17660 bytes, 1 > block(s): OK > 0. BP-1043055049-192.168.11.11-1382442676609:blk_6517693524032437780_8889786 > len=17660 repl=3 [192.168.11.12:50010, 192.168.11.15:50010, > 192.168.11.13:50010] -- Harsh J

Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs

2014-04-09 Thread Harsh J
FILE: Number of large read operations=0 > > FILE: Number of write operations=0 > > HDFS: Number of bytes read=21753912258 > > HDFS: Number of bytes written=0 > > HDFS: Number of read operations=486 > > HDFS: Number of large read operations=0 > > HDFS: Number of write operations=0 > > Job Counters > > Failed map tasks=4 > > Killed map tasks=10 > > Launched map tasks=176 > > Other local map tasks=3 > > Data-local map tasks=173 > > Total time spent by all maps in occupied slots (ms)=1035708 > > Total time spent by all reduces in occupied slots (ms)=0 > > Map-Reduce Framework > > Map input records=164217466 > > Map output records=0 > > Map output bytes=0 > > Map output materialized bytes=414720 > > Input split bytes=23490 > > Combine input records=0 > > Combine output records=0 > > Spilled Records=0 > > Failed Shuffles=0 > > Merged Map outputs=0 > > GC time elapsed (ms)=4750 > > CPU time spent (ms)=321980 > > Physical memory (bytes) snapshot=91335024640 > > Virtual memory (bytes) snapshot=229819834368 > > Total committed heap usage (bytes)=128240713728 > > File Input Format Counters > > Bytes Read=21753888768 > > 14/04/10 10:28:24 ERROR streaming.StreamJob: Job not Successful! > > Streaming Command Failed! > > > > > > Thanks and Regards, > > Truong Phan -- Harsh J

Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Harsh J
cutor.java:908) >>> at java.lang.Thread.run(Thread.java:662) >>> >>> >>> I have everything configured with hdfs running where i am able to create >>> files and directories. running jps on my machine shows all components >>> running. >>> >>> 10290 NameNode >>> 10416 DataNode >>> 10738 ResourceManager >>> 11634 Jps >>> 10584 SecondaryNameNode >>> 10844 NodeManager >>> >>> >>> Any pointers will be appreciated. >>> >>> Thanks and Regards, >>> -Rahul Singh >> >> > -- Harsh J

Re: InputFormat and InputSplit - Network location name contains /:

2014-04-10 Thread Harsh J
che.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81) > at java.lang.Thread.run(Thread.java:662) > 2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler] > org.apache.hadoop. -- Harsh J

Re: Number of map task

2014-04-12 Thread Harsh J
its? > > I think the job will be done quicker if there are more Map tasks? > > Patcharee -- Harsh J

Re: Changing default scheduler in hadoop

2014-04-13 Thread Harsh J
er how can i do this? Remove the configuration override, and it will always go back to the default FIFO based scheduler, the same whose source has been linked above. > I am struggling since 4 months to get help on Apache Hadoop?? Are you unsure about this? -- Harsh J

Re: configure HBase

2014-04-24 Thread Harsh J
memory resident" in the hbase-env.sh, can you explain in > detail? > > Thanks for any inputs. -- Harsh J

Re: hdfs write partially

2014-04-27 Thread Harsh J
e-packet-size is 64K and it can't be > bigger than 16M. > > So if write bigger than 16M a time, how to make sure it doesn't write > partially ? > > > > Does anyone knows how to fix this? > > > > Thanks a lot. > > > > -- > > Ken Huang > -- Harsh J

Re: copyFromLocal: unexpected URISyntaxException

2014-04-28 Thread Harsh J
eption" when I try to copy this file to Hadoop. See below. >> >> [patcharee@compute-1-0 ~]$ hadoop fs -copyFromLocal >> wrfout_d01_2001-01-01_00:00:00 netcdf_data/ >> copyFromLocal: unexpected URISyntaxException >> >> I am using Hadoop 2.2.0. >> >> Any suggestions? >> >> Patcharee >> > > > > -- > Nitin Pawar > > -- Harsh J

Re: 答复: hdfs write partially

2014-04-28 Thread Harsh J
> > Thank a lot for replying. > > > > Regards, > > Ken Huang > > > > 发件人: user-return-15182-tdhkx=126@hadoop.apache.org > [mailto:user-return-15182-tdhkx=126@hadoop.apache.org] 代表 Harsh J > 发送时间: 2014年4月28日 13:30 > 收件人: > 主题: Re: hdfs wri

Re: For QJM HA solution, after failover, application must update NameNode IP?

2014-04-30 Thread Harsh J
070 >> >> >> >> dfs.client.failover.proxy.provider.gilbert-prod >> >> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider >> >> >> >> On Tue, Apr 29, 2014 at 9:07 AM, sam liu wrote: >>> >>> Hi Experts, >>> >>> For example, at the beginning, the application will access NameNode using >>> IP of active NameNode(IP: 9.123.22.1). However, after failover, the IP of >>> active NameNode is changed to 9.123.22.2 which is the IP of previous standby >>> NameNode. In this case, application must update NameNode IP? >>> >>> Thanks! >> >> > -- Harsh J

Re: Realtime sensor's tcpip data to hadoop

2014-05-11 Thread Harsh J
s. > > Secondly, if the total network traffic from sensors are over the limit of > one lan port, how to share the loads, is there any component in hadoop to > make this done automatically. > > Any suggestions, thanks. -- Harsh J

Re: Running the job history server locally

2014-05-16 Thread Harsh J
You can run it via the script "mr-jobhistory-daemon.sh start historyserver", or in foreground via "mapred historyserver". On Sat, May 17, 2014 at 5:18 AM, Software Dev wrote: > How does one run the job history server? > > I am using the latest Hadoop from Hombrew on OSX -- Harsh J

Re: fuse-dfs on hadoop-2.2.0

2014-05-17 Thread Harsh J
> configured “hadoop.fuse.timer.period" and “hadoop.fuse.connection.timeout" > in $HADOOP_HOME/etc/hadoop/hdfs-site.xml. > > Can any one share some hints on how to fix this? How can I let fuse-dfs > correctly load the configuration? > > Thanks in advance! > Cheng -- Harsh J

<    6   7   8   9   10   11   12   >