Re: How can I find out which nodemanagers are unhealthy and which nodemangers are lost?

2018-10-16 Thread Harsh J
; > > What are the possible values for the state in LiveNodeManagers bean? Will > LOST, ACTIV, REBOOTED and DECOMMISSIONED show up in the state filed? > > ________ > 发件人: Harsh J > 发送时间: 2018年10月15日 12:46:49 > 收件人: ims...@outlook.com > 抄送: > 主

Re: How can I find out which nodemanagers are unhealthy and which nodemangers are lost?

2018-10-14 Thread Harsh J
gt; NumUnhealthyNMs Current number of unhealthy NodeManagers > NumRebootedNMs Current number of rebooted NodeManagers > > > How can I find out which nodemangers are unhealthy and which are lost? Better > if it could be achieved by calling jmx rest api or h

Re: ZKFC ActiveBreadCrumb Value

2018-09-15 Thread Harsh J
ally so that's > why I gave the zkCli command as an example. I'm using this Go Lib > (github.com/samuel/go-zookeeper/zk) and I get the same result. > > On Fri, 14 Sep 2018 at 02:22 Harsh J wrote: >> >> The value you are looking at directly in ZooKeeper is in

Re: ZKFC ActiveBreadCrumb Value

2018-09-13 Thread Harsh J
r sign): > cluster3active-nn3$active-nn3.example.com �>(�> > > How can I effectively write a generic code deployed on different HDFS > clusters to effectively find out which is the active NN from querying ZK? > > Or am I doing something wrong? Is the behavior above expected

Re: [EXTERNAL] Yarn : Limiting users to list only his applications

2018-03-19 Thread Harsh J
You are likely looking for the feature provided by YARN-7157. This will work if you have YARN ACLs enabled. On Tue, Mar 20, 2018 at 3:37 AM Benoy Antony wrote: > Thanks Christopher. > > > On Mon, Mar 19, 2018 at 2:23 PM, Christopher Weller < > christopher.wel...@gm.com> wrote: > >> Hi, >> >> >>

Re: How to print values in console while running MapReduce application

2017-10-08 Thread Harsh J
Consider running your job in the local mode (set config ' mapreduce.framework.name' to 'local'). Otherwise, rely on the log viewer from the (Job) History Server to check the console prints in each task (under the stdout or stderr sections). On Thu, 5 Oct 2017 at 05:15 Tanvir Rahman wrote: > Hell

Re: Is Hadoop validating the checksum when reading only a part of a file?

2017-09-19 Thread Harsh J
Yes, checksum match is checked for every form of read (unless explicitly disabled). By default, a checksum is generated and stored for every 512 bytes of data (io.bytes.per.checksum), so only the relevant parts are checked vs. the whole file when doing a partial read. On Mon, 18 Sep 2017 at 19:23

Re: Forcing a file to update its length

2017-08-09 Thread Harsh J
[image: cid:image004.png@01D19182.F24CA3E0] > > > > *From:* Harsh J [mailto:ha...@cloudera.com] > *Sent:* Wednesday, August 9, 2017 3:01 PM > *To:* David Robison ; user@hadoop.apache.org > *Subject:* Re: Forcing a file to update its length > > > > I don't think it&#x

Re: Forcing a file to update its length

2017-08-09 Thread Harsh J
I don't think it'd be safe for a reader to force an update of length at the replica locations directly. Only the writer would be perfectly aware of the DNs in use for the replicas and their states, and the precise count of bytes entirely flushed out of the local buffer. Thereby only the writer is i

Re: Directed hdfs block reads

2017-07-23 Thread Harsh J
There isn't an API way to hint/select DNs to read from currently - you may need to do manual changes (contribution of such a feature is welcome, please file a JIRA to submit a proposal). You can perhaps hook your control of which replica location for a given block is selected by the reader under t

Re: GARBAGE COLLECTOR

2017-06-19 Thread Harsh J
You can certainly configure it this way without any ill effects, but note that MR job tasks are typically short lived and GC isn't really a big issue for most of what it does. On Mon, 19 Jun 2017 at 14:20 Sidharth Kumar wrote: > Hi Team, > > How feasible will it be, if I configure CMS Garbage co

Re: HDFS - How to delete orphaned blocks

2017-03-24 Thread Harsh J
The rate of deletion of DN blocks is throttled via dfs.namenode.invalidate.work.pct.per.iteration (documented at https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml#dfs.namenode.invalidate.work.pct.per.iteration). If your problem is the rate and your usage is suc

Re: No edits files in dfs.namenode.edits.dir

2016-05-22 Thread Harsh J
Are you absolutely certain you are looking at the right directory? The NameNode is designed to crash if it cannot persist edits (transactions) durably. The "hdfs getconf" utility checks local command classpaths but your service may be running over a different configuration directory. If you have a

Re: How to configure detection of failed NodeManager's sooner?

2016-02-14 Thread Harsh J
You're looking for the property "yarn.nm.liveness-monitor.expiry-interval-ms", whose default is 60ms (10m). This is to be set on the ResourceManager(s)' yarn-site.xml. (X-Ref: https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml#yarn.nm.liveness-monitor.expiry

Re: hdfs : Change supergroup

2016-02-08 Thread Harsh J
Changing the supergroup configuration would not affect the existing group maintained on the file inodes (its persisted since the beginning, not pulled dynamically from config at every restart). You will need to manually fs -chgrp those. On Mon, Feb 8, 2016 at 10:15 PM Francis Dupin wrote: > Dear

Re: How does this work

2015-12-24 Thread Harsh J
Hue and Beeline access your warehouse data and metadata via the HiveServer2 APIs. The HiveServer2 service runs as the 'hive' user. On Wed, Dec 23, 2015 at 9:42 PM Kumar Jayapal wrote: > Hi, > > My environment has Kerbros and Senry for authentication and authorisation. > > we have the following

Re: Utilizing snapshotdiff for distcp

2015-12-11 Thread Harsh J
You need to pass the -diff option (works only when -update is active). The newer snapshot name can also be "." to indicate the current view. On Sat, Dec 12, 2015 at 12:53 AM Nicolas Seritti wrote: > Hello all, > > It looks like HDFS-8828 implemented a way to utilize the snapshotdiff > output t

Re: nodemanager listen on 0.0.0.0

2015-12-08 Thread Harsh J
gt; On Tue, Dec 8, 2015 at 9:56 PM, Harsh J wrote: > >> Hello, >> >> Could you file a JIRA for this please? Currently the ShuffleHandler will >> always bind to wildcard address due to the code being that way (in both >> branch-2 and trunk presently: >> htt

Re: nodemanager listen on 0.0.0.0

2015-12-08 Thread Harsh J
Hello, Could you file a JIRA for this please? Currently the ShuffleHandler will always bind to wildcard address due to the code being that way (in both branch-2 and trunk presently: https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduc

Re: lzo error while running mr job

2015-10-27 Thread Harsh J
dump will carry the source of all properties along with their value. On Tue, Oct 27, 2015 at 8:52 PM Kiru Pakkirisamy wrote: > Harish, > > We don't have lzo in the io.compression.codecs list. > > That is what is puzzling me. > > Regards, > > Kiru >

Re: lzo error while running mr job

2015-10-26 Thread Harsh J
Every codec in the io.compression.codecs list of classes will be initialised, regardless of actual further use. Since the Lzo*Codec classes require the native library to initialise, the failure is therefore expected. On Tue, Oct 27, 2015 at 11:42 AM Kiru Pakkirisamy wrote: > I am seeing a weird

Re: Concurrency control

2015-10-01 Thread Harsh J
If all your Apps are MR, then what you are looking for is MAPREDUCE-5583 (it can be set per-job). On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch wrote: > Hi Naga, > > Like most of the app-level configurations, admin can configure the > defaults which user may want override at application level. > > If

Re: Who will Responsible for Handling DFS Write Pipe line Failure

2015-09-07 Thread Harsh J
These 2-part blog posts from Yongjun should help you understand the HDFS file write recovery process better: http://blog.cloudera.com/blog/2015/02/understanding-hdfs-recovery-processes-part-1/ and http://blog.cloudera.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/ On Mon, Sep 7, 2

Re: Performance impact of curl command on linux server

2015-09-05 Thread Harsh J
That depends - what resource did it consume, concerning your admins? CPU? Were you uploading using the chunked technique? On Fri, Sep 4, 2015 at 9:06 PM Shashi Vishwakarma wrote: > Hi > > I have been using curl command for ingesting data into HDFS using WebHDFS. > Admins are saying that resourc

Re: MultithreadedMapper - Sharing Data Structure

2015-08-24 Thread Harsh J
f sharing > the same structure across multiple Map Tasks. Multithreaded Map task does > that partially, as within the multiple threads, same copy is used. > > > Depending upon the hardware availability, one can get the same performance. > > Thanks, > > > On Mon, Aug 24, 20

Re: MultithreadedMapper - Sharing Data Structure

2015-08-24 Thread Harsh J
The MultiThreadedMapper won't solve your problem, as all it does is run parallel maps within the same map task JVM as a non-MT one. Your data structure won't be shared across the different map task JVMs on the host, but just within the map tasks's own multiple threads running the map() function ove

Re: Sorting the inputSplits

2015-07-30 Thread Harsh J
If you meant 'scheduled' first perhaps thats doable by following (almost) what Gera says. The framework actually explicitly sorts your InputSplits list by its reported lengths, which would serve as the hack point for inducing a reordering. See https://github.com/apache/hadoop-common/blob/trunk/hado

Re: dfs.permissions.superusergroup not working

2015-07-26 Thread Harsh J
t; > > > > > Looks like it accepts only one group as value. If that’s not true, please > advise me whar would have went wrong > > > > Thanks, > > Venkat > > > > > > *From:* Harsh J [mailto:ha...@cloudera.com] > *Sent:* Friday, J

Re: dfs.permissions.superusergroup not working

2015-07-24 Thread Harsh J
Is there a typo in your email, or did you set dfs.cluster.administrators instead of intending to set dfs.permissions.superusergroup? Also, are your id outputs from the NameNode machines? Cause by default the group lookups happen local to your NameNode machine. On Sat, Jul 25, 2015 at 1:31 AM Gang

Re: total vcores per node containers in yarn

2015-07-19 Thread Harsh J
oop 2.5.0. > Whats the logic of default using hardware detection. Say My node has 8 > actual core and 32 virtual cores. Its taking 26 as value of vcores > available of this node on RM UI. > > On Sat, Jul 18, 2015 at 7:22 PM, Harsh J wrote: > >> What version of Apache Hadoop

Re: total vcores per node containers in yarn

2015-07-18 Thread Harsh J
What version of Apache Hadoop are you running? Recent changes have made YARN to auto-compute this via hardware detection, by default (rather than the 8 default). On Fri, Jul 17, 2015 at 11:31 PM Shushant Arora wrote: > In Yarn there is a setting to specify no of vcores that can be allocated > to

Re: issues about hadoop-0.20.0

2015-07-18 Thread Harsh J
Apache Hadoop 0.20 and 0.21 are both very old and unmaintained releases at this point, and may carry some issues unfixed via further releases. Please consider using a newer release. Is there a specific reason you intend to use 0.21.0, which came out of a branch long since abandoned? On Sat, Jul 1

Re: tools.DistCp: Invalid arguments

2015-07-10 Thread Harsh J
No, you don't need to recheck. On Saturday, July 11, 2015, Giri P wrote: > so ,there is no need to explicitly do checksum on source and target after > we migrate. > > I was thinking of using comparechecksum in distcpUtils to compare after > migration > > On Fri, Jul 10

Re: tools.DistCp: Invalid arguments

2015-07-10 Thread Harsh J
ava:202) >>> at >>> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241) >>> at >>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253) >>> at >>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259) >>> at >>> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) >>> at >>> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:3167) >>> at >>> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1072) >>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966) >>> Invalid arguments: Failed on local exception: >>> com.google.protobuf.InvalidProtocolBufferException: Protocol message >>> end-group tag did not match expected tag.; Host Details : local host is: >>> "hadoop-coc-1/127.0.1.1"; destination host is: "hadoop-coc-2":50070; >>> usage: distcp OPTIONS [source_path...] >>> >>> Thanks, >>> ​ >>> >>> >>> ​ >>> >>> >>> > -- Harsh J

Re: Different outputformats in avro map reduce job

2015-07-08 Thread Harsh J
> textformat as well along with these avro files.Is it possible. > > Thanks, > Nishanth -- Harsh J

Re: query uses WITH blocks and throws exception if run as Oozie hive action (hive-0.13.1)

2015-05-17 Thread Harsh J
t; 9708 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9708 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 9798 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9798 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9798 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 9815 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9815 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9815 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 9815 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9815 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 9827 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9827 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9827 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9827 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 9852 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 9852 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9852 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 9876 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Create >>>> dirs >>>> hdfs://hadev/tmp/hive-svc-yarn/hive_2015-05-15_13-58-05_500_5122268870471366216-1 >>>> with permission rwxrwxrwx recursive false >>>> 9894 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Completed getting MetaData in Semantic Analysis >>>> 10277 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 10289 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 10290 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 10294 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 10294 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 10294 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for source tables >>>> 10320 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for subqueries >>>> 10321 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 10321 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Get metadata for destination tables >>>> 10816 [main] INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer - >>>> Set stats collection dir : >>>> hdfs://hadev/tmp/hive-svc-yarn/hive_2015-05-15_13-58-05_500_5122268870471366216-1/-ext-10002 >>>> >>>> >>> >> > -- Harsh J

Re: namenode uestion

2015-05-12 Thread Harsh J
wrote: > > > hi all, > I have an hdfs question, my understanding is, namenode sends a list of > datanodes to hadoop client to send data blocks based on replication setting. > The question is, does the list from namenode has IP addresses or hostnames, > or both of the datanodes. > > NN ---> Client > DN1 --> DN2 --> DN3 > > -- Harsh J

Re: Is there any way to limit the concurrent running mappers per job?

2015-04-22 Thread Harsh J
is to use queue to limit it, but it's not easy to control it > from job submitter. > Is there any way to limit the concurrent running mappers per job? > Any documents or discussions before? > > BTW, any way to search this mailing list before I post a new question? > > Thanks very much. -- Harsh J

Re: CPU utilization in map function

2015-04-07 Thread Harsh J
s well as time spent in sending > messages in each superstep for a Giraph application. > > I am not familiar with hadoop code. Can you suggest the functions I should > look into to get this information ? > > > Thanks > Ravikant -- Harsh J

Re: Hadoop and HttpFs

2015-04-07 Thread Harsh J
uration (the one with the > hdfs://) in order to send the jars to the distributed cache. And at that > point it fails because the client doesn’t have access to the datanodes. > > > > Am I right in my understanding of what happens in that case ? > > Also, anyone meets this issue already? Any solution? Workaround? > > > > Thanks a lot in advance, > > > > Rémy. -- Harsh J

Re: Linux Container Executor (LCE) vs Default Container Executor(DCE)

2015-03-26 Thread Harsh J
nularity to control execution like ban users, min uid > - use cgroups to control resources > > While DCE uses ulimits. > > In both cases the container is executed under the user submitting it. > > Any further insights is appreciated. > > Thanks, > Rajesh > -- Harsh J

Re: Swap requirements

2015-03-25 Thread Harsh J
Yarn.nodemanager.Vmem-pmem-ratio parameter... > > If data nodes does not require swap then what about the above parameter? > What is that used for in yarn? -- Harsh J

Re: namenode recovery

2015-03-25 Thread Harsh J
tart a namenode on one of the other nodes? > Will it recover > the fsimage that was checkpointed by the secondary namenode? > > Thanks > Brian -- Harsh J

Re: Identifying new files on HDFS

2015-03-25 Thread Harsh J
are confidential and for the exclusive use of > the intended recipient. If you receive this e-mail in error please delete > it from your system immediately and notify us either by e-mail or > telephone. You should not copy, forward or otherwise disclose the content > of the e-mail. The views expressed in this communication may not > necessarily be the view held by WHISHWORKS. > -- Harsh J

Re: Can block size for namenode be different from wdatanode block size?

2015-03-25 Thread Harsh J
> recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Ltd, its > subsidiaries or their employees, unless expressly so stated. It is the > responsibility of the recipient to ensure that this email is virus free, > therefore neither Peridale Ltd, its subsidiaries nor their employees accept > any responsibility. > > > -- Harsh J

Re: Tell which reduce task run which partition

2015-03-24 Thread Harsh J
sk 0 will read partition 0, reduce task 1 will > read partition 1, etc... > > Thanks, > > -- > -- > > -- Harsh J

Re: how does datanodes know where to send the block report in HA

2015-01-31 Thread Harsh J
; ? > > Will the data node send report directly to Name Node or will it send to > Journal Nodes / ZKFC? > > Thanks > SP > > > -- Harsh J

Re: How to get Hadoop Admin job being fresher in Big Data Field?

2015-01-04 Thread Harsh J
gt; > Does company consider other experience like i have as a DBA and give job to > a newbie like me? > > What are the skills/tool knowledge i should have to get a job in Big Data > Field ? > > Thanks > Krish > > -- Harsh J

Re: Question about the QJM HA namenode

2014-12-03 Thread Harsh J
t;> like below: >>> {code} >>> 2014-12-03 12:13:35,165 INFO org.apache.hadoop.ipc.Client: Retrying connect >>> to server: l-hbase1.dba.dev.cn0/10.86.36.217:8485. Already tried 1 time(s); >>> retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, >>> sleepTime=1000 MILLISECONDS) >>> {code} >>> >>> I have the QJM on l-hbase1.dba.dev.cn0, does it matter? >>> >>> I am a newbie, Any idea will be appreciated!! >> > > -- Harsh J

Re: why does hadoop creates /tmp/hadoop-user/hadoop-unjar-xxxx/ dir and unjar my fat jar?

2014-10-25 Thread Harsh J
> I wonder why do we have to unjar these classes on the **client node** ? >> the jar won't even be accessed until on the compute nodes, right? >> > > -- Harsh J

Re: Load csv files into drill tables

2014-10-25 Thread Harsh J
1 row selected (0.964 seconds) > > I would have great thanks to somebody, who could help me. > Laszlo > -- Harsh J

Re: hadoop 2.4 using Protobuf - How does downgrade back to 2.3 works ?

2014-10-18 Thread Harsh J
So upgrade from > 2.3.0 to 2.4 would work since 2.4 can read old (2.3) binary format and write > the new 2.4 protobuf format. > > After using 2.4, if there is a need to downgrade back to 2.3, how would that > work ? > > Thanks, -- Harsh J

Re: S3 with Hadoop 2.5.0 - Not working

2014-09-10 Thread Harsh J
; http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml > > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support > s3 or do i need to do anything else. > > cheers, > Dhiraj > > > -- Harsh J

Re: Hadoop 2.0.0 stopping itself

2014-09-03 Thread Harsh J
error? I'm very new in > Hadoop. > > Thanks in advance. -- Harsh J

Re: cannot start tasktracker because java.lang.NullPointerException

2014-09-01 Thread Harsh J
It appears you have made changes to the source and recompiled it. The actual release source line 247 of the failing class can be seen at https://github.com/apache/hadoop-common/blob/release-1.2.1/src/mapred/org/apache/hadoop/mapred/TaskTracker.java#L247, which can never end in a NPE. You need to f

Re: Hadoop HDFS slow after upgrade vom 0.20 -> to 2.0

2014-08-19 Thread Harsh J
p hdfs, or where can be the bottleneck? > > Norbert -- Harsh J

Re: Don't want to read during namenode is in safemode

2014-08-15 Thread Harsh J
atus. > > Please help how can i fix this problem. > > > > Regards, > Satyam -- Harsh J

RE: hadoop/yarn and task parallelization on non-hdfs filesystems

2014-08-15 Thread Harsh J
t; > >> of memory with each map task allocating 2GB, about 86 application > > >> containers will be created. > > >> > > >> On a filesystem that isn't HDFS (like NFS or in my use case, a > > >> parallel filesystem), a MapReduce job will only allocate a subset of > > >> available tasks (e.g., with the same 3-node cluster, about 25-40 > > >> containers are created). Since I'm using a parallel filesystem, I'm > > >> not as concerned with the bottlenecks one would find if one were to > > >> use NFS. > > >> > > >> Is there a YARN (yarn-site.xml) or MapReduce (mapred-site.xml) > > >> configuration that will allow me to effectively maximize resource > > >> utilization? > > >> > > >> Thanks, > > >> Calvin > > > > > > > > -- > > Harsh J >

Re: hadoop/yarn and task parallelization on non-hdfs filesystems

2014-08-15 Thread Harsh J
one were to >> use NFS. >> >> Is there a YARN (yarn-site.xml) or MapReduce (mapred-site.xml) >> configuration that will allow me to effectively maximize resource >> utilization? >> >> Thanks, >> Calvin -- Harsh J

Re: Hadoop 2.2 Built-in Counters

2014-08-14 Thread Harsh J
t available in resource manager website anymore. I know I can get them > from client output. I was wondering if there is other place in name node or > data node to get the final counter measures regarding job id? > Thanks, > Shaw -- Harsh J

Re: Ideal number of mappers and reducers to increase performance

2014-08-07 Thread Harsh J
map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum i use - produces same execution > time . > > Then when the above things failed i also tried mapred.reduce.tasks = 4 > still results are same. No reduction in execution time. > > What other things should i set? Also i made sure hadoop is restarted every > time after changing config. > I have attached my conf folder ..please indicate me what should be added > where ? > I am really stuck ..Your help would be much appreciated. Thank you . > <(singlenodecuda)conf.zip> > > Regards, > Sindhu > > > -- Harsh J

Re: Hadoop 2.4.0 How to change "Configured Capacity"

2014-08-02 Thread Harsh J
out 49.22 >> GB per node, can anyone advise how to set bigger “configured capacity” e.g. >> 2T or more per node? >> >> Name node >> Configured Capacity: 264223436800 (246.08 GB) >> >> Each Datanode >> Configured Capacity: 52844687360 (49.22 GB) >> >> regards >> Arthur > > -- Harsh J

Re: Ideal number of mappers and reducers to increase performance

2014-07-31 Thread Harsh J
ltiple datanodes running on same machine. > > Your help is very much appreciated. > > > Regards, > sindhu > -- Harsh J

Re: Master /slave file configuration for multiple datanodes on same machine

2014-07-30 Thread Harsh J
and slave files of conf and conf2 look like if i want conf > to be master and conf2 to be slave .? > Also how should /etc/hosts file look like ? > Please help me. I am really stuck > > > Regards, > Sindhu -- Harsh J

Re: Performance on singlenode and multinode hadoop

2014-07-29 Thread Harsh J
So now , > How do i make sure load is being distributed on both datanodes or each > datanode uses different cores of the ubuntu machine. > > (Note: i know multiple datanodes on same machine is not that advantageous , > but assuming my machine is powerful ..i set it up..) > > would appreciate any advices on this. > > Regards, > Sindhu -- Harsh J

Re: Question about sqoop command error

2014-07-28 Thread Harsh J
t; at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) &

Re: Cannot compaile a basic PutMerge.java program

2014-07-28 Thread Harsh J
AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > Could not find the main class: PutMerge. Program will exit. > > I get the above error. > I tried: > $set CLASSPATH=/usr/lib/hadoop/bin/hadoop > $java PutMerge > > I still get th

Re: Question about sqoop command error

2014-07-27 Thread Harsh J
.apache.sqoop.tool.ImportTool.run(ImportTool.java:502) > at org.apache.sqoop.Sqoop.run(Sqoop.java:145) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) > at org.apache.sqoop.Sqoop.main(Sqoop.java:238) > > Could you please suggest how I could make the sqoop command work? Thanks a > lot. > > Shu > > > > > > -- > Regards > Gordon Wang > > -- Harsh J

Re: Cannot compaile a basic PutMerge.java program

2014-07-27 Thread Harsh J
nputFiles[i].getPath()); > byte buffer[] = new byte[256]; > int bytesRead = 0; > while( (bytesRead = in.read(buffer)) > 0) { > out.write(buffer, 0, bytesRead); > } > in.close(); > } > out.close(); > } catch (IOException e) { > e.printStackTrace(); > } > } > } > = > -- Harsh J

Re: Building custom block placement policy. What is srcPath?

2014-07-24 Thread Harsh J
program chunks the file as it writes. You can look at the DFSOutputStream class for the client implementation. > I'm reading the namenode and fsnamesystem code just to see if I can do what > I want from there. Any suggestions will be appreciated. > > Thank you, > > AB

Re: Building custom block placement policy. What is srcPath?

2014-07-24 Thread Harsh J
ss files or completed files? The latter form of files would result in placement policy calls iff there's an under-replication/losses/etc. to block replicas of the original set. Only for such operations would you have a possibility to determine the actual full length of file (as explained above). > Thank you, > > AB -- Harsh J

Re: Re: HDFS input/output error - fuse mount

2014-07-19 Thread Harsh J
RLClassLoader.findClass(URLClassLoader.java:190) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >>>>>>> Can't construct instance of class >>>>>>> org.apache.hadoop.conf.Configuration >>>>>>> ERROR fuse_init.c:127 Unable to establish test connection to server >>>>>>>INIT: 7.8 >>>>>>>flags=0x0001 >>>>>>>max_readahead=0x0002 >>>>>>>max_write=0x0002 >>>>>>>unique: 1, error: 0 (Success), outsize: 40 >>>>>>> unique: 2, opcode: GETATTR (3), nodeid: 1, insize: 56 >>>>>>> Exception in thread "Thread-0" >>>>>>> java.lang.UnsupportedClassVersionError: >>>>>>> org/apache/hadoop/conf/Configuration >>>>>>> : Unsupported major.minor version 51.0 >>>>>>> at java.lang.ClassLoader.defineClass1(Native Method) >>>>>>> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) >>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:615) >>>>>>> at >>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) >>>>>>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) >>>>>>> at java.net.URLClassLoader.access$000(URLClassLoader.java:58) >>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:197) >>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >>>>>>> Can't construct instance of class >>>>>>> org.apache.hadoop.conf.Configuration >>>>>>> ERROR fuse_connect.c:83 Unable to instantiate a filesystem for >>>>>>> user027 >>>>>>> ERROR fuse_impls_getattr.c:40 Could not connect to glados:9000 >>>>>>>unique: 2, error: -5 (Input/output error), outsize: 16 >>>>>>> unique: 3, opcode: GETATTR (3), nodeid: 1, insize: 56 >>>>>>> >>>>>>> I adopted this system after this was already setup, so I do not know >>>>>>> which java version was used during install. Currently I'm using: >>>>>>> >>>>>>> $java -version >>>>>>> java version "1.6.0_45" >>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_45-b06) >>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) >>>>>>> >>>>>>> $java -version >>>>>>> java version "1.6.0_45" >>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_45-b06) >>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) >>>>>>> >>>>>>> >>>>>>> Is my java version really the cause of this issue? What is the >>>>>>> correct java version to be used for this version of hadoop. I have also >>>>>>> tried 1.6.0_31 but no changes were seen. >>>>>>> >>>>>>> If java isn't my issue, then what is? >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Andrew >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >> > -- Harsh J

Re: OIV Compatiblity

2014-07-14 Thread Harsh J
There shouldn't be any - it basically streams over the existing local fsimage file. On Tue, Jul 15, 2014 at 12:21 AM, Ashish Dobhal wrote: > Sir I tried it it works. Are there any issues in downloading the gsimage > using wget. > > > On Tue, Jul 15, 2014 at 12:17 AM, Harsh

Re: OIV Compatiblity

2014-07-14 Thread Harsh J
e using the tool in the hadoop 1.2 or higher > distributions.I guess the structure of fsimage would be same for both the > distributions. > > > On Mon, Jul 14, 2014 at 11:53 PM, Ashish Dobhal > wrote: >> >> Harsh thanks >> >> >> On Mon, Jul 14, 2014 at

Re: OIV Compatiblity

2014-07-14 Thread Harsh J
0 as there is no > hdfs.sh file there. > Thanks. -- Harsh J

Re: Where is hdfs result ?

2014-06-23 Thread Harsh J
ail and any > accompanying attachment(s) > is intended only for the use of the intended recipient and may be > confidential and/or privileged of > Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader > of this communication is > not the intended recipient, unauthorized use, forwarding, printing, > storing, disclosure or copying > is strictly prohibited, and may be unlawful.If you have received this > communication in error,please > immediately notify the sender by return e-mail, and delete the original > message and all copies from > your system. Thank you. > --- -- Harsh J

Re: Recover HDFS lease after crash

2014-06-16 Thread Harsh J
nProgress exception (which I >>> guess means the namenode proceeded to close and release the file after >>> inactivity). After about 1 minute it starts to work again. >>> >>> What is the correct way to recover from this? Is there API for recovering >>> the lease and resuming appending faster? DFSClient sets a randomized client >>> name. If it were to send the same client name as before the crash, would it >>> receive a lease on the file faster? >>> >>> Thanks >> >> > -- Harsh J

Re: should i just assign history server address on NN or i have to assign on each node?

2014-06-04 Thread Harsh J
hi,maillist: >>> i installed my job history server on my one of NN(i use NN >>> HA) ,i want to ask if i need set history server address on each node? >>> >> >> > -- Harsh J

Re: Building Mahout Issue

2014-06-03 Thread Harsh J
em and how to fix it? Any advice would > be much appreciated. > > > > Thanks, > > > > Andrew Botelho > > Intern > > EMC Corporation > > Education Services > > Email: andrew.bote...@emc.com -- Harsh J

Re: How to set the max mappers per node on a per-job basis?

2014-05-30 Thread Harsh J
; can we request a different number of mappers per node for each job? From > what I've read, mapred.tasktracker.map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum cannot be overridden from the > client. > > --Jeremy -- Harsh J

Re: Problem with simple-yarn-app

2014-05-30 Thread Harsh J
java.lang.reflect.Method.invoke(Method.java:606) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > > > Does anyone know what I'm doing wrong? > > > Thanks, > > Lars > > -- Harsh J

Re: listing a 530k files directory

2014-05-30 Thread Harsh J
dfs -count >> folder/“ >> >> -ls goes out of memory, -count with the folder/* goes out of memory … >> I’d like at least at the first 10 file names, see the size, maybe open one >> >> thanks, >> G. > > -- Harsh J

Re: Can not find hadoop packages

2014-05-29 Thread Harsh J
^ > > symbol: class Path > > location: class WordCount > > Note: WordCount.java uses or overrides a deprecated API. > > Note: Recompile with -Xlint:deprecation for details. > > 12 errors > > > I do not why because I checked this post > > http://ac31004.blogspot.com/2013/11/hadoop-2x-jar-file-location-for.html > > and added the jars (hadoop-common-2.2.0, > hadoop-mapreduce-client-core-2.2.0.jar and commons-cli-1.2.jar) into my > classpath but it still does not work. > > Thanks! > > > > Best, > > Isaiah -- Harsh J

Re: fuse-dfs on hadoop-2.2.0

2014-05-27 Thread Harsh J
I forgot to send this earlier, but here's an answer with added links that may help: http://stackoverflow.com/a/21655102/1660002 On Sat, May 17, 2014 at 9:54 AM, Harsh J wrote: > The issue here is that JNI doesn't like wildcards in the classpath > string - it does not evaluate t

Re: question on yarn and fairscheduler

2014-05-20 Thread Harsh J
033851458_4824_m_06 failed 4 times . > is it possible that this is due to preempted too many times? or any other > issue. At the same job, there are also tasks get killed with note: Attmpt > state missing from History : marked as KILLED > > any help would be appreciated. Thanks. -- Harsh J

Re: about hadoop upgrade

2014-05-19 Thread Harsh J
one of step is backup namenode > dfs.namenode.name.dir directory,i have 2 directories defined in > hdfs-site.xml,should i backup them all ,or just one of them? > > > dfs.namenode.name.dir > file:///data/namespace/1,file:///data/namespace/2 > -- Harsh J

Re: hadoop 2.2.0 nodemanager graceful stop

2014-05-19 Thread Harsh J
your computer and network server immediately. Your cooperation is highly >>> appreciated. It is advised that any unauthorized use of confidential >>> information of Winbond is strictly prohibited; and any information in this >>> email irrelevant to the official business of Winbond shall be deemed as >>> neither given nor endorsed by Winbond. >>> >> >> > > > -- > Regards > Shengjun > -- Harsh J

Re: where to put log4j.properties for logging out of MR job?

2014-05-19 Thread Harsh J
rk for me. > > Could you please tell me where I should put the log4j.properties? > > Cheers > Seb. -- Harsh J

Re: fuse-dfs on hadoop-2.2.0

2014-05-17 Thread Harsh J
> configured “hadoop.fuse.timer.period" and “hadoop.fuse.connection.timeout" > in $HADOOP_HOME/etc/hadoop/hdfs-site.xml. > > Can any one share some hints on how to fix this? How can I let fuse-dfs > correctly load the configuration? > > Thanks in advance! > Cheng -- Harsh J

Re: Running the job history server locally

2014-05-16 Thread Harsh J
You can run it via the script "mr-jobhistory-daemon.sh start historyserver", or in foreground via "mapred historyserver". On Sat, May 17, 2014 at 5:18 AM, Software Dev wrote: > How does one run the job history server? > > I am using the latest Hadoop from Hombrew on OSX -- Harsh J

Re: Realtime sensor's tcpip data to hadoop

2014-05-11 Thread Harsh J
s. > > Secondly, if the total network traffic from sensors are over the limit of > one lan port, how to share the loads, is there any component in hadoop to > make this done automatically. > > Any suggestions, thanks. -- Harsh J

Re: For QJM HA solution, after failover, application must update NameNode IP?

2014-04-30 Thread Harsh J
070 >> >> >> >> dfs.client.failover.proxy.provider.gilbert-prod >> >> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider >> >> >> >> On Tue, Apr 29, 2014 at 9:07 AM, sam liu wrote: >>> >>> Hi Experts, >>> >>> For example, at the beginning, the application will access NameNode using >>> IP of active NameNode(IP: 9.123.22.1). However, after failover, the IP of >>> active NameNode is changed to 9.123.22.2 which is the IP of previous standby >>> NameNode. In this case, application must update NameNode IP? >>> >>> Thanks! >> >> > -- Harsh J

Re: 答复: hdfs write partially

2014-04-28 Thread Harsh J
> > Thank a lot for replying. > > > > Regards, > > Ken Huang > > > > 发件人: user-return-15182-tdhkx=126@hadoop.apache.org > [mailto:user-return-15182-tdhkx=126@hadoop.apache.org] 代表 Harsh J > 发送时间: 2014年4月28日 13:30 > 收件人: > 主题: Re: hdfs wri

Re: copyFromLocal: unexpected URISyntaxException

2014-04-28 Thread Harsh J
eption" when I try to copy this file to Hadoop. See below. >> >> [patcharee@compute-1-0 ~]$ hadoop fs -copyFromLocal >> wrfout_d01_2001-01-01_00:00:00 netcdf_data/ >> copyFromLocal: unexpected URISyntaxException >> >> I am using Hadoop 2.2.0. >> >> Any suggestions? >> >> Patcharee >> > > > > -- > Nitin Pawar > > -- Harsh J

Re: hdfs write partially

2014-04-27 Thread Harsh J
e-packet-size is 64K and it can't be > bigger than 16M. > > So if write bigger than 16M a time, how to make sure it doesn't write > partially ? > > > > Does anyone knows how to fix this? > > > > Thanks a lot. > > > > -- > > Ken Huang > -- Harsh J

Re: configure HBase

2014-04-24 Thread Harsh J
memory resident" in the hbase-env.sh, can you explain in > detail? > > Thanks for any inputs. -- Harsh J

Re: Changing default scheduler in hadoop

2014-04-13 Thread Harsh J
er how can i do this? Remove the configuration override, and it will always go back to the default FIFO based scheduler, the same whose source has been linked above. > I am struggling since 4 months to get help on Apache Hadoop?? Are you unsure about this? -- Harsh J

Re: Number of map task

2014-04-12 Thread Harsh J
its? > > I think the job will be done quicker if there are more Map tasks? > > Patcharee -- Harsh J

Re: InputFormat and InputSplit - Network location name contains /:

2014-04-10 Thread Harsh J
che.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81) > at java.lang.Thread.run(Thread.java:662) > 2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler] > org.apache.hadoop. -- Harsh J

Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Harsh J
cutor.java:908) >>> at java.lang.Thread.run(Thread.java:662) >>> >>> >>> I have everything configured with hdfs running where i am able to create >>> files and directories. running jps on my machine shows all components >>> running. >>> >>> 10290 NameNode >>> 10416 DataNode >>> 10738 ResourceManager >>> 11634 Jps >>> 10584 SecondaryNameNode >>> 10844 NodeManager >>> >>> >>> Any pointers will be appreciated. >>> >>> Thanks and Regards, >>> -Rahul Singh >> >> > -- Harsh J

  1   2   3   4   5   6   7   8   9   10   >