Re: HDFS Federation address performance issue

2014-01-28 Thread Daryn Sharp
Hi Anfernee, You will achieve improved performance with federation only if you stripe files across the multiple NNs. Federation basically shares DN storage with multiple NNs with the expectation the namespace load will be distributed across the multiple NNs. If everything writes to the exact

Re: Client mapred tries to renew a token with renewer specified as nobody

2013-12-09 Thread Daryn Sharp
We encountered the same issue in yarn's RM so I made the RM recognize it's own tokens and renew them regardless of the renewer. In part because older versions of oozie hardcode the renewer as mrtoken. I thought the change made it back into 1.x JT but I guess not. I agree that the conversion

Re: Name node and data node replacement

2013-12-09 Thread Daryn Sharp
To recover a NN, you probably want to use the HA feature. If not, try writing your edits to a nfs volume in additional to the local fs. There is no need to recover a DN. The NN compensates for a lost DN by using the other remaining replicas to create a new replica on another DN. Daryn On

Re: Problem with RPC encryption over wire

2013-11-13 Thread Daryn Sharp
No common protection layer between server and client likely means the host for job submission does not have hadoop.rpc.protection=privacy. In order for QOP to work, all client hosts (DN others used to access the cluster) must have an identical setting. A few quick questions: I'm assuming

Re: Why Hadoop force using DNS?

2013-07-29 Thread Daryn Sharp
One reason is the lists to accept or reject DN accepts hostnames. If dns temporarily can't resolve an IP then an unauthorized DN might slip back into the cluster, or a decommissioning node might go back into service. Daryn On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote: I have the same confusion,

Re: ALL HDFS Blocks on the Same Machine if Replication factor = 1

2013-06-10 Thread Daryn Sharp
It's normal. The default placement strategy stores the first block on the same node for performance, then choses a second random node on another rack, then a third node on the same rack as the second node. Using a replication factor of 1 is not advised if you value your data. However, if you

Re: hdfs -copyToLocal file permission

2013-06-10 Thread Daryn Sharp
No, permissions are not preserved. FsShell copy commands match *nix's cp behavior by using the current umask (the conf setting you cited) for the permissions of new file. Perhaps we could add a -p option, like the *nix cp, to preserve permissions. In the meantime you can set the umask if

Re: How to connect to hadoop through ssh tunnel and kerberos authentication

2013-04-25 Thread Daryn Sharp
The important part of the error is Cannot get kdc for realm CORP.EBAY.COMhttp://CORP.EBAY.COM. Check if the gateway's /etc/krb5.conf has an entry for CORP.EBAY.COMhttp://CORP.EBAY.COM in the [realms] section. Or if you actually have appropriate dns service records for kerberos, you can use

Re: Hadoop MapReduce

2013-04-23 Thread Daryn Sharp
MR has a local mode that does what you want. Pig has the ability to use this mode. I did a quick search but didn't immediately find a good link to documentation, but hopefully this gets you going in the right direction. Daryn On Apr 22, 2013, at 6:01 PM, David Gkogkritsiani wrote: Helllo,

Re: Problem With NAT ips

2013-04-22 Thread Daryn Sharp
to comunicate , so i get the typical msg of could not obtain block. Any ideas?. Thanks. Mauro. 2013/4/11 Daryn Sharp da...@yahoo-inc.commailto:da...@yahoo-inc.com Hi Mauro, The registration process has changed quite a bit. I don't think the NN trusts the DN's self-identification anymore

Re: Hadoop fs -getmerge

2013-04-22 Thread Daryn Sharp
If -getmerge is updated, specifically deleting the .crc is not necessary. Adding an option that invokes fs.writeChecksum via a cmdline option should do the trick. On Apr 18, 2013, at 2:41 AM, Fabio Pitzolu wrote: Hi Hemanth, I guess that the only solution is to delete the crc files after the

Re: Problem With NAT ips

2013-04-11 Thread Daryn Sharp
Hi Mauro, The registration process has changed quite a bit. I don't think the NN trusts the DN's self-identification anymore. Otherwise it makes it trivial to spoof another DN, intentionally or not, which can be a security hazard. I suspect the NN can't resolve the DN. Unresolvable hosts

Re: What is the difference between URI, Home Directory, Working Directory in FileSystem.java or HDFS

2013-04-11 Thread Daryn Sharp
On Apr 11, 2013, at 5:33 AM, Ling Kun wrote: Dear all, I am a little confusing about the URI, Home Directory and Working Directory in the FileSystem.java or HDFS. I have listed my understanding about these concept, can someone please figure out whether I am correct? Thanks.

Re: RES: I want to call HDFS REST api to upload a file using httplib.

2013-04-09 Thread Daryn Sharp
Try adding -L to your curl and see if that works. Daryn On Apr 8, 2013, at 11:05 PM, ??PHP wrote: Really Thanks. But the returned URL is wrong. And the localhost is the real URL, as i tested successfully with curl using localhost. Can anybody help me translate the curl to Python httplib?

Re: one minute delay in running a simple ls command on hadoop (maybe near security groups..): hadoop 0.23.5

2013-04-04 Thread Daryn Sharp
Hi Gopi, Check if you can run groups -F hduser. I think it's causing the delay. Daryn On Mar 31, 2013, at 1:17 AM, Gopi Krishna M wrote: Hi.. I have installed hadoop 0.23.5 and is working fine on two of my installations. In a new installation on Windows Azure VMs, I am seeing an inordinate

Re: Oozie workflow error - renewing token issue

2013-01-30 Thread Daryn Sharp
The token renewer needs to be the job tracker principal. I think oozie had mr token hardcoded at one point, but later changed it to use a conf setting. The rest of the log looks very odd - ie. it looks like security is off, but it can't be. It's trying to renew hdfs tokens issued for the hdfs

Re: reduce network problem after using cache dns

2012-01-05 Thread Daryn Sharp
I'm not sure if java is using the system's libc resolver, but assuming it is, you cannot use utilities like nslookup or dig because their use their own resolver. Ping usually uses the libc resolver. If you are on linux, you can use getent hosts $hostname to definitively test the libc

Re: Symbolic Links in HDFS

2011-11-30 Thread Daryn Sharp
in Hadoop? -Stuti -Original Message- From: Daryn Sharp [mailto:da...@yahoo-inc.com] Sent: Wednesday, November 30, 2011 12:11 AM To: hdfs-user@hadoop.apache.org; Stuti Awasthi Subject: Re: Symbolic Links in HDFS No, this will not provide symlink support to FsShell. The shell

Re: Symbolic Links in HDFS

2011-11-29 Thread Daryn Sharp
No, this will not provide symlink support to FsShell. The shell is not yet using FileContext although adding the support is planned. Daryn On Nov 28, 2011, at 10:37 PM, Stuti Awasthi wrote: HI all, Any thoughts on this ?? -Original Message- From: Stuti Awasthi Sent: Monday,