date:20160815

Is it possible to configure hdfs in a federation mode and in an HA mode in the same time?

2016-08-15 Thread Alexandr Porunov

Hello all, I don't understand if it possible to configure HDFS in both modes in the same time. Does it make sense? Can somebody show a simple configuration of HDFS in both modes? (nameNode1, nameNode2, nameNodeStandby1, nameNodeStandby2) Sincerely, Alexandr

Re: How to distcp data between two clusters which are not in the same local network?

2016-08-15 Thread Shady Xu

Thanks Wei-Chiu and Sunil, I have read the docs you mentioned before starting. The specific problem now is that the DataNodes of the source cluster report their local ip instead of the public one, which cannot be accessed from the NodeManagers of the destination cluster. Seems the solution is to se

Re: Hadoop archives (.har) are really really slow

2016-08-15 Thread Aaron Turner

Oh I should mention that creating the archive took only a few hours, but copying the files out of the archive back to HDFS was 80MB/min. Would take years to copy back which seems really surprising. -Aaron > On Aug 15, 2016, at 12:33 PM, Tsz Wo Sze wrote: > > ls over files in har:// maybe 10

Re: Hadoop archives (.har) are really really slow

2016-08-15 Thread Aaron Turner

I can list all the files out of HDFS in a few hours, not a day. Listing the files in a single directory in the har takes ~50 min. Honestly I'd be happy with only a 10x performance hit. I'm seeing closer to 100-150x. -Aaron > On Aug 15, 2016, at 12:33 PM, Tsz Wo Sze wrote: > > ls over files

Hadoop archives (.har) are really really slow

2016-08-15 Thread Aaron Turner

Basically I want to list all the files in a .har file and compare the file list/sizes to an existing directory in HDFS. The problem is that running commands like: hdfs dfs -ls -R is orders of magnitude slower then running the same command against a live HDFS file system. How much slower? I've c

Re: Yarn web UI shows more memory used than actual

2016-08-15 Thread Ravi Prakash

Hi Suresh! YARN's accounting for memory on each node is completely different from the Linux kernel's accounting of memory used. e.g. I could launch a MapReduce task which in reality allocates just 100 Mb, and tell YARN to give it 8 Gb. The kernel would show the memory requested by the task, the re

Re: How to distcp data between two clusters which are not in the same local network?

2016-08-15 Thread Sunil Govind

Hi I think you can also refer below link too. http://aajisaka.github.io/hadoop-project/hadoop-distcp/DistCp.html Thanks Sunil On Mon, Aug 15, 2016 at 7:26 PM Wei-Chiu Chuang wrote: > Hello, > if I understand your question correctly, you are actually building a > multi-home Hadoop, correct? > M

Re: How to distcp data between two clusters which are not in the same local network?

2016-08-15 Thread Wei-Chiu Chuang

Hello, if I understand your question correctly, you are actually building a multi-home Hadoop, correct? Multi-homed Hadoop cluster can be tricky to set up, to the extend that Cloudera does not recommend it. I've not set up a multihome Hadoop cluster before, but I think you have to make sure the rev

Re: Yarn web UI shows more memory used than actual

2016-08-15 Thread Sunil Govind

Hi Suresh "This 'memory used' would be the memory used by all containers running on that node" >> "Memory Used" in Nodes page indicates how memory is used in all the node managers with respect to the corresponding demand made to RM. For eg, if application has asked for 4GB resource and if its real

How to distcp data between two clusters which are not in the same local network?

2016-08-15 Thread Shady Xu

Hi all, Recently I tried to use distcp to copy data across two clusters which are not in the same local network. Fortunately, the nodes of the source cluster each has an extra interface and ip which can be accessed from the destination cluster. But during the process of distcp, the map tasks alway

Is it possible to configure hdfs in a federation mode and in an HA mode in the same time?

Re: How to distcp data between two clusters which are not in the same local network?

Re: Hadoop archives (.har) are really really slow

Re: Hadoop archives (.har) are really really slow

Hadoop archives (.har) are really really slow

Re: Yarn web UI shows more memory used than actual

Re: How to distcp data between two clusters which are not in the same local network?

Re: How to distcp data between two clusters which are not in the same local network?

Re: Yarn web UI shows more memory used than actual

How to distcp data between two clusters which are not in the same local network?

10 matches

Site Navigation

Mail list logo

Footer information