Hello all,
I don't understand if it possible to configure HDFS in both modes in the
same time. Does it make sense? Can somebody show a simple configuration of
HDFS in both modes? (nameNode1, nameNode2, nameNodeStandby1,
nameNodeStandby2)
Sincerely,
Alexandr
Thanks Wei-Chiu and Sunil, I have read the docs you mentioned before
starting. The specific problem now is that the DataNodes of the source
cluster report their local ip instead of the public one, which cannot be
accessed from the NodeManagers of the destination cluster. Seems the
solution is to se
Oh I should mention that creating the archive took only a few hours, but
copying the files out of the archive back to HDFS was 80MB/min. Would take
years to copy back which seems really surprising.
-Aaron
> On Aug 15, 2016, at 12:33 PM, Tsz Wo Sze wrote:
>
> ls over files in har:// maybe 10
I can list all the files out of HDFS in a few hours, not a day. Listing the
files in a single directory in the har takes ~50 min. Honestly I'd be happy
with only a 10x performance hit. I'm seeing closer to 100-150x.
-Aaron
> On Aug 15, 2016, at 12:33 PM, Tsz Wo Sze wrote:
>
> ls over files
Basically I want to list all the files in a .har file and compare the
file list/sizes to an existing directory in HDFS. The problem is that
running commands like: hdfs dfs -ls -R is orders of
magnitude slower then running the same command against a live HDFS
file system.
How much slower? I've c
Hi Suresh!
YARN's accounting for memory on each node is completely different from the
Linux kernel's accounting of memory used. e.g. I could launch a MapReduce
task which in reality allocates just 100 Mb, and tell YARN to give it 8 Gb.
The kernel would show the memory requested by the task, the re
Hi
I think you can also refer below link too.
http://aajisaka.github.io/hadoop-project/hadoop-distcp/DistCp.html
Thanks
Sunil
On Mon, Aug 15, 2016 at 7:26 PM Wei-Chiu Chuang wrote:
> Hello,
> if I understand your question correctly, you are actually building a
> multi-home Hadoop, correct?
> M
Hello,
if I understand your question correctly, you are actually building a
multi-home Hadoop, correct?
Multi-homed Hadoop cluster can be tricky to set up, to the extend that
Cloudera does not recommend it. I've not set up a multihome Hadoop cluster
before, but I think you have to make sure the rev
Hi Suresh
"This 'memory used' would be the memory used by all containers running on
that node"
>> "Memory Used" in Nodes page indicates how memory is used in all the node
managers with respect to the corresponding demand made to RM. For eg, if
application has asked for 4GB resource and if its real
Hi all,
Recently I tried to use distcp to copy data across two clusters which are
not in the same local network. Fortunately, the nodes of the source cluster
each has an extra interface and ip which can be accessed from the
destination cluster. But during the process of distcp, the map tasks alway
10 matches
Mail list logo