How to clear EXPIRED routers?

2023-10-18 Thread 杨光
Hi everyone!

I'm using hadoop 3.3.4, and started 5 hdfs routers on servers. Now I have
to delete two of them using: hdfs --daemon stop dfsrouter. The commend
executed successfully, but on router WebUI (http://url-to-router-webui:50071),
it shows 5 routers but 2 of them are on  EXPIRED status. How can I clear
them?


Re: DistCP from Hadoop 2.X to 3.X - where to compute

2023-10-18 Thread 杨光
Hi PA,

We just did the same work recently, copying data from hadoop 2 to hadoop 3,
to be precise, src hadoop version was CDH hadoop-2.6 (5 hdfs nameservices
federation), dst hadoop version was hadoop 3.3.4. Both clusters are
protected with Kerberos, and of course, two realms have been trusted with
each other. We executed the DistCP on hadoop 3 version cluster, but also
tried on hadoop 2. Both were working nicely. I can confirm that copying
data with DistCP from 1.x to 2.x needs webhdfs, which is slow compared to
rpc one. Here is an execution example:

hadoop --config /home/hadoop/conf distcp \
  -Dmapreduce.job.hdfs-servers.token-renewal.exclude=ns1,ns2,ns3,ns4,ns5 \
  -update -skipcrccheck \
  hdfs://hadoop2-cluster/user/test \
  hdfs://hadoop3-cluster/user/test


DistCP from Hadoop 2.X to 3.X - where to compute

2023-10-18 Thread Paul-Adrien Cordonnier

Hey team,

We're planning to migrate some of our data from an obsolete Hadoop 2.7 
to a more recent Hadoop 3.


There is approximately 60 Datanodes on the old one and approximately 10 
on the new ones. It will get bigger over the next months but since some 
of the use cases are migrating out of hadoop we'll require a downsize.


Anyway, we are planning to use a distributed copy to move the data but I 
have a small concern:


- Can you confirm that the DistCP has to be run on the new cluster ? 
Since the hdfs-client on the Hadoop 2.X won't be able to write to Hadoop 
3, the DistCP has to be on Hadoop 3. We wanted to launch the DistCP on 
the "old" one, as it is bigger it should have been faster but I do not 
think it is technically possible. I also have in mind that the network 
should the bottleneck at some point.


- The doc confuses me a bit 
https://hadoop.apache.org/docs/stable/hadoop-distcp/DistCp.html#Copying_Between_Versions_of_HDFS. 
It looks like it is required to use webhdfs, is it still relevant ?


Thanks a lot

PA


-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org