Hi, it was a well attended session with more than 40 attendees joined!
Thanks Fei Hui for giving us such a great talk.

Here's the summary for your reference.

https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing
01/02/2020 Didi talked about their large scale HDFS cluster upgrade
experience.

Slides:
https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy

Didi studied two upgrade approaches from the community documentation:
express upgrade and rolling upgrade. Rolling upgrade was selected.

The upgrade involved HDFS server side only. Clients are still on Hadoop 2.7
because applications such as Hive and Spark does not support Hadoop 3 yet.

Zookeeper was not upgraded.

Didi practiced upgrade + downgrade more than 10 times before doing it for
real.

Didi’s largest cluster has 5 federated namespaces, and 10+ thousand nodes.
The upgrade took a month. JournalNodes took 1 week; NameNode: 2 weeks;
DataNodes took a week.

During upgrade, HDFS does not clean up trash. Because the upgrade window
was a month long, the trash became a concern because it could exhaust all
available space. Didi has a (script?) to clean trash daily.

A problem was encountered which may not be related: Clients were
occasionally unable to close files. Solution: reviewed DataNode log, and
found that the blocks were not reported in time, and that was because
delete blocks took too long.

Two parameters were changed to address the issue:

Increase dfs.client.block.write.locateFollowingBlock.retries and

Reduce dfs.block.invalidate.limit (from the default 1000 to 500)

Didi believes the new upstream change HDFS-14997 can alleviate this issue.

Timeline:

May 2019, verified the plan is good.

July: trial run with a 100-node cluster, completed rolling upgrade
successfully.

Oct: 300+ node cluster rolling upgrade completed.

Nov: 10-thousand node cluster rolling upgrade completed.

Offline test

Had Spark, Hive and Hadoop full test set. Verified the upgrade/downgrade
has no impact.

Reviewed the 4000+ patches between Hadoop 2.7 and 3.2, to make sure there’s
no incompatible changes.

Authored 40+ internal wikis to document the process.

Future:

Didi’s interested in Ozone to address the small file problems.

Want to incorporate the Consistent Read from Standby feature to increase
NameNode RPC performance.

Finally, DataNode upgrade is hard. Will look into HDFS Maintenance Mode to
make this easier in the future.

This is a HDFS-only upgrade work. YARN upgrade is planned in the second
half of 2020. Since the main purpose is to use EC to reduce space usage,
Didi ported EC client side code to Hadoop 2.7 clients, and these clients
can read/write EC blocks!


On Wed, Jan 1, 2020 at 7:42 PM Wei-Chiu Chuang <weic...@apache.org> wrote:

> Hi,
> This is a gentle reminder for tomorrow's online meetup. Fei Hui from DiDi
> is going to give a presentation about DiDi's Hadoop 2 -> Hadoop 3 upgrade
> experience.
>
> We will extend this session to 1 hour. Fei will speak in Mandarin and I
> will help translate. So non-Mandarin speakers feel free to join!
>
> Time/Date:
> Jan 1 10PM (US west coast PST) / Jan 2 2pm (Beijing, China CST) / Jan 2
> 11:30am (India, IST) / Jan 2 3pm (Tokyo, Japan, JST)
>
> Join Zoom Meeting
>
> https://cloudera.zoom.us/j/880548968
>
> One tap mobile
>
> +16465588656,,880548968# US (New York)
>
> +17207072699,,880548968# US
>
> Dial by your location
>
>         +1 646 558 8656 US (New York)
>
>         +1 720 707 2699 US
>
>         877 853 5257 US Toll-free
>
>         888 475 4499 US Toll-free
>
> Meeting ID: 880 548 968
> Find your local number: https://zoom.us/u/acaGRDfMVl
>

Reply via email to