Notes from Hadoop storage community online sync

Wei-Chiu Chuang Thu, 07 Nov 2019 10:38:47 -0800

Thanks @Xiaoyu Yao <x...@cloudera.com> for giving us a great status update
on Ozone!

We had a pretty large group yesterday. Here's my notes for your reference:
<goog_1177019630>
https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing
11/6/2019
~20 contributors joined the discussion.Weichiu, Xiaoyu, Chen, Haihua,
haiyang, hexiaoqiao, Hui, Jinglun, Li, Lisheng, Oliver, sibyl.lv, Sammi,
Yisheng, aiphago, Dazhuang, haicai and many others.
Xiaoyu led the discussion of Ozone: object store for big data workloads.What
and why, feature set, current development: 0.4 features (security) and 0.5
features (HA), future roadmap: scale and stability improvement.

Decommissioning support in progress

Questions:

Python client implementation — S3 or RPC
1.

Sammi: Tencent is preparing to introduce Ozone at Tencent. Use case
1: Hive. Use case 2: Data science use cases, small files. Requires Python
client.
2.

Ozone GA timeline
3.

How does client read: is OM involved in reading data? Ans: No. client
access DataNode directly.
4.

What metadata does OM and SCM maintain?
5.

When can Ozone be used in production environment? Ans: wait for GA, and
benchmarks running workloads like TPC-DS.
6.

Performance comparison between HDFS and Ozone. Ans: Ozone use RocksDB as
the persistent store for metadata, and optimization and tuning is required
for RocksDB.
7.

Ozone uses Raft replication protocol. What if it replicates more than 3
copies? Would the leader become the bottleneck? Ans: multi Raft project is
undergoing which addresses this problem.
8.

Rename? Ozone is flat hierarchy. Does it mean rename is a O(n)
operation? Ans: Ozone plans to support hierarchy.

Notes from Hadoop storage community online sync

Reply via email to