Hi Zihao, Thanks for driving this — the pluggable ArchiveStorage abstraction looks like a clean fit for the HistoryServer. I have two questions on the ArchiveStorage interface that I'd like to understand better before the vote:
Asymmetric value type between read and write. get / getByPrefix return the generic type T, but put hard-codes the value to String: ··· T get(String key); void put(String key, String archiveContent); ··· Could you share the rationale? Making the write side symmetric (either also T, or unifying both sides on byte[] / InputStream) would feel more consistent and avoid forcing every backend to materialize the archive as a String. Is there a specific reason String was chosen for put? OOM risk of getByPrefix returning List<T>. In production a single prefix (e.g. all entries under one job, or under /jobs/) can easily expand to thousands of entries with non-trivial JSON payloads. Returning a fully materialized List<T> means the whole result set is loaded into heap at once, which I'm worried could cause OOM on busy HistoryServers. Have you considered exposing it as Iterator<T> / CloseableIterator<T> (or a Stream<T>) instead? It maps very naturally to RocksDB's prefix iterator, and FileArchiveStorage can implement it lazily as well. If there's a concrete call site that really needs the full list, it can always do Lists.newArrayList(iter) locally. Other than these two points, +1 from me on the overall direction. Best, Verne On 2026/05/09 03:37:08 zihao chen wrote: > Hi all, > > I’d like to start a discussion on FLIP-XXX: > > *Support Pluggable Storage Backend forHistoryServer*. > > This FLIP proposes improving the HistoryServer > to address excessive *small files* when handling > large numbers of archived jobs. > > [Proposal] > Optional *RocksDB-based storage* to reduce > small files > > [Compatibility] > Full backward compatibility (FILE as default) > > The detailed design is described in the > FLIP document: > > https://docs.google.com/document/d/1idHu5bq0GOsUuUAEIJSJ2UuekcDjbW0tHLNbsQfugDg/edit?usp=sharing > > This FLIP is split from the earlier discussion [1]. > > Looking forward to your feedback. > > [1] https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp > > > Best regards, > > Zihao Chen >
