ViswaNXplore opened a new issue, #1289:
URL: https://github.com/apache/curator/issues/1289
Hi Curator team,
We are using:
- Curator version: 5.6.0
- ZooKeeper version: 3.8.4
In one of our environment we do observe roughly ~50k znodes, where about
~45k of them are empty directories. And this numbers are huge in higher envs.
During CuratorCache initialization we observe that getChildren calls are
issued across the tree, including for these empty directories. Since
getChildren is relatively latency-heavy, this significantly increases cache
initialization time at this scale.
In our use case, strict initial enumeration completeness is not required,
because we rely on persistent recursive watchers and eventually receive create
events for newly added nodes. Our internal cache can converge to the correct
state through those events.
While reviewing the initialization flow, we noticed that Stat is already
retrieved for nodes. Since Stat contains numChildren, we were wondering whether
it would be reasonable to optimize the initialization logic as follows:
`if (stat.getNumChildren() == 0)
skip getChildren()`
This would avoid issuing getChildren() calls for nodes that are already
known to have no children, which could significantly reduce initialization
overhead in environments with a large number of empty directories.
Our assumption is that correctness would still be preserved because:
- persistent recursive watchers would capture future child creation events
- the cache would eventually converge to the correct state
Conceptually the idea is to reduce initialization overhead for environments
with a large number of empty directories, while still allowing the cache to
converge to the correct state through watcher events.
Would this optimization be compatible with the intended semantics of
CuratorCache initialization, or is there a reason getChildren() must always be
executed even when stat.getNumChildren() == 0?
Or is there a recommended way to optimize initialization behavior for this
type of large namespace with many empty nodes?
Thanks in advance for any guidance.
Best regards,
Viswanathan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]