Hi Mickael

Thanks for starting this. It is a very useful feature.

Some initial thoughts (I am new to Kafka so please excuse if these are
naive suggestions):
1. What is the impact on latency of the DescribeLogDirs API due to this
change? Would calculating the totalSpace from each logdir be a bottleneck
for the API? What if we are talking about a large storage size in the order
of hundred (or tens) of GBs?
2. How does this fit in with RemoteStorage (KIP-405)? I think integration
with KIP-405 is worth discussing in the scope of this KIP. My
recommendation will be to add a new API in the RLMM
(RemoteLogMetadataManager) called GetLogSize() and leave it upto the remote
storage to perform a concrete implementation for this
interface. DescribeLogDirs could call this interface internally to provide
the relevant information.
3. Do you think adding the number of files in the directory as part of the
API response will be useful as well? e.g. a use case where this information
will be useful is to monitor/alarm the situations when the number of files
are dangerously reaching the max value of file descriptors configured at
the OS.
4. Please add an API latency perf test as part of the release criteria for
this change. We want to avoid regression.

Regards,
Divij Vaidya



On Thu, Apr 7, 2022 at 11:17 AM Mickael Maison <mickael.mai...@gmail.com>
wrote:

> Hi,
>
> I wrote a small KIP to expose the total and usable space of logdirs
> via the DescribeLogDirs API:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API
>
> Please take a look and let me know if you have any feedback.
>
> Thanks,
> Mickael
>

Reply via email to