Hi Igor,

Thanks for the KIP. It looks like it's on a good track. I have a few
suggestions to throw into the mix:

1. (nit): Instead of "storage id," maybe we could call it "directory id"?
It seems a little clear since each log dir gets a unique id.
2. Rather than introducing a new RPC to communicate offline directories,
would it be reasonable to add it to BrokerHeartbeat? My thinking is that we
can let broker registration include the complete list of log dirs. The
heartbeat can then be used to communicate online/offline status of each log
dir.
3. I think we have an opportunity to consolidate normal partition
reassignment and log dir reassignment here. Since we are modifying the
`Assignment` to consist of both replica id and storage id, couldn't we make
a similar change to the `AlterPartitionReassignment` API? Basically that
would let us treat all reassignments as moving from one log dir to another.
The case handled by `AlterReplicaLogDir` currently would just become a
special case where the replica id happens to be the same. This might also
let us get rid of all the custom logic surrounding JBOD, such as the
additional fetcher, which has proven difficult to maintain.

Thanks,
Jason

On Tue, Aug 2, 2022 at 2:24 AM Igor Soarez <i...@soarez.me> wrote:

> Hi José,
>
> Thanks for having a look at this KIP and thanks for pointing this out,
> I've had a look at KIP-856.
>
> It's good to see there's some overlap in our proposals. we're both
> proposing:
>
> - Identifying log directories with a UUID
> - Extending the storage tool to ensure each log directory has a UUID
> assigned
> - Expanding the topic partition identity to include the log directory UUID
>
> There were differences in our proposals as to how the UUID is to be
> persisted, but I've changed my proposal to match yours — I think adding
> storage.id to meta.properties makes sense.
>
> --
> Igor
>
>
> On Wed, Jul 27, 2022, at 4:42 PM, José Armando García Sancio wrote:
> > Hi Igor,
> >
> > Thanks for the KIP. Looking forward to this improvement. I'll review
> your KIP.
> >
> > I should mention that I started a discussion thread on KIP-856: KRaft
> > Disk Failure Recovery at
> > https://lists.apache.org/thread/ytv0t18cplwwwqcp77h6vry7on378jzj
> >
> > Both keep introducing similar concepts. For example both KIP introduce
> > a storage uuid that is stored in meta.properties. At first glance
> > there are some minor differences. I suggest that we review each
> > other's KIP so that we can remove these minor differences. What do you
> > think?
> >
> > Thanks!
> > --
> > -José
>

Reply via email to