Hey, no problem, Sagar!

Yes, I think this is a good time for the voting thread, since all the concerns 
have been addressed in the discussion, and it’s been a while. 

You can find the rules on the main KIP page, and examples of vote messages in 
the mailing list. 

Thanks,
John

On Sat, Jul 11, 2020, at 10:30, Sagar wrote:
> Thanks John,
> 
> Sorry I’m new to this process. 😅 does it mean I start a voting email?
> 
> Pardon my ignorance.
> 
> Sagar.
> 
> 
> On Sat, 11 Jul 2020 at 8:06 PM, John Roesler <vvcep...@apache.org> wrote:
> 
> > Hi Sagar,
> >
> > Thanks for the update. As far as I’m concerned, I’m ready to vote now.
> >
> > Thanks,
> > John
> >
> > On Mon, Jul 6, 2020, at 12:58, Sagar wrote:
> > > Hi John,
> > >
> > > Thanks, I have updated the KIP.
> > >
> > > Thanks!
> > > Sagar.
> > >
> > > On Mon, Jul 6, 2020 at 12:00 AM John Roesler <j...@vvcephei.org> wrote:
> > >
> > > > Hi Sagar,
> > > >
> > > > Sorry for the ambiguity. You could just mention it in the Public
> > > > Interfaces section. Or, if you want to be more specific, you can show
> > it in
> > > > the method definition snippet. I don’t think it matters, as long as
> > it’s
> > > > clearly stated, since it affects backward compatibility with existing
> > store
> > > > implementations.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > > On Sun, Jul 5, 2020, at 11:25, Sagar wrote:
> > > > > Hi John,
> > > > >
> > > > > Thank you! Question on the comment, where should I add the default
> > > > > implementation? I guess that needs to be added in the Proposal
> > Section of
> > > > > the kIP.
> > > > >
> > > > > Thanks!
> > > > > Sagar.
> > > > >
> > > > > On Sat, Jul 4, 2020 at 11:46 PM John Roesler <vvcep...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Thanks Sagar,
> > > > > >
> > > > > > That looks good to me! The only minor comment I’d make is that I
> > think
> > > > the
> > > > > > method declaration should have a default implementation that
> > throws an
> > > > > > UnsupportedOperationException, for source compatibility with
> > existing
> > > > state
> > > > > > stores.
> > > > > >
> > > > > > Otherwise, as far as I’m concerned, I’m ready to vote.
> > > > > >
> > > > > > Thanks,
> > > > > > John
> > > > > >
> > > > > > On Sat, Jul 4, 2020, at 12:19, Sagar wrote:
> > > > > > > Hi John,
> > > > > > >
> > > > > > > I have updated the KIP with all the new changes we discussed in
> > this
> > > > > > > discussion thread.
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-614%3A+Add+Prefix+Scan+support+for+State+Stores
> > > > > > >
> > > > > > > Request you to go through the same.
> > > > > > >
> > > > > > > Thanks!
> > > > > > > Sagar.
> > > > > > >
> > > > > > > On Tue, Jun 30, 2020 at 8:09 AM John Roesler <
> > vvcep...@apache.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Sagar,
> > > > > > > >
> > > > > > > > That’s a good observation; yes, it should go in the
> > > > > > ReadOnlyKeyValueStore
> > > > > > > > interface.
> > > > > > > >
> > > > > > > > Thanks again for the great work,
> > > > > > > > John
> > > > > > > >
> > > > > > > > On Sun, Jun 28, 2020, at 23:54, Sagar wrote:
> > > > > > > > > Hi John,
> > > > > > > > >
> > > > > > > > > Thank you for the positive feedback! The meaningful
> > discussions
> > > > we
> > > > > > had on
> > > > > > > > > the mailing list helped me understand what needed to be done.
> > > > > > > > >
> > > > > > > > > I am definitely open to any further suggestions on this.
> > > > > > > > >
> > > > > > > > > Before I updated the KIP, I just had one question, is it
> > fine to
> > > > > > have it
> > > > > > > > > for KeyValueStore or should I move it to
> > ReadOnlyKeyValueStore
> > > > where
> > > > > > even
> > > > > > > > > the range query resides?
> > > > > > > > >
> > > > > > > > > Regarding the 2 notes on UnsupportedOperationException and
> > > > changing
> > > > > > the
> > > > > > > > > name to prefixScan, i will incorporate both of them into the
> > KIP.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > > Sagar.
> > > > > > > > >
> > > > > > > > > On Sun, Jun 28, 2020 at 11:55 PM John Roesler <
> > > > vvcep...@apache.org>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Woah, this is great, Sagar!
> > > > > > > > > >
> > > > > > > > > > I think this API looks really good. I'm curious if anyone
> > else
> > > > has
> > > > > > > > > > any concern. For my part, I think this will work just fine.
> > > > People
> > > > > > > > > > might face tricky bugs getting their key serde and their
> > prefix
> > > > > > > > > > serde "aligned", but I think the API makes it pretty
> > obvious
> > > > what
> > > > > > > > > > has to happen to make this work. As long as the API isn't
> > going
> > > > > > > > > > to "trick" anyone by trying to abstract away things that
> > can't
> > > > be
> > > > > > > > > > abstracted, this is the best we can do. In other words, I
> > think
> > > > > > > > > > your approach is ideal here.
> > > > > > > > > >
> > > > > > > > > > I also really appreciate that you took the time to do a
> > full
> > > > POC
> > > > > > > > > > with end-to-end tests to show that the proposal is actually
> > > > > > > > > > going to work.
> > > > > > > > > >
> > > > > > > > > > A couple of notes as you update the KIP:
> > > > > > > > > >
> > > > > > > > > > 1. I think that for "optional" state store features like
> > this,
> > > > we
> > > > > > > > > > should add a default implementation to the interface that
> > > > > > > > > > throws UnsupportedOperationException. That way,
> > > > > > > > > > any existing store implementations won't fail to compile
> > > > > > > > > > on the new version. And any store that just can't support
> > > > > > > > > > a prefix scan would simply not override the method.
> > > > > > > > > >
> > > > > > > > > > 2. I think you meant "prefixScan", not "prefixSeek", since
> > > > > > > > > > we're actually getting an iterator that only returns
> > prefix-
> > > > > > > > > > matching keys, as opposed to just seeking to that prefix.
> > > > > > > > > >
> > > > > > > > > > Thanks again for the work you've put into this. I look
> > > > > > > > > > forward to reviewing the updated KIP.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > -John
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, Jun 28, 2020, at 12:17, Sagar wrote:
> > > > > > > > > > > Hi John,
> > > > > > > > > > >
> > > > > > > > > > > I took some time out and as we discussed, looked to
> > implement
> > > > > > these
> > > > > > > > > > > changes. Most of these changes are for demonstrative
> > purposes
> > > > > > but I
> > > > > > > > > > thought
> > > > > > > > > > > I will share.
> > > > > > > > > > >
> > > > > > > > > > > I added the new prefixSeek method at the KeyValueStore
> > > > interface
> > > > > > > > level:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/confluentinc/kafka/pull/242/files#diff-5e92747b506c868db3948323478e1b07R74-R83
> > > > > > > > > > >
> > > > > > > > > > > As you had pointed out, the prefix type can be different
> > from
> > > > > > the key
> > > > > > > > > > type.
> > > > > > > > > > > That's why this method takes 2 parameters. the key type
> > and
> > > > it's
> > > > > > > > > > serializer.
> > > > > > > > > > >
> > > > > > > > > > > Then I added the implementation of this method in a
> > couple of
> > > > > > Stores.
> > > > > > > > > > > RocksDBStore:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/confluentinc/kafka/pull/242/commits#diff-046ca566243518c88e007b7499ec9f51R308-R320
> > > > > > > > > > > and
> > > > > > > > > > > InMemoryKVStore:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/confluentinc/kafka/pull/242/commits#diff-4c685a32e765eab60bcb60097768104eR108-R120
> > > > > > > > > > >
> > > > > > > > > > > I modified the older test cases for RocksDBStore. You can
> > > > find
> > > > > > them
> > > > > > > > here:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/confluentinc/kafka/pull/242/commits#diff-051439f56f0d6a12334d7e8cc4f66bf8R304-R415
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I have added a test case where the keys are of type UUID
> > > > while
> > > > > > the
> > > > > > > > prefix
> > > > > > > > > > > is of type string. This seems to be working because the
> > code
> > > > is
> > > > > > able
> > > > > > > > to
> > > > > > > > > > > pull in UUIDs with the provided prefix, even though their
> > > > types
> > > > > > are
> > > > > > > > > > > different.
> > > > > > > > > > >
> > > > > > > > > > > To address one of the gaps from my previous
> > implementation, I
> > > > > > have
> > > > > > > > also
> > > > > > > > > > > added a test case for the end to end flow using the state
> > > > store
> > > > > > > > supplier.
> > > > > > > > > > > you can find it here:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/confluentinc/kafka/pull/242/commits#diff-a94de5b2ec72d09ebac7183c31d7c906R269-R305
> > > > > > > > > > >
> > > > > > > > > > > Note that for this to work, i needed to update
> > > > MeteredKVstore and
> > > > > > > > > > > ChangeLoggingKVStore.
> > > > > > > > > > >
> > > > > > > > > > > Lastly, barring the 4 stores mentioned above, rest of the
> > > > > > > > implementers of
> > > > > > > > > > > KVStore have the prefixSeek override as null. As I said,
> > > > this is
> > > > > > > > mainly
> > > > > > > > > > for
> > > > > > > > > > > demonstrative purposes and hence done this way.
> > > > > > > > > > > If you get the chance, it would be great if you can
> > provide
> > > > some
> > > > > > > > feedback
> > > > > > > > > > > on this.
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > > Sagar.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jun 10, 2020 at 9:21 AM Sagar <
> > > > sagarmeansoc...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi John,
> > > > > > > > > > > >
> > > > > > > > > > > > You rightly pointed out, the devil is in the detail
> > :). I
> > > > will
> > > > > > > > start
> > > > > > > > > > with
> > > > > > > > > > > > the implementation to get a sense.
> > > > > > > > > > > >
> > > > > > > > > > > > Here are my thoughts on the core challenge that you
> > pointed
> > > > > > out.
> > > > > > > > The
> > > > > > > > > > key
> > > > > > > > > > > > value store abstractions that have been exposed via the
> > > > state
> > > > > > > > store DSL
> > > > > > > > > > > > APIs, make it possible for the end user to define
> > generic
> > > > key
> > > > > > > > types.
> > > > > > > > > > > > However, the Serdes are the one which convert those
> > generic
> > > > > > > > keys/values
> > > > > > > > > > > > into the format in which the state store stores them-
> > which
> > > > > > for all
> > > > > > > > > > > > practical purposes are byte-arrays. I think with the
> > prefix
> > > > > > type
> > > > > > > > > > serde, if
> > > > > > > > > > > > it converts the prefix to the same internal storage
> > type
> > > > (byte
> > > > > > > > array)
> > > > > > > > > > as
> > > > > > > > > > > > that of the Keys, then we should be able to do a prefix
> > > > scan.
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding other databases, I have worked a bit with
> > Redis
> > > > which
> > > > > > > > also
> > > > > > > > > > > > provides a scan operator using the glob style pattern
> > > > > > match(it's
> > > > > > > > more
> > > > > > > > > > > > evolved than prefix scan but can be converted easily):
> > > > > > > > > > > >
> > > > > > > > > > > > https://redis.io/commands/scan#the-match-option
> > > > > > > > > > > >
> > > > > > > > > > > > Typically Redis works with Binary Safe strings so the
> > > > prefix
> > > > > > key
> > > > > > > > type
> > > > > > > > > > and
> > > > > > > > > > > > the actual keys are of the same type.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > > Sagar.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jun 10, 2020 at 1:41 AM John Roesler <
> > > > > > vvcep...@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> Hi Sagar,
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks for the reply. I agree that your UUID example
> > > > > > illustrates
> > > > > > > > the
> > > > > > > > > > > >> problem I was pointing out.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Yes, I think that experimenting with the API in the
> > PR is
> > > > > > > > probably the
> > > > > > > > > > > >> best way to make progress (as opposed to just
> > thinking in
> > > > > > terms of
> > > > > > > > > > > >> design on the wiki) because with this kind of thing,
> > the
> > > > > > devil is
> > > > > > > > > > often
> > > > > > > > > > > >> in the details.
> > > > > > > > > > > >>
> > > > > > > > > > > >> To clarify what I meant by that last statement, I see
> > the
> > > > core
> > > > > > > > > > challenge
> > > > > > > > > > > >> here as deriving from the fact that we have a
> > key/value
> > > > store
> > > > > > with
> > > > > > > > > > > >> generically typed keys, with a separate component (the
> > > > serde)
> > > > > > that
> > > > > > > > > > > >> turns those typed keys into bytes for storage. In
> > > > contrast,
> > > > > > > > RocksDB
> > > > > > > > > > > >> can easily offer a "prefix scan" operation because
> > they
> > > > key
> > > > > > type
> > > > > > > > is
> > > > > > > > > > > >> always just a byte array, so "prefix" is a very
> > natural
> > > > > > concept to
> > > > > > > > > > offer
> > > > > > > > > > > >> in the API. Other key/value stores force the keys to
> > > > always be
> > > > > > > > > > strings,
> > > > > > > > > > > >> which also makes it easy to define a prefix scan.
> > > > > > > > > > > >>
> > > > > > > > > > > >> My question is whether there are other databases that
> > > > offer
> > > > > > both:
> > > > > > > > > > > >> 1. generically typed keys (as opposed to just bytes,
> > just
> > > > > > strings,
> > > > > > > > > > etc)
> > > > > > > > > > > >> 2. prefix scans
> > > > > > > > > > > >> And, if so, what the API looks like.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks,
> > > > > > > > > > > >> -John
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Tue, Jun 9, 2020, at 11:51, Sagar wrote:
> > > > > > > > > > > >> > Hi John,
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Thanks for the response. For starters, for our use
> > > > case, we
> > > > > > > > tweaked
> > > > > > > > > > our
> > > > > > > > > > > >> > keys etc to avoid prefix scans. So, we are good
> > there.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Regarding the KIP, I see what you mean when you say
> > > > that the
> > > > > > > > same
> > > > > > > > > > key
> > > > > > > > > > > >> type
> > > > > > > > > > > >> > for prefix won't work. For example, continuing with
> > the
> > > > UUID
> > > > > > > > example
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > you gave, let's say one of the UUIDs
> > > > > > > > > > > >> > is 123e4567-e89b-12d3-a456-426614174000, and with a
> > > > prefix
> > > > > > scan
> > > > > > > > we
> > > > > > > > > > want
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > fetch all keys starting with 123. There's already a
> > > > > > UUIDSerde
> > > > > > > > so if
> > > > > > > > > > the
> > > > > > > > > > > >> > keys have been stored with that, then using
> > UUIDSerde
> > > > for
> > > > > > > > prefixes
> > > > > > > > > > won't
> > > > > > > > > > > >> > help- I am not sure if the UUIDSerializer would even
> > > > work
> > > > > > for
> > > > > > > > 123.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > So, that indicates that we will need to provide a
> > new
> > > > > > prefix key
> > > > > > > > > > type
> > > > > > > > > > > >> > serializer. Having said that, how it will be
> > stitched
> > > > > > together
> > > > > > > > and
> > > > > > > > > > > >> finally
> > > > > > > > > > > >> > exposed using the APIs is something that is up for
> > > > > > questioning.
> > > > > > > > > > This is
> > > > > > > > > > > >> > something you have also brought up in the earlier
> > > > emails.
> > > > > > If it
> > > > > > > > > > > >> > makes sense, I can modify my PR to go along these
> > lines.
> > > > > > Please
> > > > > > > > let
> > > > > > > > > > me
> > > > > > > > > > > >> know
> > > > > > > > > > > >> > what you think.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Lastly, I didn't understand this line of yours: *It
> > > > might
> > > > > > help
> > > > > > > > if
> > > > > > > > > > there
> > > > > > > > > > > >> are
> > > > > > > > > > > >> > other typed key/value stores to compare APIs with.*
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Thanks!
> > > > > > > > > > > >> > Sagar.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Thu, Jun 4, 2020 at 6:03 AM John Roesler <
> > > > > > > > vvcep...@apache.org>
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > Hi Sagar,
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Thanks for the question, and sorry for muddying
> > the
> > > > water.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > I meant the Bytes/byte[] thing as advice for how
> > you
> > > > all
> > > > > > can
> > > > > > > > solve
> > > > > > > > > > > >> your
> > > > > > > > > > > >> > > problem in the mean time, while we work through
> > this
> > > > KIP.
> > > > > > I
> > > > > > > > don’t
> > > > > > > > > > > >> think
> > > > > > > > > > > >> > > it’s relevant for the KIP itself.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > I think the big issue here is what the type of the
> > > > prefix
> > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > >> in the
> > > > > > > > > > > >> > > method signature. Using the same type as the key
> > makes
> > > > > > sense
> > > > > > > > some
> > > > > > > > > > > >> times,
> > > > > > > > > > > >> > > but not other times. I’m not sure what the best
> > way
> > > > around
> > > > > > > > this
> > > > > > > > > > might
> > > > > > > > > > > >> be.
> > > > > > > > > > > >> > > It might help if there are other typed key/value
> > > > stores to
> > > > > > > > compare
> > > > > > > > > > > >> APIs
> > > > > > > > > > > >> > > with.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Thanks,
> > > > > > > > > > > >> > > John
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > On Mon, Jun 1, 2020, at 09:58, Sagar wrote:
> > > > > > > > > > > >> > > > Hi John,
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Just to add to my previous email(and sorry for
> > the
> > > > > > spam),
> > > > > > > > if we
> > > > > > > > > > > >> consider
> > > > > > > > > > > >> > > > using Bytes/byte[] and manually invoke the
> > serdes,
> > > > if
> > > > > > you
> > > > > > > > could
> > > > > > > > > > > >> provide
> > > > > > > > > > > >> > > > examples where the same Serde for keys won't
> > work
> > > > for
> > > > > > the
> > > > > > > > prefix
> > > > > > > > > > > >> types.
> > > > > > > > > > > >> > > As
> > > > > > > > > > > >> > > > far as my understanding goes, the prefix seek
> > would
> > > > > > depend
> > > > > > > > upon
> > > > > > > > > > > >> ordering
> > > > > > > > > > > >> > > of
> > > > > > > > > > > >> > > > the keys like lexicographic. As long as the
> > binary
> > > > > > format is
> > > > > > > > > > > >> consistent
> > > > > > > > > > > >> > > for
> > > > > > > > > > > >> > > > both the keys and the prefixes would it not
> > ensure
> > > > the
> > > > > > > > ability
> > > > > > > > > > to
> > > > > > > > > > > >> search
> > > > > > > > > > > >> > > in
> > > > > > > > > > > >> > > > that same ordering space? This is from my
> > limited
> > > > > > > > understanding
> > > > > > > > > > so
> > > > > > > > > > > >> any
> > > > > > > > > > > >> > > > concrete examples would be helpful...
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Also, you mentioned about the creation of dummy
> > > > values
> > > > > > to
> > > > > > > > > > indicate
> > > > > > > > > > > >> prefix
> > > > > > > > > > > >> > > > values, do you mean this line:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/foreignkeyjoin/ForeignJoinSubscriptionProcessorSupplier.java#L91
> > > > > > > > > > > >> > > > This
> > > > > > > > > > > >> > > > is where the prefix key is built and used for
> > > > searching
> > > > > > .
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > Thanks!
> > > > > > > > > > > >> > > > Sagar.
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > On Mon, Jun 1, 2020 at 11:42 AM Sagar <
> > > > > > > > > > sagarmeansoc...@gmail.com>
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > Hi John,
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Thank you. I think it makes sense to modify
> > the
> > > > KIP to
> > > > > > > > add the
> > > > > > > > > > > >> > > > > prefixScan() as part of the existing
> > interfaces
> > > > and
> > > > > > add
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > >> mixin
> > > > > > > > > > > >> > > > > behaviour as Rejected alternatives. I am not
> > very
> > > > > > aware of
> > > > > > > > > > other
> > > > > > > > > > > >> stores
> > > > > > > > > > > >> > > > > apart from keyValueStore so is it fine if I
> > keep
> > > > it
> > > > > > there
> > > > > > > > for
> > > > > > > > > > now?
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Regarding the type definition of types I will
> > try
> > > > and
> > > > > > > > think
> > > > > > > > > > about
> > > > > > > > > > > >> some
> > > > > > > > > > > >> > > > > alternatives and share if I get any.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Thanks!
> > > > > > > > > > > >> > > > > Sagar.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > On Sun, May 31, 2020 at 1:55 AM John Roesler <
> > > > > > > > > > vvcep...@apache.org
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > wrote:
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > >> Hi Sagar,
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> Thanks for the response. Your use case makes
> > > > sense to
> > > > > > > > me; I
> > > > > > > > > > > >> figured it
> > > > > > > > > > > >> > > > >> must be something like that.
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> On a pragmatic level, in the near term, you
> > might
> > > > > > > > consider
> > > > > > > > > > > >> basically
> > > > > > > > > > > >> > > > >> doing the same thing we did in KIP-213. If
> > you
> > > > swap
> > > > > > out
> > > > > > > > the
> > > > > > > > > > store
> > > > > > > > > > > >> > > types for
> > > > > > > > > > > >> > > > >> Byte/byte[] and “manually” invoke the serdes
> > in
> > > > your
> > > > > > own
> > > > > > > > > > logic,
> > > > > > > > > > > >> then
> > > > > > > > > > > >> > > you
> > > > > > > > > > > >> > > > >> can use the same algorithm we did to derive
> > the
> > > > range
> > > > > > > > scan
> > > > > > > > > > > >> boundaries
> > > > > > > > > > > >> > > from
> > > > > > > > > > > >> > > > >> your desired prefix.
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> For the actual KIP, it seems like we would
> > need
> > > > > > > > significant
> > > > > > > > > > > >> design
> > > > > > > > > > > >> > > > >> improvements to be able to do any mixins, so
> > I
> > > > think
> > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > >> favor
> > > > > > > > > > > >> > > > >> proposing either to just add to the existing
> > > > > > interfaces
> > > > > > > > or to
> > > > > > > > > > > >> create
> > > > > > > > > > > >> > > brand
> > > > > > > > > > > >> > > > >> new interfaces, as appropriate, for now.
> > Given
> > > > that
> > > > > > > > prefix
> > > > > > > > > > can be
> > > > > > > > > > > >> > > converted
> > > > > > > > > > > >> > > > >> to a range query at a low level, I think we
> > can
> > > > > > probably
> > > > > > > > > > explore
> > > > > > > > > > > >> > > adding
> > > > > > > > > > > >> > > > >> prefix to the existing interfaces with a
> > default
> > > > > > > > > > implementation.
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> It seems like that just leaves the question
> > of
> > > > how to
> > > > > > > > define
> > > > > > > > > > the
> > > > > > > > > > > >> type
> > > > > > > > > > > >> > > of
> > > > > > > > > > > >> > > > >> the prefix. To be honest, I don’t have any
> > great
> > > > > > ideas
> > > > > > > > here.
> > > > > > > > > > Are
> > > > > > > > > > > >> you
> > > > > > > > > > > >> > > able
> > > > > > > > > > > >> > > > >> to generate some creative solutions, Sagar?
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> Thanks,
> > > > > > > > > > > >> > > > >> John
> > > > > > > > > > > >> > > > >>
> > > > > > > > > > > >> > > > >> On Tue, May 26, 2020, at 06:42, Sagar wrote:
> > > > > > > > > > > >> > > > >> > Hi John,
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > Thanks for the detailed reply. I was a bit
> > > > crammed
> > > > > > with
> > > > > > > > > > work
> > > > > > > > > > > >> last
> > > > > > > > > > > >> > > week
> > > > > > > > > > > >> > > > >> so
> > > > > > > > > > > >> > > > >> > couldn't respond earlier so apologies for
> > that.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > First of all, thanks for the context that
> > both
> > > > you
> > > > > > and
> > > > > > > > Adam
> > > > > > > > > > > >> have
> > > > > > > > > > > >> > > > >> > provided me on the issues faced
> > previously. As
> > > > I
> > > > > > can
> > > > > > > > > > clearly
> > > > > > > > > > > >> see,
> > > > > > > > > > > >> > > while
> > > > > > > > > > > >> > > > >> I
> > > > > > > > > > > >> > > > >> > was able to cut some corners while writing
> > some
> > > > > > test
> > > > > > > > cases
> > > > > > > > > > or
> > > > > > > > > > > >> > > > >> benchmarks,
> > > > > > > > > > > >> > > > >> > to be able to stitch together a store with
> > > > prefix
> > > > > > scan
> > > > > > > > > > into an
> > > > > > > > > > > >> > > actual
> > > > > > > > > > > >> > > > >> > topology needs more work. I am sorry for
> > the
> > > > half
> > > > > > baked
> > > > > > > > > > tests
> > > > > > > > > > > >> that I
> > > > > > > > > > > >> > > > >> wrote
> > > > > > > > > > > >> > > > >> > without realising and you have rightly put
> > it
> > > > when
> > > > > > you
> > > > > > > > said
> > > > > > > > > > > >> these
> > > > > > > > > > > >> > > > >> > challenges aren't obvious up front.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > Now, coming back to the other points, I
> > spent
> > > > some
> > > > > > time
> > > > > > > > > > going
> > > > > > > > > > > >> > > through
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > KIP-213 and also some of the code snippets
> > > > that are
> > > > > > > > talked
> > > > > > > > > > > >> about in
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > >> > KIP. With the detailed explanation that you
> > > > > > provided,
> > > > > > > > it
> > > > > > > > > > is now
> > > > > > > > > > > >> > > obvious
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > me that keeping a generic type for keys
> > like K
> > > > > > won't
> > > > > > > > work
> > > > > > > > > > oob
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > > hence
> > > > > > > > > > > >> > > > >> a
> > > > > > > > > > > >> > > > >> > decision was made to use Bytes as the key
> > type.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > I just had another thought on this though.
> > I
> > > > > > looked at
> > > > > > > > the
> > > > > > > > > > > >> range
> > > > > > > > > > > >> > > > >> function
> > > > > > > > > > > >> > > > >> > that was added in the
> > ReadOnlyKeyValueStore.
> > > > While
> > > > > > the
> > > > > > > > Key
> > > > > > > > > > and
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > Value
> > > > > > > > > > > >> > > > >> > mentioned in that method is generic,
> > internally
> > > > > > almost
> > > > > > > > all
> > > > > > > > > > > >> queries
> > > > > > > > > > > >> > > end
> > > > > > > > > > > >> > > > >> up
> > > > > > > > > > > >> > > > >> > querying using Bytes in some or the other
> > > > form. I
> > > > > > > > looked at
> > > > > > > > > > > >> not just
> > > > > > > > > > > >> > > > >> > RocksDb Store but other stores like
> > InMemory
> > > > store
> > > > > > or
> > > > > > > > > > > >> MemoryLRU and
> > > > > > > > > > > >> > > this
> > > > > > > > > > > >> > > > >> > seems to be the pattern. I think this stems
> > > > from
> > > > > > the
> > > > > > > > fact
> > > > > > > > > > that
> > > > > > > > > > > >> these
> > > > > > > > > > > >> > > > >> stores
> > > > > > > > > > > >> > > > >> > while implementing KeyValueStore pass
> > Bytes,
> > > > > > byte[] as
> > > > > > > > the
> > > > > > > > > > K
> > > > > > > > > > > >> and V
> > > > > > > > > > > >> > > > >> values.
> > > > > > > > > > > >> > > > >> > Classes like MeteredKeyValueStore which
> > don't
> > > > do
> > > > > > this,
> > > > > > > > > > still
> > > > > > > > > > > >> use
> > > > > > > > > > > >> > > > >> Bytes.wrap
> > > > > > > > > > > >> > > > >> > to wrap the passed keys and values and
> > invoke
> > > > the
> > > > > > range
> > > > > > > > > > method.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > So, the point I am trying to make is, with
> > the
> > > > same
> > > > > > > > > > behaviour
> > > > > > > > > > > >> - and
> > > > > > > > > > > >> > > > >> > ignoring for a moment that it's a separate
> > > > > > interface
> > > > > > > > which
> > > > > > > > > > I am
> > > > > > > > > > > >> > > trying
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > "mix-in"- the issues with the key types
> > could
> > > > be
> > > > > > > > resolved.
> > > > > > > > > > I
> > > > > > > > > > > >> may be
> > > > > > > > > > > >> > > > >> wrong
> > > > > > > > > > > >> > > > >> > though so would like to know your thoughts
> > on
> > > > this.
> > > > > > > > Infact
> > > > > > > > > > > >> > > unknowingly
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > interface implementation of
> > PrefixSeekableType
> > > > in
> > > > > > > > > > > >> RockDBStateStore
> > > > > > > > > > > >> > > also
> > > > > > > > > > > >> > > > >> > passes Bytes and bytes[] as K and V.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > The second part of exposing it via the
> > > > publically
> > > > > > > > > > accessible
> > > > > > > > > > > >> > > interfaces
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > which we downcast while building the
> > topology
> > > > (like
> > > > > > > > > > > >> KeyValueStore),
> > > > > > > > > > > >> > > I
> > > > > > > > > > > >> > > > >> can
> > > > > > > > > > > >> > > > >> > clearly see now that mixing-in the way I
> > tried
> > > > to
> > > > > > won't
> > > > > > > > > > work.
> > > > > > > > > > > >> My
> > > > > > > > > > > >> > > > >> intention
> > > > > > > > > > > >> > > > >> > all along was not to hamper the flow of
> > those
> > > > > > stores
> > > > > > > > which
> > > > > > > > > > > >> don't
> > > > > > > > > > > >> > > support
> > > > > > > > > > > >> > > > >> > prefix scan as yet and hence the separate
> > > > > > interface.
> > > > > > > > But, I
> > > > > > > > > > > >> agree
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > >> for
> > > > > > > > > > > >> > > > >> > this to work, it needs to be part of some
> > > > > > pre-defined
> > > > > > > > store
> > > > > > > > > > > >> types
> > > > > > > > > > > >> > > like
> > > > > > > > > > > >> > > > >> > KVStore etc. Right now, I don't have an
> > answer
> > > > to
> > > > > > this
> > > > > > > > but
> > > > > > > > > > > >> mostly it
> > > > > > > > > > > >> > > > >> would
> > > > > > > > > > > >> > > > >> > have to be moved there and implemented
> > across
> > > > all
> > > > > > > > > > stores(if we
> > > > > > > > > > > >> see
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > worth in prefix scans :) )
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > Regarding the motivation, I am sorry if I
> > > > wasn't
> > > > > > clear.
> > > > > > > > > > This
> > > > > > > > > > > >> > > originated
> > > > > > > > > > > >> > > > >> > from one of my own use cases with kafka
> > streams
> > > > > > where i
> > > > > > > > > > needed
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > find
> > > > > > > > > > > >> > > > >> some
> > > > > > > > > > > >> > > > >> > keys based upon certain prefix. Infact it's
> > > > > > similar to
> > > > > > > > the
> > > > > > > > > > > >> > > > >> > RangeScanCombinedKeyUsage diagram in
> > KIP-213
> > > > where
> > > > > > the
> > > > > > > > > > > >> otherTable
> > > > > > > > > > > >> > > tries
> > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > >> > > > >> > find entries in the state store based upon
> > the
> > > > FK.
> > > > > > I
> > > > > > > > was
> > > > > > > > > > using
> > > > > > > > > > > >> > > > >> > KevValueStore to be precise. I also
> > remember
> > > > > > having a
> > > > > > > > slack
> > > > > > > > > > > >> > > > >> conversation on
> > > > > > > > > > > >> > > > >> > this, and I was told that this isn't
> > supported
> > > > > > right
> > > > > > > > now,
> > > > > > > > > > but
> > > > > > > > > > > >> some
> > > > > > > > > > > >> > > other
> > > > > > > > > > > >> > > > >> > users shared their experiences on how with
> > some
> > > > > > hacks
> > > > > > > > they
> > > > > > > > > > are
> > > > > > > > > > > >> able
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> > perform prefix scans even though their use
> > case
> > > > > > fits
> > > > > > > > the
> > > > > > > > > > bill
> > > > > > > > > > > >> for a
> > > > > > > > > > > >> > > > >> prefix
> > > > > > > > > > > >> > > > >> > scan. That kind of motivated me to take a
> > stab
> > > > at
> > > > > > it.
> > > > > > > > > > > >> > > Unfortunately, I
> > > > > > > > > > > >> > > > >> have
> > > > > > > > > > > >> > > > >> > lost the slack chat because of some
> > cleanup at
> > > > the
> > > > > > > > slack
> > > > > > > > > > > >> channel
> > > > > > > > > > > >> > > level.
> > > > > > > > > > > >> > > > >> I
> > > > > > > > > > > >> > > > >> > will try and update the ambiguous
> > motivation
> > > > > > statement
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > >> near
> > > > > > > > > > > >> > > > >> future.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > Lastly, I would like to point out, that
> > your
> > > > > > response
> > > > > > > > was
> > > > > > > > > > not
> > > > > > > > > > > >> at all
> > > > > > > > > > > >> > > > >> > discouraging. On the contrary it was really
> > > > > > insightful
> > > > > > > > and
> > > > > > > > > > it's
> > > > > > > > > > > >> > > always
> > > > > > > > > > > >> > > > >> good
> > > > > > > > > > > >> > > > >> > to learn/discover new things :)
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > Thanks!
> > > > > > > > > > > >> > > > >> > Sagar.
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > On Fri, May 15, 2020 at 7:37 AM John
> > Roesler <
> > > > > > > > > > > >> vvcep...@apache.org>
> > > > > > > > > > > >> > > > >> wrote:
> > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > >> > > > >> > > Hi, Sagar!
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Thanks for this KIP. I'm sorry it took
> > me so
> > > > > > long to
> > > > > > > > > > reply.
> > > > > > > > > > > >> I'll
> > > > > > > > > > > >> > > > >> number my
> > > > > > > > > > > >> > > > >> > > points differently to avoid confusion.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > I can provide some additional context on
> > the
> > > > > > > > > > difficulties we
> > > > > > > > > > > >> > > > >> previously
> > > > > > > > > > > >> > > > >> > > faced in KIP-213 (which you and Adam have
> > > > already
> > > > > > > > > > discussed).
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > J1) In your KIP, you propose the
> > following
> > > > > > interface:
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > public interface PrefixSeekableStore<K,
> > V> {
> > > > > > > > > > > >> > > > >> > >     KeyValueIterator<K, V> prefixSeek(K
> > > > prefix);
> > > > > > > > > > > >> > > > >> > > }
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > This is roughly the same thing that Adam
> > and
> > > > I
> > > > > > were
> > > > > > > > > > > >> considering
> > > > > > > > > > > >> > > > >> > > before. It has a hidden problem, that it
> > > > assumes
> > > > > > that
> > > > > > > > > > > >> prefixes of
> > > > > > > > > > > >> > > > >> > > keys in the key space are also in the key
> > > > space.
> > > > > > In
> > > > > > > > other
> > > > > > > > > > > >> words,
> > > > > > > > > > > >> > > this
> > > > > > > > > > > >> > > > >> > > is a store with key type K, and the API
> > > > assumes
> > > > > > that
> > > > > > > > > > > >> prefixes are
> > > > > > > > > > > >> > > also
> > > > > > > > > > > >> > > > >> > > of type K. This is true for some key
> > types,
> > > > like
> > > > > > > > String
> > > > > > > > > > or
> > > > > > > > > > > >> Bytes,
> > > > > > > > > > > >> > > but
> > > > > > > > > > > >> > > > >> not
> > > > > > > > > > > >> > > > >> > > for others.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > For example, if the keys are UUIDs, then
> > no
> > > > > > prefix is
> > > > > > > > > > also a
> > > > > > > > > > > >> > > UUID. If
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > key is a complex data type, like
> > Windowed<K>
> > > > in
> > > > > > our
> > > > > > > > own
> > > > > > > > > > DSL,
> > > > > > > > > > > >> then
> > > > > > > > > > > >> > > > >> > > we would absolutely want to query all
> > keys
> > > > with
> > > > > > the
> > > > > > > > same
> > > > > > > > > > > >> record
> > > > > > > > > > > >> > > key
> > > > > > > > > > > >> > > > >> > > (the K part), or the same window start
> > time,
> > > > but
> > > > > > in
> > > > > > > > > > neither
> > > > > > > > > > > >> case
> > > > > > > > > > > >> > > is
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > prefix actually a Windowed<K>.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > You can skirt the issue by defining a
> > third
> > > > type
> > > > > > > > > > parameter,
> > > > > > > > > > > >> maybe
> > > > > > > > > > > >> > > KP,
> > > > > > > > > > > >> > > > >> that
> > > > > > > > > > > >> > > > >> > > is the "prefix" type, but this would
> > also be
> > > > > > awkward
> > > > > > > > for
> > > > > > > > > > many
> > > > > > > > > > > >> > > usages.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > J2) There is a related problem with
> > > > > > serialization.
> > > > > > > > > > Whether
> > > > > > > > > > > >> > > something
> > > > > > > > > > > >> > > > >> > > is a prefix or not depends not on the
> > Java
> > > > key
> > > > > > (K),
> > > > > > > > but
> > > > > > > > > > on
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > binary
> > > > > > > > > > > >> > > > >> > > format that is produced when you use a
> > serde
> > > > on
> > > > > > the
> > > > > > > > key.
> > > > > > > > > > > >> Whether
> > > > > > > > > > > >> > > > >> > > we say that the prefix must also be a K
> > or
> > > > > > whether it
> > > > > > > > > > gets
> > > > > > > > > > > >> its own
> > > > > > > > > > > >> > > > >> type,
> > > > > > > > > > > >> > > > >> > > KP, there are problems.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > In the latter case, we must additionally
> > > > require
> > > > > > a
> > > > > > > > second
> > > > > > > > > > > >> set of
> > > > > > > > > > > >> > > > >> serdes
> > > > > > > > > > > >> > > > >> > > for the prefixes, but there's no obvious
> > way
> > > > to
> > > > > > > > > > incorporate
> > > > > > > > > > > >> this
> > > > > > > > > > > >> > > in
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > API, especially not in the DSL.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > In either case, for the API to actually
> > > > work, we
> > > > > > > > need to
> > > > > > > > > > know
> > > > > > > > > > > >> > > ahead
> > > > > > > > > > > >> > > > >> > > of time that the Serde will produce a
> > binary
> > > > key
> > > > > > that
> > > > > > > > > > starts
> > > > > > > > > > > >> with
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> > > part that we wish to use as a prefix. For
> > > > > > example,
> > > > > > > > what
> > > > > > > > > > we
> > > > > > > > > > > >> were
> > > > > > > > > > > >> > > doing
> > > > > > > > > > > >> > > > >> > > briefly in KIP-213 (where we had complex
> > > > keys,
> > > > > > > > similar to
> > > > > > > > > > > >> > > Windowed<K>)
> > > > > > > > > > > >> > > > >> > > was to define "dummy" values that
> > indicate
> > > > that a
> > > > > > > > > > > >> Windowed<K> is
> > > > > > > > > > > >> > > > >> actually
> > > > > > > > > > > >> > > > >> > > just a prefix key, not a real key. Maybe
> > the
> > > > > > window
> > > > > > > > start
> > > > > > > > > > > >> time
> > > > > > > > > > > >> > > would
> > > > > > > > > > > >> > > > >> be
> > > > > > > > > > > >> > > > >> > > null or the key part would be null. But
> > we
> > > > also
> > > > > > had
> > > > > > > > to
> > > > > > > > > > > >> define a
> > > > > > > > > > > >> > > serde
> > > > > > > > > > > >> > > > >> > > that would very specifically anticipate
> > which
> > > > > > > > component
> > > > > > > > > > of
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > complex
> > > > > > > > > > > >> > > > >> > > key would need to be used in a prefix
> > key.
> > > > > > Having to
> > > > > > > > > > bring
> > > > > > > > > > > >> all
> > > > > > > > > > > >> > > these
> > > > > > > > > > > >> > > > >> > > parts together in a reliable,
> > easy-to-debug,
> > > > > > fashion
> > > > > > > > > > gives
> > > > > > > > > > > >> me some
> > > > > > > > > > > >> > > > >> doubt
> > > > > > > > > > > >> > > > >> > > that people would actually be able to use
> > > > this
> > > > > > > > feature in
> > > > > > > > > > > >> > > complicated
> > > > > > > > > > > >> > > > >> > > programs without driving themselves
> > crazy.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > J3) Thanks so much for including
> > benchmarks
> > > > and
> > > > > > > > tests!
> > > > > > > > > > > >> > > Unfortunately,
> > > > > > > > > > > >> > > > >> > > these don't include everything you need
> > to
> > > > really
> > > > > > > > plug
> > > > > > > > > > into
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > >> Streams
> > > > > > > > > > > >> > > > >> > > API. I think when you push it a little
> > > > farther,
> > > > > > > > you'll
> > > > > > > > > > > >> realize
> > > > > > > > > > > >> > > what
> > > > > > > > > > > >> > > > >> Adam
> > > > > > > > > > > >> > > > >> > > was talking about wrt the interface
> > > > difficulties.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > In your benchmark and tests, you directly
> > > > > > construct
> > > > > > > > the
> > > > > > > > > > > >> store and
> > > > > > > > > > > >> > > then
> > > > > > > > > > > >> > > > >> > > use it, but in a real Streams
> > application,
> > > > you
> > > > > > can
> > > > > > > > only
> > > > > > > > > > > >> provide
> > > > > > > > > > > >> > > your
> > > > > > > > > > > >> > > > >> > > implementation in a StoreSupplier, for
> > > > example
> > > > > > via
> > > > > > > > the
> > > > > > > > > > > >> > > Materialized
> > > > > > > > > > > >> > > > >> > > parameter. Then, to use the store from
> > > > inside a
> > > > > > > > > > Processor,
> > > > > > > > > > > >> you'd
> > > > > > > > > > > >> > > have
> > > > > > > > > > > >> > > > >> > > to get it by name from the
> > ProcessorContext,
> > > > and
> > > > > > then
> > > > > > > > > > cast
> > > > > > > > > > > >> it to
> > > > > > > > > > > >> > > one
> > > > > > > > > > > >> > > > >> of
> > > > > > > > > > > >> > > > >> > > the pre-defined store types,
> > KeyValueStore,
> > > > > > > > > > WindowedStore, or
> > > > > > > > > > > >> > > > >> > > SessionStore. It won't work to "mix in"
> > your
> > > > > > > > interface
> > > > > > > > > > > >> because the
> > > > > > > > > > > >> > > > >> > > processor gets a store that's wrapped in
> > > > layers
> > > > > > that
> > > > > > > > > > handle
> > > > > > > > > > > >> > > > >> serialization,
> > > > > > > > > > > >> > > > >> > > change-logging, recording metrics, and
> > > > caching.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > To use the store through IQ, you have to
> > > > provide
> > > > > > a
> > > > > > > > > > > >> > > QueriableStoreType
> > > > > > > > > > > >> > > > >> > > to KafkaStreams#store, and you get back a
> > > > > > similarly
> > > > > > > > > > wrapped
> > > > > > > > > > > >> store.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > I think our only choices to add an
> > interface
> > > > like
> > > > > > > > yours
> > > > > > > > > > is
> > > > > > > > > > > >> either
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> add
> > > > > > > > > > > >> > > > >> > > it to one of the existing store types,
> > like
> > > > > > > > > > KeyValueStore or
> > > > > > > > > > > >> > > > >> > > WindowedStore, or to define a completely
> > new
> > > > > > store
> > > > > > > > > > hierarchy,
> > > > > > > > > > > >> > > meaning
> > > > > > > > > > > >> > > > >> > > you have to duplicate all the "wrapper"
> > > > layers in
> > > > > > > > > > Streams.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > I think if you write an "end-to-end"
> > test,
> > > > where
> > > > > > you
> > > > > > > > > > write a
> > > > > > > > > > > >> > > Streams
> > > > > > > > > > > >> > > > >> app,
> > > > > > > > > > > >> > > > >> > > provide your store, and then use it in a
> > > > > > Processor
> > > > > > > > and
> > > > > > > > > > > >> through IQ,
> > > > > > > > > > > >> > > > >> > > you'll see what I'm talking about.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > IIRC, those three points were the ones
> > that
> > > > > > > > ultimately
> > > > > > > > > > led
> > > > > > > > > > > >> us to
> > > > > > > > > > > >> > > > >> abandon
> > > > > > > > > > > >> > > > >> > > the whole idea last time and just
> > register
> > > > the
> > > > > > stores
> > > > > > > > > > with
> > > > > > > > > > > >> key
> > > > > > > > > > > >> > > type
> > > > > > > > > > > >> > > > >> Bytes.
> > > > > > > > > > > >> > > > >> > > I think some creative solutions may yet
> > be
> > > > > > possible,
> > > > > > > > but
> > > > > > > > > > > >> it'll
> > > > > > > > > > > >> > > take
> > > > > > > > > > > >> > > > >> some
> > > > > > > > > > > >> > > > >> > > more design work to get there.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Can I ask what your motivation is,
> > exactly,
> > > > for
> > > > > > > > proposing
> > > > > > > > > > > >> this
> > > > > > > > > > > >> > > > >> feature?
> > > > > > > > > > > >> > > > >> > > The motivation just says "some users may
> > > > want to
> > > > > > do
> > > > > > > > it",
> > > > > > > > > > > >> which has
> > > > > > > > > > > >> > > > >> > > the advantage that it's impossible to
> > > > disagree
> > > > > > with,
> > > > > > > > but
> > > > > > > > > > > >> doesn't
> > > > > > > > > > > >> > > > >> provide
> > > > > > > > > > > >> > > > >> > > a lot of concrete detail ;)
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Specifically, what I'm wondering is
> > whether
> > > > you
> > > > > > > > wanted
> > > > > > > > > > to use
> > > > > > > > > > > >> > > this as
> > > > > > > > > > > >> > > > >> > > part of a KayValue store, which might be
> > a
> > > > > > > > challenge, or
> > > > > > > > > > > >> whether
> > > > > > > > > > > >> > > you
> > > > > > > > > > > >> > > > >> > > wanted to use it for more efficient
> > scans in
> > > > a
> > > > > > > > > > > >> WindowedStore, like
> > > > > > > > > > > >> > > > >> > > Guozhang.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Thanks again for the KIP! I hope my
> > response
> > > > > > isn't
> > > > > > > > too
> > > > > > > > > > > >> > > discouraging;
> > > > > > > > > > > >> > > > >> > > I just wanted to convey the challenges we
> > > > faced
> > > > > > last
> > > > > > > > > > time,
> > > > > > > > > > > >> since
> > > > > > > > > > > >> > > they
> > > > > > > > > > > >> > > > >> > > are all not obvious up front.
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > Best regards,
> > > > > > > > > > > >> > > > >> > > -John
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > >
> > > > > > > > > > > >> > > > >> > > On Thu, May 14, 2020, at 16:17, Sophie
> > > > > > Blee-Goldman
> > > > > > > > > > wrote:
> > > > > > > > > > > >> > > > >> > > > Whoops, I guess I didn't finish reading
> > > > the KIP
> > > > > > > > all the
> > > > > > > > > > > >> way to
> > > > > > > > > > > >> > > the
> > > > > > > > > > > >> > > > >> end
> > > > > > > > > > > >> > > > >> > > > earlier. Thanks
> > > > > > > > > > > >> > > > >> > > > for including the link to the RocksDB
> > PR
> > > > in the
> > > > > > > > KIP!
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > I have one additional question about
> > the
> > > > > > proposal:
> > > > > > > > do
> > > > > > > > > > you
> > > > > > > > > > > >> plan
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > >> also
> > > > > > > > > > > >> > > > >> > > add
> > > > > > > > > > > >> > > > >> > > > this
> > > > > > > > > > > >> > > > >> > > > prefix seek API to the dual column
> > family
> > > > > > > > iterators?
> > > > > > > > > > These
> > > > > > > > > > > >> are
> > > > > > > > > > > >> > > used
> > > > > > > > > > > >> > > > >> by
> > > > > > > > > > > >> > > > >> > > > RocksDBTimestampedStore (which extends
> > > > > > > > RocksDBStore),
> > > > > > > > > > for
> > > > > > > > > > > >> > > example
> > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > >> > > > >> > > > *RocksDBDualCFRangeIterator*
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > Thanks for the KIP!
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > On Thu, May 14, 2020 at 10:50 AM Sagar
> > <
> > > > > > > > > > > >> > > sagarmeansoc...@gmail.com>
> > > > > > > > > > > >> > > > >> > > wrote:
> > > > > > > > > > > >> > > > >> > > >
> > > > > > > > > > > >> > > > >> > > > > Hey @Adam,
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Thanks for sharing your experience
> > with
> > > > using
> > > > > > > > prefix
> > > > > > > > > > > >> seek. I
> > > > > > > > > > > >> > > did
> > > > > > > > > > > >> > > > >> look
> > > > > > > > > > > >> > > > >> > > at
> > > > > > > > > > > >> > > > >> > > > > your code for RocksDBPrefixIterator,
> > > > infact I
> > > > > > > > have
> > > > > > > > > > > >> repurposed
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > >> > > class
> > > > > > > > > > > >> > > > >> > > > > itself since it wasn't being used
> > else
> > > > where.
> > > > > > > > > > Regarding
> > > > > > > > > > > >> how I
> > > > > > > > > > > >> > > > >> plan to
> > > > > > > > > > > >> > > > >> > > > > expose them through-out the state
> > stores,
> > > > > > what I
> > > > > > > > have
> > > > > > > > > > > >> tried
> > > > > > > > > > > >> > > to do
> > > > > > > > > > > >> > > > >> is
> > > > > > > > > > > >> > > > >> > > add it
> > > > > > > > > > > >> > > > >> > > > > as a separate interface. So,
> > basically,
> > > > it is
> > > > > > > > not at
> > > > > > > > > > the
> > > > > > > > > > > >> same
> > > > > > > > > > > >> > > > >> level as
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > *range function so to speak. The
> > reason
> > > > I did
> > > > > > > > that is
> > > > > > > > > > > >> > > currently I
> > > > > > > > > > > >> > > > >> feel
> > > > > > > > > > > >> > > > >> > > not
> > > > > > > > > > > >> > > > >> > > > > all state stores are a natural fit
> > for
> > > > prefix
> > > > > > > > seek.
> > > > > > > > > > As I
> > > > > > > > > > > >> > > > >> mentioned in
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > KIP as well, the current equivalent
> > to it
> > > > > > could
> > > > > > > > be
> > > > > > > > > > > >> > > > >> > > BulkLoadingStore(not in
> > > > > > > > > > > >> > > > >> > > > > terms of functionality but in terms
> > of
> > > > how
> > > > > > it is
> > > > > > > > > > also not
> > > > > > > > > > > >> > > > >> implemented
> > > > > > > > > > > >> > > > >> > > by
> > > > > > > > > > > >> > > > >> > > > > all of them). So, that ways I am not
> > > > needing
> > > > > > to
> > > > > > > > stub
> > > > > > > > > > them
> > > > > > > > > > > >> > > across
> > > > > > > > > > > >> > > > >> all
> > > > > > > > > > > >> > > > >> > > the
> > > > > > > > > > > >> > > > >> > > > > state-stores and we can implement it
> > only
> > > > > > where
> > > > > > > > > > needed.
> > > > > > > > > > > >> For
> > > > > > > > > > > >> > > > >> example,
> > > > > > > > > > > >> > > > >> > > in the
> > > > > > > > > > > >> > > > >> > > > > PR that I have put for reference in
> > the
> > > > KIP,
> > > > > > you
> > > > > > > > can
> > > > > > > > > > see
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > I
> > > > > > > > > > > >> > > > >> have it
> > > > > > > > > > > >> > > > >> > > > > implemented only for RocksDB.
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > @Guozhang,
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > Thanks for the feedback. Those are
> > very
> > > > > > > > interesting
> > > > > > > > > > > >> questions
> > > > > > > > > > > >> > > and
> > > > > > > > > > > >> > > > >> I
> > > > > > > > > > > >> > > > >> > > will
> > > > > > > > > > > >> > > > >> > > > > try my best to answer based upon
> > whatever
> > > > > > limited
> > > > > > > > > > > >> > > understanding I
> > > > > > > > > > > >> > > > >> have
> > > > > > > > > > > >> > > > >> > > > > developed so far :)
> > > > > > > > > > > >> > > > >> > > > >
> > > > > > > > > > > >> > > > >> > > > > 1) Regarding the usage of
> > > > > > > > > > useFixedLengthPrefixExtractor,
> > > > > > > > > > > >> > > > >> honestly, I
> > > > > > > > > > > >> > > > >> > > hadn't
> > > > > > > > > > > >> > > > >> > > > > looked at that config. I did look it
> > up
> > > > > > after you
> > > > > > > > > > > >> pointed it
> > > > > > > > > > > >> > > out
> > > > > > > > > > > >> > > > >> and
> > > > > > > > > > > >> > > > >> > > seems
> > > > > > > > > > > >> > > > >> > > > > it's more for hash-based memtables? I
> > > > may be
> > > > > > > > wrong
> > > > > > > > > > > >> though. But
> > > > > > > > > > > >> > > > >> what I
> > > > > > > > > > > >> > > > >> > > would
> > > > > > > > > > > >> > > > >> > > > > say is that, the changes I had made
> > were
> > > > not
> > > > > > > > exactly
> > > > > > > > > > > >> from a
> > > > > > > > > > > >> > > > >> correc
>

Reply via email to