Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-21 Thread Matthias J. Sax

Thanks! SGTM.

Seems all open questions are resolved. Thanks for pushing this through!

-Matthias

On 11/21/23 2:29 AM, Alieh Saeedi wrote:

Yes Matthias,
Based on the discussion we had, it has now been changed to Optional and the
default is empty (for the latest). Also, the `validTo()` method returns an
Optional.

Cheers,
Alieh

On Mon, Nov 20, 2023 at 7:38 PM Matthias J. Sax  wrote:


I think we should also discuss a little more about `validTo()` method?

Given that "latest" version does not have a valid-to TS, should we
change the return type to `Optional` and return `empty()` for "latest"?

ATM the KIP uses `MAX_VALUE` for "latest" what seems to be less clean?
We could also use `-1` (unknown), but both might be less expressive than
`Optional`?


-Matthias

On 11/20/23 1:59 AM, Bruno Cadonna wrote:

Hi Alieh,

Although, I've already voted, I found a minor miss. You should also add
a method isDescending() since the results could also be unordered now
that we agreed that the results are unordered by default. If both --
isDescending() and isAscending -- are false neither
withDescendingTimestamps() nor withAscendingTimestamps() was called.

Best,
Bruno

On 11/17/23 11:25 AM, Alieh Saeedi wrote:

Hi all,
Thank you for the feedback.

So we agreed on no default ordering for keys and TSs. So I must provide
both withAscendingXx() and with DescendingXx() for the class.
Apart from that, I think we can either remove the existing constructor
for
the `VersionedRecord` class or follow the `Optional` thing.

Since many hidden aspects of the KIP are quite clear now and we have

come

to a consensus about them, I think it 's time to vote ;-)
I look forward to your votes. Thanks a lot.

Cheers,
Alieh

On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax 

wrote:



Thanks, Alieh.

Overall SGTM. About `validTo` -- wondering if we should make it an
`Optional` and set to `empty()` by default?

I am totally ok with going with the 3-way option about ordering using
default "undefined". For this KIP (as it's all net new) nothing really
changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.

Btw: so far we focused on key-ordering, but I believe the same

"ordering

undefined by default" would apply to time-ordering, too? This might
affect KIP-997, too.


-Matthias

On 11/16/23 12:51 AM, Bruno Cadonna wrote:

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even
say that currently we do not need to keep backwards compatibility

among

IQv2 versions since we marked the API "Evolving" (do we only mean code
compatibility here or also behavioral compatibility?). I propose to

try

to not limit ourselves for backwards compatibility that we explicitly
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite
focused on what the state store provides. I see that for range

queries,

we have methods on the state store interface that specify the order,
but
that should be kind of orthogonal to the IQv2 query type. Let's assume
somebody in the future adds a state store implementation that is not
order based. To account for use cases where the order does not matter,
this person might also add a method to the state store interface that
does not guarantee any order. However, our range query type is
specified
to guarantee order by default. So we need to add something like
withNoOrder() to the query type to allow the use cases that does not
need order and has the better performance in IQ. That does not look
very
nice to me. Having the no-order-guaranteed option does not cost us
anything and it keeps the IQv2 interface flexible. I assume we want to
drop the Evolving annotation at some point.
Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for
single-key single-ts lookup; it seems better to return the proper
timestamp. The timestamp is already in the store and it's cheap to
extract it and add to the result, and it might be valuable

information

for the user. Not sure though if we should deprecate the existing
constructor though, because for "latest" it's convenient to have?


60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it
Bruno). It's kinda tricky.

Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
provide a clear contract that `range()` is ascending and
`reverseRange()` is descending.

For `RangeQuery`, the question is, if we did implicitly inherit this
contract? Our conclusion on KIP-985 discussion was, that we did
inherit it. If this holds true, changing the contract would be a
breaking change (what might still be acceptable, given that the
interface is annotated as unstable, and that IQv2 is not widely
adopted yet). I am happy to go with the 3-option contract, but just
want to ensure we all agree it's the r

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-21 Thread Alieh Saeedi
Yes Matthias,
Based on the discussion we had, it has now been changed to Optional and the
default is empty (for the latest). Also, the `validTo()` method returns an
Optional.

Cheers,
Alieh

On Mon, Nov 20, 2023 at 7:38 PM Matthias J. Sax  wrote:

> I think we should also discuss a little more about `validTo()` method?
>
> Given that "latest" version does not have a valid-to TS, should we
> change the return type to `Optional` and return `empty()` for "latest"?
>
> ATM the KIP uses `MAX_VALUE` for "latest" what seems to be less clean?
> We could also use `-1` (unknown), but both might be less expressive than
> `Optional`?
>
>
> -Matthias
>
> On 11/20/23 1:59 AM, Bruno Cadonna wrote:
> > Hi Alieh,
> >
> > Although, I've already voted, I found a minor miss. You should also add
> > a method isDescending() since the results could also be unordered now
> > that we agreed that the results are unordered by default. If both --
> > isDescending() and isAscending -- are false neither
> > withDescendingTimestamps() nor withAscendingTimestamps() was called.
> >
> > Best,
> > Bruno
> >
> > On 11/17/23 11:25 AM, Alieh Saeedi wrote:
> >> Hi all,
> >> Thank you for the feedback.
> >>
> >> So we agreed on no default ordering for keys and TSs. So I must provide
> >> both withAscendingXx() and with DescendingXx() for the class.
> >> Apart from that, I think we can either remove the existing constructor
> >> for
> >> the `VersionedRecord` class or follow the `Optional` thing.
> >>
> >> Since many hidden aspects of the KIP are quite clear now and we have
> come
> >> to a consensus about them, I think it 's time to vote ;-)
> >> I look forward to your votes. Thanks a lot.
> >>
> >> Cheers,
> >> Alieh
> >>
> >> On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax 
> wrote:
> >>
> >>> Thanks, Alieh.
> >>>
> >>> Overall SGTM. About `validTo` -- wondering if we should make it an
> >>> `Optional` and set to `empty()` by default?
> >>>
> >>> I am totally ok with going with the 3-way option about ordering using
> >>> default "undefined". For this KIP (as it's all net new) nothing really
> >>> changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.
> >>>
> >>> Btw: so far we focused on key-ordering, but I believe the same
> "ordering
> >>> undefined by default" would apply to time-ordering, too? This might
> >>> affect KIP-997, too.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 11/16/23 12:51 AM, Bruno Cadonna wrote:
>  Hi,
> 
>  80)
>  We do not keep backwards compatibility with IQv1, right? I would even
>  say that currently we do not need to keep backwards compatibility
> among
>  IQv2 versions since we marked the API "Evolving" (do we only mean code
>  compatibility here or also behavioral compatibility?). I propose to
> try
>  to not limit ourselves for backwards compatibility that we explicitly
>  marked as evolving.
>  I re-read the discussion on KIP-985. In that discussion, we were quite
>  focused on what the state store provides. I see that for range
> queries,
>  we have methods on the state store interface that specify the order,
>  but
>  that should be kind of orthogonal to the IQv2 query type. Let's assume
>  somebody in the future adds a state store implementation that is not
>  order based. To account for use cases where the order does not matter,
>  this person might also add a method to the state store interface that
>  does not guarantee any order. However, our range query type is
>  specified
>  to guarantee order by default. So we need to add something like
>  withNoOrder() to the query type to allow the use cases that does not
>  need order and has the better performance in IQ. That does not look
>  very
>  nice to me. Having the no-order-guaranteed option does not cost us
>  anything and it keeps the IQv2 interface flexible. I assume we want to
>  drop the Evolving annotation at some point.
>  Sorry for not having brought this up in the discussion about KIP-985.
> 
>  Best,
>  Bruno
> 
> 
> 
> 
> 
>  On 11/15/23 6:56 AM, Matthias J. Sax wrote:
> > Just catching up on this one.
> >
> >
> > 50) I am also in favor of setting `validTo` in VersionedRecord for
> > single-key single-ts lookup; it seems better to return the proper
> > timestamp. The timestamp is already in the store and it's cheap to
> > extract it and add to the result, and it might be valuable
> information
> > for the user. Not sure though if we should deprecate the existing
> > constructor though, because for "latest" it's convenient to have?
> >
> >
> > 60) Yes, I meant `VersionedRecord`. Sorry for the mixup.
> >
> >
> > 80) We did discuss this question on KIP-985 (maybe you missed it
> > Bruno). It's kinda tricky.
> >
> > Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
> > provide a clear contract that `ra

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-20 Thread Matthias J. Sax

I think we should also discuss a little more about `validTo()` method?

Given that "latest" version does not have a valid-to TS, should we 
change the return type to `Optional` and return `empty()` for "latest"?


ATM the KIP uses `MAX_VALUE` for "latest" what seems to be less clean? 
We could also use `-1` (unknown), but both might be less expressive than 
`Optional`?



-Matthias

On 11/20/23 1:59 AM, Bruno Cadonna wrote:

Hi Alieh,

Although, I've already voted, I found a minor miss. You should also add 
a method isDescending() since the results could also be unordered now 
that we agreed that the results are unordered by default. If both -- 
isDescending() and isAscending -- are false neither 
withDescendingTimestamps() nor withAscendingTimestamps() was called.


Best,
Bruno

On 11/17/23 11:25 AM, Alieh Saeedi wrote:

Hi all,
Thank you for the feedback.

So we agreed on no default ordering for keys and TSs. So I must provide
both withAscendingXx() and with DescendingXx() for the class.
Apart from that, I think we can either remove the existing constructor 
for

the `VersionedRecord` class or follow the `Optional` thing.

Since many hidden aspects of the KIP are quite clear now and we have come
to a consensus about them, I think it 's time to vote ;-)
I look forward to your votes. Thanks a lot.

Cheers,
Alieh

On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax  wrote:


Thanks, Alieh.

Overall SGTM. About `validTo` -- wondering if we should make it an
`Optional` and set to `empty()` by default?

I am totally ok with going with the 3-way option about ordering using
default "undefined". For this KIP (as it's all net new) nothing really
changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.

Btw: so far we focused on key-ordering, but I believe the same "ordering
undefined by default" would apply to time-ordering, too? This might
affect KIP-997, too.


-Matthias

On 11/16/23 12:51 AM, Bruno Cadonna wrote:

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even
say that currently we do not need to keep backwards compatibility among
IQv2 versions since we marked the API "Evolving" (do we only mean code
compatibility here or also behavioral compatibility?). I propose to try
to not limit ourselves for backwards compatibility that we explicitly
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite
focused on what the state store provides. I see that for range queries,
we have methods on the state store interface that specify the order, 
but

that should be kind of orthogonal to the IQv2 query type. Let's assume
somebody in the future adds a state store implementation that is not
order based. To account for use cases where the order does not matter,
this person might also add a method to the state store interface that
does not guarantee any order. However, our range query type is 
specified

to guarantee order by default. So we need to add something like
withNoOrder() to the query type to allow the use cases that does not
need order and has the better performance in IQ. That does not look 
very

nice to me. Having the no-order-guaranteed option does not cost us
anything and it keeps the IQv2 interface flexible. I assume we want to
drop the Evolving annotation at some point.
Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for
single-key single-ts lookup; it seems better to return the proper
timestamp. The timestamp is already in the store and it's cheap to
extract it and add to the result, and it might be valuable information
for the user. Not sure though if we should deprecate the existing
constructor though, because for "latest" it's convenient to have?


60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it
Bruno). It's kinda tricky.

Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
provide a clear contract that `range()` is ascending and
`reverseRange()` is descending.

For `RangeQuery`, the question is, if we did implicitly inherit this
contract? Our conclusion on KIP-985 discussion was, that we did
inherit it. If this holds true, changing the contract would be a
breaking change (what might still be acceptable, given that the
interface is annotated as unstable, and that IQv2 is not widely
adopted yet). I am happy to go with the 3-option contract, but just
want to ensure we all agree it's the right way to go, and we are
potentially willing to pay the price of backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still
useful semantic?


I don't think we necessarily need it, but as pointed out by Lucas, all
existing queries provide it. Overall, my main point is really about
not implementing something "random", but defining a proper bind

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-20 Thread Bruno Cadonna

Hi Alieh,

Although, I've already voted, I found a minor miss. You should also add 
a method isDescending() since the results could also be unordered now 
that we agreed that the results are unordered by default. If both -- 
isDescending() and isAscending -- are false neither 
withDescendingTimestamps() nor withAscendingTimestamps() was called.


Best,
Bruno

On 11/17/23 11:25 AM, Alieh Saeedi wrote:

Hi all,
Thank you for the feedback.

So we agreed on no default ordering for keys and TSs. So I must provide
both withAscendingXx() and with DescendingXx() for the class.
Apart from that, I think we can either remove the existing constructor for
the `VersionedRecord` class or follow the `Optional` thing.

Since many hidden aspects of the KIP are quite clear now and we have come
to a consensus about them, I think it 's time to vote ;-)
I look forward to your votes. Thanks a lot.

Cheers,
Alieh

On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax  wrote:


Thanks, Alieh.

Overall SGTM. About `validTo` -- wondering if we should make it an
`Optional` and set to `empty()` by default?

I am totally ok with going with the 3-way option about ordering using
default "undefined". For this KIP (as it's all net new) nothing really
changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.

Btw: so far we focused on key-ordering, but I believe the same "ordering
undefined by default" would apply to time-ordering, too? This might
affect KIP-997, too.


-Matthias

On 11/16/23 12:51 AM, Bruno Cadonna wrote:

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even
say that currently we do not need to keep backwards compatibility among
IQv2 versions since we marked the API "Evolving" (do we only mean code
compatibility here or also behavioral compatibility?). I propose to try
to not limit ourselves for backwards compatibility that we explicitly
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite
focused on what the state store provides. I see that for range queries,
we have methods on the state store interface that specify the order, but
that should be kind of orthogonal to the IQv2 query type. Let's assume
somebody in the future adds a state store implementation that is not
order based. To account for use cases where the order does not matter,
this person might also add a method to the state store interface that
does not guarantee any order. However, our range query type is specified
to guarantee order by default. So we need to add something like
withNoOrder() to the query type to allow the use cases that does not
need order and has the better performance in IQ. That does not look very
nice to me. Having the no-order-guaranteed option does not cost us
anything and it keeps the IQv2 interface flexible. I assume we want to
drop the Evolving annotation at some point.
Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for
single-key single-ts lookup; it seems better to return the proper
timestamp. The timestamp is already in the store and it's cheap to
extract it and add to the result, and it might be valuable information
for the user. Not sure though if we should deprecate the existing
constructor though, because for "latest" it's convenient to have?


60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it
Bruno). It's kinda tricky.

Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
provide a clear contract that `range()` is ascending and
`reverseRange()` is descending.

For `RangeQuery`, the question is, if we did implicitly inherit this
contract? Our conclusion on KIP-985 discussion was, that we did
inherit it. If this holds true, changing the contract would be a
breaking change (what might still be acceptable, given that the
interface is annotated as unstable, and that IQv2 is not widely
adopted yet). I am happy to go with the 3-option contract, but just
want to ensure we all agree it's the right way to go, and we are
potentially willing to pay the price of backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still
useful semantic?


I don't think we necessarily need it, but as pointed out by Lucas, all
existing queries provide it. Overall, my main point is really about
not implementing something "random", but defining a proper binding
contract that allows users to reason about it.

I general, I agree that weaker semantics might be sufficient, but I am
not sure if we can implement anything weaker in a reasonable way?
Happy to be convinced otherwise. (I have some example, that I will
omit for now, as I hope we can actually go with snapshot semantics.)

The RocksDB Snaptshot idea from Lucas sounds very promising. I was not
aware that we could do this with RocksDB

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-17 Thread Alieh Saeedi
Hi all,
Thank you for the feedback.

So we agreed on no default ordering for keys and TSs. So I must provide
both withAscendingXx() and with DescendingXx() for the class.
Apart from that, I think we can either remove the existing constructor for
the `VersionedRecord` class or follow the `Optional` thing.

Since many hidden aspects of the KIP are quite clear now and we have come
to a consensus about them, I think it 's time to vote ;-)
I look forward to your votes. Thanks a lot.

Cheers,
Alieh

On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax  wrote:

> Thanks, Alieh.
>
> Overall SGTM. About `validTo` -- wondering if we should make it an
> `Optional` and set to `empty()` by default?
>
> I am totally ok with going with the 3-way option about ordering using
> default "undefined". For this KIP (as it's all net new) nothing really
> changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.
>
> Btw: so far we focused on key-ordering, but I believe the same "ordering
> undefined by default" would apply to time-ordering, too? This might
> affect KIP-997, too.
>
>
> -Matthias
>
> On 11/16/23 12:51 AM, Bruno Cadonna wrote:
> > Hi,
> >
> > 80)
> > We do not keep backwards compatibility with IQv1, right? I would even
> > say that currently we do not need to keep backwards compatibility among
> > IQv2 versions since we marked the API "Evolving" (do we only mean code
> > compatibility here or also behavioral compatibility?). I propose to try
> > to not limit ourselves for backwards compatibility that we explicitly
> > marked as evolving.
> > I re-read the discussion on KIP-985. In that discussion, we were quite
> > focused on what the state store provides. I see that for range queries,
> > we have methods on the state store interface that specify the order, but
> > that should be kind of orthogonal to the IQv2 query type. Let's assume
> > somebody in the future adds a state store implementation that is not
> > order based. To account for use cases where the order does not matter,
> > this person might also add a method to the state store interface that
> > does not guarantee any order. However, our range query type is specified
> > to guarantee order by default. So we need to add something like
> > withNoOrder() to the query type to allow the use cases that does not
> > need order and has the better performance in IQ. That does not look very
> > nice to me. Having the no-order-guaranteed option does not cost us
> > anything and it keeps the IQv2 interface flexible. I assume we want to
> > drop the Evolving annotation at some point.
> > Sorry for not having brought this up in the discussion about KIP-985.
> >
> > Best,
> > Bruno
> >
> >
> >
> >
> >
> > On 11/15/23 6:56 AM, Matthias J. Sax wrote:
> >> Just catching up on this one.
> >>
> >>
> >> 50) I am also in favor of setting `validTo` in VersionedRecord for
> >> single-key single-ts lookup; it seems better to return the proper
> >> timestamp. The timestamp is already in the store and it's cheap to
> >> extract it and add to the result, and it might be valuable information
> >> for the user. Not sure though if we should deprecate the existing
> >> constructor though, because for "latest" it's convenient to have?
> >>
> >>
> >> 60) Yes, I meant `VersionedRecord`. Sorry for the mixup.
> >>
> >>
> >> 80) We did discuss this question on KIP-985 (maybe you missed it
> >> Bruno). It's kinda tricky.
> >>
> >> Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
> >> provide a clear contract that `range()` is ascending and
> >> `reverseRange()` is descending.
> >>
> >> For `RangeQuery`, the question is, if we did implicitly inherit this
> >> contract? Our conclusion on KIP-985 discussion was, that we did
> >> inherit it. If this holds true, changing the contract would be a
> >> breaking change (what might still be acceptable, given that the
> >> interface is annotated as unstable, and that IQv2 is not widely
> >> adopted yet). I am happy to go with the 3-option contract, but just
> >> want to ensure we all agree it's the right way to go, and we are
> >> potentially willing to pay the price of backward incompatibility.
> >>
> >>
> >>
> >>> Do we need a snapshot semantic or can we specify a weaker but still
> >>> useful semantic?
> >>
> >> I don't think we necessarily need it, but as pointed out by Lucas, all
> >> existing queries provide it. Overall, my main point is really about
> >> not implementing something "random", but defining a proper binding
> >> contract that allows users to reason about it.
> >>
> >> I general, I agree that weaker semantics might be sufficient, but I am
> >> not sure if we can implement anything weaker in a reasonable way?
> >> Happy to be convinced otherwise. (I have some example, that I will
> >> omit for now, as I hope we can actually go with snapshot semantics.)
> >>
> >> The RocksDB Snaptshot idea from Lucas sounds very promising. I was not
> >> aware that we could do this with RocksDB (otherwise I might have
> >> sugge

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Matthias J. Sax

Thanks, Alieh.

Overall SGTM. About `validTo` -- wondering if we should make it an 
`Optional` and set to `empty()` by default?


I am totally ok with going with the 3-way option about ordering using 
default "undefined". For this KIP (as it's all net new) nothing really 
changes. -- However, we should amend `RangeQuery`/KIP-985 to align it.


Btw: so far we focused on key-ordering, but I believe the same "ordering 
undefined by default" would apply to time-ordering, too? This might 
affect KIP-997, too.



-Matthias

On 11/16/23 12:51 AM, Bruno Cadonna wrote:

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even 
say that currently we do not need to keep backwards compatibility among 
IQv2 versions since we marked the API "Evolving" (do we only mean code 
compatibility here or also behavioral compatibility?). I propose to try 
to not limit ourselves for backwards compatibility that we explicitly 
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite 
focused on what the state store provides. I see that for range queries, 
we have methods on the state store interface that specify the order, but 
that should be kind of orthogonal to the IQv2 query type. Let's assume 
somebody in the future adds a state store implementation that is not 
order based. To account for use cases where the order does not matter, 
this person might also add a method to the state store interface that 
does not guarantee any order. However, our range query type is specified 
to guarantee order by default. So we need to add something like 
withNoOrder() to the query type to allow the use cases that does not 
need order and has the better performance in IQ. That does not look very 
nice to me. Having the no-order-guaranteed option does not cost us 
anything and it keeps the IQv2 interface flexible. I assume we want to 
drop the Evolving annotation at some point.

Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for 
single-key single-ts lookup; it seems better to return the proper 
timestamp. The timestamp is already in the store and it's cheap to 
extract it and add to the result, and it might be valuable information 
for the user. Not sure though if we should deprecate the existing 
constructor though, because for "latest" it's convenient to have?



60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it 
Bruno). It's kinda tricky.


Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces 
provide a clear contract that `range()` is ascending and 
`reverseRange()` is descending.


For `RangeQuery`, the question is, if we did implicitly inherit this 
contract? Our conclusion on KIP-985 discussion was, that we did 
inherit it. If this holds true, changing the contract would be a 
breaking change (what might still be acceptable, given that the 
interface is annotated as unstable, and that IQv2 is not widely 
adopted yet). I am happy to go with the 3-option contract, but just 
want to ensure we all agree it's the right way to go, and we are 
potentially willing to pay the price of backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still 
useful semantic? 


I don't think we necessarily need it, but as pointed out by Lucas, all 
existing queries provide it. Overall, my main point is really about 
not implementing something "random", but defining a proper binding 
contract that allows users to reason about it.


I general, I agree that weaker semantics might be sufficient, but I am 
not sure if we can implement anything weaker in a reasonable way? 
Happy to be convinced otherwise. (I have some example, that I will 
omit for now, as I hope we can actually go with snapshot semantics.)


The RocksDB Snaptshot idea from Lucas sounds very promising. I was not 
aware that we could do this with RocksDB (otherwise I might have 
suggested it on the PR right away). I guess the only open question 
remaining would be, if we can provide the same guarantees for a future 
in-memory implementation for VersionedStores? It sounds possible to 
do, but we should have some level of confidence about it?



-Matthias

On 11/14/23 6:33 AM, Lucas Brutschy wrote:

Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ens

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Alieh Saeedi
Hi all,
Thank you for the feedback and the interesting solutions you suggested.

The KIP and the corresponding PR are updated as follows:

   - Snapshot semantics is implemented in the `get(key, fromTime, toTime)`
   method to guarantee consistency.
   - The `ValueIterator` interface is replaced with
   `VersionedRecordIterator implements Iterator>`.
   Consequently, the `MultiVersionedKeyQuery` class is modified to `
   MultiVersionedKeyQuery implements Query>`.
   - The former `get(key,ts)` method is updated so that it returns records
   with their `validTo` timestamps as well. I still kept the old constructor
   of the `VersionedRecord` class with only two parameters. The `validTo`
   field is initialized by `MAX` as default, but I think the default value for
   the `validTo` field must be `*PUT_RETURN_CODE_VALID_TO_UNDEFINED*` which
   is defined in the `VersionedKeyValueStore` interface and is actually
   `-1`. Am I right?


Open question:

   - It seems like we do not have a conclusion about the default ordering
   issue.


Cheers,
Alieh

On Thu, Nov 16, 2023 at 9:56 AM Bruno Cadonna  wrote:

> Hi,
>
> 80)
> We do not keep backwards compatibility with IQv1, right? I would even
> say that currently we do not need to keep backwards compatibility among
> IQv2 versions since we marked the API "Evolving" (do we only mean code
> compatibility here or also behavioral compatibility?). I propose to try
> to not limit ourselves for backwards compatibility that we explicitly
> marked as evolving.
> I re-read the discussion on KIP-985. In that discussion, we were quite
> focused on what the state store provides. I see that for range queries,
> we have methods on the state store interface that specify the order, but
> that should be kind of orthogonal to the IQv2 query type. Let's assume
> somebody in the future adds a state store implementation that is not
> order based. To account for use cases where the order does not matter,
> this person might also add a method to the state store interface that
> does not guarantee any order. However, our range query type is specified
> to guarantee order by default. So we need to add something like
> withNoOrder() to the query type to allow the use cases that does not
> need order and has the better performance in IQ. That does not look very
> nice to me. Having the no-order-guaranteed option does not cost us
> anything and it keeps the IQv2 interface flexible. I assume we want to
> drop the Evolving annotation at some point.
> Sorry for not having brought this up in the discussion about KIP-985.
>
> Best,
> Bruno
>
>
>
>
>
> On 11/15/23 6:56 AM, Matthias J. Sax wrote:
> > Just catching up on this one.
> >
> >
> > 50) I am also in favor of setting `validTo` in VersionedRecord for
> > single-key single-ts lookup; it seems better to return the proper
> > timestamp. The timestamp is already in the store and it's cheap to
> > extract it and add to the result, and it might be valuable information
> > for the user. Not sure though if we should deprecate the existing
> > constructor though, because for "latest" it's convenient to have?
> >
> >
> > 60) Yes, I meant `VersionedRecord`. Sorry for the mixup.
> >
> >
> > 80) We did discuss this question on KIP-985 (maybe you missed it Bruno).
> > It's kinda tricky.
> >
> > Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces
> > provide a clear contract that `range()` is ascending and
> > `reverseRange()` is descending.
> >
> > For `RangeQuery`, the question is, if we did implicitly inherit this
> > contract? Our conclusion on KIP-985 discussion was, that we did inherit
> > it. If this holds true, changing the contract would be a breaking change
> > (what might still be acceptable, given that the interface is annotated
> > as unstable, and that IQv2 is not widely adopted yet). I am happy to go
> > with the 3-option contract, but just want to ensure we all agree it's
> > the right way to go, and we are potentially willing to pay the price of
> > backward incompatibility.
> >
> >
> >
> >> Do we need a snapshot semantic or can we specify a weaker but still
> >> useful semantic?
> >
> > I don't think we necessarily need it, but as pointed out by Lucas, all
> > existing queries provide it. Overall, my main point is really about not
> > implementing something "random", but defining a proper binding contract
> > that allows users to reason about it.
> >
> > I general, I agree that weaker semantics might be sufficient, but I am
> > not sure if we can implement anything weaker in a reasonable way? Happy
> > to be convinced otherwise. (I have some example, that I will omit for
> > now, as I hope we can actually go with snapshot semantics.)
> >
> > The RocksDB Snaptshot idea from Lucas sounds very promising. I was not
> > aware that we could do this with RocksDB (otherwise I might have
> > suggested it on the PR right away). I guess the only open question
> > remaining would be, if we can provide the same guarantees for a future
> > 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-16 Thread Bruno Cadonna

Hi,

80)
We do not keep backwards compatibility with IQv1, right? I would even 
say that currently we do not need to keep backwards compatibility among 
IQv2 versions since we marked the API "Evolving" (do we only mean code 
compatibility here or also behavioral compatibility?). I propose to try 
to not limit ourselves for backwards compatibility that we explicitly 
marked as evolving.
I re-read the discussion on KIP-985. In that discussion, we were quite 
focused on what the state store provides. I see that for range queries, 
we have methods on the state store interface that specify the order, but 
that should be kind of orthogonal to the IQv2 query type. Let's assume 
somebody in the future adds a state store implementation that is not 
order based. To account for use cases where the order does not matter, 
this person might also add a method to the state store interface that 
does not guarantee any order. However, our range query type is specified 
to guarantee order by default. So we need to add something like 
withNoOrder() to the query type to allow the use cases that does not 
need order and has the better performance in IQ. That does not look very 
nice to me. Having the no-order-guaranteed option does not cost us 
anything and it keeps the IQv2 interface flexible. I assume we want to 
drop the Evolving annotation at some point.

Sorry for not having brought this up in the discussion about KIP-985.

Best,
Bruno





On 11/15/23 6:56 AM, Matthias J. Sax wrote:

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for 
single-key single-ts lookup; it seems better to return the proper 
timestamp. The timestamp is already in the store and it's cheap to 
extract it and add to the result, and it might be valuable information 
for the user. Not sure though if we should deprecate the existing 
constructor though, because for "latest" it's convenient to have?



60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it Bruno). 
It's kinda tricky.


Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces 
provide a clear contract that `range()` is ascending and 
`reverseRange()` is descending.


For `RangeQuery`, the question is, if we did implicitly inherit this 
contract? Our conclusion on KIP-985 discussion was, that we did inherit 
it. If this holds true, changing the contract would be a breaking change 
(what might still be acceptable, given that the interface is annotated 
as unstable, and that IQv2 is not widely adopted yet). I am happy to go 
with the 3-option contract, but just want to ensure we all agree it's 
the right way to go, and we are potentially willing to pay the price of 
backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still 
useful semantic? 


I don't think we necessarily need it, but as pointed out by Lucas, all 
existing queries provide it. Overall, my main point is really about not 
implementing something "random", but defining a proper binding contract 
that allows users to reason about it.


I general, I agree that weaker semantics might be sufficient, but I am 
not sure if we can implement anything weaker in a reasonable way? Happy 
to be convinced otherwise. (I have some example, that I will omit for 
now, as I hope we can actually go with snapshot semantics.)


The RocksDB Snaptshot idea from Lucas sounds very promising. I was not 
aware that we could do this with RocksDB (otherwise I might have 
suggested it on the PR right away). I guess the only open question 
remaining would be, if we can provide the same guarantees for a future 
in-memory implementation for VersionedStores? It sounds possible to do, 
but we should have some level of confidence about it?



-Matthias

On 11/14/23 6:33 AM, Lucas Brutschy wrote:

Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ensure a consistent view on a single DB and using
`ReadOptions.setSnapshot` for the gets. Since versioned state stores
segments seem to be backed by a single RocksDB instance (that was
unclear to me during our earlier discussion), a single snapshot should
be enough to guarantee a consistent view - we just need to make sure
to clean up the snapshots after use, similar to iterators. If we
instead need a consistent view across multiple RocksDB instances, we
may have to acquire a write lock on all of those (could use the
current object monitors of our `RocksDB` interfa

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-14 Thread Matthias J. Sax

Just catching up on this one.


50) I am also in favor of setting `validTo` in VersionedRecord for 
single-key single-ts lookup; it seems better to return the proper 
timestamp. The timestamp is already in the store and it's cheap to 
extract it and add to the result, and it might be valuable information 
for the user. Not sure though if we should deprecate the existing 
constructor though, because for "latest" it's convenient to have?



60) Yes, I meant `VersionedRecord`. Sorry for the mixup.


80) We did discuss this question on KIP-985 (maybe you missed it Bruno). 
It's kinda tricky.


Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces 
provide a clear contract that `range()` is ascending and 
`reverseRange()` is descending.


For `RangeQuery`, the question is, if we did implicitly inherit this 
contract? Our conclusion on KIP-985 discussion was, that we did inherit 
it. If this holds true, changing the contract would be a breaking change 
(what might still be acceptable, given that the interface is annotated 
as unstable, and that IQv2 is not widely adopted yet). I am happy to go 
with the 3-option contract, but just want to ensure we all agree it's 
the right way to go, and we are potentially willing to pay the price of 
backward incompatibility.




Do we need a snapshot semantic or can we specify a weaker but still useful semantic? 


I don't think we necessarily need it, but as pointed out by Lucas, all 
existing queries provide it. Overall, my main point is really about not 
implementing something "random", but defining a proper binding contract 
that allows users to reason about it.


I general, I agree that weaker semantics might be sufficient, but I am 
not sure if we can implement anything weaker in a reasonable way? Happy 
to be convinced otherwise. (I have some example, that I will omit for 
now, as I hope we can actually go with snapshot semantics.)


The RocksDB Snaptshot idea from Lucas sounds very promising. I was not 
aware that we could do this with RocksDB (otherwise I might have 
suggested it on the PR right away). I guess the only open question 
remaining would be, if we can provide the same guarantees for a future 
in-memory implementation for VersionedStores? It sounds possible to do, 
but we should have some level of confidence about it?



-Matthias

On 11/14/23 6:33 AM, Lucas Brutschy wrote:

Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ensure a consistent view on a single DB and using
`ReadOptions.setSnapshot` for the gets. Since versioned state stores
segments seem to be backed by a single RocksDB instance (that was
unclear to me during our earlier discussion), a single snapshot should
be enough to guarantee a consistent view - we just need to make sure
to clean up the snapshots after use, similar to iterators. If we
instead need a consistent view across multiple RocksDB instances, we
may have to acquire a write lock on all of those (could use the
current object monitors of our `RocksDB` interface) for the duration
of creating snapshots across all instances - which I think would also
be permissible performance-wise. Snapshots are just a sequence number
and should be pretty lightweight to create (they have, however,
downside when it comes to compaction just like iterators).

With that in mind, I would be in favor of at least exploring the
option of using snapshots for a consistent view here, before dropping
this useful guarantee.

Cheers,
Lucas

On Tue, Nov 14, 2023 at 2:20 PM Bruno Cadonna  wrote:


Hi Alieh,

Regarding the semantics/guarantees of the query type:

Do we need a snapshot semantic or can we specify a weaker but still
useful semantic?

An option could be to guarantee that:
1. the query returns the latest version when the query arrived at the
state store
2. the query returns a valid history, i.e., versions with adjacent and
non-overlapping validity intervals.

I think guarantee 1 is quite obvious. Guarantee 2 maybe needs some
explanation. If we do not avoid writes to the state store during the
processing of interactive queries, it might for example happen that the
latest version in the state store moves to data structures that are
responsible for older versions. In our RocksDB implementation that are
the segments. Thus, it could be that during query processing Kafka
Streams reads the latest value x and encounters again x in a segment
because a processor put a newer version of x in the versioned state
store. A similar scenario might also happen to ea

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-14 Thread Lucas Brutschy
Hi Alieh,

I agree with Bruno that a weaker guarantee could be useful already,
and it's anyway possible to strengthen the guarantees later on. On the
other hand, it would be nice to provide a consistent view across all
segments if it doesn't come with major downsides, because until now IQ
does provide a consistent view (via iterators), and this would be the
first feature that diverges from that guarantee.

I think a consistent could be achieved relatively easily by creating a
snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that will
ensure a consistent view on a single DB and using
`ReadOptions.setSnapshot` for the gets. Since versioned state stores
segments seem to be backed by a single RocksDB instance (that was
unclear to me during our earlier discussion), a single snapshot should
be enough to guarantee a consistent view - we just need to make sure
to clean up the snapshots after use, similar to iterators. If we
instead need a consistent view across multiple RocksDB instances, we
may have to acquire a write lock on all of those (could use the
current object monitors of our `RocksDB` interface) for the duration
of creating snapshots across all instances - which I think would also
be permissible performance-wise. Snapshots are just a sequence number
and should be pretty lightweight to create (they have, however,
downside when it comes to compaction just like iterators).

With that in mind, I would be in favor of at least exploring the
option of using snapshots for a consistent view here, before dropping
this useful guarantee.

Cheers,
Lucas

On Tue, Nov 14, 2023 at 2:20 PM Bruno Cadonna  wrote:
>
> Hi Alieh,
>
> Regarding the semantics/guarantees of the query type:
>
> Do we need a snapshot semantic or can we specify a weaker but still
> useful semantic?
>
> An option could be to guarantee that:
> 1. the query returns the latest version when the query arrived at the
> state store
> 2. the query returns a valid history, i.e., versions with adjacent and
> non-overlapping validity intervals.
>
> I think guarantee 1 is quite obvious. Guarantee 2 maybe needs some
> explanation. If we do not avoid writes to the state store during the
> processing of interactive queries, it might for example happen that the
> latest version in the state store moves to data structures that are
> responsible for older versions. In our RocksDB implementation that are
> the segments. Thus, it could be that during query processing Kafka
> Streams reads the latest value x and encounters again x in a segment
> because a processor put a newer version of x in the versioned state
> store. A similar scenario might also happen to earlier versions. If
> Streams does not account for such cases it could return invalid histories.
>
> Maybe such weaker guarantees are enough for most use cases.
>
> You could consider implementing weaker guarantees as I described and if
> there is demand propose stricter guarantees in a follow-up KIP.
>
> Maybe there are also other simpler guarantees that make sense.
>
> Best,
> Bruno
>
>
> On 11/9/23 12:30 PM, Bruno Cadonna wrote:
> > Hi,
> >
> > Thanks for the updates!
> >
> > First my take on previous comments:
> >
> >
> > 50)
> > I am in favor of deprecating the constructor that does not take the
> > validTo parameter. That implies that I am in favor of modifying get(key,
> > asOf) to set the correct validTo.
> >
> >
> > 60)
> > I am in favor of renaming ValueIterator to VersionedRecordIterator and
> > define it as:
> >
> > public interface VersionedRecordIterator extends
> > Iterator>
> >
> > (Matthias, you mixed up ValueAndTimestamp with VersionedRecord in your
> > last e-mail, didn't you? Just double-checking if I understood what you
> > are proposing.)
> >
> >
> > 70)
> > I agree with Matthias that adding a new method on the
> > VersionedKeyValueStore interface defeats the purpose of one of the goals
> > of IQv2, i.e., not to need to extend the state store interface for IQ. I
> > would say if we do not need the method in normal processing, we should
> > not extend the public state store interface. BTW, nobody forces you to
> > StoreQueryUtils. I think that internal utils class was introduced for
> > convenience to leverage existing methods on the state store interface.
> >
> >
> > 80)
> > Why do we limit ourselves by specifying a default order on the result?
> > Different state store implementation might have different strategies to
> > store the records which affects the order in which the records are
> > returned if they are not sorted before they are returned to the user.
> > Some users might not be interested in an order of the result and so
> > there is no reason those users pay the cost for sorting. I propose to
> > not specify a default order but sort the results (if needed) when
> > withDescendingX() and withAscendingX() is specified on the query object.
> >
> >
> > Regarding the snapshot guarantees for the iterators, I need to think a
> > bit more about it. I will come back to this th

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-14 Thread Bruno Cadonna

Hi Alieh,

Regarding the semantics/guarantees of the query type:

Do we need a snapshot semantic or can we specify a weaker but still 
useful semantic?


An option could be to guarantee that:
1. the query returns the latest version when the query arrived at the 
state store
2. the query returns a valid history, i.e., versions with adjacent and 
non-overlapping validity intervals.


I think guarantee 1 is quite obvious. Guarantee 2 maybe needs some 
explanation. If we do not avoid writes to the state store during the 
processing of interactive queries, it might for example happen that the 
latest version in the state store moves to data structures that are 
responsible for older versions. In our RocksDB implementation that are 
the segments. Thus, it could be that during query processing Kafka 
Streams reads the latest value x and encounters again x in a segment 
because a processor put a newer version of x in the versioned state 
store. A similar scenario might also happen to earlier versions. If 
Streams does not account for such cases it could return invalid histories.


Maybe such weaker guarantees are enough for most use cases.

You could consider implementing weaker guarantees as I described and if 
there is demand propose stricter guarantees in a follow-up KIP.


Maybe there are also other simpler guarantees that make sense.

Best,
Bruno


On 11/9/23 12:30 PM, Bruno Cadonna wrote:

Hi,

Thanks for the updates!

First my take on previous comments:


50)
I am in favor of deprecating the constructor that does not take the 
validTo parameter. That implies that I am in favor of modifying get(key, 
asOf) to set the correct validTo.



60)
I am in favor of renaming ValueIterator to VersionedRecordIterator and 
define it as:


public interface VersionedRecordIterator extends 
Iterator>


(Matthias, you mixed up ValueAndTimestamp with VersionedRecord in your 
last e-mail, didn't you? Just double-checking if I understood what you 
are proposing.)



70)
I agree with Matthias that adding a new method on the 
VersionedKeyValueStore interface defeats the purpose of one of the goals 
of IQv2, i.e., not to need to extend the state store interface for IQ. I 
would say if we do not need the method in normal processing, we should 
not extend the public state store interface. BTW, nobody forces you to 
StoreQueryUtils. I think that internal utils class was introduced for 
convenience to leverage existing methods on the state store interface.



80)
Why do we limit ourselves by specifying a default order on the result? 
Different state store implementation might have different strategies to 
store the records which affects the order in which the records are 
returned if they are not sorted before they are returned to the user. 
Some users might not be interested in an order of the result and so 
there is no reason those users pay the cost for sorting. I propose to 
not specify a default order but sort the results (if needed) when 
withDescendingX() and withAscendingX() is specified on the query object.



Regarding the snapshot guarantees for the iterators, I need to think a 
bit more about it. I will come back to this thread in the next days.



Best,
Bruno


On 11/8/23 5:30 PM, Alieh Saeedi wrote:

Thank you, Bruno and Matthias, for keeping the discussion going and for
reviewing the PR.

Here are the KIP updates:

    - I removed the `peek()` from the `ValueIterator` interface since 
we do

    not need it.
    - Yes, Bruno, the `validTo` field in the `VersionedRecod` class is
    exclusive. I updated the javadocs for that.


Very important critical open questions. I list them here based on 
priority

(descendingly).

    - I implemented the `get(key, fromtime, totime)` method here

:
    the problem is that this implementation does not guarantee 
consistency
    because processing might continue interleaved (no snapshot 
semantics is

    implemented). More over, it materializes all results in memory.
   - Solution 1: Use a lock and release it after retrieving all 
desired

   records from all segments.
  - positive point: snapshot semantics is implemented
  - negative points: 1) It is expensive since iterating over all
  segments may take a long time. 2) It still requires
materializing results
  on memory
   - Solution 2: use `RocksDbIterator`.
  - positive points: 1) It guarantees snapshot segments. 2) It 
does

  not require materializing results in memory.
  - negative points: it is expensive because, anyway, we need to
  iterate over all (many) segments.

    Do you have any thoughts on this issue? (ref: Matthias's 
comment

)

    - I added the field `validTo` in `VersionedRe

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-09 Thread Bruno Cadonna

Hi,

Thanks for the updates!

First my take on previous comments:


50)
I am in favor of deprecating the constructor that does not take the 
validTo parameter. That implies that I am in favor of modifying get(key, 
asOf) to set the correct validTo.



60)
I am in favor of renaming ValueIterator to VersionedRecordIterator and 
define it as:


public interface VersionedRecordIterator extends 
Iterator>


(Matthias, you mixed up ValueAndTimestamp with VersionedRecord in your 
last e-mail, didn't you? Just double-checking if I understood what you 
are proposing.)



70)
I agree with Matthias that adding a new method on the 
VersionedKeyValueStore interface defeats the purpose of one of the goals 
of IQv2, i.e., not to need to extend the state store interface for IQ. I 
would say if we do not need the method in normal processing, we should 
not extend the public state store interface. BTW, nobody forces you to 
StoreQueryUtils. I think that internal utils class was introduced for 
convenience to leverage existing methods on the state store interface.



80)
Why do we limit ourselves by specifying a default order on the result? 
Different state store implementation might have different strategies to 
store the records which affects the order in which the records are 
returned if they are not sorted before they are returned to the user. 
Some users might not be interested in an order of the result and so 
there is no reason those users pay the cost for sorting. I propose to 
not specify a default order but sort the results (if needed) when 
withDescendingX() and withAscendingX() is specified on the query object.



Regarding the snapshot guarantees for the iterators, I need to think a 
bit more about it. I will come back to this thread in the next days.



Best,
Bruno


On 11/8/23 5:30 PM, Alieh Saeedi wrote:

Thank you, Bruno and Matthias, for keeping the discussion going and for
reviewing the PR.

Here are the KIP updates:

- I removed the `peek()` from the `ValueIterator` interface since we do
not need it.
- Yes, Bruno, the `validTo` field in the `VersionedRecod` class is
exclusive. I updated the javadocs for that.


Very important critical open questions. I list them here based on priority
(descendingly).

- I implemented the `get(key, fromtime, totime)` method here

:
the problem is that this implementation does not guarantee consistency
because processing might continue interleaved (no snapshot semantics is
implemented). More over, it materializes all results in memory.
   - Solution 1: Use a lock and release it after retrieving all desired
   records from all segments.
  - positive point: snapshot semantics is implemented
  - negative points: 1) It is expensive since iterating over all
  segments may take a long time. 2) It still requires
materializing results
  on memory
   - Solution 2: use `RocksDbIterator`.
  - positive points: 1) It guarantees snapshot segments. 2) It does
  not require materializing results in memory.
  - negative points: it is expensive because, anyway, we need to
  iterate over all (many) segments.

Do you have any thoughts on this issue? (ref: Matthias's comment
)

- I added the field `validTo` in `VersionedRecord`. Its default value is
MAX. But as Matthias mentioned, for the single-key single-ts
(`VersionedKeyQuery` in KIP-960), it may not always be true. If the
returned record belongs to an old segment, maybe it is not valid any more.
So MAX is not the correct value for `ValidTo`. Two solutions come to mind:
   - Solution 1: make the `ValidTo` as an `Optional` and set it to
   `empty` for the retuned result of `get(key, asOfTimestamp)`.
   - Solution 2: change the implementation of `get(key, asOfTimestamp)`
   so that it finds the correct `validTo` for the returned versionedRecord.

   - In this KIP and the next one, even though the default ordering is
with ascending ts, I added the method `withAscendingTimestamps()` to have
more user readibility (as Bruno suggested), while Hanyu did only add
`withDescending...` methods (he did not need ascending because that's the
default anyway). Matthias believes that we should not have
inconsistencies (he actually hates them:D). Shall I change my KIP or Hanyu?
Thoughts?


That would be maybe helpful to look into the PR
 for more clarity and even
review that ;-)

Cheers,
Alieh

On Thu, Nov 2, 2023 at 7:13 PM Bruno Cadonna  wrote:


Hi Alieh,

First of all, I like the examples.

Is validTo in VersionedRecord exclusive or inclusive?
In the javadocs you write:

"@param

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-08 Thread Alieh Saeedi
Thank you, Bruno and Matthias, for keeping the discussion going and for
reviewing the PR.

Here are the KIP updates:

   - I removed the `peek()` from the `ValueIterator` interface since we do
   not need it.
   - Yes, Bruno, the `validTo` field in the `VersionedRecod` class is
   exclusive. I updated the javadocs for that.


Very important critical open questions. I list them here based on priority
(descendingly).

   - I implemented the `get(key, fromtime, totime)` method here
   
:
   the problem is that this implementation does not guarantee consistency
   because processing might continue interleaved (no snapshot semantics is
   implemented). More over, it materializes all results in memory.
  - Solution 1: Use a lock and release it after retrieving all desired
  records from all segments.
 - positive point: snapshot semantics is implemented
 - negative points: 1) It is expensive since iterating over all
 segments may take a long time. 2) It still requires
materializing results
 on memory
  - Solution 2: use `RocksDbIterator`.
 - positive points: 1) It guarantees snapshot segments. 2) It does
 not require materializing results in memory.
 - negative points: it is expensive because, anyway, we need to
 iterate over all (many) segments.

   Do you have any thoughts on this issue? (ref: Matthias's comment
)

   - I added the field `validTo` in `VersionedRecord`. Its default value is
   MAX. But as Matthias mentioned, for the single-key single-ts
   (`VersionedKeyQuery` in KIP-960), it may not always be true. If the
   returned record belongs to an old segment, maybe it is not valid any more.
   So MAX is not the correct value for `ValidTo`. Two solutions come to mind:
  - Solution 1: make the `ValidTo` as an `Optional` and set it to
  `empty` for the retuned result of `get(key, asOfTimestamp)`.
  - Solution 2: change the implementation of `get(key, asOfTimestamp)`
  so that it finds the correct `validTo` for the returned versionedRecord.

  - In this KIP and the next one, even though the default ordering is
   with ascending ts, I added the method `withAscendingTimestamps()` to have
   more user readibility (as Bruno suggested), while Hanyu did only add
   `withDescending...` methods (he did not need ascending because that's the
   default anyway). Matthias believes that we should not have
   inconsistencies (he actually hates them:D). Shall I change my KIP or Hanyu?
   Thoughts?


That would be maybe helpful to look into the PR
 for more clarity and even
review that ;-)

Cheers,
Alieh

On Thu, Nov 2, 2023 at 7:13 PM Bruno Cadonna  wrote:

> Hi Alieh,
>
> First of all, I like the examples.
>
> Is validTo in VersionedRecord exclusive or inclusive?
> In the javadocs you write:
>
> "@param validTothe latest timestamp that value is valid"
>
> I think that is not true because the validity is defined by the start
> time of the new version. The new and the old version cannot both be
> valid at that same time.
>
> Theoretically, you could set validTo to the start time of the new
> version - 1. However, what is the unit of the 1? Is it nanoseconds?
> Milliseconds? Seconds? Sure we could agree on one, but I think it is
> more elegant to just make the validTo exclusive. Actually, you used it
> as exclusive in your examples.
>
>
> Thanks for the KIP!
>
> Best,
> Bruno
>
> On 11/1/23 9:01 PM, Alieh Saeedi wrote:
> > Hi all,
> > @Matthias: I think Victoria was right. I must add the method `get(key,
> > fromTime, toTime)` to the interface `VersionedKeyValueStore`. Right now,
> > having the method only in `RocksDBVersionedStore`, made me to have an
> > instance of `RocksDBVersionedStore` (instead of `VersionedKeyValueStore`)
> > in `StoreQueryUtils.runMultiVersionedKeyQuery()` method. In future, we
> are
> > going to use the same method for In-memory/SPDB/blaBla versioned stores.
> > Then either this method won't work any more, or we have to add code (if
> > clauses) for each type of versioned stores. What do you think about that?
> >
> > Bests,
> > Alieh
> >
> > On Tue, Oct 24, 2023 at 10:01 PM Alieh Saeedi 
> wrote:
> >
> >> Thank you, Matthias, Bruno, and Guozhang for keeping the discussion
> going.
> >>
> >> Here is the list of changes I made:
> >>
> >> 1. I enriched the "Example" section as Bruno suggested. Do you
> please
> >> have a look at that section? I think I devised an interesting one
> ;-)
> >> 2. As Matthias and Guozhang suggested, I renamed variables and
> methods
> >> as follows:
> >>- "fromTimestamp" -> "fromTime"
> >>- "asOfTimestamp" -> "toTime"
> >>- "from(Instan

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-02 Thread Bruno Cadonna

Hi Alieh,

First of all, I like the examples.

Is validTo in VersionedRecord exclusive or inclusive?
In the javadocs you write:

"@param validTothe latest timestamp that value is valid"

I think that is not true because the validity is defined by the start 
time of the new version. The new and the old version cannot both be 
valid at that same time.


Theoretically, you could set validTo to the start time of the new 
version - 1. However, what is the unit of the 1? Is it nanoseconds? 
Milliseconds? Seconds? Sure we could agree on one, but I think it is 
more elegant to just make the validTo exclusive. Actually, you used it 
as exclusive in your examples.



Thanks for the KIP!

Best,
Bruno

On 11/1/23 9:01 PM, Alieh Saeedi wrote:

Hi all,
@Matthias: I think Victoria was right. I must add the method `get(key,
fromTime, toTime)` to the interface `VersionedKeyValueStore`. Right now,
having the method only in `RocksDBVersionedStore`, made me to have an
instance of `RocksDBVersionedStore` (instead of `VersionedKeyValueStore`)
in `StoreQueryUtils.runMultiVersionedKeyQuery()` method. In future, we are
going to use the same method for In-memory/SPDB/blaBla versioned stores.
Then either this method won't work any more, or we have to add code (if
clauses) for each type of versioned stores. What do you think about that?

Bests,
Alieh

On Tue, Oct 24, 2023 at 10:01 PM Alieh Saeedi  wrote:


Thank you, Matthias, Bruno, and Guozhang for keeping the discussion going.

Here is the list of changes I made:

1. I enriched the "Example" section as Bruno suggested. Do you please
have a look at that section? I think I devised an interesting one ;-)
2. As Matthias and Guozhang suggested, I renamed variables and methods
as follows:
   - "fromTimestamp" -> "fromTime"
   - "asOfTimestamp" -> "toTime"
   - "from(Instant fromTime)" -> "fromTime(Instant fromTime)"
   - "asOf(Instant toTime)" -> "toTime(Instant toTime)"
3.

About throwing an NPE when time bounds are `null`: Ok, Matthias, I am
convinced by your idea. I consider a null timestamp as "no bound".  But I
had to change KIP-960 and the corresponding PR to be consistent as well.

Answering questions and some more discussion points:

1.

For now, I keep the class names as they are.
2.

About the new field "validTo" in VersionedRecord. Yes Matthias I keep
the old constructor, which does not have `validTo` as an input parameter.
But in the body of the old constructor, I consider the default value MAX
for the validTo field. I think MAX must be the default value for `validTo`
since before inserting a tombstone or a new value for the same key, the
value must be valid till iternity.
3.

Regarding changing the ValueIterator interface from `public interface
ValueIterator extends Iterator` to `public interface
ValueIterator extends Iterator>`: Matthias, I do not
know how it improves type safety because the MultiVersionedKeyQuery class
returns a ValueIterator of VersionedRecord any way. But if we want to be
consistent with KeyValueIterator, we must apply the changes you suggested.
4.

Regarding adding the new `get(key, fromTime, toTime)` method to the
public interface `VersionedKeyValueStore` or only adding it to the
class `RocksDBVersionedStore`: In the KIP, I changed the interface as
Victoria suggested. But still, I am not convinced why we do that. @Victoria:
Do you please clarify why we have to define it in the interface? More
specifically, why do we need to use generic VersionedKeyValueStore
instead of simply using the implementing classes?

Cheers,
Alieh

On Sat, Oct 14, 2023 at 3:35 AM Guozhang Wang 
wrote:


Thanks Alieh for the KIP, as well as a nice summary of all the
discussions! Just my 2c regarding your open questions:

1. I just checked KIP-889
(
https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores
)
and we used "VersionedRecord get(K key, long asOfTimestamp);", so I
feel to be consistent with this API is better compared with being
consistent with "WindowKeyQuery"?

3. I agree with Matthias that naming is always tricky, and I also tend
to be consistent with what we already have (even if in retro it may
not be the best idea :P and if that was really becoming a complaint,
we would change it across the board in one shot as well later).

Guozhang

On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax  wrote:


Thanks for the update!



Some thoughts about changes you made and open questions you raised:


10) About asOf vs until: I was just looking into `WindowKeyQuery`,
`WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For those,
we use "timeFrom" and "timeTo", so it seems best to actually use
`to(Instant toTime)` to keep the naming consistent across the board?

If yes, we should also do `from (Instant fromTime)` and use getters
`fromTime()` and `toTime()` -- given that it's range bounds it seems

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-11-01 Thread Alieh Saeedi
Hi all,
@Matthias: I think Victoria was right. I must add the method `get(key,
fromTime, toTime)` to the interface `VersionedKeyValueStore`. Right now,
having the method only in `RocksDBVersionedStore`, made me to have an
instance of `RocksDBVersionedStore` (instead of `VersionedKeyValueStore`)
in `StoreQueryUtils.runMultiVersionedKeyQuery()` method. In future, we are
going to use the same method for In-memory/SPDB/blaBla versioned stores.
Then either this method won't work any more, or we have to add code (if
clauses) for each type of versioned stores. What do you think about that?

Bests,
Alieh

On Tue, Oct 24, 2023 at 10:01 PM Alieh Saeedi  wrote:

> Thank you, Matthias, Bruno, and Guozhang for keeping the discussion going.
>
> Here is the list of changes I made:
>
>1. I enriched the "Example" section as Bruno suggested. Do you please
>have a look at that section? I think I devised an interesting one ;-)
>2. As Matthias and Guozhang suggested, I renamed variables and methods
>as follows:
>   - "fromTimestamp" -> "fromTime"
>   - "asOfTimestamp" -> "toTime"
>   - "from(Instant fromTime)" -> "fromTime(Instant fromTime)"
>   - "asOf(Instant toTime)" -> "toTime(Instant toTime)"
>3.
>
>About throwing an NPE when time bounds are `null`: Ok, Matthias, I am
>convinced by your idea. I consider a null timestamp as "no bound".  But I
>had to change KIP-960 and the corresponding PR to be consistent as well.
>
> Answering questions and some more discussion points:
>
>1.
>
>For now, I keep the class names as they are.
>2.
>
>About the new field "validTo" in VersionedRecord. Yes Matthias I keep
>the old constructor, which does not have `validTo` as an input parameter.
>But in the body of the old constructor, I consider the default value MAX
>for the validTo field. I think MAX must be the default value for `validTo`
>since before inserting a tombstone or a new value for the same key, the
>value must be valid till iternity.
>3.
>
>Regarding changing the ValueIterator interface from `public interface
>ValueIterator extends Iterator` to `public interface
>ValueIterator extends Iterator>`: Matthias, I do not
>know how it improves type safety because the MultiVersionedKeyQuery class
>returns a ValueIterator of VersionedRecord any way. But if we want to be
>consistent with KeyValueIterator, we must apply the changes you suggested.
>4.
>
>Regarding adding the new `get(key, fromTime, toTime)` method to the
>public interface `VersionedKeyValueStore` or only adding it to the
>class `RocksDBVersionedStore`: In the KIP, I changed the interface as
>Victoria suggested. But still, I am not convinced why we do that. 
> @Victoria:
>Do you please clarify why we have to define it in the interface? More
>specifically, why do we need to use generic VersionedKeyValueStore
>instead of simply using the implementing classes?
>
> Cheers,
> Alieh
>
> On Sat, Oct 14, 2023 at 3:35 AM Guozhang Wang 
> wrote:
>
>> Thanks Alieh for the KIP, as well as a nice summary of all the
>> discussions! Just my 2c regarding your open questions:
>>
>> 1. I just checked KIP-889
>> (
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores
>> )
>> and we used "VersionedRecord get(K key, long asOfTimestamp);", so I
>> feel to be consistent with this API is better compared with being
>> consistent with "WindowKeyQuery"?
>>
>> 3. I agree with Matthias that naming is always tricky, and I also tend
>> to be consistent with what we already have (even if in retro it may
>> not be the best idea :P and if that was really becoming a complaint,
>> we would change it across the board in one shot as well later).
>>
>> Guozhang
>>
>> On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax  wrote:
>> >
>> > Thanks for the update!
>> >
>> >
>> >
>> > Some thoughts about changes you made and open questions you raised:
>> >
>> >
>> > 10) About asOf vs until: I was just looking into `WindowKeyQuery`,
>> > `WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For those,
>> > we use "timeFrom" and "timeTo", so it seems best to actually use
>> > `to(Instant toTime)` to keep the naming consistent across the board?
>> >
>> > If yes, we should also do `from (Instant fromTime)` and use getters
>> > `fromTime()` and `toTime()` -- given that it's range bounds it seems
>> > acceptable to me, to diverge a little bit from KIP-960 `asOfTimestamp()`
>> > -- but we could also rename it to `asOfTime()`? -- Given that we
>> > strongly type with `Instant` I am not worried about semantic ambiguity.
>> >
>> >
>> >
>> > 20) About throwing a NPE when time bounds are `null` -- why? (For the
>> > key it makes sense as is mandatory to have a key.) Could we not
>> > interpret `null` as "no bound". We did KIP-941 to add `null` for
>> > open-ended `RangeQueries`, so I am wondering if we should just stick to
>> > the same semantics?
>> >
>>

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-24 Thread Alieh Saeedi
Thank you, Matthias, Bruno, and Guozhang for keeping the discussion going.

Here is the list of changes I made:

   1. I enriched the "Example" section as Bruno suggested. Do you please
   have a look at that section? I think I devised an interesting one ;-)
   2. As Matthias and Guozhang suggested, I renamed variables and methods
   as follows:
  - "fromTimestamp" -> "fromTime"
  - "asOfTimestamp" -> "toTime"
  - "from(Instant fromTime)" -> "fromTime(Instant fromTime)"
  - "asOf(Instant toTime)" -> "toTime(Instant toTime)"
   3.

   About throwing an NPE when time bounds are `null`: Ok, Matthias, I am
   convinced by your idea. I consider a null timestamp as "no bound".  But I
   had to change KIP-960 and the corresponding PR to be consistent as well.

Answering questions and some more discussion points:

   1.

   For now, I keep the class names as they are.
   2.

   About the new field "validTo" in VersionedRecord. Yes Matthias I keep
   the old constructor, which does not have `validTo` as an input parameter.
   But in the body of the old constructor, I consider the default value MAX
   for the validTo field. I think MAX must be the default value for `validTo`
   since before inserting a tombstone or a new value for the same key, the
   value must be valid till iternity.
   3.

   Regarding changing the ValueIterator interface from `public interface
   ValueIterator extends Iterator` to `public interface
   ValueIterator extends Iterator>`: Matthias, I do not
   know how it improves type safety because the MultiVersionedKeyQuery class
   returns a ValueIterator of VersionedRecord any way. But if we want to be
   consistent with KeyValueIterator, we must apply the changes you suggested.
   4.

   Regarding adding the new `get(key, fromTime, toTime)` method to the
   public interface `VersionedKeyValueStore` or only adding it to the class
   `RocksDBVersionedStore`: In the KIP, I changed the interface as Victoria
   suggested. But still, I am not convinced why we do that. @Victoria: Do
   you please clarify why we have to define it in the interface? More
   specifically, why do we need to use generic VersionedKeyValueStore
   instead of simply using the implementing classes?

Cheers,
Alieh

On Sat, Oct 14, 2023 at 3:35 AM Guozhang Wang 
wrote:

> Thanks Alieh for the KIP, as well as a nice summary of all the
> discussions! Just my 2c regarding your open questions:
>
> 1. I just checked KIP-889
> (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores
> )
> and we used "VersionedRecord get(K key, long asOfTimestamp);", so I
> feel to be consistent with this API is better compared with being
> consistent with "WindowKeyQuery"?
>
> 3. I agree with Matthias that naming is always tricky, and I also tend
> to be consistent with what we already have (even if in retro it may
> not be the best idea :P and if that was really becoming a complaint,
> we would change it across the board in one shot as well later).
>
> Guozhang
>
> On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax  wrote:
> >
> > Thanks for the update!
> >
> >
> >
> > Some thoughts about changes you made and open questions you raised:
> >
> >
> > 10) About asOf vs until: I was just looking into `WindowKeyQuery`,
> > `WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For those,
> > we use "timeFrom" and "timeTo", so it seems best to actually use
> > `to(Instant toTime)` to keep the naming consistent across the board?
> >
> > If yes, we should also do `from (Instant fromTime)` and use getters
> > `fromTime()` and `toTime()` -- given that it's range bounds it seems
> > acceptable to me, to diverge a little bit from KIP-960 `asOfTimestamp()`
> > -- but we could also rename it to `asOfTime()`? -- Given that we
> > strongly type with `Instant` I am not worried about semantic ambiguity.
> >
> >
> >
> > 20) About throwing a NPE when time bounds are `null` -- why? (For the
> > key it makes sense as is mandatory to have a key.) Could we not
> > interpret `null` as "no bound". We did KIP-941 to add `null` for
> > open-ended `RangeQueries`, so I am wondering if we should just stick to
> > the same semantics?
> >
> > Cf
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-941%3A+Range+queries+to+accept+null+lower+and+upper+bounds
> >
> >
> >
> > 30) About the class naming. That's always tricky, and I am not married
> > to my proposal. I agree with Bruno that the other suggested names are
> > not really better. -- The underlying idea was, to get some "consistent"
> > naming across the board.
> >
> > Existing `KeyQuery`
> > New `VersionedKeyQuery` (KIP-960; we add a prefix)
> > New `MultiVersionKeyQuery` (this KIP; extend the prefix with a
> pre-prefix)
> >
> > Existing `RangeQuery`
> > New `MultiVersionRangeQuery` (KIP-969; add same prefix as above)
> >
> >
> >
> > 40) I am fine with not adding `range(from, to)` -- it was just an idea.
> >
> >
> >
> >
> >
> > Some more follow up question:
> >
> > 50) 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-13 Thread Guozhang Wang
Thanks Alieh for the KIP, as well as a nice summary of all the
discussions! Just my 2c regarding your open questions:

1. I just checked KIP-889
(https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores)
and we used "VersionedRecord get(K key, long asOfTimestamp);", so I
feel to be consistent with this API is better compared with being
consistent with "WindowKeyQuery"?

3. I agree with Matthias that naming is always tricky, and I also tend
to be consistent with what we already have (even if in retro it may
not be the best idea :P and if that was really becoming a complaint,
we would change it across the board in one shot as well later).

Guozhang

On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax  wrote:
>
> Thanks for the update!
>
>
>
> Some thoughts about changes you made and open questions you raised:
>
>
> 10) About asOf vs until: I was just looking into `WindowKeyQuery`,
> `WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For those,
> we use "timeFrom" and "timeTo", so it seems best to actually use
> `to(Instant toTime)` to keep the naming consistent across the board?
>
> If yes, we should also do `from (Instant fromTime)` and use getters
> `fromTime()` and `toTime()` -- given that it's range bounds it seems
> acceptable to me, to diverge a little bit from KIP-960 `asOfTimestamp()`
> -- but we could also rename it to `asOfTime()`? -- Given that we
> strongly type with `Instant` I am not worried about semantic ambiguity.
>
>
>
> 20) About throwing a NPE when time bounds are `null` -- why? (For the
> key it makes sense as is mandatory to have a key.) Could we not
> interpret `null` as "no bound". We did KIP-941 to add `null` for
> open-ended `RangeQueries`, so I am wondering if we should just stick to
> the same semantics?
>
> Cf
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-941%3A+Range+queries+to+accept+null+lower+and+upper+bounds
>
>
>
> 30) About the class naming. That's always tricky, and I am not married
> to my proposal. I agree with Bruno that the other suggested names are
> not really better. -- The underlying idea was, to get some "consistent"
> naming across the board.
>
> Existing `KeyQuery`
> New `VersionedKeyQuery` (KIP-960; we add a prefix)
> New `MultiVersionKeyQuery` (this KIP; extend the prefix with a pre-prefix)
>
> Existing `RangeQuery`
> New `MultiVersionRangeQuery` (KIP-969; add same prefix as above)
>
>
>
> 40) I am fine with not adding `range(from, to)` -- it was just an idea.
>
>
>
>
>
> Some more follow up question:
>
> 50) You propose to add a new constructor and getter to `VersionedRecord`
> -- I am wondering if this implies that `validTo` is optional because the
> existing constructor is not deprecated? -- Also, what happens if
> `validTo` is not set and `valueTo()` is called? Or do we intent to make
> `validTo` mandatory?
>
> Maybe this question can only be answered when working on the code, but I
> am wondering if we should make `validTo` mandatory or not... And what
> the "blast radius" of changing `VersionedRecord` will be in general. Do
> you have already some POC PR that we could look at to get some signals
> about this?
>
>
>
> 60) The new query class is defined to return
> `ValueIterator>` -- while I like the idea to add
> `ValueIterator` in a generic way on the one hand, I am wondering if
> it might be better to change it, and enforce its usage (ie, return type)
> of `VersionedRecord` to improve type safety (type erasure is often a
> pain, and we could mitigate it this way).
>
> Btw: We actually do a similar thing for `KeyValueIterator`.
>
> Ie,
>
> public interface ValueIterator extends Iterator>
>
> and
>
> ValueAndTimestamp peek();
>
> This would imply that the return type of the new query is
> `ValueIterator` on the interface what seems simpler and more elegant?
>
> If we go with the change, I am also wondering if we need to find a
> better name for the new iterator class? Maybe `VersionIterator` or
> something like this?
>
> Of course it might limit the use of `ValueIterator` for other value
> types -- not sure if this a limitation that is prohibitive? My gut
> feeling is, that is should not be too limiting.
>
>
>
>
> 70) Do we really need the change in `VersionedKeyValueStore` and add a
> new method? In the end, the idea of IQv2 is to avoid exactly this... It
> was the main issue for IQv1, that the base interface of the store needed
> an update and thus all classed implementing the base interface, making
> it very cumbersome to add new query types. -- Of course, we need this
> new method on the actually implementation (as private method) that can
> be called from `query()` method, but adding it to the interface seems to
> defeat the purpose of IQv2.
>
> Note, for existing IQv2 queries types that go against others stores, the
> public methods already existed when IQv2 was introduces, and thus the
> implementation of these query types just pragmatically re-used existing
> methods -- but it does not imply that new p

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-11 Thread Matthias J. Sax

Thanks for the update!



Some thoughts about changes you made and open questions you raised:


10) About asOf vs until: I was just looking into `WindowKeyQuery`, 
`WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For those, 
we use "timeFrom" and "timeTo", so it seems best to actually use 
`to(Instant toTime)` to keep the naming consistent across the board?


If yes, we should also do `from (Instant fromTime)` and use getters 
`fromTime()` and `toTime()` -- given that it's range bounds it seems 
acceptable to me, to diverge a little bit from KIP-960 `asOfTimestamp()` 
-- but we could also rename it to `asOfTime()`? -- Given that we 
strongly type with `Instant` I am not worried about semantic ambiguity.




20) About throwing a NPE when time bounds are `null` -- why? (For the 
key it makes sense as is mandatory to have a key.) Could we not 
interpret `null` as "no bound". We did KIP-941 to add `null` for 
open-ended `RangeQueries`, so I am wondering if we should just stick to 
the same semantics?


Cf 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-941%3A+Range+queries+to+accept+null+lower+and+upper+bounds




30) About the class naming. That's always tricky, and I am not married 
to my proposal. I agree with Bruno that the other suggested names are 
not really better. -- The underlying idea was, to get some "consistent" 
naming across the board.


Existing `KeyQuery`
New `VersionedKeyQuery` (KIP-960; we add a prefix)
New `MultiVersionKeyQuery` (this KIP; extend the prefix with a pre-prefix)

Existing `RangeQuery`
New `MultiVersionRangeQuery` (KIP-969; add same prefix as above)



40) I am fine with not adding `range(from, to)` -- it was just an idea.





Some more follow up question:

50) You propose to add a new constructor and getter to `VersionedRecord` 
-- I am wondering if this implies that `validTo` is optional because the 
existing constructor is not deprecated? -- Also, what happens if 
`validTo` is not set and `valueTo()` is called? Or do we intent to make 
`validTo` mandatory?


Maybe this question can only be answered when working on the code, but I 
am wondering if we should make `validTo` mandatory or not... And what 
the "blast radius" of changing `VersionedRecord` will be in general. Do 
you have already some POC PR that we could look at to get some signals 
about this?




60) The new query class is defined to return 
`ValueIterator>` -- while I like the idea to add 
`ValueIterator` in a generic way on the one hand, I am wondering if 
it might be better to change it, and enforce its usage (ie, return type) 
of `VersionedRecord` to improve type safety (type erasure is often a 
pain, and we could mitigate it this way).


Btw: We actually do a similar thing for `KeyValueIterator`.

Ie,

public interface ValueIterator extends Iterator>

and

ValueAndTimestamp peek();

This would imply that the return type of the new query is 
`ValueIterator` on the interface what seems simpler and more elegant?


If we go with the change, I am also wondering if we need to find a 
better name for the new iterator class? Maybe `VersionIterator` or 
something like this?


Of course it might limit the use of `ValueIterator` for other value 
types -- not sure if this a limitation that is prohibitive? My gut 
feeling is, that is should not be too limiting.





70) Do we really need the change in `VersionedKeyValueStore` and add a 
new method? In the end, the idea of IQv2 is to avoid exactly this... It 
was the main issue for IQv1, that the base interface of the store needed 
an update and thus all classed implementing the base interface, making 
it very cumbersome to add new query types. -- Of course, we need this 
new method on the actually implementation (as private method) that can 
be called from `query()` method, but adding it to the interface seems to 
defeat the purpose of IQv2.


Note, for existing IQv2 queries types that go against others stores, the 
public methods already existed when IQv2 was introduces, and thus the 
implementation of these query types just pragmatically re-used existing 
methods -- but it does not imply that new public method should be added.





-Matthias


On 10/11/23 5:11 AM, Bruno Cadonna wrote:

Thanks for the updates, Alieh!

The example in the KIP uses the allVersions() method which we agreed to 
remove.


Regarding your questions:
1. asOf vs. until: I am fine with both but slightly prefer until.
2. What about KeyMultiVersionsQuery, KeyVersionsQuery (KIP-960 would 
then be KeyVersionQuery). However, I am also fine with 
MultiVersionedKeyQuery since none of the names sounds better or worse to 
me.
3. I agree with you not to introduce the method with the two bounds to 
keep things simple.

4. Forget about fromTime() an asOfTime(), from() and asOf() is fine.
5. The main purpose is to show how to use the API. Maybe make an example 
with just the key to distinguish this query from the single value query 
of KIP-960 and then one with a key and a time range. Whe

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-11 Thread Bruno Cadonna

Thanks for the updates, Alieh!

The example in the KIP uses the allVersions() method which we agreed to 
remove.


Regarding your questions:
1. asOf vs. until: I am fine with both but slightly prefer until.
2. What about KeyMultiVersionsQuery, KeyVersionsQuery (KIP-960 would 
then be KeyVersionQuery). However, I am also fine with 
MultiVersionedKeyQuery since none of the names sounds better or worse to 
me.
3. I agree with you not to introduce the method with the two bounds to 
keep things simple.

4. Forget about fromTime() an asOfTime(), from() and asOf() is fine.
5. The main purpose is to show how to use the API. Maybe make an example 
with just the key to distinguish this query from the single value query 
of KIP-960 and then one with a key and a time range. When you iterate 
over the results you could also call validTo(). Maybe add some actual 
records in the comments to show what the result might look like.


Regarding the test plan, I hope you also plan to add unit tests in all 
of your KIPs. Maybe you could also explain why system tests are not 
needed here.


Best,
Bruno

On 10/10/23 5:36 PM, Alieh Saeedi wrote:

Thank you all for the very exact and constructive comments. I really
enjoyed reading your ideas and all the important points you made me aware
of. I updated KIP-968 as follows:

1. If the key or time bounds are null, the method returns NPE.
2. The "valid" word: I removed the sentence "all the records that are
valid..." and replaced it with an exact explanation. More over, I explained
it with an example in the KIP but not in the javadocs. Do I need to add the
example to the javadocs as well?
3. Since I followed Bruno's suggestion and removed the allVersions()
method, the problem of meaningless combinations is solved, and I do not
need any IllegalArgumentException or something like that. Therefore, the
change is that if no time bound is specified, the query returns the records
with the specified key for all timestamps (all versions).
4. As Victoria suggested, adding a method to the *VersionedKeyValueStore
*interface is essential. So I did that. I had this method only in the
RocksDBVersionedStore class, which was not enough.
5. I added the *validTo* field to the VersionedRecord class to be able
to represent the tombstones. As you suggested, we postpone solving the
problem of retrieving consecutive tombstones for later.
6. I added the "Test Plan" section to all KIPs. I hope what I wrote is
convincing.
7. I added the *withAscendingTimestamp()* method to provide more
code readability
for the user.
8. I removed the evil word "get" from all getter methods.

There have also been some more suggestions which I am still not convinced
or clear about them:

1. Regarding asOf vs until: reading all comments, my conclusion was that
I keep it as "asOf" (following Walker's idea as the native speaker as well
as Bruno's suggestion to be consistent with single-key_single_ts queries).
But I do not have a personal preference. If you insist on "until", I change
it.
2. Bruno suggested renaming the class "MultiVersionedKeyQuery" to sth
else. We already had a long discussion about the name with Matthias. I am
open to renaming it to something else, but do you have any ideas?
3. Matthias suggested having a method with two input parameters that
enables the user to specify both time bounds in the same method. Isn't it
introducing redundancy? It is somehow disrespectful to the idea of having
composable methods.
4. Bruno suggested renaming the methods "asOf" and "from" to "asOfTime"
and "fromTime". If I do that, then it is not consistent with KIP-960.
Moreover, the input parameter is clearly a timestamp, which explains
enough. What do you think about that?
5. I was asked to add more examples to the example section. My question
is, what is the main purpose of that? If I know it clearly, then I can add
what you mean.



Cheers,
Alieh

On Tue, Oct 10, 2023 at 1:13 AM Matthias J. Sax  wrote:


Bruno and I had some background conversation about the `get` prefix
question including a few other committers.

The official policy was never changed, and we should not add the
`get`-prefix. It's a slip on our side in previous KIPs to add the
`get`-prefix and we should actually clean it up doing a follow up KIP.


-Matthias


On 10/5/23 5:26 AM, Bruno Cadonna wrote:

Hi Matthias,

Given all the IQv2 KIPs that use getX and given recent PRs (internal
interfaces mainly) that got merged, I was under the impression that we
moved away from the strict no-getX policy.

I do not think it was an accident using getX in the IQv2 KIPs since
somebody would have brought it up, otherwise.

I am fine with both types of getters.

If we think, we need to discuss this in a broader context, let's start a
separate thread.


Best,
Bruno





On 10/5/23 7:44 AM, Matthias J. Sax wrote:

I agree to (almo

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-10 Thread Alieh Saeedi
Thank you all for the very exact and constructive comments. I really
enjoyed reading your ideas and all the important points you made me aware
of. I updated KIP-968 as follows:

   1. If the key or time bounds are null, the method returns NPE.
   2. The "valid" word: I removed the sentence "all the records that are
   valid..." and replaced it with an exact explanation. More over, I explained
   it with an example in the KIP but not in the javadocs. Do I need to add the
   example to the javadocs as well?
   3. Since I followed Bruno's suggestion and removed the allVersions()
   method, the problem of meaningless combinations is solved, and I do not
   need any IllegalArgumentException or something like that. Therefore, the
   change is that if no time bound is specified, the query returns the records
   with the specified key for all timestamps (all versions).
   4. As Victoria suggested, adding a method to the *VersionedKeyValueStore
   *interface is essential. So I did that. I had this method only in the
   RocksDBVersionedStore class, which was not enough.
   5. I added the *validTo* field to the VersionedRecord class to be able
   to represent the tombstones. As you suggested, we postpone solving the
   problem of retrieving consecutive tombstones for later.
   6. I added the "Test Plan" section to all KIPs. I hope what I wrote is
   convincing.
   7. I added the *withAscendingTimestamp()* method to provide more
code readability
   for the user.
   8. I removed the evil word "get" from all getter methods.

There have also been some more suggestions which I am still not convinced
or clear about them:

   1. Regarding asOf vs until: reading all comments, my conclusion was that
   I keep it as "asOf" (following Walker's idea as the native speaker as well
   as Bruno's suggestion to be consistent with single-key_single_ts queries).
   But I do not have a personal preference. If you insist on "until", I change
   it.
   2. Bruno suggested renaming the class "MultiVersionedKeyQuery" to sth
   else. We already had a long discussion about the name with Matthias. I am
   open to renaming it to something else, but do you have any ideas?
   3. Matthias suggested having a method with two input parameters that
   enables the user to specify both time bounds in the same method. Isn't it
   introducing redundancy? It is somehow disrespectful to the idea of having
   composable methods.
   4. Bruno suggested renaming the methods "asOf" and "from" to "asOfTime"
   and "fromTime". If I do that, then it is not consistent with KIP-960.
   Moreover, the input parameter is clearly a timestamp, which explains
   enough. What do you think about that?
   5. I was asked to add more examples to the example section. My question
   is, what is the main purpose of that? If I know it clearly, then I can add
   what you mean.



Cheers,
Alieh

On Tue, Oct 10, 2023 at 1:13 AM Matthias J. Sax  wrote:

> Bruno and I had some background conversation about the `get` prefix
> question including a few other committers.
>
> The official policy was never changed, and we should not add the
> `get`-prefix. It's a slip on our side in previous KIPs to add the
> `get`-prefix and we should actually clean it up doing a follow up KIP.
>
>
> -Matthias
>
>
> On 10/5/23 5:26 AM, Bruno Cadonna wrote:
> > Hi Matthias,
> >
> > Given all the IQv2 KIPs that use getX and given recent PRs (internal
> > interfaces mainly) that got merged, I was under the impression that we
> > moved away from the strict no-getX policy.
> >
> > I do not think it was an accident using getX in the IQv2 KIPs since
> > somebody would have brought it up, otherwise.
> >
> > I am fine with both types of getters.
> >
> > If we think, we need to discuss this in a broader context, let's start a
> > separate thread.
> >
> >
> > Best,
> > Bruno
> >
> >
> >
> >
> >
> > On 10/5/23 7:44 AM, Matthias J. Sax wrote:
> >> I agree to (almost) everything what Bruno said.
> >>
> >>
> >>> In general, we tend to move away from using getters without "get",
> >>> recently. So I would keep the "get".
> >>
> >> This is new to me? Can you elaborate on this point? Why do you think
> >> that's the case?
> >>
> >> I actually did realize (after Walker mentioned it) that existing query
> >> types use `get` prefix, but to me it seems that it was by accident and
> >> we should consider correcting it? Thus, I would actually prefer to not
> >> add the `get` prefix for new methods query types.
> >>
> >> IMHO, we should do a follow up KIP to deprecate all methods with `get`
> >> prefix and replace them with new ones without `get` -- it's of course
> >> always kinda "unnecessary" noise, but if we don't do it, we might get
> >> into more and more inconsistent naming what would result in a "bad" API.
> >>
> >> If we indeed want to change the convention and use the `get` prefix, I
> >> would strongly advocate to bit the bullet and do KIP to pro-actively
> >> add the `get` "everywhere" it's missing... But overall, it seem

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-09 Thread Matthias J. Sax
Bruno and I had some background conversation about the `get` prefix 
question including a few other committers.


The official policy was never changed, and we should not add the 
`get`-prefix. It's a slip on our side in previous KIPs to add the 
`get`-prefix and we should actually clean it up doing a follow up KIP.



-Matthias


On 10/5/23 5:26 AM, Bruno Cadonna wrote:

Hi Matthias,

Given all the IQv2 KIPs that use getX and given recent PRs (internal 
interfaces mainly) that got merged, I was under the impression that we 
moved away from the strict no-getX policy.


I do not think it was an accident using getX in the IQv2 KIPs since 
somebody would have brought it up, otherwise.


I am fine with both types of getters.

If we think, we need to discuss this in a broader context, let's start a 
separate thread.



Best,
Bruno





On 10/5/23 7:44 AM, Matthias J. Sax wrote:

I agree to (almost) everything what Bruno said.


In general, we tend to move away from using getters without "get", 
recently. So I would keep the "get".


This is new to me? Can you elaborate on this point? Why do you think 
that's the case?


I actually did realize (after Walker mentioned it) that existing query 
types use `get` prefix, but to me it seems that it was by accident and 
we should consider correcting it? Thus, I would actually prefer to not 
add the `get` prefix for new methods query types.


IMHO, we should do a follow up KIP to deprecate all methods with `get` 
prefix and replace them with new ones without `get` -- it's of course 
always kinda "unnecessary" noise, but if we don't do it, we might get 
into more and more inconsistent naming what would result in a "bad" API.


If we indeed want to change the convention and use the `get` prefix, I 
would strongly advocate to bit the bullet and do KIP to pro-actively 
add the `get` "everywhere" it's missing... But overall, it seems to be 
a much broader decision, and we should get buy in from many committers 
about it -- as long as there is no broad consensus to add `get` 
everywhere, I would strongly prefer not to diverge from the current 
agreement to omit `get`.




-Matthias




On 10/4/23 2:36 AM, Bruno Cadonna wrote:

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result 
is not complete, because users could never know a record was deleted 
at some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones 
into a versioned state store? I guess if somebody does that on 
purpose, then there should be a way to retrieve each of those 
tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see 
the need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or 
.until()). If we go with .within(), I would opt for 
.withinTimeRange(fromTs, toTs), because the query becomes more readable:


MultiVersionedKeyQuery
   .withKey(1)
   .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() 
and .untilTime() (or .toTime()):


MultiVersionedKeyQuery
  .withKey(1)
  .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
  .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker 
mentioned where this KIP specifies a time range. IMO asOf() fits very 
well with KIP-960 where one version is queried, but here I think 
.until() fits better. That might just be a matter of taste and in the 
end I am fine with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I 
would keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me 
where
until gives me the idea that the query is making a change. It's 
totally a

connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-05 Thread Bruno Cadonna

Hi Matthias,

Given all the IQv2 KIPs that use getX and given recent PRs (internal 
interfaces mainly) that got merged, I was under the impression that we 
moved away from the strict no-getX policy.


I do not think it was an accident using getX in the IQv2 KIPs since 
somebody would have brought it up, otherwise.


I am fine with both types of getters.

If we think, we need to discuss this in a broader context, let's start a 
separate thread.



Best,
Bruno





On 10/5/23 7:44 AM, Matthias J. Sax wrote:

I agree to (almost) everything what Bruno said.


In general, we tend to move away from using getters without "get", 
recently. So I would keep the "get".


This is new to me? Can you elaborate on this point? Why do you think 
that's the case?


I actually did realize (after Walker mentioned it) that existing query 
types use `get` prefix, but to me it seems that it was by accident and 
we should consider correcting it? Thus, I would actually prefer to not 
add the `get` prefix for new methods query types.


IMHO, we should do a follow up KIP to deprecate all methods with `get` 
prefix and replace them with new ones without `get` -- it's of course 
always kinda "unnecessary" noise, but if we don't do it, we might get 
into more and more inconsistent naming what would result in a "bad" API.


If we indeed want to change the convention and use the `get` prefix, I 
would strongly advocate to bit the bullet and do KIP to pro-actively add 
the `get` "everywhere" it's missing... But overall, it seems to be a 
much broader decision, and we should get buy in from many committers 
about it -- as long as there is no broad consensus to add `get` 
everywhere, I would strongly prefer not to diverge from the current 
agreement to omit `get`.




-Matthias




On 10/4/23 2:36 AM, Bruno Cadonna wrote:

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result 
is not complete, because users could never know a record was deleted 
at some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones 
into a versioned state store? I guess if somebody does that on 
purpose, then there should be a way to retrieve each of those 
tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see 
the need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or 
.until()). If we go with .within(), I would opt for 
.withinTimeRange(fromTs, toTs), because the query becomes more readable:


MultiVersionedKeyQuery
   .withKey(1)
   .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() 
and .untilTime() (or .toTime()):


MultiVersionedKeyQuery
  .withKey(1)
  .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
  .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker 
mentioned where this KIP specifies a time range. IMO asOf() fits very 
well with KIP-960 where one version is queried, but here I think 
.until() fits better. That might just be a matter of taste and in the 
end I am fine with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I 
would keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me 
where
until gives me the idea that the query is making a change. It's 
totally a

connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

    * The key that was specified for this query.
    */
   public K getKey();

   /**
    * The starting time point of the query, if specified
    */
   public Optional getFromTimestamp();

   /**
    * The ending time point of the query, if specified
    */
   public Optional getAsOfTimestamp();

Other than that I didn't have 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-04 Thread Matthias J. Sax

I agree to (almost) everything what Bruno said.



In general, we tend to move away from using getters without "get", recently. So I would 
keep the "get".


This is new to me? Can you elaborate on this point? Why do you think 
that's the case?


I actually did realize (after Walker mentioned it) that existing query 
types use `get` prefix, but to me it seems that it was by accident and 
we should consider correcting it? Thus, I would actually prefer to not 
add the `get` prefix for new methods query types.


IMHO, we should do a follow up KIP to deprecate all methods with `get` 
prefix and replace them with new ones without `get` -- it's of course 
always kinda "unnecessary" noise, but if we don't do it, we might get 
into more and more inconsistent naming what would result in a "bad" API.


If we indeed want to change the convention and use the `get` prefix, I 
would strongly advocate to bit the bullet and do KIP to pro-actively add 
the `get` "everywhere" it's missing... But overall, it seems to be a 
much broader decision, and we should get buy in from many committers 
about it -- as long as there is no broad consensus to add `get` 
everywhere, I would strongly prefer not to diverge from the current 
agreement to omit `get`.




-Matthias




On 10/4/23 2:36 AM, Bruno Cadonna wrote:

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result is 
not complete, because users could never know a record was deleted at 
some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones into 
a versioned state store? I guess if somebody does that on purpose, then 
there should be a way to retrieve each of those tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see the 
need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or .until()). 
If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs), 
because the query becomes more readable:


MultiVersionedKeyQuery
   .withKey(1)
   .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() and 
.untilTime() (or .toTime()):


MultiVersionedKeyQuery
  .withKey(1)
  .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
  .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker mentioned 
where this KIP specifies a time range. IMO asOf() fits very well with 
KIP-960 where one version is queried, but here I think .until() fits 
better. That might just be a matter of taste and in the end I am fine 
with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I would 
keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me 
where

until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

    * The key that was specified for this query.
    */
   public K getKey();

   /**
    * The starting time point of the query, if specified
    */
   public Optional getFromTimestamp();

   /**
    * The ending time point of the query, if specified
    */
   public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the 
direction

of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:


Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her email) if
a range query would not return any tombstones, ie, only two records in
Victoria's example. Thus, it seems best to include a `validTo` ts-field
to `Ver

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-04 Thread Bruno Cadonna

Hi,

Regarding tombstones:
As far as I understand, we need to add either a validTo field to 
VersionedRecord or we need to return tombstones, otherwise the result is 
not complete, because users could never know a record was deleted at 
some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more 
concise and easier interpretable.


Extending on Victoria's example, with the following puts

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)

the result with tombstones would be

value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)

instead of

value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)

The benefit of conciseness would already apply to one single tombstone.

On the other hand, why would somebody write consecutive tombstones into 
a versioned state store? I guess if somebody does that on purpose, then 
there should be a way to retrieve each of those tombstones, right?
So maybe we need both -- validTo field and the option to return 
tombstones. The latter might be moved to a future KIP in case we see the 
need.



Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or .until()). 
If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs), 
because the query becomes more readable:


MultiVersionedKeyQuery
  .withKey(1)
  .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), 
Instant.parse(2023-08-04T10:37:30.00Z))


If we stay with .from() and .until(), we should consider .fromTime() and 
.untilTime() (or .toTime()):


MultiVersionedKeyQuery
 .withKey(1)
 .fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
 .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))



Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker mentioned 
where this KIP specifies a time range. IMO asOf() fits very well with 
KIP-960 where one version is queried, but here I think .until() fits 
better. That might just be a matter of taste and in the end I am fine 
with both as long as it is well documented.



Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we 
tend to move away from using getters without "get", recently. So I would 
keep the "get".



Best,
Bruno

On 10/3/23 7:49 PM, Walker Carlson wrote:

Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me where
until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

* The key that was specified for this query.
*/
   public K getKey();

   /**
* The starting time point of the query, if specified
*/
   public Optional getFromTimestamp();

   /**
* The ending time point of the query, if specified
*/
   public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the direction
of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:


Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her email) if
a range query would not return any tombstones, ie, only two records in
Victoria's example. Thus, it seems best to include a `validTo` ts-field
to `VersionedRecord` -- otherwise, the retrieved result cannot be
interpreted correctly.

Not sure what others think about it.

I would also be open to actually add a `includeDeletes()` (or
`includeTombstones()`) method/flag (disabled by default) to allow users
to get all tombstone: this would only be helpful if there are two
consecutive tombstone though (if I got it right), so not sure if we want
to add it or not -- it seems also possible to add it later if there is
user demand for it, so it might be a premature addition as this point?


Nit:


the public interface ValueIterator is used


"is used" -> "is added" (otherwise it sounds like as if `ValueIterator`
exist already)



Should we also add a `.within(fromTs, toTs)` (or maybe some better
name?) to allow specifying both bounds at once? The existing
`RangeQuery` does the same for specifying the key-range, so might be
good to add for time-range too?



-Matthias


On 9/6/23 5:01 AM, Bruno Cadonna wrote:

In my last e-mail I missed to finish a sentence.

"I think from a KIP"

should be

"I think the KIP looks good!"


On 9/6/23 1:59 PM, Bruno Cadonna wrote:

Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an
IllegalStateException for meaningless combinations. In any case, the
KIP should specify w

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-03 Thread Walker Carlson
Hey Alieh thanks for the KIP,

Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me where
until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.

Also for these methods it makes sense to drop the "get" We don't
normally use that in getters

   * The key that was specified for this query.
   */
  public K getKey();

  /**
   * The starting time point of the query, if specified
   */
  public Optional getFromTimestamp();

  /**
   * The ending time point of the query, if specified
   */
  public Optional getAsOfTimestamp();

Other than that I didn't have too much to add. Overall I like the direction
of the KIP and think the funcatinlyt is all there!
best,
Walker



On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax  wrote:

> Thanks for the updated KIP. Overall I like it.
>
> Victoria raises a very good point, and I personally tend to prefer (I
> believe so does Victoria, but it's not totally clear from her email) if
> a range query would not return any tombstones, ie, only two records in
> Victoria's example. Thus, it seems best to include a `validTo` ts-field
> to `VersionedRecord` -- otherwise, the retrieved result cannot be
> interpreted correctly.
>
> Not sure what others think about it.
>
> I would also be open to actually add a `includeDeletes()` (or
> `includeTombstones()`) method/flag (disabled by default) to allow users
> to get all tombstone: this would only be helpful if there are two
> consecutive tombstone though (if I got it right), so not sure if we want
> to add it or not -- it seems also possible to add it later if there is
> user demand for it, so it might be a premature addition as this point?
>
>
> Nit:
>
> > the public interface ValueIterator is used
>
> "is used" -> "is added" (otherwise it sounds like as if `ValueIterator`
> exist already)
>
>
>
> Should we also add a `.within(fromTs, toTs)` (or maybe some better
> name?) to allow specifying both bounds at once? The existing
> `RangeQuery` does the same for specifying the key-range, so might be
> good to add for time-range too?
>
>
>
> -Matthias
>
>
> On 9/6/23 5:01 AM, Bruno Cadonna wrote:
> > In my last e-mail I missed to finish a sentence.
> >
> > "I think from a KIP"
> >
> > should be
> >
> > "I think the KIP looks good!"
> >
> >
> > On 9/6/23 1:59 PM, Bruno Cadonna wrote:
> >> Hi Alieh,
> >>
> >> Thanks for the KIP!
> >>
> >> I think from a KIP
> >>
> >> 1.
> >> I propose to throw an IllegalArgumentException or an
> >> IllegalStateException for meaningless combinations. In any case, the
> >> KIP should specify what exception is thrown.
> >>
> >> 2.
> >> Why does not specifying a range return the latest version? I would
> >> expect that it returns all versions since an empty lower or upper
> >> limit is interpreted as no limit.
> >>
> >> 3.
> >> I second Matthias comment about replacing "asOf" with "until" or "to".
> >>
> >> 4.
> >> Do we need "allVersions()"? As I said above I would return all
> >> versions if no limits are specified. I think if we get rid of
> >> allVersions() there might not be any meaningless combinations anymore.
> >> If a user applies twice the same limit like for example
> >> MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.
> >>
> >> 5.
> >> Could you add some more examples with time ranges to the example
> section?
> >>
> >> 6.
> >> The KIP misses the test plan section.
> >>
> >> 7.
> >> I propose to rename the class to "MultiVersionKeyQuery" since we are
> >> querying multiple versions of the same key.
> >>
> >> 8.
> >> Could you also add withAscendingTimestamps()? IMO it gives users the
> >> possibility to make their code more readable instead of only relying
> >> on the default.
> >>
> >> Best,
> >> Bruno
> >>
> >>
> >> On 8/17/23 4:13 AM, Matthias J. Sax wrote:
> >>> Thanks for splitting this part into a separate KIP!
> >>>
> >>> For `withKey()` we should be explicit that `null` is not allowed.
> >>>
> >>> (Looking into existing `KeyQuery` it seems the JavaDocs don't cover
> >>> this either -- would you like to do a tiny cleanup PR for this, or
> >>> fix on-the-side in one of your PRs?)
> >>>
> >>>
> >>>
>  The key query returns all the records that are valid in the time
>  range starting from the timestamp {@code fromTimestamp}.
> >>>
> >>> In the JavaDocs you use the phrase `are valid` -- I think we need to
> >>> explain what "valid" means? It might even be worth to add some
> >>> examples. It's annoying, but being precise if kinda important.
> >>>
> >>> With regard to KIP-962, should we allow `null` for time bounds ? The
> >>> JavaDocs should also be explicit if `null` is allowed or not and what
> >>> the semantics are if allowed.
> >>>
> >>>
> >>>
> >>> You are using `asOf()` however, because we are doing time-range
> >>> queries, to me using `until()` to des

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-10-02 Thread Matthias J. Sax

Thanks for the updated KIP. Overall I like it.

Victoria raises a very good point, and I personally tend to prefer (I 
believe so does Victoria, but it's not totally clear from her email) if 
a range query would not return any tombstones, ie, only two records in 
Victoria's example. Thus, it seems best to include a `validTo` ts-field 
to `VersionedRecord` -- otherwise, the retrieved result cannot be 
interpreted correctly.


Not sure what others think about it.

I would also be open to actually add a `includeDeletes()` (or 
`includeTombstones()`) method/flag (disabled by default) to allow users 
to get all tombstone: this would only be helpful if there are two 
consecutive tombstone though (if I got it right), so not sure if we want 
to add it or not -- it seems also possible to add it later if there is 
user demand for it, so it might be a premature addition as this point?



Nit:

the public interface ValueIterator is used 


"is used" -> "is added" (otherwise it sounds like as if `ValueIterator` 
exist already)




Should we also add a `.within(fromTs, toTs)` (or maybe some better 
name?) to allow specifying both bounds at once? The existing 
`RangeQuery` does the same for specifying the key-range, so might be 
good to add for time-range too?




-Matthias


On 9/6/23 5:01 AM, Bruno Cadonna wrote:

In my last e-mail I missed to finish a sentence.

"I think from a KIP"

should be

"I think the KIP looks good!"


On 9/6/23 1:59 PM, Bruno Cadonna wrote:

Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an 
IllegalStateException for meaningless combinations. In any case, the 
KIP should specify what exception is thrown.


2.
Why does not specifying a range return the latest version? I would 
expect that it returns all versions since an empty lower or upper 
limit is interpreted as no limit.


3.
I second Matthias comment about replacing "asOf" with "until" or "to".

4.
Do we need "allVersions()"? As I said above I would return all 
versions if no limits are specified. I think if we get rid of 
allVersions() there might not be any meaningless combinations anymore.
If a user applies twice the same limit like for example 
MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.


5.
Could you add some more examples with time ranges to the example section?

6.
The KIP misses the test plan section.

7.
I propose to rename the class to "MultiVersionKeyQuery" since we are 
querying multiple versions of the same key.


8.
Could you also add withAscendingTimestamps()? IMO it gives users the 
possibility to make their code more readable instead of only relying 
on the default.


Best,
Bruno


On 8/17/23 4:13 AM, Matthias J. Sax wrote:

Thanks for splitting this part into a separate KIP!

For `withKey()` we should be explicit that `null` is not allowed.

(Looking into existing `KeyQuery` it seems the JavaDocs don't cover 
this either -- would you like to do a tiny cleanup PR for this, or 
fix on-the-side in one of your PRs?)




The key query returns all the records that are valid in the time 
range starting from the timestamp {@code fromTimestamp}.


In the JavaDocs you use the phrase `are valid` -- I think we need to 
explain what "valid" means? It might even be worth to add some 
examples. It's annoying, but being precise if kinda important.


With regard to KIP-962, should we allow `null` for time bounds ? The 
JavaDocs should also be explicit if `null` is allowed or not and what 
the semantics are if allowed.




You are using `asOf()` however, because we are doing time-range 
queries, to me using `until()` to describe the upper bound would 
sound better (I am not a native speaker though, so maybe I am off?)



The key query returns all the records that have timestamp <= {@code 
asOfTimestamp}.


This is only correct if not lower-bound is set, right?


In your reply to KIP-960 you mentioned:


the meaningless combinations are prevented by throwing exceptions.


We should add corresponding JavaDocs like:

    @throws IllegalArgumentException if {@code fromTimestamp} is 
equal or

 larger than {@code untilTimestamp}

Or something similar.


With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` 
class for single-key-single-ts lookup, would we need to find a new 
name for the query class of this KIP, given that the return type is 
different?



-Matthias



On 8/16/23 10:57 AM, Alieh Saeedi wrote:

Hi all,

I splitted KIP-960

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can 
see all

the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries 
(IQv2) for

versioned state stores


Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-09-06 Thread Bruno Cadonna

Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an 
IllegalStateException for meaningless combinations. In any case, the KIP 
should specify what exception is thrown.


2.
Why does not specifying a range return the latest version? I would 
expect that it returns all versions since an empty lower or upper limit 
is interpreted as no limit.


3.
I second Matthias comment about replacing "asOf" with "until" or "to".

4.
Do we need "allVersions()"? As I said above I would return all versions 
if no limits are specified. I think if we get rid of allVersions() there 
might not be any meaningless combinations anymore.
If a user applies twice the same limit like for example 
MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.


5.
Could you add some more examples with time ranges to the example section?

6.
The KIP misses the test plan section.

7.
I propose to rename the class to "MultiVersionKeyQuery" since we are 
querying multiple versions of the same key.


8.
Could you also add withAscendingTimestamps()? IMO it gives users the 
possibility to make their code more readable instead of only relying on 
the default.


Best,
Bruno


On 8/17/23 4:13 AM, Matthias J. Sax wrote:

Thanks for splitting this part into a separate KIP!

For `withKey()` we should be explicit that `null` is not allowed.

(Looking into existing `KeyQuery` it seems the JavaDocs don't cover this 
either -- would you like to do a tiny cleanup PR for this, or fix 
on-the-side in one of your PRs?)




The key query returns all the records that are valid in the time range 
starting from the timestamp {@code fromTimestamp}.


In the JavaDocs you use the phrase `are valid` -- I think we need to 
explain what "valid" means? It might even be worth to add some examples. 
It's annoying, but being precise if kinda important.


With regard to KIP-962, should we allow `null` for time bounds ? The 
JavaDocs should also be explicit if `null` is allowed or not and what 
the semantics are if allowed.




You are using `asOf()` however, because we are doing time-range queries, 
to me using `until()` to describe the upper bound would sound better (I 
am not a native speaker though, so maybe I am off?)



The key query returns all the records that have timestamp <= {@code 
asOfTimestamp}.


This is only correct if not lower-bound is set, right?


In your reply to KIP-960 you mentioned:


the meaningless combinations are prevented by throwing exceptions.


We should add corresponding JavaDocs like:

    @throws IllegalArgumentException if {@code fromTimestamp} is equal or
     larger than {@code untilTimestamp}

Or something similar.


With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` 
class for single-key-single-ts lookup, would we need to find a new name 
for the query class of this KIP, given that the return type is different?



-Matthias



On 8/16/23 10:57 AM, Alieh Saeedi wrote:

Hi all,

I splitted KIP-960

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can 
see all

the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) 
for

versioned state stores


I look forward to your feedback!

Cheers,
Alieh



Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-09-06 Thread Bruno Cadonna

In my last e-mail I missed to finish a sentence.

"I think from a KIP"

should be

"I think the KIP looks good!"


On 9/6/23 1:59 PM, Bruno Cadonna wrote:

Hi Alieh,

Thanks for the KIP!

I think from a KIP

1.
I propose to throw an IllegalArgumentException or an 
IllegalStateException for meaningless combinations. In any case, the KIP 
should specify what exception is thrown.


2.
Why does not specifying a range return the latest version? I would 
expect that it returns all versions since an empty lower or upper limit 
is interpreted as no limit.


3.
I second Matthias comment about replacing "asOf" with "until" or "to".

4.
Do we need "allVersions()"? As I said above I would return all versions 
if no limits are specified. I think if we get rid of allVersions() there 
might not be any meaningless combinations anymore.
If a user applies twice the same limit like for example 
MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.


5.
Could you add some more examples with time ranges to the example section?

6.
The KIP misses the test plan section.

7.
I propose to rename the class to "MultiVersionKeyQuery" since we are 
querying multiple versions of the same key.


8.
Could you also add withAscendingTimestamps()? IMO it gives users the 
possibility to make their code more readable instead of only relying on 
the default.


Best,
Bruno


On 8/17/23 4:13 AM, Matthias J. Sax wrote:

Thanks for splitting this part into a separate KIP!

For `withKey()` we should be explicit that `null` is not allowed.

(Looking into existing `KeyQuery` it seems the JavaDocs don't cover 
this either -- would you like to do a tiny cleanup PR for this, or fix 
on-the-side in one of your PRs?)




The key query returns all the records that are valid in the time 
range starting from the timestamp {@code fromTimestamp}.


In the JavaDocs you use the phrase `are valid` -- I think we need to 
explain what "valid" means? It might even be worth to add some 
examples. It's annoying, but being precise if kinda important.


With regard to KIP-962, should we allow `null` for time bounds ? The 
JavaDocs should also be explicit if `null` is allowed or not and what 
the semantics are if allowed.




You are using `asOf()` however, because we are doing time-range 
queries, to me using `until()` to describe the upper bound would sound 
better (I am not a native speaker though, so maybe I am off?)



The key query returns all the records that have timestamp <= {@code 
asOfTimestamp}.


This is only correct if not lower-bound is set, right?


In your reply to KIP-960 you mentioned:


the meaningless combinations are prevented by throwing exceptions.


We should add corresponding JavaDocs like:

    @throws IllegalArgumentException if {@code fromTimestamp} is equal or
 larger than {@code untilTimestamp}

Or something similar.


With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` 
class for single-key-single-ts lookup, would we need to find a new 
name for the query class of this KIP, given that the return type is 
different?



-Matthias



On 8/16/23 10:57 AM, Alieh Saeedi wrote:

Hi all,

I splitted KIP-960

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can 
see all

the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries 
(IQv2) for

versioned state stores


I look forward to your feedback!

Cheers,
Alieh



RE: Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-08-24 Thread Victoria Xia
Hi Alieh,

Thanks for the KIP!

1. As mentioned in the discussion thread for KIP-960, I think it'd be good to 
add a new method to the VersionedKeyValueStore interface for supporting 
single-key, multi-timestamp lookups, as part of implementing these new 
interactive queries so that the IQ implementation can simply call the relevant 
underlying method.

2. Also, I'd like to discuss whether it'd be better to update the 
VersionedRecord class to allow null values, or to add a validTo timestamp field 
to VersionedRecord instead. Here's an example scenario to consider: suppose a 
user puts two values and two tombstones into a versioned store, all for the 
same key:

put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, v2, time=20)

And then issues an interactive query for all records for this key, in order to 
get back a `ValueIterator>`. How many records do you expect 
to be in ValueIterator?

By proposing to allow null values in VersionedRecord, I think you are proposing 
either four records ((v1, t=0), (null, t=5), (null, t=10), (v2, t=20)) or three 
records ((v1, t=0), (null, t=5), (v2, t=20)) but there's a case to be made that 
actually only two records should be returned ((v1, t=0, validTo=5), (v2, t=20, 
validTo=null/infinity)). The last interpretation is most consistent with the 
fact that `get(key)` and `get(key, asOfTimestamp)` do not distinguish between 
tombstones and no records having been inserted. As in, get(key=k, 
asOfTimestamp=7) returns null instead of a VersionedRecord with null value and 
timestamp 5. Curious to hear what others think about this.

Best,
Victoria

On 2023/08/17 02:13:24 "Matthias J. Sax" wrote:
> Thanks for splitting this part into a separate KIP!
> 
> For `withKey()` we should be explicit that `null` is not allowed.
> 
> (Looking into existing `KeyQuery` it seems the JavaDocs don't cover this 
> either -- would you like to do a tiny cleanup PR for this, or fix 
> on-the-side in one of your PRs?)
> 
> 
> 
> > The key query returns all the records that are valid in the time range 
> > starting from the timestamp {@code fromTimestamp}.
> 
> In the JavaDocs you use the phrase `are valid` -- I think we need to 
> explain what "valid" means? It might even be worth to add some examples. 
> It's annoying, but being precise if kinda important.
> 
> With regard to KIP-962, should we allow `null` for time bounds ? The 
> JavaDocs should also be explicit if `null` is allowed or not and what 
> the semantics are if allowed.
> 
> 
> 
> You are using `asOf()` however, because we are doing time-range queries, 
> to me using `until()` to describe the upper bound would sound better (I 
> am not a native speaker though, so maybe I am off?)
> 
> 
> > The key query returns all the records that have timestamp <= {@code 
> > asOfTimestamp}.
> 
> This is only correct if not lower-bound is set, right?
> 
> 
> In your reply to KIP-960 you mentioned:
> 
> > the meaningless combinations are prevented by throwing exceptions.
> 
> We should add corresponding JavaDocs like:
> 
> @throws IllegalArgumentException if {@code fromTimestamp} is equal or
>  larger than {@code untilTimestamp}
> 
> Or something similar.
> 
> 
> With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` 
> class for single-key-single-ts lookup, would we need to find a new name 
> for the query class of this KIP, given that the return type is different?
> 
> 
> -Matthias
> 
> 
> 
> On 8/16/23 10:57 AM, Alieh Saeedi wrote:
> > Hi all,
> > 
> > I splitted KIP-960
> > 
> > into three separate KIPs. Therefore, please continue discussions
> > about single-key, multi-timestamp interactive queries here. You can see all
> > the addressed reviews on the following page. Thanks in advance.
> > 
> > KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for
> > versioned state stores
> > 
> > 
> > I look forward to your feedback!
> > 
> > Cheers,
> > Alieh
> > 
> 

Re: [DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-08-16 Thread Matthias J. Sax

Thanks for splitting this part into a separate KIP!

For `withKey()` we should be explicit that `null` is not allowed.

(Looking into existing `KeyQuery` it seems the JavaDocs don't cover this 
either -- would you like to do a tiny cleanup PR for this, or fix 
on-the-side in one of your PRs?)





The key query returns all the records that are valid in the time range starting 
from the timestamp {@code fromTimestamp}.


In the JavaDocs you use the phrase `are valid` -- I think we need to 
explain what "valid" means? It might even be worth to add some examples. 
It's annoying, but being precise if kinda important.


With regard to KIP-962, should we allow `null` for time bounds ? The 
JavaDocs should also be explicit if `null` is allowed or not and what 
the semantics are if allowed.




You are using `asOf()` however, because we are doing time-range queries, 
to me using `until()` to describe the upper bound would sound better (I 
am not a native speaker though, so maybe I am off?)




The key query returns all the records that have timestamp <= {@code 
asOfTimestamp}.


This is only correct if not lower-bound is set, right?


In your reply to KIP-960 you mentioned:


the meaningless combinations are prevented by throwing exceptions.


We should add corresponding JavaDocs like:

   @throws IllegalArgumentException if {@code fromTimestamp} is equal or
larger than {@code untilTimestamp}

Or something similar.


With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` 
class for single-key-single-ts lookup, would we need to find a new name 
for the query class of this KIP, given that the return type is different?



-Matthias



On 8/16/23 10:57 AM, Alieh Saeedi wrote:

Hi all,

I splitted KIP-960

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can see all
the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for
versioned state stores


I look forward to your feedback!

Cheers,
Alieh



[DISCUSS] KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for versioned state stores

2023-08-16 Thread Alieh Saeedi
Hi all,

I splitted KIP-960

into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can see all
the addressed reviews on the following page. Thanks in advance.

KIP-968: Support single-key_multi-timestamp interactive queries (IQv2) for
versioned state stores


I look forward to your feedback!

Cheers,
Alieh