[ 
https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Fernando updated PHOENIX-1954:
----------------------------------
    Attachment: PHOENIX-1954-wip6.patch

[~jamestaylor] Following from our discussions attached is a patch that 
addresses issues (1) and (2).

1) We now support allocating slots in bulk directly from the client cache if 
the cache has sufficient slots available to service the request. I wasn't able 
to get rid of all the special casing as we ran into issues trying to have 
completely share the code for determining if the cache is exhausted across both 
the NEXT VALUE FOR and NEXT <n> VALUES FOR. The issues were related to cycles 
and overflows and underflows that created subtle differences in behavior in the 
2 flows that I couldn't genericize without making the code very brittle and 
introducing risks of hard to detect regressions. Therefore the existing check 
below remains so that NEXT VALUE FOR flow remains the same which will 
significantly reduce risk for strange regressions like the ones I found.

{code}
value.currentValue == value.nextValue
{code}

2) Added logic to ensure that if NEXT <n> VALUES FOR performs an illegal 
operation like requesting slots on a sequence with a cycle or exceeding the max 
and min we throw a SQLException during expression evaluation consistently in 
all cases. This involved adding client side validation, in lieu of forcing an 
RPC call each time. I added a few static methods on SequenceUtil to consolidate 
validation logic between server and client.

The attached patch isn't rebased so it's easier for you to see changes. Let me 
know if you'd like a rebased version with all the changes and I'll do that once 
you are ready.

> Reserve chunks of numbers for a sequence
> ----------------------------------------
>
>                 Key: PHOENIX-1954
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1954
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>            Assignee: Jan Fernando
>         Attachments: PHOENIX-1954-rebased.patch, PHOENIX-1954-wip.patch, 
> PHOENIX-1954-wip2.patch.txt, PHOENIX-1954-wip3.patch, 
> PHOENIX-1954-wip4.patch, PHOENIX-1954-wip5-rebased.patch, 
> PHOENIX-1954-wip6.patch
>
>
> In order to be able to generate many ids in bulk (for example in map reduce 
> jobs) we need a way to generate or reserve large sets of ids. We also need to 
> mix ids reserved with incrementally generated ids from other clients. 
> For this we need to atomically increment the sequence and return the value it 
> had when the increment happened.
> If we're OK to throw the current cached set of values away we can do
> {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the 
> value it incremented from (i.e. it has to throw the current cache away, and 
> return the next value it found at the server).
> Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the 
> same, but does not invalidate the cache.
> Note that in either case we won't retrieve the reserved set of values via 
> {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we 
> need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}}, 
> which returns a value <M> is that the range [M, M+N) won't be used by any 
> other user of the sequence. My might need reserve 1bn ids this way ahead of a 
> map reduce run.
> Any better ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to