[
https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jan Fernando updated PHOENIX-1954:
----------------------------------
Attachment: PHOENIX-1954-wip2.patch
[~jamestaylor] [~tdsilva] I have uploaded a second WIP patch that I think
implements a working version of NEXT <n> VALUES FOR <seq> based on the initial
patch James provided. I've added a new IT test class
SequenceBulkAllocationIT.java that covers interaction with various sequence
features including MIN, MAX, CYCLE, different values of INCREMENT BY both
negative and positive, interplay with NEXT VALUE FOR and CURRENT VALUE. So far
seems good :) I have been running SequenceIT.java to look for regressions.
Some specific behaviors implemented in this patch I want to call out:
1) When performing a bulk allocation if we hit the MIN or MAX or
Overflow/Underflow conditions we throw an exception. We don't support partial
allocations. You either get all the values you request or none.
2) We currently don't support the new syntax on sequences that have the CYCLE
flag set. If you try and execute NEXT <n> VALUES FOR <seq> on a sequence an
exception is thrown. [~tdsilva] and I chatted about this offline and felt bulk
allocation and cycles together raise a whole host of weird behaviors that we
have to handle - for example we probably wouldn't want to cycle across a bulk
allocation as the semantics across which slots were allocated becomes very
complex. I propose deferring this work to another separate JIRA.
3) When including multiple NEXT <n> VALUES FOR and NEXT VALUE expressions in a
statement we honor the expression which allocates the highest number of values
and all expressions (including any CURRENT VALUE FOR) expressions return the
value for that expression.
4) [~jamestaylor] One specific change I want to call out that I had to make to
support the parallel numAllocations array in the SequenceManager was in
SequenceManager.validateSequences() at line 206 in the patch we were invoking
Collections.sort(nextSequences). This caused problems when we had multiple
expressions in for different sequences in a statement. The number of slots to
allocate would get out of sync with the sequences and would be applied to
incorrect sequences. I commented out this sort out and it looks to me it's not
needed. What do you think? Was there a reason we were doing this that the tests
in SequenceIT.java don't cover?
I still need to do some more testing in addition to the IT tests I added in a
SequenceBulkAllocationIT.java. Specifically I still need to validate:
1) Backwards compatibility with older clients and upgraded server
2) Test Sequences with Multi-Tenant views to make sure there are no regressions
there and the new semantics works
3) Look at the sequence reclaiming logic we have and make sure there is no
additional changes needed there to support the new semantics.
4) Run more existing IT tests.
Please take a look when you get a chance.
> Reserve chunks of numbers for a sequence
> ----------------------------------------
>
> Key: PHOENIX-1954
> URL: https://issues.apache.org/jira/browse/PHOENIX-1954
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Lars Hofhansl
> Assignee: Jan Fernando
> Attachments: PHOENIX-1954-wip.patch, PHOENIX-1954-wip2.patch
>
>
> In order to be able to generate many ids in bulk (for example in map reduce
> jobs) we need a way to generate or reserve large sets of ids. We also need to
> mix ids reserved with incrementally generated ids from other clients.
> For this we need to atomically increment the sequence and return the value it
> had when the increment happened.
> If we're OK to throw the current cached set of values away we can do
> {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the
> value it incremented from (i.e. it has to throw the current cache away, and
> return the next value it found at the server).
> Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the
> same, but does not invalidate the cache.
> Note that in either case we won't retrieve the reserved set of values via
> {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we
> need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}},
> which returns a value <M> is that the range [M, M+N) won't be used by any
> other user of the sequence. My might need reserve 1bn ids this way ahead of a
> map reduce run.
> Any better ideas?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)