[ 
https://issues.apache.org/jira/browse/PHOENIX-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Fernando updated PHOENIX-1954:
----------------------------------
    Attachment: PHOENIX-1954-wip2.patch

[~jamestaylor] [~tdsilva] I have uploaded a second WIP patch that I think 
implements a working version of NEXT <n> VALUES FOR <seq> based on the initial 
patch James provided. I've added a new IT test class 
SequenceBulkAllocationIT.java that covers interaction with various sequence 
features including MIN, MAX, CYCLE, different values of INCREMENT BY both 
negative and positive, interplay with NEXT VALUE FOR and CURRENT VALUE. So far 
seems good :) I have been running SequenceIT.java to look for regressions.

Some specific behaviors implemented in this patch I want to call out:

1) When performing a bulk allocation if we hit the MIN or MAX or 
Overflow/Underflow conditions we throw an exception. We don't support partial 
allocations. You either get all the values you request or none.

2) We currently don't support the new syntax on sequences that have the CYCLE 
flag set. If you try and execute NEXT <n> VALUES FOR <seq> on a sequence an 
exception is thrown. [~tdsilva] and I chatted about this offline and felt bulk 
allocation and cycles together raise a whole host of weird behaviors that we 
have to handle - for example we probably wouldn't want to cycle across a bulk 
allocation as the semantics across which slots were allocated becomes very 
complex. I propose deferring this work to another separate JIRA.

3) When including multiple NEXT <n> VALUES FOR and NEXT VALUE expressions in a 
statement we honor the expression which allocates the highest number of values 
and all expressions (including any CURRENT VALUE FOR) expressions return the 
value for that expression.

4) [~jamestaylor] One specific change I want to call out that I had to make to 
support the parallel numAllocations array in the SequenceManager was in 
SequenceManager.validateSequences() at line 206 in the patch we were invoking 
Collections.sort(nextSequences). This caused problems when we had multiple 
expressions in for different sequences in a statement. The number of slots to 
allocate would get out of sync with the sequences and would be applied to 
incorrect sequences. I commented out this sort out and it looks to me it's not 
needed. What do you think? Was there a reason we were doing this that the tests 
in SequenceIT.java don't cover?

I still need to do some more testing in addition to the IT tests I added in a 
SequenceBulkAllocationIT.java. Specifically I still need to validate:
1) Backwards compatibility with older clients and upgraded server
2) Test Sequences with Multi-Tenant views to make sure there are no regressions 
there and the new semantics works
3) Look at the sequence reclaiming logic we have and make sure there is no 
additional changes needed there to support the new semantics.
4) Run more existing IT tests.

Please take a look when you get a chance.

> Reserve chunks of numbers for a sequence
> ----------------------------------------
>
>                 Key: PHOENIX-1954
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1954
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Lars Hofhansl
>            Assignee: Jan Fernando
>         Attachments: PHOENIX-1954-wip.patch, PHOENIX-1954-wip2.patch
>
>
> In order to be able to generate many ids in bulk (for example in map reduce 
> jobs) we need a way to generate or reserve large sets of ids. We also need to 
> mix ids reserved with incrementally generated ids from other clients. 
> For this we need to atomically increment the sequence and return the value it 
> had when the increment happened.
> If we're OK to throw the current cached set of values away we can do
> {{NEXT VALUE FOR <seq>(,<N>)}}, that needs to increment value and return the 
> value it incremented from (i.e. it has to throw the current cache away, and 
> return the next value it found at the server).
> Or we can invent a new syntax {{RESERVE VALUES FOR <seq>, <N>}} that does the 
> same, but does not invalidate the cache.
> Note that in either case we won't retrieve the reserved set of values via 
> {{NEXT VALUE FOR}} because we'd need to be idempotent in our case, all we 
> need to guarantee is that after a call to {{RESERVE VALUES FOR <seq>, <N>}}, 
> which returns a value <M> is that the range [M, M+N) won't be used by any 
> other user of the sequence. My might need reserve 1bn ids this way ahead of a 
> map reduce run.
> Any better ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to