[ 
https://issues.apache.org/jira/browse/PHOENIX-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131452#comment-15131452
 ] 

James Taylor commented on PHOENIX-2169:
---------------------------------------

Good catch, [~ankit.singhal]. +1 on the patch - please check in to 4.x and 
master branches. One minor fix on commit issue: please make the bitSet member 
variable in ProjectedColumnExpression private.

If an expression is not stateless, it should clone itself in the 
CloneExpressionVisitor (as you've done). This will end up cloning the 
expression tree for the parallel thread evaluating a select expression that 
uses a ProjectedColumnExpression. The alternative would be to remove bitSet as 
a member variable and instantiate a new one with each call to evaluate (which 
would likely affect perf more than the fix you've done). 



> Illegal data error on UPSERT SELECT and JOIN with salted tables
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-2169
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2169
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.0
>            Reporter: Josh Mahonin
>            Assignee: Ankit Singhal
>              Labels: verify
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2169-bug.patch, PHOENIX-2169.patch
>
>
> I have an issue where I get periodic failures (~50%) for an UPSERT SELECT 
> query involving a JOIN on salted tables. Unfortunately I haven't been able to 
> create a reproducible test case yet, though I'll keep trying. I believe this 
> same behaviour existed in 4.3.1 as well, so I don't think it's a regression.
> The upsert query itself looks something like this:
> {code}
> UPSERT INTO a(tid, ds, etp, eid, ts, atp, rel, tp, tpid, dt, pro) 
> SELECT c.tid, 
>        c.ds, 
>        c.etp, 
>        c.eid, 
>        c.dh, 
>        0, 
>        c.rel, 
>        c.tp, 
>        c.tpid, 
>        current_time(), 
>        1.0 / s.th 
> FROM   e_c c 
> join   e_s s 
> ON     s.tid = c.tid 
> AND    s.ds = c.ds 
> AND    s.etp = c.etp 
> AND    s.eid = c.eid 
> WHERE  c.tid = 'FOO';
> {code}
> Without the upsert, the query always returns the right data, but with the 
> upsert, it ends up with failures like:
> Error: ERROR 201 (22000): Illegal data. ERROR 201 (22000): Illegal data. 
> Expected length of at least 109 bytes, but had 19 (state=22000,code=201)
> The explain plan looks like:
> {code}
> UPSERT SELECT
> CLIENT 16-CHUNK PARALLEL 16-WAY RANGE SCAN OVER E_C [0,'FOO']
>       SERVER FILTER BY FIRST KEY ONLY
>       PARALLEL INNER-JOIN TABLE 0
>           CLIENT 16-CHUNK PARALLEL 16-WAY FULL SCAN OVER E_S
>       DYNAMIC SERVER FILTER BY (C.TID, C.DS, C.ETP, C.EID) IN ((S.TID, S.DS, 
> S.ETP, S.EID))
> {code}
> I'm using SALT_BUCKETS=16 for both tables in the join, and this is a dev 
> environment, so only 1 region server. Note that without salted tables, I have 
> no issue with this query.
> The number of rows in E_C is around 23K, and the number of rows in E_S is 62.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to