[ https://issues.apache.org/jira/browse/PHOENIX-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas D'Silva reassigned PHOENIX-3709: --------------------------------------- Assignee: (was: Thomas D'Silva) Fix Version/s: (was: 4.10.0) Description: Based on some testing (see patch), I noticed a mysterious DeleteFamily marker when a covered column is set to null. This could potentially delete an actual row with that row key, so it's bad. Here's a raw scan dump taken after the MutableIndexIT.testCoveredColumns() test: {code} ************ dumping IDX_T000002;hconnection-0x211e75ea ************** \x00a/0:/1487356752097/DeleteFamily/vlen=0/seqid=0 value = x\x00a/0:0:V2/1487356752231/Put/vlen=1/seqid=0 value = 4 x\x00a/0:0:V2/1487356752225/Put/vlen=1/seqid=0 value = 4 x\x00a/0:0:V2/1487356752202/Put/vlen=1/seqid=0 value = 3 x\x00a/0:0:V2/1487356752149/DeleteColumn/vlen=0/seqid=0 value = x\x00a/0:0:V2/1487356752097/Put/vlen=1/seqid=0 value = 1 x\x00a/0:_0/1487356752231/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752225/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752202/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752149/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752097/Put/vlen=2/seqid=0 value = _0 ----------------------------------------------- {code} An optimization would be not to issue the first Put since the value behind it is the same. was: Based on some testing (see patch), I noticed a mysterious DeleteFamily marker when a covered column is set to null. This could potentially delete an actual row with that row key, so it's bad. Here's a raw scan dump taken after the MutableIndexIT.testCoveredColumns() test: {code} ************ dumping IDX_T000002;hconnection-0x211e75ea ************** \x00a/0:/1487356752097/DeleteFamily/vlen=0/seqid=0 value = x\x00a/0:0:V2/1487356752231/Put/vlen=1/seqid=0 value = 4 x\x00a/0:0:V2/1487356752225/Put/vlen=1/seqid=0 value = 4 x\x00a/0:0:V2/1487356752202/Put/vlen=1/seqid=0 value = 3 x\x00a/0:0:V2/1487356752149/DeleteColumn/vlen=0/seqid=0 value = x\x00a/0:0:V2/1487356752097/Put/vlen=1/seqid=0 value = 1 x\x00a/0:_0/1487356752231/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752225/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752202/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752149/Put/vlen=2/seqid=0 value = _0 x\x00a/0:_0/1487356752097/Put/vlen=2/seqid=0 value = _0 ----------------------------------------------- {code} That first DeleteFamily marker shouldn't be there. This occurs for both global and local indexes, but not for transactional tables. A further optimization would be not to issue the first Put since the value behind it is the same. On the plus side, we're not issuing DeleteFamily markers when only the covered column is being set which is good. Issue Type: Improvement (was: Bug) > For a mutable index do not issue a put if the new value is the same as the > previous value > ------------------------------------------------------------------------------------------ > > Key: PHOENIX-3709 > URL: https://issues.apache.org/jira/browse/PHOENIX-3709 > Project: Phoenix > Issue Type: Improvement > Reporter: James Taylor > Priority: Blocker > > Based on some testing (see patch), I noticed a mysterious DeleteFamily marker > when a covered column is set to null. This could potentially delete an actual > row with that row key, so it's bad. > Here's a raw scan dump taken after the MutableIndexIT.testCoveredColumns() > test: > {code} > ************ dumping IDX_T000002;hconnection-0x211e75ea ************** > \x00a/0:/1487356752097/DeleteFamily/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752231/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752225/Put/vlen=1/seqid=0 value = 4 > x\x00a/0:0:V2/1487356752202/Put/vlen=1/seqid=0 value = 3 > x\x00a/0:0:V2/1487356752149/DeleteColumn/vlen=0/seqid=0 value = > x\x00a/0:0:V2/1487356752097/Put/vlen=1/seqid=0 value = 1 > x\x00a/0:_0/1487356752231/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752225/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752202/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752149/Put/vlen=2/seqid=0 value = _0 > x\x00a/0:_0/1487356752097/Put/vlen=2/seqid=0 value = _0 > ----------------------------------------------- > {code} > An optimization would be not to issue the first Put since the value behind it > is the same. -- This message was sent by Atlassian JIRA (v6.3.15#6346)