Alfonso Nishikawa created GORA-391:
--------------------------------------

             Summary: Arrays persisted in HBase don't shrink automatically
                 Key: GORA-391
                 URL: https://issues.apache.org/jira/browse/GORA-391
             Project: Apache Gora
          Issue Type: Bug
          Components: gora-hbase
    Affects Versions: 0.5, 0.4
            Reporter: Alfonso Nishikawa
            Assignee: Alfonso Nishikawa
            Priority: Minor


Fields defined as arrays can grow and be updated, but don't shrink when an 
element is deleted.

See the code involved: 
[https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L312]

The workaround is:
# Define the field as a nullable array: ['null', ...array...]
# Set the field to null and persist  -> the array will be deleted
# Set the field to the new array and persist -> the array will be persisted 
with the new size

Comment from Renato:
bq.You are right, the array can not be shrinked at the moment and yes, it is 
wrong having to write the whole array back if you just want to change a single 
element. The column qualifier used for each item is the original index that 
means if your original array had 10 elements, then you'd have 10 column 
qualfiers to store those 10 items. But if then you delete the third element, 
Gora will end up with 9 actual elements (without the third), but there will be 
a 10th element inside HBase :( and when modifying a specific element, we will 
end up rewriting all of the elements :( Maybe we should do the same thing, we 
do with the maps and rewrite them all into HBase. At least it will work 
correctly.


Maybe the best solution would be an adaptative persistency: if a big percentage 
of the field is persisted, overwrite everything. If a small percentage of the 
field is persisted, update in a diff maner (addings, deletions, updates). This 
proposed approach seems too much complex, so the solution to implement is the 
one found in maps: delete all elements and write them again.

{panel:bgColor=#FFFFCE} (!) This will be horrible with arrays with big elements 
and only one update, but it is the same as it is being done by now. Same for 
maps. {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to