[ https://issues.apache.org/jira/browse/GORA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alfonso Nishikawa updated GORA-391: ----------------------------------- Reporter: Alfonso Nishikawa Muñumer (was: Alfonso Nishikawa) > Arrays persisted in HBase don't shrink automatically > ---------------------------------------------------- > > Key: GORA-391 > URL: https://issues.apache.org/jira/browse/GORA-391 > Project: Apache Gora > Issue Type: Bug > Components: gora-hbase > Affects Versions: 0.4, 0.5 > Reporter: Alfonso Nishikawa Muñumer > Priority: Minor > Labels: arrays, maps > Fix For: 1.0 > > > Fields defined as arrays can grow and be updated, but don't shrink when an > element is deleted. > See the code involved: > [https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L312] > The workaround is: > # Define the field as a nullable array: ['null', ...array...] > # Set the field to null and persist -> the array will be deleted > # Set the field to the new array and persist -> the array will be persisted > with the new size > Comment from Renato: > bq.You are right, the array can not be shrinked at the moment and yes, it is > wrong having to write the whole array back if you just want to change a > single element. The column qualifier used for each item is the original index > that means if your original array had 10 elements, then you'd have 10 column > qualfiers to store those 10 items. But if then you delete the third element, > Gora will end up with 9 actual elements (without the third), but there will > be a 10th element inside HBase :( and when modifying a specific element, we > will end up rewriting all of the elements :( Maybe we should do the same > thing, we do with the maps and rewrite them all into HBase. At least it will > work correctly. > Maybe the best solution would be an adaptative persistency: if a big > percentage of the field is persisted, overwrite everything. If a small > percentage of the field is persisted, update in a diff maner (addings, > deletions, updates). This proposed approach seems too much complex, so the > solution to implement is the one found in maps: delete all elements and write > them again. > {panel:bgColor=#FFFFCE} (!) This will be horrible with arrays with big > elements and only one update, but it is the same as it is being done by now. > Same for maps. {panel} -- This message was sent by Atlassian Jira (v8.20.1#820001)