Github user renato2099 commented on the pull request:
https://github.com/apache/gora/pull/23#issuecomment-94309903
Thanks a lot for the explanation @gerhardgossen! And yes this is a problem
we have seen in other data stores as well. I mean managing complex data types
because not all data stores provide the same functionality. For example, in
gora-cassandra depending on your mapping file, you could create subcolumns
inside a super column or even separated columns. Then when updating maps, you
could end up updating a whole column even when a single value was modified
inside an array or map. This behaviour is of course wrong. I guess this is also
happening in accumulo per your test.
I think there is a trade-off here between generating a column for each
specific value of a map/array which leads to a more complex scan operation or
using a single column to store them all which leads to the current behaviour.
In Cassandra, arrays and maps can be now stored natively, so I guess we
will be using them soon instead of adding this extra "mapping" complexity. Do
you know if Accumulo stores complex data types or if it plans to?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---