[ https://issues.apache.org/jira/browse/OAK-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julian Reschke updated OAK-1941: -------------------------------- Attachment: with-modified-index.diff a change that creates an index on _modified; right now it's not clear whether this actually improves performance > RDB: decide on table layout > --------------------------- > > Key: OAK-1941 > URL: https://issues.apache.org/jira/browse/OAK-1941 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: rdbmk > Reporter: Julian Reschke > Assignee: Julian Reschke > Fix For: 1.2 > > Attachments: OAK-1941-cmodcount.diff, utf8measure.diff, > with-modified-index.diff, with-modified-index.diff > > > The current approach is to serialize the Document using JSON, and then to > store either (a) the full JSON in a VARCHAR column, or, if that column isn't > wide enough, (b) to store it in a BLOB (optionally gzipped). > For debugging purposes, the inline VARCHAR always gets populated with the > start of the JSON serialization. > However, with Oracle we are limited to 4000 bytes (which may be way less > characters due to non-ASCII overhead), so many document instances will use > what was initially thought to be the exception case. > Questions: > 1) Do we stick with JSON or do we attempt a different serialization? It might > make sense both wrt to length and performance. There might be also some code > to borrow from the off-heap serialization code. > 2) Do we get rid of the "dual" strategy, and just always use the BLOB? The > indirection might make things more expensive, but then the total column width > would drop considerably. -- How can we do good benchmarks on this? > (This all assumes that we stick with a model where all code is the same > between database types, except for the DDL statements; of course it's also > conceivable add more vendor-specific special cases into the Java code) -- This message was sent by Atlassian JIRA (v6.3.4#6332)