Hi, I'm looking at the possibility of creating a new kind of data store, let's call it a federated data store, and wanted to see what everyone thinks about this.
The basic idea is that the federated data store would allow for more than one data store to be configured for an Oak instance. Oak would then be able to choose which data store to use based on a number of criteria, like file size, JCR path, node type, existence of a node property, a node property value, or other items, or a combination of items. In my thinking these are defined in configuration so the federated data store would know how to select which data store is used to store which binary. I think this is a step towards UC14 - Hierarchical BlobStore in [0]. Once the federated data store was implemented we should be able to support UC14 with little work. I can also foresee other possible capabilities it could offer, such as storing blobs for different node types in different data stores, or choosing from a few different data stores based on geographic location (UC2 in [0]). In my mind we could add capability to DataStoreBlobStore.writeStream() where the decision is made whether to write a stream to the data store delegate or put it in-memory. Instead we could defer the decision directly to the delegate, adding a method to the appropriate interface (BlobStore or GarbageCollectibleBlobStore) to handle this decision, and default the decision in AbstractBlobStore to be based on the record size (which is the current behavior, except currently that decision is made in DataStoreBlobStore IIUC). All other existing data stores should then behave the same. But in the case of the federated data store this decision would be more involved, selecting the right data store based on configuration. The federated data store would need to exist independent of other data stores, so figuring out how to create those data stores without having a code dependency would be a challenge to figure out. Please let me know what you think, is my idea about the implementation flawed, is there a better way to accomplish this, what concerns are there about it, etc. I'd like to brainstorm with the list something that can work in this area and then I'll create a ticket for it. Or I can create the ticket, and we can have the discussion in the ticket. Let me know which is best. [0] - https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase - Matt Ryan