Hi Frank, I put my initial HighlevelStorage proof of concept up on github, in my forked fcrepo repo: https://github.com/birkland/fcrepo.git branch: hlstore_hbase_poc
(you can see it as part of the fcrepo network view: https://github.com/fcrepo/fcrepo/network) Everything of interest is in the fcrepo-hlstore module, including - HighlevelStorage interface, and an HBaseHighlevelStorage implementation - DistributedDOManager - alternate "drop-in" DOManager implementation that uses HighlevelStorage - Example Spring config files in src/main/resources/config/spring This is a proof of concept created in early 2010, originally against Fedora 3.3. It is now updated for fedora 3.5 snapshot. It was created prior to several design/discussion meetings, so it does not reflect the entirety of current thinking (such as: splitting the HighlevelStorage interface into 'Readable' and "Writable'). Nevertheless, it may be a good starting point to work from. There are some ugly hacks and workarounds in order to require minimal changes to existing code. To try it out: - checkout the hlstore_hbase_poc branch, and build it - run the resulting fedora installer as usual - edit fedora.fcfg, and remove the following modules: org.fcrepo.oai.OAIProvider, org.fcrepo.server.management.PIDGenerator, org.fcrepo.server.storage.DOManager, org.fcrepo.server.search.FieldSearch - remove akubra-llstore.xml from server/config/spring - copy the contents of fcrepo-hlstore/src/main/resources/config/spring/highlevel_hbase/ into server/config/spring - Edit server/config/spring/HighLevelStorage_hbase.xml and either (a) modify the value of HBaseRoot parameter to point to your standalone HBase table location, or (b) if you are using HBase in distributed mode, remobe HBaseRoot parameter, and instead put the hostname and port in a property called 'HBaseMaster' - Start tomcat. It will create an HBase table called 'fedora' if one does not exist. It has three column families: - 'object' containing the serialized fedora object - 'meta' containing the last modified datestamp - 'datastream', containing all managed datastream. Each managed datastream creates a column named after their datastream ID I think HBase defaults to three versions for each row. Cell versions are used for implementing datastream versioning, so the table will need to be tweaked if you wanted to try more. Note: as this was just a proof of concept, there is much that is not implemented. FieldSearch has been replaced with an empty stub always returns zero results, as does DeploymentManager, so searches and disseminations won't work until real implementations have been developed. Take a look, browse the code, read the javadocs (where present) - I'd be happy to answer questions. -Aaron ------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev _______________________________________________ Fedora-commons-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
