We're using the approach you describe here at UW Madison. Basically, we
have a number of mounted file systems, with a structure that looks like
this:
/fedora/datamp/data01
/fedora/datamp/data02
/fedora/datamp/data03...
Then we have a directories with symbolic links into the mounted file
systems:
/fedora/data/objects/a -> /fedora/datamp/data01/objects/a
/fedora/data/datastreams/a -> /fedora/datamp/data01/datastreams/a
/fedora/data/objects/b -> /fedora/datamp/data02/objects/b
/fedora/data/datastreams/b -> /fedora/datamp/data02/datastreams/b
/fedora/data/objects/c -> /fedora/datamp/data03/objects/c
/fedora/data/datastreams/c -> /fedora/datamp/data03/datastreams/c
And in our akubra.xml file, we have the object and datastream roots and
hash paths configured:
<bean name="fsObjectStore" class="org.akubraproject.fs.FSBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:fsObjectStore"/>
<constructor-arg value="/fedora/data/objects"/>
</bean>
<bean name="fsObjectStoreMapper"
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper"
singleton="true">
<constructor-arg value="#/##/##"/>
</bean>
<bean name="fsDatastreamStore" class="org.akubraproject.fs.FSBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:fsDatastreamStore"/>
<constructor-arg value="/fedora/data/datastreams"/>
</bean>
<bean name="fsDatastreamStoreMapper"
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper"
singleton="true">
<constructor-arg value="#/##/##"/>
</bean>
We have four mount point on our production machine, and four top-level
directories allocated per mount point.
One of the beauties of hashed directory and file paths is that all the
file systems should fill up evenly: the hash ensures that objects are
distributed across all the file systems.
/fedora/datamp/data01: 68%
/fedora/datamp/data02: 68%
/fedora/datamp/data03: 68%
/fedora/datamp/data04: 69%
-- Scott
On 04/18/2013 10:42 AM, Gary Phillips wrote:
> Hello,
>
> I've spent some time looking at akubra and the HashPathIdMapper to get a
> feel for how we would distribute our datastreamStore over multiple file
> systems. The default configuration (##) creates 256 potential
> directories. Changing that to # (or something like #/##) would give us
> 16 top level sub-directories, which we could work with. However, there
> is another issue, in that I don't see how we can easily predict how each
> of those directories would grow, if I am understanding how the files are
> distributed across the directories.
>
> I assume one solution might involve sym linking the top level
> directories over to a few directories that each correspond to a mount point.
>
> Have other people tackled this particular problem (and what solutions
> did you come up with) or found another way around distributing large
> amounts of Fedora data over multiple mount points? Thanks in advance.
>
>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
>
>
>
> _______________________________________________
> Fedora-commons-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>
--
Scott Prater
Shared Development Group
General Library System
University of Wisconsin - Madison
[email protected]
5-5415
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users