Hola Guys!
On 03/22/12 16:42, Chris Wilper wrote:
> Forgetting transactions for a second, I was kind of wondering if some
> sort of LockProvider service would be useful (one implementation of
> which would be a hazelcast/cluster-capable one). Higher level code
> that works with fcrepo-store would do something like:
>
> Lock lock = lockProvider.getLock(pid);
> lock.lock();
> try {
> store.addObject(fedoraObject);
> } finally {
> lock.unlock();
> }
I think this would be very nice to have in a possibly distributed
environment, since a service manipulating a couple of objects for
preservation (which is planned in the SCAPE project) could lock the
objects which it might manipulate. Since preservation tasks might be
running for a long time (e.g. couple of days) the chance for concurrent
manipulations of objects gets larger. If the task could lock those
objects beforehand, things would be nice.
> I agree that size and other info (mime type, etc.) are important for
> implementations to have access to at storage time. Interestingly, the
> way the API currently works, the associated FedoraObject *must* be
> provided to the impl prior to the call to setContent().
Oh ok. I did not know that.
> Are there other hints, not
> necessarily present in the FedoraObject, that we can envision being
> important to making content storage decisions?
Hmm not at the moment, i'd just like to have information about the size
in setContent().
> Note that I'm not convinced that having stream-oriented (for managed
> content) and object-oriented (for FedoraObjects) methods at the same
> level in the API is the right move necessarily -- it just seemed more
> practical to implement in the short term because embedding
> stream-getting/setting functionality directly inside the FedoraObject
> interface would tie instances to a particular FedoraStore impl...which
> makes them harder to move around, if that makes sense.
Since this would be completely fedora internal APIs, that would not be
exposed to the user but only employed by developers, i don't think
having stream based API methods alongside of "normal" methods is a bad
thing, since developers will understand the need to handle big data in a
stream based fashion rather than filling up memory.
This also gives you the possibility to access datastreams
independendant of the objects. If getting/setting a Stream would involve
calling a method on the FedoraObject, the system would have to fetch the
object beforehand for every request. If you have a Stream based API like
you proposed datastreams can be fetched/updated with their ID only.
In short: im all for a stream based API as proposed :)
> In your experience, were you working with already-transaction
> resources (via JTA?) As mentioned on the call, I think if we attempt
> to implement transactions ourselves, there's all kinds of opportunity
> for failure. But if we can "wrap" already-transactional resources
> while still keeping the ability to integrate non-transactional blob
> storage, that seems more palatable to me.
In this Test i had a service write to a Datasource connected via
Hibernate and to a Filesystem in a Transaction. The way i did this was
quite straigthforward. I defined an Action class which held all the
neccessary information about the atomic operations (e.g. one Action can
be: write object to datasource, or: write xml to file system) in a
LinkedList. I kept an index in the Transaction telling the system which
Action is the current one, and if some error occurs, the system iterates
up in the LinkedList undoing any Actions it encounters.
So the system used Hibernate's org.hibernate.Session and Transaction for
handling transactions on the datasource level, but it uses a simple
handwritten logic for handling transactions on the filesystem.
This logic has all bee wrapped into a PlatformTransactionManager from
Spring, which i weaved into the service using @Transactional annotations
and a spring bean configuration for the datasource, filesystem and the
transaction manager.
> The original point made in the paper was that there was a way to not
> *force* locking to occur (via optimistic concurrency control) if the
> storage interface provided a way to declare the previously-seen state
> with each request.
But this would mean fetching the existing objects from the storage layer
before applying any update in order to be able to compare the versions
which also introduces quite an overhead, depending on the object's
complexity. And when dealing with large datastreams it seems quite
inefficient to compare the currently stored version with one version
supplied with a put request in order to overwrite it with yet another
version given in the put request.
Have Fun!
Frank
--
*frank asseg*
softwareentwicklung
feichtmayrstr. 37
76646 bruchsal
tel.: ++49-7251-322-6073
fax.: ++49-7251-322-6078
mail: [email protected]
web: http://www.congrace.de/
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers