Re: (ab)using STM for longish calculations, network I/O

Alyssa Kwan Fri, 15 Oct 2010 19:58:51 -0700


On Oct 15, 7:57 pm, peter veentjer <[email protected]> wrote:
> On Oct 15, 2:51 pm, hobnob <[email protected]> wrote:
>
>
>
>
>
> > Hi,
>
> > I'm just starting to get my head wrapped around STM just by reading
> > about it so I can make a decision on whether to port a Java project to
> > Clojure.
>
> > a) can STM transactions contain calculations that take a 'long' time,
> > let's say computing the cryptographic hash of a plaintext. I'd
> > 'ensure' the input parameters such as plain text, hash algo and bit-
> > length, compute the hash (can be slow) and store the hash in a ref.
> > What I'd need here is that the calculation is interrupted if the
> > transaction is aborted for a retry. No need to complete the long
> > calculation if we aren't going to store the result anyway. The
> > existing convention in Java is to set the interrupt flag on a thread
> > which is queried ever now and then by long-running calculations. This
> > is a convention that many Java libraries adhere to. So how to do this
> > interrupting?
>
> Also a tough question. If you have a non transactional flag, it
> doesn't need
> to provide the value you expect it to have. And using a transactional
> flag
> could also lead to strange problems like transactions always conflict
> on change of that flag.
>
>


hobnob, I'm confused.  What does the former (retry leading to
immediate abort) have to do with the latter (interrupt by a monitoring
process)?  The former is actually really easy to implement, but it
requires an enhancement to core:  there's a RETRY_LIMIT of 10,000 for
all transactions; let that be configurable per transaction and set the
limit to 1 for the ones you want immediate abort for.

>
> > b) With long-running calculations, nested dosyncs should really result
> > in nested transactions. The point being if a transaction has several
> > long-running calculations. E.g. first compute a hash, then encrypt the
> > hash with a public key. In that case, if the 'ensured' parameters that
> > the second encrypt part depends on are changed concurrently but not
> > the parameters/result of the first part, then only the second
> > calculation has to be repeated, not both. The second part could be
> > enclosed in an inner dosync but currently Clojure will unnecessarily
> > redo the whole thing.
>
> I guess you need a propagation level: RequiresNew. And at the end you
> need to be able to commit all the transactions as one. Afaik Clojure
> has no support for it, but I don't think it is very hard to add if
> clojure also
> exposes some kind of prepare methods to makes sure that the
> transaction
> is able to commit.
>
> For the Multiverse STM I have introduced CommitBarriers for this
> purpose,
> but I don't think they would be hard to add to the Clojure STM.
>
>

peter, I'm confused.  What does 2PC have to do with nested
transactions?  I'm new to transactional/locking programming, so I'm
probably missing something blindingly obvious.

hobnob, this seems to be a static analysis problem:  given the body of
a dosync, figure out what writes depend on ensured reads/previous
writes, and only recalculate those on a retry.  Again, this requires
an enhancement to core, but I can certainly envision building out a
dependency graph as the body of a dosync is walked down by
LockingTransaction.  This approach makes performance (even more)
unpredictable, since not all locks would be released/reacquired on
retry, but that should be more than compensated by the shorter
transaction windows leading to less contention in the first place.  Of
course, building that graph out for small transactions is overkill, so
it should be a setting passed into dosync.  One key benefit of this
approach is that it doesn't require the user to specify nested
transactions.  Transactions boundaries should be about correctness
alone; performance should be kept orthogonal.  This approach takes the
load off of the user completely and lets the VM do it.

>
> > c) Somewhat different: I'm not supposed to do I/O in a transaction
> > because the transaction might be repeated and that will repeat the I/
> > O. But maybe that's what I want. The I/O could be a network output
> > sending the computed value over the net to be stored on a remote
> > machine instead of being stored in a local ref. I *do* want the
> > computed value of each retry to be sent over the net. I guess my setup
> > here could be considered an ad-hoc distributed TM, this touches the
> > other discussion of "STM with external transactions" in this group.
>
> All communication with non transactional datasources from within a
> transaction is hard. You can do some stuff with the deferred and
> compensating
> actions if the transaction commits or aborts..But I haven't found a
> solution
> for this problem either. This was one of the main reasons for MS to
> drop
> their STM research project.
>
>

hobnob, I'm genuinely curious what your use case is.  Why do you want
to keep tries?  Is there some purpose other than logging or diagnosing
system performance?  If it's either of those, there should be ONE
RIGHT WAY to look at Clojure internals with a good, standard tool
chain.

Have you taken a look at my modifications to core?  
http://github.com/kwanalyssa/clojure
It adds durability, though it uses BDB JE, with one DB instance tied
to one Clojure VM (no transactional sharing across VMs, which would be
awesome), and there's no out-of-the-box querying.  But it should be
fully ACID...  (Help me test it, pretty please?)

>
>
>
> > Thanks

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: (ab)using STM for longish calculations, network I/O

Reply via email to