Hi Tobias, first of all: I am really happy that this discussion is now open, enabling us to benefit from the entire expertise available in the ImageJ universe.
I would like to use the opportunity to provide a bit of background for those readers who did not benefit from the extensive in-person discussions at the hackathon, in particular because there is no public summary available yet: At the hackathon [*1*] the main goal was to get ImgLib2 out of beta (and there was a *lot* of progress in that regard), and in the process a couple of last-minute changes were introduced, amongst others a memory optimization of the bit type containers. In particular, it changed from: https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/type/logic/BitType.java to https://github.com/imglib/imglib2/blob/fc0d3ebcd/src/main/java/net/imglib2/type/logic/BitType.java That is, the BitAccess was replaced by a LongAccess. The BitAccess was implemented by https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/img/basictypeaccess/array/BitArray.java Unfortunately, after upgrading imagej-ops to the ImgLib2 release, the regression tests started failing, and Tobias offered an explanation: On Thu, 30 Oct 2014, Tobias Pietzsch wrote: > The test uses > final Img<BitType> out = bitmap(); > > I bet that for the new BitType from the Fractions branch nobody > considered the possibility that two cursors might simultaneously write > to bits of the same underlying long value. As the problem is intermittent and changes between test runs even on the same computer, this is quite likely, especially given that the original BitArray used locking explicitly: https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/img/basictypeaccess/array/BitArray.java#L85-L91 That the BitType is now thread-unsafe is therefore a regression introduced just recently. > One solution would be to use locks to synchronize all accesses to the > underlying long[] array (this is for BitType, Unsigned12BitType, etc). > However, I fear that this will slow down things considerably. I agree that this would slow down operations as compared to the current code (at the price of correctness *grins*), but it would not slow down operations as compared to the BitArray which was used previously. > Is anyone familiar enough with the Java Memory Model to provide an > educated guess as to whether a lock-free approach would be feasible? The best online resource on this issue I am aware of is http://www.vogella.com/tutorials/JavaConcurrency/article.html#nonblocking (while the best offline resource is "Java Concurrency In Practice"). It talks about the most common lock-free primitive, available in Java since version 5.0: compare-and-swap (CAS). Unfortunately, this technique would require us to switch to a different data type, as the operation is not available on primitive types, let alone primitive type arrays. Theoretically, we could paper over this issue by using the Unsafe class. However, this class is marked as internal API for a good reason, and it would not be advisable to make use of it to work around a fundamental design issue. > By lock-free I mean setting the value and then checking whether the > value is actually what was expected (and if not, retry). A naïve implementation of this technique could easily result in a very nasty ping-pong effect: if one thread tries to clear a bit and the next thread tries to set it, it is very to run into a trap when not leaving a way for one thread to win. The safest way to resolve this issue is to reinstate the lock-on-write method that was in place earlier, i.e. surround the lines https://github.com/imglib/imglib2/blob/fc0d3ebcd9256e60961180952c0383c47e63d111/src/main/java/net/imglib2/type/logic/BitType.java#L133-L137 with a `synchronized (dataAccess) { ... }` guard. An alternative might be to give up lock-freedom in favor of wait-freedom [*2*]. In this regard, a more performant version might be to change // Clear the bits first, then or the masked value if ( value ) dataAccess.setValue(i1, (dataAccess.getValue(i1) | (1l << shift) ) ); else dataAccess.setValue(i1, (dataAccess.getValue(i1) & ~(1l << shift)) ); to use Optimistic Concurrency Control [*3*]: final long original = dataAccess.getValue(i1); if ( value ) { final long newValue = original | (1l << shift); dataAccess.setValue(i1, newValue); if ( newValue != dataAccess.getValue( i1 ) ) { synchronized (dataAccess) { dataAccess.setValue( i1, dataAccess.getValue(i1) | (1l << shift) ); } } } else { final long newValue = original & ~(1l << shift); dataAccess.setValue(i1, newValue); if ( newValue != dataAccess.getValue( i1 ) ) { synchronized (dataAccess) { dataAccess.setValue( i1, dataAccess.getValue(i1) & ~(1l << shift) ); } } } However, apart from being ugly, it is a little bit too complicated to be verified as correct easily even by myself. As ImgLib2 has yet to attract any concurrency expert, I would be *really* reluctant to destabilize ImgLib2 at this criticial time, and would rather leave this for a future improvement at a time when we have concurrency experts in our ranks. Correctness trumps speed. Ciao, Johannes Footnote *1*: The best public information so far is: http://imagej.net/pipermail/imagej-devel/2014-October/002280.html Footnote *2*: https://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms Footnote *3*: https://en.wikipedia.org/wiki/Optimistic_concurrency_control
_______________________________________________ ImageJ-devel mailing list [email protected] http://imagej.net/mailman/listinfo/imagej-devel
