I am sorry but I must be missing something here. The code uses synchronization blocks and mutexes to explicitly make sure threads don't interfere which each other. This restriction will be there because of resource sharing and the java memory model. Instrumenting wait/notify cannot magically remove that restriction.
Based on queuing theory you can move your bottleneck to your mutexes - but that's about it.
