Re: TxTDB and JENA-91, JENA-96 and JENA-97

Paolo Castagna Fri, 19 Aug 2011 08:01:27 -0700

Hi Simon

Simon Helsen wrote:
> Thanks Paolo,
>
> However, I don't think this request will help much.


(short answer)

I disagree.

Someone else looking at it might help.



(longer answer)

Someone else looking at it might help. They might experience issues simply
running on a different system (different OS, 32 bits vs 64 bits, different
file system, different JVM, etc).

> The test cases
> currently in the trunk are just not replicating the scenario that our
> (IBM) test cases go through. The key is to make the jena test cases
> reflect our internal test cases better which is tricky (I am still looking
> for what we do different).

Yes. I agree.

But...

The test is a minimal (and very simple) example to run read and write
transactions (committing and aborting those) concurrently using multiple
threads. Changing the parameters you can easily increase the number of
threads and make the tests as aggressive as you need it to be. Moreover,
people, inspired by this simple example, can change the logic of read
and write threads and make it as close as possible to their usage scenarios.

The value in having a simple test like that in SVN is that others can run
it and change it (if not directly, submitting patches). So, far, even if
it's extremely simple, that small program has proven very useful to me in
finding concurrency problems in TxTDB.

One of the differences between TestTransSystem and the ...MultipleDatasets
version is that the latter is using multiple datasets as you suggested on
JIRA. I really hoped this was the way to replicate the issues you are
experiencing... but, no problems on my machines. I do not exclude I am doing
something silly in TestTransSystemMultipleDatasets. So, once again, having
others looking at it can help.

Another, maybe relevant, difference might be the operating system. I use
Linux, 64 bit machines. So, someone could run into problems running on
Windows. This is another reason why asking others to run the test might
help.

Another difference could be the JDK used to run. I am using Oracle JDK 1.6
and I can test on different versions of Oracle JDK. But others might have
a different set of bugs in other JVMs. Once again, diversity helps in
identifying these sort of problems.

Last but not least, little test programs or test cases which stress test
the code are more than welcome. Any help on this is welcome. It's the
Apache way, isn't it?

> As for whether something is critical or major
> versus normal, I don't know what conventions you use, but the criteria you
> indicate in (*) don't make sense to me because your ability to reproduce a
> bug has nothing to do with the criticalness of the bug for your clients.
> And as for it to affect other users, they have to adopt it first. Instead
> of running test cases which are known not to reproduce the problem, it
> would be better if others tried to replace TDB 0.8.x with the latest
> snapshot of TDB 0.9.0 and see if their code still runs.

This is a good suggestion, guess what I was doing this morning @ Talis? :-)

> You don't even have to use the transaction API to run into corruptions.

Yep.

You can corrupt TDB indexes in various ways (i.e. not using MRSW locking,
not calling .close() when you must, deleting/replacing files while TDB is
running, pointing at the same location from different JVMs, etc.)

> But I don't know how many people have tried using TDB 0.9.0 so far.

No one other than us, probably.

This is also the reason why discussing on jena-dev and asking people to
give it a try is, in my opinion, a good thing.

> We (IBM) may well be
> the first to stress test it to this scale. Anyone?

Having a micro-benchmark for (Tx)TDB would be very useful.

We don't have one at the moment.
Maybe someone can contribute something here. :-)

I have been using BSBM mainly (and SP2B sometimes), but BSBM is focusing
on query performance and I am not convinced the update use case really
stresses the system in terms of reads/writes concurrency.

> Note also that our
> framework and test cases run glitch-free in a commercial product on TDB
> 0.8.7 and I did not observe any issues when I integrated the latest SDB
> (even using transactions). In other words, from my point of view, the
> current TDB-TX has a problem and we just have to find out what makes it
> happen.

I've never claimed there are no problems. I've never seen a software
without bugs. I am here to find those problems and, once we have found
them, to fix them (if I know how to do so).

> One more note: it is always possible that I am doing something
> wrong in how I integrated TDB-TX,

This is a possibility.

See above on how many different ways you can corrupt TDB indexes. Hopefully,
TxTDB will drastically reduce this, instead of increasing it. :-)

> so I am not saying that JENA-91, JENA-86
> and JENA-97 are definitely TDB-TX bugs, but for now, I have no other
> choice than to think that they are

Yes.

> I'll keep on it, but I will comment in issue JENA-91 (perhaps for
> documentation purposes, it would be good if others who run the tests also
> comment in there)

Yes.

Paolo

>
> Simon
>
>
>
> From:
> Paolo Castagna <[email protected]>
> To:
> [email protected]
> Date:
> 08/19/2011 07:38 AM
> Subject:
> TxTDB and JENA-91, JENA-96 and JENA-97
>
>
>
> Hi,
> in relation to (Tx)TDB we have 3 (still open) bug reports: JENA-91,
> JENA-96 and JENA-97. They are all flagged as "Critical" (*) and I am
> not able to replicate any of them.
>
> I am using Linux, Oracle JDK 1.6, 64-bit OS+JVM and I have been using
> TestTransSystem and TestTransSystemMultiDatasets programs included in
> the test package of TxTDB.
>
> I would appreciate if you could checkout TxTDB from here:
> https://svn.apache.org/repos/asf/incubator/jena/Experimental/TxTDB/trunk/
> and run TestTransSystemMultiDatasets.java
>
> If everything is fine, you should see something like this:
>
> ----
> START (disk, 100 iterations)
> 000: ..........
> 010: ..........
> 020: ..........
> 030: ..........
> 040: ..........
> 050: ..........
> 060: ..........
> 070: ..........
> 080: ..........
> 090: ..........
>
> DONE (100)
> FINISH
> ----
>
>
> You can change the number of reader/writer concurrent threads:
>
>     static final int numReaderTasks         = 10 ;
>     static final int numWriterTasksA        = 10 ;
>     static final int numWriterTasksC        = 10 ;
>
> Or number of reads/writes each thread will perform and the pause
> in ms between each read/write operation:
>
>     static final int readerSeqRepeats       = 8 ;
>     static final int readerMaxPause         = 50 ;
>
>     static final int writerAbortSeqRepeats  = 4 ;
>     static final int writerCommitSeqRepeats = 4 ;
>     static final int writerMaxPause         = 25 ;
>
> You can switch between direct and mapped mode changing:
>
>     static { SystemTDB.setFileMode(FileMode.mapped) ; }
>
> If you see an error or an exception, please, let us know.
> If you run successfully with no errors or exceptions, please, let us know.
>
> Thank you,
> Paolo
>
>
>
> (*) I tend to mark JIRA issues as "Major" or "Critical" when they are
> confirmed, reproducible and they have affect directly users. None
> of these conditions apply to the issues above.
>
>
>

Re: TxTDB and JENA-91, JENA-96 and JENA-97

Reply via email to