David, Dain and I tested this last night, and here's the final conclusion we
came to.

The initial problem was that retrieving and iterating over a list of the
entities from the entity manager and deleting them one at a time is
incredibly slow (one of my test cases took 4 minutes and 23 seconds to run
through ten test methods for less than 100 entity instances) and the tests
slowed down to a crawl.  The persistence provider (in this case OpenJPA)
isn't built for this sort of usage, so we decided to try a second technique
involving executing native SQL queries directly against the database by way
of the EntityManager.

The queries we determined to run were a sequence of "DELETE FROM foo"
queries, where "foo" is replaced with the name of the actual table / class.
Thus, for Person objects, we'd execute a "DELETE FROM Person" query.

However, this is a potentially dangerous practice as well, for reasons which
are now clear.  I was deleting tables in an order that did not take into
consideration maintaining referential integrity of foreign keys.  Because
many of my objects are tightly integrated, it is important to delete the
tables in a particular order, such that reference objects are wiped clean
first.

I was able to isolate this behavior by experimentally altering the order of
table deletions.  Certain cases would always throw errors, others would
throw none.  The errors each had a characteristic error stack trace with
many repetitions of the following line:

WARN - Unexpected exception from afterCompletion; continuing
> <openjpa-1.0.1-r420667:592145 fatal general error>
> org.apache.openjpa.persistence.PersistenceException: no-saved-fields
>         at org.apache.openjpa.kernel.StateManagerImpl.dirtyCheck(
> StateManagerImpl.java:799)
>

So, anywhere an entity table has a foreign key column you need to either
perform a cascading delete with the entityManager, nullify the reference
(assuming the foreign key columns are optional), or make certain to delete
the referencing bean first.

The recommended method to clear the foreign key columns is to run update
queries.  This is a good mechanism for breaking circular references on
teardown. An example looks like this:

"UPDATE Customer c set.myfk = null"

So, to sum things up, direct deletion against the database is much faster
than working through the EntityManager interface in some cases, but you need
to determine the precise order you are tearing items down, or design your
tests so that they are resilient to cruft in the database, if you intend on
running tests with an embedded OpenEJB container.

For the record, we also tried setting up the POM for the project to fork the
testing JVM, but this mechanism does not create a new JVM for each test
METHOD.  It only creates a new JVM for each test CASE.

I'm now running direct deletes, and being sensitive about foreign key
references, deletion order, and circular loops.  My final solution will
initially include a combination of direct deletes and EntityManager remove
calls via OpenJPA, and I will iteratively tune it until I've got a reliable
scraping method for my library.

One suggestion / idea for the future would be to provide a tool to analyze a
persistence unit and generate this sort of fast delete mechanism as an array
of query strings that could be serialized for any given persistence unit and
marshalled when needed for rapid clearing of the database during entity
tests.

A great big thanks to everyone on both teams who helped me isolate and
resolve this issue.
--
Alexander R. Saint Croix.



On Jan 8, 2008 8:21 PM, David Blevins <[EMAIL PROTECTED]> wrote:

> I swore I thought I mentioned using fork mode, but just in case.   If
> you use fork mode with the in-memory db, there's nothing to clean.
> Have you experimented with that route.
>
> We're still using jaxb to unmarshall the persistence.xml, but I've
> been thinking of cutting that out which might save you 1 second each
> test.  JAXB takes a while to initialize the first time it's used in a
> vm.
>
> -David

Reply via email to