[sqlalchemy] Re: Unit testing with SA?

Kumar McMillan Wed, 07 Feb 2007 10:59:10 -0800

Thanks for taking a close look Allen.  Here are some answers...

On 2/7/07, Allen Bierbaum <[EMAIL PROTECTED]> wrote:
> Overall it looks very interesting.  I was expecting something much
> more along the vein of DbUnit where you do something like this:
>
> setUp():
>    dataset = XmlDataset("dataset.xml")
>    db.refresh(dataset)
>
> tearDown():
>    db.tearDown()
>
> testMethod():
>    # Do normal db queries here relying upon the data to be in there
>    # some custom comparisons against loaded datasets are supported


given the first example datasets, the equivalent with fixture is:

class TestMyCodeWithData(unittest.TestCase):
    def setUp(self):
        self.data = db.data(events_data)
        self.data.setup()

    def tearDown(self):
        self.data.teardown()

    def testSomething(self):
        joe = Affiliate.get(self.data.affiliates_data.joe.id)
        click = Event.get(self.data.events_data.joes_click.id)
        assert click.affiliate is joe
        assert click.type == self.data.events_data.joes_click.type


And you;re right, I should add an example to this since people coming
from DbUnit or rails, etc, will be more familiar with this approach.
I started with the decorator example since I believe most of the time
it is easier and faster to write small test functions over classes.
But classes scale better and often it's cumbersome to convert test
functions into classes when it comes to that -- hence, like the print
problem [1], there is a good argument to always use test classes

[1] http://www.python.org/dev/peps/pep-3105/

Having said that, this is a good idea.  I've committed
fixture.DataTestCase, a mixin for use with unittest.TestCase
derivatives.

I should also point out here that in python 2.5 you can write tests like:

with db.data(events_data) as data:
     joe = Affiliate.get(data.affiliates_data.joe.id)
     click = Event.get(data.events_data.joes_click.id)
     assert click.affiliate is joe
     assert click.type == data.events_data.joes_click.type

As far as the XML approach -- I dislike this because I think fixtures
are meant to be hand-coded and editable (in most cases).  XML is hard
to edit and the rails approach to use YAML is good, but, python code
is as good as YAML if you ask me.  Somone who feels storngly can
submit a patch to me for loading fixtures in XML or YAML.

> So far the fixture code looks a little magic to me but that may be
> because I do not use SessionContext or assign_mapper in my code.  I
> think it would be helpful to see a full example of a simple test case
> class (using unittest) that only uses the standard SA features found
> in the SA tutorial.  This may help separate the SA plugin magic from
> the fixture magic.

I admit the "discovery" of table objects is magical so yes I should
come up with a better example.  Also, good point about not using
extensions.  I just committed an update so that you can create a
fixture purely from a session:

db = SQLAlchemyFixture(session=sqlalchemy.create_session())

keep in mind that if you need to init the session elsewhere, you can ...

db = SQLAlchemyFixture()
# anytime before db.data().setup() ...
db.session = my_session

> This command looks interesting and I need to try it on some of my
> "real" data.  One question though, it looks like this generates python
> code correct?  Have you thought about supporting the option of
> generating some metadata xml file that can be loaded directly.  This
> may make maintenance a little easier and the datasets less verbose.
> It could also allow for hand editing of the xml datasources when that
> makes sense.

I think I addressed this above.  If you are talking about tons and
tons of data, then yes I can see how python code might not make sense.
 However, in that case I would suggest using a sql dump since it will
be more efficient (I use this for testing triggers and stored
procedures).  Then again, you lose the functionality of making
assertions with the data object ... so XML might not be such a bad
idea for such a use case.  I'll think more about that.

> > Next you build the DataSet objects that you want to load, in this case they
> > inherit from SequencedSet, an optional DataSet enhancement that simulates
> > auto-incrementing IDs.  The IDs values can be overridden for any row and the
> > column name is configurable, but defaults to 'id'::
>
> Why is this needed?  If the id is already a sequence shouldn't the
> standard inserts into the database handle this automatically?  How can
> it be disabled?

You can disable it by inheriting from fixture.DataSet instead of
SequencedSet.  It's necessary because a DataSet doesn't [and
shouldn't] know about the database.  I.E. if you did let the database
create the IDs for you, you wouldn't be able to reference the id
attributes in other data sets.

However, I'm using a deferred method of resolving the reference
already so I might experiment with resolving references *after* a
dataset has been loaded.  That might work, since you probably only
reference the data set instances after loading anyway.


> This structure was very interesting and IMHO innovative.  Basically it
> represents the source "data" as a set of first class python objects
> that can be referenced later.  I can see how this could be very
> helpful for small test cases.  I am a little less sure of how useful
> this would be for a larger set of data.  For example if I need to test
> on a complex dataset where I have 20-30 tables and 10-20 rows in each
> table then things get a little more tricky.  But maybe I should be
> focusing on simplifying my data in this case. :)

I'd be curious to learn about your use cases for loading so much data
for a single test.  I can see stress testing as one.  Generally, I
have never needed to load more than a couple dozen rows, probably
because most of my tests are isolated in what database-related
functionality they test.

I think the most complex thing I've run into is getting long chains of
foreign key references represented in data sets.  However I am ok with
a long python module for that scenario.


> Using the name "db" here confused me at first since I normal name
> engines something with "db" in the name.  But I think I see the point
> where this is creating a fixture and letting the fixture know about
> the datasets it can load.  What I don't understand though is if it is
> using the TrimmedNameStyle to pick up the data sets from above, then
> why do they have to be passed in explicitly to the db.with_data
> decorator below?

the TrimmedNameStyle is used to find the table objects that will be
loaded for a given DataSet.  I.E. "events" would be the name of the
Table instance used to load events_data.  There is a way to specify an
object directly, like:

class anything_i_want(DataSet):
    class Meta:
        storable=sqlalchemy.Table('events', ...)
   class click:
        name="click"

I just checked in another update that added more tests for this (also
fixed a few things for it).


> Is it possible to capture all of the meta create/drop and data loading
> behind the scenes with some standard calls.  For example something
> like:
>
> def setUp():
>    db.setupDatabase(method=CLEAN_ALL)
> def tearDown():
>    db.teardownDatabase(method=DELETE_ALL)
>
> CLEAN_ALL and DELETE_ALL are ideas from DbUnit.  They let the fixture
> infrastructure know that it should clear everything and load the
> everything it knows about the database from scratch (CLEAN_ALL) and
> delete everything from the database that the fixture knows about
> (DELETE_ALL).

I had originall gone with a "clear all" approach but it seems to me
too easy for a new user to accidentally send the fixture a dsn to a
live database and then poof, whoops, new user not happy.  Although I
could make giant, red warnings about this (the loaders already will
not operate on any kind of "default" connection) I decided to take a
different route, which is deleting only the objects loaded.  Thanks to
sqlalchemy's unit of work this is very efficient, but I might be
persuaded to add an option "clear all" that can be explicitly turned
on since that might be slightly faster.

>
> This makes sense to me now.  The one point of confusion I have is how
> a session would fit into all this.  In your case I think it is being
> hidden by the plugins you are using but since I don't have those I
> think I may have to handle something more explicitly.  (or maybe I
> should be using the plugins. :)

this is a good point, I added the session parameter to
SQLAlchemyFixture(), example above.

>
> Thanks for the great code.  It looks very helpful and I can't wait to
> use it.  Please understand that my comments are not meant to be harsh
> or critical, just some feedback from someone looking at the code for a
> first time.

of course!  I'm very curious to see what other use cases my fellow
test-infected folks may have for the module ;)

Kumar

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Unit testing with SA?

Reply via email to