[sqlalchemy] Re: Unit testing with SA?

Allen Bierbaum Wed, 07 Feb 2007 11:44:53 -0800

I am going to try to integrate this into my testing framework this
afternoon so I am sure I will have more questions after that.  In the
meantime see below...


On 2/7/07, Kumar McMillan <[EMAIL PROTECTED]> wrote:
>
> Thanks for taking a close look Allen.  Here are some answers...
>
> On 2/7/07, Allen Bierbaum <[EMAIL PROTECTED]> wrote:
> > Overall it looks very interesting.  I was expecting something much
> > more along the vein of DbUnit where you do something like this:
> >
> > setUp():
> >    dataset = XmlDataset("dataset.xml")
> >    db.refresh(dataset)
> >
> > tearDown():
> >    db.tearDown()
> >
> > testMethod():
> >    # Do normal db queries here relying upon the data to be in there
> >    # some custom comparisons against loaded datasets are supported
>
> given the first example datasets, the equivalent with fixture is:
>
> class TestMyCodeWithData(unittest.TestCase):
>     def setUp(self):
>         self.data = db.data(events_data)
>         self.data.setup()
>
>     def tearDown(self):
>         self.data.teardown()
>
>     def testSomething(self):
>         joe = Affiliate.get(self.data.affiliates_data.joe.id)
>         click = Event.get(self.data.events_data.joes_click.id)
>         assert click.affiliate is joe
>         assert click.type == self.data.events_data.joes_click.type
>
>
> And you;re right, I should add an example to this since people coming
> from DbUnit or rails, etc, will be more familiar with this approach.
> I started with the decorator example since I believe most of the time
> it is easier and faster to write small test functions over classes.
> But classes scale better and often it's cumbersome to convert test
> functions into classes when it comes to that -- hence, like the print
> problem [1], there is a good argument to always use test classes

Agreed.  This is probably the simplest example that people may want to
start with.  You could make it even simpler by using only a single
table but that prevents you from showing some of the advanced
features.

> [1] http://www.python.org/dev/peps/pep-3105/
>
> Having said that, this is a good idea.  I've committed
> fixture.DataTestCase, a mixin for use with unittest.TestCase
> derivatives.

I will give this a try and see how it works.

> I should also point out here that in python 2.5 you can write tests like:
>
> with db.data(events_data) as data:
>      joe = Affiliate.get(data.affiliates_data.joe.id)
>      click = Event.get(data.events_data.joes_click.id)
>      assert click.affiliate is joe
>      assert click.type == data.events_data.joes_click.type

I have no idea how this works. I guess that means I need to learn a
bit of python 2.5 soon. :)

Until then, it is nice to know that it can be even easier once I
understand python 2.5.

> As far as the XML approach -- I dislike this because I think fixtures
> are meant to be hand-coded and editable (in most cases).  XML is hard
> to edit and the rails approach to use YAML is good, but, python code
> is as good as YAML if you ask me.  Somone who feels storngly can
> submit a patch to me for loading fixtures in XML or YAML.

I understand your hesitation.  Maybe I could suggest that you just
think about adding some sort of "loader" plugin support similar to the
IDataSet interfaces in DbUnit
(http://dbunit.sourceforge.net/components.html).  This could provide a
point of extension for people in the future that may want to load XML,
YAML, or even excel files. :)

> > So far the fixture code looks a little magic to me but that may be
> > because I do not use SessionContext or assign_mapper in my code.  I
> > think it would be helpful to see a full example of a simple test case
> > class (using unittest) that only uses the standard SA features found
> > in the SA tutorial.  This may help separate the SA plugin magic from
> > the fixture magic.
>
> I admit the "discovery" of table objects is magical so yes I should
> come up with a better example.  Also, good point about not using
> extensions.  I just committed an update so that you can create a
> fixture purely from a session:
>
> db = SQLAlchemyFixture(session=sqlalchemy.create_session())
>
> keep in mind that if you need to init the session elsewhere, you can ...
>
> db = SQLAlchemyFixture()
> # anytime before db.data().setup() ...
> db.session = my_session

In my current system I have a single global session that is used for
everything.  Is there any reason you can see that I could not just
reuse this session in all the test cases or should I be creating a new
on each time?

> > This command looks interesting and I need to try it on some of my
> > "real" data.  One question though, it looks like this generates python
> > code correct?  Have you thought about supporting the option of
> > generating some metadata xml file that can be loaded directly.  This
> > may make maintenance a little easier and the datasets less verbose.
> > It could also allow for hand editing of the xml datasources when that
> > makes sense.
>
> I think I addressed this above.  If you are talking about tons and
> tons of data, then yes I can see how python code might not make sense.
>  However, in that case I would suggest using a sql dump since it will
> be more efficient (I use this for testing triggers and stored
> procedures).  Then again, you lose the functionality of making
> assertions with the data object ... so XML might not be such a bad
> idea for such a use case.  I'll think more about that.

The other thing you lose with an SQL dump is that the output may not
work across different database backends.  That is why I would really
like the loading of the table to be routed back through SA so we can
have some support for moving the testing data to whatever db's you end
up needed.  (in my particular case this isn't really going to work
because I need GIS support which is non portable, but it sounds like a
nice capability to me)

> > > Next you build the DataSet objects that you want to load, in this case 
> > > they
> > > inherit from SequencedSet, an optional DataSet enhancement that simulates
> > > auto-incrementing IDs.  The IDs values can be overridden for any row and 
> > > the
> > > column name is configurable, but defaults to 'id'::
> >
> > Why is this needed?  If the id is already a sequence shouldn't the
> > standard inserts into the database handle this automatically?  How can
> > it be disabled?
>
> You can disable it by inheriting from fixture.DataSet instead of
> SequencedSet.  It's necessary because a DataSet doesn't [and
> shouldn't] know about the database.  I.E. if you did let the database
> create the IDs for you, you wouldn't be able to reference the id
> attributes in other data sets.
>
> However, I'm using a deferred method of resolving the reference
> already so I might experiment with resolving references *after* a
> dataset has been loaded.  That might work, since you probably only
> reference the data set instances after loading anyway.

I would be interested to see your idea here.

I guess in my test cases I wasn't thinking I would be so reliant upon
using this type of information from the the DataSet.  I just wanted
DataSet as a way to encapsulate the data and process for loading known
data into the database.  I can see how this could be very useful
though.

> > This structure was very interesting and IMHO innovative.  Basically it
> > represents the source "data" as a set of first class python objects
> > that can be referenced later.  I can see how this could be very
> > helpful for small test cases.  I am a little less sure of how useful
> > this would be for a larger set of data.  For example if I need to test
> > on a complex dataset where I have 20-30 tables and 10-20 rows in each
> > table then things get a little more tricky.  But maybe I should be
> > focusing on simplifying my data in this case. :)
>
> I'd be curious to learn about your use cases for loading so much data
> for a single test.  I can see stress testing as one.  Generally, I
> have never needed to load more than a couple dozen rows, probably
> because most of my tests are isolated in what database-related
> functionality they test.

Right now I am thinking about the future more then the current set of
data I am using.  In the future the project I am working on will have
a tremendous amount of data and I *think* I may need to use some
rather large datasets for testing.  I will try everything I can to
avoid that though.

> I think the most complex thing I've run into is getting long chains of
> foreign key references represented in data sets.  However I am ok with
> a long python module for that scenario.
>
>
> > Using the name "db" here confused me at first since I normal name
> > engines something with "db" in the name.  But I think I see the point
> > where this is creating a fixture and letting the fixture know about
> > the datasets it can load.  What I don't understand though is if it is
> > using the TrimmedNameStyle to pick up the data sets from above, then
> > why do they have to be passed in explicitly to the db.with_data
> > decorator below?
>
> the TrimmedNameStyle is used to find the table objects that will be
> loaded for a given DataSet.  I.E. "events" would be the name of the
> Table instance used to load events_data.  There is a way to specify an
> object directly, like:
>
> class anything_i_want(DataSet):
>     class Meta:
>         storable=sqlalchemy.Table('events', ...)
>    class click:
>         name="click"
>
> I just checked in another update that added more tests for this (also
> fixed a few things for it).

Ok, this makes more sense now and points out a problem I am going to
have using it.  In my current code I am not keeping the tables or
mappers around.  Instead I have a database manager class that sets
everthing up and simply holds onto the session and engine that should
be used to query the database.  I rely upon the mapped classes to keep
track of the table and metadata references internally.

So... is there any way to associated a dataset with the Class type
that is associated with the data in the dataset?

for example maybe something like:

class anything_I_want(DataSet):
   mappedType = MyDataClass
   class click:
      name="click"

or something else along these lines.  This seems like it would work
well to tie the data back to the class type that is actually being
mapped.  Then the "anything_I_want" class is really just a list of
MyDataClass objects that need to be populated into the database.

>
> > Is it possible to capture all of the meta create/drop and data loading
> > behind the scenes with some standard calls.  For example something
> > like:
> >
> > def setUp():
> >    db.setupDatabase(method=CLEAN_ALL)
> > def tearDown():
> >    db.teardownDatabase(method=DELETE_ALL)
> >
> > CLEAN_ALL and DELETE_ALL are ideas from DbUnit.  They let the fixture
> > infrastructure know that it should clear everything and load the
> > everything it knows about the database from scratch (CLEAN_ALL) and
> > delete everything from the database that the fixture knows about
> > (DELETE_ALL).
>
> I had originall gone with a "clear all" approach but it seems to me
> too easy for a new user to accidentally send the fixture a dsn to a
> live database and then poof, whoops, new user not happy.  Although I
> could make giant, red warnings about this (the loaders already will
> not operate on any kind of "default" connection) I decided to take a
> different route, which is deleting only the objects loaded.  Thanks to
> sqlalchemy's unit of work this is very efficient, but I might be
> persuaded to add an option "clear all" that can be explicitly turned
> on since that might be slightly faster.

Using the existing meta.drop_all() gives quite a bit of the
functionality needed so it may not be that bad in a real use case.

> >
> > This makes sense to me now.  The one point of confusion I have is how
> > a session would fit into all this.  In your case I think it is being
> > hidden by the plugins you are using but since I don't have those I
> > think I may have to handle something more explicitly.  (or maybe I
> > should be using the plugins. :)
>
> this is a good point, I added the session parameter to
> SQLAlchemyFixture(), example above.
>
> >
> > Thanks for the great code.  It looks very helpful and I can't wait to
> > use it.  Please understand that my comments are not meant to be harsh
> > or critical, just some feedback from someone looking at the code for a
> > first time.
>
> of course!  I'm very curious to see what other use cases my fellow
> test-infected folks may have for the module ;)

I am very curious to see where this leads as well.  Hopefully I can
get some of this code integrated this afternoon and it will work out.

Thanks,
Allen

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Unit testing with SA?

Reply via email to