Re: [GSoC 2012] Schema Alteration API proposal

Russell Keith-Magee Mon, 02 Apr 2012 18:10:14 -0700

On 03/04/2012, at 5:06 AM, j4nu5 wrote:

> Hi Russell,
> 
> Thanks for the prompt reply.
> 
>  * You aren't ever going to eat your own dogfood. You're spending the GSoC 
> building an API that is intended for use with schema migration, but you're 
> explicitly not looking at any part of the migration process that would 
> actually use that API. How will we know that the API you build is actually 
> fit for the purpose it is intended? How do we know that the requirements of 
> "step 2" of schema migration will be met by your API? I'd almost prefer to 
> see more depth, and less breadth -- i.e., show me a fully functioning schema 
> migration stack on just one database, rather than a fully functioning API on 
> all databases that hasn't actually been shown to work in practice.
> 
> 'Eating my own dogfood' to check whether my low level migration primitives 
> are actually *usable*, I believe can be done by:
> 1. Developing a working fork of South to use these primitives as I mentioned 
> in my project goals, or
> 2. Aiming for less 'breadth' and more 'depth', as you suggested.
> 
> I did not opt for 2, since creating the '2nd level' of the migration 
> framework (the caller of the lower level API) is a huge beast by itself. Any 
> reasonable solution will have to take care of 'Pythonic' as well as 
> 'pseudo-SQL' migrations as discussed above. Not to mention taking care of 
> versioning + dependency management + backwards migrations. I am against the 
> development of a half baked and/or inconsistent 2nd level API layer. Trying 
> to fully develop such a solution even for one database will exceed the GSoC 
> timeline, in my humble opinion.


Ok - there's two problems with what you've said here:

 1) You don't make any reference in your schedule to implementing a "working 
fork of South". This isn't a trivial activity, so if you're planning on doing 
this, you should tell use how this is factored into your schedule.

 2) You're making the assumption that you need to "fully develop" a solution. A 
proof of concept would be more than adequate. For example, in the 2010 GSoC, 
Alex Gaynor's project was split into two bits; a bunch of modifications to the 
core query engine, and a completely separate project, not intended for merging 
to trunk, that demonstrated that his core query changes would do what was 
necessary. You could take exactly the same approach here; don't try to delivery 
a fully functioning schema migration tool, just enough of a tool to demonstrate 
that your API is sufficient. 

>  * It feels like there's a lot of padding in your schedule.
> 
>    - A week of discussion at the start
>    - 2 weeks for a "base" migration API
>    - 2.5 weeks to write documentation
>    - 2 "buffer" weeks
> 
> Your project is proposing the development of a low level database API. While 
> this should certainly be documented, if it's not going to be "user facing", 
> the documentation requirements aren't as high. Also, because it's a low level 
> database API, I'm not sure what common tools will exist -- yet your schedule 
> estimates 1/6 of your overall time, and 1/3 of your active coding time, will 
> be spent building these common tools. Having 1/6 of your project schedule as 
> contingency is very generous; and you don't mention what you plan to look at 
> if you don't have to use that contingency.
> 
> I think the problem is that the 1st part - development of a lower level 
> migrations API - is a little bit small for the GSoC timeline but the 2nd part 
> - the caller of the API - is way big for GSoC. As I said, I did not want to 
> create a half baked solution. Thats why the explicit skipping of 2nd level 
> and thus the *padding*. I am still open for discussion and suggestions 
> regarding this matter though.

So, to summarize: What you're telling us is that you know, a-priori, that your 
project isn't 12 weeks of work. This doesn't give us a lot of incentive to pick 
up your proposal for the GSoC. We have an opportunity to get Google to pay for 
12 weeks development. Given that we have that opportunity, why would we select 
a project that will only yield 6 weeks of output?

The goal here isn't to pick a project, and then make it fit 12 weeks by any 
means necessary. It's to pick something that will actually be 12 weeks of work. 
A little contingency is fine, but if you start padding too much, your proposal 
isn't going to be taken seriously.

My suggestion -- work out some small aspect of part 2 that you *can* deliver. 
Not necessarily the whole thing, but a skeleton, and try to delivery a fully 
fleshed out part on that skeleton. If you're smart about it, this can also 
double as your dogfood requirement.

>  * Your references to testing are a bit casual for my taste. From my 
> experience, testing schema migration code is hard. Normal view code and 
> utilities are easy to test -- you set up a test database, insert some data, 
> and check functionality. However, schema migration code is explicitly about 
> making database changes, so the thing that Django normally considers "static" 
> -- the database models -- are subject to change, and that isn't always an 
> easy thing to accommodate. I'd be interested to see your thoughts on how you 
> plan to test your API.
> 
> On a high level, the testing code will have to check:
> 1. Whether the migration has been applied correctly.
> 2. Whether models are behaving the way they are supposed to after the 
> migration.
> 
> The 1st part will involve checking non-related fields and related fields 
> (ManyToMany, ForeignKey etc.). Checking non-related fields is relatively easy 
> while checking related fields will involve checking for changes to 
> appropriate constraints and as in the case with ManyToMany, whether the 
> changes have *cascaded* properly to all the tables.
> 
> The 2nd part involves checking whether the models/fields affected by the 
> migration, either directly or indirectly, are working/throwing errors the way 
> they are supposed to.

I think you're missing my point. 

If you're planning on running actual Django code to test the functionality of 
models, you're going to need two models -- the initial model before migration, 
and the end model after migration. How do you plan to accomodate the existence 
of both in Django's app cache?

>  * Your proposal doesn't make any reference to the existing "migration-like" 
> tasks in Django's codebase. For example, we already have code for creating 
> tables and adding indicies. How will your migration code use, modify or 
> augment these existing capabilities?
> 
> The current code, for e.g. sql_create_model in django.db.backends.creation is 
> a mix of *inspection* part and *sql generation* part. Our new layers of API 
> will basically divide these tasks. The *sql generation* part will be handled 
> by the dry-run mode of the new low level API while the inspection part is the 
> responsibility of the higher levels. The refactored version of creation.py 
> will basically make use of the higher level inspection part of the new API 
> for inspection which in turn will call the lower level API for final sql 
> generation.

None of this is mentioned in your proposal, or accounted for in your schedule. 
Seems like a large omission to me :-)

Yours
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

Reply via email to