Re: [GSoC 2012] Schema Alteration API proposal

j4nu5 Wed, 04 Apr 2012 08:50:46 -0700

Hi Russell,
Thanks for your immense patience :-)

These are some additions to my proposal above, based on your inputs:
Status of current 'creation' code in django:
The current code, for e.g. sql_create_model in
django.db.backends.creation is a mix of *inspection* part and *sql
generation* part. Since the sql generation part will (should) now be
handled by our new CRUD API, I will refactor
django.db.backends.creation (and other backends' creation modules) to
continue using their inspection part but using our new CRUD API for
sql generation. The approach will be to get the fields using
model._meta.local_fields and feeding them to our new CRUD API. This
will serve to be a proof of concept for my API.


As for testing using Django code, my models will be something like:
class UnchangedModel(models.Model):
   eg = models.TextField()

if BEFORE_MIGRATION:
   class MyModel(models.Model):
       f1 = models.TextField()
       f2 = models.TextField()
# Deletion of a field
else:
   class MyModel(models.Model):
       f1 = models.TextField()

The value of BEFORE_MIGRATION will be controlled by the testing code.
A temporary environment variable can be used for this purpose.

Also a revised schedule:
Bonding period before GSoC: Discussion on API design
Week 1    : Writing tests (using 2 part checks (checking the actual
                 database and using Django models), as discussed above)
Week 2    : Developing the base migration API
Week 3    : Developing extensions and overrides for PostgreSQL
Weeks 4-5 : Developing extensions and overrides for MySQL
Weeks 6-7 : Developing extensions and overrides for SQLite (may be shorter 
or
                  longer (by 0.5 week) depending on how much of xtrqt's 
code is
                  considered acceptable)
Weeks 8-10  : Refactoring django.db backends.creation (and the PostgreSQL,
                     MySQL, SQLite creation modules) to use the new API for
                     SQL generation (approach discussed above)
Week 11      : Writing documentaion and leftover tests, if any
Week 12      : Buffer week for the unexpected

On Tuesday, 3 April 2012 06:39:37 UTC+5:30, Russell Keith-Magee wrote:
>
>
> On 03/04/2012, at 5:06 AM, j4nu5 wrote:
>
> > Hi Russell,
> > 
> > Thanks for the prompt reply.
> > 
> >  * You aren't ever going to eat your own dogfood. You're spending the 
> GSoC building an API that is intended for use with schema migration, but 
> you're explicitly not looking at any part of the migration process that 
> would actually use that API. How will we know that the API you build is 
> actually fit for the purpose it is intended? How do we know that the 
> requirements of "step 2" of schema migration will be met by your API? I'd 
> almost prefer to see more depth, and less breadth -- i.e., show me a fully 
> functioning schema migration stack on just one database, rather than a 
> fully functioning API on all databases that hasn't actually been shown to 
> work in practice.
> > 
> > 'Eating my own dogfood' to check whether my low level migration 
> primitives are actually *usable*, I believe can be done by:
> > 1. Developing a working fork of South to use these primitives as I 
> mentioned in my project goals, or
> > 2. Aiming for less 'breadth' and more 'depth', as you suggested.
> > 
> > I did not opt for 2, since creating the '2nd level' of the migration 
> framework (the caller of the lower level API) is a huge beast by itself. 
> Any reasonable solution will have to take care of 'Pythonic' as well as 
> 'pseudo-SQL' migrations as discussed above. Not to mention taking care of 
> versioning + dependency management + backwards migrations. I am against the 
> development of a half baked and/or inconsistent 2nd level API layer. Trying 
> to fully develop such a solution even for one database will exceed the GSoC 
> timeline, in my humble opinion.
>
> Ok - there's two problems with what you've said here:
>
>  1) You don't make any reference in your schedule to implementing a 
> "working fork of South". This isn't a trivial activity, so if you're 
> planning on doing this, you should tell use how this is factored into your 
> schedule.
>
>  2) You're making the assumption that you need to "fully develop" a 
> solution. A proof of concept would be more than adequate. For example, in 
> the 2010 GSoC, Alex Gaynor's project was split into two bits; a bunch of 
> modifications to the core query engine, and a completely separate project, 
> not intended for merging to trunk, that demonstrated that his core query 
> changes would do what was necessary. You could take exactly the same 
> approach here; don't try to delivery a fully functioning schema migration 
> tool, just enough of a tool to demonstrate that your API is sufficient. 
>
> >  * It feels like there's a lot of padding in your schedule.
> > 
> >    - A week of discussion at the start
> >    - 2 weeks for a "base" migration API
> >    - 2.5 weeks to write documentation
> >    - 2 "buffer" weeks
> > 
> > Your project is proposing the development of a low level database API. 
> While this should certainly be documented, if it's not going to be "user 
> facing", the documentation requirements aren't as high. Also, because it's 
> a low level database API, I'm not sure what common tools will exist -- yet 
> your schedule estimates 1/6 of your overall time, and 1/3 of your active 
> coding time, will be spent building these common tools. Having 1/6 of your 
> project schedule as contingency is very generous; and you don't mention 
> what you plan to look at if you don't have to use that contingency.
> > 
> > I think the problem is that the 1st part - development of a lower level 
> migrations API - is a little bit small for the GSoC timeline but the 2nd 
> part - the caller of the API - is way big for GSoC. As I said, I did not 
> want to create a half baked solution. Thats why the explicit skipping of 
> 2nd level and thus the *padding*. I am still open for discussion and 
> suggestions regarding this matter though.
>
> So, to summarize: What you're telling us is that you know, a-priori, that 
> your project isn't 12 weeks of work. This doesn't give us a lot of 
> incentive to pick up your proposal for the GSoC. We have an opportunity to 
> get Google to pay for 12 weeks development. Given that we have that 
> opportunity, why would we select a project that will only yield 6 weeks of 
> output?
>
> The goal here isn't to pick a project, and then make it fit 12 weeks by 
> any means necessary. It's to pick something that will actually be 12 weeks 
> of work. A little contingency is fine, but if you start padding too much, 
> your proposal isn't going to be taken seriously.
>
> My suggestion -- work out some small aspect of part 2 that you *can* 
> deliver. Not necessarily the whole thing, but a skeleton, and try to 
> delivery a fully fleshed out part on that skeleton. If you're smart about 
> it, this can also double as your dogfood requirement.
>
> >  * Your references to testing are a bit casual for my taste. From my 
> experience, testing schema migration code is hard. Normal view code and 
> utilities are easy to test -- you set up a test database, insert some data, 
> and check functionality. However, schema migration code is explicitly about 
> making database changes, so the thing that Django normally considers 
> "static" -- the database models -- are subject to change, and that isn't 
> always an easy thing to accommodate. I'd be interested to see your thoughts 
> on how you plan to test your API.
> > 
> > On a high level, the testing code will have to check:
> > 1. Whether the migration has been applied correctly.
> > 2. Whether models are behaving the way they are supposed to after the 
> migration.
> > 
> > The 1st part will involve checking non-related fields and related fields 
> (ManyToMany, ForeignKey etc.). Checking non-related fields is relatively 
> easy while checking related fields will involve checking for changes to 
> appropriate constraints and as in the case with ManyToMany, whether the 
> changes have *cascaded* properly to all the tables.
> > 
> > The 2nd part involves checking whether the models/fields affected by the 
> migration, either directly or indirectly, are working/throwing errors the 
> way they are supposed to.
>
> I think you're missing my point. 
>
> If you're planning on running actual Django code to test the functionality 
> of models, you're going to need two models -- the initial model before 
> migration, and the end model after migration. How do you plan to accomodate 
> the existence of both in Django's app cache?
>
> >  * Your proposal doesn't make any reference to the existing 
> "migration-like" tasks in Django's codebase. For example, we already have 
> code for creating tables and adding indicies. How will your migration code 
> use, modify or augment these existing capabilities?
> > 
> > The current code, for e.g. sql_create_model in 
> django.db.backends.creation is a mix of *inspection* part and *sql 
> generation* part. Our new layers of API will basically divide these tasks. 
> The *sql generation* part will be handled by the dry-run mode of the new 
> low level API while the inspection part is the responsibility of the higher 
> levels. The refactored version of creation.py will basically make use of 
> the higher level inspection part of the new API for inspection which in 
> turn will call the lower level API for final sql generation.
>
> None of this is mentioned in your proposal, or accounted for in your 
> schedule. Seems like a large omission to me :-)
>
> Yours
> Russ Magee %-)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/tsClw1mEpvoJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

Reply via email to