For the impatient:

http://pypi.python.org/pypi/dse/3.0.0.Beta-1
Source at https://bitbucket.org/weholt/dse2/src
Modified BSD license.

New in the 3.x version of DSE is the bulk_update-method, more
intuitive syntax and code clean up.
NB! The new syntax is not backwards compatible so existing code using
DSE must be updated to work.

New syntax:

    with Person.delayed as d:
        d.insert({'name': 'Thomas', 'age': 36, 'sex': 'M'})
        d.update({'id': 1, 'name': 'John'})
        d.delete(10) # Deletes record with id 10

I hope the syntax is more intuitive and easy to read. Comments wanted.

Bulk update It takes a dictionary of values to update, requires a
value for the primary key/id of the record, but uses the django orm's
own update method
instead of plain sql to reduce number of statements to execute. This
is helpful when your fields can have a limited set of values, like
EXIF-data from photos or metadata from mp3s.

An example::

    with Photo.delayed as d:
        d.update({'id': 1, 'camera_model': 'Nikon', 'fnumber': 2.8,
'iso_speed': 200})
        d.update({'id': 2, 'camera_model': 'Nikon', 'fnumber': 11,
'iso_speed': 400})
        d.update({'id': 3, 'camera_model': 'Nikon', 'fnumber': 2.8,
'iso_speed': 400})
        d.update({'id': 4, 'camera_model': 'Canon', 'fnumber': 3.5,
'iso_speed': 200})
        d.update({'id': 5, 'camera_model': 'Canon', 'fnumber': 11,
'iso_speed': 800})
        d.update({'id': 6, 'camera_model': 'Pentax', 'fnumber': 11,
'iso_speed': 800})
        d.update({'id': 7, 'camera_model': 'Sony', 'fnumber': 3.5,
'iso_speed': 1600})
        # and then some thousand more lines like that

Internally DSE will construct a structure like this::

    bulk_updates = {
        'camera_model': {
                'Nikon': [1,2,3],
                'Canon': [4,5],
                'Pentax': [6],
                'Sony': [7],
            },
        'fnumber': {
                2.8: [1,3],
                11: [2,5,6],
                3.5: [4,7],
            },
        'iso_speed': {
                200: [1,4],
                400: [2,3],
                800: [5,6],
                1600: [7]
        }
    }

And then execute those statements using::

    # pk = the primary key field for the model, in most cases id
    for field, values in bulk_updates.iteritems():
        for value, ids in values.iteritems():
            model.objects.filter(**{"%s__in" % pk:
ids}).update(**{field: value})

For huge datasets where the fields can have limited values this has a
big impact on performance. So when to use
update or bulk_update depends on the data you want to process. For
instance importing a contact list where most
of the fields had almost unique values would benefit from the
update-method, but importing data from photos, id3-tags
from your music collection etc would process much faster using bulk_update.

Thanks to Cal Leeming [Simplicity Media Ltd] for inspiration on this one :-)

-- 
Mvh/Best regards,
Thomas Weholt
http://www.weholt.org

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to