+1 on this new feature.

For those that don't understand how this impacts performance, I'm dedicating
a chunk of the (soon to come) webcast to explaining how this works and the
future ideas for monkey-patching the ORM with deferred execution (which
basically means to make this a 99% drop-in replacement).

For those testing this new DSE feature, please remember to use manual
transactions, perform dryrun's first, and keep the chunk size low (as these
are all factors which need to be changed based on what your data is, and
what you are doing with it).

Cal

On Fri, Jun 24, 2011 at 12:32 AM, Thomas Weholt <thomas.weh...@gmail.com>wrote:

> For the impatient:
>
> http://pypi.python.org/pypi/dse/3.0.0.Beta-1
> Source at https://bitbucket.org/weholt/dse2/src
> Modified BSD license.
>
> New in the 3.x version of DSE is the bulk_update-method, more
> intuitive syntax and code clean up.
> NB! The new syntax is not backwards compatible so existing code using
> DSE must be updated to work.
>
> New syntax:
>
>    with Person.delayed as d:
>        d.insert({'name': 'Thomas', 'age': 36, 'sex': 'M'})
>        d.update({'id': 1, 'name': 'John'})
>        d.delete(10) # Deletes record with id 10
>
> I hope the syntax is more intuitive and easy to read. Comments wanted.
>
> Bulk update It takes a dictionary of values to update, requires a
> value for the primary key/id of the record, but uses the django orm's
> own update method
> instead of plain sql to reduce number of statements to execute. This
> is helpful when your fields can have a limited set of values, like
> EXIF-data from photos or metadata from mp3s.
>
> An example::
>
>    with Photo.delayed as d:
>        d.update({'id': 1, 'camera_model': 'Nikon', 'fnumber': 2.8,
> 'iso_speed': 200})
>        d.update({'id': 2, 'camera_model': 'Nikon', 'fnumber': 11,
> 'iso_speed': 400})
>        d.update({'id': 3, 'camera_model': 'Nikon', 'fnumber': 2.8,
> 'iso_speed': 400})
>        d.update({'id': 4, 'camera_model': 'Canon', 'fnumber': 3.5,
> 'iso_speed': 200})
>        d.update({'id': 5, 'camera_model': 'Canon', 'fnumber': 11,
> 'iso_speed': 800})
>        d.update({'id': 6, 'camera_model': 'Pentax', 'fnumber': 11,
> 'iso_speed': 800})
>        d.update({'id': 7, 'camera_model': 'Sony', 'fnumber': 3.5,
> 'iso_speed': 1600})
>        # and then some thousand more lines like that
>
> Internally DSE will construct a structure like this::
>
>    bulk_updates = {
>        'camera_model': {
>                'Nikon': [1,2,3],
>                'Canon': [4,5],
>                'Pentax': [6],
>                'Sony': [7],
>            },
>        'fnumber': {
>                2.8: [1,3],
>                11: [2,5,6],
>                3.5: [4,7],
>            },
>        'iso_speed': {
>                200: [1,4],
>                400: [2,3],
>                800: [5,6],
>                1600: [7]
>        }
>    }
>
> And then execute those statements using::
>
>    # pk = the primary key field for the model, in most cases id
>    for field, values in bulk_updates.iteritems():
>        for value, ids in values.iteritems():
>            model.objects.filter(**{"%s__in" % pk:
> ids}).update(**{field: value})
>
> For huge datasets where the fields can have limited values this has a
> big impact on performance. So when to use
> update or bulk_update depends on the data you want to process. For
> instance importing a contact list where most
> of the fields had almost unique values would benefit from the
> update-method, but importing data from photos, id3-tags
> from your music collection etc would process much faster using bulk_update.
>
> Thanks to Cal Leeming [Simplicity Media Ltd] for inspiration on this one
> :-)
>
> --
> Mvh/Best regards,
> Thomas Weholt
> http://www.weholt.org
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to django-users@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to