Re: [Django] #14697: Speeding up queryset model instance creation

2010-11-17 Thread Django
#14697: Speeding up queryset model instance creation
---+
  Reporter:  akaariai  | Owner:  nobody 

Status:  new   | Milestone:  1.3

 Component:  Database layer (models, ORM)  |   Version:  SVN

Resolution:|  Keywords:  
performance, queryset, iterator
 Stage:  Unreviewed| Has_patch:  1  

Needs_docs:  0 |   Needs_tests:  0  

Needs_better_patch:  0 |  
---+
Comment (by akaariai):

 Ok, I now have django-bench benchmarks and results. I am not too familiar
 with django-bench, so there might be some stupid choices made in
 constructing the benchmarks...

 Results:

 {{{
 query_all: fetch 1000 objects with just 2 fields
 Running 'query_all' benchmark ...
 Min: 0.01 -> 0.00: incomparable (one result was zero)
 Avg: 0.015140 -> 0.009020: 1.6785x faster
 Significant (t=27.972377)
 Stddev: 0.00525 -> 0.00450: 1.1673x smaller (N = 1000)

 query_all_multifield: fetch 1000 objects with 11 fields
 Running 'query_all_multifield' benchmark ...
 Min: 0.01 -> 0.01: no change
 Avg: 0.023940 -> 0.017660: 1.3556x faster
 Significant (t=27.959131)
 Stddev: 0.00517 -> 0.00487: 1.0604x smaller (N = 1000)
 }}}

 I can't run the whole suite (getting some errors) but I tried some of the
 most promising named benchmarks, and didn't see any change in them, except
 for query_iterator, which gave me:

 {{{
 Running 'query_iterator' benchmark ...
 Min: 0.00 -> 0.00: incomparable (one result was zero)
 Avg: 0.000300 -> 0.000250: 1.2000x faster
 Not significant
 Stddev: 0.00171 -> 0.00156: 1.0926x smaller (N = 1000)
 }}}

 I am attaching a tar.gz that contains four benchmarks (query_all,
 query_all_multifield and for #14700 query_raw and query_raw_deferred).
 Extract to benchmarks directory...

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.



Re: [Django] #14697: Speeding up queryset model instance creation

2010-11-16 Thread Django
#14697: Speeding up queryset model instance creation
---+
  Reporter:  akaariai  | Owner:  nobody 

Status:  new   | Milestone:  1.3

 Component:  Database layer (models, ORM)  |   Version:  SVN

Resolution:|  Keywords:  
performance, queryset, iterator
 Stage:  Unreviewed| Has_patch:  1  

Needs_docs:  0 |   Needs_tests:  0  

Needs_better_patch:  0 |  
---+
Comment (by akaariai):

 Yes, I will look into transforming the test to django-bench. However I
 can't promise when. I have already used too much work time for this, but
 if nobody tells my boss I might be able to transform that benchmark into
 django-bench test tomorrow...

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.



Re: [Django] #14697: Speeding up queryset model instance creation

2010-11-16 Thread Django
#14697: Speeding up queryset model instance creation
---+
  Reporter:  akaariai  | Owner:  nobody 

Status:  new   | Milestone:  1.3

 Component:  Database layer (models, ORM)  |   Version:  SVN

Resolution:|  Keywords:  
performance, queryset, iterator
 Stage:  Unreviewed| Has_patch:  1  

Needs_docs:  0 |   Needs_tests:  0  

Needs_better_patch:  0 |  
---+
Comment (by lukeplant):

 Just as a pointer, if you wanted to create that benchmark - I think you
 could call the benchmark 'query_all', use the 'query_get' benchmark as an
 example, and use the 'setup' keyword argument to the 'run_benchmark'
 utility to create your objects rather than using a fixture. You should
 find it very straightforward.

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.



Re: [Django] #14697: Speeding up queryset model instance creation

2010-11-16 Thread Django
#14697: Speeding up queryset model instance creation
---+
  Reporter:  akaariai  | Owner:  nobody 

Status:  new   | Milestone:  1.3

 Component:  Database layer (models, ORM)  |   Version:  SVN

Resolution:|  Keywords:  
performance, queryset, iterator
 Stage:  Unreviewed| Has_patch:  1  

Needs_docs:  0 |   Needs_tests:  0  

Needs_better_patch:  0 |  
---+
Changes (by lukeplant):

  * needs_better_patch:  => 0
  * needs_tests:  => 0
  * needs_docs:  => 0

Comment:

 It's not a requirement for a patch being accepted, but it would really
 help me and possibly others to have your example as part of
 [https://github.com/jacobian/djangobench django-bench].  Jacob usually
 accepts pull requests for that fairly quickly. It should help you do more
 optimisations as well.

 Or, if there is an existing benchmark which shows similar stats for your
 patch, just point us to that one.

 Thanks!

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.



[Django] #14697: Speeding up queryset model instance creation

2010-11-16 Thread Django
#14697: Speeding up queryset model instance creation
-+--
 Reporter:  akaariai |   Owner:  nobody
   Status:  new  |   Milestone:  1.3   
Component:  Database layer (models, ORM) | Version:  SVN   
 Keywords:  performance, queryset, iterator  |   Stage:  Unreviewed
Has_patch:  1|  
-+--
 The attached patch does some easy optimizations to speed up iteration and
 thus model instance creation when using querysets.

 The tests are run using the following code:

 {{{
 models.py:

 from django.db import models

 class Test1(models.Model):
 pass

 # Create your models here.
 class Test2(models.Model):
 field1 = models.CharField(max_length=20)
 field2 = models.ForeignKey(Test1)
 field3 = models.CharField(max_length=20)
 field4 = models.CharField(max_length=20)
 field5 = models.CharField(max_length=20)
 field6 = models.CharField(max_length=20)
 field7 = models.CharField(max_length=20)
 field8 = models.CharField(max_length=20)
 field9 = models.CharField(max_length=20)
 field10 = models.CharField(max_length=20)
 field11 = models.CharField(max_length=20)
 field12 = models.CharField(max_length=20)
 field13 = models.CharField(max_length=20)

 test.py:
 from test_.models import *
 """
 Uncomment for first run to create objects...
 t2 = Test1(pk=1)
 t2.save()
 for i in range(0, 1000):
 t = Test2(pk=i, field1='value', field2=t2)
 t.save()
 for i in range(0, 1000):
 t = Test1(pk=i)
 t.save()
 """
 from datetime import datetime
 from django.conf import settings
 # dummy read of settings to avoid weird results in timing:
 # first read of settings changes timezone...
 t = settings.INSTALLED_APPS

 def fetch_objs():
 for i in range(0, 10):
 #for obj in Test1.objects.all():
 for obj in Test2.objects.all():
 pass

 import hotshot, hotshot.stats
 prof = hotshot.Profile("test.prof")
 prof.runcall(fetch_objs)
 prof.close()
 stats = hotshot.stats.load("test.prof")
 # stats.strip_dirs()
 stats.sort_stats('time', 'calls')
 stats.print_stats(50)
 start = datetime.now()
 fetch_objs()
 print '%s' % (datetime.now() - start)
 # What is the absolute maximum that can be achieved?
 from django.db import connection
 cursor = connection.cursor()
 start = datetime.now()
 for i in range(0, 10):
 cursor.execute('select * from test__test2')
 for obj in cursor.fetchall():
 pass
 print '%s' % (datetime.now() - start)
 }}}

 The results on my computer are as follows:

 When fetching 1 test1 objects:
 0.085 seconds with patch, 0.145 seconds without patch

 When fetching 1 test2 objects:
 0.200 seconds with patch, 0.27 seconds without patch

 So, this should result in 20-40% speed up for these simple cases.

 The absolute maximum that can be achieved is somewhere around 0.015
 seconds for the Test1 case (0.007 for fetching from DB, and 0.07 for
 creating a python object and setting attributes for it). Add in signals
 and ModelState creation, and you land in somewhere between 0.02-0.03. So,
 there is still some ground for optimizations, but going further doesn't
 seem too easy. Possible optimizations: pass to
 base.py/BaseModel.`__init__` a dict containing attr_name: val, so that one
 can update the model `__dict__` directly. This results in around 20%
 speedup, but is backwards incompatible (either init *args or **kwargs need
 to contain that dict and existing code does not expect that). and for that
 reason not included here. Another possibility is to include a different
 method (qs.as_list()) to fetch the list without any caching (just fetch
 all the results from cursor and create a list from that). I think that
 would result in around 20% more speedup, but would require maintaining two
 different implementations for fetching objects.

 Just as a datapoint: Doing the same using Test2.object.raw("select * from
 test`__`test2") results in about 1 second run time. I am going to look
 into that next, as that is _really_ bad.

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-upda...@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.