Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-09-30 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  closed
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:  wontfix
 Keywords:   | Triage Stage:  Ready for
 |  checkin
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-
Changes (by Mariusz Felisiak):

 * resolution:  fixed => wontfix


-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.a43f88cff8c3e288e90d18e0b4db5439%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-09-30 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  closed
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:  fixed
 Keywords:   | Triage Stage:  Ready for
 |  checkin
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-

Comment (by Mariusz Felisiak ):

 In [changeset:"93a42d43a6995993b9bbcb743ab3c2a2b8414ebd" 93a42d43]:
 {{{
 #!CommitTicketReference repository=""
 revision="93a42d43a6995993b9bbcb743ab3c2a2b8414ebd"
 [4.0.x] Fixed #33159 -- Reverted "Fixed #32970 -- Changed
 WhereNode.clone() to create a shallow copy of children."

 This reverts commit e441847ecae99dd1ccd0d9ce76dbcff51afa863c.

 A shallow copy is not enough because querysets can be reused and
 evaluated in nested nodes, which shouldn't mutate JOIN aliases.

 Thanks Michal Čihař for the report.
 Backport of 903aaa35e5ceaa33bfc9b19b7f6da65ce5a91dd4 from main
 }}}

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.e0b1b18f4f41df844faa12f61e844341%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-09-30 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  closed
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:  fixed
 Keywords:   | Triage Stage:  Ready for
 |  checkin
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-

Comment (by GitHub ):

 In [changeset:"903aaa35e5ceaa33bfc9b19b7f6da65ce5a91dd4" 903aaa3]:
 {{{
 #!CommitTicketReference repository=""
 revision="903aaa35e5ceaa33bfc9b19b7f6da65ce5a91dd4"
 Fixed #33159 -- Reverted "Fixed #32970 -- Changed WhereNode.clone() to
 create a shallow copy of children."

 This reverts commit e441847ecae99dd1ccd0d9ce76dbcff51afa863c.

 A shallow copy is not enough because querysets can be reused and
 evaluated in nested nodes, which shouldn't mutate JOIN aliases.

 Thanks Michal Čihař for the report.
 }}}

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.8969874084aa386576d5edb9da24309e%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-09-20 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  closed
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:  fixed
 Keywords:   | Triage Stage:  Ready for
 |  checkin
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-
Changes (by Mariusz Felisiak ):

 * status:  assigned => closed
 * resolution:   => fixed


Comment:

 In [changeset:"e441847ecae99dd1ccd0d9ce76dbcff51afa863c" e441847e]:
 {{{
 #!CommitTicketReference repository=""
 revision="e441847ecae99dd1ccd0d9ce76dbcff51afa863c"
 Fixed #32970 -- Changed WhereNode.clone() to create a shallow copy of
 children.
 }}}

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.17a1e7d83b7feab0143232a8987d13b5%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-09-19 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  assigned
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:
 Keywords:   | Triage Stage:  Ready for
 |  checkin
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-
Changes (by Mariusz Felisiak):

 * stage:  Accepted => Ready for checkin


-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.be130192468a09826e0543a12e3fb931%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-08-03 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  assigned
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:
 Keywords:   | Triage Stage:  Accepted
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-
Changes (by Carlton Gibson):

 * stage:  Unreviewed => Accepted


Comment:

 OK, we can certainly ''Investigate''. Thanks for the work Keryn. :)

-- 
Ticket URL: 
Django 
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/067.b2ddd6d270e6dd28f778aca3274769ac%40djangoproject.com.


Re: [Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-07-29 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
 Reporter:  Keryn Knight |Owner:  Keryn
 Type:   |  Knight
  Cleanup/optimization   |   Status:  assigned
Component:  Database layer   |  Version:  dev
  (models, ORM)  |
 Severity:  Normal   |   Resolution:
 Keywords:   | Triage Stage:
 |  Unreviewed
Has patch:  1|  Needs documentation:  0
  Needs tests:  0|  Patch needs improvement:  0
Easy pickings:  0|UI/UX:  0
-+-
Description changed by Keryn Knight:

Old description:

> This relates to #32948 insofar as it's about figuring out how shallow or
> deep a copy is necessary for cloning `Node` instances/subclasses. And a
> bit to #28455 which acknowledges the need to avoid cloning sometimes due
> to it being costly.
>
> The PR I made for optimising Q combining/inverting
> (https://github.com/django/django/pull/14673) wants to introduce a
> `__copy__` method onto `Q` objects to indicate what level of operation is
> needed. Unlike that PR, it's decidedly not possible to take the same
> approach with WhereNode and just do `clone.__dict__ =
> self.__dict__.copy()` (things broke horribly) because of attributes like
> `contains_aggregate` and `output_field` which are cached properties on
> the WhereNode instance. Also `resolved` gets set after the fact by
> `resolve_expression`. But, as far as I can tell, the
> looping/testing/child cloning ''may be removable''. Perhaps once upon a
> time they were required and other components changed such that it's now
> safe to consider. I can't readily say, but initial experimentation
> suggests it's OK to look at, at least in theory.
>
> There is a PR already for this, here:
> https://github.com/django/django/pull/14709 which is currently marked
> draft/WIP because I couldn't easily run the whole test suite locally and
> needed to see if the more exotic parts caused problems. They didn't,
> which was unexpected, so here I am. The PR currently replaces the
> `WhereNode.clone` method like so:
> {{{
> clone = self.__class__._new_instance(children=[],
> connector=self.connector, negated=self.negated)
> for child in self.children:
> if hasattr(child, 'clone'):
> clone.children.append(child.clone())
> else:
> clone.children.append(child)
> }}}
> with:
> {{{
> clone = self.__class__._new_instance(children=None,
> connector=self.connector, negated=self.negated)
> clone.children = self.children[:]
> }}}
>

> But I ''think'' that Q and WhereNode ''can'' both just return a shallow
> copy by only aliasing the children property, at which point I think the
> `__copy__` method could be implemented on `Node` directly and mirror the
> existing `__deepcopy__` method. By preferring those stdlib names it draws
> a line in the sand over what level of copy should be expected. The
> existing `clone` and `copy` methods can become aliases to same.
>

> === Before any changes
>
> If we now begin to examine the before and after with as much data as I
> can easily gather. First a bit of prelude:
>
> {{{
> In [1]: from django.db.models.sql.where import WhereNode
> In [2]: from django.db.models import QuerySet
> In [3]: from django.db.models.sql import Query
> In [4]: from django.contrib.auth.models import User
> In [5]: x = User.objects.all()
> In [6]: y = User.objects.filter(username='test',
> email='t...@test.test').exclude(username='test').filter(email='nottest')
> In [7]: %load_ext line_profiler
> }}}
>
> Now let's establish a ballpark for how much time is spent where, I've
> stripped the line profiling down to just the relevant bits for clarity.
>
>  Queryset with no clauses (`x`)
>
> {{{
> In [8]: %lprun -f QuerySet._chain -f QuerySet._clone -f Query.chain -f
> Query.clone -f WhereNode.clone for _ in range(1000): x._chain()
>
> Function: QuerySet._chain
> Line #  Hits Time  Per Hit   % Time  Line Contents
> ==
>   1325   def _chain(self,
> **kwargs):
>   ...
>   1330  1000  88169.0 88.2 97.8  obj =
> self._clone()
>
> Function: QuerySet._clone
> Line #  Hits Time  Per Hit   % Time  Line Contents
> ==
>   1337   def _clone(self):
>   ...
>   1342  1000  80796.0 80.8 86.3  chained =
> self.query.chain()
>   1343  1000   7240.0  7.2  7.7  c =
> self.__class__(model=self.model, query=chained, 

[Django] #32970: Investigate feasibility of improving WhereNode clone performance

2021-07-29 Thread Django
#32970: Investigate feasibility of improving WhereNode clone performance
-+-
   Reporter:  Keryn  |  Owner:  Keryn Knight
  Knight |
   Type: | Status:  assigned
  Cleanup/optimization   |
  Component:  Database   |Version:  dev
  layer (models, ORM)|
   Severity:  Normal |   Keywords:
   Triage Stage: |  Has patch:  1
  Unreviewed |
Needs documentation:  0  |Needs tests:  0
Patch needs improvement:  0  |  Easy pickings:  0
  UI/UX:  0  |
-+-
 This relates to #32948 insofar as it's about figuring out how shallow or
 deep a copy is necessary for cloning `Node` instances/subclasses. And a
 bit to #28455 which acknowledges the need to avoid cloning sometimes due
 to it being costly.

 The PR I made for optimising Q combining/inverting
 (https://github.com/django/django/pull/14673) wants to introduce a
 `__copy__` method onto `Q` objects to indicate what level of operation is
 needed. Unlike that PR, it's decidedly not possible to take the same
 approach with WhereNode and just do `clone.__dict__ =
 self.__dict__.copy()` (things broke horribly) because of attributes like
 `contains_aggregate` and `output_field` which are cached properties on the
 WhereNode instance. Also `resolved` gets set after the fact by
 `resolve_expression`. But, as far as I can tell, the looping/testing/child
 cloning ''may be removable''. Perhaps once upon a time they were required
 and other components changed such that it's now safe to consider. I can't
 readily say, but initial experimentation suggests it's OK to look at, at
 least in theory.

 There is a PR already for this, here:
 https://github.com/django/django/pull/14709 which is currently marked
 draft/WIP because I couldn't easily run the whole test suite locally and
 needed to see if the more exotic parts caused problems. They didn't, which
 was unexpected, so here I am. The PR currently replaces the
 `WhereNode.clone` method like so:
 {{{
 clone = self.__class__._new_instance(children=[],
 connector=self.connector, negated=self.negated)
 for child in self.children:
 if hasattr(child, 'clone'):
 clone.children.append(child.clone())
 else:
 clone.children.append(child)
 }}}
 with:
 {{{
 clone = self.__class__._new_instance(children=None,
 connector=self.connector, negated=self.negated)
 clone.children = self.children[:]
 }}}


 But I ''think'' that Q and WhereNode ''can'' both just return a shallow
 copy by only aliasing the children property, at which point I think the
 `__copy__` method could be implemented on `Node` directly and mirror the
 existing `__deepcopy__` method. By preferring those stdlib names it draws
 a line in the sand over what level of copy should be expected. The
 existing `clone` and `copy` methods can become aliases to same.


 === Before any changes

 If we now begin to examine the before and after with as much data as I can
 easily gather. First a bit of prelude:

 {{{
 In [1]: from django.db.models.sql.where import WhereNode
 In [2]: from django.db.models import QuerySet
 In [3]: from django.db.models.sql import Query
 In [4]: from django.contrib.auth.models import User
 In [5]: x = User.objects.all()
 In [6]: y = User.objects.filter(username='test',
 email='t...@test.test').exclude(username='test').filter(email='nottest')
 In [7]: %load_ext line_profiler
 }}}

 Now let's establish a ballpark for how much time is spent where, I've
 stripped the line profiling down to just the relevant bits for clarity.

  Queryset with no clauses (`x`)

 {{{
 In [8]: %lprun -f QuerySet._chain -f QuerySet._clone -f Query.chain -f
 Query.clone -f WhereNode.clone for _ in range(1000): x._chain()

 Function: QuerySet._chain
 Line #  Hits Time  Per Hit   % Time  Line Contents
 ==
   1325   def _chain(self,
 **kwargs):
   ...
   1330  1000  88169.0 88.2 97.8  obj =
 self._clone()

 Function: QuerySet._clone
 Line #  Hits Time  Per Hit   % Time  Line Contents
 ==
   1337   def _clone(self):
   ...
   1342  1000  80796.0 80.8 86.3  chained =
 self.query.chain()
   1343  1000   7240.0  7.2  7.7  c =
 self.__class__(model=self.model, query=chained, using=self._db,
 hints=self._hints))

 Function: Query.chain
 Line #  Hits Time  Per Hit   % Time  Line Contents
 ==
341