#32840: Micro-optimisation possibility in Field.get_col -------------------------------------+------------------------------------- Reporter: Keryn | Owner: nobody Knight | Type: | Status: new Cleanup/optimization | Component: Database | Version: dev layer (models, ORM) | Severity: Normal | Keywords: Triage Stage: | Has patch: 0 Unreviewed | Needs documentation: 0 | Needs tests: 0 Patch needs improvement: 0 | Easy pickings: 0 UI/UX: 0 | -------------------------------------+------------------------------------- Current implementation is: {{{ def get_col(self, alias, output_field=None): if output_field is None: output_field = self if alias != self.model._meta.db_table or output_field != self: from django.db.models.expressions import Col return Col(alias, self, output_field) else: return self.cached_col }}} If no ''different'' output field is provided, is doing the following comparison needlessly as far as I can tell: `output_field != self` for which the default implementation of `Field.__eq__` is: {{{ if isinstance(other, Field): return ( self.creation_counter == other.creation_counter and getattr(self, 'model', None) == getattr(other, 'model', None) ) return NotImplemented }}} in that scenario, because `self` and `output_field` are literally the same object (down to the `id(...)`) the `isinstance` resolves to True, the creation counters also are the same and the models are as you'd expect ... the same. There's no short-circuiting via falsy condition available.
I think that the method body can be changed to: {{{ has_diff_output_field = True if output_field is None: output_field = self has_diff_output_field = False if alias != self.model._meta.db_table or (has_diff_output_field and output_field != self): from django.db.models.expressions import Col return Col(alias, self, output_field) else: return self.cached_col }}} The introduction of `has_diff_output_field` being the important part. If it's `False` then comparison short-circuiting will prevent the execution of `output_field != self` at all. I'm purposefully avoiding making further investigation/judgement about whether `output_field != self` is itself necessary, because it's ostensibly possible for a custom `output_field` to be provided which has the same `creation_counter` + `model` and I don't know how ''likely'' that is. Across the entire test suite (ignoring those which have skipped), executed with the proposed change didn't seem to break anything (yay) and augmenting the method additionally with: {{{ if has_diff_output_field: print('different') else: print('same') }}} and counting the results across some 14K tests, there were `87021 different` and `178493 same`. Quick example of how to get to the method: {{{ >>> tuple(get_user_model().objects.all()) (Pdb) w /path/django/db/models/query.py(280)__iter__() -> self._fetch_all() /path/django/db/models/query.py(1343)_fetch_all() -> self._result_cache = list(self._iterable_class(self)) /path/django/db/models/query.py(51)__iter__() -> results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) /path/django/db/models/sql/compiler.py(1175)execute_sql() -> sql, params = self.as_sql() /path/django/db/models/sql/compiler.py(523)as_sql() -> extra_select, order_by, group_by = self.pre_sql_setup() /path/django/db/models/sql/compiler.py(55)pre_sql_setup() -> self.setup_query() /path/django/db/models/sql/compiler.py(46)setup_query() -> self.select, self.klass_info, self.annotation_col_map = self.get_select() /path/django/db/models/sql/compiler.py(228)get_select() -> cols = self.get_default_columns() /path/django/db/models/sql/compiler.py(715)get_default_columns() -> column = field.get_col(alias) > /path/django/db/models/fields/__init__.py(396)get_col() }}} Overall it's: - 1 comparison if they're the same (it was 1 comparison before too, but that was itself 3 comparisons) - 1 additional comparison if they're not the same. The weighting/ratio of the test suite + the fact that the ''simplest'' ORM usage suggests (to me) it might have merit. Addendum: when I say micro, I mean micro. It's not a big time saver, I just happened to notice upon far more calls to `__eq__` than I expected. -- Ticket URL: <https://code.djangoproject.com/ticket/32840> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/052.4ff52e8b8b4a18aa6fdb392a592e6de3%40djangoproject.com.