Hi all, 

I would like to continue the discussion started in this very old thread:
https://groups.google.com/d/msg/django-developers/H2QFcQYsbo8/RmRb-8FVypwJ

I’m sorry if I should've continued the discussion in that thread but it 
felt a bit wrong to bring a 5 year old thread back to life :-)

Anyway, as previously described in that thread the implicit creation of the 
`*_pattern_ops` index when creating indexes on `CharField` and `TextField` 
with `db_index=True` is not ideal. 

In my experience `LIKE` expressions is not that common that it warrants to 
always create an index to cover this. 

For very large tables this can become a problem where insertion/update 
performance is negatively affected and disc space usage is twice what 
really is needed if no `LIKE` queries are used.

And even if one would like to use `LIKE` expressions it’s not obvious that 
the `*_pattern_ops` is the correct index. For leading wildcard patters as 
`LIKE %foo` one has to use a GIN/GiST index with the `*_trgm_ops` opclass. 
With the current implementation we would end up with 3 indexes when 2 would 
be sufficient for this use case (the regular b-tree and the trigram).

One could also argue that we’re not consistent with these implicit indexes. 
The `iexact`/`icontains` lookups require an expression index on 
`(UPPER(col))`  but that’s not created.

One important detail is that this implicit index is _not_ created when 
using the class based `Index` . In my opinion it’s not very clear that one 
needs to handle the creation of a `*_pattern_ops` index manually when using 
it.  

This is the only documentation that I’ve been able to find about the 
creation of these implicit indexes: 
https://docs.djangoproject.com/en/dev/ref/databases/#indexes-for-varchar-and-text-columns

My proposal is to remove the implicit index created by `db_index=True` and 
add documentation that one should use `Index.opclasses` to utilise indexes 
for `LIKE` queries on PostgresSQL. This would give a more consistent 
behaviour and being more explicit about this would help people tune their 
indexes.

If there’s a consensus on this I would like to give this a shot. If there’s 
not an agreement on removing the implicit index I think we should at least 
make the documentation around this a bit more clear about the gotchas of 
`Index` vs `db_index=True`.

Here’s two tickets which have previously discussed this:
https://code.djangoproject.com/ticket/24507
https://code.djangoproject.com/ticket/24088

Stay safe!

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/49a05b47-3eb1-4f28-a86f-6cd151fd7984%40googlegroups.com.

Reply via email to