Thank you for your detailed explanation. You are right, they have tried to
shape the database suitable for this structure for a long time, but
removing the expression indexes will not cause a crisis. People and
products using the database created by Django will already have to create a
standard index and then an expression index for "UPPER". I have presented
the gin index as an example, but indexes can be produced for full text
search in different structures. But regardless, an expression index must be
added for the "UPPER" function. Creating an expression gin index in
addition to a standard gin index can be costly and even cause performance
problems in large databases. Of course, it is also necessary to consider
its reindex. As a result, my main expectation is not to ignore the current
structure of PostgreSQL. Or if you are using "icontains", be sure to
include an expression string because we should say we prefer not to use
"ilike". :)

Hannes Ljungberg <hannes.ljungb...@gmail.com>, 1 Mar 2021 Pzt, 11:31
tarihinde şunu yazdı:

> I kind of agree that using `UPPER` instead of `ILIKE` for `__icontains` on
> PostgreSQL isn’t optimal. But it is quite easy to create a functional
> trigram GIN-index which use `UPPER` to allow these lookups to use an index.
> This will be even easier in Django 3.2 where you can create functional
> indexes in your model definitions.
>
> The recommended way to handle case insensitive searches in PostgreSQL is
> by using functional indexes (
> https://www.postgresql.org/docs/current/indexes-expressional.html).  But
> with trigram indexes this isn’t needed and they are required to get index
> scans for leading wildcard searches. Note that trigram GIN indexes do have
> drawbacks, for example they don’t support the `=`-operator and will not
> support search strings with less than 3 characters.
>
> I’ve been in a similar position as you and to solve this I created a
> custom lookup which used `ILIKE` and used that instead of `__icontains`.
>
> It might make sense to change the `__icontains`  and `__iendswith` lookups
> to use `ILIKE` instead of `UPPER` but I’m not really sure that it’s
> justified to break the indexed queries people already have in place. That
> is where people have the GIN trigram `UPPER` index opposed from the regular
> GIN trigram index. The performance issue you’re describing doesn’t really
> change if we use `ILIKE` over `UPPER`. You need to install an index anyway.
>
> Regarding your issue with Turkish characters I think that it works because
> `ILIKE` internally uses some form of `LOWER` and `LOWER('İstanbul') =
> LOWER('istanbul')` would’ve worked in your case. As James wrote this
> behaviour depends on your configured locale. I think the one way to do
> these kind of searches without changing the locale is to use `tsvector` and
> `tsquery` with a Turkish configuration, you can even make them unaccented
> to allow matching both “Istanbul” and “İstanbul” with the search string
> “istanbul”.
>
>
> måndag 1 mars 2021 kl. 07:06:59 UTC+1 skrev mesuto...@gmail.com:
>
>> Hi James,
>> Thanks for your explanations. However, I wanted to explain the
>> disadvantage of using "UPPER like"  instead of "ilike" in icontains,
>> istartswith and iendswith. The performance problem should not be ignored.
>>
>> James Bennett <ubern...@gmail.com>, 1 Mar 2021 Pzt, 04:05 tarihinde şunu
>> yazdı:
>>
>>> On Sun, Feb 28, 2021 at 2:39 AM Tom Forbes <t...@tomforb.es> wrote:
>>> >
>>> > Thank you for the clarification! I think the biggest argument for this
>>> change is the fact that uppercasing Unicode can cause incorrect results to
>>> be returned.
>>> >
>>> > Given that we now have much better support for custom index types,
>>> perhaps we should change this? We need a custom expression index anyway, so
>>> it might not be a huge ask to say “now you should use a gin index”?
>>>
>>> It's worth pointing out that case mapping and transformation in
>>> Unicode is difficult and complex. I wrote up an intro to the problem a
>>> while back:
>>>
>>> https://www.b-list.org/weblog/2018/nov/26/case/
>>>
>>> One thing that's important to note is that there is no generic
>>> one-size-fits-all-languages option that Django can just do by default
>>> and get the right results. For example, a case mapping that does the
>>> right thing for Turkish will do the wrong thing for (to pick a random
>>> example) French, and vice-versa. Unicode itself provides a basic "hope
>>> for the best" set of default case mappings that do the right thing for
>>> many cased scripts, but also is clear in saying that you may need to
>>> use a locale-specific mapping to get what you really want.
>>>
>>> Postgres has the ability to configure locale, and when configured it
>>> does the "right thing" -- for example, when the locale is tr_TR or
>>> another Turkish locale variant, the UPPER() function should correctly
>>> handle dotted versus dotless 'i' as required for Turkish. But Postgres
>>> also warns that this will have performance impact, which I think is
>>> what's being noted in the ticket.
>>>
>>> I'm not sure there will be an easy or obvious solution here.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Django developers  (Contributions to Django itself)" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to django-develop...@googlegroups.com.
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/django-developers/CAL13Cg9nYMJZwm2XcsCcWG5Fqn8gqqE93FM11Xcfs4TXsmTbZQ%40mail.gmail.com
>>> .
>>>
>>
>>
>> --
>> İyi çalışmalar. Saygılarımla.
>>
>> *Mesut Öncel*
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/124180ad-dc7e-439d-ac4a-166e40e5d921n%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/124180ad-dc7e-439d-ac4a-166e40e5d921n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
İyi çalışmalar. Saygılarımla.

*Mesut Öncel*

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAACgSnTcofOS9fC9D3_mDBpwtOQDLSK%2BTwAQy-EUPvi3_pBMDg%40mail.gmail.com.

Reply via email to