14.08.2025 01:12, Jeff Davis wrote:
On Mon, 2025-08-11 at 17:21 +0300, Alexander Borisov wrote:
[..]
Comments on the patch itself:
The 0001 patch generalizes the two-step lookup process: first navigate
branches to find the index into a partially-compacted sparse array, and
then use that to
09.08.2025 02:17, Jeff Davis пишет:
On Tue, 2025-07-08 at 22:42 +0300, Alexander Borisov wrote:
Version 3 patches. In version 2 "make -s headerscheck" did not work.
I ran my own performance tests. What I did was get some test data from
ICU v76.1 by doing:
[..]
Results wi
01.08.2025 23:37, Tom Lane пишет:
Alexander Borisov writes:
I'm new here, so please advise me: if a patch wasn't accepted at the
commitfest, does that mean it's not needed (no one was interested in
it), or was there not enough time?
It's kind of hard to tell really --- th
sure what to do.
I looked and saw that patches are often transferred from commitfest to
commitfest. I understand that this is normal practice?
Please understand, it's not very transparent here, the approach is not
obvious.
What is the best course of action for me?
Thanks!
--
Regards,
Alexander Borisov
20.06.2025 20:20, Jeff Davis wrote:
On Fri, 2025-06-20 at 17:51 +0300, Alexander Borisov wrote:
I don't quite see how this compares to the implementation on Rust. In
the link provided, they use perfect hash, which I get rid of and get
a x2 boost.
If you take ICU implementations in C++, I
19.06.2025 20:41, Jeff Davis wrote:
On Tue, 2025-06-03 at 00:51 +0300, Alexander Borisov wrote:
As promised, I continue to improve/speed up Unicode in Postgres.
Last time, we improved the lower(), upper(), and casefold()
functions. [1]
Now it's time for Unicode Normalization Forms, specifi
11.06.2025 10:13, John Naylor wrote:
On Tue, Jun 3, 2025 at 1:51 PM Alexander Borisov wrote:
5. The server part "lost weight" in the binary, but the frontend
"gained weight" a little.
I read the old commits, which say that the size of the frontend is very
important a
e in this area.
But again, I'm new to the Postgres community and I'm getting to know
what's going on here and how it works.
Thank you for paying attention to it!
--
Regards,
Alexander Borisov
me from the commit message nor the
skimming the original thread, whether the perf improvement numbers
listed by Alexander also apply to lower() and upper(), or if they only
apply to casefold():
On Sun, 4 May 2025 at 00:32, Alexander Borisov wrote:
ASCII by ≈10%
Cyrillic by ≈80%
Unicode in general by
u for clarifying!
Users are not interested in performance gains.
Then it's not worth considering. Sorry to interrupt.
--
Regards,
Alexander Borisov
algorithms.
Because of which the functions lower(), upper(), casefold() got a
significant boost.
--
Regards,
Alexander Borisov
d want to understand.
Commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=27bdec06841d1bb004ca7627eac97808b08a7ac7
I am now actively working on a major improvement to Unicode
Normalization Forms.
Thanks!
--
Regards,
Alexander Borisov
15.03.2025 23:07, Jeff Davis wrote:
On Fri, 2025-03-14 at 15:00 +0300, Alexander Borisov wrote:
I tried adding a loop to create tables, and everything looks fine
(v7).
[...]
I prefer to generalize when we have the other code in place. As it was,
it was a bit confusing why the extra
12.03.2025 19:55, Alexander Borisov wrote:
[...]
A couple questions:
* Is there a reason the fast-path for codepoints < 0x80 is in
unicode_case.c rather than unicode_case_func.h?
Yes, this is an important optimization, below are benchmarks that
[...]
I forgot to add the benchm
19.02.2025 01:56, Jeff Davis пишет:
On Wed, 2025-02-19 at 01:54 +0300, Alexander Borisov wrote:
In proposing the patch for v3, I struck a balance between improving
performance and reducing binary size, without sacrificing code
clarity.
Fair enough. I will continue reviewing v3.
Did you have
19.02.2025 01:02, Jeff Davis пишет:
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
I tried the approach via a range table. The result was worse than
without the table. With branching in a function, the result is
better.
Patch v3 — ranges binary search by branches.
Patch v4
06.02.2025 22:08, Jeff Davis пишет:
On Thu, 2025-02-06 at 18:39 +0300, Alexander Borisov wrote:
Since I started to improve Unicode Case, I used the same approach,
essentially a binary search, only not by individual values, but by
ranges.
I considered it a 4th approach because of the generated
Hi Jeff,
06.02.2025 00:46, Jeff Davis пишет:
On Tue, 2025-02-04 at 23:19 +0300, Alexander Borisov wrote:
I've done many different experiments and everywhere the result is
within
the margin of the v2 patch result.
Great, thank you for working on this!
There doesn't appear to be
by uint8*n.
Thanks, after the weekend I'll send an updated patch that takes into
account the comments/advice.
--
SberTech
Alexander Borisov
Sorry, I made a mistake in the code. It's not worth watching this patch yet.
29.01.2025 23:23, Alexander Borisov пишет:
Hi, hackers!
I propose to consider a simple optimization for Unicode case tables.
The main changes affect the generate-unicode_case_table.pl file.
Because of the mod
10.12.2024 13:59, Victor Yegorov пишет:
чт, 5 дек. 2024 г. в 17:02, Alexander Borisov <mailto:lex.bori...@gmail.com>>:
[..]
Hey, I had a look at this patch and found its functionality mature and
performant.
As Peter mentioned pguri, I used it to compare with the proposed
06.12.2024 21:04, Matthias van de Meent:
On Thu, 5 Dec 2024 at 15:02, Alexander Borisov wrote:
[..]
I'd be extremely annoyed if URLs I wrote into the database didn't
return in identical manner when fetched from the database. See also
how numeric has different representations o
Hi Daniel,
06.12.2024 16:46, Daniel Gustafsson пишет:
On 6 Dec 2024, at 13:59, Alexander Borisov wrote:
As I've written before, there is a difference between parsing URLs
according to the RFC 3986 specification and WHATWG URLs. This is
especially true for host. Here are a couple
05.12.2024 17:59, Peter Eisentraut пишет:
On 05.12.24 15:01, Alexander Borisov wrote:
Postgres users often store URLs in the database. As an example, they
provide links to their pages on the web, analyze users posts and get
links for further storage and analysis. Naturally, there is a need to
24 matches
Mail list logo