Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel
It's not clear from the web site in question that the relevant code is released under the PostgreSQL license. As author I allow to use this code in PostgreSQL and under its license. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Double shared memory allocation for SLRU LWLocks
It seems to me that we're allocating shared memory for SLRU lwlocks twice, unless I'm missing something. I think you are right. Did you check previous versions? i.e. should it be applyed to previous branches?? I suppose yes, to minimize code difference. Also I'd like an idea to add Assert(offset <= SimpleLruShmemSize(nslots, nlsns)) at the end of SimpleLruInit() SimpleLruShmemSize() calculates total SLRU shared memory size including lwlocks size. SimpleLruInit() starts with line shared = (SlruShared) ShmemInitStruct(name, SimpleLruShmemSize(nslots, nlsns), ); which allocates SLRU shared memory (LWLocks size is included because SimpleLruShmemSize() is used for size computation). Following line allocates shared memory for LWLocks again: shared->buffer_locks = (LWLockPadded *) ShmemAlloc(sizeof(LWLockPadded) * nslots); Attached patch fixes that by removing extra ShmemAlloc for SLRU. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Suspicious place in heap_prepare_freeze_tuple()
Hi! Playing around freezing tuple I found suspicious piece of code: heap_prepare_freeze_tuple(): ... frz->t_infomask = tuple->t_infomask; ... frz->t_infomask &= ~HEAP_XMAX_BITS; frz->xmax = newxmax; if (flags & FRM_MARK_COMMITTED) frz->t_infomask &= HEAP_XMAX_COMMITTED; Seems, in last line it should be a bitwise OR instead of AND. Now this line cleans all bits in t_infomask which later will be copied directly in tuple. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 8deb344d09..ec227bac80 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6639,7 +6639,7 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, TransactionId cutoff_xid, frz->t_infomask &= ~HEAP_XMAX_BITS; frz->xmax = newxmax; if (flags & FRM_MARK_COMMITTED) -frz->t_infomask &= HEAP_XMAX_COMMITTED; +frz->t_infomask |= HEAP_XMAX_COMMITTED; changed = true; totally_frozen = false; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pluggable storage
1. Table AM with a 6-byte TID. 2. Table AM with a custom locator format, which could be TID-like. 3. Table AM with no locators. Currently TID has its own type in system catalog. Seems, it's possible that storage claims type of TID which it uses. Then, AM could claim it too, so the core based on that information could solve the question about AM-storage compatibility. Storage could also claim that it hasn't TID type at all so it couldn't be used with any access method, use case: compressed column oriented storage. As I remeber, only GIN depends on TID format, other indexes use it as opaque type. Except, at least, btree and GiST - they believ that internal pointers are the same as outer (to heap) Another dubious part - BitmapScan. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Perfomance bug in v10
BTW, was the larger query plan that you showed (with a Materialize node) generated by 9.6, or v10 HEAD? Because I would be surprised if 9.6 did v10, commit acbd8375e954774181b673a31b814e9d46f436a5 Author: Magnus Hagander <mag...@hagander.net> Date: Fri Jun 2 11:18:24 2017 +0200 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Perfomance bug in v10
There were old threads about considering a risk factor when estimating plans, and I'm thinking this issue is the planner failing to do exactly that. I'm afraid it's tool late for v10 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Perfomance bug in v10
Teodor, could you check if this patch fixes your real-world problem? It works fine with original query, thank you. But some other query slowdowns for ~10% (9 secs vs 10 secs). Look at following part of plans of huge query: without patch: -> Nested Loop (cost=34.82..50.91 rows=1 width=20) (actual time=0.017..0.061 rows=5 loops=24121) -> ... -> Materialize (cost=0.56..15.69 rows=1 width=5) (actual time=0.003..0.004 rows=2 loops=109061) -> Index Scan using ... (cost=0.56..15.68 rows=1 width=5) (actual time=0.013..0.014 rows=2 loops=24121) with patch: -> Nested Loop (cost=34.82..50.91 rows=1 width=20) (actual time=0.018..0.063 rows=5 loops=24121) -> ... -> Index Scan using ... (cost=0.56..15.68 rows=1 width=5) (actual time=0.012..0.013 rows=2 loops=109061) (dots hidden the same parts) As you said, it removes Materialize node, although it's useful here. If you wish, I can do a test suite, its size will be around 10MB and send it by private email. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Perfomance bug in v10
Thank you for the answer! This is all caused by get_variable_numdistinct() deciding that all values are distinct because ntuples < DEFAULT_NUM_DISTINCT. I see that if the example is increased to use 300 tuples instead of 32, then that's enough for the planner to estimate 2 rows instead of clamping to 1, and the bad plan does not look so good anymore since the planner predicts that those nested loops need to be executed more than once. I miss here why could the presence of index influence on that? removing index causes a good plan although it isn't used in both plans . I really think the planner is too inclined to take risks by nesting Nested loops like this, but I'm not all that sure the best solution to fix this, and certainly not for beta1. So, I'm a bit unsure exactly how best to deal with this. It seems like we'd better make some effort, as perhaps this could be a case that might occur when temp tables are used and not ANALYZED, but the only way I can think to deal with it is not to favour unique inner nested loops in the costing model. The unfortunate thing about not doing this is that the planner will no longer swap the join order of a 2-way join to put the unique rel on the inner side. This is evident by the regression test failures caused by patching with the attached, which changes the cost model for nested loops back to what it was before unique joins. The patch, seems, works for this particular case, but loosing swap isn't good thing, I suppose. My other line of thought is just not to bother doing anything about this. There's plenty more queries you could handcraft to trick the planner into generating a plan that'll blow up like this. Is this a realistic enough one to bother accounting for? Did it come from a real world case? else, how did you stumble upon it? Unfortunately, it's taken from real application. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Perfomance bug in v10
Hi! I found an example where v10 chooses extremely non-optimal plan: select i::int as a, i::int + 1 as b, 0 as c into t from generate_series(1,32) as i; create unique index i on t (c, a); explain analyze SELECT t1.a, t1.b, t2.a, t2.b, t3.a, t3.b, t4.a, t4.b, t5.a, t5.b, t6.a, t6.b /* , t7.a, t7.b, t8.a, t8.b, t9.a, t9.b, t10.a, t10.b */ FROM t T1 LEFT OUTER JOIN t T2 ON T1.b = T2.a AND T2.c = 0 LEFT OUTER JOIN t T3 ON T2.b = T3.a AND T3.c = 0 LEFT OUTER JOIN t T4 ON T3.b = T4.a AND T4.c = 0 LEFT OUTER JOIN t T5 ON T4.b = T5.a AND T5.c = 0 LEFT OUTER JOIN t T6 ON T5.b = T6.a AND T6.c = 0 LEFT OUTER JOIN t T7 ON T6.b = T7.a AND T7.c = 0 LEFT OUTER JOIN t T8 ON T7.b = T8.a AND T8.c = 0 LEFT OUTER JOIN t T9 ON T8.b = T9.a AND T9.c = 0 LEFT OUTER JOIN t T10 ON T9.b = T10.a AND T10.c = 0 WHERE T1.c = 0 AND T1.a = 5 ; It takes 4 seconds on my laptop, uncommenting commented lines causes run forever. analyzing table or removing index reduces execution time to milliseconds regardless on commented or uncommented lines. The commit commit 9c7f5229ad68d7e0e4dd149e3f80257893e404d4 Author: Tom Lane <t...@sss.pgh.pa.us> Date: Fri Apr 7 22:20:03 2017 -0400 Optimize joins when the inner relation can be proven unique. seems a root this problem - before it the query takes milliseconds. In attachment there is a output of explain analyze with commented lines, my attention was attracted by a huge number of loops: -> Materialize (cost=0.00..1.40 rows=1 width=8) (actual time=0.000..0.001 rows=17 loops=1048576) -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ Timing is on. DROP TABLE Time: 5,268 ms SELECT 32 Time: 5,515 ms CREATE INDEX Time: 14,971 ms QUERY PLAN --- Nested Loop Left Join (cost=0.00..8.55 rows=1 width=48) (actual time=818.219..4159.317 rows=1 loops=1) Join Filter: (t1.b = t2.a) Rows Removed by Join Filter: 31 -> Seq Scan on t t1 (cost=0.00..1.48 rows=1 width=8) (actual time=0.026..0.032 rows=1 loops=1) Filter: ((c = 0) AND (a = 5)) Rows Removed by Filter: 31 -> Nested Loop Left Join (cost=0.00..7.06 rows=1 width=40) (actual time=10.588..4159.270 rows=32 loops=1) Join Filter: (t2.b = t3.a) Rows Removed by Join Filter: 993 -> Seq Scan on t t2 (cost=0.00..1.40 rows=1 width=8) (actual time=0.008..0.020 rows=32 loops=1) Filter: (c = 0) -> Nested Loop Left Join (cost=0.00..5.65 rows=1 width=32) (actual time=0.142..129.970 rows=32 loops=32) Join Filter: (t3.b = t4.a) Rows Removed by Join Filter: 993 -> Seq Scan on t t3 (cost=0.00..1.40 rows=1 width=8) (actual time=0.002..0.010 rows=32 loops=32) Filter: (c = 0) -> Nested Loop Left Join (cost=0.00..4.23 rows=1 width=24) (actual time=0.007..4.055 rows=32 loops=1024) Join Filter: (t4.b = t5.a) Rows Removed by Join Filter: 993 -> Seq Scan on t t4 (cost=0.00..1.40 rows=1 width=8) (actual time=0.002..0.010 rows=32 loops=1024) Filter: (c = 0) -> Nested Loop Left Join (cost=0.00..2.82 rows=1 width=16) (actual time=0.003..0.121 rows=32 loops=32768) Join Filter: (t5.b = t6.a) Rows Removed by Join Filter: 528 -> Seq Scan on t t5 (cost=0.00..1.40 rows=1 width=8) (actual time=0.002..0.009 rows=32 loops=32768) Filter: (c = 0) -> Materialize (cost=0.00..1.40 rows=1 width=8) (actual time=0.000..0.001 rows=17 loops=1048576) -> Seq Scan on t t6 (cost=0.00..1.40 rows=1 width=8) (actual time=0.008..0.031 rows=32 loops=1) Filter: (c = 0) Planning time: 3.316 ms Execution time: 4159.596 ms (31 rows) Time: 4165,372 ms (00:04,165) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] convert EXSITS to inner join gotcha and bug
Really, the way to fix Teodor's complaint is to recognize that the semijoin inner rel is effectively unique against the whole outer rel, and then strength-reduce the semijoin to a plain join. The infrastructure we built for unique joins is capable of proving that, we just weren't applying it in the right way. Perfect, it works. Thank you! Second patch reduces time of full query (my example was just a small part) from 20 minutes to 20 seconds. I'm kind of strongly tempted to apply the second patch; but it would be fair to complain that reduce_unique_semijoins() is new development and should wait for v11. Opinions? Obviously, I'm voting for second patch applied to version 10. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] convert EXSITS to inner join gotcha and bug
Ah, thanks for the clue about enable_hashjoin, because it wasn't reproducing for me as stated. I missed tweaked config, sorry -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] convert EXSITS to inner join gotcha and bug
Both 9.6 and 10devel are affected to addiction of query result on seqscan variable. Oops, I was too nervious, 9.6 is not affected to enable_seqscan setting. But it doesn't push down condition too. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] convert EXSITS to inner join gotcha and bug
Hi! Seems, there two issues: 1) Sometime conditions which for a first glance could be pushed down to scan are leaved as join quals. And it could be a ~30 times performance loss. 2) Number of query result depend on enabe_seqscan variable. The query explain analyze SELECT * FROM t1 INNER JOIN t2 ON ( EXISTS ( SELECT true FROM t3 WHERE t3.id1 = t1.id AND t3.id2 = t2.id ) ) WHERE t1.name = '5c5fec6a41b8809972870abc154b3ecd' ; produces following plan: Nested Loop (cost=6.42..1928.71 rows=1 width=99) (actual time=71.415..148.922 rows=162 loops=1) Join Filter: (t3.id1 = t1.id) Rows Removed by Join Filter: 70368 -> Index Only Scan using t1i2 on t1 (cost=0.28..8.30 rows=1 width=66) (actual time=0.100..0.103 rows=1 loops=1) Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text) Heap Fetches: 1 -> Hash Join (cost=6.14..1918.37 rows=163 width=66) (actual time=0.370..120.971 rows=70530 loops=1) (1) Hash Cond: (t3.id2 = t2.id) (2) -> Seq Scan on t3 (cost=0.00..1576.30 rows=70530 width=66) (actual time=0.017..27.424 rows=70530 loops=1) -> Hash (cost=3.84..3.84 rows=184 width=33) (actual time=0.273..0.273 rows=184 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 20kB -> Seq Scan on t2 (cost=0.00..3.84 rows=184 width=33) (actual time=0.017..0.105 rows=184 loops=1) Planning time: 7.326 ms Execution time: 149.115 ms Condition (1) is not pushed to scan (2) which seemsly could be safely moved. With seqscan = off condition is not pushed too but query returns only one row instead of 162. Scan on t3 returns ~7 rows but only ~150 rows are really needed. I didn't found a combination of GUCs enable_* to push down that and it seems to me there is reason for that which I don't see or support is somehow missed. If pair of (t3.id1, t3.id2) is unique (see dump, there is a unique index on them) the query could be directly rewrited to inner join and its plan is: Nested Loop (cost=9.70..299.96 rows=25 width=66) (actual time=0.376..5.232 rows=162 loops=1) -> Nested Loop (cost=9.43..292.77 rows=25 width=99) (actual time=0.316..0.645 rows=162 loops=1) -> Index Only Scan using t1i2 on t1 (cost=0.28..8.30 rows=1 width=66) (actual time=0.047..0.050 rows=1 loops=1) Index Cond: (name = '5c5fec6a41b8809972870abc154b3ecd'::text) Heap Fetches: 1 -> Bitmap Heap Scan on t3 (cost=9.15..283.53 rows=94 width=66) (actual time=0.257..0.426 rows=162 loops=1) Recheck Cond: (id1 = t1.id) Heap Blocks: exact=3 -> Bitmap Index Scan on t3i1 (cost=0.00..9.12 rows=94 width=0) (actual time=0.186..0.186 rows=162 loops=1) Index Cond: (id1 = t1.id) -> Index Only Scan using t2i1 on t2 (cost=0.27..0.29 rows=1 width=33) (actual time=0.024..0.024 rows=1 loops=162) Index Cond: (id = t3.id2) Heap Fetches: 162 Planning time: 5.532 ms Execution time: 5.457 ms Second plan is ~30 times faster. But with turned off sequentual scan the first query is not work correctly, which points to some bug in planner, I suppose. Both 9.6 and 10devel are affected to addiction of query result on seqscan variable. Dump to reproduce (subset of real data but obfucated), queries are in attachment http://sigaev.ru/misc/exists_to_nested.sql.gz -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ --query returns 162 rows explain analyze SELECT * FROM t1 INNER JOIN t2 ON ( EXISTS ( SELECT true FROM t3 WHERE t3.id1 = t1.id AND t3.id2 = t2.id ) ) WHERE t1.name = '5c5fec6a41b8809972870abc154b3ecd' ; set enable_seqscan=off; --the same query returns only one row! explain analyze SELECT * FROM t1 INNER JOIN t2 ON ( EXISTS ( SELECT true FROM t3 WHERE t3.id1 = t1.id AND t3.id2 = t2.id ) ) WHERE t1.name = '5c5fec6a41b8809972870abc154b3ecd' ; explain analyze SELECT t1.* FROM t1, t2, t3 WHERE t1.name = '5c5fec6a41b8809972870abc154b3ecd' AND t3.id1 = t1.
Re: [HACKERS] WIP: Covering + unique indexes.
I had a look on patch and played with it, seems, it looks fine. I splitted it to two patches: core changes (+bloom index fix) and btree itself. All docs are left in first patch - I'm too lazy to rewrite documentation which is changed in second patch. Any objection from reviewers to push both patches? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ 0001-covering-core.patch Description: binary/octet-stream 0002-covering-btree.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P + IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE I think your syntax would read no worse, possibly even better, if you just used the existing INCLUDING keyword. It was a discussion in this thread about naming and both databases, which support covering indexes, use INCLUDE keyword. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
Thank you, pushed Excellent! I've attached a new (and hopefully final) version of the patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] ALTER DEFAULT PRIVILEGES with GRANT/REVOKE ON SCHEMAS
Thank you, pushed Matheus de Oliveira wrote: On Thu, Mar 2, 2017 at 10:27 AM, David Steele <da...@pgmasters.net <mailto:da...@pgmasters.net>> wrote: It looks like this patch is still waiting on an update for tab completion in psql. Hi All, Sorry about the long delay... It was so simple to add it to tab-complete.c that is a shame I didn't do it before, very sorry about that. Attached the new version of the patch that is basically the same as previously with the addition to tab completion for psql and rebased with master. Hope it is enough. Thank you all. -- Matheus de Oliveira -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential data loss of 2PC files
Thank you, pushed Michael Paquier wrote: On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev <teo...@sigaev.ru> wrote: And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. I did not think of this case, but is that really common? There is even no option to configure that at command level. And surely we cannot support any fancy scenario that people figure out using PGDATA. Existing callers of fsync_fname don't bother about this case as well by the way, take the calls related to pg_logical and pg_repslot. If something should be done in this area, that would be surely in fsync_fname directly to centralize all the calls, and I would think of that as a separate patch, and a separate discussion. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential data loss of 2PC files
Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. I did not think of this case, but is that really common? There is even I saw a lot such cases. If something should be done in this area, that would be surely in fsync_fname directly to centralize all the calls, and I would think of that as a separate patch, and a separate discussion. Agree -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.
No, it is really needed so that the lag measure is correct. Thank you, pushed -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions
Sorry, 1) and 4) is my fault, comment in hsearch.h: * ... The hash key * is expected to be at the start of the caller's hash entry data structure. Ops, forgot that. Teodor Sigaev wrote: things in order I'm attaching the previous patch as well. Patches look good, but I have some notices: 1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never used for read, so entry for hash could be just a pointer to PgStat_TableStatus. 2 step1 In pgstat_report_stat() you remove one by one entries from hash and remove them all. Isn't it better to hash_destroy/hash_create or even let hash lives in separate memory context and just resets it? 3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash although they will be free from point of view of pgStatTabList. 4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't initialized anywhere. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions
things in order I'm attaching the previous patch as well. Patches look good, but I have some notices: 1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never used for read, so entry for hash could be just a pointer to PgStat_TableStatus. 2 step1 In pgstat_report_stat() you remove one by one entries from hash and remove them all. Isn't it better to hash_destroy/hash_create or even let hash lives in separate memory context and just resets it? 3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash although they will be free from point of view of pgStatTabList. 4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't initialized anywhere. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential data loss of 2PC files
And the renaming of pg_clog to pg_xact is also my fault. Attached is an updated patch. Thank you. One more question: what about symlinks? If DBA moves, for example, pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync on symlink will do nothing actually. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backend crash on non-exclusive backup cancel
Thank you all, pushed Michael Paquier wrote: On Fri, Mar 24, 2017 at 1:45 AM, Teodor Sigaev <teo...@sigaev.ru> wrote: I believe patch looks good and it's ready to commit. Thanks for the review! As I understand, it fixes bug introduced by commit 7117685461af50f50c03f43e6a622284c8d54694 Date: Tue Apr 5 20:03:49 2016 +0200 Implement backup API functions for non-exclusive backups Indeed. And, suppose, it should be backpatched to 9.6? Yes, that's where non-exclusive backups have been introduced. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.
Hi, the patch looks good except why do you remove initialization of is_throttled? Suppose, just a typo? --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -1967,7 +1967,6 @@ top: st->listen = false; st->sleeping = false; st->throttling = false; - st->is_throttled = false; memset(st->prepared, 0, sizeof(st->prepared)); } -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential data loss of 2PC files
Hmm, it doesn't work (but appplies) on current HEAD: % uname -a FreeBSD *** 11.0-RELEASE-p8 FreeBSD 11.0-RELEASE-p8 #0 r315651: Tue Mar 21 02:44:23 MSK 2017 teodor@***:/usr/obj/usr/src/sys/XOR amd64 % pg_config --configure '--enable-depend' '--enable-cassert' '--enable-debug' '--enable-tap-tests' '--with-perl' 'CC=clang' 'CFLAGS=-O0' % rm -rf /spool/pg_data/* % initdb -n -E UTF8 --locale ru_RU.UTF-8 --lc-messages C --lc-monetary C --lc-numeric C --lc-time C -D /spool/pg_data Running in no-clean mode. Mistakes will not be cleaned up. The files belonging to this database system will be owned by user "teodor". This user must also own the server process. The database cluster will be initialized with locales COLLATE: ru_RU.UTF-8 CTYPE:ru_RU.UTF-8 MESSAGES: C MONETARY: C NUMERIC: C TIME: C The default text search configuration will be set to "russian". Data page checksums are disabled. fixing permissions on existing directory /spool/pg_data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... 2017-03-23 23:07:14.705 MSK [40286] FATAL: could not open file "pg_clog": No such file or directory child process exited with exit code 1 initdb: data directory "/spool/pg_data" not removed at user's request -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backend crash on non-exclusive backup cancel
Hi! I believe patch looks good and it's ready to commit. As I understand, it fixes bug introduced by commit 7117685461af50f50c03f43e6a622284c8d54694 Date: Tue Apr 5 20:03:49 2016 +0200 Implement backup API functions for non-exclusive backups And, suppose, it should be backpatched to 9.6? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree
Thank you, pushed Andrew Borodin wrote: 2017-03-22 22:48 GMT+05:00 Teodor Sigaev <teo...@sigaev.ru>: hasEmptyChild? and hasNonEmptyChild (BTW, isAnyNonempy has missed 't') Yes, I think this naming is good. It's clear what's in common in these flags and what's different. And if the whole posting tree is empty,then we could mark root page as leaf and remove all other pages in tree without any locking. Although, it could be a task for separate patch. From the performance point of view, this is a very good idea. Both, performance of VACUUM and performance of Scans. But doing so we risk to leave some garbage pages in case of a crash. And I do not see how to avoid these without unlinking pages one by one. I agree, that leaving this trick for a separate patch is quite reasonable. Best regards, Andrey Borodin. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree
No, second conditional code will be called for any subtree, which contains totally empty subtree. That check !isRoot covers case when the entire posting tree should be erased: we cannot just quit out of recursive cleanup, we have to make a scan here, starting from root. Oh, I see Probably, variable isChildHasVoid has a bit confusing name. This flag indicates that some subtree: 1. Had empty pages 2. Did not bother deleting them, because there is a chance that it is a part of a bigger empty subtree. May be it'd be better to call the variable "someChildIsVoidSubtree". hasEmptyChild? and hasNonEmptyChild (BTW, isAnyNonempy has missed 't') And if the whole posting tree is empty,then we could mark root page as leaf and remove all other pages in tree without any locking. Although, it could be a task for separate patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential data loss of 2PC files
If that can happen, don't we have the same problem in many other places? Like, all the SLRUs? They don't fsync the directory either. Right, pg_commit_ts and pg_clog enter in this category. Implemented as attached. Is unlink() guaranteed to be durable, without fsyncing the directory? If not, then we need to fsync() the directory even if there are no files in it at the moment, because some might've been removed earlier in the checkpoint cycle. What is protection if pg crashes after unlimk() but before fsync()? Right, it's rather small window for such scenario, but isn't better to have another protection? Like WAL-logging of WAL segment removing... -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree
Thank you for your suggestions, do not hesitate to ask any questions, concurrency and GIN both are very interesting topics. I had a look on patch and found some issue. Look at ginvacuum.c around line 387, function ginVacuumPostingTreeLeaves(): /* * All subtree is empty - just return TRUE to indicate that parent must * do a cleanup. Unless we are ROOT an there is way to go upper. */ if(isChildHasVoid && !isAnyNonempy && !isRoot) return TRUE; if(isChildHasVoid) { ... ginScanToDelete(gvs, blkno, TRUE, , InvalidOffsetNumber); } In first 'if' clause I see !isRoot, so second if and corresponding ginScanToDelete() could be called only for root in posting tree. If so, it seems to me, it wasn't a good idea to move ginScanToDelete() from ginVacuumPostingTree() and patch should just remove lock for cleanup over ginVacuumPostingTreeLeaves() and if it returns that tree contains empty page then lock the whole posting tree to do ginScanToDelete() work. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops
Thank you, pushed. I just make test table permanent. Anastasia Lubennikova wrote: The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested As I can see, this bugfix was already discussed and reviewed. All mentioned issues are fixed in the latest version of the patch So I mark it "Ready for committer". Thank you for fixing it! The new status of this patch is: Ready for Committer -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PinBuffer() no longer makes use of strategy
if (buf->usage_count < BM_MAX_USAGE_COUNT) if (BUF_STATE_GET_USAGECOUNT(buf_state) != BM_MAX_USAGE_COUNT) being prone to paranoia, I prefer the first, but I've seen both styles in the code so I don't know if it's worth futzing with. Ok, let's be paranoic and do this same way as before. Revised patch is attached. I see the change was done in 9.6 release cycle in commit 48354581a49c30f5757c203415aa8412d85b0f70 at April, 10. Does it mean the fix should be backpatched too? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Phrase search distance syntax
Sorry to be asking another phrase search syntax question, and so close to final release, but ... Really close... Why does the phrase distance operator assume <1> means adjacent words, and not <0>. (FYI, <-> is the same as <1>.) Because 1 it is a result of subtruction of word's positions 2 <0> could be used as special case like a word with two infinitives: # create text search dictionary xx (template = 'ispell', DictFile='ispell_sample', AffFile='ispell_sample'); # alter text search configuration english ALTER MAPPING FOR asciiword WITH xx, english_stem; # select to_tsvector('english', 'bookings'); to_tsvector -- 'book':1 'booking':1 # select to_tsvector('english', 'bookings') @@ 'book <0> booking'; ?column? -- t (1 row) -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
Please find attached a patch which makes it possible to disallow UPDATEs and DELETEs which lack a WHERE clause. As this changes query behavior, I've made the new GUCs PGC_SUSET. What say? DELETE FROM tbl WHERE true; ? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One process per session lack of sharing
On 12 July 2016 at 09:57, <amatv...@bitec.ru <mailto:amatv...@bitec.ru>> wrote: We have faced with some lack of sharing resources. So in our test memory usage per session: Oracle: about 5M MSSqlServer: about 4M postgreSql: about 160М Using shared resources also has significant problems, so care must be taken. To see memory allocation by opinion of pgsql it's possible to use https://github.com/postgrespro/memstat This module (9.6+) could collect information about MemoryContexts allocation. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BUG #14245: Segfault on weird to_tsquery
The above-described topic is currently a PostgreSQL 9.6 open item. Teodor, I'm working on it now and believe that fix will be published today. since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a 9.6 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within 72 hours of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping 9.6rc1. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
That didn't cover all the places that needed to be fixed, but I have re-read the docs and believe I've made things good now. I have reviewed this thread and verified that all the cases raised in it now work as desired, so I have marked the open item closed. Thank you very much! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of new phrase search tsquery operator
Attached patch changes a precedences of operations to |, &, <->, | in ascending order. BTW, it simplifies a bit a code around printing and parsing of tsquery. |, &, <->, ! of course -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of new phrase search tsquery operator
On Wed, Jun 8, 2016 at 7:13 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: It appears that the new <-> operator has been made to have exactly the same grammatical precedence as the existing & (AND) operator. Thus, for example, 'a & b <-> c'::tsquery means something different from 'b <-> c & a'::tsquery: I find this surprising. My intuitive feeling is that <-> ought to bind tighter than & (and therefore also tighter than |). What's the reasoning for making it act like this? ah, now we remember :) The idea about equivalence of & and <-> operators appeared in situation when <-> degenerates to & in case of absence of positional information. Looks like we mixed different things, will fix. Attached patch changes a precedences of operations to |, &, <->, | in ascending order. BTW, it simplifies a bit a code around printing and parsing of tsquery. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_predecence-2.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
We're really quickly running out of time to get this done before beta2. Please don't commit anything that's going to break the tree because we only have about 72 hours before the wrap, but if it's correct then it should go in. Isn't late now? Or wait to beta2 is out? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
Such algorithm finds closest pair of (Lpos, Rpos) but satisfying pair could be not closest, example: to_tsvector('simple', '1 2 1 2') @@ '1 <3> 2'; Oh ... the indexes in the lists don't have much to do with the distances, do they. OK, maybe it's not quite as easy as I was thinking. I'm okay with the patch as presented. Huh, I found that my isn't correct for example which I show :(. Reworked patch is in attach. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_exact_distance-2.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
Tom Lane wrote: Teodor Sigaev <teo...@sigaev.ru> writes: So I think there's a reasonable case for decreeing that should only match lexemes *exactly* N apart. If we did that, we would no longer have the misbehavior that Jean-Pierre is complaining about, and we'd not need to argue about whether <0> needs to be treated specially. Agree, seems that's easy to change. ... Patch is attached Hmm, couldn't the loop logic be simplified a great deal if this is the definition? Or are you leaving it like that with the idea that we might later introduce another operator with the less-than-or-equal behavior? Do you suggest something like merge join of two sorted lists? ie: while(Rpos < Rdata.pos + Rdata.npos && Lpos < Ldata.pos + Ldata.npos) { if (*Lpos > *Rpos) Rpos++; else if (*Lpos < *Rpos) { if (*Rpos - *Lpos == distance) match! Lpos++; } else { if (distance == 0) match! Lpos++; Rpos++; } } Such algorithm finds closest pair of (Lpos, Rpos) but satisfying pair could be not closest, example: to_tsvector('simple', '1 2 1 2') @@ '1 <3> 2'; -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
We need to reach a consensus here, since there is no way to say "I don't know". I inclined to agree with you, that returning false is better in such a case.That will indicate user to the source of problem. Here is a patch, now phrase operation returns false if there is not postion information. If this behavior looks more reasonable, I'll commit that. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_no_fallback.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
what's about word with several infinitives select to_tsvector('en', 'leavings'); to_tsvector 'leave':1 'leavings':1 (1 row) select to_tsvector('en', 'leavings') @@ 'leave <0> leavings'::tsquery; ?column? -- t (1 row) Second example is not correct: select phraseto_tsquery('en', 'leavings') will produce 'leave | leavings' and select phraseto_tsquery('en', 'leavings cats') will produce 'leave <-> cat | leavings <-> cat' which seems correct and we don't need special threating of <0>. This brings up something else that I am not very sold on: to wit, do we really want the "less than or equal" distance behavior at all? The documentation gives the example that phraseto_tsquery('cat ate some rats') produces ( 'cat' <-> 'ate' ) <2> 'rat' because "some" is a stopword. However, that pattern will also match "cat ate rats", which seems surprising and unexpected to me; certainly it would surprise a user who did not realize that "some" is a stopword. So I think there's a reasonable case for decreeing that should only match lexemes *exactly* N apart. If we did that, we would no longer have the misbehavior that Jean-Pierre is complaining about, and we'd not need to argue about whether <0> needs to be treated specially. Agree, seems that's easy to change. I thought that I saw an issue with hyphenated word but, fortunately, I forget that hyphenated words don't share a position: # select to_tsvector('foo-bar'); to_tsvector - 'bar':3 'foo':2 'foo-bar':1 # select phraseto_tsquery('foo-bar'); phraseto_tsquery --- ( 'foo-bar' <-> 'foo' ) <-> 'bar' and # select to_tsvector('foo-bar') @@ phraseto_tsquery('foo-bar'); ?column? -- t Patch is attached -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_exact_distance.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 9.6 open item is long past due for your status update. Please reacquaint yourself with the policy on open item ownership[1] and then reply immediately. If I do not hear from you by 2016-06-16 07:00 UTC, I will transfer this item to release management team ownership without further notice. [1] http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com I'm working on it right now. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COMMENT ON, psql and access methods
As far as I can see, COMMENT ON has no support for access methods. Wouldn't we want to add it as it is created by a command? On top of that, perhaps we could have a backslash command in psql to list the supported access methods, like \dam[S]? The system methods would be in this case all the in-core ones. +1. Are there other opinions? That's not a 9.6 blocker IMO, so I could get patches out for 10.0 if needed. I missed that on review process, but I'm agree it should be implemented. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Allow usage of huge maintenance_work_mem for GIN build" patch
--- a/src/include/access/gin_private.h +++ b/src/include/access/gin_private.h @@ -903,7 +903,7 @@ typedef struct GinEntryAccumulator typedef struct { GinState *ginstate; - longallocatedMemory; + SizeallocatedMemory; GinEntryAccumulator *entryallocator; uint32 eas_used; RBTree *tree; Are you sure this is safe, Teodor? I don't have time to study the patch in detail, but offhand I think that it might have been better to make allocatedMemory of type int64, just like the tuplesort.c memory accounting variables are post-MaxAllocHuge. It's not obvious to me that this variable isn't allowed to occasionally become negative, just like in tuplesort.c. It looks like that *might* be true -- ginbulk.c may let allocatedMemory go negative for a period, which would now be broken. It could not be negative - subtruction is doing only around repalloc call, in all other places it only grows. If you did make this exact error, you would not be the first. If it isn't actually broken, perhaps you should still make this change, simply on general principle. I'd like to hear other opinions on that, though. It could be an int64 without any regressions or problems. I choose Size type because variable allocatedMemory actually stores a size of piece of memory and it's the same type as returned by GetMemoryChunkSpace() AFAIK, size_t type (type Size is a just typedef alias) is a part of C99 and it should be unsigned and large enough to store size of any chunk of avaliable RAM. And it at least twice larger than allowed all memory-related GUC variables. I don't see a reason to use int64 except, may be, general practice in pgsql. But then why we keep Size type - let us just use int64. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind
This all should me moved behind "access method" abstraction... +1 relopt_kind should be moved in am, at least. Or removed. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add missing "USING bloom" into bloom extension documentation
you should add "USING bloom" to the insert statement, in order make this example work. Patch in the attachment fixes an example. Please commit it ;-) Thank you, applied -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding an alternate syntax for Phrase Search
to_tsquery(' Berkus & "PostgreSQL Version 10.0" ') ... would be equivalent to: to_tsquery(' Berkus & ( PostgreSQL <-> version <-> 10.0 )') select to_tsquery('Berkus') && phraseto_tsquery('PostgreSQL Version 10.0'); does it as you wish I realize we're already in beta, but pgCon was actually the first time I saw the new syntax. I think if we don't do this now, we'll be doing it for 10.0. Havn't an objections for 10.0 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Allocate all page images at once in generic wal interface
Allocate all page images at once in generic wal interface That reduces number of allocation. Per gripe from Michael Paquier and Tom Lane suggestion. Branch -- master Details --- http://git.postgresql.org/pg/commitdiff/7c979c95a3700d0bd34c2831f49a9260d505b0f9 Modified Files -- src/backend/access/transam/generic_xlog.c | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) Seems, this patch isn't liked by curculio [1] buildfarm member, but I'm confused with diagnostics: 2016-05-17 21:43:19.489 CEST [573b7457.547c:3] LOG: statement: CREATE EXTENSION bloom; 2016-05-17 21:43:19.501 CEST [573b7457.547c:4] ERROR: syntax error in file "/home/pgbf/buildroot/HEAD/inst/share/postgresql/extension/bloom.control" line 1, near token "" Could somebody explain me what's going on? Thank you [1] http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=curculio=2016-05-17+19%3A30%3A09 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Correctly align page's images in generic wal API
Instead of allocating this memory unconditionally for each buffer, wouldn't it be better to set all the page pointers to NULL in GenericXLogStart and allocate memory only once a buffer is registered in GenericXLogRegisterBuffer when finding a free slot? This patch is wasting many cycles. GenericXLogRegisterBuffer() could be called in another MemoryContext what can be a reason for strange bugs. Right now only a few pages could be involved in one round of GenericWal. I don't believe that such allocation could be a reason of noticable performance degradation. Although I didn't check that. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Allow Pin/UnpinBuffer to operate in a lockfree manner. I get the errors: ERROR: attempted to delete invisible tuple ERROR: unexpected chunk number 1 (expected 2) for toast value Just reminder, you investigate "unexpected chunk number" problem, but, seems, we have another bug (first ERROR: attempted to delete invisible tuple). IMHO, it's a separate bug, not related to oid. Unfortunately, I've never seen such error on my notebook. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] atomic pin/unpin causing errors
Hm. And you're not seeing the asserts I reported in http://archives.postgresql.org/message-id/20160505185246.2i7qftadwhzewykj%40alap3.anarazel.de ? I see it a lot, but I think that is a result of ereport(FATAL) after FileWrite(BLCKSZ/3) added by Jeff. Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] atomic pin/unpin causing errors
quot;teodor") at postgres.c:4122 #36 0x00839744 in BackendRun (port=0x801d571c0) at postmaster.c:4258 #37 0x00838d54 in BackendStartup (port=0x801d571c0) at postmaster.c:3932 #38 0x00835617 in ServerLoop () at postmaster.c:1690 #39 0x00832c69 in PostmasterMain (argc=4, argv=0x7fffe420) at postmaster.c:1298 #40 0x0075f228 in main (argc=4, argv=0x7fffe420) at main.c:228 -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] atomic pin/unpin causing errors
Any chance you could package up that data directory for me to download? Sent by personal email to Alexander, Andres and Jeff In /var/log/message I found May 4 22:04:07 xor kernel: pid 14010 (postgres), uid 1001: exited on signal 6 (core dumped) May 4 22:04:25 xor kernel: pid 14032 (postgres), uid 1001: exited on signal 11 (core dumped) May 4 22:04:52 xor kernel: pid 14037 (postgres), uid 1001: exited on signal 6 (core dumped) Sometimes postgres is crashed with SIGSEGV signal instead of SIGABRT (which comes form abort() in assert) I'll try to get a coredump after SIGSEGV, but it could take a time. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] atomic pin/unpin causing errors
I get the errors: ERROR: attempted to delete invisible tuple STATEMENT: update foo set count=count+1,text_array=$1 where text_array @> $2 And also: ERROR: unexpected chunk number 1 (expected 2) for toast value 85223889 in pg_toast_16424 STATEMENT: update foo set count=count+1 where text_array @> $1 Hm. I appear to have trouble reproducing this issue (continuing to try) on master as of 8826d8507. Is there any chance you could package up a data directory after the issue hit? I've got ERROR: unexpected chunk number 0 (expected 1) for toast value 10192986 in pg_toast_16424 The test required 10 hours to run on my notebook. postgresql was compiled with -O0 --enable-debug --enable-cassert. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tsvector filter problem
Thank you, pushed. Stas Kelvich wrote: Hi. As discovered by Oleg Bartunov, current filter() function for tsvector can crash backend. Bug was caused erroneous usage of char type in memmove argument. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] amroutine->amsupport from numeric to defined constants
I think this number should be specified only once, in .h file. Yep, that sounds true. It may be a good idea to make something similar for contrib/bloom if > 0 values are defined for amstrategies or amsupport for consistency. Thank you, pushed. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
The more I think about it, the more I think gin is just an innocent bystander, for which I just happen to have a particularly demanding test. I think something about snapshots and wrap-around may be broken. After 10 hours of running I've got 1587 XX000 2016-04-28 05:57:09.964 MSK:ERROR: unexpected chunk number 0 (expected 1) for toast value 10192986 in pg_toast_16424 1587 XX000 2016-04-28 05:57:09.964 MSK:CONTEXT: automatic analyze of table "teodor.public.foo" 1587 0 2016-04-28 05:57:09.974 MSK:LOG: JJ transaction ID wrap limit is 2147484514, limited by database with OID 16384 I've pushed v5 of gin-alone-cleanup patch, many thanks to all participants. Will work on backpatch -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Phrase search ported to 9.6
There are some previously unnoticed errors in docs (master branch), this patch fixes them. Thank you, pushed -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
Check my reasoning: In version 4 I added a remebering of tail of pending list into blknoFinish variable. And when we read page which was a tail on cleanup start then we sets cleanupFinish variable and after cleaning that page we will stop further cleanup. Any insert caused during cleanup will be placed after blknoFinish (corner case: in that page), so, vacuum should not miss tuples marked as deleted. Yes, I agree with the correctness of v4. But I do wonder if we should use that early stopping for vacuum and gin_clean_pending_list, rather Interesting, I've missed this possible option than just using it for user backends. While I think correctness allows it to stop early, since these routines are explicitly about cleaning things up it seems like they should volunteer to clean the whole thing. I believe that autovacuum should not require guaranteed full clean up, only vacuum and gin_clean_pending_list() should do that. In all other cases it should stop early to prevent possible infinite cleanup. Patch attached. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ gin_alone_cleanup-5.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
Added, see attached patch (based on v3.1) With this applied, I am getting a couple errors I have not seen before after extensive crash recovery testing: ERROR: attempted to delete invisible tuple ERROR: unexpected chunk number 1 (expected 2) for toast value 100338365 in pg_toast_16425 Huh, seems, it's not related to GIN at all... Indexes don't play with toast machinery. The single place where this error can occur is a heap_delete() - deleting already deleted tuple. I've restarted the test harness with intentional crashes turned off, to see if the problems are related to crash recovery or are more generic than that. I've never seen these particular problems before, so don't have much insight into what might be going on or how to debug it. Check my reasoning: In version 4 I added a remebering of tail of pending list into blknoFinish variable. And when we read page which was a tail on cleanup start then we sets cleanupFinish variable and after cleaning that page we will stop further cleanup. Any insert caused during cleanup will be placed after blknoFinish (corner case: in that page), so, vacuum should not miss tuples marked as deleted. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
Alvaro's recommendation, to let the cleaner off the hook once it passes the page which was the tail page at the time it started, would prevent any process from getting pinned down indefinitely, but would Added, see attached patch (based on v3.1) If there is no objections I will aplly it at mondeay and backpatch all supported versions. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
There are only 3 fundamental options I see, the cleaner can wait, "help", or move on. "Helping" is what it does now and is dangerous. Moving on gives the above-discussed unthrottling problem. Waiting has two problems. The act of waiting will cause autovacuums to be canceled, unless ugly hacks are deployed to prevent that. If we deploy those ugly hacks, then we have the problem that a user backend will end up waiting on an autovacuum to finish the cleaning, and the autovacuum is taking its sweet time due to autovacuum_vacuum_cost_delay. (The "helping" model avoids this problem because the user backend can just catch up with and pass the io-throttled autovac process) With pending cleanup patch backend will try to get lock on metapage with ConditionalLockPage. Will it interrupt autovacum worker? For completeness sake, a fourth option would to move on, but only after inserting the tuple directly into the main index structure (rather then the pending list) like would be done with fastupdate off, once the pending list is already oversized. This is my favorite, but there is no chance of it going into 9.6, much less being backpatched. Agree, will reimplement for 9.7 Alvaro's recommendation, to let the cleaner off the hook once it passes the page which was the tail page at the time it started, would prevent any process from getting pinned down indefinitely, but would not prevent the size of the list from increasing without bound. I think that would probably be good enough, because the current throttling behavior is purely accidentally and doesn't *guarantee* a limit on the size of the pending list. Added, see attached patch (based on v3.1) -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ gin_alone_cleanup-4.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
This restricts the memory used by ordinary backends when doing the cleanup to be work_mem. Shouldn't we let them use maintenance_work_mem? Only one backend can be doing this clean up of a given index at any given time, so we don't need to worry about many concurrent allocations of maintenance_work_mem. This seems very similar in spirit to index creation, where a backend is allowed to use maintenance_work_mem. Because it could be a several indexes in one pg instance. And each cleaner could eat maintenance_work_mem. Also, do we plan on backpatching this? While there are no known Yes -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other things about contrib/bloom and generic_xlog.c
I agree with both of these ideas. Patch is attached. Hmm, you accidentally forget to change flag type of GenericXLogRegister in contrib/bloom signature. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some other things about contrib/bloom and generic_xlog.c
mainline XLOG replay. Shouldn't generic_xlog.c take the same trouble? Yes, it should. Alexander now works on it. 2. Unless I'm missing something, contrib/bloom is making no effort to ensure that bloom index pages can be distinguished from other pages by pg_filedump. That's fine if your expectation is that bloom will always be a toy with no use in production; but otherwise, not so much. You need to make sure that the last two bytes of the page special space contain a uniquely identifiable code; compare spgist_page_id in SPGiST indexes. Now contrib/bloom is a canonical example how to implement yourown index and how to use generic WAL interface. And it is an useful toy: in some cases it could help in production system, patch attached. I'm a little dissatisfied with patch because I had to add unused field to BloomPageOpaqueData, in opposite case size of BloomPageOpaqueData struct equals 6 bytes but special size is MAXALIGNED, so, last two bytes will be unused. If unused field is not a problem, I will push this patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: PL/Pythonu - function ereport
Thank you very much thank you, pushed. Pls, pay attention to buildfarm. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan <p...@heroku.com> wrote: Personally, I like documenting assertions, and will sometimes write assertions that the compiler could easily optimize away. Maybe going *that* far is more a matter of personal style, but I think an assertion about the new index tuple size being <= the old one is just a good idea. It's not about a problem in your code at all. You should make index_truncate_tuple()/index_reform_tuple() promise to always do this in its comments/contract with caller as part of this, IMV. Some notices: - index_truncate_tuple(Relation idxrel, IndexTuple olditup, int indnatts, int indnkeyatts) Why we need indnatts/indnkeyatts? They are presented in idxrel struct already - follow code where index_truncate_tuple() is called, it should never called in case where indnatts == indnkeyatts. So, indnkeyatts should be strictly less than indnatts, pls, change assertion. If they are equal the this function becomes complicated variant of CopyIndexTuple() -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
I've tested the v2, v3 and v3.1 of the patch, to see if there are any differences. The v2 no longer applies, so I tested it on ee943004. The following table shows the total duration of the data load, and also sizes of the two GIN indexes. duration (sec) subject body --- v2 1290 23 MB 684 MB v3 1360 24 MB 487 MB v3.11360 24 MB 488 MB Thank you very much. Hmm. v3 is a just rebased version of v2, v3.1 hasn't unlock/lock cycle during cleanup, just to become similar to Jeff's pending lock patch. In theory, v2 and v3 should be very close, v3.1 should be close to pending_lock. I'm inclining to push v3.1 as one of two winners by size/performance and, unlike to pending lock patch, it doesn't change an internal logic of lock machinery. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
Thank you, pushed with Petr's notice Dmitry Dolgov wrote: On 6 April 2016 at 03:29, Andrew Dunstan <and...@dunslane.net <mailto:and...@dunslane.net>> wrote: Yeah, keeping it but rejecting update of an existing key is my preference too. cheers andrew Yes, it sounds quite reasonable. Here is a new version of patch (it will throw an error for an existing key). Is it better now? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
I've been asked to look at and comment on the SQL API of the feature. I think it's basically sound, although there is one thing that's not clear from the regression tests: what happens if we're inserting into an object and the key already exists? e.g.: select jsonb_insert('{"a": {"b": "value"}}', '{a, b}', '"new_value"'); I think this should be forbidden, i.e. the function shouldn't ever overwrite an existing value. If that's not handled it should be, and either way there should be a regression test for it. I'm agree about covering this case by tests, but I think it should be allowed. In this case it will work exactly as jsbonb_set -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
It seems to me that the patch is completed. Except, maybe, grammar check of comments and documentation. Looking forward to your review. Are there any objectins on it? I'm planning to look closely today or tommorrow and commit it. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN data corruption bug(s) in 9.6devel
The above-described topic is currently a PostgreSQL 9.6 open item. Teodor, since you committed the patch believed to have created it, you own this open item. If that responsibility lies elsewhere, please let us know whose responsibility it is to fix this. Since new open items may be discovered at any time and I want to plan to have them all fixed well in advance of the ship date, I will appreciate your efforts toward speedy resolution. Please present, within 72 hours, a plan to fix the defect within seven days of this message. Thanks. I'm waiting of Tomas testing. Suppose, it isn't possible now? so, will do myself, after that I will publish results. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Phrase search ported to 9.6
there was a character that was very similar to dots I would suggest that. The closest is * I think, so what do you think of "***"? And join opertator for tsqueries is the same : select 'fat'::tsquery *** 'cat'; ? Single '*' ? That's close to regex, any number of tokens. And it saves rules about duplicating character. select 'fat'::tsquery ** 'cat'; select 'fat * cat'::tsquery; select 'fat * [3] cat'::tsqyery; -- for non-default distance. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Phrase search ported to 9.6
Well, I noticed that the docs talk about an operator that can be used in SQL (outside the tsquery parser), as well as an operator that can be Just to join 2 tsquery with operator FOLLOWED BY used inside tsquery. Inside tsquery anything would be usable, but outside that it would be good to keep in mind the rest of SQL operators; and <> means "different from", so using it for FOLLOWED BY seems odd to me. previous rule was: duplicate operation character to join tsqueries. Ellipsis is forbidden to use as operator name: # create operator ... (procedure = 'int2and'); ERROR: syntax error at or near ".." LINE 1: create operator ... (procedure = 'int2and'); What is your suggestion for joining operator name? My suggestion of ... (ellipsis) is because that's already known as related to text, used for omissions of words, where the surrounding words form a phrase. It seems much more natural to me. Yes, agree for omission. But for reason above its'n a good name although I don't have a strong objections. may be <=>? it isn't used anywhere yet. select 'fat'::tsquery <=> 'cat'; select 'fat <=> cat'::tsquery; select 'fat <3> cat'::tsqyery; -- for non-default distance. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Phrase search ported to 9.6
Looking at patch, I'm inlined to commit it in current commitfest, it looks workable and too many people desire it. I've did some changes, mostly in formatting and comments. Patch makes some user-visible changes: 1 change order for tsquery. Assume, it's acceptable, only a few users store tsquery in table and have a Btree index. They will need to reindex. 2 less number of parenthesis in tsquery output, and tsquery becomes more readable. Pls, remove tsquery_setweight from patch because it's not a part of phrase search and it isn't mentioned in docs. Please, make a separate patch for it. Dmitry Ivanov wrote: Sorry for the delay, I desperately needed some time to finish a bunch of dangling tasks. I've added some new comments and clarified the ones that were obscure. Moreover, I felt an urge to recheck most parts of the code since apparently nobody (besides myself) has gone so far yet. On 25.03.16 18:42 MSK, David Steele wrote: Time is short and it's not encouraging that you say there is "still much work to be done". Indeed, I was inaccurate. I am more than interested in the positive outcome. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase-0.4.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
Thank you, pushed Emre Hasegeli wrote: I'll try to explain with two-dimensional example over points. ASCII-art: Thank you for the explanation. Should we incorporate this with the patch. added I have worked on the comments of the patch. It is attached. I hope it looks more clear than it was before. + cmp_double(const double a, const double b) Does this function necessary? We can just compare the doubles. Yes, it compares with limited precision as it does by geometry operations. Renamed to point actual arguments. I meant that we could use FP macros directly instead of this function. Doing so is also more correct. I haven't tried to produce the problem, but this function is not same as using the macros directly. For differences smaller that the epsilon, it can return different results. I have removed it on the attached version. + boxPointerToRangeBox(BOX *box, RangeBox * rectangle) The "rectangle" variable in here should be renamed. fixed I found a bunch of those too and renamed for clarity. I have also reordered the arguments of the helper functions to keep them consistent. evalRangeBox() is used to initialize fields of RangeBox in evalRectBox(). If evalRangeBox() will palloc its result then we need to copy its result into RangeBox struct and free. Let both fucntion use the same interface. I found it nicer to copy and edit the existing value than creating it in two steps like this. It is also attached. it works until allthesame branch contains only one inner node. Otherwise traversalValue will be freeed several times, see spgWalk(). It just works, when traversalValues is not set. It is also attached. I have also added the missing regression tests. While doing that I noticed that some operators are missing and also added support for them. It is also attached with the updated version of my test script. I hope I haven't changed the patch too much. I have also pushed the changes here: https://github.com/hasegeli/postgres/commits/box-spgist -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
as we discussed recently [1] you should avoid leaving "holes" with uninitialized data in structures. Please fix this or provide a comment that describes why things are done here the way they are done. [1] http://www.postgresql.org/message-id/56eff347.20...@anastigmatix.net That discussion is about SQL-level types which could be stored on disk, not about in-memory structs -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
GenericXLogStart(Relation relation) { ... if (genericXlogStatus != GXLOG_NOT_STARTED) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("GenericXLogStart: generic xlog is already started"))); Hmm, seems, generic wal whiil be in incorrect state if exception occurs between GenericXLogStart() and GenericXLogFinish() calls because static variable genericXlogStatus will contain GXLOG_LOGGED/GXLOG_UNLOGGED status. Suppose, it could be solved by different ways - remove all static variable, so, GenericXLogStart() will return an struct (object) which incapsulated all data needed to generic wal work. As I can see, in case of exception there isn't ane needing to extra cleanup. Also, it would allow to use generic wal for two or more relations at the same time, although I don't know any useful example for such feature. - add callback via RegisterResourceReleaseCallback() which will cleanup state of genericXlogStatus variable -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
I incorporated your changes and did some additional refinements on top of them still. Attached is delta against v12, that should cause less issues when merging for Teodor. But last version is 13th? BTW, it would be cool to add odcs in VII. Internals chapter, description should be similar to index's interface description, i.e. how to use generic WAL interface. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
And here it is. It's not perfect but it's better (I am not native speaker either). It's same as v12, just changed comments somewhat. Thank you, but I have a problems with applying: % patch -p1 < ~/Downloads/0002-generic-xlog.13.patch Hmm... Looks like a new-style context diff to me... ... |diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c |new file mode 100644 |index ...7ca03bf |*** a/src/backend/access/transam/generic_xlog.c |--- b/src/backend/access/transam/generic_xlog.c -- (Creating file src/backend/access/transam/generic_xlog.c...) Patching file src/backend/access/transam/generic_xlog.c using Plan A... patch: malformed patch at line 634: diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
Does that mean this patch should be closed or is there more remaining to commit? Petr promises to check english in comments/docs in generic-wal patch at this week, bloom patch depends on it. Bloom patch is ready from my point of view. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes for next week's releases
Does include ICU mean that collation handling is identical across platforms? E.g. a query on Linux involving string comparison would yield the same result on MacOS and Windows? Yes, it does and that's the most important issue for us. Yes, exactly. Attached patch adds support for libicu with configure flag --with-icu. Patch rebased to current HEAD, hope, it works. It's based on https://people.freebsd.org/~girgen/postgresql-icu/readme.html work, and it was migrated to 9.5 with abbrevation keys support. Patch in current state is not ready to commit, of course. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ libicu-8.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
+ * boxtype_spgist.c The names on the file header need to be changed, too. Oops. fixed I'll try to explain with two-dimensional example over points. ASCII-art: Thank you for the explanation. Should we incorporate this with the patch. added + cmp_double(const double a, const double b) Does this function necessary? We can just compare the doubles. Yes, it compares with limited precision as it does by geometry operations. Renamed to point actual arguments. + boxPointerToRangeBox(BOX *box, RangeBox * rectangle) The "rectangle" variable in here should be renamed. fixed + Assert(is_infinite(b) == 0); This is failing on my test. You can reproduce with the script I have sent. I didn't know: # select '(1,inf)'::point; point - (1,inf) fixed Wouldn't it be simpler to palloc and return the value on those functions? evalRangeBox() initializes part of RectBox, so, it could not palloc its result. Other methods use the same signature just for consistency. I couldn't understand it. evalRangeBox() can palloc and return the result. evalRectBox() can return the result palloc'ed by evalRangeBox(). The caller wouldn't need to palloc anything. evalRangeBox() is used to initialize fields of RangeBox in evalRectBox(). If evalRangeBox() will palloc its result then we need to copy its result into RangeBox struct and free. Let both fucntion use the same interface. I went through all other variables: + int r = is_infinite(a); + double x = *(double *) a; + double y = *(double *) b; + spgInnerConsistentIn *in = (spgInnerConsistentIn *) PG_GETARG_POINTER(0); + spgLeafConsistentIn *in = (spgLeafConsistentIn *) PG_GETARG_POINTER(0); + BOX*leafBox = DatumGetBoxP(in->leafDatum); Shouldn't they be "const", too? They could. But it doesn't required. To be in consistent state I've removed const modifier where possible +/* + * Begin. This block evaluates the median of coordinates of boxes + */ I would rather explain what the function does on the function header. fixed The "end" part of it is still there. Oops again, fixed Do we really need to copy the traversalValues on allTheSame case. Wouldn't it work if just the same value is passed for all of them. The search shouldn't continue after allTheSame case. Seems, yes. spgist tree could contain a locng branches with allTheSame. Does SP-GiST allows any node under allTheSame to not being allTheSame? No, SP-GiST doesn't allow that Not setting traversalValues for allTheSame worked fine with my test. it works until allthesame branch contains only one inner node. Otherwise traversalValue will be freeed several times, see spgWalk(). # select i as id, '1,2,3,4'::box as b into x from generate_series(1,100) i; # create index ix on i using spgist (b); # select count(*) from x where b && '1,2,3,4'::box; -- coredump gdb: #0 0x00080143564a in thr_kill () from /lib/libc.so.7 #1 0x000801435636 in raise () from /lib/libc.so.7 #2 0x0008014355b9 in abort () from /lib/libc.so.7 #3 0x00a80739 in in_error_recursion_trouble () at elog.c:199 #4 0x00abb748 in pfree (pointer=0x801e90868) at mcxt.c:1016 #5 0x0053330c in freeScanStackEntry (so=0x801e8d358, stackEntry=0x801e935d8) at spgscan.c:47 #6 0x00532cdb in spgWalk (index=0x801f1c588, so=0x801e8d358, scanWholeIndex=1 '\001', storeRes=0x532d10 ... + if (in->allTheSame) Most of the things happening before this check is not used in there. Shouldn't we move this to the top of the function? yep, fixed + out->nodeNumbers = (int *) palloc(sizeof(int) * in->nNodes); This could go before allTheSame block. yep, fixed I've attached all patches again. Thank you very much! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ q4d-4.patch.gz Description: GNU Zip compressed data traversalValue-2.patch.gz Description: GNU Zip compressed data range-1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
Isn't this basically the same thing that the cube contrib module does? (Which has the added benefit of kNN-capable operators). No, cube module introduces new type - N-dimensional box. And adds an index support for it. Current patch suggests non-traditional indexing technique for 2D boxes by treating them as point in 4D space. With such representation it's became possible to use quad-tree technique which: 1 supports only points to index 2 already supported by SP-GiST Such technique provides some benefits in comparance with traditional R-Tree which implemented in pg with a help GiST. See Oleg's message. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
tend to think of a *point* as having *zero* dimensions. Would it perhaps be more accurate to say we are treating a 2-dimensional box as a point in 4-dimensional space? Exactly, sorry for ambiguity. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
nterToRectangle( +DatumGetBoxP(in->scankeys[i].sk_argument), +p_query_rect +); I don't think this matches the project coding style. fixed + int flag = 1, Wouldn't it be better as "bool"? fixed. The regression tests for the remaining operators are still missing. Ooops, will be soon. Attached all patches to simplify review. Thank you very much! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ I'll try to explain with two-dimensional example over points. ASCII-art: | | 1|2 | ---+- |P 3|4 | | '+' with 'A' represents a centroid or, other words, point which splits plane for 4 quadrants and it strorend in parent node. 1,2,3,4 are labels of quadrants, each labeling will be the same for all pictures and all centriods, and i will not show them in pictures later to prevent too complicated images. Let we add C - child node (and again, it will split plane for 4 quads): | | +-+--- X |B|C | | ---+-+--- |P|A | | | | A and B are points of intersection of lines. So, box PBCAis a bounding box for points contained in 3-rd (see labeling above). For example X labeled point is not a descendace of child node with centroid C because it must be in branch of 1-st quad of parent node. So, each node (except root) will have a limitation in its quadrant. To transfer that limitation the traversalValue is used. traversalValue-2.patch.gz Description: GNU Zip compressed data q4d-3.patch.gz Description: GNU Zip compressed data range-1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
I don't see any objection, is consensus reached? I'm close to comiit that... I did only cursory review on the bloom contrib module and don't really have complaints there, but I know you can review that one. I'd like the English of the generic_xlog.c description improved but I won't get to it before weekend. So, per patch status: create-am ready generic-xlog need to fix comments/docs bloom-contrib need review. I'm an original author of this module and of course I can do some review of it but, seems, it's needed more eyes. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* Not yet, but will The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. I splitted to two: 1 0001-idx_or_core - only planner and executor changes 2 0002-idx_or_indexes - BRIN/GIN/GiST changes with tests I don't think that splitting of second patch adds readability but increase management diffculties, but if your insist I will split. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. added -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ 0001-idx_or_core-v4.patch.gz Description: GNU Zip compressed data 0002-idx_or_indexes-v4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* Oops, will add shortly. The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. Ok, will do it. 1) fields in BrinOpaque are not following the naming convention (all the existing fields start with bo_) fixed 2) there's plenty of places violating the usual code style (e.g. for single-command if branches) - not a big deal for WIP patch, but needs to get fixed eventually hope, fixed 3) I wonder whether we really need both SK_OR and SK_AND, considering they are mutually exclusive. Why not to assume SK_AND by default, and only use SK_OR? If we really need them, perhaps an assert making sure they are not set at the same time would be appropriate. In short: possible ambiguity and increasing stack machine complexity. Let we have follow expression in reversed polish notation (letters represent a condtion, | - OR, & - AND logical operation, ANDs are omitted): a b c | Is it ((a & b)| c) or (a & (b | c)) ? Also, using both SK_ makes code more readable. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. Will add soon 5) scanGetItem() may end up using uninitialized 'cmp' - it only gets initialized when (!leftFinished && !rightFinished), but then gets used when either part of the condition evaluates to true. Probably should be if (!leftFinished || !rightFinished) cmp = ... fixed 6) the code in nodeIndexscan.c should not include call to abort() { abort(); elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } fixed, just forgot to remove 7) I find it rather ugly that the paths are built by converting BitmapOr paths. Firstly, it means indexes without amgetbitmap can't benefit from this change. Maybe that's reasonable limitation, though? I based on following thoughts: 1 code which tries to find OR-index path will be very similar to existing generate_or_bitmap code. Obviously, it should not be duplicated. 2 all existsing indexes have amgetbitmap method, only a few don't. amgetbitmap interface is simpler. Anyway, I can add an option for generate_or_bitmap to use any index, but, in current state it will just repeat all work. But more importantly, this design already has a bunch of unintended consequences. For example, the current code completely ignores enable_indexscan setting, because it merely copies the costs from the bitmap path. I'd like to add separate enable_indexorscan That's pretty dubious, I guess. So this code probably needs to be made aware of enable_indexscan - right now it entirely ignores startup_cost in convert_bitmap_path_to_index_clause(). But of course if there are multiple IndexPaths, the enable_indexscan=off will be included multiple times. 9) This already breaks estimation for some reason. Consider this ... So the OR-clause is estimated to match 0 rows, less than each of the clauses independently. Needless to say that without the patch this works just fine. fixed 10) Also, this already breaks some regression tests, apparently because it changes how 'width' is computed. fixed too So I think this way of building the index path from a BitmapOr path is pretty much a dead-end. I don't think so because separate code path to support OR-clause in index will significanlty duplicate BitmapOr generator. Will send next version as soon as possible. Thank you for your attention! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ index_or-3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Access method extendability
Good catch, thanks! Tests were added. I don't see any objection, is consensus reached? I'm close to comiit that... -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] we have added support for box type in SP-GiST index
Do reconstructedValues is required now? Wouldn't it be cleaner to use the new field on the prefix tree implementation, too? reconstructedValues is needed to reconctruct full indexed value and should match to type info indexed column We haven't had specific memory context for reconstructedValues. Why is it required for the new field? because pg knows type of reconstructedValues (see above why) but traversalValue just a void*, SP-GiST core doesn't knnow how to free memory of allocated for it. + Memory for traversalValues should be allocated in + the default context, but make sure to switch to + traversalMemoryContext before allocating memory + for pointers themselves. This sentence is not understandable. Shouldn't it say "the memory should *not* be allocated in the default context"? What are the "pointers themselves"? Clarified, I hope The box opclass is broken because of the floating point arithmetics of the box type. You can reproduce it with the attached script. We hope, fixed. At least, seems, syncronized with seqscan really need to fix the geometric types, before building more on them. They fail to serve the only purpose they are useful for, demonstrating features. agree, but it's not a deal for this patch It think the opclass should support the cross data type operators like box @> point. Ideally we should do it by using multiple opclasses in an opfamily. The floating problem will bite us again in here, because some of the operators are not using FP macros. Again, agree. But I suggest to do it by separate patch. The tests check not supported operator "@". It should be "<@". It is nice that the opclass doesn't support long deprecated operators. fixed + #define LT -1 + #define GT 1 + #define EQ 0 Using these numbers is a very well established pattern. I don't think macros make the code any more readable. fixed + /* SP-GiST API functions */ + Datum spg_box_quad_config(PG_FUNCTION_ARGS); + Datum spg_box_quad_choose(PG_FUNCTION_ARGS); + Datum spg_box_quad_picksplit(PG_FUNCTION_ARGS); + Datum spg_box_quad_inner_consistent(PG_FUNCTION_ARGS); + Datum spg_box_quad_leaf_consistent(PG_FUNCTION_ARGS); I guess they should go to the header file. Remove them because they are presented in auto-generated file ./src/include/utils/builtins.h range patch is unchanged, but still attached to keep them altogether. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ q4d-2.patch.gz Description: GNU Zip compressed data traversalValue-2.patch.gz Description: GNU Zip compressed data range-1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: function parse_ident
I hope so the messages are ok now. Few more regress tests added. Thank you, committed. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: function parse_ident
I afraid so I cannot to fix this inconsistency (if this is inconsistency - the binary values are same) - the parameter of function is raw string with processed escape codes, and I have not any information about original escape sequences. When you enter octet value, and I show it as hex value, then there should be difference. Buy I have not information about your input (octet or hex). I have the original string of SQL identifier inside parser, executor, but I have not original string of function parameter inside function (not without pretty complex and long code). Ok, agree I am trying describe it in doc (I am sorry for my less level English) in new patch. Fixed duplicated oid too. Edited a bit + fix some typos and remove unneeded headers, patch attached Sorry, I can't find all corner-cases at once, but: SELECT parse_ident(E'"c".X XX'); ERROR: identifier contains disallowed characters: "\"c" Error message wrongly points to the reason of error. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ parse_ident-13.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tsvector editing functions
I saw errors on windows, here is the fix: Thank you, pushed -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tsvector editing functions
Thanks! Fixed and added tests. Thank you! I did some patch cleanup/fix, but I have some doubt with function's names: 1 to_tsvector: # \df to_tsvector List of functions Schema |Name | Result data type | Argument data types | Type +-+--+-+ pg_catalog | to_tsvector | tsvector | regconfig, text | normal pg_catalog | to_tsvector | tsvector | text| normal pg_catalog | to_tsvector | tsvector | text[] | normal First two variants of to_tsvector make a morphological processing, last one doesn't. 2 to_array # \df *to_array List of functions Schema | Name | Result data type | Argument data types | Type +---+--+-+ pg_catalog | regexp_split_to_array | text[] | text, text | normal pg_catalog | regexp_split_to_array | text[] | text, text, text| normal pg_catalog | string_to_array | text[] | text, text | normal pg_catalog | string_to_array | text[] | text, text, text| normal pg_catalog | to_array | text[] | tsvector| normal Seems, to_array is not a right name compared to other *to_array. I would like to suggest rename both functions to array_to_tsvector and tsvector_to_array to have consistent name. Later we could add to_tsvector([regconfig, ], text[]) with morphological processing. Thoughts? -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 4b5ee81..ed0b6be 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -9211,16 +9211,29 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple setweight - setweight(tsvector, "char") + setweight(vector tsvector, weight "char") tsvector -assign weight to each element of tsvector +assign weight to each element of vector setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') 'cat':3A 'fat':2A,4A 'rat':5A + setweight + setweight by filter + + setweight(vector tsvector, weight "char", lexemes "text"[]) + +tsvector +assign weight to elements of vector that are listed in lexemes array +setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}') +'cat':3A 'fat':2,4 'rat':5A + + + + strip strip(tsvector) @@ -9233,6 +9246,81 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple + delete + delete lemexeme + + delete(vector tsvector, lexeme text) + +tsvector +remove given lexeme from vector +delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') +'cat':3 'rat':5A + + + + + delete + delete lemexemes array + + delete(vector tsvector, lexemes text[]) + +tsvector +remove any occurrence of lexemes in lexemes array from vector +delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) +'cat':3 + + + + + unnest + + unnest(tsvector, OUT lexeme text, OUT positions smallint[], OUT weights text) + +setof record +expand a tsvector to a set of rows +unnest('fat:2,4 cat:3 rat:5A'::tsvector) +(cat,{3},{D}) ... + + + + + to_array + + to_array(tsvector) + +text[] +convert tsvector to array of lexemes +to_array('fat:2,4 cat:3 rat:5A'::tsvector) +{cat,fat,rat} + + + + + to_tsvector + array to tsvector + + to_tsvector(text[]) + +tsvector +convert array of lexemes to tsvector +to_tsvector('{fat,cat,rat}'::text[]) +'fat' 'cat' 'rat' + + + + + filter + + filter(vector tsvector, weights "char"[]) + +tsvector +Select only elements with given weights from vector +filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}') +'cat':3B 'rat':5A + + + + to_tsquery to_tsquery( config regconfig , query text) diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgm
Re: [HACKERS] proposal: function parse_ident
select parse_ident(E'X\rXX'); I am sending updated patch - I used json function for correct escaping - the escaping behave is same. Hmm, it doesn't look so: % select parse_ident(E'_\005'); ERROR: identifier contains disallowed characters: "_\u0005" % select parse_ident(E'\005'); ERROR: missing identifier: "\u0005" but # select parse_ident(E'"\005"'); parse_ident - {\x05} Error messages above point wrong character wrongly. One more inconsistence: # select parse_ident(E'"\005"') as "\005"; \005 {\x05} Display outputs of actual identifier and parse_indent are differ. Actually, I can live with both but are any other opinions? Seems, at least difference of actual identifier and output of parse_indent should be pointed in docs. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers