from:"Nikita Malakhov"

Re: Shared detoast Datum proposal

2024-05-22 Thread Nikita Malakhov

Hi,

Robert, thank you very much for your response to this thread.
I agree with most of things you've mentioned, but this proposal
is a PoC, and anyway has a long way to go to be presented
(if it ever would) as something to be committed.

Andy, glad you've not lost interest in this work, I'm looking
forward to your improvements!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-03-15 Thread Nikita Malakhov

Hi!

Here's a slightly improved version of patch Tomas provided above (v2),
with cache invalidations and slices caching added, still as PoC.

The main issue I've encountered during tests is that when the same query
retrieves both slices and full value - slices, like substring(t,...) the
order of
the values is not predictable, with text fields substrings are retrieved
before the full value and we cannot benefit from cache full value first
and use slices from cached value.

Yet the cache code is still very compact and affects only sources related
to detoasting.

Tomas, about  HASH_ENTER - according to comments it could throw
an OOM error, so I've changed it to  HASH_ENTER_NULL to avoid
new errors. In this case we would just have the value not cached
without an error.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


0001-toast-cache-v3.patch
Description: Binary data

Re: Shared detoast Datum proposal

2024-03-06 Thread Nikita Malakhov

Hi!

Tomas, I thought about this issue -
>What if you only need the first slice? In that case decoding everything
>will be a clear regression.
And completely agree with you, I'm currently working on it and will post
a patch a bit later. Another issue we have here - if we have multiple
slices requested - then we have to update cached slice from both sides,
the beginning and the end.

On update, yep, you're right
>Surely the update creates a new TOAST value, with a completely new TOAST
>pointer, new rows in the TOAST table etc. It's not updated in-place. So
>it'd end up with two independent entries in the TOAST cache.

>Or are you interested just in evicting the old value simply to free the
>memory, because we're not going to need that (now invisible) value? That
>seems interesting, but if it's too invasive or complex I'd just leave it
>up to the LRU eviction (at least for v1).
Again, yes, we do not need the old value after it was updated and
it is better to take care of it explicitly. It's a simple and not invasive
addition to your patch.

Could not agree with you about large values - it makes sense
to cache large values because the larger it is the more benefit
we will have from not detoasting it again and again. One way
is to keep them compressed, but what if we have data with very
low compression rate?

I also changed HASHACTION field value from HASH_ENTER to
HASH_ENTER_NULL to softly deal with case when we do not
have enough memory and added checks for if the value was stored
or not.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-03-05 Thread Nikita Malakhov

Hi,

In addition to the previous message - for the toast_fetch_datum_slice
the first (seems obvious) way is to detoast the whole value, save it
to cache and get slices from it on demand. I have another one on my
mind, but have to play with it first.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-03-05 Thread Nikita Malakhov

ot;default" algorithm if we don't have a very
> > good idea of the exact life span of the values, etc. Which is why other
> > nodes (e.g. Memoize) use LRU too.
>
> > But I wonder if there's a way to count how many times an attribute is
> > accessed (or is likely to be). That might be used to inform a better
> > eviction strategy.
>
> Yes, but in current issue we can get a better esitimation with the help
> of plan shape and Memoize depends on some planner information as well.
> If we bypass the planner information and try to resolve it at the
> cache level, the code may become to complex as well and all the cost is
> run time overhead while the other way is a planning timie overhead.
>
> > Also, we don't need to evict the whole entry - we could evict just the
> > data part (guaranteed to be fairly large), but keep the header, and keep
> > the counts, expected number of hits, and other info. And use this to
> > e.g. release entries that reached the expected number of hits. But I'd
> > probably start with LRU and only do this as an improvement later.
>
> A great lession learnt here, thanks for sharing this!
>
> As for the current user case what I want to highlight is in the current
> user case, we are "caching" "user data" "locally".
>
> USER DATA indicates it might be very huge, it is not common to have a
> 1M tables, but it is much common we have 1M Tuples in one scan, so
> keeping the header might extra memory usage as well, like 10M * 24 bytes
> = 240MB. LOCALLY means it is not friend to multi active sessions. CACHE
> indicates it is hard to evict correctly. My method also have the USER
> DATA, LOCALLY attributes, but it would be better at eviction. eviction
> then have lead to memory usage issue which is discribed at the beginning
> of this writing.
>
> >>> Also, this leads me to the need of having some sort of memory limit. If
> >>> we may be keeping entries for extended periods of time, and we don't
> >>> have any way to limit the amount of memory, that does not seem great.
> >>>
> >>> AFAIK if we detoast everything into tts_values[] there's no way to
> >>> implement and enforce such limit. What happens if there's a row with
> >>> multiple large-ish TOAST values? What happens if those rows are in
> >>> different (and distant) parts of the plan?
> >>
> >> I think this can be done by tracking the memory usage on EState level
> >> or global variable level and disable it if it exceeds the limits and
> >> resume it when we free the detoast datum when we don't need it. I think
> >> no other changes need to be done.
> >>
> >
> > That seems like a fair amount of additional complexity. And what if the
> > toasted values are accessed in context without EState (I haven't checked
> > how common / important that is)?
> >
> > And I'm not sure just disabling the detoast as a way to enforce a memory
> > limit, as explained earlier.
> >
> >>> It seems far easier to limit the memory with the toast cache.
> >>
> >> I think the memory limit and entry eviction is the key issue now. IMO,
> >> there are still some difference when both methods can support memory
> >> limit. The reason is my patch can grantee the cached memory will be
> >> reused, so if we set the limit to 10MB, we know all the 10MB is
> >> useful, but the TOAST cache method, probably can't grantee that, then
> >> when we want to make it effecitvely, we have to set a higher limit for
> >> this.
> >>
> >
> > Can it actually guarantee that? It can guarantee the slot may be used,
> > but I don't see how could it guarantee the detoasted value will be used.
> > We may be keeping the slot for other reasons. And even if it could
> > guarantee the detoasted value will be used, does that actually prove
> > it's better to keep that value? What if it's only used once, but it's
> > blocking detoasting of values that will be used 10x that?
> >
> > If we detoast a 10MB value in the outer side of the Nest Loop, what if
> > the inner path has multiple accesses to another 10MB value that now
> > can't be detoasted (as a shared value)?
>
> Grarantee may be wrong word. The difference in my mind are:
> 1. plan shape have better potential to know the user case of datum,
> since we know the plan tree and knows the rows pass to a given node.
> 2. Planning time effort is cheaper than run-time effort.
> 3. eviction in my method is not as important as it is in TOAST cache
> method since it is reset per slot, so usually it doesn't hit limit in
> fact. But as a cache, it does.
> 4. use up to memory limit we set in TOAST cache case.
>
> >>> In any case, my concern is more about having to do this when creating
> >>> the plan at all, the code complexity etc. Not just because it might
> have
> >>> performance impact.
> >>
> >> I think the main trade-off is TOAST cache method is pretty non-invasive
> >> but can't control the eviction well, the impacts includes:
> >> 1. may evicting the datum we want and kept the datum we don't need.
> >
> > This applies to any eviction algorithm, not just LRU. Ultimately what
> > matters is whether we have in the cache the most often used values, i.e.
> > the hit ratio (perhaps in combination with how expensive detoasting that
> > particular entry was).
>
> Correct, just that I am doubtful about design a LOCAL CACHE for USER
> DATA with the reason I described above.
>
> At last, thanks for your attention, really appreciated about it!
>
> --
> Best Regards
> Andy Fan
>
>
>
>

-- 
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-03-02 Thread Nikita Malakhov

Hi, Andy!

Sorry for the delay, I have had long flights this week.
I've reviewed the patch set, thank you for your efforts.
I have several notes about patch set code, but first of
I'm not sure the overall approach is the best for the task.

As Tomas wrote above, the approach is very invasive
and spreads code related to detoasting among many
parts of code.

Have you considered another one - to alter pg_detoast_datum
(actually, it would be detoast_attr function) and save
detoasted datums in the detoast context derived
from the query context?

We have just enough information at this step to identify
the datum - toast relation id and value id, and could
keep links to these detoasted values in a, say, linked list
 or hash table. Thus we would avoid altering the executor
code and all detoast-related code would reside within
the detoast source files?

I'd check this approach in several days and would
report on the result here.

There are also comments on the code itself, I'd write them
a bit later.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-02-26 Thread Nikita Malakhov

Hi,

Tomas, we already have a working jsonb partial detoast prototype,
and currently I'm porting it to the recent master. Due to the size
 of the changes and very invasive nature it takes a lot of effort,
but it is already done. I'm also trying to make the core patch
less invasive. Actually, it is a subject for a separate topic, as soon
as I make it work on the current master we'll propose it
to the community.

Andy, thank you! I'll check the last patch set out and reply in a day or
two.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Shared detoast Datum proposal

2024-02-25 Thread Nikita Malakhov

Hi!

I see this to be a very promising initiative, but some issues come into my
mind.
When we store and detoast large values, say, 1Gb - that's a very likely
scenario,
we have such cases from prod systems - we would end up in using a lot of
shared
memory to keep these values alive, only to discard them later. Also,
toasted values
are not always being used immediately and as a whole, i.e. jsonb values are
fully
detoasted (we're working on this right now) to extract the smallest value
from
big json, and these values are not worth keeping in memory. For text values
too,
we often do not need the whole value to be detoasted and kept in memory.

What do you think?

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2024-02-09 Thread Nikita Malakhov

Hi!

I agree with Heikki on most topics and especially the one he recommended
to publish your extension on GitHub, this tool is very promising for highly
loaded
systems, you will get a lot of feedback very soon.

I'm curious about SpinLock - it is recommended for very short operations,
taking up to several instructions, and docs say that for longer ones it
will be
too expensive, and recommends using LWLock. Why have you chosen SpinLock?
Does it have some benefits here?

Thank you!

PS: process_query_desc function has trace_context argument that is neither
used nor passed anywhere.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2024-01-26 Thread Nikita Malakhov

Hi!

It's a good idea to split a big patch into several smaller ones.
But you've already implemented these features - you could provide them
as sequential small patches (i.e. v13-0002-guc-context-propagation.patch
and so on)

Great job! I'm both hands for committing your patch set.

On Fri, Jan 26, 2024 at 1:17 PM Anthonin Bonnefoy <
anthonin.bonne...@datadoghq.com> wrote:

> Hi,
>
> The last CFbot failure was on pg_upgrade/002_pg_upgrade and doesn't
> seem related to the patch.
>
> #   Failed test 'regression tests pass'
> #   at C:/cirrus/src/bin/pg_upgrade/t/002_pg_upgrade.pl line 212.
> #  got: '256'
> # expected: '0'
> # Looks like you failed 1 test of 18.
>
> Given that the patch grew a bit in complexity and features, I've tried
> to reduce it to a minimum to make reviewing easier. The goal is to
> keep the scope focused for a first version while additional features
> and changes could be handled afterwards.
>
> For this version:
> - Trace context can be propagated through SQLCommenter
> - Sampling decision uses either a global sampling rate or on
> SQLCommenter's sampled flag
> - We generate spans for Planner, ExecutorRun and ExecutorFinish
>
> What was put on hold for now:
> - Generate spans from planstate. This means we don't need the change
> in instrument.h to track the start of a node for the first version.
> - Generate spans for parallel processes
> - Using a GUC variable to propagate trace context
> - Filtering sampling on queryId
>
> With this, the current patch provides the core to:
> - Generate, store and export spans with basic buffer management (drop
> old spans when full or drop new spans when full)
> - Keep variable strings in a separate file (a la pg_stat_statements)
> - Track tracing state and top spans across nested execution levels
>
> Thoughts on this approach?
>
> On Mon, Jan 22, 2024 at 7:45 AM Peter Smith  wrote:
> >
> > 2024-01 Commitfest.
> >
> > Hi, This patch has a CF status of "Needs Review" [1], but it seems
> > there were CFbot test failures last time it was run [2]. Please have a
> > look and post an updated version if necessary.
> >
> > ==
> > [1] https://commitfest.postgresql.org/46/4456/
> > [2] https://cirrus-ci.com/task/5581154296791040
> >
> > Kind Regards,
> > Peter Smith.
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2024-01-17 Thread Nikita Malakhov

Hi,

Aleksander, there was a quite straightforward answer regarding Pluggable
TOAST
in other thread - the Pluggable TOAST feature is not desired by the
community,
and advanced TOAST mechanics would be accepted as parts of problematic
datatypes extended functionality, on a par with in and out functions, so
what I am
actually doing now - re-writing JSONb TOAST improvements to be called as
separate
functions without Pluggable TOAST API. This takes some time because there
is a large
and complex code base left by Nikita Glukhov who has lost interest in this
work due
to some reasons.

I, personally, think that these two features could benefit from each other,
but they could
be adapted to each other after I would introduce JSONb Toaster in v17
master.

If you don't mind please check the thread on extending the TOAST pointer -
it is important
for improving TOAST mechanics.

On Wed, Jan 17, 2024 at 5:27 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi again,
>
> > Yes it does for a while now. Until we reach any agreement regarding
> > questions (1) and (2) personally I don't see the point in submitting
> > rebased patches. We can continue the discussion but mark the CF entry
> > as RwF for now it will be helpful.
>
> Sorry, what I actually meant were the following questions:
>
> """
> Several things occured to me:
>
> - Does anyone believe that va_tag should be part of the utf8-like
> bitmask in order to save a byte or two?
>
> - The described approach means that compression dictionaries are not
> going to be used when data is compressed in-place (i.e. within a
> tuple), since no TOAST pointer is involved in this case. Also we will
> be unable to add additional compression algorithms here. Does anyone
> have problems with this? Should we use the reserved compression
> algorithm id instead as a marker of an extended TOAST?
>
> - It would be nice to decompose the feature in several independent
> patches, e.g. modify TOAST first, then add compression dictionaries
> without automatic update of the dictionaries, then add the automatic
> update. I find it difficult to imagine however how to modify TOAST
> pointers and test the code properly without a dependency on a larger
> feature. Could anyone think of a trivial test case for extendable
> TOAST? Maybe something we could add to src/test/modules similarly to
> how we test SLRU, background workers, etc.
> """
>
> Since there was not much activity since then (for 3 months) I don't
> really see how to process further.
>
> --
> Best regards,
> Aleksander Alekseev
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2024-01-05 Thread Nikita Malakhov

Hi!

I've meant exactly the thing you mentioned -

>
> > By queries you mean particular queries, not transactions? And since
> > they have an assigned ID it means that the query is already executing
> > and we want to enable the tracking in another session, right?
>
> I think that was the idea. The query identifier generated for a
> specific statement is stable so we could have a GUC variable with a
> list of query id and only queries matching the provided query ids
> would be sampled. This would make it easier to generate traces for a
> specific query within a session.
>
> This one. When maintaining production systems with high load rate
we often encountered necessity to trace a number of queries with
known IDs (that was the case for Oracle environments, but Postgres
is not very different from this point of view), because enabling tracing
for the whole session would immediately result in thousands of trace
spans in the system with throughput like hundreds or even thousands
of tps, when we need, say, to trace a single problematic query.

Thank you!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: introduce dynamic shared memory registry

2023-12-18 Thread Nikita Malakhov

Hi!

This patch looks like a good solution for a pain in the ass, I'm too for
this patch to be committed.
Have looked through the code and agree with Andrei, the code looks good.
Just a suggestion - maybe it is worth adding a function for detaching the
segment,
for cases when we unload and/or re-load the extension?

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Is a clearer memory lifespan for outerTuple and innerTuple useful?

2023-12-17 Thread Nikita Malakhov

Hi!

Maybe, the alternative way is using a separate kind of context, say name it
'ToastContext' for all custom data related to Toasted values? What do you
think?

On Sun, Dec 17, 2023 at 4:52 PM Andy Fan  wrote:

>
> Andy Fan  writes:
>
> > Andy Fan  writes:
> >
> >> ...,  I attached the 2 MemoryContext in
> >> JoinState rather than MergeJoinState, which is for the "shared detoast
> >> value"[0] more or less.
> >>
>
> In order to delimit the scope of this discussion, I attached the 2
> MemoryContext to MergeJoinState. Since the code was writen by Tom at
> 2005, so add Tom to the cc-list.
>
>
> --
> Best Regards
> Andy Fan
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-12-15 Thread Nikita Malakhov

Hi!

Overall solution looks good for me except SQL Commenter - query
instrumentation
with SQL comments is often not possible on production systems. Instead
the very often requested feature is to enable tracing for a given single
query ID,
or several (very limited number of) queries by IDs. It could be done by
adding
SQL function to add and remove query ID into a list (even array would do)
stored in top tracing context.

Great job, thank you!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFI: Extending the TOAST Pointer

2023-12-06 Thread Nikita Malakhov

Hi,

Here's the PoC for a custom TOAST pointer. The main idea is that custom
pointer
provides data space allowing to store custom metadata (i.e. TOAST method,
relation
OIDs, advanced compression information, etc, and even keep part of the data
inline.

Any feedback would be greatly appreciated.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


0001_custom_toast_pointer_v1.patch
Description: Binary data

Re: Avoid detoast overhead when possible

2023-12-05 Thread Nikita Malakhov

Hi,

With your setup (table created with setup.sql):
postgres@postgres=# explain analyze select big->'1', big->'2', big->'3',
big->'5', big->'10' from b;
  QUERY PLAN
--
 Seq Scan on b  (cost=0.00..29.52 rows=1001 width=160) (actual
time=0.656..359.964 rows=1001 loops=1)
 Planning Time: 0.042 ms
 Execution Time: 360.177 ms
(3 rows)

Time: 361.054 ms
postgres@postgres=# explain analyze select big->'1' from b;
 QUERY PLAN

 Seq Scan on b  (cost=0.00..19.51 rows=1001 width=32) (actual
time=0.170..63.996 rows=1001 loops=1)
 Planning Time: 0.042 ms
 Execution Time: 64.063 ms
(3 rows)

Time: 64.626 ms

Without patch, the same table and queries:
postgres@postgres=# explain analyze select big->'1', big->'2', big->'3',
big->'5', big->'10' from b;
  QUERY PLAN
--
 Seq Scan on b  (cost=0.00..29.52 rows=1001 width=160) (actual
time=0.665..326.399 rows=1001 loops=1)
 Planning Time: 0.035 ms
 Execution Time: 326.508 ms
(3 rows)

Time: 327.132 ms
postgres@postgres=# explain analyze select big->'1' from b;
 QUERY PLAN

 Seq Scan on b  (cost=0.00..19.51 rows=1001 width=32) (actual
time=0.159..62.807 rows=1001 loops=1)
 Planning Time: 0.033 ms
 Execution Time: 62.879 ms
(3 rows)

Time: 63.504 ms

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Avoid detoast overhead when possible

2023-12-05 Thread Nikita Malakhov

Hi,

Hmmm, I've checked this patch and can't see performance difference on a
large
(2 key-value pairs) json, using toasted json column several times makes
no
difference between current implementation on master (like queries mentioned
above).

Maybe I'm doing something wrong?

On Tue, Dec 5, 2023 at 4:16 AM  wrote:

>
> Hi,
>
> Matthias van de Meent  writes:
>
> > SELECT toastable_col FROM t1
> > WHERE f(t1.toastable_col)
> > ORDER BY nonindexed;
>
> Thanks for this example! it's true that the current design requires more
> memory to sort since toastable_col is detoasted at the scan stage and it
> is output to the sort node. It should be avoided.
>
> > SELECT ev_class
> > FROM pg_rewrite
> > WHERE octet_length(ev_action) > 1
> > ORDER BY ev_class;
>
> This one is different I think, since the ev_action (the toastable_col) is
> *NOT* output to sort node, so no extra memory is required IIUC.
>
>  * CP_SMALL_TLIST specifies that a narrower tlist is preferred.  This is
>  * passed down by parent nodes such as Sort and Hash, which will have to
>  * store the returned tuples.
>
> We can also verify this by
>
> explain (costs off, verbose) SELECT ev_class
> FROM pg_rewrite
> WHERE octet_length(ev_action) > 1
> ORDER BY ev_class;
> QUERY PLAN
> --
>  Sort
>Output: ev_class
>Sort Key: pg_rewrite.ev_class
>->  Seq Scan on pg_catalog.pg_rewrite
>  Output: ev_class
>  Filter: (octet_length((pg_rewrite.ev_action)::text) > 1)
> (6 rows)
>
> Only ev_class is output to Sort node.
>
> So if we want to make sure there is performance regression for all the
> existing queries in any case, we can add 1 more restriction into the
> saved-detoast-value logic. It must be (NOT under CP_SMALL_TLIST) OR (the
> toastable_col is not in the output list). It can be a planner decision.
>
> If we code like this, the result will be we need to dotoast N times
> for toastable_col in qual for the below query.
>
> SELECT toastable_col FROM t
> WHERE f1(toastable_col)
> AND f2(toastable_col)
> ..
> AND fn(toastable_col)
> ORDER BY any-target-entry;
>
> However
>
> SELECT
>   f1(toastable_col),
>   f2(toastable_col),
>   ..
>   fn(toastable_col)
> FROM t
> ORDER BY any-target-entry;
>
> the current path still works for it.
>
> This one is my favorite one so far. Another option is saving the
> detoast-value in some other memory or existing-slot-in-place for
> different sistuation, that would requires more expr expression changes
> and planner changes. I just checked all the queries in my hand, the
> current design can cover all of them.
>
> --
> Best Regards
> Andy Fan
>
>
>
>

-- 
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Avoid detoast overhead when possible

2023-12-04 Thread Nikita Malakhov

Hi!

There's a view from the other angle - detoast just attributes that are
needed
(partial detoast), with optimized storage mechanics for JSONb. I'm preparing
a patch for it, so maybe the best results could be acquired by combining
these
two techniques.

What do you think?

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [Proposal] global sequence implemented by snowflake ID

2023-11-30 Thread Nikita Malakhov

Hi!

I have reviewed the patch in this topic and have a question mentioning the
machine ID -
INSERT INTO snowflake_sequence.machine_id
SELECT round((random() * (0 - 511))::numeric, 0) + 511;

This kind of ID generation does not seem to guarantee from not having the
same ID in a pool
of instances, does it?

On Thu, Nov 30, 2023 at 4:18 AM Michael Paquier  wrote:

> On Tue, Nov 28, 2023 at 02:23:44PM +0530, Amit Kapila wrote:
> > It is interesting to see you want to work towards globally distributed
> > sequences. I think it would be important to discuss how and what we
> > want to achieve with sequences w.r.t logical replication and or
> > active-active configuration. There is a patch [1] for logical
> > replication of sequences which will primarily achieve the failover
> > case, i.e. if the publisher goes down and the subscriber takes over
> > the role, one can re-direct connections to it. Now, if we have global
> > sequences, one can imagine that even after failover the clients can
> > still get unique values of sequences. It will be a bit more flexible
> > to use global sequences, for example, we can use the sequence on both
> > nodes at the same time which won't be possible with the replication of
> > sequences as they will become inconsistent. Now, it is also possible
> > that both serve different use cases and we need both functionalities
> > but it would be better to have some discussion on the same.
> >
> > Thoughts?
> >
> > [1] - https://commitfest.postgresql.org/45/3823/
>
> Thanks for pointing this out.  I've read through the patch proposed by
> Tomas and both are independent things IMO.  The logical decoding patch
> relies on the SEQ_LOG records to find out which last_value/is_called
> to transfer, which is something directly depending on the in-core
> sequence implementation.  Sequence AMs are concepts that cover much
> more ground, leaving it up to the implementor to do what they want
> while hiding the activity with a RELKIND_SEQUENCE (generated columns
> included).
>
> To put it short, I have the impression that one and the other don't
> really conflict, but just cover different ground.  However, I agree
> that depending on the sequence AM implementation used in a cluster
> (snowflake IDs guarantee unicity with their machine ID), replication
> may not be necessary because the sequence implementation may be able
> to ensure that no replication is required from the start.
> --
> Michael
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Table AM Interface Enhancements

2023-11-29 Thread Nikita Malakhov

Hi,

Pavel, as far as I understand Alexander's idea assertion and especially
ereport
here does not make any sense - this method is not considered to report
error, it
silently calls if there is underlying [free] function and simply falls
through otherwise,
also, take into account that it could be located in the uninterruptible
part of the code.

On the whole topic I have to

On Wed, Nov 29, 2023 at 4:56 PM Pavel Borisov 
wrote:

> Hi, Alexander!
>
>> I'm planning to review some of the other patches from the current
>> patchset soon.
>>
>
> I've looked into the patch 0003.
> The patch looks in good shape and is uncontroversial to me. Making memory
> structures to be dynamically allocated is simple enough and it allows to
> store complex data like lists etc. I consider places like this that expect
> memory structures to be flat and allocated at once are because the was no
> need in more complex ones previously. If there is a need for them, I think
> they could be added without much doubt, provided the simplicity of the
> change.
>
> For the code:
> +static inline void
> +table_free_rd_amcache(Relation rel)
> +{
> + if (rel->rd_tableam && rel->rd_tableam->free_rd_amcache)
> + {
> + rel->rd_tableam->free_rd_amcache(rel);
> + }
> + else
> + {
> + if (rel->rd_amcache)
> + pfree(rel->rd_amcache);
> + rel->rd_amcache = NULL;
> + }
>
> here I suggest adding Assert(rel->rd_amcache == NULL) (or maybe better an
> error report) after calling free_rd_amcache to be sure the custom
> implementation has done what it should do.
>
> Also, I think some brief documentation about writing this custom method is
> quite relevant maybe based on already existing comments in the code.
>
> Kind regards,
> Pavel
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Patch: Global Unique Index

2023-11-24 Thread Nikita Malakhov

Hi!

Please advise on the status of this patch set - are there any improvements?
Is there any work going on?

Thanks!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFC: Pluggable TOAST

2023-11-14 Thread Nikita Malakhov

Hi!

Matthias, regarding your message above, I have a question to ask.
On typed TOAST implementations - we thought that TOAST method used
for storing data could depend not only on data type, but on the flow or
workload,
like out bytea appendable toaster which is much (hundreds of times) faster
on
update compared to regular procedure. That was one of ideas behind the
Pluggable TOAST - we can choose the most suitable TOAST implementation
available.

If we have a single TOAST entry point for data type - then we should have
some means to control it or choose a TOAST method suitable to our needs.
Or should not?

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFC: Pluggable TOAST

2023-11-07 Thread Nikita Malakhov

Hi,

I've been thinking about Matthias' proposals for some time and have some
questions:

>So, in short, I don't think there is a need for a specific "Pluggable
>toast API" like the one in the patchset at [0] that can be loaded
>on-demand, but I think that updating our current TOAST system to a
>system for which types can provide support functions would likely be
>quite beneficial, for efficient extraction of data from composite
>values.

As I understand one of the reasons against Pluggable TOAST is that
differences
in plugged-in Toasters could result in incompatibility even in different
versions
of the same DB.

The importance of the correct TOAST update is out of question, feel like I
have
to prepare a patch for it. There are some questions though, I'd address them
later with a patch.

>Example support functions:

>/* TODO: bikeshedding on names, signatures, further support functions. */
>Datum typsup_roastsliceofbread(Datum ptr, int sizetarget, char cmethod)
>Datum typsup_unroastsliceofbread(Datum ptr)
>void typsup_releaseroastedsliceofbread(Datump ptr) /* in case of
>non-unitary in-memory datums */

I correctly understand that you mean extending PG_TYPE and type cache,
by adding a new function set for toasting/detoasting a value in addition to
in/out, etc?

I see several issues here:
1) We could benefit from knowledge of internals of data being toasted (i.e.
in case of JSON value with key-value structure) only when EXTERNAL
storage mode is set, otherwise value will be compressed before toasted.
So we have to keep both TOAST mechanics regarding the storage mode
being used. It's the same issue as in Pluggable TOAST. Is it OK?

2) TOAST pointer is very limited in means of data it keeps, we'd have to
extend it anyway and keep both for backwards compatibility;

3) There is no API and such an approach would require implementing
toast and detoast in every data type we want to be custom toasted, resulting
in multiple files modification. Maybe we have to consider introducing such
an API?

4) 1 toast relation per regular relation. With an update mechanics this will
be less limiting, but still a limiting factor because 1 entry in base table
could have a lot of entries in the toast table. Are we doing something with
this?

>We would probably want at least 2 more subtypes of varattrib_1b_e -
>one for on-disk pointers, and one for in-memory pointers - where the
>payload of those pointers is managed by the type's toast mechanism and
>considered opaque to the rest of PostgreSQL (and thus not compatible
>with the binary transfer protocol). Types are currently already
>expected to be able to handle their own binary representation, so
>allowing types to manage parts of the toast representation should IMHO
>not be too dangerous, though we should make sure that BINARY COERCIBLE
>types share this toast support routine, or be returned to their
>canonical binary version before they are cast to the coerced type, as
>using different detoasting mechanisms could result in corrupted data
>and thus crashes.

>Lastly, there is the compression part of TOAST. I think it should be
>relatively straightforward to expose the compression-related
>components of TOAST through functions that can then be used by
>type-specific toast support functions.
>Note that this would be opt-in for a type, thus all functions that use
>that type's internals should be aware of the different on-disk format
>for toasted values and should thus be able to handle it gracefully.

Thanks a lot for answers!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-11-03 Thread Nikita Malakhov

Hi!

Read Tom Lane's note in previous discussion (quite long, so I've missed it)
on pg_proc column -

>I strongly recommend against having a new pg_proc column at all.
>I doubt that you really need it, and having one will create
>enormous mechanical burdens to making the conversion.  (For example,
>needing a catversion bump every time we convert one more function,
>or an extension version bump to convert extensions.)

so should figure out another way to do it.

Regards,
--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-11-01 Thread Nikita Malakhov

Hi!

According to the discussion above, I've added the 'proerrsafe' attribute to
the PG_PROC relation.
The same was done some time ago by Nikita Glukhov but this part was
reverted.
This is a WIP patch, I am new to this part of Postgres, so please correct
me if I'm going the wrong way.

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


0001_proerrsafe_attr_v1.patch
Description: Binary data

Re: RFC: Pluggable TOAST

2023-10-26 Thread Nikita Malakhov

Hi!

Matthias, thank you for your patience and explanation. I'd wish I had it
much earlier, it would save a lot of time.
You've asked a lot of good questions, and the answers we have for some
seem to be not very satisfactory, and pointed out some topics that were not
mentioned before. I have to rethink our approach to the TOAST enhancements
according to it.

Thanks a lot!

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-26 Thread Nikita Malakhov

Hi,

Agreed on the latter, that must not be the part of it for sure.
Would think on how to make this part correct.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFC: Pluggable TOAST

2023-10-26 Thread Nikita Malakhov

Hi,

I meant discussion preceding the patch set - there was no any.

And the goal of *THIS* topic is to gather a picture on how the community
sees
improvements in TOAST mechanics if it doesn't want it the way we proposed
before, to understand which way to go with JSON advanced storage and other
enhancements we already have. Previous topic was not of any help here.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-26 Thread Nikita Malakhov

Hi,

The main goal was to correctly process invalid queries (as in examples
above).
I'm not sure this could be done in type input functions. I thought that some
coercions could be checked before evaluating expressions for saving reasons.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFC: Pluggable TOAST

2023-10-26 Thread Nikita Malakhov

Hi,

Aleksander, previous discussion was not a discussion actually, we proposed
a set of big and complex core changes without any discussion preceding it.
That was not very good approach although the overall idea behind the patch
set is very progressive and is ready to solve some old and painful issues
in Postgres.

Also, introduction of SQL/JSON will further boost usage of JSON in
databases,
so our improvements in JSON storage and performance would be very useful.
These improvements depend on Pluggable TOAST, without API that allows easy
plug-in different TOAST implementations they require heavy core
modifications
and are very unlikely to be accepted. Not to mention that such kind of
changes
require upgrades, restarts and so on.

Pluggable TOAST allows using advanced storage techniques on top of the
default
Postgres database engine, instead of implementing the complex Pluggable
Storage
API, and allows plugging these advanced techniques on the fly - without even
restarting the server, which is crucial for production systems.

Discussion on extending the TOAST pointer showed some interest in this
topic,
so I hope this feature would draw some attention in the scope of widely
used large
JSON objects.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-25 Thread Nikita Malakhov

Hi!

Amit, on previous email, patch #2 - I agree that it is not the best idea to
introduce
new type of logic into the parser, so this logic could be moved to the
executor,
or removed at all. What do you think of these options?

On Wed, Oct 18, 2023 at 5:19 AM jian he  wrote:

> Hi.
> based on v22.
>
> I added some tests again json_value for the sake of coverager test.
>
> A previous email thread mentioned needing to check *empty in
> ExecEvalJsonExpr.
> since JSON_VALUE_OP, JSON_QUERY_OP, JSON_EXISTS_OP all need to have
> *empty cases, So I refactored a little bit.
> might be helpful. Maybe we can also refactor *error cases.
>
> The following part is not easy to understand.
> res = ExecPrepareJsonItemCoercion(jbv,
> +  jsestate->item_jcstates,
> +  &post_eval->jcstate);
> + if (post_eval->jcstate &&
> + post_eval->jcstate->coercion &&
> + (post_eval->jcstate->coercion->via_io ||
> + post_eval->jcstate->coercion->via_populate))
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

RFC: Pluggable TOAST

2023-10-24 Thread Nikita Malakhov

Hi hackers!

We need community feedback on previously discussed topic [1].
There are some long-live issues in Postgres related to the TOAST mechanics,
like [2].
Some time ago we already proposed a set of patches with an API allowing to
plug in
different TOAST implementations into a live database. The patch set
introduced a lot
of code and was quite crude in some places, so after several
implementations we decided
to try to implement it in the production environment for further check-up.

The main idea behind pluggable TOAST is make it possible to easily plug in
and use different
implementations of large values storage, preserving existing mechanics to
keep backward
compatibilitну provide easy Postgres-way  give users alternative mechanics
for storing large
column values in a more effective way - we already have custom and very
effective (up to tens
and even hundreds of times faster) TOAST implementations for bytea and
JSONb data types.

As we see it - Pluggable TOAST proposes
1) changes in TOAST pointer itself, extending it to store custom data -
current limitations
of TOAST pointer were discussed in [1] and [4];
2) API which allows calls of custom TOAST implementations for certain table
columns and
(a topic for discussion) certain datatypes.

Custom TOAST could be also used in a not so trivial way - for example,
limited columnar storage could be easily implemented and plugged in without
heavy core modifications
of implementation of Pluggable Storage (Table Access Methods), preserving
existing data
and database structure, be upgraded, replicated and so on.

Any thoughts and proposals are welcome.

[1] Pluggable TOAST
https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru

[2] Infinite loop while acquiring new TOAST Oid
https://www.postgresql.org/message-id/flat/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com

[3] JSONB Compression dictionaries
https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com

[4] Extending the TOAST pointer
https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com
-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-16 Thread Nikita Malakhov

Hi,

Thank you very much, I'll check it out. It looks like the
getObjectIdentity() used in
pg_identify_object() could do.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-16 Thread Nikita Malakhov

Hi,

Sorry, forgot to mention above - patches from our patch set should be
applied
onto SQL/JSON part 3 - v22-0003-SQL-JSON-query-functions.patch, thus
they are numbered as v23-0003-1 and -2.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-16 Thread Nikita Malakhov

Hi,

Also FYI - the following case results in segmentation fault:

postgres@postgres=# CREATE TABLE test_jsonb_constraints (
js text,
i int,
x jsonb DEFAULT JSON_QUERY(jsonb '[1,2]', '$[*]' WITH WRAPPER)
CONSTRAINT test_jsonb_constraint1
CHECK (js IS JSON)
CONSTRAINT test_jsonb_constraint5
CHECK (JSON_QUERY(js::jsonb, '$.mm' RETURNING char(5) OMIT
QUOTES EMPTY ARRAY ON EMPTY) >  'a' COLLATE "C")
CONSTRAINT test_jsonb_constraint6
CHECK (JSON_EXISTS(js::jsonb, 'strict $.a' RETURNING int
TRUE ON ERROR) < 2)
);
CREATE TABLE
Time: 13.518 ms
postgres@postgres=# INSERT INTO test_jsonb_constraints VALUES ('[]');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
Time: 6.858 ms
@!>

We're currently looking into this case.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: remaining sql/json patches

2023-10-16 Thread Nikita Malakhov

Hi!

With the latest set of patches we encountered failure with the following
query:

postgres@postgres=# SELECT JSON_QUERY(jsonpath '"aaa"', '$' RETURNING text);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
Time: 11.165 ms

A colleague of mine, Anton Melnikov, proposed the following changes which
slightly
alter coercion functions to process this kind of error correctly.

Please check attached patch set.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


v23-0003-1-transformJsonExprCommon-fixup.patch
Description: Binary data


v23-0003-2-json-query-coercion-override.patch
Description: Binary data

Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov

Hi,

Textual representation requires a long text field because it could contain
schema,
arguments, it is difficult and not effective to be saved as part of the
data, and must
be parsed to retrieve function oid. By using direct oid (actually, a value
of the regprocedure field) we avoid it and function could be retrieved by
pk.

Why pg_upgrade cannot be used? OID preservation logic is already implemented
for several OIDs in catalog tables, like pg_class, type, relfilenode,
enum...
I've mentioned twice that this logic is already implemented and I haven't
encountered
any problems with pg_upgrade.

Actually, I've asked here because there are several references to PG_PROC
oids
from other tables in the system catalog, so I was worried if this logic
could break
something I do not know about.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov

Hi,

I've already implemented preserving PG_PROC oids during pg_upgrade
in a way like relfilenodes, etc, actually, it is quite simple, and on the
first
look there are no any problems.

About using surrogate key - this feature is more for data generated by
the DBMS itself, i.e. data processed by some extension and saved
and re-processed automatically or by user's request, but without bothering
user with these internal keys.

The main question - maybe, are there pitfalls of which I am not aware of?

Thanks for your replies!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov

Hi!

Say, we have data processed by some user function and we want to keep
reference to this function
in our data. In this case we have two ways - first - store string output of
regprocedure, which is not
very convenient, and the second - store its OID, which requires slight
modification of pg_upgrade
(pg_dump and func/procedure creation function).

I've read previous threads about using regproc, and agree that this is not
a very good case anyway,
but I haven't found any serious obstacles that forbid modifying pg_upgrade
this way.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-12 Thread Nikita Malakhov

Hi hackers!

Please advise on the idea of preserving pg_proc oids during pg_upgrade, in
a way like relfilenodes, type id and so on. What are possible downsides of
such a solution?

Thanks!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-09-15 Thread Nikita Malakhov

Hi!
Great you continue to work on this patch!
I'm checking out the newest changes.


On Fri, Sep 15, 2023 at 2:32 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi,
>
> > Renaming/Refactoring:
> > - All spans are now tracked in the palloced current_trace_spans buffer
> > compared to top_span and parse_span being kept in a static variable
> > before.
> > - I've renamed query_spans to top_span. A top_span serves as the
> > parent for all spans in a specific nested level.
> > - More debugging information and assertions. Spans now track their
> > nested level, if they've been ended and if they are a top_span.
> >
> > Changes:
> > - I've added the subxact_count to the span's metadata. This can help
> > identify the moment a subtransaction was started.
> > - I've reworked nested queries and utility statements. Previously, I
> > made the assumptions that we could only have one top_span per nested
> > level which is not the case. Some utility statements can execute
> > multiple queries in the same nested level. Tracing utility statement
> > now works better (see image of tracing a create extension).
>
> Many thanks for the updated patch.
>
> If it's not too much trouble, please use `git format-patch`. This
> makes applying and testing the patch much easier. Also you can provide
> a commit message which 1. will simplify the work for the committer and
> 2. can be reviewed as well.
>
> The tests fail on CI [1]. I tried to run them locally and got the same
> results. I'm using full-build.sh from this repository [2] for
> Autotools and the following one-liner for Meson:
>
> ```
> time CPPFLAGS="-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS" sh -c 'git
> clean -dfx && meson setup --buildtype debug -Dcassert=true
> -DPG_TEST_EXTRA="kerberos ldap ssl load_balance" -Dldap=disabled
> -Dssl=openssl -Dtap_tests=enabled
> -Dprefix=/home/eax/projects/pginstall build && ninja -C build && ninja
> -C build docs && PG_TEST_EXTRA=1 meson test -C build'
> ```
>
> The comments for pg_tracing.c don't seem to be formatted properly.
> Please make sure to run pg_indent. See src/tools/pgindent/README
>
> The interface pg_tracing_spans(true | false) doesn't strike me as
> particularly readable. Maybe this function should be private and the
> interface be like pg_tracing_spans() and pg_trancing_consume_spans().
> Alternatively you could use named arguments in the tests, but I don't
> think this is a preferred solution.
>
> Speaking of the tests I suggest adding a bit more comments before
> every (or most) of the queries. Figuring out what they test could be
> not particularly straightforward for somebody who will make changes
> after the patch will be accepted.
>
> [1]: http://cfbot.cputube.org/
> [2]: https://github.com/afiskon/pgscripts/
>
> --
> Best regards,
> Aleksander Alekseev
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-08-14 Thread Nikita Malakhov

Hi!

Storing spans is a tricky question. Monitoring systems often use pull model
and use probes or SQL
API for pulling data, so from this point of view it is much more convenient
to keep spans in separate
table. But in this case we come to another issue - how to flush this data
into the table? Automatic
flushing on a full buffer would randomly (to the user) significantly affect
query performance. I'd rather
make a GUC turned off by default to store spans into the table instead of
buffer.

>3) I'm testing more complex queries. Most of my previous tests were using
simple query protocol but extended protocol introduces
>differences that break some assumptions I did. For example, with multi
statement transaction like
>BEGIN;
>SELECT 1;
>SELECT 2;
>The parse of SELECT 2 will happen before the ExecutorEnd (and the
end_tracing) of SELECT 1. For now, I'm skipping the post parse
>hook if we still have an ongoing tracing.

I've checked this behavior before and haven't noticed the case you
mentioned, but for
loops like
for i in 1..2 loop
StartTime := clock_timestamp();
insert into t2 values (i, a_data);
EndTime := clock_timestamp();
Delta := 1000 * ( extract(epoch from EndTime) - extract(epoch from
StartTime) );
end loop;

I've got the following call sequence:
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook <1>
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook <1>
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook <2>
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook <2>
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorStart

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorRun
 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorFinish

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorEnd

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorStart

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorRun
 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorFinish

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorEnd


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-08-01 Thread Nikita Malakhov

Hi!

Thanks for the improvements!
>Here's a new patch with changes from the previous discussion:
>- I'm now directly storing nanoseconds duration in the span instead of the
instr_time. Using the instr_time macros was a bit awkward as the
durations I generate don't necessarily have a starting and ending
instr_time.
>Moving to straight nanoseconds made things clearer from my point of view.
Cool. It could be easily casted to ms for the user.

>- I've added an additional sample rate pg_tracing.sample_rate (on top of
the pg_tracing.caller_sample_rate). This one will allow queries to be
sampled even without trace information propagated from the caller.
>Setting this sample rate to 1 will basically trace everything. For now,
this will only work when going through the post parse hook. I will add
support for prepared statements and cached plans for the next patch.
Cool, I've just made the same improvement and wanted to send a patch a bit
later after tests.

>- I've improved how parse spans are created. It's a bit challenging to get
the start of a parse as there's no pre parse hook or instrumentation around
parse so it's only an estimation.
I've also added a query id field to span and made a table and an sql
function that flushes spans to this table instead of returning set or
records - it is more convenient for the maintenance to query the table.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-28 Thread Nikita Malakhov

Hi!

What do you think about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT
and INSTR_TIME_GET_MILLISEC
macros for timing calculations?

Also, have you thought about a way to trace existing (running) queries
without directly instrumenting them? It would
be much more usable for maintenance and support personnel, because in
production environments you rarely could
change query text directly. For the current state the most simple solution
is switch tracing on and off by calling SQL
function, and possibly switch tracing for prepared statement the same way,
but not for any random query.

I'll check the patch for the race conditions.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-28 Thread Nikita Malakhov

Hi,

I'd keep Autotools build up to date, because Meson is very limited in terms
of not very
up-to-date OS versions. Anyway, it's OK now. I'm currently playing with the
patch and
reviewing sources, if you need any cooperation - please let us know. I'm +1
for committing
this patch after review.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-27 Thread Nikita Malakhov

Hi,

I've fixed the Autotools build, please check patch below (v2).

On Thu, Jul 27, 2023 at 6:39 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi,
>
> > Also FYI, there are build warnings because functions
> > const char * get_span_name(const Span * span, const char *qbuffer)
> > and
> > const char * get_operation_name(const Span * span, const char *qbuffer)
> > do not have default inside switch and no return outside of switch.
>
> You are right, there are a few warnings:
>
> ```
> [1566/1887] Compiling C object contrib/pg_tracing/pg_tracing.so.p/span.c.o
> ../contrib/pg_tracing/span.c: In function ‘get_span_name’:
> ../contrib/pg_tracing/span.c:210:1: warning: control reaches end of
> non-void function [-Wreturn-type]
>   210 | }
>   | ^
> ../contrib/pg_tracing/span.c: In function ‘get_operation_name’:
> ../contrib/pg_tracing/span.c:249:1: warning: control reaches end of
> non-void function [-Wreturn-type]
>   249 | }
>   | ^
> ```
>
> Here is the patch v2 with a quick fix.
>
> > but got errors calling make check and cannot install the extension
>
> Agree, something goes wrong when using Autotools (but not Meson) on
> both Linux and MacOS. I didn't investigate the issue though.
>
> --
> Best regards,
> Aleksander Alekseev
>


-- 
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


pg-tracing-v2.patch
Description: Binary data

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-27 Thread Nikita Malakhov

Hi,

Also FYI, there are build warnings because functions
const char * get_span_name(const Span * span, const char *qbuffer)
and
const char * get_operation_name(const Span * span, const char *qbuffer)
do not have default inside switch and no return outside of switch.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-27 Thread Nikita Malakhov

Hi!

I've tried to test the extension, but got errors calling make check and
cannot install the extension.
I've applied the patch onto current master and configured it as:
./configure --enable-debug --enable-cassert --enable-depend
--enable-tap-tests
Could you please advise if I'm doing something wrong?
For make check please see attached log.
For installing extension after setting shared preload library I've got an
error in Postgres logfile:
2023-07-27 13:41:52.324 MSK [12738] LOG:  database system is shut down
2023-07-27 13:41:52.404 MSK [14126] FATAL:  could not load library
"/usr/local/pgsql/lib/pg_tracing.so": /usr/local/pgsql/lib/pg_tracing.so:
undefined symbol: get_operation_name
Thank you!

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/


mc.log
Description: Binary data

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-07-26 Thread Nikita Malakhov

Hi!

This patch looks very interesting, I'm working on the same subject too. But
I've used
another approach - I'm using C wrapper for C++ API library from
OpenTelemetry, and
handle span storage and output to this library. There are some nuances
though, but it
is possible. Have you tried to use OpenTelemetry APIs instead of
implementing all
functionality around spans?

On Tue, Jul 25, 2023 at 1:16 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi Anthonin,
>
> > I have a working prototype of a pg_tracing extension and wanted some
> feedback on the design and architecture.
>
> The patch looks very interesting, thanks for working on it and for
> sharing. The facts that the patch doesn't change the core except for
> two lines in instrument.{c.h} and that is uses pull-based model:
>
> > - Collect the spans through a new pg_tracing_spans() function output
>
> ... IMO were the right design decisions. The patch lacks the
> documentation, but this is OK for a PoC.
>
> I added the patch to the nearest CF [1]. Let's see what the rest of
> the community thinks.
>
> [1] https://commitfest.postgresql.org/44/4456/
>
> --
> Best regards,
> Aleksander Alekseev
>
>
>

-- 
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Protect extension' internal tables - how?

2023-07-18 Thread Nikita Malakhov

Hi,

Aleksander, thank you very much.
Tables are already placed into special schema, but there are some
dynamically
created tables and the goal is to protect all these tables from direct
insert, update
and delete operations from users. I've read about the SECURITY DEFINER,
it will do the trick.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Protect extension' internal tables - how?

2023-07-15 Thread Nikita Malakhov

Hi hackers!

While working on an extension I encountered a quite tricky question -
the extension (with functions in C) creates tables during function calls,
these tables must be protected from direct users' queries, at the same
time they must remain accessible for all functions of this extension
for all users allowed to use this extension.

Could you please advise or give some hint on what is the correct (and
secure) way to implement this?

Currently I use the owner of the extension as owner when creating
such a table inside the function, but maybe there are some pitfalls
in this kind of solution?

Thanks in advance.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-07-11 Thread Nikita Malakhov

Hi!

Aleksander, thank you for reminding me of this patch, try to do it in a few
days.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2023-07-05 Thread Nikita Malakhov

Hi!

I like the idea of having a standard function which shows a TOAST value ID
for a row. I've used my own to handle TOAST errors. Just, maybe, more
correct
name would be "...value_id", because you actually retrieve valueid field
from the TOAST pointer, and chunk ID consists of valueid + chunk_seq.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Pluggable toaster

2023-06-16 Thread Nikita Malakhov

Hi,

Seems that I missed the thread mentioned above. I strongly disagree
with such statement, Pluggable TOAST could not be a part or Compression
Dictionaries thread because the TOAST improvement is a more general subject,
it involves much deeper and tricky changes in the core. And also is much
more
promising in terms of performance and storage improvements.

We already have a lot of changes in Pluggable TOAST that were not committed
to the main GIT branch of this thread, so it seems that I have to merge
them and
reopen it.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Pluggable toaster

2023-06-14 Thread Nikita Malakhov

Hi!

I have a question for the community - why was this patch silently put to a
"rejected" status?
Should there no be some kind of explanation?

During this discussion I got the impression that for some reason some
members of the community
do not want the TOAST functionality, which has such drawbacks that make it
really a curse for
in many ways very good DBMS, to be touched. We cannot get rid of it because
of backwards
compatibility, so the best way is to make it more adaptable and extensible
- this is what this thread
is about. We proposed our vision on how to extend the TOAST Postgres-way,
like Pluggable
Storage some time before.

There are some very complex subjects left in this topic that really need a
community' attention.
I've mentioned them above, but there was no feedback on them.

Pavel, we've already had an update implementation for TOAST. But it is a
part of a Pluggable
TOAST and it hardly would be here without it. I've started another thread
on extending the TOAST
pointer, maybe you would want to participate there [1].

We still would be grateful for feedback.

[1] Extending the TOAST Pointer
<https://www.postgresql.org/message-id/flat/CAJ7c6TNAYyeMYKVkiwOZChy7UpE_CkjpYOk73gcWTXMkLkEyzw%40mail.gmail.com#59aacdde27dd61277fe7c46c61c84b2c>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: RFI: Extending the TOAST Pointer

2023-05-24 Thread Nikita Malakhov

Hi!

I've made a WIP patch that uses 64-bit TOAST value ID instead of 32-bit,
and sent it as a part of discussion, but there was no feedback on such a
solution. There was a link to that discussion at the top of this thread.

Also, I have to note that, based on our work on Pluggable TOAST - extending
TOAST pointer with additional structures would require review of the logical
replication engine, currently it is not suitable for any custom TOAST
pointers.
Currently we have no final solution for problems with logical replication
for
custom TOAST pointers.

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFI: Extending the TOAST Pointer

2023-05-22 Thread Nikita Malakhov

Hi,

Aleksander, I'm interested in extending TOAST pointer in various ways.
64-bit TOAST value ID allows to resolve very complex issue for production
systems with large tables and heavy update rate.

I agree with Matthias that there should not be processing of TOAST pointer
internals outside TOAST macros. Currently, TOASTed value is distinguished
as VARATT_IS_EXTERNAL_ONDISK, and it should stay this way. Adding
compression requires another implementation (extension) of
VARATT_EXTERNAL because current supports only 2 compression methods -
it has only 1 bit responsible for compression method, and there is a safe
way to do so, without affecting default TOAST mechanics - we must keep
it this way for compatibility issues and not to break DB upgrade.

Also, I must remind that we should not forget about field alignment inside
the TOAST pointer.

As it was already mentioned, it seems not very reasonable trying to save
a byte or two while we are storing out-of-line values of at least 2 kb in
size.

On Mon, May 22, 2023 at 4:47 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi,
>
> > I see your point, but I do think we should also think about why we do
> > the change.
>
> Personally at the moment I care only about implementing compression
> dictionaries on top of this, as is discussed in the corresponding
> thread [1]. I'm going to need new fields in the TOAST pointer's
> including (but not necessarily limited to) dictionary id.
>
> As I understand, Nikita is interested in implementing 64-bit TOAST
> pointers [2]. I must admit I didn't follow that thread too closely but
> I can imagine the needs are similar.
>
> Last but not least I remember somebody on the mailing list suggested
> adding ZSTD compression support for TOAST, besides LZ4. Assuming I
> didn't dream it, the proposal was rejected due to the limited amount
> of free bits in ToastCompressionId. It was argued that two possible
> combinations that are left should be treated with care and ZSTD will
> not bring enough value to the users compared to LZ4.
>
> These are 3 recent cases I could recall. This being said I think our
> solution should be generic enough to cover possible future cases
> and/or cases unknown to us yet.
>
> [1]:
> https://postgr.es/m/CAJ7c6TM7%2BsTvwREeL74Y5U91%2B5ymNobRbOmnDRfdTonq9trZyQ%40mail.gmail.com
> [2]: https://commitfest.postgresql.org/43/4296/
>
> --
> Best regards,
> Aleksander Alekseev
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: RFI: Extending the TOAST Pointer

2023-05-18 Thread Nikita Malakhov

Hi!

Matthias, in the Pluggable TOAST thread we proposed additional pointer
definition, without modification
of the original varatt_external - we have to keep it untouched for
compatibility issues. The following extension
for the TOAST pointer was proposed:

typedef struct varatt_custom
{
uint16 va_toasterdatalen;/* total size of toast pointer, < BLCKSZ */
uint32 va_rawsize; /* Original data size (includes header) */
uint32 va_toasterid; /* Toaster ID, actually Oid */
char va_toasterdata[FLEXIBLE_ARRAY_MEMBER]; /* Custom toaster data */
} varatt_custom;

with the new tag VARTAG_CUSTOM = 127.

Rawsize we have to keep because it is used by Executor. And Toaster ID is
the OID by which
we identify the extension that uses this pointer invariant.


On Thu, May 18, 2023 at 5:34 PM Matthias van de Meent <
boekewurm+postg...@gmail.com> wrote:

> On Thu, 18 May 2023 at 15:12, Aleksander Alekseev
>  wrote:
> >
> > Hi,
> >
> > > I agree that va_tag can have another use. But since we are going to
> > > make varatt_external variable in size (otherwise I don't see how it
> > > could be really **extendable**) I don't think this is the right
> > > approach.
>
> Why would we modify va_tag=18; data=varatt_external? A different
> va_tag option would allow us to keep the current layout around without
> much maintenance and a consistent low overhead.
>
> > On second thought, perhaps we are talking more or less about the same
> thing?
> >
> > It doesn't matter what will be used as a sign of presence of a varint
> > bitmask in the pointer. My initial proposal to use ToastCompressionId
> > for this is probably redundant since va_tag > 18 will already tell
> > that.
>
> I'm not sure "extendable" would be the right word, but as I see it:
>
> 1. We need more space to store more metadata;
> 2. Essentially all bits in varatt_external are already accounted for; and
> 3. There are still many options left in va_tag
>
> It seems to me that adding a new variant to va_att for marking new
> features in the toast infrastructure makes the most sense, as we'd
> also be able to do the new things like varints etc without needing to
> modify existing toast paths significantly.
>
> We'd need to stop using the va_tag as length indicator, but I don't
> think it's currently assumed to be a length indicator anyway (see
> VARSIZE_EXTERNAL(ptr)). By not using the varatt_external struct
> currently in use, we could be able to get down to <18B toast pointers
> as well, though I'd consider that unlikely.
>
> Kind regards,
>
> Matthias van de Meent
> Neon, Inc.
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2023-05-17 Thread Nikita Malakhov

Hi hackers!

As discussed above, I've created a new thread on the Extension of the TOAST
pointer subject -
https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com
Please check and comment.

On Thu, Apr 27, 2023 at 1:43 PM Nikita Malakhov  wrote:

> Hi,
>
> I think I should open a new thread related to TOAST pointer refactoring
> based on Pluggable TOAST, COPY and looping in retrieving new TOAST
> value OID issues.
>
> On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev <
> aleksan...@timescale.com> wrote:
>
>> Hi Nikita,
>>
>> > The External TOAST pointer is very limited to the amount of service data
>> > it could keep, that's why we introduced the Custom TOAST pointers in the
>> > Pluggable TOAST. But keep in mind that changing the TOAST pointer
>> > structure requires a lot of quite heavy modifications in the core -
>> along with
>> > some obvious places like insert/update/delete datum there is very
>> serious
>> > issue with logical replication.
>> > The Pluggable TOAST was rejected, but we have a lot of improvements
>> > based on changing the TOAST pointer structure.
>>
>> Now I see what you meant [1]. I agree that we should focus on
>> refactoring TOAST pointers first. So I suggest we continue discussing
>> this in a corresponding thread and return to this one later.
>>
>> [1]:
>> https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com
>>
>> --
>> Best regards,
>> Aleksander Alekseev
>>
>
>
> --
> Regards,
>
> --
> Nikita Malakhov
> Postgres Professional
> The Russian Postgres Company
> https://postgrespro.ru/
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

RFI: Extending the TOAST Pointer

2023-05-10 Thread Nikita Malakhov

Hi hackers!

There were several discussions where the limitations of the existing TOAST
pointers
were mentioned [1], [2] and [3] and from time to time this topic appears in
other places.

We proposed a fresh approach to the TOAST mechanics in [2], but
unfortunately the
patch was met quite unfriendly, and after several iterations was rejected,
although we
still have hopes for it and have several very promising features based on
it.

Anyway, the old TOAST pointer is also the cause of problems like [4], and
this part of
the PostgreSQL screams to be revised and improved.

The TOAST begins with the pointer to the externalized value - the TOAST
Pointer, which
is very limited in means of storing data, and all TOAST improvements
require revision
of this Pointer structure. So we decided to ask the community for thoughts
and ideas on
how to rework this pointer.
The TOAST Pointer (varatt_external structure) stores 4 fields:
[varlena header][<4b - original data size><4b - size in TOAST table><4b -
TOAST table OID><4b - ID of chunk>]
In [2] we proposed the new Custom TOAST pointer structure where main
feature is
extensibility:
[varlena header][<2b - total size of the TOAST pointer><4b size of original
data><4b - OID of algorithm used for TOASTing>]
where Custom TOAST Pointer is distinguished from Regular one by va_flag
field which
is a part of varlena header, so new pointer format does not interfere with
the old (regular) one.
The first field is necessary because the Custom TOAST pointer has variable
length due to the
tail used for inline storage, and original value size is used by the
Executor. The third field could
be a subject for discussion.

Thoughts? Objections?

[1] [PATCH] Infinite loop while acquiring new TOAST Oid
<https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com>
[2] Pluggable Toaster
<https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru>
[3] [PATCH] Compression dictionaries for JSONB
<https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com>
[4] BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in
the table.
<https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org>

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2023-04-27 Thread Nikita Malakhov

Hi,

I think I should open a new thread related to TOAST pointer refactoring
based on Pluggable TOAST, COPY and looping in retrieving new TOAST
value OID issues.

On Wed, Apr 26, 2023 at 4:00 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi Nikita,
>
> > The External TOAST pointer is very limited to the amount of service data
> > it could keep, that's why we introduced the Custom TOAST pointers in the
> > Pluggable TOAST. But keep in mind that changing the TOAST pointer
> > structure requires a lot of quite heavy modifications in the core -
> along with
> > some obvious places like insert/update/delete datum there is very serious
> > issue with logical replication.
> > The Pluggable TOAST was rejected, but we have a lot of improvements
> > based on changing the TOAST pointer structure.
>
> Now I see what you meant [1]. I agree that we should focus on
> refactoring TOAST pointers first. So I suggest we continue discussing
> this in a corresponding thread and return to this one later.
>
> [1]:
> https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com
>
> --
> Best regards,
> Aleksander Alekseev
>


-- 
Regards,

--
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-27 Thread Nikita Malakhov

Hi!

Widening of a TOAST pointer is possible if we keep backward compatibility
with
old-fashioned TOAST tables - I mean differ 'long' and 'short' TOAST
pointers and
process them accordingly on insert and delete cases, and vacuum with logical
replication. It is not very difficult, however it takes some effort.
Recently I've found
out that I have not overseen all compatibility cases, so the provided patch
is
functional but limited in compatibility.

We already have a flag byte in the TOAST pointer which is responsible for
the type
of the pointer - va_flag field. It was explained in the Pluggable TOAST
patch.
One is enough, there is no need to add another one.


On Wed, Apr 26, 2023 at 3:55 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi,
>
> > I agree that we can't simply widen varatt_external to use 8 bytes for
> > the toast ID in all cases.
>
> +1
>
> Note that the user may have a table with multiple TOASTable
> attributes. If we simply widen the TOAST pointer it may break the
> existing tables in the edge case. Also this may be a reason why
> certain users may prefer to continue using narrow pointers. IMO wide
> TOAST pointers should be a table option. Whether the default for new
> tables should be wide or narrow pointers is debatable.
>
> In another discussion [1] we seem to agree that we also want to have
> an ability to include a 32-bit dictionary_id to the TOAST pointers and
> perhaps support more compression methods (ZSTD to name one). Besides
> that it would be nice to have an ability to extend TOAST pointers in
> the future without breaking the existing pointers. One possible
> solution would be to add a varint feature bitmask to every pointer. So
> we could add flags like TOAST_IS_WIDE, TOAST_HAS_DICTIONARY,
> TOAST_UNKNOWN_FEATURE_FROM_2077, etc indefinitely.
>
> I suggest we address all the current and future needs once and
> completely refactor TOAST pointers rather than solving one problem at
> a time. I believe this will be more beneficial for the community in
> the long term.
>
> Thoughts?
>
> [1]: https://commitfest.postgresql.org/43/3626/
>
> --
> Best regards,
> Aleksander Alekseev
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-04-24 Thread Nikita Malakhov

Hi,

This is a production case for large databases with high update rates, but
is mistaken
with reaching table size limit, although size limit is processed correctly.

The note on TOAST limitation does not mention that TOAST values are not
actually
updated on UPDATE operation - old value is marked as dead and new one is
inserted,
and dead values should be vacuumed before value OID could be reused. The
worst
is that the INSERT/UPDATE clause does not fail if there is no OID available
- it is
looped in an infinite loop of sorting out OIDs.

On Sat, Apr 22, 2023 at 6:42 PM Gurjeet Singh  wrote:

> On Fri, Apr 21, 2023 at 12:14 AM Nikita Malakhov 
> wrote:
> > This limitation applies not only to wide tables - it also applies to
> tables where TOASTed values
> > are updated very often. You would soon be out of available TOAST value
> ID because in case of
> > high frequency updates autovacuum cleanup rate won't keep up with the
> updates. It is discussed
> > in [1].
> >
> > On Fri, Apr 21, 2023 at 9:39 AM Jakub Wartak <
> jakub.war...@enterprisedb.com> wrote:
> >> I would like to ask if it wouldn't be good idea to copy the
> >> https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit
> >> discussion (out-of-line OID usage per TOAST-ed columns / potential
> >> limitation) to the official "Appendix K. PostgreSQL Limits" with also
> >> little bonus mentioning the "still searching for an unused OID in
> >> relation" notice. Although it is pretty obvious information for some
> >> and from commit 7fbcee1b2d5f1012c67942126881bd492e95077e and the
> >> discussion in [1], I wonder if the information shouldn't be a little
> >> more well known via the limitation (especially to steer people away
> >> from designing very wide non-partitioned tables).
> >>
> >> [1] -
> https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org
> >
> > [1]
> https://www.postgresql.org/message-id/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com
>
> These 2 discussions show that it's a painful experience to run into
> this problem, and that the hackers have ideas on how to fix it, but
> those fixes haven't materialized for years. So I would say that, yes,
> this info belongs in the hard-limits section, because who knows how
> long it'll take this to be fixed.
>
> Please submit a patch.
>
> I anticipate that edits to Appendix K Postgres Limits will prompt
> improving the note in there about the maximum column limit, That note
> is too wordy, and sometimes confusing, especially for the audience
> that it's written for: newcomers to Postgres ecosystem.
>
> Best regards,
> Gurjeet https://Gurje.et
> http://aws.amazon.com
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2023-04-23 Thread Nikita Malakhov

Hi!

No, it wasn't. It was a proposal, I thought I'd get some feedback on it
before sending it to commitfest.

On Sat, Apr 22, 2023 at 6:17 PM Gurjeet Singh  wrote:

> On Thu, Dec 22, 2022 at 10:07 AM Nikita Malakhov 
> wrote:
> > Any suggestions on the previous message (64-bit toast value ID with
> individual counters)?
>
> Was this patch ever added to CommitFest? I don't see it in the current
> Open Commitfest.
>
> https://commitfest.postgresql.org/43/
>
> Best regards,
> Gurjeet http://Gurje.et
> http://aws.amazon.com
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2023-04-21 Thread Nikita Malakhov

Hi!

This limitation applies not only to wide tables - it also applies to tables
where TOASTed values
are updated very often. You would soon be out of available TOAST value ID
because in case of
high frequency updates autovacuum cleanup rate won't keep up with the
updates. It is discussed
in [1].


On Fri, Apr 21, 2023 at 9:39 AM Jakub Wartak 
wrote:

> Hi -hackers,
>
> I would like to ask if it wouldn't be good idea to copy the
> https://wiki.postgresql.org/wiki/TOAST#Total_table_size_limit
> discussion (out-of-line OID usage per TOAST-ed columns / potential
> limitation) to the official "Appendix K. PostgreSQL Limits" with also
> little bonus mentioning the "still searching for an unused OID in
> relation" notice. Although it is pretty obvious information for some
> and from commit 7fbcee1b2d5f1012c67942126881bd492e95077e and the
> discussion in [1], I wonder if the information shouldn't be a little
> more well known via the limitation (especially to steer people away
> from designing very wide non-partitioned tables).
>
> Regards,
> -J.
>
> [1] -
> https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org
>
>
>
[1]
https://www.postgresql.org/message-id/CAN-LCVPRvRzxeUdYdDCZ7UwZQs1NmZpqBUCd%3D%2BRdMPFTyt-bRQ%40mail.gmail.com

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2023-04-18 Thread Nikita Malakhov

 there is no expensive TOAST lookup for the
> attributes of the tuple; but that doesn't seem to be an option with
> that design.
>
> > When user wants a given attribute to be compressed, he/she says:
> >
> > ALTER TABLE foo ALTER COLUMN bar SET STORAGE DICTIONARY;
> >
> > And the compression algorithms is chosen as usual:
> >
> > ALTER TABLE foo ALTER COLUMN bar SET COMPRESSION lz4;
> >
> > When there are no dictionaries yet, DICTIONARY works the same as
> > EXTENDED. When a dictionary is trained the data is compressed using
> > the latest version of this dictionary. For visibility we are going to
> > need some sort of pg_stat_dictionaries view that shows the existing
> > dictionaries, how much space they consume, etc.
>
> I think "AT_AC SET COMPRESSION lz4 {[WITH | WITHOUT] DICTIONARY}",
> "AT_AC SET COMPRESSION lz4-dictionary", or "AT_AC SET
> compression_dictionary = on" would be better from a design
> perspective.
>
> > If we choose this approach there are a couple of questions/notes that
> > come to mind:
> >
> > * The default compression algorithm is PGLZ and unlike LZ4 it doesn't
> > support training dictionaries yet. This should be straightforward to
> > implement though, or alternatively shared dictionaries could work only
> > for LZ4;
>
> Didn't we get zstd support recently as well?
>
> > * Currently users have control of toast_tuple_target but not
> > TOAST_TUPLE_THRESHOLD. Which means for tuples smaller than 1/4 of the
> > page size shared dictionaries are not going to be triggered. Which is
> > not necessarily a bad thing. Alternatively we could give the users
> > toast_tuple_threshold setting. This shouldn't necessarily be part of
> > this CF entry discussion however, we can always discuss it separately;
>
> That makes a lot of sense, but as you said handling that separately
> would probably be better and easier to review.
>
> > * Should we allow setting DICTIONARY storage strategy for a given
> > type, i.e. CREATE TYPE baz STORAGE = DICTIONARY? I suggest we forbid
> > it in the first implementation, just for the sake of simplicity.
>
> Can we specify a default compression method for each postgresql type,
> just like how we specify the default storage? If not, then the setting
> could realistically be in conflict with a default_toast_compression
> setting, assuming that dictionary support is not a requirement for
> column compression methods.
>
> > * It looks like we won't be able to share a dictionary between
> > multiple columns. Which again is not necessarily a bad thing: data in
> > these columns can be completely different (e.g. BYTEA and XML),
> > columns can be dropped independently, etc.
>
> Yes
>
> > If a user is interested in
> > sharing a dictionary between several columns he/she can join these
> > columns in a single JSONB column.
>
> It is unreasonable to expect this to be possible, due to e.g.
> partitioning resulting in columns that share compressable patters to
> be on different physical tables.
>
> > * TOAST currently doesn't support ZSTD. IMO this is not a big deal and
> > adding the corresponding support can be discussed separately.
> > * If memory serves, there were not so many free bits left in TOAST
> > pointers. The pointers don't store a storage strategy though so
> > hopefully this will not be a problem. We'll see.
>
> The toast pointer must store enough info about the compression used to
> decompress the datum, which implies it needs to store the compression
> algorithm used, and a reference to the compression dictionary (if
> any). I think the idea about introducing a new toast pointer type (in
> the custom toast patch) wasn't bad per se, and that change would allow
> us to carry more or different info in the header.
>
> Kind regards,
>
> Matthias van de Meent
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: SQL/JSON revisited

2023-04-04 Thread Nikita Malakhov

Hi hackers!

The latest SQL standard contains dot notation for JSON. Are there any plans
to include it into Pg 16?
Or maybe we should start a separate thread for it?

Thanks!


On Tue, Apr 4, 2023 at 3:36 PM Alvaro Herrera 
wrote:

> On 2023-Apr-04, Amit Langote wrote:
>
> > On Tue, Apr 4, 2023 at 2:16 AM Alvaro Herrera 
> wrote:
>
> > > - the gram.y solution to the "ON ERROR/ON EMPTY" clauses is quite ugly.
> > >   I think we could make that stuff use something similar to
> > >   ConstraintAttributeSpec with an accompanying post-processing
> function.
> > >   That would reduce the number of ad-hoc hacks, which seem excessive.
> >
> > Do you mean the solution involving the JsonBehavior node?
>
> Right.  It has spilled as the separate on_behavior struct in the core
> parser %union in addition to the raw jsbehavior, which is something
> we've gone 30 years without having, and I don't see why we should start
> now.
>
> This stuff is terrible:
>
> json_exists_error_clause_opt:
> json_exists_error_behavior ON ERROR_P   { $$ = $1; }
> | /* EMPTY */   { $$ = NULL; }
> ;
>
> json_exists_error_behavior:
> ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR,
> NULL); }
> | TRUE_P{ $$ = makeJsonBehavior(JSON_BEHAVIOR_TRUE,
> NULL); }
> | FALSE_P   { $$ = makeJsonBehavior(JSON_BEHAVIOR_FALSE,
> NULL); }
> | UNKNOWN   { $$ = makeJsonBehavior(JSON_BEHAVIOR_UNKNOWN,
> NULL); }
> ;
>
> json_value_behavior:
> NULL_P  { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL);
> }
> | ERROR_P   { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR,
> NULL); }
> | DEFAULT a_expr{ $$ =
> makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
> ;
>
> json_value_on_behavior_clause_opt:
> json_value_behavior ON EMPTY_P
> { $$.on_empty = $1; $$.on_error =
> NULL; }
> | json_value_behavior ON EMPTY_P json_value_behavior ON ERROR_P
> { $$.on_empty = $1; $$.on_error = $4; }
> | json_value_behavior ON ERROR_P
> { $$.on_empty = NULL; $$.on_error =
> $1; }
> |  /* EMPTY */
> { $$.on_empty = NULL; $$.on_error =
> NULL; }
> ;
>
> json_query_behavior:
> ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR,
> NULL); }
> | NULL_P{ $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL,
> NULL); }
> | EMPTY_P ARRAY { $$ =
> makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
> /* non-standard, for Oracle compatibility only */
> | EMPTY_P   { $$ =
> makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL); }
> | EMPTY_P OBJECT_P  { $$ =
> makeJsonBehavior(JSON_BEHAVIOR_EMPTY_OBJECT, NULL); }
> | DEFAULT a_expr{ $$ =
> makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2); }
> ;
>
> json_query_on_behavior_clause_opt:
> json_query_behavior ON EMPTY_P
> { $$.on_empty = $1; $$.on_error =
> NULL; }
> | json_query_behavior ON EMPTY_P json_query_behavior ON ERROR_P
> { $$.on_empty = $1; $$.on_error = $4; }
> | json_query_behavior ON ERROR_P
> { $$.on_empty = NULL; $$.on_error =
> $1; }
> |  /* EMPTY */
> { $$.on_empty = NULL; $$.on_error =
> NULL; }
> ;
>
> Surely this can be made cleaner.
>
> By the way -- that comment about clauses being non-standard, can you
> spot exactly *which* clauses that comment applies to?
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
> "El número de instalaciones de UNIX se ha elevado a 10,
> y se espera que este número aumente" (UPM, 1972)
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: SQL JSON path enhanced numeric literals

2023-03-31 Thread Nikita Malakhov

Hi!

Sorry to bother, but there is a question on JsonPath - how many bits in the
JsonPath
header could be used for the version? The JsonPath header is 4 bytes, and
currently
the Version part is defined as
#define JSONPATH_VERSION (0x01)

Thanks!

On Sun, Mar 5, 2023 at 6:55 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 03.03.23 21:16, Dean Rasheed wrote:
> > I think this new feature ought to be mentioned in the docs somewhere.
> > Perhaps a sentence or two in the note below table 9.49 would suffice,
> > since it looks like that's where jsonpath numbers are mentioned for
> > the first time.
>
> Done.  I actually put it into the data types chapter, where some other
> differences between SQL and SQL/JSON syntax were already discussed.
>
> > In jsonpath_scan.l, I think the hex/oct/bininteger cases could do with
> > a comment, such as
> >
> > /* Non-decimal integers in ECMAScript; must not have underscore after
> radix */
> > hexinteger0[xX]{hexdigit}(_?{hexdigit})*
> > octinteger0[oO]{octdigit}(_?{octdigit})*
> > bininteger0[bB]{bindigit}(_?{bindigit})*
> >
> > since that's different from the main lexer's syntax.
>
> done
>
> > Perhaps it's worth mentioning that difference in the docs.
>
> done
>
> > Otherwise, this looks good to me.
>
> committed
>
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: JsonPath version bits

2023-03-30 Thread Nikita Malakhov

Hi hackers!

Could the 1 byte from the JsonPath header be used to store version?
Or how many bits from the header could be used for the version value?

On Mon, Mar 27, 2023 at 12:54 PM Nikita Malakhov  wrote:

> Hi hackers!
>
> I've got a question on the JsonPath header - currently the header size
> is 4 bytes, where there are version and mode bits. Is there somewhere
> a defined size of the version part? There are some extensions working
> with JsonPath, and we have some too, thus it is important how many
> bits is it possible to use to store a version value?
>
>
-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

JsonPath version bits

2023-03-27 Thread Nikita Malakhov

Hi hackers!

I've got a question on the JsonPath header - currently the header size
is 4 bytes, where there are version and mode bits. Is there somewhere
a defined size of the version part? There are some extensions working
with JsonPath, and we have some too, thus it is important how many
bits is it possible to use to store a version value?

Thanks!

--
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2023-02-10 Thread Nikita Malakhov

Hi!

If I understand Andres' message correctly - the proposition is to
make use of compression dictionaries automatic, possibly just setting
a parameter when the table is created, something like
CREATE TABLE t ( ..., t JSONB USE DICTIONARY);
The question is then how to create such dictionaries automatically
and extend them while data is being added to the table. Because
it is not something unusual when after a time circumstances change
and a rather small table is started to be loaded with huge amounts
of data.

I prefer extending a dictionary over re-creating it because while
dictionary is recreated we leave users two choices - to wait until
dictionary creation is over or to use the old version (say, kept as
as a snapshot while a new one is created). Keeping many versions
simultaneously does not make sense and would extend DB size.

Also, compressing small data with a large dictionary (the case for
one-for-many tables dictionary), I think, would add some considerable
overhead to the INSERT/UPDATE commands, so the most reasonable
choice is a per-table dictionary.

Am I right?

Any ideas on how to create and extend such dictionaries automatically?

On Thu, Feb 9, 2023 at 2:01 PM Andres Freund  wrote:

> Hi,
>
> On February 9, 2023 2:50:57 AM PST, Aleksander Alekseev <
> aleksan...@timescale.com> wrote:
> >Hi Andres,
> >
> >> > So to clarify, are we talking about tuple-level compression? Or
> >> > perhaps page-level compression?
> >>
> >> Tuple level.
> >
> >> although my own patch proposed attribute-level compression, not
> >> tuple-level one, it is arguably closer to tuple-level approach than
> >> page-level one
> >
> >Just wanted to make sure that by tuple-level we mean the same thing.
> >
> >When saying tuple-level do you mean that the entire tuple should be
> >compressed as one large binary (i.e. similarly to page-level
> >compression but more granularly), or every single attribute should be
> >compressed separately (similarly to how TOAST does this)?
>
> Good point - should have been clearer. I meant attribute wise compression.
> Like we do today, except that we would use a dictionary to increase
> compression rates.
>
> Andres
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Nikita Malakhov

Hi,

On updating dictionary -

>You cannot "just" refresh a dictionary used once to compress an
>object, because you need it to decompress the object too.

and when you have many - updating an existing dictionary requires
going through all objects compressed with it in the whole database.
It's a very tricky question how to implement this feature correctly.
Also, there are some thoughts on using JSON schema to optimize
storage for JSON objects.
(That's applied to the TOAST too, so at first glance we've decided
to forbid dropping or changing TOAST implementations already
registered in a particular database.)

In my experience, in modern world, even with fast SSD storage
arrays, with large database (about 40-50 Tb) we had disk access
as a bottleneck more often than CPU, except for the cases with
a lot of parallel execution threads for a single query (Oracle).

On Mon, Feb 6, 2023 at 10:33 PM Andres Freund  wrote:

> Hi,
>
> On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote:
> > On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev
> >  wrote:
> > >
> > > Hi,
> > >
> > > I see your point regarding the fact that creating dictionaries on a
> > > training set is too beneficial to neglect it. Can't argue with this.
> > >
> > > What puzzles me though is: what prevents us from doing this on a page
> > > level as suggested previously?
> >
> > The complexity of page-level compression is significant, as pages are
> > currently a base primitive of our persistency and consistency scheme.
>
> +many
>
> It's also not all a panacea performance-wise, datum-level decompression can
> often be deferred much longer than page level decompression. For things
> like
> json[b], you'd hopefully normally have some "pre-filtering" based on proper
> columns, before you need to dig into the json datum.
>
> It's also not necessarily that good, compression ratio wise. Particularly
> for
> wider datums you're not going to be able to remove much duplication,
> because
> there's only a handful of tuples. Consider the case of json keys - the
> dictionary will often do better than page level compression, because it'll
> have the common keys in the dictionary, which means the "full" keys never
> will
> have to appear on a page, whereas page-level compression will have the
> keys on
> it, at least once.
>
> Of course you can use a dictionary for page-level compression too, but the
> gains when it works well will often be limited, because in most OLTP usable
> page-compression schemes I'm aware of, you can't compress a page all that
> far
> down, because you need a small number of possible "compressed page sizes".
>
>
> > > More similar data you compress the more space and disk I/O you save.
> > > Additionally you don't have to compress/decompress the data every time
> > > you access it. Everything that's in shared buffers is uncompressed.
> > > Not to mention the fact that you don't care what's in pg_attribute,
> > > the fact that schema may change, etc. There is a table and a
> > > dictionary for this table that you refresh from time to time. Very
> > > simple.
> >
> > You cannot "just" refresh a dictionary used once to compress an
> > object, because you need it to decompress the object too.
>
> Right. That's what I was trying to refer to when mentioning that we might
> need
> to add a bit of additional information to the varlena header for datums
> compressed with a dictionary.
>
> Greetings,
>
> Andres Freund
>


-- 
Regards,

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Pluggable toaster

2023-02-06 Thread Nikita Malakhov

Hi,

It'll be great to see more opinions on the approach as a whole.

>The problem that I see being raised here, is that there was little
>discussion and no observed community consensus about the design of
>this complex feature *before* this patch with high complexity was
>provided.
>The next action that was requested is to take a step back and decide
>how we would want to implement type-aware TOASTing (and the associated
>patch compression dictionaries) *before* we look into the type-aware
>toasting.

We decided to put this improvement as a patch because we thought
that the most complex and questionable part would be the TOAST
implementations (the Toasters) itself, and the Pluggable TOAST is
just a tool to make plugging different TOAST implementations clean
and simple.

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Pluggable toaster

2023-02-06 Thread Nikita Malakhov

Hi!

Existing TOAST has several very painful drawbacks - lack of UPDATE
operation, bloating of TOAST tables, and limits that are implicitly implied
on base tables by their TOAST tables, so it is seems not fair to say that
Pluggable TOAST does not solve any problems but just introduces new
ones.

The main reason behind this decision is that keeping the first
implementation
on the side of the vanilla (I mean rebasing it) over time is very difficult
due
to the very invasive nature of this solution.

So we decided to reduce changes in the core to the minimum necessary
to make it available through the hooks, because the hooks part is very
lightweight and simple to keep rebasing onto the vanilla core. We plan
to keep this extension free with the PostgreSQL license, so any PostgreSQL
user could benefit from the TOAST on steroids, and sometimes in the
future it will be a much simpler task to integrate the Pluggable TOAST into
the vanilla, along with our advanced TOAST implementations which
we plan to keep under Open Source licenses too.

On Mon, Feb 6, 2023 at 1:49 PM Alvaro Herrera 
wrote:

> On 2023-Feb-06, Nikita Malakhov wrote:
>
> > Currently we're busy revising the whole Pluggable TOAST API to make it
> > available as an extension and based on hooks to minimize changes in
> > the core. It will be available soon.
>
> Hmm, I'm not sure why would PGDG want to accept such a thing.  I read
> "minimize changes" as "open source Postgres can keep their crap
> implementation and companies will offer good ones for a fee".  I'd
> rather have something that can give users direct benefit -- not hide it
> behind proprietary licenses forcing each company to implement their own
> performant toaster.
>
> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Pluggable toaster

2023-02-05 Thread Nikita Malakhov

Hi hackers!

In response to opinion in thread on Compresson dictionaries for JSONb [1]
>The approaches are completely different,

>but they seem to be trying to fix the same problem: the fact that the
>default TOAST stuff isn't good enough for JSONB.





The problem, actually, is that the default TOAST is often not good for




modern loads and amounts of data.Pluggable TOAST is based not only




on pure enthusiasm, but on demands and tickets from production




databases.The main demand is effective and transparent storage subsystem




for large values for some problematic types of data, which we already have,




with proven efficiency.







So we're really quite surprised that it has got so little feedback. We've
got




some opinions on approach but there is no any general one on the approach




itself except doubts about the TOAST mechanism needs revision at all.







Currently we're busy revising the whole Pluggable TOAST API to make it




available as an extension and based on hooks to minimize changes in




the core. It will be available soon.



> [1]  https://postgr.es/m/20230203095540.zutul5vmsbmantbm@alvherre.pgsql
>
--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-26 Thread Nikita Malakhov

Hi!

Yes, I've checked that. What would be desirable behavior in the case above?
Anyway, waiting for table unlock seems to be not quite right.

On Sat, Jan 21, 2023 at 4:12 AM Nathan Bossart 
wrote:

> On Mon, Jan 16, 2023 at 11:18:08AM +0300, Alexander Pyhalov wrote:
> > Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
> > check for inheritors in expand_vacuum_rel()?
>
> Since no lock is held on the partition, the calls to functions like
> object_ownercheck() and pg_class_aclcheck() in
> vacuum_is_permitted_for_relation() will produce cache lookup ERRORs if the
> relation is concurrently dropped.
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: run pgindent on a regular basis / scripted manner

2023-01-23 Thread Nikita Malakhov

Hi!

I've ran pdindent on the whole Postgres and it'd changed
an awful lot of source files. Won't it create a lot of merge conflicts?

On Mon, Jan 23, 2023 at 8:48 PM Alvaro Herrera 
wrote:

> On 2023-Jan-23, Tom Lane wrote:
>
> > 1. [...] So now I think that we should
> > stick to the convention that it's on the user to install
> > pg_bsd_indent somewhere in their PATH; all we'll be doing with
> > this change is eliminating the step of fetching pg_bsd_indent's
> > source files from somewhere else.
>
> +1
>
> > 2. Given #1, it'll be prudent to continue having pgindent
> > double-check that pg_bsd_indent reports a specific version
> > number.  We could imagine starting to use the main Postgres
> > version number for that, but I'm inclined to continue with
> > its existing numbering series.
>
> +1
>
> > 3. If we do nothing special, the first mass reindentation is
> > going to reformat the pg_bsd_indent sources per PG style,
> > which is ... er ... not the way they look now.  Do we want
> > to accept that outcome, or take steps to prevent pgindent
> > from processing pg_bsd_indent?  I have a feeling that manual
> > cleanup would be necessary if we let such reindentation
> > happen, but I haven't experimented.
>
> Hmm, initially it must just be easier to have an exception so that
> pg_bsd_indent itself isn't indented.
>
> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
> #error <https://www.EnterpriseDB.com/#error> "Operator lives in the wrong
> universe"
>   ("Use of cookies in real-time system development", M. Gleixner, M. Mc
> Guire)
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-19 Thread Nikita Malakhov

Hi!

For the test Alexander described in the beginning of the discussion - the
results are
postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.324 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 1.782 ms
postgres@postgres=> vacuum full;
WARNING:  permission denied to vacuum "pg_statistic", skipping it
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
...
postgres@postgres=> cluster vacuum_tab;
ERROR:  must be owner of table vacuum_tab
Time: 0.384 ms

For the standard role "Postgres" the behavior is the same as before patch.

On Thu, Jan 19, 2023 at 10:37 AM Alexander Pyhalov 
wrote:

> Justin Pryzby писал 2023-01-19 04:49:
> > On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
> >> Hi,
> >>
> >> Currently there is no error in this case, so additional thrown error
> >> would
> >> require a new test.
> >> Besides, throwing an error here does not make sense - it is just a
> >> check
> >> for a vacuum
> >> permission, I think the right way is to just skip a relation that is
> >> not
> >> suitable for vacuum.
> >> Any thoughts or objections?
> >
> > Could you check if this is consistent between the behavior of VACUUM
> > FULL and CLUSTER ?  See also Nathan's patches.
>
> Hi.
>
> Cluster behaves in a different way - it errors out immediately if
> relation is not owned by user. For partitioned rel it would anyway raise
> error later.
> VACUUM and VACUUM FULL behave consistently after applying Nikita's patch
> (for partitioned and regular tables) - issue warning "skipping
> TABLE_NAME --- only table or database owner can vacuum it" and return
> success status.
>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-18 Thread Nikita Malakhov

Hi!

I've found the discussion you'd mentioned before, checking now.

On Thu, Jan 19, 2023 at 4:49 AM Justin Pryzby  wrote:

> On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
> > Hi,
> >
> > Currently there is no error in this case, so additional thrown error
> would
> > require a new test.
> > Besides, throwing an error here does not make sense - it is just a check
> > for a vacuum
> > permission, I think the right way is to just skip a relation that is not
> > suitable for vacuum.
> > Any thoughts or objections?
>
> Could you check if this is consistent between the behavior of VACUUM
> FULL and CLUSTER ?  See also Nathan's patches.
>
> --
> Justin
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-18 Thread Nikita Malakhov

Hi hackers!

Alexander found a very good issue.
Please check the solution above. Any objections? It's a production case,
please review,
any thoughts and objections are welcome.

On Mon, Jan 16, 2023 at 8:15 PM Alexander Pyhalov 
wrote:

> Nikita Malakhov писал 2023-01-16 20:12:
> > Hi,
> >
> > Currently there is no error in this case, so additional thrown error
> > would require a new test.
> > Besides, throwing an error here does not make sense - it is just a
> > check for a vacuum
> > permission, I think the right way is to just skip a relation that is
> > not suitable for vacuum.
> > Any thoughts or objections?
> >
>
> No objections for not throwing an error.
>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-16 Thread Nikita Malakhov

Hi,

Currently there is no error in this case, so additional thrown error would
require a new test.
Besides, throwing an error here does not make sense - it is just a check
for a vacuum
permission, I think the right way is to just skip a relation that is not
suitable for vacuum.
Any thoughts or objections?

On Mon, Jan 16, 2023 at 7:46 PM Alexander Pyhalov 
wrote:

> Nikita Malakhov писал 2023-01-16 17:26:
> > Hi!
> >
> > Here's the patch that fixes this case, please check it out.
> > The patch adds vacuum_is_permitted_for_relation() check before adding
> > partition relation to the vacuum list, and if permission is denied the
> > relation
> > is not added, so it is not passed to vacuum_rel() and there are no try
> > to
> > acquire the lock.
> >
> > Cheers!
>
> Hi.
>
> The patch seems to solve the issue.
> Two minor questions I have:
> 1) should we error out if HeapTupleIsValid(part_tuple) is false?
> 2) comment "Check partition relations for vacuum permit" seems to be
> broken in some way.
>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Inconsistency in vacuum behavior

2023-01-16 Thread Nikita Malakhov

Hi!

Here's the patch that fixes this case, please check it out.
The patch adds vacuum_is_permitted_for_relation() check before adding
partition relation to the vacuum list, and if permission is denied the
relation
is not added, so it is not passed to vacuum_rel() and there are no try to
acquire the lock.

Cheers!

On Mon, Jan 16, 2023 at 4:48 PM Nikita Malakhov  wrote:

> Hi!
>
> I've checked this expand_vacuum_rel() and made a quick fix for this.Here's
> the result of the test:
>
> postgres@postgres=# set role regress_vacuum_conflict;
> SET
> Time: 0.369 ms
> postgres@postgres=> vacuum vacuum_tab;
> WARNING:  permission denied to vacuum "vacuum_tab", skipping it
> WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
> WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
> VACUUM
> Time: 0.936 ms
> postgres@postgres=>
>
> Looks like it's a subject for a patch.
>
> On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov <
> a.pyha...@postgrespro.ru> wrote:
>
>> Hi.
>>
>> We've run regress isolation tests on partitioned tables and found
>> interesting VACUUM behavior. I'm not sure, if it's intended.
>>
>> In the following example, partitioned tables and regular tables behave
>> differently:
>>
>> CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a);
>> CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH
>> (MODULUS 2, REMAINDER 0);
>> CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH
>> (MODULUS 2, REMAINDER 1);
>> CREATE ROLE regress_vacuum_conflict;
>>
>> In the first session:
>>
>> begin;
>>   LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE;
>>
>> In the second:
>> SET ROLE regress_vacuum_conflict;
>>   VACUUM vacuum_tab;
>>   WARNING:  permission denied to vacuum "vacuum_tab", skipping it <
>> hangs here, trying to lock vacuum_tab_1
>>
>> In non-partitioned case second session exits after emitting warning. In
>> partitioned case, it hangs, trying to get locks.
>> This is due to the fact that in expand_vacuum_rel() we skip parent table
>> if vacuum_is_permitted_for_relation(), but don't perform such check for
>> its child.
>> The check will be performed later in vacuum_rel(), but after
>> vacuum_open_relation(), which leads to hang in the second session.
>>
>> Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
>> check for inheritors in expand_vacuum_rel()?
>>
>> --
>> Best regards,
>> Alexander Pyhalov,
>> Postgres Professional
>>
>>
>>
>
> --
> Regards,
> Nikita Malakhov
> Postgres Professional
> https://postgrespro.ru/
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


0001_expand_vac_rel_part_chk_v1.patch
Description: Binary data

Re: Inconsistency in vacuum behavior

2023-01-16 Thread Nikita Malakhov

Hi!

I've checked this expand_vacuum_rel() and made a quick fix for this.Here's
the result of the test:

postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.369 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 0.936 ms
postgres@postgres=>

Looks like it's a subject for a patch.

On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov 
wrote:

> Hi.
>
> We've run regress isolation tests on partitioned tables and found
> interesting VACUUM behavior. I'm not sure, if it's intended.
>
> In the following example, partitioned tables and regular tables behave
> differently:
>
> CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a);
> CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH
> (MODULUS 2, REMAINDER 0);
> CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH
> (MODULUS 2, REMAINDER 1);
> CREATE ROLE regress_vacuum_conflict;
>
> In the first session:
>
> begin;
>   LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE;
>
> In the second:
> SET ROLE regress_vacuum_conflict;
>   VACUUM vacuum_tab;
>   WARNING:  permission denied to vacuum "vacuum_tab", skipping it <
> hangs here, trying to lock vacuum_tab_1
>
> In non-partitioned case second session exits after emitting warning. In
> partitioned case, it hangs, trying to get locks.
> This is due to the fact that in expand_vacuum_rel() we skip parent table
> if vacuum_is_permitted_for_relation(), but don't perform such check for
> its child.
> The check will be performed later in vacuum_rel(), but after
> vacuum_open_relation(), which leads to hang in the second session.
>
> Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
> check for inheritors in expand_vacuum_rel()?
>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Pluggable toaster

2023-01-14 Thread Nikita Malakhov

Hi!
Fails due to recent changes. Working on it.

On Sat, Jan 14, 2023 at 9:56 AM vignesh C  wrote:

> On Sun, 8 Jan 2023 at 01:40, Nikita Malakhov  wrote:
> >
> > Hi!
> >
> > Thank you for your attention.
> > I've rebased the patchset onto the latest master (from 07.01), but the
> second part is still
> > in pre-patch shape - it is working, but tests are failing due to changes
> in TOAST relations
> > logic - I haven't adapted 'em yet.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased
> patch:
> === Applying patches on top of PostgreSQL commit ID
> c44f6334ca6ff6d242d9eb6742441bc4e1294067 ===
> === expanding ./0002_toaster_default_v25.patch.gz
> === expanding ./0001_toaster_interface_v25.patch.gz
> === expanding ./0004_drop_toaster_v25.patch.gz
> === expanding ./0003_pg_toastrel_control_v25.patch.gz
> === applying patch ./0001_toaster_interface_v25.patch
> 
> patching file src/include/postgres.h
> Hunk #1 succeeded at 80 with fuzz 2 (offset 4 lines).
> Hunk #2 FAILED at 148.
> Hunk #3 FAILED at 315.
> Hunk #4 FAILED at 344.
> Hunk #5 FAILED at 359.
> 4 out of 5 hunks FAILED -- saving rejects to file
> src/include/postgres.h.rej
>
> [1] - http://cfbot.cputube.org/patch_41_3490.log
>
> Regards,
> Vignesh
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: How to define template types in PostgreSQL

2023-01-07 Thread Nikita Malakhov

Hi!

I'd suggest creating an API that defines a general function set with
variable input,
and calling implementation defined on the input type?

On Sat, Jan 7, 2023 at 12:32 PM Esteban Zimanyi 
wrote:

> Dear all
>
> MobilityDB (https://github.com/MobilityDB/MobilityDB) defines at the C
> level four template types: Set, Span, SpanSet, and Temporal. The type Set
> is akin to PostgreSQL's ArrayType restricted to one dimension, but enforces
> the constraint that sets do not have duplicates, the types Span and SpanSet
> are akin to PostgreSQL's RangeType and MultirangeType but enforce the
> constraints that span types are of fixed length and that empty spans and
> infinite bounds are not allowed, and the typeTemporal is used to
> manipulate time-varying values.
>
> These template types need to be instantiated at the SQL level with base
> types (int, bigint, float, timestamptz, text, ...) and because of this,
> MobilityDB needs to define numerous SQL functions that all call the same
> function in C. Taking as example the Set type, we need to define, e.g.,
>
> CREATE FUNCTION intset_eq(intset, intset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_eq' ...
> CREATE FUNCTION bigintset_eq(bigintset, bigintset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_eq' ...
> CREATE FUNCTION floatset_eq(floatset, floatset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_eq' ...
> CREATE FUNCTION textset_eq(textset, textset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_eq' ...
> ...
>
> CREATE FUNCTION intset_ne(intset, intset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_ne' ...
> CREATE FUNCTION bigintset_ne(bigintset, bigintset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_ne' ...
> CREATE FUNCTION floatset_ne(floatset, floatset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_ne' ...
> CREATE FUNCTION textset_ne(textset, textset) RETURNS bool AS
> 'MODULE_PATHNAME', 'Set_ne' ...
> ...
>
> In the case of arrays, ranges, and multiranges, PostgreSQL avoids this
> redundancy using pseudo-types such as anyarray, anyrange, anymultirange, ...
>
> Is there a possibility that we can also define pseudo types such as
> anyset, anyspan, anyspanset, anytemporal,  ?
>
> This will considerably reduce the number of SQL functions to define.
> Currently, given the high number of functions in MobilityDB, creating the
> extension takes a lng time 
>
> Regards
>
> Esteban
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: split TOAST support out of postgres.h

2022-12-28 Thread Nikita Malakhov

Hi,

I've thought about this while working on Pluggable TOAST and 64-bit
TOAST value ID myself. Agree with #2 to seem the best of the above.
There are not so many places where a new header should be included.

On Wed, Dec 28, 2022 at 6:08 PM Tom Lane  wrote:

> Peter Eisentraut  writes:
> > ... Then we could either
>
> > 1) Include varatt.h in postgres.h, similar to elog.h and palloc.h.  That
> > way we clean up the files a bit but don't change any external interfaces.
>
> > 2) Just let everyone who needs it include the new file.
>
> > 3) Compromise: You can avoid most "damage" by having fmgr.h include
> > varatt.h.  That satisfies most data types and extension code.  That way,
> > there are only a few places that need an explicit include of varatt.h.
>
> > I went with the last option in my patch.
>
> I dunno, #3 seems kind of unprincipled.  Also, since fmgr.h is included
> so widely, I doubt it is buying very much in terms of reducing header
> footprint.  How bad is it to do #2?
>
> regards, tom lane
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Passing relation metadata to Exec routine

2022-12-27 Thread Nikita Malakhov

Hi Tom!

Thank you for your feedback. I agree that for complex columns
created with joins, grouping, etc considering properties of the base
table does not make sense at all.

But for CREATE TABLE LIKE and simple columns that are inherited
from some existing relations - it does, if we consider some advanced
properties and from user's perspective - want our new table [columns]
to behave exactly as the base ones (in some ways like encryption,
storage, compression methods, etc). LIKE options is a good idea,
thank you, but when we CREATE TABLE AS - maybe, we take it
into account too?

I agree that passing these parameters in a backdoor fashion is not
very transparent and user-friendly, too.

On Tue, Dec 27, 2022 at 12:56 AM Tom Lane  wrote:

> Nikita Malakhov  writes:
> > While working on Pluggable TOAST [1] we found out that creation
> > of new relation with CREATE TABLE AS... or CREATE TABLE LIKE -
> > method
> > static ObjectAddress create_ctas_internal(List *attrList, IntoClause
> *into)
> > does not receive any metadata from columns or tables used in query
> > (if any). It makes sense to pass not only column type and size, but
> > all other metadata - like attoptions,base relation OID (and, maybe,
> > reloptions), if the column from existing relation was used.
>
> I am very, very skeptical of the premise here.
>
> CREATE TABLE AS creates a table based on the output of a query.
> That query could involve arbitrarily complex processing -- joins,
> grouping, what-have-you.  I don't see how it makes sense to
> consider the properties of the base table(s) in deciding how to
> create the output table.  I certainly do not think it'd be sane for
> that to behave differently depending on how complicated the query is.
>
> As for CREATE TABLE LIKE, the point of that is to create a table
> by copying a *specified* set of properties of a reference table.
> I don't think letting an access method copy some other properties
> behind the user's back is a great idea.  If you've got things that
> it'd make sense to be able to inherit, let's discuss adding more
> LIKE options to support that --- in which case the implementation
> would presumably pass the data through in a non-back-door fashion.
>
> regards, tom lane
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Passing relation metadata to Exec routine

2022-12-26 Thread Nikita Malakhov

Hi hackers!

While working on Pluggable TOAST [1] we found out that creation
of new relation with CREATE TABLE AS... or CREATE TABLE LIKE -
method
static ObjectAddress create_ctas_internal(List *attrList, IntoClause *into)
does not receive any metadata from columns or tables used in query
(if any). It makes sense to pass not only column type and size, but
all other metadata - like attoptions,base relation OID (and, maybe,
reloptions), if the column from existing relation was used.

A good example is the creation of new relation from base one where
some other Toaster was assigned to a column - it seems reasonable
that the same column in new table must have the same Toaster assigned
as the base one. And we already have a couple of other practical uses
for the metadata passed along with column definitions.

Any thoughts or suggestions?

[1] https://commitfest.postgresql.org/41/3490/

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-26 Thread Nikita Malakhov

Hi,

My bad, I was misleaded by unconditional return in ereturn_domain.
Sorry for the noise.


On Sat, Dec 24, 2022 at 7:05 PM Tom Lane  wrote:

> Nikita Malakhov  writes:
> > Even with null context it does not turn to ereport, and returns dummy
> value
>
> Read the code.  ArrayGetNItems passes NULL for escontext, therefore
> if there's a problem the ereturn calls in ArrayGetNItemsSafe will
> throw error, *not* return -1.
>
> Not sure how we could persuade Coverity of that, though,
> if it fails to understand that for itself.
>
>     regards, tom lane
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-24 Thread Nikita Malakhov

Hi,

Even with null context it does not turn to ereport, and returns dummy value
-

#define errsave_domain(context, domain, ...) \
do { \
struct Node *context_ = (context); \
pg_prevent_errno_in_scope(); \
if (errsave_start(context_, domain)) \
__VA_ARGS__, errsave_finish(context_, __FILE__, __LINE__, __func__); \
} while(0)

#define errsave(context, ...) \
errsave_domain(context, TEXTDOMAIN, __VA_ARGS__)

/*
 * "ereturn(context, dummy_value, ...);" is exactly the same as
 * "errsave(context, ...); return dummy_value;".  This saves a bit
 * of typing in the common case where a function has no cleanup
 * actions to take after reporting a soft error.  "dummy_value"
 * can be empty if the function returns void.
 */
#define ereturn_domain(context, dummy_value, domain, ...) \
do { \
errsave_domain(context, domain, __VA_ARGS__); \
return dummy_value; \
} while(0)

#define ereturn(context, dummy_value, ...) \
ereturn_domain(context, dummy_value, TEXTDOMAIN, __VA_ARGS__)


On Fri, Dec 23, 2022 at 11:40 AM Kyotaro Horiguchi 
wrote:

> At Fri, 23 Dec 2022 17:37:55 +0900 (JST), Kyotaro Horiguchi <
> horikyota@gmail.com> wrote in
> > At Thu, 22 Dec 2022 12:35:58 -0300, Ranier Vilela 
> wrote in
> > > Hi.
> > >
> > > Per Coverity.
> > >
> > > The commit ccff2d2
> > > <
> https://github.com/postgres/postgres/commit/ccff2d20ed9622815df2a7deffce8a7b14830965
> >,
> > > changed the behavior function ArrayGetNItems,
> > > with the introduction of the function ArrayGetNItemsSafe.
> > >
> > > Now ArrayGetNItems may return -1, according to the comment.
> > > " instead of throwing an exception. -1 is returned after an error."
> >
> > If I'm reading the code correctly, it's the definition of
> > ArrayGetNItems*Safe*. ArrayGetNItems() calls that function with a NULL
> > escontext and the NULL turns ereturn() into ereport().
>
> > That doesn't seem to be changed by the commit.
>
> No.. It seems to me that the commit didn't change its behavior in that
> regard.
>
> > Of course teaching Coverity not to issue the false warnings would be
> > another actual issue that we should do, maybe.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-22 Thread Nikita Malakhov

Hi,

The most obvious solution I see is to check all calls and for cases like we
both mentioned
to pass a flag meaning safe or unsafe (for these cases) behavior is
expected, like

#define ARRNELEMS(x)  ArrayGetNItems( ARR_NDIM(x), ARR_DIMS(x), false)

...

int
ArrayGetNItems(int ndim, const int *dims, bool issafe)
{
return ArrayGetNItemsSafe(ndim, dims, NULL, issafe);
}

int
ArrayGetNItemsSafe(int ndim, const int *dims, struct Node *escontext, bool
issafe)
{
...

Another solution is revision of wrapping code for all calls of
ArrayGetNItems.
Safe functions is a good idea overall, but a lot of code needs to be
revised.

On Fri, Dec 23, 2022 at 1:20 AM Ranier Vilela  wrote:

> Em qui., 22 de dez. de 2022 às 15:45, Nikita Malakhov 
> escreveu:
>
>> Hi,
>>
>> Actually, there would be much more sources affected, like
>>  nbytes += subbytes[outer_nelems];
>>  subnitems[outer_nelems] = ArrayGetNItems(this_ndims,
>> ARR_DIMS(array));
>>  nitems += subnitems[outer_nelems];
>>  havenulls |= ARR_HASNULL(array);
>>  outer_nelems++;
>>   }
>>
>> Maybe it is better for most calls like this to keep old behavior, by
>> passing a flag
>> that says which behavior is expected by caller?
>>
> I agreed that it is better to keep old behavior.
> Even the value 0 is problematic, with calls like this:
>
> nel = ARRNELEMS(ent);
> memcpy(ptr, ARRPTR(ent), nel * sizeof(int32));
>
> regards,
> Ranier Vilela
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-22 Thread Nikita Malakhov

Hi,

Actually, there would be much more sources affected, like
 nbytes += subbytes[outer_nelems];
 subnitems[outer_nelems] = ArrayGetNItems(this_ndims,
ARR_DIMS(array));
 nitems += subnitems[outer_nelems];
 havenulls |= ARR_HASNULL(array);
 outer_nelems++;
  }

Maybe it is better for most calls like this to keep old behavior, by
passing a flag
that says which behavior is expected by caller?

On Thu, Dec 22, 2022 at 6:36 PM Ranier Vilela  wrote:

> Hi.
>
> Per Coverity.
>
> The commit ccff2d2
> <https://github.com/postgres/postgres/commit/ccff2d20ed9622815df2a7deffce8a7b14830965>,
> changed the behavior function ArrayGetNItems,
> with the introduction of the function ArrayGetNItemsSafe.
>
> Now ArrayGetNItems may return -1, according to the comment.
> " instead of throwing an exception. -1 is returned after an error."
>
> So the macro ARRNELEMS can fail entirely with -1 return,
> resulting in codes failing to use without checking the function return.
>
> Like (contrib/intarray/_int_gist.c):
> {
> int nel;
>
> nel = ARRNELEMS(ent);
> memcpy(ptr, ARRPTR(ent), nel * sizeof(int32));
> }
>
> Sources possibly affecteds:
> contrib\cube\cube.c
> contrib\intarray\_intbig_gist.c
> contrib\intarray\_int_bool.c
> contrib\intarray\_int_gin.c
> contrib\intarray\_int_gist.c
> contrib\intarray\_int_op.c
> contrib\intarray\_int_tool.c:
>
> Thoughts?
>
> regards,
> Ranier Vilela
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: [PATCH] Infinite loop while acquiring new TOAST Oid

2022-12-22 Thread Nikita Malakhov

Hi hackers!

Any suggestions on the previous message (64-bit toast value ID with
individual counters)?

On Tue, Dec 13, 2022 at 1:41 PM Nikita Malakhov  wrote:

> Hi hackers!
>
> I've prepared a patch with a 64-bit TOAST Value ID. It is a kind of
> prototype
> and needs some further work, but it is already working and ready to play
> with.
>
> I've introduced 64-bit value ID field in varatt_external, but to keep it
> compatible
> with older bases 64-bit value is stored as a structure of two 32-bit
> fields and value
> ID moved to be the last in the varatt_external structure. Also, I've
> introduced
> different ID assignment function - instead of using global OID assignment
> function
> I use individual counters for each TOAST table, automatically cached and
> after
> server restart when new value is inserted into TOAST table maximum used
> value
> is searched and used to assign the next one.
>
> >Andres Freund:
> >I think we'll need to do something about the width of varatt_external to
> make
> >the conversion to 64bit toast oids viable - and if we do, I don't think
> >there's a decent argument for staying with 4 byte toast OIDs. I think the
> >varatt_external equivalent would end up being smaller in just about all
> cases.
> >And as you said earlier, the increased overhead inside the toast table /
> index
> >is not relevant compared to the size of toasted datums.
>
> There is some small overhead due to indexing 64-bit values. Also, for
> smaller
> tables we can use 32-bit ID instead of 64-bit, but then we would have a
> problem
> when we reach the limit of 2^32.
>
> Pg_upgrade seems to be not a very difficult case if we go keeping old TOAST
> tables using 32-bit values,
>
> Please have a look. I'd be grateful for some further directions.
>
> GIT branch with this feature, rebased onto current master:
> https://github.com/postgrespro/postgres/tree/toast_64bit
>
>
> --
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Array initialisation notation in syscache.c

2022-12-21 Thread Nikita Malakhov

Hi!

Wanted to ask this since I encountered a need for a cache with 5 keys -
why is the syscache index still limited to 4 keys?

Thanks!

On Wed, Dec 21, 2022 at 7:36 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 21.12.22 04:16, Thomas Munro wrote:
> > On Wed, Dec 21, 2022 at 1:33 PM Thomas Munro 
> wrote:
> >>  KEY(Anum_pg_attribute_attrelid,
> >>  Anum_pg_attribute_attnum),
> >
> > I independently rediscovered that our VA_ARGS_NARGS() macro in c.h
> > always returns 1 on MSVC via trial-by-CI.  Derp.  Here is the same
> > patch, no change from v2, but this time accompanied by Victor Spirin's
> > fix, which I found via one of the tab-completion-is-busted-on-Windows
> > discussions.  I can't supply a useful commit message, because I
> > haven't understood why it works, but it does indeed seem to work and
> > this should make cfbot green.
>
> This looks like a good improvement to me.
>
> (I have also thought about having this generated from the catalog
> definition files somehow, but one step at a time ...)
>
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: meson and tmp_install

2022-12-21 Thread Nikita Malakhov

Hi!

That's great, thanks! Discussion list is very long so I've missed this
topic.
Just a suggestion - I've checked the link above, maybe there should be
added a small part on where build files are located and how to add new
sources for successful build?

On Tue, Dec 20, 2022 at 9:22 PM Andres Freund  wrote:

> Hi,
>
> On 2022-12-20 21:11:26 +0300, Nikita Malakhov wrote:
> > Didn't know where to ask, so I've chosen this thread - there is no any
> > documentation on meson build platform in PostgreSQL docs.
>
> There is now:
> https://www.postgresql.org/docs/devel/install-meson.html
>
> Needs further work, but it's a start.
>
>
> > Is this okay? For me it was a surprise when the meson platform was
> > added
>
> It's been discussed on the list for a year or so before it was
> added. It's a large change, so unfortunately it's not something that I
> could get done in a single day, with perfect docs from the get go.
>
> Greetings,
>
> Andres Freund
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: meson and tmp_install

2022-12-20 Thread Nikita Malakhov

Hi!

Didn't know where to ask, so I've chosen this thread - there is no any
documentation on meson build platform in PostgreSQL docs. Is this
okay? For me it was a surprise when the meson platform was added,
and I have had to spend some time sweeping through meson docs
when I'd added new source files, to build Postgres successfully.

Thanks!

On Tue, Dec 20, 2022 at 8:30 PM Andres Freund  wrote:

> Hi,
>
> On 2022-12-20 07:12:04 -0500, Andrew Dunstan wrote:
> > Yesterday when testing a patch I got annoyed when my test failed. I
> > tested it like this:
> >
> > meson test ldap_password_func/001_mutated_bindpasswd
> >
> > It turned out that I needed to do this:
> >
> > meson test tmp_install ldap_password_func/001_mutated_bindpasswd
> >
> > The Makefile equivalent ensures that you have a tmp_install without
> > having to request to explicitly. It would be nice if we could make meson
> > do the same thing, and also honor NO_TEMP_INSTALL if set.
>
> I would like that too, but there's no easy fix that doesn't have
> downsides as far as I am aware. We could make the temp install a build
> target that the tests depend on, but for historical reasons in meson
> that means that the 'all' target depends on temp-install. Which isn't
> great.
>
> My current thinking is that we should get away from needing the
> temporary install and instead allow to run the tests against the build
> directory itself. The temp-install adds a fair bit of overhead and
> failure potential. The only reason we need it is that a) initdb and a
> few other programs insist that postgres needs to be in the same
> directory b) contrib modules currently need to reside in one single
> directory.
>
> Greetings,
>
> Andres Freund
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: New strategies for freezing, advancing relfrozenxid early

2022-12-20 Thread Nikita Malakhov

Hi!

I'll try to apply this patch onto my branch with Pluggable TOAST to test
these mechanics with new TOAST. Would reply on the result. It could
be difficult though, because both have a lot of changes that affect
the same code.

>I'm not sure how much this would help with bloat. I suspect that it
>could make a big difference with the right workload. If you always
>need frequent autovacuums, just to deal with bloat, then there is
>never a good time to run an aggressive antiwraparound autovacuum. An
>aggressive AV will probably end up taking much longer than the typical
>autovacuum that deals with bloat. While the aggressive AV will remove
>as much bloat as any other AV, in theory, that might not help much. If
>the aggressive AV takes as long as (say) 5 regular autovacuums would
>have taken, and if you really needed those 5 separate autovacuums to
>run, just to deal with the bloat, then that's a real problem.  The
>aggressive AV effectively causes bloat with such a workload.



On Tue, Dec 20, 2022 at 12:01 PM Jeff Davis  wrote:

> On Sun, 2022-12-18 at 14:20 -0800, Peter Geoghegan wrote:
> > Attached is v10, which fixes this issue, but using a different
> > approach to the one I sketched here.
>
> In 0001, it's fairly straightforward rearrangement and looks like an
> improvement to me. I have a few complaints, but they are about pre-
> existing code that you moved around, and I like that you didn't
> editorialize too much while just moving code around. +1 from me.
>
>
> --
> Jeff Davis
> PostgreSQL Contributor Team - AWS
>
>
>
>
>

-- 
Regards,

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Patch: Global Unique Index

2022-12-19 Thread Nikita Malakhov

Hi!

Sorry to bother - but is this patch used in IvorySQL?
Here:
https://www.ivorysql.org/docs/Global%20Unique%20Index/create_global_unique_index
According to syntax it definitely looks like this patch.
Thank you!


On Sat, Dec 3, 2022 at 3:05 AM David Zhang  wrote:

> On 2022-11-29 6:16 p.m., Tom Lane wrote:
> > Assuming that you are inserting into index X, and you've checked
> > index Y to find that it has no conflicts, what prevents another
> > backend from inserting a conflict into index Y just after you look?
> > AIUI the idea is to prevent that by continuing to hold an exclusive
> > lock on the whole index Y until you've completed the insertion.
> > Perhaps there's a better way to do that, but it's not what was
> > described.
> Another main change in patch
> `0004-support-global-unique-index-insert-and-update.patch`,
> +search_global:
> +stack = _bt_search(iRel, insertstate.itup_key,
> +   &insertstate.buf, BT_READ,
> NULL);
> +xwait = _bt_check_unique_gi(iRel, &insertstate,
> +hRel, checkUnique,
> &is_unique,
> + &speculativeToken, heapRel);
> +if (unlikely(TransactionIdIsValid(xwait)))
> +{
> ... ...
> +goto search_global;
> +}
>
> Here, I am trying to use `BT_READ` to require a LW_SHARED lock on the
> buffer block if a match found using `itup_key` search key. The
> cross-partition uniqueness checking will wait if the index tuple
> insertion on this buffer block has not done yet, otherwise runs the
> uniqueness check to see if there is an ongoing transaction which may
> insert a conflict value. Once the ongoing insertion is done, it will go
> back and check again (I think it can also handle the case that a
> potential conflict index tuple was later marked as dead in the same
> transaction). Based on this change, my test results are:
>
> 1) a select-only query will not be blocked by the ongoing insertion on
> index X
>
> 2) insertion happening on index Y may wait for the buffer block lock
> when inserting a different value but it does not wait for the
> transaction lock held by insertion on index X.
>
> 3) when an insertion inserting a conflict value on index Y,
>  3.1) it waits for buffer block lock if the lock has been held by
> the insertion on index X.
>  3.2) then, it waits for transaction lock until the insertion on
> index X is done.
>
>
>
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: On login trigger: take three

2022-12-16 Thread Nikita Malakhov

Hi,

Mikhail, I've checked the patch and previous discussion,
the condition mentioned earlier is still actual:
>The example trigger from the documentation

+DECLARE

+  hour integer = EXTRACT('hour' FROM current_time);

+  rec boolean;

+BEGIN

+-- 1. Forbid logging in between 2AM and 4AM.

+IF hour BETWEEN 2 AND 4 THEN

+  RAISE EXCEPTION 'Login forbidden';

+END IF;


>can be bypassed with PGOPTIONS='-c timezone=...'. Probably this is
>nothing new and concerns any SECURITY DEFINER function, but still...

along with
+IF hour BETWEEN 8 AND 20 THEN
It seems to be a minor security issue, so just in case you haven't noticed
it.

On Fri, Dec 16, 2022 at 9:14 PM Mikhail Gribkov  wrote:

> Hi hackers,
> Attached v33 is a new rebase of the flagless version of the patch.  As
> there were no objections at first glance, I’ll try to post it to the
> upcoming commitfest, thus the brief recap of all the patch details is below.
>
> v33-On_client_login_event_trigger
> The patch introduces a trigger on login event, allowing to fire some
> actions right on the user connection. This can be useful for  logging or
> connection check purposes as well as for some personalization of the
> environment. Usage details are described in the documentation included, but
> shortly usage is the same as for other triggers: create function returning
> event_trigger and then create event trigger on login event.
>
> The patch is prepared to be applied to the master branch and tested when
> applied after e52f8b301ed54aac5162b185b43f5f1e44b6b17e commit by Thomas
> Munro (Date:   Fri Dec 16 17:36:22 2022 +1300).
> Regression tests and documentation included.
>
> A couple of words about what and why I changed compared to the previous
> author's version.
> First, the (en/dis)abling GUC was removed from the patch because
> ideologically it is a separate feature and nowadays it’s  discussed and
>  supported in a separate thread by Daniel Gustaffson:
>
> https://www.postgresql.org/message-id/flat/9140106E-F9BF-4D85-8FC8-F2D3C094A6D9%40yesql.se
> Second, I have removed the dathasloginevt flag. The flag was initially
> added to the patch for performance reasons and it did the job, although it
> was quite a clumsy construct causing tons of bugs and could potentially
> lead to tons more during further PostgreSQL evolution. I have removed the
> flag and found that the performance drop is not that significant.
>
> And this is an aspect I should describe in more detail.
> The possible performance drop is expected as an increased time used to
> login a user NOT using a login trigger.
> First of all, the method of performance check:
> echo 'select 1;' > ./tst.sql
> pgbench -n -C -T3 -f tst.sql -U postgres postgres
> The output value "average connection time" is the one I use to compare
> performance.
> Now, what are the results.
> * master branch: 0.641 ms
> * patched version: 0.644 ms
> No significant difference here and it is just what was expected. Based on
> the patch design the performance drop can be expected when there are lots
> of event triggers created, but not the login one. Thus I have created 1000
> drop triggers (the script for creating triggers is attached too) in the
> database and repeated the test:
> * master branch: 0.646 ms
> * patched version: 0.754 ms
> For 2000 triggers the patched version connection time is further increased
> to 0.862. Thus we have a login time rise in about 16.5% per 1000 event
> triggers in the database. It is a statistically noticeable value, still I
> don’t think it’s a critical one we should be afraid of.
> N.B. The exact values of the login times  slightly differ from the ones I
> posted in the previous email. Well, that’s the repeatability level we have.
> This convinces me even more that the observed level of performance drop is
> acceptable.
> --
>  best regards,
>     Mikhail A. Gribkov
>
> e-mail: youzh...@gmail.com
> *http://www.flickr.com/photos/youzhick/albums
> <http://www.flickr.com/photos/youzhick/albums>*
> http://www.strava.com/athletes/5085772
> phone: +7(916)604-71-12
> Telegram: @youzhick
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

1 2 >

1 - 100 of 175 matches

Mail list logo