> > Well, I guess the question is: if we don't offer some builtin way to
render
> > non-standard formats built into company products, will those
companies fix
> > their format or just not use PostgreSQL?
>
> Well, there is an advantage that Postgres has that some others don't:
you
> can extend
> > As a user I'd really prefer all of this to be much more transparent,
and could
> > well imagine the -Fc format to be some kind of TOC + zip of table
data + post
> > load instructions (organized per table), or something like this.
> > In fact just what you described, all embedded in a single
> > while we are at it -- one feature would be great for 8.4, an
> > ability to shange shared buffers size "on the fly". I expect
> > it is not trivial, but would help fine-tuning running database.
> > I think DBA would need to set maximum shared buffers size
> > along the normal setting.
>
Simon wrote:
> My proposal is to provide two additional modes:
> --schema-pre-load corresponding to (1) above
> --schema-post-load corresponding to (3) above
Sounds nice.
For a large schema we might rather want one switch that dumps 2 files,
no ?
Probably also better from a mvcc perspective.
And
> http://msdn2.microsoft.com/en-us/library/b842y285(VS.71).aspx
> appears to
> > suggest that the size of the field is fixed.
>
> That would imply that dumpbin fails at 4096 symbols per file. While I
> surely wouldn't put it past M$ to have put in such a
> limitation, I think
> it's more like
> > The plural seems better to me; there's no such thing as a solitary
> > synchronized scan, no? The whole point of the feature is to affect
> > the behavior of multiple scans.
>
> +1. The plural is important IMHO.
ok, good.
> As I stated earlier, I don't really like this argument (we already
> > +1. If we go with 'enable_sync_seqcans' for 8.3, and in a future
release
> > cycle we do test the cases Simon described above and we agree we
need to
> > do a fine tune to benefit from this feature, we will need to
deprecate
> > 'enable_sync_seqscans' and invent another one
(sync_seqscans_t
> It's a good point that we don't want pg_dump to screw up the cluster
> order, but that's the only use case I've seen this far for disabling
> sync scans. Even that wouldn't matter much if our estimate for
> "clusteredness" didn't get screwed up by a table that looks
> like this:
> "5 6 7 8
> >> I liked the "synchronized_sequential_scans" idea myself.
>
> > I think that's a bit too long. How about "synchronized_scans", or
> > "synchronized_seqscans"?
>
> We have enable_seqscan already, so that last choice seems to fit in.
Yes looks good, how about synchronized_seqscan without plur
> >> This seems like a pretty horrid idea. Bad pages shouldn't be
allowed to
> >> get into shared buffers in the first place. Why not have the
checking
> >> logic operate outside shared buffers?
>
> > It currently works outside the shared buffers, but I afraid about
> > collision due to parall
> > Personally I find the automatic partition idea intriguing, where you
> > only have to choose an expression that equates to one value (value
> > group) per partition (and possibly a way to derive a
> partition name).
>
> IMO, better go right to a fully automated approach. Or why would you
>
> > I don't agree with that at all. I can imagine plenty of situations
> > where a tuple falling outside the range of available partitions
*should*
> > be treated as an error. For instance, consider timestamped
observations
> > --- data in the future is certainly bogus, and data further back
tha
> > Traditionally materialized views exist, so that you do not need to code
> > differently.
> > Your queries still run on the detail table, but are silently answered
> > by a suitable MV. The MV might have count + other aggregated columns
> > grouped by some columns, and thus be able e.g. shortc
> > But you had to modify your queries. I would think that a materialized
> > views implementation worth its salt would put the view to work on the
> > original, unmodified queries.
> >
>
> I might be slow today (everyday? :-) ) - but what do you mean by this?
> The only difference between *
> I've kept a list of requests for improvement that I can share with
you;
> I've always been loathe to publish a list of bad points.
I think it would help understand the proposal if you also present the
shortcomings.
When you presented the positive and negative points, the negative list
did look
> > Yeah, I think it would be useful to log one message if after (say) 5
> > seconds we still haven't been able to open the file.
>
> Either that, or on the first run.
Imho 1-5s is better, so that would be after the 10-50th try.
> loop. It's supposed to loop 300 times.
Yes.
> > (Are we OK wi
> > Note that even though the processor is 99% in wait state the drive
is
> > only handling about 3 MB/s. That translates into a seek time of
2.2ms
> > which is actually pretty fast...But note that if this were a raid
array
> > Postgres's wouldn't be getting any better results. A Raid array
wou
> Getting partitioning/read-only right will allow 70+TB of that to be on
> tape or similar, which with compression can be reduced to maybe 20TB?
I
> don't want to promise any particular compression ratio, but it will
make
> a substantial difference, as I'm sure you realise.
Wouldn't one very subst
> There are a number of nasty
> limitations for partitions currently (not the least of which is that
real
> uniqueness guarantees are impractical),
Just to add an other opinion to this statement, because it imho sounds
overly
pessimistic:
Uniqueness is currently perfectly practical, when the un
> > GetText is conversion po(EUC_JP) to SJIS.
Yes.
> Are you sure about that? Why would gettext be converting to SJIS,
when
> SJIS is nowhere in the environment it can see?
gettext is using GetACP () on Windows, wherever that gets it's info from
...
"chcp" did change the GetACP codepage in Hir
> > But that requires a) dealing with the problem of the parent table
which has no
> > constraints and ...
Imho we should provide a mechanism that forces the parent to be empty
and let the planner know.
e.g. a false constraint on parent ONLY.
Andreas
---(end of broadcas
> I don't think that should even be a TODO item --- it seems far more
> likely to provide a foot-gun than useful capability.
On further reflection I think that initdb time is probably sufficient.
Do you think that would be a reasonable TODO ?
> Whether 16MB is still a reasonable default segment
> > > Perhaps we should move the successful archived message to DEBUG1
now,
> > > except for the first message after the archiver starts or when the
> > > archive_command changes, plus one message every 255 segments?
> > > That would reduce the log volume in the normal case without
endangering
>
> Perhaps we should move the successful archived message to DEBUG1 now,
> except for the first message after the archiver starts or when the
> archive_command changes, plus one message every 255 segments?
> That would reduce the log volume in the normal case without
endangering
> our ability to
Kevin Grittner wrote:
> > . . .the abuse of such hints in applications I have seen is so
rampant as to
> > make me doubt the utility of adding them anyway. It's true that by
adding
> > hints, you give a facility to a good, competent designer who has a
really
> I have trouble not seeing the point
> > I think the cool thing that Josh Berkus wants is
> >
> > return query select a, b, c from foo;
> > return query select d, e, f from bar;
> >
>
> maybe better
>
> SELECT a,b FROM foo;
> SELECT d, e FROM bar;
>
> procedure doesn't need return statement usually
The background for Quel was, th
> A nice improvement on that would be to have a "rearchive_command" to
> allow to sync the new bytes written since a previous archive_command
(so
> it needs a new placeholder "start from this byte"). This allows
writing
> dd seek=%s skip=%s count=%b bs=1
But after a log switch nothing is filling
> > The probably useful next step would be to pass the current length to
the
> > archive_command,
> > so it can write the filled part of the file without the need for a
> > filter.
>
> I can see that helping a lot, but not by writing onto the file on
disk.
> If the file is nearly empty, that wou
> > * Do we bump the .so major version number for libpq? I think we
should
> > because there are two new exported functions since 8.2, and on at
least
> > some platforms there's nothing else than major number to
disambiguate
> > whether a client needs these or not. Comments?
-1. You don't bump
> > Attached is a modified version to implement both of these. I also
bailed
> > out if there was surplus input. I tried an optimization of
allocating a
> > separate buffer for outputting the zeros, to avoid repeated memset
calls.
> > It didn't seem to make a very big difference; do you think it
> The COLD updated (old) tuple would be pruned to dead line pointer
> once the tuple becomes DEAD. Normally that would let us reuse the
> tuple storage for other purposes. We do the same for DELETEd tuples.
Oh, I thought only pruned tuples from HOT chains can produce a
"redirect dead" line point
> CREATE TABLE test (a int, b char(200));
> CREATE UNIQUE INDEX testindx ON test(a);
> INSERT INTO test VALUES (1, 'foo');
>
> Now, if we repeatedly update the tuple so that each update is a
> COLD update, we would bloat the page with redirect-dead line pointers.
Um, sorry for not understanding,
> > I still don't see it as terrifically useful functionality, given
that it's
> > just saving you 4 keystrokes ...
>
> Less than that, because the AS is optional. The only argument I've
> heard that carries much weight with me is that it eases porting from
> other DBMS's that allow this. Are
> > how much harder can it be to accept:
> >
> > group by 'foo'
Presumably you meant group by "foo".
Imho pg should accept group by "foo". It could be part of a constant
removal, that also takes burden off the sort.
e.g. in "select x, count(*) from bar where x=5 group by x", x could be
remov
> > If your implementation accepts:
> >
> > group by case when true then 'foo' end
I think he meant: group by case when true then "foo" end
>
> What would that mean? Regardless of whether or not it's accepted, it
> should have *some* meaning.
>
> It's not equivalent to GROUP BY "foo"
Yea,
Sounds reasonable, but why exactly did we spell out "english" instead of "en" ?
Seems the abbrev is much easier to extract from LANG or browser prefs ...
Andreas
-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Tom Lane
Gesendet: Mittwoch, 22. A
> Which brings us back to the original issue. If I decide to stick with
> the current implementation and not "improve our existing partitioning
> mechanisms to scale to 100,000 partitions", I could do something like:
There is a point where you can leave the selection of the correct rows
to norm
> > unpatched HOT
> > autovacuums 116 43
> > autoanalyzes139 60
>
> > HOT greatly reduces the number of vacuums needed. That's
> good, that's
> > where the gains in throughput in longer I/O bound runs comes from.
>
> But surely failing to
> The NUM_BUFFER_PARTITIONS patch is fairly simple. We've
> noticed gains with NUM_BUFFER_PARTITIONS set between 256 and
> 2048, but little to no gain after 2048, although this might
> depend on the benchmark and platform being used. We've
Might this also be a padding issue, because 2048 part
> > Is there any reason we can't just use a check on whether
> SysLoggerPID
> > is not 0?
>
> (a) that really shouldn't be exported out of postmaster.c,
> and (b) it is not readily available to child backends is it?
Should there be child backends when the logger did not start ?
I'd think star
> synchronous_commit
> Idea: Greg Stark
> Supporters: Simon, Josh, Tom, Bruce, Florian
There was one more:
asynchronous_commit
Idea: Florian G. Pflug
Supporters: none
But if you are calling the feature that (which imho is good), the guc
might as well get that name.
Andreas
> user) . I can build as the admin user but when I come to run
> initdb it fails, complaining that it can't find the postgres
> executable.
FYI, this happens on my Win 2000 also.
Maybe a problem with mixed / \ path separators after RestrictExec.
Andreas
---(end of broa
> > tests| pgbench | DBT-2 response time
> (avg/90%/max)
> >
> ---+-+
> > ---+-+---
> > LDC only | 181 tps | 1.12 / 4.38 / 12.13 s
> > + BM_CHECKPOINT_NEEDED(*
> For some Unicode character sets, element_width can be as much as 4
In UTF8 one char can be up to 6 bytes, so 4 is not correct in general.
Andreas
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
> Thats exactly the point. Consider
> select mytext from mytable ;
>
> How can PostgreSQL possibly know the maximum length of the
> returned values *before* it has scanned the whole table?
I think this focuses too much on those cases where it is not possible.
When it is not feasible like with a
> > Again, *all* other major relational databases do this ...
> even blob fields have a maximum length reported from the database.
>
> So what are you doing with the max length? Not all data types
> and values have a meaningful max length, so you have to be
> able to deal with variable length
> > Wouldn't it be far more logical to decide that if a user has the
> > permissions to do a DELETE FROM table; then they have permission to
do
> > a TRUNCATE? Why make an additional permission?
>
> Truncate doesn't fire ON DELETE triggers.
Yes, but it would imho be ok if there are'nt any on d
> > > > > The launcher is set up to wake up in autovacuum_naptime
seconds
> > > > > at most.
> >
> > Imho the fix is usually to have a sleep loop.
>
> This is what we have. The sleep time depends on the schedule
> of next vacuum for the closest database in time. If naptime
> is high, the sl
> My next suggestion would be to leave EXTERN_TUPLES_PER_PAGE
> as is, but:
> Split data wider than a page into page sized chunks as long
> as they fill whole pages.
> Split the rest with EXTERN_TUPLES_PER_PAGE (4) as now.
> This would not waste more space than currently, but improve
> performa
> > > The launcher is set up to wake up in autovacuum_naptime seconds at
most.
Imho the fix is usually to have a sleep loop.
Andreas
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
> > While I agree, that 2 might be a good compromise with low risc for
> > now, I think that toasting all rows down to ~512 bytes is too
narrowly
> > targeted at not reading wider columns.
>
> Well, it is summarized here:
>
> http://momjian.us/expire/TOAST/SUMMARY.html
>
> It made non-T
> > No, you misunderstood. Bruce was suggesting changing the target to
512.
> > That means if a row is wider than ~2k, toaster will try to toast
until
> > the base row is
> > ~512 bytes. I would not do that part for 8.3.
>
> OK, what do you suggest for 8.3? Attached are my suggestion
> to use
> For example in 8.2 this is mapped to array_prepend:
>
> regression=# select 'x'::text || array['aa','bb','cc'];
>?column?
> --
> {x,aa,bb,cc}
> (1 row)
>
> but with the experimental code you get textcat:
>
> catany=# select 'x'::text || array['aa','bb','cc'];
> ?column?
> -
> > The big question is do we want to drop the target tuple size down
to
> > 512, and increase the chunk size to 8k for 8.3? Dropping the tuple
> > size down to 512 is going to give us some smaller TOAST values to
fill
> > in free space created by the 8k chuck size, assuming you have both
>
> >> Assume the following:
> >> index on: (id, adate)
> >> constraint CHECK(adate > '01-01-2007' AND adate < '04-01-2007');
> >
Um, the subject is CE, but the question is about an index ? Those are
separate issues.
> >> The planner will not use the index listed above.
> > For what?
>
> select
> > > Given this, I propose we simply #ifdef out the SO_REUSEADDR on
win32.
I agree, that this is what we should do.
> > > (A fairly good reference to read up on the options is at
> > > http://msdn2.microsoft.com/en-us/library/ms740621.aspx
> >
> > Hmm ... if accurate, that page says in words
> I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1:
>
> 4 15.596
> 2 15.197
> 1 14.6
>
> which is basically a 3% decrease from 4->2 and 2->1. The
> test script and result are here:
>
> http://momjian.us/expire/TOAST2/
>
> shared_buffer
> > However, it suddenly struck me that we could
> >probably make most of the problem go away if we put that same wait
into
> >DROP DATABASE itself --- that is, if we see other backends in the
> >target DB, sleep for a second or two and then recheck before erroring
out.
Yup, waiting in drop da
> > Whereas if you set toast_tuples_per_page to 8k then the only option
> > for Postgres will be to put each datum in its own page and
> waste 1-3k
> > on every page.
>
> No, because actually the code is designed to make the toast
> chunk size just enough less than 8K that the tuples fit.
He
> I reran the tests with hashtext(), and created a SUMMARY.HTML chart:
>
> http://momjian.us/expire/TOAST/
>
> What you will see is that pushing TEXT to a TOAST column
> allows quick access to non-TOAST values and single-row TOAST
> values, but accessing all TOAST columns is slower than
> > However, I have just about convinced myself that we don't need
> > IsFirstByte for matching "_" for UTF8, either preceded by "%" or
not,
> > as it should always be true. Can anyone come up with a counter
example?
>
> You have to be on a first byte before you can meaningfully
> apply NextCh
> > > > 32 buffers = 1MB with 32KB blocksize, which spoils the CPU L2
> > > > cache effect.
I'd say in a scenario where 32k pages are indicated you will also want
larger than average L2 caches.
> > > >
> > > > How about using 256/blocksize?
The reading ahead uses 1/4 ring size. To the best of
> Close. There was an Msys build from the 9th running on port 5432.
> 2) VC++ and Msys builds will both happily start on the same
> port at the same time. The first one to start listens on 5432
> until it shuts down, at which point the second server takes
> over seamlessly! It doesn't matter w
> Sorry, 16x8K page ring is too small indeed. The reason we
> selected 16 is because greenplum db runs on 32K page size, so
> we are indeed reading 128K at a time. The #pages in the ring
> should be made relative to the page size, so you achieve 128K
> per read.
Ah, ok. New disks here also ha
> Also, that patch doesn't address the VACUUM issue at all. And
> using a small fixed size ring with scans that do updates can
> be devastating. I'm experimenting with different ring sizes
> for COPY at the moment. Too small ring leads to a lot of WAL
> flushes, it's basically the same problem
> In reference to the seq scans roadmap, I have just submitted
> a patch that addresses some of the concerns.
>
> The patch does this:
>
> 1. for small relation (smaller than 60% of bufferpool), use
> the current logic 2. for big relation:
> - use a ring buffer in heap scan
> - pin
> >> Are you filling multiple buffers in the buffer cache with a single
> >> read-call?
> >
> > yes, needs vector or ScatterGather IO.
>
> I would expect that to get only moderate improvement.
The vast improvement comes from 256k blocksize.
> To get
> the full benefit I would think you would
> >> What do you mean with using readahead inside the heapscan?
> >> Starting an async read request?
> >
> > Nope - just reading N buffers ahead for seqscans. Subsequent calls
> > use previously read pages. The objective is to issue
> contiguous reads
> > to the OS in sizes greater than the
> Nope - just reading N buffers ahead for seqscans. Subsequent
> calls use previously read pages. The objective is to issue
> contiguous reads to the OS in sizes greater than the PG page
> size (which is much smaller than what is needed for fast
> sequential I/O).
Problem here is that eight
> > Any suggestions? pgdiagnostics?
>
> Yes, I like "diagnostics", or "internals". I just think
> forensics isn't going to be understood by the average native
> English speaker, let alone non-English speakers.
I think forensics is ok. The world is currently beeing swamped with
related tv show
> So what happens if a backend is running with full_page_writes
> = off, someone edits postgresql.conf to turns it on and
> forgets to reload/ restart, and then we crash? You'll come up
> in recovery mode thinking that f_p_w was turned on, when in
> fact it wasn't.
>
> ISTM that we need to so
> > Writing to a different area was considered in pg, but there were
more
> > negative issues than positive.
> > So imho pg_compresslog is the correct path forward. The current
> > discussion is only about whether we want a more complex
pg_compresslog
> > and no change to current WAL, or an incr
> > 1) To deal with partial/inconsisitent write to the data file at
crash
> > recovery, we need full page writes at the first modification to
pages
> > after each checkpoint. It consumes much of WAL space.
>
> We need to find a way around this someday. Other DBs don't
> do this; it may be be
> 3) To maintain crash recovery chance and reduce the amount of
> archive log, removal of unnecessary full page writes from
> archive logs is a good choice.
Definitely, yes. pg_compresslog could even move the full pages written
during backup out of WAL and put them in a different file that nee
> I don't insist the name and the default of the GUC parameter.
> I'm afraid wal_fullpage_optimization = on (default) makes
> some confusion because the default behavior becomes a bit
> different on WAL itself.
Seems my wal_fullpage_optimization is not a good name if it caused
misinterpretati
> Hmm, I'll give it a go when I'm back in the office, but bear
> in mind this is a Mingw build on which debugging is nigh-on
> impossible.
I use the Snapshot
http://prdownloads.sf.net/mingw/gdb-6.3-2.exe?download from sf.net.
It has some issues, but it is definitely useable.
Andreas
-
> With DBT-2 benchmark, I've already compared the amount of WAL. The
> result was as follows:
>
> Amount of WAL after 60min. run of DBT-2 benchmark
> wal_add_optimization_info = off (default) 3.13GB
how about wal_fullpage_optimization = on (default)
> wal_add_optimization_info = on (new ca
I agree with Tom's reasoning about the suggested simplifications, sorry.
> > 3. Should the WALWriter also do the wal_buffers half-full write at
the
> > start of XLogInsert() ?
>
> That should go away entirely; to me the main point of the
> separate wal-writer process is to take over responsibi
> > Yup, this is a good summary.
> >
> > You say you need to remove the optimization that avoids the logging
of
> > a new tuple because the full page image exists.
> > I think we must already have the info in WAL which tuple inside the
> > full page image is new (the one for which we avoided th
> I have this exact problem a lot. There are actually cases
> where you can eliminate regular joins, not just left joins.
> For example:
>
> CREATE TABLE partner (
> id serial,
> namevarchar(40) not null,
> primary key (id)
> );
>
>
> I don't fully understand what "transaction log" means. If it means
> "archived WAL", the current (8.2) code handle WAL as follows:
Probably we can define "transaction log" to be the part of WAL that is
not
full pages.
> 1) If full_page_writes=off, then no full page writes will be
> written
> Maybe odd, but simpler to optimize this way.
>
> Your idea would be also a very good optimization, there was
> already a discussion about that here:
> http://archives.postgresql.org/pgsql-performance/2006-01/msg00
> 151.php, but that time Tom refused it because it was too
> expensive and rare
> > That page is ages out of date. The intended sync is
> apparently broken.
> > The current download area is on sourceforge
> > http://sf.net/project/showfiles.php?group_id=2435
> >
> >
>
> *sigh*
>
> And what is in 3.12, which is apparently the current version?
Sorry that was implied. sys
That page is ages out of date. The intended sync is apparently broken.
The current download area is on sourceforge
http://sf.net/project/showfiles.php?group_id=2435
Andreas
> > mingw-runtime-3.10 introduced a gettimeofday declaration in
> sys/time.h
> > that is not compatible with port.h.
> >
mingw-runtime-3.10 introduced a gettimeofday declaration in sys/time.h
that is not compatible with port.h.
(current is mingw-runtime-3.12)
int __cdecl gettimeofday(struct timeval *__restrict__,
void *__restrict__ /* tzp (unused) */);
The problem was already reported by
> > > I've been working on the patch to enhance our group commit
behavior.
> > > The patch is a dirty hack at the moment, but I'm settled on the
> > > algorithm I'm going to use and I know the issues involved.
> >
> > One question that just came to mind is whether Simon's
no-commit-wait
> > pa
> > (But that sounds rather like pie in the sky, actually. Which other
> > databases can do that, and how do they do it?)
>
> Oracle does it, by building a big index. Few people use it.
And others allow a different partitioning strategy for each index,
but that has the same problem of how to r
> > That lets you enforce unique constraints as long as the partition
key
> > is part of the unique constraint.
>
> Is that already sufficient?
yes
> That would alter the ordering of
> the columns in the index, no? I mean:
It produces ordered blocks of append nodes for range queries that spa
> > When the database uses a single byte encoding, the chr function
takes
> > the binary byte representation as an integer number between 0 and
255
> > (e.g. ascii code).
> > When the database encoding is one of the unicode encodings it takes
a
> > unicode code point.
> > This is also what Orac
> What do others think? Should the argument to CHR() be a
> Unicode code point or the numeric representation of the
> database encoding?
When the database uses a single byte encoding, the chr function takes
the binary byte representation as an integer number between 0 and 255
(e.g. ascii code).
> > ... should we revel
> > in configurability, and allow CREATE TABLE/ALTER TABLE behavior to
> > vary depending on the current threshold setting? We'd have to fix
the
> > toaster routines to not try to push stuff out-of-line when there is
no
> > out-of-line to push to ... but I think we prob
> Archive recovery needs the
> normal xlog record, which in some cases has been optimised
> away because the backup block is present, since the full
> block already contains the changes.
Aah, I didn't know that optimization exists.
I agree that removing that optimization is good/ok.
Andreas
> Without a switch, because both full page writes and
> corresponding logical log is included in WAL, this will
> increase WAL size slightly
> (maybe about five percent or so). If everybody is happy
> with this, we
> don't need a switch.
Sorry, I still don't understand that. What is the "c
> > My feeling is we should have more regular sync points where the
patch
> > queue is emptied and everything committed or rejected.
>
> No doubt, but the real problem here is that
> reviewing/committing other people's patches is not fun, it's
> just work :-(. So it's no surprise that it tend
> > I agree that these values need a second look. I think a
> > TOAST_TUPLE_THRESHOLD well smaller than the current value would
still
> > easily pay its way. With a little caution to avoid wasting too much
> > effort on the last few bytes I suspect even as low as
> 400-500 bytes is probably wo
> > The advantage of sync_scan_offset is that, in some situations, a
> > second scan can actually finish faster than if it were the only
query
> > executing, because a previous scan has already caused some blocks to
> > be cached. However, 16 is a small number because that benefit would
> > on
> > Is there a particular reason why CLUSTER isn't MVCC-safe? It seems
to
> > me that it would be trivial to fix, by using SnapshotAny instead of
> > SnapshotNow, and not overwriting the xmin/xmax with the xid of the
> > cluster command.
>
> It's trivial to fix now in this way, but it would br
> > Since partition is inheritance-based, a simple DROP or "NO
> INHERIT"
> > will do the job to deal with the partition. Do we want to reinvent
> > additional syntax when these are around and are documented?
>
> Well, if the syntax for adding a new partition eventually
> ends up as ALTER TA
> > The hard part there is checking that the partition constraints are
> > distinct. If the partition constraints are added one at a time, you
> > can use the predicate testing logic to compare the to-be-added
> > partition's constraint against each of the already added
constraints.
> > That be
> > What is really needed is a data structure that allows range
partitions
> > to be accessed more efficiently. This could make adding partitions
and
> > deciding in which partition a specific value goes an O(logN)
operation.
>
> I think we need to re-evaluate the inheritance mechanism for
> p
1 - 100 of 189 matches
Mail list logo