Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x

2008-02-28 Thread Zeugswetter Andreas ADI SD
> > Well, I guess the question is: if we don't offer some builtin way to render > > non-standard formats built into company products, will those companies fix > > their format or just not use PostgreSQL? > > Well, there is an advantage that Postgres has that some others don't: you > can extend

Re: [HACKERS] pg_dump additional options for performance

2008-02-26 Thread Zeugswetter Andreas ADI SD
> > As a user I'd really prefer all of this to be much more transparent, and could > > well imagine the -Fc format to be some kind of TOC + zip of table data + post > > load instructions (organized per table), or something like this. > > In fact just what you described, all embedded in a single

Re: [HACKERS] configurability of OOM killer

2008-02-08 Thread Zeugswetter Andreas ADI SD
> > while we are at it -- one feature would be great for 8.4, an > > ability to shange shared buffers size "on the fly". I expect > > it is not trivial, but would help fine-tuning running database. > > I think DBA would need to set maximum shared buffers size > > along the normal setting. >

Re: [HACKERS] pg_dump additional options for performance

2008-02-06 Thread Zeugswetter Andreas ADI SD
Simon wrote: > My proposal is to provide two additional modes: > --schema-pre-load corresponding to (1) above > --schema-post-load corresponding to (3) above Sounds nice. For a large schema we might rather want one switch that dumps 2 files, no ? Probably also better from a mvcc perspective. And

Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-31 Thread Zeugswetter Andreas ADI SD
> http://msdn2.microsoft.com/en-us/library/b842y285(VS.71).aspx > appears to > > suggest that the size of the field is fixed. > > That would imply that dumpbin fails at 4096 symbols per file. While I > surely wouldn't put it past M$ to have put in such a > limitation, I think > it's more like

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Zeugswetter Andreas ADI SD
> > The plural seems better to me; there's no such thing as a solitary > > synchronized scan, no? The whole point of the feature is to affect > > the behavior of multiple scans. > > +1. The plural is important IMHO. ok, good. > As I stated earlier, I don't really like this argument (we already

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Zeugswetter Andreas ADI SD
> > +1. If we go with 'enable_sync_seqcans' for 8.3, and in a future release > > cycle we do test the cases Simon described above and we agree we need to > > do a fine tune to benefit from this feature, we will need to deprecate > > 'enable_sync_seqscans' and invent another one (sync_seqscans_t

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-29 Thread Zeugswetter Andreas ADI SD
> It's a good point that we don't want pg_dump to screw up the cluster > order, but that's the only use case I've seen this far for disabling > sync scans. Even that wouldn't matter much if our estimate for > "clusteredness" didn't get screwed up by a table that looks > like this: > "5 6 7 8

Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Zeugswetter Andreas ADI SD
> >> I liked the "synchronized_sequential_scans" idea myself. > > > I think that's a bit too long. How about "synchronized_scans", or > > "synchronized_seqscans"? > > We have enable_seqscan already, so that last choice seems to fit in. Yes looks good, how about synchronized_seqscan without plur

Re: [HACKERS] Proposal: Integrity check

2008-01-25 Thread Zeugswetter Andreas ADI SD
> >> This seems like a pretty horrid idea. Bad pages shouldn't be allowed to > >> get into shared buffers in the first place. Why not have the checking > >> logic operate outside shared buffers? > > > It currently works outside the shared buffers, but I afraid about > > collision due to parall

Re: [HACKERS] Declarative partitioning grammar

2008-01-16 Thread Zeugswetter Andreas ADI SD
> > Personally I find the automatic partition idea intriguing, where you > > only have to choose an expression that equates to one value (value > > group) per partition (and possibly a way to derive a > partition name). > > IMO, better go right to a fully automated approach. Or why would you >

Re: [HACKERS] Declarative partitioning grammar

2008-01-15 Thread Zeugswetter Andreas ADI SD
> > I don't agree with that at all. I can imagine plenty of situations > > where a tuple falling outside the range of available partitions *should* > > be treated as an error. For instance, consider timestamped observations > > --- data in the future is certainly bogus, and data further back tha

Re: [HACKERS] Postgresql Materialized views

2008-01-14 Thread Zeugswetter Andreas ADI SD
> > Traditionally materialized views exist, so that you do not need to code > > differently. > > Your queries still run on the detail table, but are silently answered > > by a suitable MV. The MV might have count + other aggregated columns > > grouped by some columns, and thus be able e.g. shortc

Re: [HACKERS] Postgresql Materialized views

2008-01-14 Thread Zeugswetter Andreas ADI SD
> > But you had to modify your queries. I would think that a materialized > > views implementation worth its salt would put the view to work on the > > original, unmodified queries. > > > > I might be slow today (everyday? :-) ) - but what do you mean by this? > The only difference between *

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-11 Thread Zeugswetter Andreas ADI SD
> I've kept a list of requests for improvement that I can share with you; > I've always been loathe to publish a list of bad points. I think it would help understand the proposal if you also present the shortcomings. When you presented the positive and negative points, the negative list did look

Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Zeugswetter Andreas ADI SD
> > Yeah, I think it would be useful to log one message if after (say) 5 > > seconds we still haven't been able to open the file. > > Either that, or on the first run. Imho 1-5s is better, so that would be after the 10-50th try. > loop. It's supposed to loop 300 times. Yes. > > (Are we OK wi

Re: [HACKERS] [GENERAL] Slow PITR restore

2007-12-14 Thread Zeugswetter Andreas ADI SD
> > Note that even though the processor is 99% in wait state the drive is > > only handling about 3 MB/s. That translates into a seek time of 2.2ms > > which is actually pretty fast...But note that if this were a raid array > > Postgres's wouldn't be getting any better results. A Raid array wou

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-12 Thread Zeugswetter Andreas ADI SD
> Getting partitioning/read-only right will allow 70+TB of that to be on > tape or similar, which with compression can be reduced to maybe 20TB? I > don't want to promise any particular compression ratio, but it will make > a substantial difference, as I'm sure you realise. Wouldn't one very subst

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-12 Thread Zeugswetter Andreas ADI SD
> There are a number of nasty > limitations for partitions currently (not the least of which is that real > uniqueness guarantees are impractical), Just to add an other opinion to this statement, because it imho sounds overly pessimistic: Uniqueness is currently perfectly practical, when the un

Re: [HACKERS] Problem of a server gettext message.

2007-12-11 Thread Zeugswetter Andreas ADI SD
> > GetText is conversion po(EUC_JP) to SJIS. Yes. > Are you sure about that? Why would gettext be converting to SJIS, when > SJIS is nowhere in the environment it can see? gettext is using GetACP () on Windows, wherever that gets it's info from ... "chcp" did change the GetACP codepage in Hir

Re: [HACKERS] Ordered Append Node

2007-11-23 Thread Zeugswetter Andreas ADI SD
> > But that requires a) dealing with the problem of the parent table which has no > > constraints and ... Imho we should provide a mechanism that forces the parent to be empty and let the planner know. e.g. a false constraint on parent ONLY. Andreas ---(end of broadcas

Re: [HACKERS] Postgres 8.3 archive_command

2007-11-22 Thread Zeugswetter Andreas ADI SD
> I don't think that should even be a TODO item --- it seems far more > likely to provide a foot-gun than useful capability. On further reflection I think that initdb time is probably sufficient. Do you think that would be a reasonable TODO ? > Whether 16MB is still a reasonable default segment

Re: [HACKERS] Postgres 8.3 archive_command

2007-11-22 Thread Zeugswetter Andreas ADI SD
> > > Perhaps we should move the successful archived message to DEBUG1 now, > > > except for the first message after the archiver starts or when the > > > archive_command changes, plus one message every 255 segments? > > > That would reduce the log volume in the normal case without endangering >

Re: [HACKERS] Postgres 8.3 archive_command

2007-11-21 Thread Zeugswetter Andreas ADI SD
> Perhaps we should move the successful archived message to DEBUG1 now, > except for the first message after the archiver starts or when the > archive_command changes, plus one message every 255 segments? > That would reduce the log volume in the normal case without endangering > our ability to

Re: [HACKERS] How to keep a table in memory?

2007-11-14 Thread Zeugswetter Andreas ADI SD
Kevin Grittner wrote: > > . . .the abuse of such hints in applications I have seen is so rampant as to > > make me doubt the utility of adding them anyway. It's true that by adding > > hints, you give a facility to a good, competent designer who has a really > I have trouble not seeing the point

Re: [HACKERS] Proposal: real procedures again (8.4)

2007-10-30 Thread Zeugswetter Andreas ADI SD
> > I think the cool thing that Josh Berkus wants is > > > > return query select a, b, c from foo; > > return query select d, e, f from bar; > > > > maybe better > > SELECT a,b FROM foo; > SELECT d, e FROM bar; > > procedure doesn't need return statement usually The background for Quel was, th

Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Zeugswetter Andreas ADI SD
> A nice improvement on that would be to have a "rearchive_command" to > allow to sync the new bytes written since a previous archive_command (so > it needs a new placeholder "start from this byte"). This allows writing > dd seek=%s skip=%s count=%b bs=1 But after a log switch nothing is filling

Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-28 Thread Zeugswetter Andreas ADI SD
> > The probably useful next step would be to pass the current length to the > > archive_command, > > so it can write the filled part of the file without the need for a > > filter. > > I can see that helping a lot, but not by writing onto the file on disk. > If the file is nearly empty, that wou

Re: [HACKERS] Getting to 8.3 beta1

2007-09-28 Thread Zeugswetter Andreas ADI SD
> > * Do we bump the .so major version number for libpq? I think we should > > because there are two new exported functions since 8.2, and on at least > > some platforms there's nothing else than major number to disambiguate > > whether a client needs these or not. Comments? -1. You don't bump

Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (MaybeOFFTOPIC)

2007-09-27 Thread Zeugswetter Andreas ADI SD
> > Attached is a modified version to implement both of these. I also bailed > > out if there was surplus input. I tried an optimization of allocating a > > separate buffer for outputting the zeros, to avoid repeated memset calls. > > It didn't seem to make a very big difference; do you think it

Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation

2007-09-13 Thread Zeugswetter Andreas ADI SD
> The COLD updated (old) tuple would be pruned to dead line pointer > once the tuple becomes DEAD. Normally that would let us reuse the > tuple storage for other purposes. We do the same for DELETEd tuples. Oh, I thought only pruned tuples from HOT chains can produce a "redirect dead" line point

Re: [HACKERS] HOT line pointer bloat and PageRepairFragmentation

2007-09-13 Thread Zeugswetter Andreas ADI SD
> CREATE TABLE test (a int, b char(200)); > CREATE UNIQUE INDEX testindx ON test(a); > INSERT INTO test VALUES (1, 'foo'); > > Now, if we repeatedly update the tuple so that each update is a > COLD update, we would bloat the page with redirect-dead line pointers. Um, sorry for not understanding,

Re: [HACKERS] SQL feature requests

2007-08-25 Thread Zeugswetter Andreas ADI SD
> > I still don't see it as terrifically useful functionality, given that it's > > just saving you 4 keystrokes ... > > Less than that, because the AS is optional. The only argument I've > heard that carries much weight with me is that it eases porting from > other DBMS's that allow this. Are

Re: [HACKERS] SQL feature requests

2007-08-23 Thread Zeugswetter Andreas ADI SD
> > how much harder can it be to accept: > > > > group by 'foo' Presumably you meant group by "foo". Imho pg should accept group by "foo". It could be part of a constant removal, that also takes burden off the sort. e.g. in "select x, count(*) from bar where x=5 group by x", x could be remov

Re: [HACKERS] SQL feature requests

2007-08-23 Thread Zeugswetter Andreas ADI SD
> > If your implementation accepts: > > > > group by case when true then 'foo' end I think he meant: group by case when true then "foo" end > > What would that mean? Regardless of whether or not it's accepted, it > should have *some* meaning. > > It's not equivalent to GROUP BY "foo" Yea,

Re: [HACKERS] Naming of the prefab snowball stemmer dictionaries

2007-08-22 Thread Zeugswetter Andreas ADI SD
Sounds reasonable, but why exactly did we spell out "english" instead of "en" ? Seems the abbrev is much easier to extract from LANG or browser prefs ... Andreas -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Tom Lane Gesendet: Mittwoch, 22. A

Re: [HACKERS] 2D partitioning of VLDB - sane or not?

2007-08-14 Thread Zeugswetter Andreas ADI SD
> Which brings us back to the original issue. If I decide to stick with > the current implementation and not "improve our existing partitioning > mechanisms to scale to 100,000 partitions", I could do something like: There is a point where you can leave the selection of the correct rows to norm

Re: [HACKERS] HOT pgbench results

2007-08-07 Thread Zeugswetter Andreas ADI SD
> > unpatched HOT > > autovacuums 116 43 > > autoanalyzes139 60 > > > HOT greatly reduces the number of vacuums needed. That's > good, that's > > where the gains in throughput in longer I/O bound runs comes from. > > But surely failing to

Re: [HACKERS] Straightforward changes for increased SMP scalability

2007-07-17 Thread Zeugswetter Andreas ADI SD
> The NUM_BUFFER_PARTITIONS patch is fairly simple. We've > noticed gains with NUM_BUFFER_PARTITIONS set between 256 and > 2048, but little to no gain after 2048, although this might > depend on the benchmark and platform being used. We've Might this also be a padding issue, because 2048 part

Re: [HACKERS] write_pipe_chunks patch messes up early error message output

2007-07-16 Thread Zeugswetter Andreas ADI SD
> > Is there any reason we can't just use a check on whether > SysLoggerPID > > is not 0? > > (a) that really shouldn't be exported out of postmaster.c, > and (b) it is not readily available to child backends is it? Should there be child backends when the logger did not start ? I'd think star

Re: [HACKERS] Winner of naming discussions: Synchronous Commit

2007-06-25 Thread Zeugswetter Andreas ADI SD
> synchronous_commit > Idea: Greg Stark > Supporters: Simon, Josh, Tom, Bruce, Florian There was one more: asynchronous_commit Idea: Florian G. Pflug Supporters: none But if you are calling the feature that (which imho is good), the guc might as well get that name. Andreas

Re: [HACKERS] msvc and vista fun

2007-06-25 Thread Zeugswetter Andreas ADI SD
> user) . I can build as the admin user but when I come to run > initdb it fails, complaining that it can't find the postgres > executable. FYI, this happens on my Win 2000 also. Maybe a problem with mixed / \ path separators after RestrictExec. Andreas ---(end of broa

Re: [HACKERS] Sorted writes in checkpoint

2007-06-15 Thread Zeugswetter Andreas ADI SD
> > tests| pgbench | DBT-2 response time > (avg/90%/max) > > > ---+-+ > > ---+-+--- > > LDC only | 181 tps | 1.12 / 4.38 / 12.13 s > > + BM_CHECKPOINT_NEEDED(*

Re: [HACKERS] Selecting a constant question: A summary

2007-06-13 Thread Zeugswetter Andreas ADI SD
> For some Unicode character sets, element_width can be as much as 4 In UTF8 one char can be up to 6 bytes, so 4 is not correct in general. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate

Re: [HACKERS] Selecting a constant question

2007-06-12 Thread Zeugswetter Andreas ADI SD
> Thats exactly the point. Consider > select mytext from mytable ; > > How can PostgreSQL possibly know the maximum length of the > returned values *before* it has scanned the whole table? I think this focuses too much on those cases where it is not possible. When it is not feasible like with a

Re: [HACKERS] Selecting a constant question

2007-06-12 Thread Zeugswetter Andreas ADI SD
> > Again, *all* other major relational databases do this ... > even blob fields have a maximum length reported from the database. > > So what are you doing with the max length? Not all data types > and values have a meaningful max length, so you have to be > able to deal with variable length

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Zeugswetter Andreas ADI SD
> > Wouldn't it be far more logical to decide that if a user has the > > permissions to do a DELETE FROM table; then they have permission to do > > a TRUNCATE? Why make an additional permission? > > Truncate doesn't fire ON DELETE triggers. Yes, but it would imho be ok if there are'nt any on d

Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-08 Thread Zeugswetter Andreas ADI SD
> > > > > The launcher is set up to wake up in autovacuum_naptime seconds > > > > > at most. > > > > Imho the fix is usually to have a sleep loop. > > This is what we have. The sleep time depends on the schedule > of next vacuum for the closest database in time. If naptime > is high, the sl

Re: [HACKERS] TOAST usage setting

2007-06-08 Thread Zeugswetter Andreas ADI SD
> My next suggestion would be to leave EXTERN_TUPLES_PER_PAGE > as is, but: > Split data wider than a page into page sized chunks as long > as they fill whole pages. > Split the rest with EXTERN_TUPLES_PER_PAGE (4) as now. > This would not waste more space than currently, but improve > performa

Re: [HACKERS] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-08 Thread Zeugswetter Andreas ADI SD
> > > The launcher is set up to wake up in autovacuum_naptime seconds at most. Imho the fix is usually to have a sleep loop. Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Zeugswetter Andreas ADI SD
> > While I agree, that 2 might be a good compromise with low risc for > > now, I think that toasting all rows down to ~512 bytes is too narrowly > > targeted at not reading wider columns. > > Well, it is summarized here: > > http://momjian.us/expire/TOAST/SUMMARY.html > > It made non-T

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Zeugswetter Andreas ADI SD
> > No, you misunderstood. Bruce was suggesting changing the target to 512. > > That means if a row is wider than ~2k, toaster will try to toast until > > the base row is > > ~512 bytes. I would not do that part for 8.3. > > OK, what do you suggest for 8.3? Attached are my suggestion > to use

Re: [HACKERS] Implicit casts with generic arrays

2007-06-06 Thread Zeugswetter Andreas ADI SD
> For example in 8.2 this is mapped to array_prepend: > > regression=# select 'x'::text || array['aa','bb','cc']; >?column? > -- > {x,aa,bb,cc} > (1 row) > > but with the experimental code you get textcat: > > catany=# select 'x'::text || array['aa','bb','cc']; > ?column? > -

Re: [HACKERS] TOAST usage setting

2007-06-05 Thread Zeugswetter Andreas ADI SD
> > The big question is do we want to drop the target tuple size down to > > 512, and increase the chunk size to 8k for 8.3? Dropping the tuple > > size down to 512 is going to give us some smaller TOAST values to fill > > in free space created by the 8k chuck size, assuming you have both >

Re: [HACKERS] Constraint exclusion oddity with composite index

2007-06-04 Thread Zeugswetter Andreas ADI SD
> >> Assume the following: > >> index on: (id, adate) > >> constraint CHECK(adate > '01-01-2007' AND adate < '04-01-2007'); > > Um, the subject is CE, but the question is about an index ? Those are separate issues. > >> The planner will not use the index listed above. > > For what? > > select

Re: [HACKERS] What is happening on buildfarm member baiji?

2007-06-04 Thread Zeugswetter Andreas ADI SD
> > > Given this, I propose we simply #ifdef out the SO_REUSEADDR on win32. I agree, that this is what we should do. > > > (A fairly good reference to read up on the options is at > > > http://msdn2.microsoft.com/en-us/library/ms740621.aspx > > > > Hmm ... if accurate, that page says in words

Re: [HACKERS] TOAST usage setting

2007-05-31 Thread Zeugswetter Andreas ADI SD
> I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1: > > 4 15.596 > 2 15.197 > 1 14.6 > > which is basically a 3% decrease from 4->2 and 2->1. The > test script and result are here: > > http://momjian.us/expire/TOAST2/ > > shared_buffer

Re: [HACKERS] Ye olde drop-the-database-you-just-left problem

2007-05-31 Thread Zeugswetter Andreas ADI SD
> > However, it suddenly struck me that we could > >probably make most of the problem go away if we put that same wait into > >DROP DATABASE itself --- that is, if we see other backends in the > >target DB, sleep for a second or two and then recheck before erroring out. Yup, waiting in drop da

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Zeugswetter Andreas ADI SD
> > Whereas if you set toast_tuples_per_page to 8k then the only option > > for Postgres will be to put each datum in its own page and > waste 1-3k > > on every page. > > No, because actually the code is designed to make the toast > chunk size just enough less than 8K that the tuples fit. He

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Zeugswetter Andreas ADI SD
> I reran the tests with hashtext(), and created a SUMMARY.HTML chart: > > http://momjian.us/expire/TOAST/ > > What you will see is that pushing TEXT to a TOAST column > allows quick access to non-TOAST values and single-row TOAST > values, but accessing all TOAST columns is slower than

Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Zeugswetter Andreas ADI SD
> > However, I have just about convinced myself that we don't need > > IsFirstByte for matching "_" for UTF8, either preceded by "%" or not, > > as it should always be true. Can anyone come up with a counter example? > > You have to be on a first byte before you can meaningfully > apply NextCh

Re: [HACKERS] Seq scans roadmap

2007-05-16 Thread Zeugswetter Andreas ADI SD
> > > > 32 buffers = 1MB with 32KB blocksize, which spoils the CPU L2 > > > > cache effect. I'd say in a scenario where 32k pages are indicated you will also want larger than average L2 caches. > > > > > > > > How about using 256/blocksize? The reading ahead uses 1/4 ring size. To the best of

Re: [HACKERS] What is happening on buildfarm member baiji?

2007-05-14 Thread Zeugswetter Andreas ADI SD
> Close. There was an Msys build from the 9th running on port 5432. > 2) VC++ and Msys builds will both happily start on the same > port at the same time. The first one to start listens on 5432 > until it shuts down, at which point the second server takes > over seamlessly! It doesn't matter w

Re: [HACKERS] Seq scans roadmap

2007-05-11 Thread Zeugswetter Andreas ADI SD
> Sorry, 16x8K page ring is too small indeed. The reason we > selected 16 is because greenplum db runs on 32K page size, so > we are indeed reading 128K at a time. The #pages in the ring > should be made relative to the page size, so you achieve 128K > per read. Ah, ok. New disks here also ha

Re: [HACKERS] Seq scans roadmap

2007-05-10 Thread Zeugswetter Andreas ADI SD
> Also, that patch doesn't address the VACUUM issue at all. And > using a small fixed size ring with scans that do updates can > be devastating. I'm experimenting with different ring sizes > for COPY at the moment. Too small ring leads to a lot of WAL > flushes, it's basically the same problem

Re: [HACKERS] Seq scans roadmap

2007-05-10 Thread Zeugswetter Andreas ADI SD
> In reference to the seq scans roadmap, I have just submitted > a patch that addresses some of the concerns. > > The patch does this: > > 1. for small relation (smaller than 60% of bufferpool), use > the current logic 2. for big relation: > - use a ring buffer in heap scan > - pin

Re: [HACKERS] Seq scans roadmap

2007-05-09 Thread Zeugswetter Andreas ADI SD
> >> Are you filling multiple buffers in the buffer cache with a single > >> read-call? > > > > yes, needs vector or ScatterGather IO. > > I would expect that to get only moderate improvement. The vast improvement comes from 256k blocksize. > To get > the full benefit I would think you would

Re: [HACKERS] Seq scans roadmap

2007-05-08 Thread Zeugswetter Andreas ADI SD
> >> What do you mean with using readahead inside the heapscan? > >> Starting an async read request? > > > > Nope - just reading N buffers ahead for seqscans. Subsequent calls > > use previously read pages. The objective is to issue > contiguous reads > > to the OS in sizes greater than the

Re: [HACKERS] Seq scans roadmap

2007-05-08 Thread Zeugswetter Andreas ADI SD
> Nope - just reading N buffers ahead for seqscans. Subsequent > calls use previously read pages. The objective is to issue > contiguous reads to the OS in sizes greater than the PG page > size (which is much smaller than what is needed for fast > sequential I/O). Problem here is that eight

Re: [HACKERS] Heap page diagnostic functions

2007-05-02 Thread Zeugswetter Andreas ADI SD
> > Any suggestions? pgdiagnostics? > > Yes, I like "diagnostics", or "internals". I just think > forensics isn't going to be understood by the average native > English speaker, let alone non-English speakers. I think forensics is ok. The world is currently beeing swamped with related tv show

Re: [HACKERS] Avoiding unnecessary reads in recovery

2007-04-26 Thread Zeugswetter Andreas ADI SD
> So what happens if a backend is running with full_page_writes > = off, someone edits postgresql.conf to turns it on and > forgets to reload/ restart, and then we crash? You'll come up > in recovery mode thinking that f_p_w was turned on, when in > fact it wasn't. > > ISTM that we need to so

Re: [HACKERS] too much WAL volume

2007-04-26 Thread Zeugswetter Andreas ADI SD
> > Writing to a different area was considered in pg, but there were more > > negative issues than positive. > > So imho pg_compresslog is the correct path forward. The current > > discussion is only about whether we want a more complex pg_compresslog > > and no change to current WAL, or an incr

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-25 Thread Zeugswetter Andreas ADI SD
> > 1) To deal with partial/inconsisitent write to the data file at crash > > recovery, we need full page writes at the first modification to pages > > after each checkpoint. It consumes much of WAL space. > > We need to find a way around this someday. Other DBs don't > do this; it may be be

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-24 Thread Zeugswetter Andreas ADI SD
> 3) To maintain crash recovery chance and reduce the amount of > archive log, removal of unnecessary full page writes from > archive logs is a good choice. Definitely, yes. pg_compresslog could even move the full pages written during backup out of WAL and put them in a different file that nee

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-23 Thread Zeugswetter Andreas ADI SD
> I don't insist the name and the default of the GUC parameter. > I'm afraid wal_fullpage_optimization = on (default) makes > some confusion because the default behavior becomes a bit > different on WAL itself. Seems my wal_fullpage_optimization is not a good name if it caused misinterpretati

Re: [HACKERS] [Fwd: PGBuildfarm member narwhal Branch HEAD Statuschanged from OK to InstallCheck failure]

2007-04-23 Thread Zeugswetter Andreas ADI SD
> Hmm, I'll give it a go when I'm back in the office, but bear > in mind this is a Mingw build on which debugging is nigh-on > impossible. I use the Snapshot http://prdownloads.sf.net/mingw/gdb-6.3-2.exe?download from sf.net. It has some issues, but it is definitely useable. Andreas -

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-20 Thread Zeugswetter Andreas ADI SD
> With DBT-2 benchmark, I've already compared the amount of WAL. The > result was as follows: > > Amount of WAL after 60min. run of DBT-2 benchmark > wal_add_optimization_info = off (default) 3.13GB how about wal_fullpage_optimization = on (default) > wal_add_optimization_info = on (new ca

Re: [HACKERS] [PATCHES] Reviewers Guide to Deferred Transactions/Transaction Guarantee

2007-04-13 Thread Zeugswetter Andreas ADI SD
I agree with Tom's reasoning about the suggested simplifications, sorry. > > 3. Should the WALWriter also do the wal_buffers half-full write at the > > start of XLogInsert() ? > > That should go away entirely; to me the main point of the > separate wal-writer process is to take over responsibi

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-13 Thread Zeugswetter Andreas ADI SD
> > Yup, this is a good summary. > > > > You say you need to remove the optimization that avoids the logging of > > a new tuple because the full page image exists. > > I think we must already have the info in WAL which tuple inside the > > full page image is new (the one for which we avoided th

Re: [HACKERS] Eliminating unnecessary left joins

2007-04-13 Thread Zeugswetter Andreas ADI SD
> I have this exact problem a lot. There are actually cases > where you can eliminate regular joins, not just left joins. > For example: > > CREATE TABLE partner ( > id serial, > namevarchar(40) not null, > primary key (id) > ); > >

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-04-12 Thread Zeugswetter Andreas ADI SD
> I don't fully understand what "transaction log" means. If it means > "archived WAL", the current (8.2) code handle WAL as follows: Probably we can define "transaction log" to be the part of WAL that is not full pages. > 1) If full_page_writes=off, then no full page writes will be > written

Re: [HACKERS] Eliminating unnecessary left joins

2007-04-12 Thread Zeugswetter Andreas ADI SD
> Maybe odd, but simpler to optimize this way. > > Your idea would be also a very good optimization, there was > already a discussion about that here: > http://archives.postgresql.org/pgsql-performance/2006-01/msg00 > 151.php, but that time Tom refused it because it was too > expensive and rare

Re: [HACKERS] conflicting gettimeofday with MinGW

2007-04-11 Thread Zeugswetter Andreas ADI SD
> > That page is ages out of date. The intended sync is > apparently broken. > > The current download area is on sourceforge > > http://sf.net/project/showfiles.php?group_id=2435 > > > > > > *sigh* > > And what is in 3.12, which is apparently the current version? Sorry that was implied. sys

Re: [HACKERS] conflicting gettimeofday with MinGW

2007-04-11 Thread Zeugswetter Andreas ADI SD
That page is ages out of date. The intended sync is apparently broken. The current download area is on sourceforge http://sf.net/project/showfiles.php?group_id=2435 Andreas > > mingw-runtime-3.10 introduced a gettimeofday declaration in > sys/time.h > > that is not compatible with port.h. > >

[HACKERS] conflicting gettimeofday with MinGW

2007-04-11 Thread Zeugswetter Andreas ADI SD
mingw-runtime-3.10 introduced a gettimeofday declaration in sys/time.h that is not compatible with port.h. (current is mingw-runtime-3.12) int __cdecl gettimeofday(struct timeval *__restrict__, void *__restrict__ /* tzp (unused) */); The problem was already reported by

Re: [HACKERS] Group Commit

2007-04-10 Thread Zeugswetter Andreas ADI SD
> > > I've been working on the patch to enhance our group commit behavior. > > > The patch is a dirty hack at the moment, but I'm settled on the > > > algorithm I'm going to use and I know the issues involved. > > > > One question that just came to mind is whether Simon's no-commit-wait > > pa

Re: [HACKERS] Auto Partitioning

2007-04-06 Thread Zeugswetter Andreas ADI SD
> > (But that sounds rather like pie in the sky, actually. Which other > > databases can do that, and how do they do it?) > > Oracle does it, by building a big index. Few people use it. And others allow a different partitioning strategy for each index, but that has the same problem of how to r

Re: [HACKERS] Auto Partitioning

2007-04-05 Thread Zeugswetter Andreas ADI SD
> > That lets you enforce unique constraints as long as the partition key > > is part of the unique constraint. > > Is that already sufficient? yes > That would alter the ordering of > the columns in the index, no? I mean: It produces ordered blocks of append nodes for range queries that spa

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-04-04 Thread Zeugswetter Andreas ADI SD
> > When the database uses a single byte encoding, the chr function takes > > the binary byte representation as an integer number between 0 and 255 > > (e.g. ascii code). > > When the database encoding is one of the unicode encodings it takes a > > unicode code point. > > This is also what Orac

Re: [HACKERS] Bug in UTF8-Validation Code?

2007-04-04 Thread Zeugswetter Andreas ADI SD
> What do others think? Should the argument to CHR() be a > Unicode code point or the numeric representation of the > database encoding? When the database uses a single byte encoding, the chr function takes the binary byte representation as an integer number between 0 and 255 (e.g. ascii code).

Re: [HACKERS] Modifying TOAST thresholds

2007-04-03 Thread Zeugswetter Andreas ADI SD
> > ... should we revel > > in configurability, and allow CREATE TABLE/ALTER TABLE behavior to > > vary depending on the current threshold setting? We'd have to fix the > > toaster routines to not try to push stuff out-of-line when there is no > > out-of-line to push to ... but I think we prob

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Zeugswetter Andreas ADI SD
> Archive recovery needs the > normal xlog record, which in some cases has been optimised > away because the backup block is present, since the full > block already contains the changes. Aah, I didn't know that optimization exists. I agree that removing that optimization is good/ok. Andreas

Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-30 Thread Zeugswetter Andreas ADI SD
> Without a switch, because both full page writes and > corresponding logical log is included in WAL, this will > increase WAL size slightly > (maybe about five percent or so). If everybody is happy > with this, we > don't need a switch. Sorry, I still don't understand that. What is the "c

Re: [HACKERS] Patch queue concern

2007-03-29 Thread Zeugswetter Andreas ADI SD
> > My feeling is we should have more regular sync points where the patch > > queue is emptied and everything committed or rejected. > > No doubt, but the real problem here is that > reviewing/committing other people's patches is not fun, it's > just work :-(. So it's no surprise that it tend

Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Zeugswetter Andreas ADI SD
> > I agree that these values need a second look. I think a > > TOAST_TUPLE_THRESHOLD well smaller than the current value would still > > easily pay its way. With a little caution to avoid wasting too much > > effort on the last few bytes I suspect even as low as > 400-500 bytes is probably wo

Re: [HACKERS] Synchronized Scan update

2007-03-14 Thread Zeugswetter Andreas ADI SD
> > The advantage of sync_scan_offset is that, in some situations, a > > second scan can actually finish faster than if it were the only query > > executing, because a previous scan has already caused some blocks to > > be cached. However, 16 is a small number because that benefit would > > on

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Zeugswetter Andreas ADI SD
> > Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to > > me that it would be trivial to fix, by using SnapshotAny instead of > > SnapshotNow, and not overwriting the xmin/xmax with the xid of the > > cluster command. > > It's trivial to fix now in this way, but it would br

Re: [HACKERS] Auto creation of Partitions

2007-03-09 Thread Zeugswetter Andreas ADI SD
> > Since partition is inheritance-based, a simple DROP or "NO > INHERIT" > > will do the job to deal with the partition. Do we want to reinvent > > additional syntax when these are around and are documented? > > Well, if the syntax for adding a new partition eventually > ends up as ALTER TA

Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Zeugswetter Andreas ADI SD
> > The hard part there is checking that the partition constraints are > > distinct. If the partition constraints are added one at a time, you > > can use the predicate testing logic to compare the to-be-added > > partition's constraint against each of the already added constraints. > > That be

Re: [HACKERS] Auto creation of Partitions

2007-03-07 Thread Zeugswetter Andreas ADI SD
> > What is really needed is a data structure that allows range partitions > > to be accessed more efficiently. This could make adding partitions and > > deciding in which partition a specific value goes an O(logN) operation. > > I think we need to re-evaluate the inheritance mechanism for > p

  1   2   >