Re: [HACKERS] [Plperlng-devel] Data Persists Past Scope
This is almost ceratinly a perl problem that has nothing to do with postgres. Please construct a small test case - I at least don't have time to spend wading through huge gobs of code. Note: if the variable is referred to by a live subroutine it will still be alive. See man perlref and search for closure - it might help you. cheers andrew David Fetter wrote: Folks, While testing DBI-Link, I've noticed something very odd. In the trigger code, I have subroutines with 'my' variables in them, which I thought meant that as soon as the subroutine returned, the variables went away. They are not going away :( Please find attached some sample output along with DBI-Link. The database I'm connecting to is MySQL's Sakila, but the same happens in Oracle, so I don't think (this time ;) it's a MySQL problem. If I quit the session or reload the functions, the ghost variables go away, but I can't ask people to do that between queries. Help! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] postgres database crashed
Hi all,I am sorry but I forgot to mention that in the database schema we are maintaining referrences to the main table xyz(int id, img image, fname varhcar(50))There are around 14 tables referrencing this table . The referrences are being made to the column id.The code works well if we don't maintain the referrences but when we include the referrences then the database crashes somewhere between 2500-3000 transactions.So could this problem be due to the multiple referrences being made to the same table ?Markus Schaber [EMAIL PROTECTED] wrote: Hi, Ashish,Ashish Goel wrote: But the same code worked when I inserted around 2500 images in the database. After that it started crashing.Testing can never prove that there are no bugs.It's like the proof that all odd numbers above 1 are prime:3 is prime, 5 is prime, 7 is prime, so I conclude that all odd numbersabove 1 are prime. So , I don't think it's because of error in the code. Can u suggest some other possible reasons and also why is it crashing at call to memcpy().- broken hardware- compiler bugs- bugs in PostgreSQLBut without having seen your code, I tend to assume that it's somethinglike a wrong length flag in some corner case in your codeMarkus-- Markus Schaber | Logical TrackingTracing International AGDipl. Inf. | Software Development GISFight against software patents in Europe! www.ffii.orgwww.nosoftwarepatents.org Get your own web address for just $1.99/1st yr. We'll help. Yahoo! Small Business.
Re: [HACKERS] postgres database crashed
Hi, Ashish, Ashish Goel wrote: I am sorry but I forgot to mention that in the database schema we are maintaining referrences to the main table xyz(int id, img image, fname varhcar(50)) There are around 14 tables referrencing this table . The referrences are being made to the column id. The code works well if we don't maintain the referrences but when we include the referrences then the database crashes somewhere between 2500-3000 transactions. So could this problem be due to the multiple referrences being made to the same table ? I doubt so. Foreign key references are among the basics of SQL, they're pretty well tested. Could you try to replace your image type e. G. with bytea for your test purposes, and see, whether it crashes, too? HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Is python 2.5 supported?
Ühel kenal päeval, E, 2006-10-16 kell 14:46, kirjutas Jim C. Nasby: Since installing python 2.5, tapir has been failing: http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=tapirdt=2006-10-15%2020:20:16 Several of the failures appear to be a simple change in error reporting; I haven't investigated why import_succeed() failed. Should python 2.5 work with plpython? This is about pl_python ? Forwarding to Sven to investigate -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Anyone using POSIX time zone offset capability?
Hi, The POSIX timezone notation as understood by the zic code includes the possibility of zoneabbrev[+-]hh[:mm[:ss]] but the meaning is that hh:mm:ss *is* the offset from GMT, and zoneabbrev is being defined as the abbreviation for that offset. What the datetime.c code is doing is trying to find the zoneabbrev in a built-in timezone table, and then adding the two together. This is simply wacko. I think that if anyone has ever tried to use this notation they would have noticed this misinterpretation of the specs. Given where the code stands now, I think the best solution is to rip out DecodePosixTimezone and instead pass the syntax off to the zic code (which can handle it via tzparse()). Since the datetime input parser is ultimately only interested in the GMT offset value, this would mean that the zoneabbrev part would become a noise word. Sounds like a good idea to me. Sander ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Anyone using POSIX time zone offset capability?
Sander Steffann [EMAIL PROTECTED] writes: What the datetime.c code is doing is trying to find the zoneabbrev in a built-in timezone table, and then adding the two together. This is simply wacko. I think that if anyone has ever tried to use this notation they would have noticed this misinterpretation of the specs. Well, it'd work without surprise for the case of GMT+-n, which is undoubtedly the most common case ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Anyone using POSIX time zone offset capability?
Hi, Sander Steffann [EMAIL PROTECTED] writes: What the datetime.c code is doing is trying to find the zoneabbrev in a built-in timezone table, and then adding the two together. This is simply wacko. I think that if anyone has ever tried to use this notation they would have noticed this misinterpretation of the specs. Well, it'd work without surprise for the case of GMT+-n, which is undoubtedly the most common case ... H. I hadn't thought of that, but then: with the changes you proposed they would still get what they expect. Even though that notation would not conform to the POSIX docs. Still seems like a good idea :) Sander ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Getting the type Oid in a CREATE TYPE output function
On 10/16/06, Weslee Bilodeau [EMAIL PROTECTED] wrote: Marko Kreen wrote: The PGP functions happen to do it already - pgp_key_id(). Actually, Tom helped me realize I made a mistake, which I'm following his suggestion. Not tying keys to OIDs which change when backup/restored. Yeah, tying to oids is bad, you should link to names, preferably schema-qualified. Anyway, that was just off-hand suggestion. [ snip nice description ] I'm not sure if anyone else needs something like it, but it allows us to transparently encrypt data directly in the tables. Minimum application changes ('select enc_key' at connection) - the main requirement when working on legacy code that needs to match todays security polices quickly. Some want row-level access control, then your scheme would not be enough. Maybe it would be better to avoid combining the keys, instead have hidden key in database and several user keys that grant access to that key, thus you can revoke access from only some users. But one thing I suggest strongly - use PGP encryption instead of old encrypt()/decrypt(). PGP hides the data much better, espacially in case of lot of small data with same key. -- marko ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] postgres database crashed
Markus Schaber [EMAIL PROTECTED] writes: Ashish Goel wrote: The code works well if we don't maintain the referrences but when we include the referrences then the database crashes somewhere between 2500-3000 transactions. So could this problem be due to the multiple referrences being made to the same table ? I doubt so. Foreign key references are among the basics of SQL, they're pretty well tested. I'm betting that this is a memory-clobber problem in that custom datatype. The reason the symptoms come and go when varying unrelated stuff is that it might be chancing to clobber momentarily-unused memory rather than live data structures. (In the above example, the queue of pending FK trigger events is likely what got clobbered.) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] hacking postgres hashjoin algorithm
Hello there I am trying to play around with the hashjoin algorithm in postgres. I am using the statement like Select count(*) from r,s where r.id=s.id; I looked at the function ExecHashJoin() in nodeHashjoin.c and cannot find where the algorithm is comparing if r.id equals s.id please advise. Regards hal ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Syntax bug? Group by?
Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function If I am asking for a specific column value, should I, technically speaking, need to group by that column? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] hacking postgres hashjoin algorithm
On Sun, Oct 15, 2006 at 11:08:18PM -0400, HS wrote: Hello there I am trying to play around with the hashjoin algorithm in postgres. I am using the statement like Select count(*) from r,s where r.id=s.id; I looked at the function ExecHashJoin() in nodeHashjoin.c and cannot find where the algorithm is comparing if r.id equals s.id The code doing the work is actually ExecScanHashBucket() which is in nodeHash.c. The actual check is done by the ExecQual there... Hope this helps, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Syntax bug? Group by?
* Mark Woodward ([EMAIL PROTECTED]) wrote: If I am asking for a specific column value, should I, technically speaking, need to group by that column? Technically speaking, if you're asking for a specific tuple, should you be allowed to request an aggregation? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Additional stats for Relations
On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? Nikhils This makes sense for indexes, but only makes sense for heaps when we know that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads. IMHO Jim's proposal makes more sense for general use. heap_blks_extend: The number of times file extend was invoked on the relation Sounds good heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g. Sounds good As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relation Those last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums is a more interesting metric? We've got last vacuum date, but no indication of how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. Nikhils This should be useful for tuning space allocation/deallocation. If we get this patch in early it should help get feedback on this area. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Syntax bug? Group by?
On Tue, 17 Oct 2006 12:08:07 -0400 Stephen Frost [EMAIL PROTECTED] wrote: * Mark Woodward ([EMAIL PROTECTED]) wrote: If I am asking for a specific column value, should I, technically speaking, need to group by that column? Technically speaking, if you're asking for a specific tuple, should you be allowed to request an aggregation? One column value doesn't necessarily mean one tuple unless it has a unique index on that column. SELECT COUNT(*) FROM table WHERE field = 'value'; That's perfectly reasonable. You don't need the GROUP BY clause. However, this doesn't sound like a hackers question. Next time, please ask on another list such as pgsql-sql or even pgsql-novice. You can review the mailing lists and their purpose at http://www.postgresql.org/community/lists/ -- D'Arcy J.M. Cain darcy@druid.net | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] constraints in query plans
On Sun, 2006-10-15 at 20:36 -0700, Jeremy Drake wrote: I just tried that, CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp); and dropped the others. That actually works properly. jeremyd=# explain analyze select * from test_domain where k = 1255 and mp; QUERY PLAN -- Bitmap Heap Scan on test_domain (cost=5.37..237.21 rows=66 width=17) (actual time=0.115..0.707 rows=132 loops=1) Recheck Cond: (k = 1255) Filter: mp - Bitmap Index Scan on test_domain_k_x1_x2_mp (cost=0.00..5.37 rows=66 width=0) (actual time=0.081..0.081 rows=132 loops=1) Index Cond: ((k = 1255) AND (mp = true)) Total runtime: 1.137 ms (6 rows) I thought I had to refer to all of the columns in order for this to work, that I could not skip some in the middle, but it seems to work. As long as k=1255 is selective enough, the index is useful. That's because k is the first item in the index key. Regards, Jeff Davis ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Syntax bug? Group by?
Stephen Frost wrote: * Mark Woodward ([EMAIL PROTECTED]) wrote: If I am asking for a specific column value, should I, technically speaking, need to group by that column? Technically speaking, if you're asking for a specific tuple, should you be allowed to request an aggregation? Only with the assumption that the value in the where clause is for a unique column. If you want min(col2) and avg(col2) where col1=x you can get it without a group by, the same as if you put col1x - if you want an aggregate of all records returned not the aggregate based on each value of col1. select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Asynchronous I/O Support
On 10/15/06, Luke Lonergan [EMAIL PROTECTED] wrote: Martijn, The killer use-case we've identified is for the scattered I/O associated with index + heap scans in Postgres. If we can issue ~5-15 I/Os in advance when the TIDs are widely separated it has the potential to increase the I/O speed by the number of disks in the tablespace being scanned. At this point, that pattern will only use one disk. did you have a chance to look at posix_fadvise? merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Syntax bug? Group by?
Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Asynchronous I/O Support
* Neil Conway: [1] http://lse.sourceforge.net/io/aio.html Last Modified Mon, 07 Jun 2004 12:00:09 GMT But you are right -- it seems that io_submit still blocks without O_DIRECT. *sigh* -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Syntax bug? Group by?
Hi, Mark, Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function If I am asking for a specific column value, should I, technically speaking, need to group by that column? Try: SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15; HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? I think your point is that every non-aggregate column in the results of the query also appears in the where clause and is given a single value there, so conceivably, an all-knowing, all-powerful postgres could recognize this and do the implied GROUP by on these columns. I'm not in a position to give a definitive answer on this, but I suspect that adjusting the query parser/planner to allow an implied GROUP BY either gets prohibitively complicated, or fits too much of a special case to be worth implementing. select ycis_id, some_other_id, min(tindex), avg(tindex) from y where ycis_id = 15 group by some_other_id; Here, postgres would have to use the group by you specified, and also recognize the single-valued constant assigned to ycis_id. Maybe not too bad, but: select ycis_id, some_other_id, min(tindex), avg(tindex) from y where ycis_id = some_single_valued_constant(foo, bar) group by some_other_id; In this case, postgres doesn't know whether some_single_valued_constant() will really return the same single value for every tuple. Ultimately, as more complex queries are introduced, it would become a lot simpler for the query writer to just specify the group by columns instead of trying to guess it from the where clause. Final note: I could also see situations where an implied group by would silently allow a poorly written query to execute, instead of throwing an error that suggests to the query writer that they did something wrong. -- Nolan Cafferky Software Developer IT Department RBS Interactive [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? AFAIK what you want is not per sql spec. What if you had instead written select ycis_id, min(tindex), avg(tindex) from y where frobnitz(ycis_id) = 15; ? I think you are expecting too much reasoning from the engine. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Syntax bug? Group by?
Hi, Mark, Mark Woodward wrote: Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? I think that it's a lack of special-casing the = operator. Imagine where ycis_id15 or where ycis_id @| $RECTANGLE or other (probably user defined) operators on (probably user defined) datatypes. The parser has no real knowledge what the operators do, it simply requests one that returns a bool. One could make the parser to special case the = operator, and maybe some others, however I doubt it's worth the effort. HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Syntax bug? Group by?
Hi, Mark, Mark Woodward wrote: Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? I think that it's a lack of special-casing the = operator. Imagine where ycis_id15 or where ycis_id @| $RECTANGLE or other (probably user defined) operators on (probably user defined) datatypes. The parser has no real knowledge what the operators do, it simply requests one that returns a bool. One could make the parser to special case the = operator, and maybe some others, however I doubt it's worth the effort. I understand the SQL, and this isn't a sql question else it would be on a different list, it is a PostgreSQL internals question and IMHO potential bug. The original query: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; Should NOT require a group by to get ycis_id in the results. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Asynchronous I/O Support
Have a look at this: [2]http://www-128.ibm.com/developerworks/linux/library/l-async/ This gives a good description of AIO. I'm doing some testing. Will notify, if I get any positive results. Please let me know, if you get any ideas after reading [2]. Regards, Raja On 10/17/06, Florian Weimer [EMAIL PROTECTED] wrote: * Neil Conway: [1] http://lse.sourceforge.net/io/aio.html Last Modified Mon, 07 Jun 2004 12:00:09 GMT But you are right -- it seems that io_submit still blocks without O_DIRECT. *sigh* -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
Hi, Mark, Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function If I am asking for a specific column value, should I, technically speaking, need to group by that column? Try: SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15; This isn't a SQL question!!! This is a question of whether or not PostgreSQL is correct in requiring a group by in the query. I assert that since it is unabiguous as to what ycis_id should be, PostgreSQL should not require a grouping. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? AFAIK what you want is not per sql spec. What if you had instead written select ycis_id, min(tindex), avg(tindex) from y where frobnitz(ycis_id) = 15; ? I think you are expecting too much reasoning from the engine. Regardless, I can get the results I need and have already worked around this. The reason why I posted the question to hackers was that I think it is a bug. The output column ycis_id is unabiguously a single value with regards to the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this exact type of query before either on PostgreSQL or another system, maybe Oracle, and it did work. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? I think your point is that every non-aggregate column in the results of the query also appears in the where clause and is given a single value there, so conceivably, an all-knowing, all-powerful postgres could recognize this and do the implied GROUP by on these columns. Not exactly. I'm not in a position to give a definitive answer on this, but I suspect that adjusting the query parser/planner to allow an implied GROUP BY either gets prohibitively complicated, or fits too much of a special case to be worth implementing. select ycis_id, some_other_id, min(tindex), avg(tindex) from y where ycis_id = 15 group by some_other_id; This is not, in fact, like the example I gave and confuses the point I am trying to make. The original query: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ycis_id is unambiguous and MUST be only one value, there should be no requirement of grouping. In fact, a group by implies multiple result rows in an aggregate query. As I said in other branches of this thread, this isn't a SQL question, it is a question of whether or not the PostgreSQL parser is correct or not, and I do not believe that it is working correctly. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Hi, Mark, Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function If I am asking for a specific column value, should I, technically speaking, need to group by that column? Try: SELECT 15 as ycis_id, min(tindex), avt(tindex) from y where ycis_id = 15; This isn't a SQL question!!! This is a question of whether or not PostgreSQL is correct in requiring a group by in the query. I assert that since it is unabiguous as to what ycis_id should be, PostgreSQL should not require a grouping. Of course it's an SQL question. How can you ask about the correctness of a piece of text which purports to be SQL and then say it isn't an SQL question? If you can point to a place in the spec or our docs that sanctions the usage you expect, then please do so, Until then I (and I suspect everyone else) will persist in saying it's not a bug. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Stephen Frost wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; But back to the query the issue comes in that the ycis_id value is included with the return values requested (a single row value with aggregate values that isn't grouped) - if ycis_id is not unique you will get x number of returned tuples with ycis_id=15 and the same min() and avg() values for each row. Removing the ycis_id after the select will return the aggregate values you want without the group by. I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. SELECT ycis_id FROM table WHERE ycis_id=15; returns a single tuple when ycis_id is unique otherwise multiple tuples which means that SELECT ycis_id is technically defined as returning a multiple row tuple even if ycis_id is unique - the data in the tuple returned is data directly from one table row SELECT max(col2) FROM table WHERE ycis_id=15; returns an aggregate tuple SELECT ycis_id FROM table WHERE ycis_id=15 GROUP BY ycis_id; returns an aggregate tuple (aggregated with the GROUP BY clause making the ycis_id after the SELECT an aggregate as well) You can't have both a single tuple and an aggregate tuple returned in the one statement. If you want the column value of ycis_id in the results you need the group by to unify all returned results as being aggregates. -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function This would require a great deal of special-casing, in particular knowledge of the = operator, and then the restriction to a particular form of the WHERE clause. For overall consistency, I don't think this should be allowed. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Syntax bug? Group by?
On Oct 17, 2006, at 15:19, Peter Eisentraut wrote: Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function This would require a great deal of special-casing, in particular knowledge of the = operator, and then the restriction to a particular form of the WHERE clause. For overall consistency, I don't think this should be allowed. In this particular case, the client constructing the query *knows* the value of ycis_id (since the client is generating the ycis_id = 15 clause). It's technically just a waste of bandwidth and server resources to recalculate it. If you really want to replicate the output of the query you proposed, you could rewrite it on the client as: select 15 as ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; You could argue that the server should do this for you, but it seems ugly to do in the general case. And, like Peter points out, would need a lot of special-casing. I guess the parser could do it for expressions in the SELECT clause that exactly match expressions in the WHERE clause. Thanks! - Chris ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Syntax bug? Group by?
On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote: The output column ycis_id is unabiguously a single value with regards to the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this exact type of query before either on PostgreSQL or another system, maybe Oracle, and it did work. Doesn't work in Oracle 10g: SELECT ycis_id, tindex from x where ycis_id = 15; YCIS_ID TINDEX === == 15 10 15 20 SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15; ORA-00937: not a single-group group function SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY ycis_id; YCIS_ID MIN(TINDEX) AVG(TINDEX) === === === 15 10 15 That's interesting. I am digging through the SQL99 spec, and am trying to find a definitive answer. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function This would require a great deal of special-casing, in particular knowledge of the = operator, and then the restriction to a particular form of the WHERE clause. For overall consistency, I don't think this should be allowed. Well, this started out as a huh, that's funny, that should work, is that a bug? and is turning into a search through the SQL99 spec for a clear answer. I've already worked around it, but to me, at least, it seems it should work. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
On Oct 17, 2006, at 15:19, Peter Eisentraut wrote: Mark Woodward wrote: Shouldn't this work? select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; ERROR: column y.ycis_id must appear in the GROUP BY clause or be used in an aggregate function This would require a great deal of special-casing, in particular knowledge of the = operator, and then the restriction to a particular form of the WHERE clause. For overall consistency, I don't think this should be allowed. In this particular case, the client constructing the query *knows* the value of ycis_id (since the client is generating the ycis_id = 15 clause). It's technically just a waste of bandwidth and server resources to recalculate it. If you really want to replicate the output of the query you proposed, you could rewrite it on the client as: select 15 as ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; You could argue that the server should do this for you, but it seems ugly to do in the general case. And, like Peter points out, would need a lot of special-casing. I guess the parser could do it for expressions in the SELECT clause that exactly match expressions in the WHERE clause. But, and here's the rub, which is the correct way to handle it? I'm looking through the SQL99 spec to see if I can find an answer. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote: The output column ycis_id is unabiguously a single value with regards to the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this exact type of query before either on PostgreSQL or another system, maybe Oracle, and it did work. Doesn't work in Oracle 10g: SELECT ycis_id, tindex from x where ycis_id = 15; YCIS_ID TINDEX === == 15 10 15 20 SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15; ORA-00937: not a single-group group function SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY ycis_id; YCIS_ID MIN(TINDEX) AVG(TINDEX) === === === 15 10 15 --Joe -- Joe Sunday [EMAIL PROTECTED] http://www.csh.rit.edu/~sunday/ Computer Science House, Rochester Inst. Of Technology ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] query log corrupted-looking entries
George Pavlov [EMAIL PROTECTED] writes: Hmm. If the messages are less than PIPE_BUF bytes long (4096 bytes on Linux) then the writes are supposed to be atomic. Some of them involve long messages (4K), but there are many that do not (like the ones I had posted at the start of this thread). I checked around with some kernel/glibc gurus in Red Hat, and the consensus seemed to be that we'd be better off to bypass fprintf() and just send message strings to stderr using write() --- ie, instead of elog.c doing fprintf(stderr, %s, buf.data); do write(fileno(stderr), buf.data, strlen(buf.data)); Anyone have any comments on possible portability risks? In particular, will this work on Windows? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Syntax bug? Group by?
On Tue, Oct 17, 2006 at 04:45:49PM -0400, Mark Woodward wrote: Well, this started out as a huh, that's funny, that should work, is that a bug? and is turning into a search through the SQL99 spec for a clear answer. I've already worked around it, but to me, at least, it seems it should work. What you're asking for is difficult, not done by anyone else (so far demostrated) and not mandated by the spec, so I don't see how it could be construed a bug. As for the spec, this is what I have from SQL2003: 7.12.15) If T is a grouped table, then let G be the set of grouping columns of T. In each value expression contained in select list, each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a set function specification whose aggregation query is QS. Which to me says that everything in the output is either grouped by or part of an aggregate. That together with a statement elsewhere saying that if no group by clause is present, GROUP BY () is implied seems to seal it for me. (BTW, the functionally dependent is new and postgresql only supports the older SQL standards where C has to actually be a grouping column). Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Syntax bug? Group by?
Andrew Dunstan [EMAIL PROTECTED] writes: Mark Woodward wrote: My question, is it a syntactic technicality that PostgreSQL asks for a group by, or a bug in the parser? AFAIK what you want is not per sql spec. It would in fact be a violation of spec. Consider the case where there are no rows matching 15. In this case select min(tindex), avg(tindex) from y where ycis_id = 15; will yield one row containing NULLs, whereas select min(tindex), avg(tindex) from y where ycis_id = 15 group by ycis_id; will yield no rows (because there are no groups). Therefore, if Postgres were to implicitly insert a GROUP BY to make it legal to reference ycis_id directly, we'd be changing the query behavior and breaking spec. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Syntax bug? Group by?
Joe Sunday wrote: On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote: The output column ycis_id is unabiguously a single value with regards to the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this exact type of query before either on PostgreSQL or another system, maybe Oracle, and it did work. Doesn't work in Oracle 10g: SELECT ycis_id, tindex from x where ycis_id = 15; YCIS_ID TINDEX === == 15 10 15 20 SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15; ORA-00937: not a single-group group function SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY ycis_id; YCIS_ID MIN(TINDEX) AVG(TINDEX) === === === 15 10 15 --Joe MySQL reports - Mixing of GROUP columns (MIN(),MAX(),COUNT()...) with no GROUP columns is illegal if there is no GROUP BY clause I found one that actually returns the desired result - SQLite3. sqlite select * from test; 15|20 15|10 sqlite select ycis_id,min(tindex),avg(tindex) from test where ycis_id=15; 15|10|15 sqlite -- Shane Ambler [EMAIL PROTECTED] Get Sheeky @ http://Sheeky.Biz ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Postgresql Caching
Hi I've loved reading all of your thoughts and comments. Yet, I'm left with the question: Can we can brainstorm a caching solution that is workable... I've seen some posts talking about some of the challenges. 1.) Only good for static data As it was proposed that is largely true. This doesn't mean, however that the idea as stated isn't worth persuing because I think you'll find most applications have a great deal of "static enough" data to benefit greatly from this type of caching. However, I think some simple changes to the idea may make it useful for busy tables... These changes, would probably require direct communication between the caching controller and the the postmaster. a.) Rather than table locking, track changes at the row level. b.) Rather than requiring a complete reseeding of a table after an update, just invalidate, or repopulate the affected rows. c.) Rather than destroying popular query results, try to update them if possible. For example, it's easy to remove one entry from the cache if that row was just deleted. (It's probably cheaper to rerun the query on just the few changed rows than to rerun the whole query.) d.) Any other ideas? 2.) If any caching were possible, we'd already be doing it. I don't think this statement will stand the test of time! Agreed, caching quickly becomes a head hurting complex topic, but the benefits are well worth the effort! 3.) Consistency and memcached, *are* mutually exclusive. Memcached provides no mechanisms for consistency. "You can never have perfect consistency across different systems (memcache / postgresql) and especially not when their visibility rules differ. What is visible to something via memcache is always latest uncommitted. What is visible in PostgreSQL is something less than that. Consistency is not possible. Correct caching is therefore also not possible unless you define correct as 'latest', and even then, you have problems if memcache expires the record, before the real record has been commited into PostgreSQL." I completely agree. I'm not talking about synchronizing memcached data to be consistent, I'm saying lets create a caching layer that works something like memcache and preserves consistency! and, very related to this, 4.) Memcached Caching is exactly opposite to Postgres consistency. Specifically: Memcache is serialized Memcache can loose data at any time Memcache has only 2 fields Memcache has no synchronization Postgres needs consistency. Memcache doesn't do any synchronization, and that means consistency is impossible. However, a special version of memcache that is embedded into the postgresql system or api that does talk with the postmaster could be able to provide guaranteed consistency? 5.) This idea won't save any time with SQL parsing. I believe it can... Because, as memcache has illustrated, you can avoid any sql parsing by using the sql and user's table permissions (if different users are enabled) as the key to the cached data. 6.) Postgresql is consistency. If an application needs speed let the application figure out how to cache the data I appreciate that Postgres is all about data consistency. Actually, that's why I'm here and not on the Mysql board... However, I believe that we can provide caching without losing consistency, and developers will love the extra speed. If we do implement the caching once, everyone will be able to use it without complicating their end application!!! (Read: It will help the world and make PostgreSQL very popular!) --- So, would it work to create a caching pre-processor for Postgresql that would work serially on every request, and pass all uncached queries to the database? - If it's a cache hit, and the data is currently available and active, pass the data back. - If it's a miss, pass the query along to the database, and populate the cache with the results. - If the query changes data, invalidate the cached queries that touch any table rows that could be affected. After the update, re-enable the unaffected table rows, and repopulate the cache with the updates. - Cached queries using an affected table would normally be deleted except in special simple cases that could be updated. A related interesting thought... It might help if the cache stored the data in separate table rows rather than in whatever format the query requested the way memcached does. - Each cached table row could be stored as a separate entity. - Complicated joins, rather than caching all of the data in whatever organization the user specified, would instead store a matrix of pointers to the exact table fields in most cases. Will it work? Am I missing anything? Thanks Daniel Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
Re: [HACKERS] Postgresql Caching
On Tue, 2006-10-17 at 16:51 -0700, Anon Mous wrote: Hi I've loved reading all of your thoughts and comments. Yet, I'm left with the question: Can we can brainstorm a caching solution that is workable... I think you're making this a little complicated. A lot of these problems can be solved with something like materialized views (triggers updating another smaller relation), or they are already solved by the lower layers of caching (like PostgreSQL's shared buffers, or the OS buffer cache). If the application is executing many queries that are exactly the same, it would be a good idea to look at something like pgpool-II's query cache. The only way to gain a benefit on querying results that are already in memory is to avoid the query processing. Regards, Jeff Davis ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] 8.1.4 verified on Intel Mac OS 10.4.8
The subject line says it all.I just completed a build, test, and install of PostgreSQL 8.1.4 on an Intel Mac with OS 10.4.8. Every single step executed flawlessly, and all 98 tests passed, running the regression tests.I've got a suggestion for the documentation (INSTALL file). I may be the only person in the world who does everything backwards, but my first attempt at regression tests failed because I had already started the postmaster. I know that's pretty stupid, but I had to scratch my head for a few minutes when it said initdb failed, and left me no clue as to why. You might add a note in with the instructions for regression testing to shutdown the server before running the tests. It does say, after all, that you can run them any time! Douglas Toltzman[EMAIL PROTECTED](910) 526-5938p.s. I noticed I am a build behind, but I would think that 8.1.5 would give similar results.
[HACKERS] pg_internal.init is hazardous to your health
Dirk Lutzebaeck and I just spent a tense couple of hours trying to figure out why a large database Down Under wasn't coming up after being reloaded from a base backup plus PITR recovery. The symptoms were that the recovery went fine, but backend processes would fail at startup or soon after with could not open relation XX/XX/XX: No such file type of errors. The answer that ultimately emerged was that they'd been running a nightly maintenance script that did REINDEX SYSTEM (among other things I suppose). The PITR base backup included pg_internal.init files that were appropriate when it was taken, and the PITR recovery process did nothing whatsoever to update 'em :-(. So incoming backends picked up init files with obsolete relfilenode values. We don't actually need to *update* the file, per se, we only need to remove it if no longer valid --- the next incoming backend will rebuild it. I could see fixing this by making WAL recovery run around and zap all the .init files (only problem is to find 'em), or we could add a new kind of WAL record saying remove the .init file for database XYZ to be emitted whenever someone removes the active one. Thoughts? Meanwhile, if you're trying to recover from a PITR backup and it's not working, try removing any pg_internal.init files you can find. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_internal.init is hazardous to your health
On Tue, 17 Oct 2006, Tom Lane wrote: Dirk Lutzebaeck and I just spent a tense couple of hours trying to figure out why a large database Down Under wasn't coming up after being reloaded from a base backup plus PITR recovery. The symptoms were that the recovery went fine, but backend processes would fail at startup or soon after with could not open relation XX/XX/XX: No such file type of errors. The answer that ultimately emerged was that they'd been running a nightly maintenance script that did REINDEX SYSTEM (among other things I suppose). The PITR base backup included pg_internal.init files that were appropriate when it was taken, and the PITR recovery process did nothing whatsoever to update 'em :-(. So incoming backends picked up init files with obsolete relfilenode values. Ouch. We don't actually need to *update* the file, per se, we only need to remove it if no longer valid --- the next incoming backend will rebuild it. I could see fixing this by making WAL recovery run around and zap all the .init files (only problem is to find 'em), or we could add a new kind of WAL record saying remove the .init file for database XYZ to be emitted whenever someone removes the active one. Thoughts? The latter seems the Right Way except, I guess, that the decision to remove the file is buried deep inside inval.c. Thanks, Gavin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Hints proposal
On Friday 13 October 2006 12:46, Gregory Stark wrote: Josh Berkus josh@agliodbs.com writes: I actually think the way to attack this issue is to discuss the kinds of errors the planner makes, and what tweaks we could do to correct them. Here's the ones I'm aware of: -- Incorrect selectivity of WHERE clause -- Incorrect selectivity of JOIN -- Wrong estimate of rows returned from SRF -- Incorrect cost estimate for index use Can you think of any others? -- Incorrect estimate for result of DISTINCT or GROUP BY. Yeah, that one is bad. I also ran into one the other day where the planner did not seem to understand the distinctness of a columns values across table partitions... -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Anyone using POSIX time zone offset capability?
Tom == Tom Lane [EMAIL PROTECTED] writes: Tom The weird thing about this allegedly-POSIX notation is the combination Tom of a symbolic name and a further offset from it. AIUI, it is not a further offset but rather (mostly-)redundant data specifying the exact offset from UTC¹ the text tz specifies. Having both provides easy parsing both for humans (the text) and for code (the number). -JimC [1] Of course POSIX time is not really offset from UTC, since POSIX pretends there have been no leap seconds since 1970. As such the timestamps are technically ambiguous as to whether the specify real UTC-based time or POSIX time (Currently there is a 23-second difference between the two.) -- James Cloos [EMAIL PROTECTED] OpenPGP: 0xED7DAEA6 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Additional stats for Relations
Hi, So: heap_blks_reused (with Jim's semantics), heap_blks_extend, heap_blks_truncate are the interesting stats? Will try to work up a patch for this. Regards, Nikhils EnterpriseDB http://www.enterprisedb.com On 10/15/06, Simon Riggs [EMAIL PROTECTED] wrote: On Sat, 2006-10-14 at 11:32 +0530, NikhilS wrote: On 10/13/06, Jim C. Nasby [EMAIL PROTECTED] wrote: I'm also not sure if this metric is what you actually want, since a single page can be returned many times from the FSM even between vacuums. Tracking how many pages for a relation have been put into the FSM might be more useful... Nikhils Pages might be put into the FSM, but by this metric don't we get the actual usage of the pages from the FSM? Agreed a single page can be returned multiple times, but since it serves a new tuple, shouldn't we track it? NikhilsThis makes sense for indexes, but only makes sense for heaps when weknow that the backend will keep re-accessing the block until it is full - so only of interest in steady-state workloads.IMHO Jim's proposal makes more sense for general use. heap_blks_extend: The number of times file extend was invoked on the relationSounds good heap_blks_truncate: The total number of blocks that have been truncated due to vacuum activity e.g.Sounds good As an addendum to the truncate stats above, we can also have the additional following stats: heap_blks_maxtruncate: The max block of buffers truncated in one go heap_blks_ntruncate: The number of times truncate was called on this relationThose last 2 sound too complex for normal use and ntruncate is most likely the same as number of vacuums anyway. Hmmm...Perhaps nvacuums isa more interesting metric? We've got last vacuum date, but no indicationof how frequently a vacuum has run. Do you have a use-case for this info? I can see where it might be neat to know, but I'm not sure how you'd actually use it in the real world. Nikhils The use-case according to me is that these stats help prove the effectiveness of autovacuum/vacuum operations. By varying some autovac guc variables, and doing subsequent (pgbench e.g.) runs, one can find out the optimum values for these variables using these stats. NikhilsThis should be useful for tuning space allocation/deallocation. If weget this patch in early it should help get feedback on this area.--Simon RiggsEnterpriseDB http://www.enterprisedb.com-- All the world's a stage, and most of us are desperately unrehearsed.