[HACKERS] Failing semctl
I have an intermittent problem with PostgreSQL 7.3.2 on Solaris 8. The backend process crashes once in a while, with this in the database log: ... ... 2004-07-05 22:43:54 LOG: database system is ready IpcSemaphoreInitialize: semctl(id=65615, 14, SETVAL, 0) failed: Invalid argument 2004-07-06 09:19:03 LOG: server process (pid 23755) exited with exit code 1 2004-07-06 09:19:03 LOG: terminating any other active server processes 2004-07-06 09:19:03 WARNING: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. 2004-07-06 09:19:03 WARNING: Message from PostgreSQL backend: ... ... This problem may randomly pop up on any one of the few Pg servers running on this host. I'm not sure whether this could be a kernel problem... Do I need to look at the kernel configuration, installed patches, etc? Any idea on why this could happen would be appreciated. Mike. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Adding VERSION to startup log output
> Would anybody object to adding an extra line at startup that shows the > version number of the database system? > > Message suggestion: > Starting PostgreSQL database system > > (Nothing cheesy, just informative and neutrally welcoming...) > > Comments? > > Best Regards, Simon Riggs If you do that, does it make sense to display the full version, instead of what's stored in PG_VERSION? (e.g. 7.3.4 vs. 7.3) Mike ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Syntax question
For what it's worth, I like the second form better. Mike > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Christopher > Kings-Lynne > Sent: Sunday, May 23, 2004 6:08 AM > To: PostgreSQL-development > Subject: [HACKERS] Syntax question > > > Here are the two syntaxes we can use for turning off clustering: > > 1) ALTER TABLE / SET WITHOUT CLUSTER > > This will turn off clusting on any index on the table that has it > enabled. It won't recurse so as to match the CLUSTER ON syntax. > However, this form makes the non-standardy SET WITHOUT form more > emphasised... > > 2) ALTER TABLE / DROP CLUSTER ON idx > > I like this form, however to make it work, we need to bump CLUSTER to > being a reserved keyword. This form looks more like SQL standard, and > is related to the CLUSTER ON form. > > Which one do we want? > > Chris > > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] database errors
> -Original Message- > From: [EMAIL PROTECTED] > > > > pg_dump: ERROR: XLogFlush: request 0/A971020 is not satisfied --- > > > flushed only to 0/550 ... lost synchronization with > server, resetting > > > connection > > > > Okay, you have a page with an LSN of A971020 which is past end of XLOG > > (550). You may have created this problem for yourself by doing > > pg_resetxlog with poorly chosen parameters. > > Michael, > > >From reading this error logs, it would appear that this system has been > very strangely configured indeed. > > The recommendations for usage are fairly clear > - don't use it on NFSnot cause we hate NFSits just unsuited to > the task of serving files to a database system > - don't delete the transaction logs manually...they get recycled soon > enough anyhow > > [ Is there a connection between the fact that it is on NFS and the logs > have been manually deleted? >From what I know this was an attempt to make things better after they ran into bad problems. There's no direct indication these problems were in any way related to NFS, but I can't exclude this chance either. They ran pg_resetxlog without any arguments, then ran it with -f. (Perhaps this was done more than once) At some point they deleted the logs. And the errors I posted above were generated after I got the copy of this database and started experimenting with it. > We know that SQLServer allows a "truncate transcation log" facility > is that something that you were expecting to see and trying to emulate > with PostgreSQL? Were you trying to stop NFS writes taking place? No, I don't think this was the idea. > Your logs are rated very low. Is the transaction rate very low on this > system or has the system recently been set up? This was a very fresh database indeed. > ... what is the benefit of using NFS? > PostgreSQL offers client/server access - so why not use that instead? We don't have a full control over this. The database is a relatively small piece of a larger system, which includes the customized Apache server and a number of other modules as well. Setting up the system involves some rules and restrictions, one of them is that we don't yet support installing the database server on a different host. (If this is what you meant) We may actually support it soon, this is not a problem. But NFS is an entirely another issue - our customers often install database on NFS. I am not sure if we can ever prevent it... Thank you, Mike P.S. This is not the first time I'm bringing my problems to this list, and I sincerely want to thank you, folks for responsiveness and help... > > Best Regards, > > Simon Riggs ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] database errors
It looks that "No such file or directory" followed by the abort signal resulted from manually removing logs. pg_resetxlog took care of this, but other problems persisted. I got a copy of the database and installed it on the local partition. It does seem badly corrupted, these are some hard errors. pg_dump fails and dumps the core: pg_dump: ERROR: XLogFlush: request 0/A971020 is not satisfied --- flushed only to 0/550 ... lost synchronization with server, resetting connection looking at the core file: (dbx) where 15 =>[1] _libc_kill(0x0, 0x6, 0x0, 0x, 0x2eaf00, 0xff135888), at 0xff19f938 [2] abort(0xff1bc004, 0xff1c3a4c, 0x0, 0x7efefeff, 0x21c08, 0x2404c4), at 0xff13596c [3] elog(0x14, 0x267818, 0x0, 0xa971020, 0x0, 0x5006260), at 0x2407dc [4] XLogFlush(0xffbee908, 0xffbee908, 0x827e0, 0x0, 0x0, 0x0), at 0x78530 [5] BufferSync(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x18df2c [6] FlushBufferPool(0x2, 0x1e554, 0x0, 0x3, 0x0, 0xffbeea79), at 0x18e5c4 [7] CreateCheckPoint(0x0, 0x0, 0x82c00, 0xff1bc004, 0x2212c, 0x83534), at 0x7d93c [8] BootstrapMain(0x5, 0xffbeec50, 0x10, 0xffbeec50, 0xffbeebc8, 0xffbeebc8), at 0x836bc [9] SSDataBase(0x3, 0x40a24a8a, 0x2e3800, 0x4, 0x2212c, 0x16f504), at 0x172590 [10] ServerLoop(0x5091, 0x2e398c, 0x2e3800, 0xff1c2940, 0xff1bc004, 0xff1c2940), at 0x16f3a0 [11] PostmasterMain(0x1, 0x323ad0, 0x2af000, 0x0, 0x6572, 0x6572), at 0x16ef88 [12] main(0x1, 0xffbef68c, 0xffbef694, 0x2eaf08, 0x0, 0x0), at 0x12864c == (I don't have the debug build at the moment to get more details) this query fails: LOG: query: select count (1) from note_links_aux; ERROR: XLogFlush: request 0/A971020 is not satisfied --- flushed only to 0/5006260 drop table fails: drop table note_links_aux; ERROR: getObjectDescription: Rule 17019 does not exist Are there any pointers as to why this could happen, aside of potential memory and disk problems? As for NFS... I know how strong the Postgresql community is advising against it, but we have to face it: our customers ARE running on NFS and they WILL be running on NFS. Is there such a thing as "better" and "worse" NFS versions? (I made a note of what was said about hard mount vs. soft mount, etc) Tom, you recommended upgrade from 7.3.2 to 7.3.6 Out next release is using v 7.3.4. (maybe it's not too late to upgrade) Would v. 7.3.6 provide more protection against problems like this? Thank you, Mike > -Original Message- ... ... > The messages you quote certainly read like a badly corrupted database to > me. In the case of a local filesystem I'd be counseling you to start > running memory and disk diagnostics. That may still be appropriate > here, but you had better also reconsider the decision to use NFS. > > If you're absolutely set on using NFS, one possibly useful tip is to > make sure it's a hard mount not a soft mount. If your systems support > NFS-over-TCP instead of UDP, that might be worth trying too. > > Also I would strongly advise an update to PG 7.3.6. 7.3.2 has serious > known bugs. > > regards, tom lane > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] database errors
Our customer has problems with Pg 7.3.2 on Solaris. There are numerous errors in the app. server log and in the database log, including these: LOG: open of /mnt_c1t2d0s0//postgresql/pg_xlog/0001 (log file 0, segment 1) failed: No such file or directory LOG: invalid primary checkpoint record LOG: open of /mnt_c1t2d0s0//postgresql/pg_xlog/0001 (log file 0, segment 1) failed: No such file or directory LOG: invalid secondary checkpoint record PANIC: unable to locate a valid checkpoint record LOG: startup process (pid 16527) was terminated by signal 6 LOG: aborting startup due to startup process failure ... ERROR: Cannot insert a duplicate key into unique index cr_pk PANIC: RecordTransactionAbort: xact 55143 already committed LOG: server process (pid 22185) was terminated by signal 6 LOG: terminating any other active server processes WARNING: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. LOG: all server processes terminated; reinitializing shared memory and semaphores LOG: database system was interrupted at 2004-05-10 10:51:01 CDT LOG: checkpoint record is at 0/30005E0 LOG: redo record is at 0/30005E0; undo record is at 0/0; shutdown TRUE LOG: next transaction id: 53340; next oid: 57982 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 0/3000620 LOG: ReadRecord: record with zero length at 0/3000930 LOG: redo done at 0/3000908 WARNING: XLogFlush: request 0/A970F68 is not satisfied --- flushed only to 0/3000930 WARNING: XLogFlush: request 0/A970FA8 is not satisfied --- flushed only to 0/3000930 WARNING: XLogFlush: request 0/A970E00 is not satisfied --- flushed only to 0/3000930 WARNING: XLogFlush: request 0/A970E40 is not satisfied --- flushed only to 0/3000930 FATAL: The database system is starting up ... -- We've had "Cannot insert a duplicate key into unique index" in the past. We ran pg_resetxlog and reloaded the database - this helped. I wonder if message "open of /mnt_c1t2d0s0/... (log file 0, segment 1) failed: No such file or directory" may indicate some kind of NFS problem. Anything else I need to look at? Thanks in advance, Mike. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] negative pid?
Sorry, I should have checked the source first. Just found this in miscinit.c : * By convention, the owning process' PID is negated if it is a standalone * backend rather than a postmaster. This is just for informational purposes. > -Original Message- > I see this code in pg_ctl: > >PID=`sed -n 1p $PIDFILE` >if [ "$PID" -lt 0 ];then >PID=`expr 0 - $PID` > > > Wnen it is possible (and why) to have a negative number in postmaster.pid? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] negative pid?
I see this code in pg_ctl: PID=`sed -n 1p $PIDFILE` if [ "$PID" -lt 0 ];then PID=`expr 0 - $PID` Wnen it is possible (and why) to have a negative number in postmaster.pid? Thanks, Mike ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] database is shutting down
We have a customer who reports a weird problem. Too often the App. Server fails to connect to the database. Sometimes the scheduled vacuum fails as well. The error message is always the same: FATAL: The database system is shutting down But from what I see no one is trying to shut down the database at this time. I am still waiting for the database-log to see if I can find a clue there, but I wonder if someone knows what can make the database respond this way. This is Pg 7.3.2, on HP 11.0, using the Unix Domain Socket. Thank you, Mike ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [ADMIN] Schema comparisons
Interestingly I tried to address the same problem few days ago. I used pg_dump, grep, etc - in the end I got what I needed, but it was a cumbersome ordeal. I think ideally it would be great to have a utility that would give me a clean diff. between the schemas. Perhaps pg_dump could have a new arg to produce the output most suitable for this utility. Mike. > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Tom Lane > Sent: Saturday, February 28, 2004 10:40 AM > To: Mark Lubratt > Cc: [EMAIL PROTECTED] > Subject: Re: [HACKERS] [ADMIN] Schema comparisons > > > Mark Lubratt <[EMAIL PROTECTED]> writes: > > On Feb 27, 2004, at 10:28 PM, Tom Lane wrote: > >> Mark Lubratt <[EMAIL PROTECTED]> writes: > >>> I've been trying to be careful, but I've gotten out of synch with > >>> whether or not I've applied the changes I've made to the development > >>> system to the production system. Is there a utility that will compare > >>> the tables, functions, trigger, views, etc. between two systems and > >>> flag the schema elements that aren't in synch between the two? > >> > >> Have you tried diffing pg_dump output? It's not the greatest tool but > >> it's helpful. > > > Yes, I did. It was quite cumbersome. Especially since the OIDs and > > TOC entry numbers didn't matchup; and, since those didn't always match, > > the order of objects wasn't quite the same either. So, diff was > > throwing a lot of false positives at me. > > Yeah. CVS-tip pg_dump doesn't show OIDs by default, to make it easier > to use for purposes like this. The ordering issue is the bigger problem > though. I presume that the object creation history is different in the > two databases and so pg_dump's habit of sorting by OID isn't helpful. > > It occurs to me that this could be solved now that we have > dependency-driven ordering in pg_dump. The ordering algorithm is > presently > * Order by object type, and by OID within types; > * Move objects as needed to honor dependencies. > Ordering by OID should no longer be needed for correctness, because > the second phase will take care of any dependency problems. We > could instead make the initial sort be by object name (within types). > This should ensure that the schema output is identical for logically > equivalent databases, even if their history is different. > > (When dumping from a pre-7.3 database, we'd have to stick to the OID > algorithm for lack of dependency info, but of course that case is > getting less interesting as time wears on.) > > Comments? Anyone see a reason not to do this? > > regards, tom lane > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Question on pg_dump
Sorry, I forgot to mention that I also considered bypassing createlang and using direct sql: ... RETURNS LANGUAGE_HANDLER AS '${pglib}/plpgsql' ... but I'm not sure if this is much better then updating pg_proc. - > > I first tried to take care of the problem by removing "-L $libpath" > from the arg list passed to createlang. This worked in a way that > probin in pg_proc had value "$libdir/plpgsql". > > Later it turned out the embedded library path was used, and install > failed when there was no access to the build environment. > > Now I put the "-L $libpath" argument back in place, then I update > pg_proc with the dynamic value. This works fine, but such approach > looks too intrusive. I think I also tried to set env LD_LIBRARY_PATH, > which by some reason did not help. > > Is there a better way to handle this and avoid updating pg_proc? > > Thank you. > > > > Michael Brusser <[EMAIL PROTECTED]> writes: > > > I'm running Postgres v.7.3.4. > > > In my database dump file I see this: > > > > > CREATE FUNCTION plpgsql_call_handler () RETURNS language_handler > > > AS '/home/tmichael/build/relipg21/syncinc/lib.sol2/plpgsql', > > > 'plpgsql_call_handler' > > > LANGUAGE c; > > > > > The hardcoded library path may become an obstacle when loading > > > data into a different server. Is there a way to avoid this? > > > > The preferred way to write it nowadays is '$libdir/plpgsql', but > > you evidently have a legacy value embedded in your pg_proc table. > > pg_dump will not second-guess this, and so the old full-path > > approach will persist over dump/reloads until you do something about it. > > > > I'd suggest editing the dump file before you reload, or even manually > > updating pg_proc.probin for this function entry so that future dumps > > are right. > > > > regards, tom lane > > > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Question on pg_dump
I first tried to take care of the problem by removing "-L $libpath" from the arg list passed to createlang. This worked in a way that probin in pg_proc had value "$libdir/plpgsql". Later it turned out the embedded library path was used, and install failed when there was no access to the build environment. Now I put the "-L $libpath" argument back in place, then I update pg_proc with the dynamic value. This works fine, but such approach looks too intrusive. I think I also tried to set env LD_LIBRARY_PATH, which by some reason did not help. Is there a better way to handle this and avoid updating pg_proc? Thank you. ======== > Michael Brusser <[EMAIL PROTECTED]> writes: > > I'm running Postgres v.7.3.4. > > In my database dump file I see this: > > > CREATE FUNCTION plpgsql_call_handler () RETURNS language_handler > > AS '/home/tmichael/build/relipg21/syncinc/lib.sol2/plpgsql', > > 'plpgsql_call_handler' > > LANGUAGE c; > > > The hardcoded library path may become an obstacle when loading > > data into a different server. Is there a way to avoid this? > > The preferred way to write it nowadays is '$libdir/plpgsql', but > you evidently have a legacy value embedded in your pg_proc table. > pg_dump will not second-guess this, and so the old full-path > approach will persist over dump/reloads until you do something about it. > > I'd suggest editing the dump file before you reload, or even manually > updating pg_proc.probin for this function entry so that future dumps > are right. > > regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Question on pg_dump
I'm running Postgres v.7.3.4. In my database dump file I see this: CREATE FUNCTION plpgsql_call_handler () RETURNS language_handler AS '/home/tmichael/build/relipg21/syncinc/lib.sol2/plpgsql', 'plpgsql_call_handler' LANGUAGE c; The hardcoded library path may become an obstacle when loading data into a different server. Is there a way to avoid this? Thank you, Mike. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Question on database backup
We have customers who prefer to use their backup facilities instead of what we provide in the app (we use pg_dump) I hear speed is at least one consideration. The questions I need to answer are these: 1) Is this absolutely safe to do file copy (cpio, or smth. else, whatever the robust backup app. would use) on the Postgres db, when it's completely shut down. 2) Same question, but the database is up and running in read-only mode. We're making sure that no updates are taking place. If it matters - this is on Solaris, HP, Linux. We're using v.7.3.2 running on UDS and v.7.3.4 on TCP/IP We provide no explicit settings for wal, fsync and the like. And (yes, I know) they often install it on NFS. Thank you much. Mike. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Problem with pgtcl on HP
Just want to share in case you want to make a change in the code, or perhaps tell us that we didn't do the right thing. At some point we needed to add -DTCL_THREADS and -D_REENTRANT to the makefiles and then we ran into a problem with pgtcl library. This seems to be HP specific (HP-11) and happened on 7.3.2 and 7.3.4 Depening on the environment attempt to either load or unload this library results in a core dump (signal 11) We tracked problem to this line in pgtcl.c: putenv ("PGCLIENTENCODING=UNICODE"); Changing this to Tcl_PutEnv ("PGCLIENTENCODING=UNICODE"); seems to be a good fix, so far we are not seeing any problems. Mike. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Corrupted db?
We got a problem ticket from the customer. Messages in the error_log indicated problem trying to insert a duplicate value into a table. Looking at the database (v.7.3.2 on Solaris) I'm puzzled with this: syncdb=# \d ERROR: Cache lookup failed for relation 17075 syncdb=# \di ERROR: Cache lookup failed for relation 17081 syncdb=# \d property_types; ERROR: Cache lookup failed for relation 17075 syncdb=# \d public.property_types Table "public.property_types" Column| Type | Modifiers -+-+--- id | integer | not null name| character varying(256) | not null stamp | date| not null description | character varying(1024) | Indexes: pt_pk primary key btree (id), pt_name_uid_uk unique btree (name) syncdb=# SET search_path = public, pg_catalog; SET syncdb=# \d property_types; ERROR: Cache lookup failed for relation 17075 syncdb=# select * from pg_namespace ; nspname | nspowner | nspacl +--+ pg_catalog |1 | {=U} pg_toast |1 | {=} public |1 | {=UC} pg_temp_1 |1 | (4 rows) syncdb=# select * from property_types; id | name | stamp | description +--+---+- (0 rows) INSERT INTO property_types (id, name, description, stamp) VALUES (nextval('pt_seq'), 'FROM_RELATIONSHIP', 'Describes the relationship of ...', date('today')); ERROR: Cannot insert a duplicate key into unique index pt_name_uid_uk - I can only guess that something is corrupted. What could lead to that? Can this can be prevented or repaired? Thank you, Mike. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Slow query - index not used
> What's the actual distribution of values in these columns? > Are you searching for values that are particularly common > or uncommon? This column always has a predefined set of values. Usually the app. would search for one of the existing values. --- Total records: 74654 --- nt_note_name| count +-- Collection | 10068 Component | 1200 CustTicket | 15009 Deliver| 1201 Download | 1999 GroupProv | 464 IP_News| 5950 IP_Publish_Request | 4000 IP_Usage | 2000 KnowledgeBase | 15002 LevelInfo |10 OtherParam | 4000 Request| 4501 TestMethod | 4050 VerTech| 4000 Version| 1200 --- I started from scratch: took out param DEFAULT_STATISTICS_TARGET from config file, restarted db, ran vacuum analyze, then got this statistics: # explain select count (1) from note_links_aux where nt_note_name = 'KnowledgeBase'; Aggregate (cost=1982.68..1982.68 rows=1 width=0) -> Seq Scan on note_links_aux (cost=0.00..1970.18 rows=5002 width=0) Filter: (nt_note_name = 'KnowledgeBase'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'OtherParam'; Aggregate (cost=1984.78..1984.78 rows=1 width=0) -> Seq Scan on note_links_aux (cost=0.00..1970.18 rows=5840 width=0) Filter: (nt_note_name = 'OtherParam'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'LevelInfo'; Aggregate (cost=58.91..58.91 rows=1 width=0) -> Index Scan using nla_nt_note_name_fk_i on note_links_aux (cost=0.00..58.87 rows=15 width=0) Index Cond: (nt_note_name = 'LevelInfo'::character varying) # explain select count (1) from note_links_aux where nt_note_name = 'NoSuchThing'; Aggregate (cost=5.83..5.83 rows=1 width=0) -> Index Scan using nla_nt_note_name_fk_i on note_links_aux (cost=0.00..5.83 rows=1 width=0) Index Cond: (nt_note_name = 'NoSuchThing'::character varying) So 'rows' values are incorrect. Also looking at queries with 'KnowledgeBase' and 'OtherParam' - does seq. scan make sense? I mean 'rows' has value of about 5000 records from the total of 75000 records on the table. This ratio does not seem high enough to assume that index scan won't be benefitial. And even if we consider the real number of records - 5000, this is still only 20% of the total. Would an index scan be still faster? Sorry if I put here more info than you need. Thanks, Mike. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] cache control?
Is there a way to force database to load a frequently-accessed table into cache and keep it there? Thanks, Mike. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Postgres v.7.3.4 - Performance tuning
I need to provide recommendations for optimal value for the shared_buffers. This has to do with some optional feature we have in our application. If turned on - a number of extra tables involved in the operations. I don't have an experience with PG perf. testing and I guess that I may need to account for the size of the tables, perhaps the size of index tables as well..? Any hints, or pointers to related reading would be very appreciated. Mike. ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] conversion_create.sql
After upgrading to v 7.3.4 I noticed this error message in the database setup log: grep: can't open /conversion_create.sql Turned out initdb is looking for conversion_create.sql. We're not building this script and I may need to look into the build process, but for now can someone tell me what it does and why we'd need it. I guess it has to do with locale/data conversion, but there's so much stuff in src/backend/utils/mb/conversion_procs that I wonder - do we need to know which conversions we need to support, or should we build all of them to be on the safe side? Thanks, Mike. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Defaults for GUC variables (was Re: [HACKERS] pg_ctl reports
> Christopher Kings-Lynne <[EMAIL PROTECTED]> writes: > > Shouldn't it revert to the default value? > > No, not unless you think the postmaster should react to comments in the > postgresql.conf file, which is rather against my idea of a comment. > > However, you're not the first to get burnt by this mis-assumption, > so maybe we should do something about it. > > The low-tech solution to this would be to stop listing the default > values as commented-out entries, but just make them ordinary uncommented > entries. That way people who think "undoing my edit will revert the > change" would be right. > > Or we could try to make it actually work the way you seem to be > expecting. The only implementation I can think of is to reset GUC > variables to defaults just before scanning the .conf file ... -- I have to say we never had any problems or misconception with how it currently works, but if this has to be changed I'd rather vote for the low-tech solution. Mike. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] round() function wrong?
But this seems to work correctly on 7.3.2 and 7.3.4: psql -c "select round (2.5)" Password: round --- 3 (1 row) = > > I just tried that on my 7.2.4 and 7.4 beta 4 machines and I get 2 for > round(2.5) > > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] 7.4 compatibility question
> -Original Message- > From: Bruce Momjian ... > The big question is whether the current release notes hit he right > balanced. Do they for you? The last time I read the notes was when we upgraded to 7.3.4. I'll pick up couple entries from Release Notes and the HISTORY file (which we always read) for examples. PostgreSQL now supports the ALTER TABLE ... DROP COLUMN functionality. => this is entirely sufficient. Detailed info can be found in the docs. Optimizer improvements => this tells me nothing. I suppose this could be a minor internal code tweak, which does not affect me. On the other hand this could be a major breakthrough, so now I can run some stupid query which would take a week to complete in the previous release. How do I know? Fix to_ascii() buffer overruns => I don't think I need any more details here Work around buggy strxfrm() present in some Solaris releases => if we did not suffer from this (big thanks for fixing!) I would've never guessed how it may manifest itself and affect the database, even though this alone could be a strong reason for upgrade. If you think this would take too much space and bloat the document, then maybe the best solution is to have a reference number: Bug# 123 : Work around ... Then I could go to some http://postgres../bugtrack enter this number and learn more. Mike. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4 compatibility question
We integrate PostgreSQL with our product, which we ship to the end user. We do read the release notes, they are important to us. They don't have to be excruciatingly long, they can't be ridiculously short and cryptic. You have to find the golden spot in the middle. Just treat us the way you want to be treated + some extra allowance for ignorance. Mike. > > Part of the reason the release notes are read is > > because they are _readable_ ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] Automatic conversion from Unicode
With Pg 7.3.x we initialize database with -E UNICODE option. I'd like to provide support for automatic conversion to Chinese char-set by putting "client_encoding big5" into postgresql.conf. But when I try "\encoding big5" in psql session I get this: big5: invalid encoding name or conversion procedure not found I looked into table pg_conversion - it's empty. Is this something I missed with compile options? Thanks in advance, Mike ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_ctl reload - is it safe?
Yes, we use NFS. Many of our customers use it as well. Mike. > -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] ... ... > Michael Brusser <[EMAIL PROTECTED]> writes: > > 2003-10-10 22:37:05 ERROR: cannot read block 0 of s_noteimportlinks: > > Interrupted system call > > Hmm. I found this hard to believe at first, but indeed my local man > pages for read() and write() say they can return EINTR if interrupted > by a signal. This may depend on the filesystem in use (are you using > NFS?) > > We had probably better fix fd.c to retry on EINTR. > > regards, tom lane > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] feature request
I wonder if this is feasible to enhance create trigger so I could say 'create or replace' Thanks, Mike. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] Question on "current_user"
Is there something special about function current_user? In v.7.3.4 it only works when I do "select current_user;" and breaks when called with the empty arg. list: syncdb=# select current_user(); ERROR: parser: parse error at or near "(" at character 20 This behavior is just the opposite to that of some other functions that take no arguments (e.g. version, timeofday) Mike. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] pg_ctl reload - is it safe?
Env: Postgres 7.3.4 on Unix I have a script to create a new database user and update pg_hba.conf. I was wondering whether I need to restart server, or simply run pg_ctl reload. The question is what happens to all active clients when all backends get the signal. I ran such client process which was very busy querying and updating database. At the same time I kept executing "pg_ctl reload". Soon after client reported database error. Here's the excerpt from the database log: ... ... 2003-10-10 22:33:12 LOG: Received SIGHUP, reloading configuration files <25 successful SIGHUPs, about 2 seconds apart from each other> ... ... 2003-10-10 22:37:05 ERROR: cannot read block 0 of s_noteimportlinks: Interrupted system call ... >From the client log I see that problem occured while trying to SELECT nextval from sequence s_noteimportlinks We are going to pass this script to the customers and we have to know what to recommend: reload or shut down/restart. I hope they won't do reload 25 times... but they may have 25 or more active client processes at any time. Thanks in advance, Mike. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Using backslash in query
I'm afraid I'm confused about something very simple... but anyway I need to run a query on a varchar field containing a backslash. My first attempt looked like this: SELECT smth. FROM tbl WHERE situation LIKE '%\\%'; This did not returned any rows. I looked up for a reference, confirmed that "... double-backslash is required to represent a literal backslash." http://www.postgresql.org/docs/aw_pgsql_book/node139.html#copy_backslash_han dling But when I doubled the number of backslashes: SELECT smth. FROM tbl WHERE situation LIKE '%%'; - it actually worked fine. Same thing happens with using regex: situation ~ '\\'; Could someone shed some light on this, please. Mike. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] FATAL message on Linux
We're testing application with Postgres 7.3.2 on the backend. The entire test involves running series of individual tests. Few times throughout this procedure the database server is being shut down and started up again Normally, this is what I see in the database log. 2003-07-29 10:14:33 [14513] LOG: smart shutdown request 2003-07-29 10:14:33 [14594] LOG: shutting down 2003-07-29 10:14:35 [14594] LOG: database system is shut down 2003-07-29 10:14:37 [14676] LOG: database system was shut down at 2003-07-29 10:14:35 EDT 2003-07-29 10:14:37 [14676] LOG: checkpoint record is at 0/9383DC 2003-07-29 10:14:37 [14676] LOG: redo record is at 0/9383DC; undo record is at 0/0; shutdown TRUE 2003-07-29 10:14:37 [14676] LOG: next transaction id: 1899; next oid: 17344 2003-07-29 10:14:37 [14676] LOG: database system is ready Now, (on Linux only) we sometimes run into this: 2003-07-29 11:31:15 [26665] LOG: smart shutdown request 2003-07-29 11:31:15 [26728] LOG: shutting down 2003-07-29 11:31:17 [26728] LOG: database system is shut down 2003-07-29 11:31:19 [26767] LOG: database system was shut down at 2003-07-29 11:31:17 EDT 2003-07-29 11:31:19 [26767] LOG: checkpoint record is at 0/C01520 2003-07-29 11:31:19 [26767] LOG: redo record is at 0/C01520; undo record is at 0/0; shutdown TRUE 2003-07-29 11:31:19 [26767] LOG: next transaction id: 15648; next oid: 18523 2003-07-29 11:31:24 [26769] FATAL: The database system is starting up 2003-07-29 11:31:26 [26767] LOG: database system is ready The database was running with LOG_MIN_ERROR_STATEMENT = DEBUG2 in postgresql.conf, but there were no additional information in the database log. I am wondering if on Linux it takes more time to shut down database server, and we get this because we try to start the server while some of postgres processes are still around? Any ideas are appreciated. Thanks, Mike. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] SHMSEG definition
Can somebody clarify for me the requirement for the SHMSEG kernel parameter, please. The Postgres doc says: SHMSEG - Maximum number of shared memory segments per process only 1 segment is needed, but the default is much higher I was trying to calculate the kernel parameters for running a number of Postgres servers on Solaris, and something did not work. Having Postgres running I executed ipcs. It shows that while Postgres is taking only one segment, the number of attachments is 7. This I think refers to the number of processes spawned by postmaster at that time. Solaris docs describe SHMSEG as "Limit on the number of shared memory segments that any one process can create" Now run this: sysdef | grep SHMSEG -> 50 max attached shm segments per process (SHMSEG) I suppose there's a distinction between "can create" and "can attach to". Looks like the manual and output of sysdef are not very consistent. I am worried that maybe (on Solaris) SHMSEG should be set with respect to number of connections, rather than number of servers. Thank you, Mike. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Hitting the nfile limit
> > I wonder how Postgres handles this situation. > > (Or power outage, or any hard system fault, at this point) > > Theoretically we should be able to recover from this without loss of > committed data (assuming you were running with fsync on). Is your QA > person certain that the record in question had been written by a > successfully-committed transaction? > He's saying that his test script did not write any new records, only updated existing ones. My uneducated guess on how update may work: - create a clone record from the one to be updated and update some field(s) with given values. - write new record to the database and delete the original. If this is the case, could it be that somewhere along these lines postgres ran into problem and lost the record completely? But all this should be done in a transaction, so... I don't know... As for fsync, we currently go with whatever default value is, same for wal_sync_method. Does anyone has an estimate on performance penalty related to turning fsync on? Michael. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Hitting the nfile limit
We ran into problem while load-testing 7.3.2 server. >From the database log: FATAL: cannot open /home//postgresql/PG_VERSION: File table overflow The QA engineer who ran the test claims that after server was restarted one record on the database was missing. We are not sure what exactly happened. He was running about 10 servers on HP-11, hitting them with AstraLoad. Most requests would try to update some record on the database, most run with Serializable Isolation Level. Apparently we managed to run out of the open file descriptors on the host machine. I wonder how Postgres handles this situation. (Or power outage, or any hard system fault, at this point) Is it possible that we really lost a record because of that? Should we consider changing default WAL_SYNC_METHOD? Thanks in advance, Michael. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Core dump on HP
Hi, folks; We're running Postgres 7.3.2 and we have a core dump on HP-11. This does not seem reproducible on Solaris or Linux. Working with debugger we get this stack: #0 0xc0185a20 in mallinfo+0x2144 () from /usr/lib/libc.2 (gdb) where #0 0xc0185a20 in mallinfo+0x2144 () from /usr/lib/libc.2 #1 0xc01821c0 in __thread_callback_np+0x894 () from /usr/lib/libc.2 #2 0xc0187fa4 in realloc+0x1c4 () from /usr/lib/libc.2 #3 0xc018f420 in putenv+0xe8 () from /usr/lib/libc.2 It looks that problem occurs when we releasing TCL interpreter, also the problem can be avoided if in pgtcl.c this code is commented out: if (tclversion >= 8.1) putenv("PGCLIENTENCODING=UNICODE"); I found another report on putenv problem: http://archives.postgresql.org/pgsql-bugs/1999-11/msg7.php Are we're dealing with essentially same root cause? Is there a recommended solution? Thank you, Michael. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Core dump on HP
Hi, folks; We're running Postgres 7.3.2. At the end of some procedure we get a core dump on HP-11. This does not seem reproducible on Solaris or Linux. Working with debugger we get this stack: #0 0xc0185a20 in mallinfo+0x2144 () from /usr/lib/libc.2 (gdb) where #0 0xc0185a20 in mallinfo+0x2144 () from /usr/lib/libc.2 #1 0xc01821c0 in __thread_callback_np+0x894 () from /usr/lib/libc.2 #2 0xc0187fa4 in realloc+0x1c4 () from /usr/lib/libc.2 #3 0xc018f420 in putenv+0xe8 () from /usr/lib/libc.2 It looks that problem occurs when we releasing TCL interpreter, also the problem can be avoided if in pgtcl.c this code is commented out: if (tclversion >= 8.1) putenv("PGCLIENTENCODING=UNICODE"); I found another report on putenv problem: http://archives.postgresql.org/pgsql-bugs/1999-11/msg7.php Are we're dealing with essentially same root cause? Is there a recommended solution? Thank you, Michael. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Practical sets of SQLSTATE values?
>> Anyone have lists of implementation-defined SQLSTATEs for >> the big commercial DBs? This points to the Oracle docs. http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a87540/ch2.htm Table 2-2 SQLSTATE Status Codes ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] vacuum analyze corrupts database
> Well, first thing I'd ask is whether the other installation is using > the same locale settings. Oh, yes, I am very sure of that. Mike ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] vacuum analyze corrupts database
> Hmm. Are you running the database in a non-C locale? > (pg_controldata would tell you.) --- Here's the output of pg_controldata: pg_control version number:72 Catalog version number: 200211021 Database cluster state: in production pg_control last modified: Sun May 25 18:38:06 2003 Current log file ID: 0 Next log file segment:2 Latest checkpoint location: 0/15EF7A8 Prior checkpoint location:0/15ED2D8 Latest checkpoint's REDO location:0/15EF7A8 Latest checkpoint's UNDO location:0/0 Latest checkpoint's StartUpID:47 Latest checkpoint's NextXID: 3563 Latest checkpoint's NextOID: 118086 Time of latest checkpoint:Sun May 25 18:38:02 2003 Database block size: 8192 Blocks per segment of large relation: 131072 Maximum length of identifiers:64 Maximum number of function arguments: 32 Date/time type storage: Floating point Maximum length of locale name:128 LC_COLLATE: en_US.ISO8859-1 LC_CTYPE: en_US.ISO8859-1 - locale settings on the host: tomkins% locale LANG= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= - > I am wondering if strxfrm() on your platform sometimes writes more bytes > than it is supposed to. I have seen vsnprintf() overrun its output > buffer on some flavors of Solaris (according to FAQ_Solaris, the 64-bit > libc in Solaris 7 had such a problem). Could there be a similar bug in > their strxfrm? We're on Solaris 8. I'll try to find information on strxfrm bugs, but do you rule out any problems in Postgres code? Is there a good explanation to why the same table loaded into another Postgres installation on the same host can be analyzed without any problems? Also in my database I can drop/create database, load table and reproduce the error. Not sure what to make out of this. Mike. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Tom Lane Sent: Sunday, May 25, 2003 9:03 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] vacuum analyze corrupts database Michael Brusser <[EMAIL PROTECTED]> writes: > (dbx) where 30 > =>[1] pfree(0x489420, 0xffbee890, 0x489420, 0xffbee880, 0x489628, > 0xffbee888), at 0x2535e4 > [2] convert_to_scalar(0x489078, 0x19, 0xffbee890, 0x489008, 0x488fc0, > 0x413), at 0x1fc6b4 > [3] scalarineqsel(0x484608, 0x42a, 0x0, 0x488a88, 0x489078, 0x19), at > 0x1f94e4 Hmm. Are you running the database in a non-C locale? (pg_controldata would tell you.) If so, this pfree is trying to pfree one of three strings that were filled with strxfrm(). I am wondering if strxfrm() on your platform sometimes writes more bytes than it is supposed to. I have seen vsnprintf() overrun its output buffer on some flavors of Solaris (according to FAQ_Solaris, the 64-bit libc in Solaris 7 had such a problem). Could there be a similar bug in their strxfrm? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Do we always need the socket file?
We're trying to avoid creating any unnecessary files, especially outside of the product installation tree. Look at this as a policy. Experience shows that sooner or later some of your customers ask you: what is this /tmp/.s.PGSQL.xxx file is? What do I need it for? Also some admins known to periodically clean out /tmp, /var/run, etc. Mike. > -Original Message- > From: Bruce Momjian [mailto:[EMAIL PROTECTED]] > Sent: Thursday, February 13, 2003 8:38 PM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: Re: [HACKERS] Do we always need the socket file? > > > Michael Brusser wrote: > > I have Postgres 7.2.1 configured to listen on TCP/IP port. > > When the server starts up it still creates the socket file in /tmp. > > Removing this file manually does not seem to cause any problem for the > > application. > > > > Is there a way to prevent postmaster from creating this file? > > Is this really safe to remove the socket file, or would it create > > some problem that I won't necessarily see? > > I guess the big question is why you don't want the file created? If you > have 'local' disabled in pg_hba.conf, it doesn't allow connections. > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, > Pennsylvania 19073 > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Do we always need the socket file?
I have Postgres 7.2.1 configured to listen on TCP/IP port. When the server starts up it still creates the socket file in /tmp. Removing this file manually does not seem to cause any problem for the application. Is there a way to prevent postmaster from creating this file? Is this really safe to remove the socket file, or would it create some problem that I won't necessarily see? Thanks, Mike. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html