Re: [GENERAL] PostgreSQL Performance issue
I am curious to know how much of your delay is due to PostgreSQL and how much to your Java batch program. If you comment out the call to the database function, so that you are reading your input file but not doing anything with the data, how long does your batch program take to run? RobR -- The Java program hardly takes a minute to process all the flat files but at the time inserting the records into the db, the entire process takes more than 4 hours. Many thanks
Re: [GENERAL] Is the wiki down?
Chris Velevitch wrote: I was just checking http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy and I get a connection time out. Is the site down? I'm not sure if it's down or not, but I haven't been able to reach it from here in some time either (home Verizon FIOS connection) so it's not just you. It looks like a routing issue for me though, might not actually be the site itself having the problem. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql on EC2/EBS in production?
Bryan Murphy wrote: We had to fail over to one of our spares twice in the last 1.5 years. Not fun. Both times were due to instance failure. What do you mean by an instance failure here? The actual EC2 image getting corrupted so that it won't boot anymore, or just the instance going down badly? -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] VACUUM FULL vs backup/restore
David Wall wrote: It seems my options are 1) VACUUM FULL with a reindex_db; or 2) backup, then restore. Is there anything one does better than the other? Our impression is that the backup and restore will run faster (perhaps 2 hours), whereas we have a currently running VACUUM FULL that's been running for 4.5 hours already. VACUUM FULL can easily run for 4.5 days. See http://wiki.postgresql.org/wiki/VACUUM_FULL for a discussion of the issues here and comments about what you should do instead (probably CLUSTER if you're running 8.3 or later). The wiki seems to be having issues right now so you might need to grab it from a cache somewhere else instead, i.e. http://webcache.googleusercontent.com/search?q=cache:fCJXjixyulMJ:wiki.postgresql.org/wiki/VACUUM_FULL&hl=en&strip=1 If you're using 8.2 or earlier, dump and reload is the way to go for you. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Errors starting postgres after initializing new database
Chris Barnes wrote: 2010-04-27 16:19:17 EDTFATAL: requested shared memory size overflows size_t This is what happens if you try and use a postgresql.conf with parameters set for a 64-bit installation on a 32-bit operating system. You're trying to do this: shared_buffers = 7680MB But the 32-bit Centos 5.2 you're using doesn't allow you to allocate more than 2GB of RAM for that. If there really is around 32GB of RAM in there, like I'm assuming the 5.4 system has for pgtune to made this suggestion, you certainly will need to install a 64-it OS on it to take advantage of that usefully. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Is the wiki down?
I was just checking http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy and I get a connection time out. Is the site down? Chris -- Chris Velevitch Manager - Adobe Platform Users Group, Sydney m: 0415 469 095 www.apugs.org.au Adobe Platform Users Group, Sydney May 2010: TBD Date: 31st May 6pm for 6:30 start Details and RSVP coming soon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On Tuesday 27 April 2010 5:45:43 pm Anthony wrote: > On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain < > > cedric.villemain.deb...@gmail.com> wrote: > > store your files in a filesystem, and keep the path to the file (plus > > metadata, acl, etc...) in database. > > What type of filesystem is good for this? A filesystem with support for > storing tens of thousands of files in a single directory, or should one > play the 41/56/34/41563489.ext game? > > Are there any open source systems which handle keeping a filesystem and > database in sync for this purpose, or is it a wheel that keeps getting > reinvented? > > I know "store your files in a filesystem" is the best long-term solution. > But it's just so much easier to just throw everything in the database. In the for what it is worth department check out this Wiki: http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql on EC2/EBS in production?
On Tue, Apr 27, 2010 at 11:31 AM, Greg Smith wrote: > Nikhil G. Daddikar wrote: >> I was wondering if any of you are using (or tried to use) PG+EC2/EBS on a >> production system. Are any best-practices. Googling didn't help much. A few >> articles I came across scared me a bit. > > There have been a couple of reports of happy users: Been running a very busy 170+ gb OLTP postgres database on Amazon for 1.5 years now. I can't say I'm "happy" but I've made it work and still prefer it to running downtown to a colo at 3am when something goes wrong. > There are two main things to be wary of: > > 1) Physical I/O is not very good, thus how that first system used a RAID0. Let's be clear here, physical I/O is at times *terrible*. :) If you have a larger database, the EBS volumes are going to become a real bottleneck. Our primary database needs 8 EBS volumes in a RAID drive and we use slony to offload requests to two slave machines and it still can't really keep up. There's no way we could run this database on a single EBS volume. I also recommend you use RAID10, not RAID0. EBS volumes fail. More frequently, single volumes will experience *very long* periods of poor performance. The more drives you have in your raid, the more you'll smooth things out. However, there have been occasions where we've had to swap out a poor performing volume for a new one and rebuild the RAID to get things back up to speed. You can't do that with a RAID0 array. > 2) Reliability of EBS is terrible by database standards; I commented on this > a bit already at > http://archives.postgresql.org/pgsql-general/2009-06/msg00762.php The end > result is that you must be careful about how you back your data up, with a > continuous streaming backup via WAL shipping being the recommended approach. > I wouldn't deploy into this environment in a situation where losing a > minute or two of transactions in the case of a EC2/EBS failure would be > unacceptable, because that's something that's a bit more likely to hapen > here than on most database hardware. Agreed. We have three WAL-shipped spares. One streams our WAL files to a single EBS volume which we use for worst case scenario snapshot backups. The other two are exact replicas of our primary database (one in the west coast data center, and the other in an east coast data center) which we have for failover. If we ever have to worst-case-scenario restore from one of our EBS snapshots, we're down for six hours because we'll have to stream the data from our EBS snapshot back over to an EBS raid array. 170gb at 20mb/sec (if you're lucky) takes a LONG time. It takes 30 to 60 minutes for one of those snapshots to become "usable" once we create a drive from it, and then we still have to bring up the database and wait an agonizingly long time for hot data to stream back into memory. We had to fail over to one of our spares twice in the last 1.5 years. Not fun. Both times were due to instance failure. It's possible to run a larger database on EC2, but it takes a lot of work, careful planning and a thick skin. Bryan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain < cedric.villemain.deb...@gmail.com> wrote: > store your files in a filesystem, and keep the path to the file (plus > metadata, acl, etc...) in database. > What type of filesystem is good for this? A filesystem with support for storing tens of thousands of files in a single directory, or should one play the 41/56/34/41563489.ext game? Are there any open source systems which handle keeping a filesystem and database in sync for this purpose, or is it a wheel that keeps getting reinvented? I know "store your files in a filesystem" is the best long-term solution. But it's just so much easier to just throw everything in the database.
Re: [GENERAL] invalid abstime to timestamp error only on specific date range
On Tue, Apr 27, 2010 at 11:02 AM, Tom Lane wrote: > zhong ming wu writes: > > The closest thing I can find to that in the code is > > cannot convert abstime "invalid" to timestamp Yes this is the message. There were invalid values in that table at first. Then I deleted such entries and tried the same query and got the same message. The query is four table joins but in order to pin-point the problem I pruned it down to as simple as what I posted originally. In any case I reloaded the data and on this second pass I stopped receiving the error. The reloading scripts were different and I don't know what that did to the reloaded data. Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] VACUUM FULL vs backup/restore
On Tue, Apr 27, 2010 at 2:50 PM, David Wall wrote: > We are copying a production database and then pairing it down dramatically > (perhaps removing 99% of records, most of which are large objects). > > It seems my options are 1) VACUUM FULL with a reindex_db; or 2) backup, then > restore. > > Is there anything one does better than the other? Our impression is that > the backup and restore will run faster (perhaps 2 hours), whereas we have a > currently running VACUUM FULL that's been running for 4.5 hours already. Vacuum Full was invented back in the days when drive space was not as cheap as it is today. It can shrink a table without having to have enough room free on the drive for a complete copy to be made. In pgsql 9.0 that behaviour is changing, making vacuum full much faster than it once was. So, the advantage of vacuum full is that it can (for now) operate in a space restricted environment if needed. Given how cheap drives are nowadays, the preferred method is either to cluster in place each table (which needs 2x file size drive space) or backup / restore the db. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] VACUUM FULL vs backup/restore
On Tue, Apr 27, 2010 at 4:50 PM, David Wall wrote: > It seems my options are 1) VACUUM FULL with a reindex_db; or 2) backup, then > restore. > I'd go with the backup+restore. The other option is to to a trivial ALTER to one of your fields which causes the table to be rewritten. Basically, anything is faster than a VACUUM FULL in my experience. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Server Panic when trying to stop point in time recovery
On Tue, Apr 27, 2010 at 4:02 PM, Joshua D. Drake wrote: >> > Well the problem is that the files are missing. How are you copying > them? > > Joshua D. Drake > I am using rsync to copy the files from the master into a directory /opt/wal/archwalremote on the standby. My restore script copies the requested file from /opt/wal/archwalremote to wherever it is asked to (pg_xlog/RECOVERYXLOG). Here is the end of the log from my restore script. First, 000100A1 is requested and copied. Then 000100A2 is requested, but it has not been synced yet so the script is sleeping and polling for that file to exist. Before the file exists, the trigger is detected and the restore script starts returning 1 to all requests. Tue Apr 27 15:06:00 CDT 2010 > cp /opt/wal/archwalremote/000100A1 pg_xlog/RECOVERYXLOG Tue Apr 27 15:06:00 CDT 2010 > cp returned 0, script returning 0 Tue Apr 27 15:06:00 CDT 2010 Starting up... From /opt/wal/archwalremote/000100A2 TO pg_xlog/RECOVERYXLOG Tue Apr 27 15:06:00 CDT 2010 /opt/wal/archwalremote/000100A2 not found Tue Apr 27 15:06:00 CDT 2010 Sleeping Tue Apr 27 15:09:11 CDT 2010 /opt/wal/archwalremote/000100A2 not found Tue Apr 27 15:09:11 CDT 2010 Sleeping Tue Apr 27 15:09:21 CDT 2010 Triggered Tue Apr 27 15:09:21 CDT 2010 Starting up... From /opt/wal/archwalremote/000100A1 TO pg_xlog/RECOVERYXLOG Tue Apr 27 15:09:21 CDT 2010 Triggered 000100A1 is found and restored (see original log file). I guess the question is: why, after the server should be out of recovery mode, is it looking for 000100A1 in the pg_xlog directory? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Server Panic when trying to stop point in time recovery
On Tue, 2010-04-27 at 15:39 -0500, Chris Copeland wrote: > Any help is greatly appreciated. Please let me know if I can provide > any more information that will be helpful. Well the problem is that the files are missing. How are you copying them? Joshua D. Drake > > -Chris > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] VACUUM FULL vs backup/restore
We are copying a production database and then pairing it down dramatically (perhaps removing 99% of records, most of which are large objects). It seems my options are 1) VACUUM FULL with a reindex_db; or 2) backup, then restore. Is there anything one does better than the other? Our impression is that the backup and restore will run faster (perhaps 2 hours), whereas we have a currently running VACUUM FULL that's been running for 4.5 hours already. Anybody have any experience on this? Would a backup/restore essentially create a minimally sized database with all fresh indexes? Thanks, David -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source BI Tool
On Tue, 2010-04-27 at 13:40 -0700, John R Pierce wrote: > akp geek wrote: > > Business Intelligence > > isn't that an oxymoron? Depends, you want your paycheck? > > > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Server Panic when trying to stop point in time recovery
I am running 8.2.4 on Solaris 10 x86. I have setup WAL file shipping from a primary server to a warm standby. I am able to start the standby server using a backup from the primary and get it to apply the log files as they arrive. My problem comes when I want to trigger the standby server to come out of recovery mode. Here is the log file starting from when the server comes up from the backup. Just prior to the error I have "triggered" the server to exit recovery mode by making my restore_command return 1 instead of 0. 2010-04-27 15:00:58 CDT :LOG: database system was interrupted at 2010-04-27 10:10:08 CDT 2010-04-27 15:00:58 CDT :LOG: starting archive recovery 2010-04-27 15:00:58 CDT :LOG: restore_command = "/opt/data/restore.sh /opt/wal/archwalremote/%f %p" 2010-04-27 15:00:58 CDT :LOG: restored log file "0001009F00BA.0278.backup" from archive 2010-04-27 15:00:59 CDT :LOG: restored log file "0001009F00BA" from archive 2010-04-27 15:00:59 CDT :LOG: checkpoint record is at 9F/BA000278 2010-04-27 15:00:59 CDT :LOG: redo record is at 9F/BA000278; undo record is at 0/0; shutdown FALSE 2010-04-27 15:00:59 CDT :LOG: next transaction ID: 0/325985316; next OID: 823081 2010-04-27 15:00:59 CDT :LOG: next MultiXactId: 2127; next MultiXactOffset: 4278 2010-04-27 15:00:59 CDT :LOG: automatic recovery in progress 2010-04-27 15:00:59 CDT :LOG: redo starts at 9F/BA0002C0 2010-04-27 15:01:00 CDT :LOG: restored log file "0001009F00BB" from archive 2010-04-27 15:01:02 CDT :LOG: restored log file "0001009F00BC" from archive 2010-04-27 15:03:19 CDT :LOG: restored log file "0001009F00FE" from archive 2010-04-27 15:03:20 CDT :LOG: restored log file "000100A0" from archive 2010-04-27 15:06:00 CDT :LOG: restored log file "000100A1" from archive 2010-04-27 15:09:21 CDT :LOG: could not open file "pg_xlog/000100A2" (log file 160, segment 2): No such file or directory 2010-04-27 15:09:21 CDT :LOG: redo done at A0/168 2010-04-27 15:09:21 CDT :PANIC: could not open file "pg_xlog/000100A1" (log file 160, segment 1): No such file or directory 2010-04-27 15:09:26 CDT :LOG: startup process (PID 22490) was terminated by signal 6 2010-04-27 15:09:26 CDT :LOG: aborting startup due to startup process failure 2010-04-27 15:09:26 CDT :LOG: logger shutting down At this point the server will now enter a restart loop and constantly generate log files like this : 2010-04-27 15:09:26 CDT :LOG: database system was interrupted while in recovery at log time 2010-04-27 15:05:08 CDT 2010-04-27 15:09:26 CDT :HINT: If this has occurred more than once some data may be corrupted and you may need to choose an earlier recovery target. 2010-04-27 15:09:26 CDT :LOG: starting archive recovery 2010-04-27 15:09:26 CDT :LOG: restore_command = "/opt/data/restore.sh /opt/wal/archwalremote/%f %p" 2010-04-27 15:09:26 CDT :LOG: could not open file "pg_xlog/000100A1" (log file 160, segment 1): No such file or directory 2010-04-27 15:09:26 CDT :LOG: invalid primary checkpoint record 2010-04-27 15:09:26 CDT :LOG: could not open file "pg_xlog/0001009F00BA" (log file 159, segment 186): No such file or directory 2010-04-27 15:09:26 CDT :LOG: invalid secondary checkpoint record 2010-04-27 15:09:26 CDT :PANIC: could not locate a valid checkpoint record 2010-04-27 15:09:30 CDT :LOG: startup process (PID 24191) was terminated by signal 6 2010-04-27 15:09:30 CDT :LOG: aborting startup due to startup process failure 2010-04-27 15:09:30 CDT :LOG: logger shutting down Any help is greatly appreciated. Please let me know if I can provide any more information that will be helpful. -Chris
Re: [GENERAL] Open Source BI Tool
akp geek wrote: Business Intelligence isn't that an oxymoron? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Errors starting postgres after initializing new database
On 27 April 2010 21:30, Chris Barnes wrote: > > I have just initialized a database with no errors, used the postgresql.conf > file from another system running 8.4.2. > Attempted to start and got the fatal error below. > I than ran pgtune and got the same error. > > I am not sure what the problem is? Is there more detailed logging than > below, or can be turned on? > > CentOS release 5.4 (Final) (installed and working) > CentOS release 5.2 (Final) (failing) > > > after pgtune > olap.rownum_name = 'default' > maintenance_work_mem = 1GB # pg_generate_conf wizard 2010-04-27 > effective_cache_size = 22GB # pg_generate_conf wizard 2010-04-27 > work_mem = 96MB # pg_generate_conf wizard 2010-04-27 > shared_buffers = 7680MB # pg_generate_conf wizard 2010-04-27 > ~ > ~ > > > > [r...@rc-rec-five pgtune-0.9.0]# /etc/init.d/postgresql stop > Stopping postgresql service: [FAILED] > [r...@rc-rec-five pgtune-0.9.0]# /etc/init.d/postgresql start > Starting postgresql service: [FAILED] > [r...@rc-rec-five pgtune-0.9.0]# cat pgstartup.log > cat: pgstartup.log: No such file or directory > [r...@rc-rec-five pgtune-0.9.0]# cat /data/pgsql/data/pgstartup.log > 2010-04-27 16:19:17 EDTFATAL: requested shared memory size overflows > size_t > 2010-04-27 16:22:27 EDTFATAL: requested shared memory size overflows > size_t > > /var/log/messages/ > Apr 27 13:57:56 rc-rec5 ntpd[2990]: synchronized to 206.248.171.198, > stratum 1 > Apr 27 16:13:45 rc-rec5 postgres[5717]: [1-1] 2010-04-27 16:13:45 EDT > FATAL: requested shared memory size overflows size_t > Apr 27 16:19:17 rc-rec5 postgres[7736]: [1-1] 2010-04-27 16:19:17 EDT > FATAL: requested shared memory size overflows size_t > Apr 27 16:22:27 rc-rec5 postgres[9648]: [1-1] 2010-04-27 16:22:27 EDT > FATAL: requested shared memory size overflows size_t > > > This might sound like a strange question, but are the architectures of the servers different? Specifically, is the failing server 32-bit and the working server 64-bit? And what architecture is PostgreSQL built against on the failing server? Thom
[GENERAL] Errors starting postgres after initializing new database
I have just initialized a database with no errors, used the postgresql.conf file from another system running 8.4.2. Attempted to start and got the fatal error below. I than ran pgtune and got the same error. I am not sure what the problem is? Is there more detailed logging than below, or can be turned on? CentOS release 5.4 (Final) (installed and working) CentOS release 5.2 (Final) (failing) after pgtune olap.rownum_name = 'default' maintenance_work_mem = 1GB # pg_generate_conf wizard 2010-04-27 effective_cache_size = 22GB # pg_generate_conf wizard 2010-04-27 work_mem = 96MB # pg_generate_conf wizard 2010-04-27 shared_buffers = 7680MB # pg_generate_conf wizard 2010-04-27 ~ ~ [r...@rc-rec-five pgtune-0.9.0]# /etc/init.d/postgresql stop Stopping postgresql service: [FAILED] [r...@rc-rec-five pgtune-0.9.0]# /etc/init.d/postgresql start Starting postgresql service: [FAILED] [r...@rc-rec-five pgtune-0.9.0]# cat pgstartup.log cat: pgstartup.log: No such file or directory [r...@rc-rec-five pgtune-0.9.0]# cat /data/pgsql/data/pgstartup.log 2010-04-27 16:19:17 EDTFATAL: requested shared memory size overflows size_t 2010-04-27 16:22:27 EDTFATAL: requested shared memory size overflows size_t /var/log/messages/ Apr 27 13:57:56 rc-rec5 ntpd[2990]: synchronized to 206.248.171.198, stratum 1 Apr 27 16:13:45 rc-rec5 postgres[5717]: [1-1] 2010-04-27 16:13:45 EDTFATAL: requested shared memory size overflows size_t Apr 27 16:19:17 rc-rec5 postgres[7736]: [1-1] 2010-04-27 16:19:17 EDTFATAL: requested shared memory size overflows size_t Apr 27 16:22:27 rc-rec5 postgres[9648]: [1-1] 2010-04-27 16:22:27 EDTFATAL: requested shared memory size overflows size_t Thanks, Chris _ Hotmail & Messenger are available on your phone. Try now. http://go.microsoft.com/?linkid=9724461
Re: [GENERAL] Open Source BI Tool
Thank you all . regards On Tue, Apr 27, 2010 at 2:42 PM, Adrian von Bidder wrote: > On Tuesday 27 April 2010 19.12:31 Steve Atkins wrote: > > [...] > > BIRT > [...] > > > FWIW, my cow-orkers are quite happy with BIRT (especially with the designer > environment in Eclipse) after having used Crystal and Actuate previously. > I've no idea how BIRT compares with the other OSS tools, but compared to > those two using BIRT was quite a step up in terms of ease of use / > available > documentation & debugging tools. > > cheers > -- vbi > > -- > Most scientists think that the not-fossil theory is a red herring; > indeed, they think that oil is lots of herrings (and other things) > compacted over time into sticky black mud. >-- Prospect Magazine, March 2003, p. 6 >
Re: [GENERAL] Open Source BI Tool
On Tuesday 27 April 2010 19.12:31 Steve Atkins wrote: [...] > BIRT [...] FWIW, my cow-orkers are quite happy with BIRT (especially with the designer environment in Eclipse) after having used Crystal and Actuate previously. I've no idea how BIRT compares with the other OSS tools, but compared to those two using BIRT was quite a step up in terms of ease of use / available documentation & debugging tools. cheers -- vbi -- Most scientists think that the not-fossil theory is a red herring; indeed, they think that oil is lots of herrings (and other things) compacted over time into sticky black mud. -- Prospect Magazine, March 2003, p. 6 signature.asc Description: This is a digitally signed message part.
Re: [GENERAL] Open Source BI Tool
On Tue, 2010-04-27 at 12:33 -0400, akp geek wrote: > Hi all - > > I would like to know, if there is a open source BI tool > for creating reports against Postgres database ? appreciate your help > Pentaho > Regards -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source BI Tool
On Apr 27, 2010, at 9:33 AM, akp geek wrote: > Hi all - > > I would like to know, if there is a open source BI tool for > creating reports against Postgres database ? appreciate your help Pentaho, BIRT, JasperReports (and associated tools like iReport), JFreeReport, DataVision, OpenReports, FreeReportBuilder, rlib, fyiReporting and xTuple are some to take a look at. There's a lot of overlap and repackaging of much the same underlying engines in there. The first three are the most commonly mentioned. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source BI Tool
On 27/04/2010 17:40, akp geek wrote: > Business Intelligence tool ( crystal reports for example ) > Ah - grand - thanks! Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source BI Tool
2010/4/27 akp geek : > Hi all - > > I would like to know, if there is a open source BI tool for > creating reports against Postgres database ? appreciate your help Pentaho have some good tools http://www.pentaho.com/ > > Regards > -- Cédric Villemain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source BI Tool
Business Intelligence tool ( crystal reports for example ) Regards On Tue, Apr 27, 2010 at 12:38 PM, Raymond O'Donnell wrote: > On 27/04/2010 17:33, akp geek wrote: > > Hi all - > > > > I would like to know, if there is a open source BI tool for > > creating reports against Postgres database ? appreciate your help > > Pardon my ignorance - what's a "BI tool"? > > Thanks. :-) > > Ray. > > -- > Raymond O'Donnell :: Galway :: Ireland > r...@iol.ie >
Re: [GENERAL] Open Source BI Tool
On 27/04/2010 17:33, akp geek wrote: > Hi all - > > I would like to know, if there is a open source BI tool for > creating reports against Postgres database ? appreciate your help Pardon my ignorance - what's a "BI tool"? Thanks. :-) Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Open Source BI Tool
Hi all - I would like to know, if there is a open source BI tool for creating reports against Postgres database ? appreciate your help Regards
Re: [GENERAL] Postgresql on EC2/EBS in production?
Nikhil G. Daddikar wrote: I was wondering if any of you are using (or tried to use) PG+EC2/EBS on a production system. Are any best-practices. Googling didn't help much. A few articles I came across scared me a bit. There have been a couple of reports of happy users: http://blog.endpoint.com/2010/02/postgresql-ec2-ebs-raid0-snapshot.html http://archives.postgresql.org/pgsql-general/2009-06/msg00702.php There are two main things to be wary of: 1) Physical I/O is not very good, thus how that first system used a RAID0. 2) Reliability of EBS is terrible by database standards; I commented on this a bit already at http://archives.postgresql.org/pgsql-general/2009-06/msg00762.php The end result is that you must be careful about how you back your data up, with a continuous streaming backup via WAL shipping being the recommended approach. I wouldn't deploy into this environment in a situation where losing a minute or two of transactions in the case of a EC2/EBS failure would be unacceptable, because that's something that's a bit more likely to hapen here than on most database hardware. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
2010/4/27 Greg Smith : > a.bhattacha...@sungard.com wrote: >> >> I have *622,000 number of records *but it is taking almost *4 and half >> hours* to load these data into the tables. Without the schema and the queries, all you can get is guessing. -- Vincenzo Romano NotOrAnd Information Technologies NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
a.bhattacha...@sungard.com wrote: I have *622,000 number of records *but it is taking almost *4 and half hours* to load these data into the tables. I have a simple function in db which is being called from Java batch program to populate the records into tables from flat files. Four likely possibilities here, in order of how easy they are to test for and therefore resolve: 1) Your hardware doesn't handle commits very well. You can turn off the synchronous_commit command while doing the data loading to see if that helps. See http://www.postgresql.org/docs/8.3/static/wal-async-commit.html 2) You're doing a commit after every single transaction. See http://java.sun.com/docs/books/tutorial/jdbc/basics/transactions.html for an intro to disabling this. Common practice is to commit every 100 to 1000 transactions instead. 3) The simple function in your database is executing very slowly. 4) The design of the database involves a lot of overhead, such as difficult to process foreign keys or an excess of indexes on some tables. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] invalid abstime to timestamp error only on specific date range
zhong ming wu writes: > I have a postgres table with a column type abstime. (all data in this > database are loaded from an ancient propriety database) > When I filter data from this table with a specific date range on this > column it works. When I filter with a different date range > it gives me an error like > invalid abstime to timestamp > I don't have that server at this moment and I cannot post exact error. The closest thing I can find to that in the code is cannot convert abstime "invalid" to timestamp The abstime type does have a special reserved value "invalid", while timestamp doesn't so that value can't be converted to timestamp. I'm guessing that you have an "invalid" or two lurking in the table somewhere, but it's hard to be specific with so few details. You've not shown us enough information to tell why your query would be trying to convert any abstime values to timestamp --- let alone that particular one. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
Thanks a lot for your help. However I am new to Postgres database therefore it would be nice if you can let me know how to set autocommit off. I know from psql client issuing “\set Autocommit Off” would set it off but unfortunately it doesn’t set it off. It's a client-side setting, not a server-side one. Hence you need to turn it off in your java program, probably in the database connection settings. P.S. Your message was so mangled that I couldn't find which part of the original message you were referring to, so I deleted the remainder. Top-posting is considered bad form in mailing-lists. Alban Hertroys Actually, top posting is considered bad form in THIS mailing list. All the others I participate in top post regularly. That said though, it is considered good form to post in the manner expected by the list that one is posting to. ;-) Bayless -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
On Tue, Apr 27, 2010 at 5:17 AM, Alban Hertroys wrote: > On 27 Apr 2010, at 10:11, > wrote: > >> Dear All Experts, >> >> I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. >> However, I have a batch program written in Java which processes the data and >> populates them into tables in Postgres database. >> >> I have 622,000 number of records but it is taking almost 4 and half hours to >> load these data into the tables. > > Wow, that's averaging almost 40s per insert! I don't think those inserts not > being in a single transaction explains that much of a delay. Not that > changing that doesn't help, but there's more going on here. You got that backwards: it's 40 inserts/sec. This is about par for the course on windows style fsync on slow hardware iirc. Solution is to use transactions, or play with fsync. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] lc_ctype does not work on windows ?
=?UTF-8?Q?Filip_Rembia=C5=82kowski?= writes: > I have a problem with locale on windows. > postgresql 8.4. > this does not work as expected: > filip=# select 'A' ~ '\w', 'Ä' ~ '\w'; The locale-aware regex operators don't really work in multibyte encodings. There's a (partial) fix for this in 9.0, but we don't trust it yet so it's not been back-patched. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] lc_ctype does not work on windows ?
Hi I have a problem with locale on windows. postgresql 8.4. this does not work as expected: filip=# select 'A' ~ '\w', 'Ą' ~ '\w'; ?column? | ?column? --+-- t| f (1 row) -- investigating postgres settings: filip=# select name, setting, context, source, boot_val, reset_val from pg_settings where name ~* '(lc|encoding|locale|char|text|version)'; name| setting | context | source | boot_val | reset_val +---+---++---+--- client_encoding| win1250 | user | session | SQL_ASCII | UTF8 default_text_search_config | pg_catalog.simple | user | configuration file | pg_catalog.simple | pg_catalog.simple lc_collate | Polish, Poland| internal | override | C | Polish, Poland lc_ctype | Polish, Poland| internal | override | C | Polish, Poland lc_messages| Polish, Poland| superuser | configuration file | | Polish, Poland lc_monetary| Polish, Poland| user | configuration file | C | Polish, Poland lc_numeric | Polish, Poland| user | configuration file | C | Polish, Poland lc_time| Polish, Poland| user | configuration file | C | Polish, Poland server_encoding| UTF8 | internal | override | SQL_ASCII | UTF8 server_version | 8.4.2 | internal | default | 8.4.2 | 8.4.2 server_version_num | 80402 | internal | default | 80402 | 80402 (11 rows) -- and database settings: filip=# select * from pg_database where datname='filip'; -[ RECORD 1 ]-+--- datname | filip datdba| 2650623 encoding | 6 datcollate| Polish, Poland datctype | Polish, Poland datistemplate | f datallowconn | t datconnlimit | -1 datlastsysoid | 11563 datfrozenxid | 649 dattablespace | 1663 datconfig | datacl| -- Filip Rembiałkowski JID,mailto:filip.rembialkow...@gmail.com http://filip.rembialkowski.net/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
On 27 Apr 2010, at 11:15, wrote: > Thanks a lot for your help. However I am new to Postgres database therefore > it would be nice if you can let me know how to set autocommit off. > I know from psql client issuing “\set Autocommit Off” would set it off but > unfortunately it doesn’t set it off. It's a client-side setting, not a server-side one. Hence you need to turn it off in your java program, probably in the database connection settings. P.S. Your message was so mangled that I couldn't find which part of the original message you were referring to, so I deleted the remainder. Top-posting is considered bad form in mailing-lists. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4bd6e4f110411684215286! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
I am curious to know how much of your delay is due to PostgreSQL and how much to your Java batch program. If you comment out the call to the database function, so that you are reading your input file but not doing anything with the data, how long does your batch program take to run? RobR
Re: [GENERAL] pg_hba.conf
I've had problems before with the listen_addresses and had to set it accordingly. Wouldn't accept connections locally. listen_addresses = '*' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all # (change requires restart) port = 5432 > Date: Tue, 27 Apr 2010 21:08:31 +0900 > From: ketan...@ashisuto.co.jp > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] pg_hba.conf > > Hi > > >Would there be a line earlier in the file that matches and is preventing > >a connection? > > At first, I think so too. > But if there is a line earlier in the file ,we get following error. > > > psql: could not connect to server: Connection refused > Is the server running on host "192.168.23.132" and accepting > TCP/IP connections on port 1843? > > > ex: my pg_hba.conf > > hostall all 192.168.23.132 255.255.255.255 deny > hostall all 192.168.23.132 255.255.255.255 trust > > > > The Jim's message say pg_hba.conf has no entory. > > > FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", > database "arc" > > > 1)Is pg_hba.conf's location correct? > You can check to execute this command. > > postgres=# show hba_file; > hba_file > --- > /home/p843/pgdata/pg_hba.conf > (1 row) > > 2)Did you reload pg_hba.conf? > If we change pg_hba.conf ,we must execute "pg_ctl reload" > > 3)pg_hba.conf may have a trash. >Can you recreate pg_hba.conf? >*Don't copy old pg_hba.conf. > > > Thank you. > > > On 27/04/2010 11:42, jkun...@laurcat.com wrote: > > > >> I am putting up a new server on version 8.4.3. I copied pg_hba.conf > >> from a running 8.3.6 system, changing only the public IP address for the > >> local machine. > >> > >> I get the error: > >> FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", > >> database "arc" > >> > >> pg_hba.conf contains the line: > >> hostall all209.159.145.248 255.255.255.255 > >> trust > >> > > Would there be a line earlier in the file that matches and is preventing > > a connection? > > > > Ray. > > > > > > > -- > > Kenichiro Tanaka > K.K.Ashisuto > http://www.ashisuto.co.jp/english/index.html > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general _ Hotmail & Messenger are available on your phone. Try now. http://go.microsoft.com/?linkid=9724461
Re: [GENERAL] pg_hba.conf
Hi >Would there be a line earlier in the file that matches and is preventing >a connection? At first, I think so too. But if there is a line earlier in the file ,we get following error. psql: could not connect to server: Connection refused Is the server running on host "192.168.23.132" and accepting TCP/IP connections on port 1843? ex: my pg_hba.conf hostall all 192.168.23.132 255.255.255.255 deny hostall all 192.168.23.132 255.255.255.255 trust The Jim's message say pg_hba.conf has no entory. FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", database "arc" 1)Is pg_hba.conf's location correct? You can check to execute this command. postgres=# show hba_file; hba_file --- /home/p843/pgdata/pg_hba.conf (1 row) 2)Did you reload pg_hba.conf? If we change pg_hba.conf ,we must execute "pg_ctl reload" 3)pg_hba.conf may have a trash. Can you recreate pg_hba.conf? *Don't copy old pg_hba.conf. Thank you. On 27/04/2010 11:42, jkun...@laurcat.com wrote: I am putting up a new server on version 8.4.3. I copied pg_hba.conf from a running 8.3.6 system, changing only the public IP address for the local machine. I get the error: FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", database "arc" pg_hba.conf contains the line: hostall all209.159.145.248 255.255.255.255 trust Would there be a line earlier in the file that matches and is preventing a connection? Ray. -- Kenichiro Tanaka K.K.Ashisuto http://www.ashisuto.co.jp/english/index.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] invalid abstime to timestamp error only on specific date range
I have a postgres table with a column type abstime. (all data in this database are loaded from an ancient propriety database) When I filter data from this table with a specific date range on this column it works. When I filter with a different date range it gives me an error like invalid abstime to timestamp I don't have that server at this moment and I cannot post exact error. I'll if I have to. I was pulling my hair out because it happens only with a specific date range and can't be my invalid format in select statement. Say i do psql> select * from shipping where ship_date <= '04/22/2008' and ship_date>='04/21/2008' it works but if i do psql> select * from shipping where ship_date <= '04/22/2010' and ship_date>='04/21/2010' it won't even perform the query but emits the error. Both server and client are 8.4.3 I tried other date formats but above format should work because it works with 2008 year. Thanks for any suggestion -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_hba.conf
On 27/04/2010 11:42, jkun...@laurcat.com wrote: > I am putting up a new server on version 8.4.3. I copied pg_hba.conf > from a running 8.3.6 system, changing only the public IP address for the > local machine. > > I get the error: > FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", > database "arc" > > pg_hba.conf contains the line: > hostall all209.159.145.248 255.255.255.255 > trust Would there be a line earlier in the file that matches and is preventing a connection? Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On Tuesday 27 April 2010 11.17:42 Cédric Villemain wrote: > > Anyone had this kind of design problem and how did you solve it? > > store your files in a filesystem, and keep the path to the file (plus > metadata, acl, etc...) in database. ... and be careful that db and file storage do not go out of sync. But if files are ever only added and possibly removed (but never changed), this is not too hard: * be sure to commit db transaction only after file has been written to disk (use fsync or similar to be sure!) (For file deletions: first delete db metadata, then delete the file.) * be sure to detect failed writes and abort the db transaction or otherwise properly handle errors while storing the file. * occasionally run a clean-up to remove files that were written to filesystem where the db metadata was not stored. Should be a rare case but it probably will happen. PostgreSQL support 2PC (PREPARE and then COMMIT as separate steps); you may want to use this (PREPARE database transaction, then do filesystem operations. If filessystem operation fails, you cann ROLLBACK the db connection, otherwise COMMIT.) That way, you don't lose transactional semantics. Backup requires some more thought. I guess you could use some kind of volume management to get filesysstem snapshots, but you have to be sure the fs snapshot reflects the point in time when the database backup was made. Depending on load / availability requirements you may get away with stopping data modification at the application level for a few seconds until the db backup has started and the filesystem snapshot has been created. cheers -- vbi -- featured product: PostgreSQL - http://postgresql.org signature.asc Description: This is a digitally signed message part.
[GENERAL] pg_hba.conf
I am putting up a new server on version 8.4.3. I copied pg_hba.conf from a running 8.3.6 system, changing only the public IP address for the local machine. I get the error: FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", database "arc" pg_hba.conf contains the line: hostall all209.159.145.248 255.255.255.255 trust Other records work (I can connect from my remote site using pgAdmin, just fine), so I know the file is being read by posgres. Any ideas? Thanks in advance, Jim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unable to run createlang (or psql for that matter)
Do you know of any guides to ritual suicide? On Apr 27, 2010, at 3:02 AM, Scott Mead wrote: Your path has 'PostgresPlus' Locate shows 'PostgreSQL' -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Efficient intersect operation (Full Text Search)
On Wed, Apr 14, 2010 at 11:39 AM, Allan Kamau wrote: > I have a brief question - I can provide more information if it is not clear. > > I would like to perform pairwise intersect operations between several > pairs of sets (where a set is a list or vector of labels), I have many > such pairs of sets and the counts of their elements may vary greatly. > > Is there a way to perform to "AND" two tsvectors and get the resulting > intersect as another tsvector? > > > Allan. > To refine the above question, is there a way to intersect two tsvectors? In short I would like to intersect two documents. Allan. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
Thanks a lot for your help. However I am new to Postgres database therefore it would be nice if you can let me know how to set autocommit off. I know from psql client issuing "\set Autocommit Off" would set it off but unfortunately it doesn't set it off. Many thanks for your help. From: Magnus Hagander [mailto:mag...@hagander.net] Sent: Tuesday, April 27, 2010 2:19 PM To: Bhattacharya, A Cc: pgsql-general Subject: Re: [GENERAL] PostgreSQL Performance issue On Tue, Apr 27, 2010 at 10:11, wrote: Dear All Experts, I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. You really need to upgrade. At least to 8.3.10. It has many important bugfixes. However, I have a batch program written in Java which processes the data and populates them into tables in Postgres database. I have 622,000 number of records but it is taking almost 4 and half hours to load these data into the tables. I have a simple function in db which is being called from Java batch program to populate the records into tables from flat files. I have the below system configuration for my database server. Database Server PostgreSQL v8.3.5 Operating System Windows 2003 Server 64 bit, Service Pack 2 CPU 2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz Memory 16 GB RAM Disk Space total 2.5 TB [C drive - 454 GB & D drive 1.99 TB] The interesting point is not how much disk you have, but what configuration you have it in. Eitehr way, 622k records in 4 1/2 hours is obviously crappy even for a single disk though. and I have set my postgresql.conf parameters as below. shared_buffers = 1GB You might want to try to lower that one drastically, say 128Mb. In some cases, this has been known to give better performance on Windows. not in all case though, so you have to try it out. temp_buffers = 256MB max_prepared_transactions = 100 Are you really using 2-phase commit on the system? If not, turn this off. This is prepared transactions, not prepared statements. Please advise me the best or optimum way setting these parameters to achieve better performance. Also note that, when I am setting my shared_buffer = 2GB or high , Postgres is throwing an error "shared_buffer size cannot be more than size_t" That is normal since your binary is 32-bit. In fact, having shared_buffers at 1Gb may give you some trouble with your fairly high work_mem as well, as the *total* amount of memory in the process is limited. That's another reason to try a lower shared_buffers. (other than that, read the comments from Thom) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [GENERAL] Storing many big files in database- should I do it?
S3 is not primary storage for the files, it's a distribution system. We want to be able to switch form S3 to other CDN if required. So, "Master" copies of files is kept on private server. Question is should it be database of filesystem. On Tue, Apr 27, 2010 at 7:03 PM, Massa, Harald Armin wrote: >> No, I'm not storing RDBMS in S3. I didn't write that in my post. >> S3 is used as CDN, only for downloading files. > > > So you are storing your files on S3 ? > > Why should you store those files additionally in a PostgreSQL database? > > If you want to keep track of them / remember metadata, hashes will do the > job with much less memory. > > Harald > > -- > GHUM Harald Massa > persuadere et programmare > Harald Armin Massa > Spielberger Straße 49 > 70435 Stuttgart > 0173/9409607 > no fx, no carrier pigeon > - > %s is too gigantic of an industry to bend to the whims of reality > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
2010/4/27 Rod : > Hello, > > I have a web application where users upload/share files. > After file is uploaded it is copied to S3 and all subsequent downloads > are done from there. > So in a file's lifetime it's accessed only twice- when created and > when copied to S3. > > Files are documents, of different size from few kilobytes to 200 > Megabytes. Number of files: thousands to hundreds of thousands. > > My dilemma is - Should I store files in PGSQL database or store in > filesystem and keep only metadata in database? > > I see the possible cons of using PGSQL as storage: > - more network bandwidth required comparing to access NFS-mounted filesystem ? > - if database becomes corrupt you can't recover individual files > - you can't backup live database unless you install complicated > replication add-ons > - more CPU required to store/retrieve files (comparing to filesystem access) > - size overhead, e.g. storing 1000 bytes will take 1000 bytes in > database + 100 bytes for db metadata, index, etc. with lot of files > this will be a lot of overhead. > > Are these concerns valid? yes > Anyone had this kind of design problem and how did you solve it? store your files in a filesystem, and keep the path to the file (plus metadata, acl, etc...) in database. > > Thanks. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Cédric Villemain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
On 27 Apr 2010, at 10:11, wrote: > Dear All Experts, > > I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. > However, I have a batch program written in Java which processes the data and > populates them into tables in Postgres database. > > I have 622,000 number of records but it is taking almost 4 and half hours to > load these data into the tables. Wow, that's averaging almost 40s per insert! I don't think those inserts not being in a single transaction explains that much of a delay. Not that changing that doesn't help, but there's more going on here. Maybe you got some foreign key constraints involved that don't have indices on the foreign keys? In that case I expect that you either have a lot of foreign references, only a few but from rather large tables (several million rows at least) or something in between? Any other constraints that could be relevant? > I have a simple function in db which is being called from Java batch program > to populate the records into tables from flat files. Did you verify that most of the time is spent waiting on the database? I'm not entirely sure what you mean by the above. Do you have a batch program that starts the java interpreter for each flat file? Are they running synchronously (one after the other) or parallel? In the latter case you may be waiting on locks. Is the data that your program needs to insert in one line in the flat file or does it need to collect data from multiple lines throughout the file? How much memory does your java program use, could it be that it causes postgres to be swapped out? Did you do any benchmarking on your "simple function in db" or on the queries it performs (assuming it does perform any)? > I have the below system configuration for my database server. > Database Server > PostgreSQL v8.3.5 > Operating System > Windows 2003 Server 64 bit, Service Pack 2 > CPU > 2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz > Memory > 16 GB RAM > Disk Space > total 2.5 TB [C drive – 454 GB & D drive 1.99 TB] A system like that should be able to insert that small a number of records in no time. > Also note that, when I am setting my shared_buffer = 2GB or high , Postgres > is throwing an error “shared_buffer size cannot be more than size_t” That's odd... Is this a 32-bit Postgres build or is a 64-bit Windows incapable of assigning more than a 32-bit number for the amount of shared memory? Are you running in some kind of 32-bit compatibility mode maybe (PAE comes to mind)? That said, I haven't used Windows for anything more serious than gaming since last century - I'm not exactly an expert on its behaviour. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4bd6abc310411173714063! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
> > No, I'm not storing RDBMS in S3. I didn't write that in my post. > S3 is used as CDN, only for downloading files. > So you are storing your files on S3 ? Why should you store those files additionally in a PostgreSQL database? If you want to keep track of them / remember metadata, hashes will do the job with much less memory. Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 no fx, no carrier pigeon - %s is too gigantic of an industry to bend to the whims of reality
Re: [GENERAL] Storing many big files in database- should I do it?
No, I'm not storing RDBMS in S3. I didn't write that in my post. S3 is used as CDN, only for downloading files. On Tue, Apr 27, 2010 at 6:54 PM, John R Pierce wrote: > Rod wrote: >> >> Hello, >> >> I have a web application where users upload/share files. >> After file is uploaded it is copied to S3 and all subsequent downloads >> are done from there. >> So in a file's lifetime it's accessed only twice- when created and >> when copied to S3. >> >> Files are documents, of different size from few kilobytes to 200 >> Megabytes. Number of files: thousands to hundreds of thousands. >> >> My dilemma is - Should I store files in PGSQL database or store in >> filesystem and keep only metadata in database? >> >> I see the possible cons of using PGSQL as storage: >> - more network bandwidth required comparing to access NFS-mounted >> filesystem ? >> - if database becomes corrupt you can't recover individual files >> - you can't backup live database unless you install complicated >> replication add-ons >> - more CPU required to store/retrieve files (comparing to filesystem >> access) >> - size overhead, e.g. storing 1000 bytes will take 1000 bytes in >> database + 100 bytes for db metadata, index, etc. with lot of files >> this will be a lot of overhead. >> >> Are these concerns valid? >> Anyone had this kind of design problem and how did you solve it? >> > > S3 storage is not suitable for running a RDBMS. > An RDBMS wants fast low latency storage using 8k block random reads and > writes. S3 is high latency and oriented towards streaming > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
a.bhattacha...@sungard.com wrote: Dear All Experts, I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. However, I have a batch program written in Java which processes the data and populates them into tables in Postgres database. I have *622,000 number of records *but it is taking almost *4 and half hours* to load these data into the tables. I have a simple function in db which is being called from Java batch program to populate the records into tables from flat files. I have the below system configuration for my database server. Database Server *PostgreSQL v8.3.5* Operating System *Windows 2003 Server 64 bit, Service Pack 2* CPU *2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz* Memory *16 GB RAM* Disk Space *total 2.5 TB [C drive – 454 GB & D drive 1.99 TB]* ... with one thread doing inserts, the other 7 cores will be idle. but you're almost certainly disk IO bound. OTOH, if you're calling a function (is that pl-pgsql, pl-java, pl-perl, or what?) for each insert, you could be compute bound on that single core. really depends on what that function is doing. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Rod wrote: Hello, I have a web application where users upload/share files. After file is uploaded it is copied to S3 and all subsequent downloads are done from there. So in a file's lifetime it's accessed only twice- when created and when copied to S3. Files are documents, of different size from few kilobytes to 200 Megabytes. Number of files: thousands to hundreds of thousands. My dilemma is - Should I store files in PGSQL database or store in filesystem and keep only metadata in database? I see the possible cons of using PGSQL as storage: - more network bandwidth required comparing to access NFS-mounted filesystem ? - if database becomes corrupt you can't recover individual files - you can't backup live database unless you install complicated replication add-ons - more CPU required to store/retrieve files (comparing to filesystem access) - size overhead, e.g. storing 1000 bytes will take 1000 bytes in database + 100 bytes for db metadata, index, etc. with lot of files this will be a lot of overhead. Are these concerns valid? Anyone had this kind of design problem and how did you solve it? S3 storage is not suitable for running a RDBMS. An RDBMS wants fast low latency storage using 8k block random reads and writes. S3 is high latency and oriented towards streaming -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
On Tue, Apr 27, 2010 at 10:11, wrote: > Dear All Experts, > > > > I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. > You really need to upgrade. At least to 8.3.10. It has many important bugfixes. > However, I have a batch program written in Java which processes the data > and populates them into tables in Postgres database. > > > > I have *622,000 number of records *but it is taking almost *4 and half > hours* to load these data into the tables. > > I have a simple function in db which is being called from Java batch > program to populate the records into tables from flat files. > > > > I have the below system configuration for my database server. > > Database Server > > *PostgreSQL v8.3.5* > > Operating System > > *Windows 2003 Server 64 bit, Service Pack 2* > > CPU > > *2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz* > > Memory > > *16 GB RAM* > > Disk Space > > *total 2.5 TB [C drive – 454 GB & D drive 1.99 TB]* > > The interesting point is not how much disk you have, but what configuration you have it in. Eitehr way, 622k records in 4 1/2 hours is obviously crappy even for a single disk though. > and I have set my postgresql.conf parameters as below. > > > > shared_buffers = 1GB > > You might want to try to lower that one drastically, say 128Mb. In some cases, this has been known to give better performance on Windows. not in all case though, so you have to try it out. > temp_buffers = 256MB > > max_prepared_transactions = 100 > Are you really using 2-phase commit on the system? If not, turn this off. This is prepared transactions, not prepared statements. > > > > Please advise me the best or optimum way setting these parameters to > achieve better performance. > > Also note that, when I am setting my *shared_buffer = 2GB or high , *Postgres > is throwing an error “*shared_buffer size cannot be more than size_t*” > That is normal since your binary is 32-bit. In fact, having shared_buffers at 1Gb may give you some trouble with your fairly high work_mem as well, as the *total* amount of memory in the process is limited. That's another reason to try a lower shared_buffers. (other than that, read the comments from Thom) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
[GENERAL] Storing many big files in database- should I do it?
Hello, I have a web application where users upload/share files. After file is uploaded it is copied to S3 and all subsequent downloads are done from there. So in a file's lifetime it's accessed only twice- when created and when copied to S3. Files are documents, of different size from few kilobytes to 200 Megabytes. Number of files: thousands to hundreds of thousands. My dilemma is - Should I store files in PGSQL database or store in filesystem and keep only metadata in database? I see the possible cons of using PGSQL as storage: - more network bandwidth required comparing to access NFS-mounted filesystem ? - if database becomes corrupt you can't recover individual files - you can't backup live database unless you install complicated replication add-ons - more CPU required to store/retrieve files (comparing to filesystem access) - size overhead, e.g. storing 1000 bytes will take 1000 bytes in database + 100 bytes for db metadata, index, etc. with lot of files this will be a lot of overhead. Are these concerns valid? Anyone had this kind of design problem and how did you solve it? Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
Most likely you are inserting one per transaction. Set autocommit to false and commit only after all the inserts are done. -n. On 27-04-2010 13:41, a.bhattacha...@sungard.com wrote: Dear All Experts, I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. However, I have a batch program written in Java which processes the data and populates them into tables in Postgres database. I have *622,000 number of records *but it is taking almost *4 and half hours* to load these data into the tables. I have a simple function in db which is being called from Java batch program to populate the records into tables from flat files. I have the below system configuration for my database server. Database Server *PostgreSQL v8.3.5* Operating System *Windows 2003 Server 64 bit, Service Pack 2* CPU *2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz* Memory *16 GB RAM* Disk Space *total 2.5 TB [C drive -- 454 GB & D drive 1.99 TB]* and I have set my postgresql.conf parameters as below. == #-- # RESOURCE USAGE (except WAL) #-- # - Memory - shared_buffers = 1GB temp_buffers = 256MB max_prepared_transactions = 100 work_mem = 512MB maintenance_work_mem = 512MB # - Free Space Map - max_fsm_pages = 160 max_fsm_relations = 1 ' #-- # WRITE AHEAD LOG #-- wal_buffers = 5MB# min 32kB checkpoint_segments = 32 checkpoint_completion_target = 0.9 #-- # QUERY TUNING #-- # - Planner Method Configuration - enable_hashagg = on enable_hashjoin = on enable_indexscan = on enable_mergejoin = on enable_nestloop = on enable_seqscan = on enable_sort = on effective_cache_size = 8GB Please advise me the best or optimum way setting these parameters to achieve better performance. Also note that, when I am setting my *shared_buffer = 2GB or high , *Postgres is throwing an error "/shared_buffer size cannot be more than size_t/" It would be very grateful, if anyone can help me on this. Many thanks
Re: [GENERAL] PostgreSQL Performance issue
On 27 April 2010 09:11, wrote: > Dear All Experts, > > > > I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. > > However, I have a batch program written in Java which processes the data > and populates them into tables in Postgres database. > > > > I have *622,000 number of records *but it is taking almost *4 and half > hours* to load these data into the tables. > > I have a simple function in db which is being called from Java batch > program to populate the records into tables from flat files. > > > > I have the below system configuration for my database server. > > Database Server > > *PostgreSQL v8.3.5* > > Operating System > > *Windows 2003 Server 64 bit, Service Pack 2* > > CPU > > *2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz* > > Memory > > *16 GB RAM* > > Disk Space > > *total 2.5 TB [C drive – 454 GB & D drive 1.99 TB]* > > > > and I have set my postgresql.conf parameters as below. > > > > == > > > #-- > > # RESOURCE USAGE (except WAL) > > > #-- > > > > # - Memory - > > > > shared_buffers = 1GB > > > temp_buffers = 256MB > > max_prepared_transactions = 100 > > > > > work_mem = 512MB > > maintenance_work_mem = 512MB > > > > > > # - Free Space Map - > > > > max_fsm_pages = 160 > > > > > max_fsm_relations = 1 > ‘ > > > > > #-- > > # WRITE AHEAD LOG > > > #-- > > wal_buffers = 5MB# min 32kB > > > > checkpoint_segments = 32 > > checkpoint_completion_target = 0.9 > > > > > #-- > > # QUERY TUNING > > > #-- > > > > # - Planner Method Configuration - > > > > enable_hashagg = on > > enable_hashjoin = on > > enable_indexscan = on > > enable_mergejoin = on > > enable_nestloop = on > > enable_seqscan = on > > enable_sort = on > > > > effective_cache_size = 8GB > > > > > > > > Please advise me the best or optimum way setting these parameters to > achieve better performance. > > Also note that, when I am setting my *shared_buffer = 2GB or high , *Postgres > is throwing an error “*shared_buffer size cannot be more than size_t*” > > > > It would be very grateful, if anyone can help me on this. > > > > Many thanks > Are these all being sent in 1 transaction? Can't you use COPY to bulk insert into the database? If not, can you insert in batches (like 1,000 at a time) Have you got any triggers/constraints/complicated domains on the table you're inserting into? Thom
[GENERAL] PostgreSQL Performance issue
Dear All Experts, I am using in PostgreSQL 8.3.5 database on windows 64 bit OS. However, I have a batch program written in Java which processes the data and populates them into tables in Postgres database. I have 622,000 number of records but it is taking almost 4 and half hours to load these data into the tables. I have a simple function in db which is being called from Java batch program to populate the records into tables from flat files. I have the below system configuration for my database server. Database Server PostgreSQL v8.3.5 Operating System Windows 2003 Server 64 bit, Service Pack 2 CPU 2 * Quad Core AMD Opteron Processor 2376 @ 2.29 GHz Memory 16 GB RAM Disk Space total 2.5 TB [C drive - 454 GB & D drive 1.99 TB] and I have set my postgresql.conf parameters as below. == #--- --- # RESOURCE USAGE (except WAL) #--- --- # - Memory - shared_buffers = 1GB temp_buffers = 256MB max_prepared_transactions = 100 work_mem = 512MB maintenance_work_mem = 512MB # - Free Space Map - max_fsm_pages = 160 max_fsm_relations = 1 ' #--- --- # WRITE AHEAD LOG #--- --- wal_buffers = 5MB# min 32kB checkpoint_segments = 32 checkpoint_completion_target = 0.9 #--- --- # QUERY TUNING #--- --- # - Planner Method Configuration - enable_hashagg = on enable_hashjoin = on enable_indexscan = on enable_mergejoin = on enable_nestloop = on enable_seqscan = on enable_sort = on effective_cache_size = 8GB Please advise me the best or optimum way setting these parameters to achieve better performance. Also note that, when I am setting my shared_buffer = 2GB or high , Postgres is throwing an error "shared_buffer size cannot be more than size_t" It would be very grateful, if anyone can help me on this. Many thanks
Re: [GENERAL] gmake check problem
Hi Jim To resolve this problem, we have to indicate which SQL is hanging. "stats test" executes postgresql-8.4.3/src/test/regress/sql/stats.sql. we can see what the test does. And we can see the log at postgresql-8.4.3/src/test/regress/results/stats.out So I suppose we can indicate which SQL is hanging. What does stats.out say? Thank you. I built 8.4.3 on Centos 5 with just ./configure and gmake When I run gmake check, the process stops at "test stats..." There is no residual postgresql or postmaster running (consuming clock cycles on top). When I break the process, I get the following errors: gmake[2]: *** wait: No child processes. Stop. gmake[2]: *** Waiting for unfinished jobs gmake[2]: *** wait: No child processes. Stop. gmake[1]: *** [check] Error 2 gmake: *** [check] Interrupt [postg...@bubachubs postgresql-8.4.3]$ Any help would be appreciated. Thanks, Jim -- Kenichiro Tanaka K.K.Ashisuto http://www.ashisuto.co.jp/english/index.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general