Re: [GENERAL] v8.3.4 metadata changes while users active
On Thu, 2012-04-05 at 03:17 +, Gauthier, Dave wrote: H... I don't have root access :-( In that case, ask your sysadmin to grant you sudo access to iptables or, if he thinks that is excessive, to write a wrapper script that enables/disables just that port and give you sudo access to that script. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Versioned, chunked documents
On Mon, 2012-04-02 at 00:38 +0200, Ivan Voras wrote: Hi, I have documents which are divided into chunks, so that the (ordered) concatenation of chunks make the whole document. Each of the chunks may be edited separately and past versions of the chunks need to be kept. The structure looks fairly simple: The first goal is to retrieve the latest version of the whole document, made from the latest versions of all chunks, but later the goal will also be to fetch the whole version at some point in time (i.e. with chunks created before a point in time). I did the first goal by creating two helper views: CREATE VIEW documents_chunks_last_version_chunk_ids AS SELECT documents_id, max(id), seq FROM documents_chunks GROUP BY documents_id, seq; CREATE VIEW documents_chunks_last_version_content AS SELECT documents_chunks.documents_id, content FROM documents_chunks JOIN documents_chunks_last_version_chunk_ids ON documents_chunks.id=documents_chunks_last_version_chunk_ids.max ORDER BY documents_chunks_last_version_chunk_ids.seq; There are indexes on the document_chunks fields seq and documents_id. Everything looked fine until I examined the output of EXPLAIN ANALYZE and saw this: I'm not surprised. First guess: I'd use id (FK of documents), seq, ctime as the prime key of document_chunk, which would work for your initial requirement but is far too simplistic to deal with the general requirement of retrieving a specific document version. You'd probably need something like: CREATE TABLE documents ( id SERIAL PRIMARY KEY, title TEXT NOT NULL, ... ); Create table version ( version_no serial primary key, author var char 40, ctime timestamp ); create table document_version ( id serial references document(id), version_number serial references version(version_no), primary_key (id, version_no), ); CREATE TABLE documents_chunks ( id SERIAL references document_version(id), version_number serial references document_version(version_number), seq serial, content TEXT, primary_key(id, version_number, seq) }; Disclaimer: this is not syntax checked or tested. It may/or may not match your requirements, but since I haven't seen your ERD or the 3NF you derived from it I can't offer any more applicable advice. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Versioned, chunked documents
Correction interpolated - see below On Mon, 2012-04-02 at 00:22 +0100, Martin Gregorie wrote: On Mon, 2012-04-02 at 00:38 +0200, Ivan Voras wrote: Hi, I have documents which are divided into chunks, so that the (ordered) concatenation of chunks make the whole document. Each of the chunks may be edited separately and past versions of the chunks need to be kept. The structure looks fairly simple: The first goal is to retrieve the latest version of the whole document, made from the latest versions of all chunks, but later the goal will also be to fetch the whole version at some point in time (i.e. with chunks created before a point in time). I did the first goal by creating two helper views: CREATE VIEW documents_chunks_last_version_chunk_ids AS SELECT documents_id, max(id), seq FROM documents_chunks GROUP BY documents_id, seq; CREATE VIEW documents_chunks_last_version_content AS SELECT documents_chunks.documents_id, content FROM documents_chunks JOIN documents_chunks_last_version_chunk_ids ON documents_chunks.id=documents_chunks_last_version_chunk_ids.max ORDER BY documents_chunks_last_version_chunk_ids.seq; There are indexes on the document_chunks fields seq and documents_id. Everything looked fine until I examined the output of EXPLAIN ANALYZE and saw this: I'm not surprised. First guess: I'd use id (FK of documents), seq, ctime as the prime key of document_chunk, which would work for your initial requirement but is far too simplistic to deal with the general requirement of retrieving a specific document version. You'd probably need something like: CREATE TABLE documents ( id SERIAL PRIMARY KEY, title TEXT NOT NULL, ... ); Create table version ( version_no serial primary key, author var char 40, ctime timestamp ); create table document_version ( id serial references document(id), version_number serial references version(version_no), primary_key (id, version_no), ); I used a version number in the key because a TIMESTAMP is much too fine grained unless you're planning to obtain its value before committing all the changed document_chunks affected by this editing session. BTW, why use document_chunks when a text field can hold megabytes, especially if they will be concatenated to form a complete document which is then edited as a whole item and before being split into chunks and saved back to the database? If the chunks represent chapters or other logical sections that are always edited separately, why not name them to reflect this? 'document_chunk' just sounds too arbitrary to me. CREATE TABLE documents_chunks ( id SERIAL references document_version(id), version_number serial references document_version(version_number), seq serial, content TEXT, primary_key(id, version_number, seq) == should be primary_key(id, seq, version_number)# of course! }; Disclaimer: this is not syntax checked or tested. It may/or may not match your requirements, but since I haven't seen your ERD or the 3NF you derived from it I can't offer any more applicable advice. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Help in Parsing PG log usings CSV format
On Thu, 2012-03-22 at 09:32 +, Arvind Singh wrote: Help needed in parsing PostgreSQL CSV Log Hello friends, I am working an a section of application which needs to Parse CSV Logs generated by PostgreSql server. - The Logs are stored C:\Program Files\PostgreSQL\9.0\data\pg_log - The Server version in 9.0.4 - The application is developed in C Sharp * The basic utility after Parse the Log is to show contents in a DataGridView. * There are other filter options like to view log contents for a particular range of Time for a Day. **However the main problem that is, the Log format is not readable** A Sample Log data line 2012-03-21 11:59:20.640 IST,postgres,stock_apals,3276,localhost:1639,4f697540.ccc,10,idle,2012-03-21 11:59:20 IST,2/163,0,LOG,0,statement: SELECT version()exec_simple_query, .\src\backend\tcop\postgres.c:900, As you can see the columns in the Log are comma separated , But however individual values are not Quote Enclosed. For instance the 1st,4rth,6th .. columns Thats fairly standard. A good CSV parser only requires a field to be quoted if it contains commas or quotes. In the latter case the internal quotes should be doubled, i.e the three fields in the following: unquoted field,contains commas, etc.,Fred said Cool!. should be handled correctly by a decent CSV parser. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] copy in date string 00-00-00 00:00:00
On Wed, 2012-03-14 at 21:52 -0700, Mark Phillips wrote: I am not familiar with sed, except for some trivial bits I nicked off the web. Enough to know it works, and to be dangerous. Nonetheless, using SED may be the way to go as there are two tables that contain a bit over 3,000,000 rows each. You should also consider using awk/gawk with the field separator (FS variable) set to match the one in your input (','). The advantages in this case are that it can be made to work on specific fields in the CSV file and not look at the rest, something like: BEGIN { FS = ','; } # set field sep to comma $5 == '-00-00 00:00:00' { $5 = '' } # empty field 5 if it matches {print } # output all lines Disclaimer: this is untested example code Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to find compiled-in default port number?
On Tue, 2012-03-13 at 11:16 +0200, Виктор Егоров wrote: Greetings. Is there a way to find out the compiled-in port number? Two ways, with Postgres running: - Scan the server's ports with nmap. - as root on the server, run lsof | less and look at the Postgres process(es). Both are fast, so do both. If lsof shows the PostgreSQL port but nmap doesn't, then you'll know its being blocked by a firewall. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Error installing postgresq91-python package
On Sat, 2012-03-10 at 14:08 -0800, PgSQL wrote: If the OP can identify and download the relevant .rpm file Have you checked the CentOS bugzilla to see if the yum failure has been reported? You should raise a bug report if it isn't there: if nobody reports a bug its unlikely to get fixed. I download this packages: When you download rpms manually, you use the rpm command to install the rpm packages, not yum. wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/elfutils-0.137-3.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/elfutils-libs-0.137-3.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/elfutils-libelf-0.137-3.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/elfutils-libs-0.137-3.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/expat-1.95.8-8.3.el5_5.3.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/gmp-4.1.4-10.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/libxml2-2.6.26-2.1.12.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/libxml2-python-2.6.26-2.1.12.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/m2crypto-0.16-8.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/python-2.4.3-44.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/python-elementtree-1.2.6-5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/python-iniparse-0.2.3-4.el5.noarch.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/python-sqlite-1.1.7-1.2.1.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/python-urlgrabber-3.1.0-6.el5.noarch.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/readline-5.1-3.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/rpm-4.4.2.3-22.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/rpm-libs-4.4.2.3-22.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/rpm-python-4.4.2.3-22.el5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/sqlite-3.3.6-5.i386.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/yum-3.2.22-37.el5.centos.noarch.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/yum-NetworkManager-dispatcher-1.1.16-16.el5.centos.noarch.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/yum-fastestmirror-1.1.16-16.el5.centos.noarch.rpm wget http://mirror.centos.org/centos-5/5.7/os/i386/CentOS/yum-metadata-parser-1.1.2-3.el5.centos.i386.rpm Why did you download these? Only five of them appear to have anything to do with Python and four to be connected with yum. Which, if any, of them contain mx? Hint: try using rpm -qip filename.rpm or rpm -qilp filename.rpm to see what is in them. In my Fedora15 installation the mx package is in the mx and mx-devel packages: running yum info '*mx*' told me that. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Matching on keyword or phrases within a field that is delimited with an or operator |
On Mon, 2012-03-12 at 15:57 -0700, Jim Ostler wrote: I have a table that is around 20 GB, so I need to optimize as best as possible the matching with another table on keywords across multiple fields. I have around 10 fields that have keywords or phrases delimited with the or operator |. So it would be in the form of a | b | and jack | cd . There are around 20 keywords or phrases per field, and these keywords could be any word. How static is the list of keywords? Is there any significance in their order? (I'm guessing there isn't). How many rows are there in the table, IOW how big is each row? I wonder if putting the keywords in a separate table with a many:many relationship with your big table would help. This would retain your ability to add or remove keywords without affecting the schema. The selection rules used to combine keywords would also run fast. I'm getting good performance from a mail archive where messages are selected using: - a 1:M relationship with a subject table - a M:M relationship with an address table - range comparison with the date sent (non-indexed field in the message) - and ILIKE(%phrase%) comparison with the message text The query is assembled using WHERE clauses for each of the four selection possibilities listed above before being prepared and issued. A where clause is only included if the user has specified a value for it. There is an AND relationship between the clauses. Currently there are around 130,000 messages in the database. Speed: it took 0.6 seconds to find 209 messages to or from an address in the last 5 years and containing the keyword 'f1a'. The same search without the date range specified ran in 0.86 seconds. It took 16.9 seconds to find the 12331 messages containing 'f1a' when no address or data range were specified. This is on a dual core, 3.2 GHz Athlon with 4GB of RAM using PG 9.1. The other possibility is to use a star schema with each keyword being a dimension - IOW a traditional data warehouse set-up, but of course this does require a static list of keywords to be defined. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Error installing postgresq91-python package
On Sat, 2012-03-10 at 13:28 -0800, Adrian Klaver wrote: On 03/10/2012 09:59 AM, PgSQL wrote: Thanks. You can see: root@s1 [/000/yuyuyum/2012/2012]# yum install mx Traceback (most recent call last): File /usr/bin/yum, line 4, in ? import yum File /usr/lib/python2.4/site-packages/yum/__init__.py, line 50, in ? import comps File /usr/lib/python2.4/site-packages/yum/comps.py, line 23, in ? iterparse = cElementTree.iterparse AttributeError: 'module' object has no attribute 'iterparse' root@s1 [/000/yuyuyum/2012/2012]# A yum problem with Python 2.4. A test on my Ubuntu machine confirmed that the iterparse function is not included in cElementTree shipped with Python 2.4 even though it is supported from 2.2+ In my previous post I erroneously said you where on Fedora, not Centos. Might want to file a bug report with the CentOS folks. Is there another mechanism to install mx, other than yum in the mean time? If the OP can identify and download the relevant .rpm file it should be possible to install mx with the rpm utility. The next time yum is used it will complain that the package database was updated outside of it but that can be ignored: this happens every time I upgrade Opera, which is distributed for Fedora as a manually applied RPM. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex transactions without using plPgSQL Functions. It is possible?
On Wed, 2012-03-07 at 14:19 +0400, Sergey Konoplev wrote: On Tue, Mar 6, 2012 at 11:30 PM, Andre Lopes lopes80an...@gmail.com wrote: This is the plPgSQL code that I need to write in Python. It is possible to do this without using PlPgSQL? Have you looked at pyodbc? ODBC will usually accept statements allowing you to turn autocommit off and to use connection.commit() to group a set of statements into a transaction. Note that there's a documentation comment saying that autocommit settings are not passed to the driver, However, as the documentation is still talking about bytea fields it may be out of date so I'd suggest running a test. Try writing a Python test program that turns autocommit off and does a couple of inserts followed by a rollback. File a bug if the inserted data is in the table after a successful run. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Fixing the loss of 'template1'
In the course of migrating from (I think) Postgres 8.4 under Fedora 12 to Postgres 9.1 under Fedora 16 I managed to loose 'template1' and associated data. The immediate effect is that, although my schema and data seem to be intact and are working correctly, pg_dumpall refused to run because it can't find 'template1'. The sequence was something like this: 1) first step was to bring my updates fully up to date by running pg_dumpall with output to a USB hard drive and to run the incremental backup that's part of my application (a mail archive that backs up into a set of mbox files). 2)installed Fedora 16, including the Postgres 9.1 package, and fully updated all packages 3)initialised the Postgres user and basic database structures, started the server. 4)attempted to use pg_restore to recover my database. This threw lots of errors and was obviously stupid, so I killed it and reloaded the backup with psql. This ran OK except that some data was incorrectly restored to my mail messages table because tabs in some messages confused the COPY command. 5) Started again: I dropped my schema and recreated it before letting my application restore the database from its mbox files. 6) The next scheduled backup using pg_dumpall failed immediately because it couldn't find 'template1'. Observations a) could the abrupt termination of pg_restore have removed 'template1' from the database structure? b) I've seen the confusion between tabs in data and the tabs used as delimiters by pg_dump cause this problem in the past. It was fixed then, but seems to have crept back in. Question Is it possible to reinstate 'template1' and all its works without reinitialising the database from scratch. It was suggested to me that running initdb might fix it while leaving my data in place though my source was uncertain about its success. Would this work or is there a better way to restore 'template1' in a populated database system? Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fixing the loss of 'template1'
On Wed, 2012-03-07 at 09:38 -0500, Gary Chambers wrote: Martin, 6) The next scheduled backup using pg_dumpall failed immediately because it couldn't find 'template1'. The template1 database is the default database to which pg_dumpall attempts to connect. If you use the -l or --database option, you can change that and pg_dumpall will resume functioning as you expect. I've just logged in under postgres and run '\l' - and the databases postgres, template0 and template1 are still there: postgres=# \l List of databases Name| Owner | Encoding | Collate |Ctype| Access privileges ---+--+--+-+-+--- postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres=CTc/postgres+ | | | | | =c/postgres So, is this a privilege issue? I don't understand the content of that somewhat cryptic 'privilege' column. Is it set how you'd expect? Marti: == I got this output: postgres=# SELECT datname FROM pg_database WHERE datistemplate; datname --- template0 template1 (2 rows) so it doesn't look like its been renamed. Martin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general