Re: [GENERAL] Performance PLV8 vs PLPGSQL
On 12/29/2016 10:35 AM, Pavel Stehule wrote: 2016-12-29 10:03 GMT+01:00 Tim Uckun mailto:timuc...@gmail.com>>: I think it's awesome that postgres allows you to code in different languages like this. It really is a unique development environment and one that is overlooked as a development platform. It would be nice if more languages were delivered in the default package especially lua, V8 and mruby. It is about dependencies and maintenance. There are not too much people who has good experience with C embedding Lua, V8 and others. Any people who can do some work are welcome. The living outside main package has disadvantages - only enthusiast knows about it, but some advantages too - you are not fixed on PostgreSQL development cycle, and development can be faster. I'll add my 2 cents. Postgresql and in general SQL are about integrity and coherency. Checking coherency is much easier with strict data type. PL/PGSQL gives you that, JS is far far away from that. Postgresql is a very flexible database and you can stretch it to do "MEAN like"[1] stuff but that's going to increase your "impedance mismatch". If you think there is some space for JS in your application stack that's nearer to the client rather than to the DB. Or possibly you need to do "MEAN like" stuff but you don't want to install another "database". As other said using stored procedures is a two edged sword. It can decouple DB schema from the application or it can increase the coupling. Choosing JS for performance in the stored procedure realm is going to encourage coupling and make scalability harder and it is going to become a mess when you'll need to refactor. [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle) -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] CRM where pg is a first class citizen?
On 12/13/2016 11:45 PM, Rich Shepard wrote: On Tue, 13 Dec 2016, Adrian Klaver wrote: This killed the community(Open Source) edition going forward: https://community.sugarcrm.com/thread/18434 I'd like to comment regarding this paragraph from the above-referenced blog post: "In the course of the past five years, we have surveyed tens of thousands of Sugar Community Edition users and found that we see two types of users of Sugar Community Edition: 1) developers that wish to build on an open source CRM platform, and 2) users, generally first time CRM users, that are looking for a free/inexpensive CRM solution. We don’t believe that the current Sugar Community Edition serves both audiences effectively. We envision an open source solution targeted exclusively for developers. And, we also envision a simpler way for first-time CRM users to find and use CRM." This is an interesting perspective, but not surprising for a large for-profit corporation like SugarCRM. I'm an environmental consultant sole practitioner and have been looking for years for a postgres-supporting CRM that I could use. There is none. Every business is different and has different needs. This is why a generic CRM like Sugar that tries to fit every business regardless of type or size forces its customers to fit into their generic model rather than supporting a developer _and_ end-user framework that can be customized for each business's specific needs and way of working. This reminds me of Drupal and the companies driving its development again... Drupal was interesting because it was a packaged product and a framework. Most SME can't afford customization of ERP and accounting programs (if you're not including invoice formatting). SalesForce is not offering custom products and it is still pretty successful. While at least here in Italy I think most accounting programs are a trap, I've realized that most of the times SME should learn from the procedures proposed by CRM/ERP/accounting programs and adapt rather than customize. Processes are generally not scientifically planned, rather built up as they go. A program that has been built to serve many through years may not be optimal but at least tend to be more rational. Still I'm not looking for something perfect, but something simple with low maintenance. *Postgres in this case is one of the ingredients of low maintenance or at least maintenance I'm familiar with.* That's why I'm developing my own using PyQt5, Python3, psychpg2, and postgres-9.6. I have the postgres schema that works for me and am willing to share it with others because of this thread. I had not planned on putting it on GitHub, but see no reason not to do so if there's interest by others. I'm starting to learn PyQt5 and Python3 after a decade of wxPython use with Python2 and am just about ready to start creating the UI. Unfortunately I don't want to depend on something I'll have to put developing resources in and I need something that work reasonably quickly. But I admit that considering the few requirement I have I spent a couple of seconds considering the idea to write one. Nothing bad could come out by publishing your code on Github and if not to contribute I'll surely give a look to learn something. -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] CRM where pg is a first class citizen?
On 12/13/2016 10:49 PM, Adrian Klaver wrote: On 12/13/2016 12:36 PM, George Weaver wrote: I've never used it but what about: https://developer.sugarcrm.com/2012/08/03/like-postgresql-and-want-to-use-it-with-sugarcrm-check-out-a-new-community-project/ This killed the community(Open Source) edition going forward: https://community.sugarcrm.com/thread/18434 There are several forks. It is not even clear if the forks support pg and that's not a good start. Wikipedia lists: https://www.dolibarr.org/ (EPR) seems a bit messy http://epe.si/ (CRM) not clear workflow http://www.tryton.org/ (EPR) very postgres oriented, same family of odoo supporting postgres. Still looking for some good advice. -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] CRM where pg is a first class citizen?
On 12/13/2016 09:36 PM, George Weaver wrote: I've never used it but what about: https://developer.sugarcrm.com/2012/08/03/like-postgresql-and-want-to-use-it-with-sugarcrm-check-out-a-new-community-project/ That's one of the first thing I've read and it doesn't look as a good premise to make pg a first class citizen. BTW Joshua D. Drake cited Drupal 7+ Drupal was the reason I started using postgres back in the days and it has been one of the reasons I was bothering you all on this list more frequently few years ago. I've spent a pretty huge amount of time locally patching Drupal to make it work with postgresql and to try to push some of the patches upstream. I don't develop on Drupal anymore but up to at least D7 Postgresql was still not a first class citizen. I've heard DB abstraction layer in D8 is much better but I don't have anything critical on Drupal anymore and life is too short to fight to see your patches refused from upstream because "supporting postgres is holding us back". Considering that most PHP web applications are not optimized for any DB and I wonder what features could a developer exploit to optimize for mysql, that's really a shame. I don't want to repeat the experience, especially on software I'm just going to use and not develop on. Forgive me for I have sinned: last Drupal I've installed was 7 and I picked up mysql and I still feel awkward when I've to deal with it. I'm using horde on postgres. Postgres is not a first class citizen in Horde but at least they were happy and quick to merge a couple of my patches. Working with pg was fun, it was a very nice piece of software, it has a great community and it got even better. I'm not anymore a hardcore user but I still enjoy using it. There are much more services offering Postgres on the internet (Amazon, Heroku, OpenStack...). Lack of proper support from "web applications" still drives me nuts. It comes to no surprise that Python and Java "web applications" tend to support postgres better. Python would be absolutely welcome, but I don't have that much experience managing Java on the web. https://www.odoo.com/ supports postgres but it is an ERP and it is far more complicated than I would like. -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] CRM where pg is a first class citizen?
Hi, I was looking for a open source CRM, PHP or python based, with a large community where Postgresql is a first class citizen. I'd prefer ease of use over features. Thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] "Too far out of the mainstream"
On Tue, 4 Sep 2012 19:14:28 -0700 Chris Travers wrote: > So people are using PostgreSQL in roles that aren't very visible > anyway, DBA's are usually coming to PostgreSQL from other RDBMS's, > and few applications are really distributed for PostgreSQL. I know a bunch of people working for huge sites that love Postgres but use MySQL. The main reason is they build what Postgres is famous for at a higher level and in a more specialized way with their own glue. It's easy to get visibility if you're on the internet and you're huge. But not everyone can "rebuild" eg. transactions at a higher level and need as much specialized solutions. On the other hand for historical reasons MySQL and PHP have nearly monopolized the hosting space and for many web sites it's hard to appreciate the difference between Postgres and MySQL (unless your DB crash and burn). That's what most people perceive as "the mainstream" if you don't have a big marketing dept lying. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] fsync default setting and version
Where can I check in which version the default setting for fsync was changed? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to drop function?
On Wed, 16 Nov 2011 09:17:45 +0100 Thomas Kellerer wrote: > Ivan Sergio Borgonovo, 16.11.2011 01:01: > > test=# begin; > > create or replace function process_table ( > > action TEXT, v_table_name varchar(100) > > ) RETURNS BOOLEAN > > AS $$ > > DECLARE > > > > BEGIN > > return true; > > END; > > $$ LANGUAGE plpgsql; > > > > drop function process_table ( > > action TEXT, v_table_name varchar(100) > > ); > > commit; > > BEGIN > > CREATE FUNCTION > > DROP FUNCTION > > COMMIT > > test=# > > > > Repeat just the input parameters. > > You don't have to include the parameter names though > > drop function process_table (TEXT, varchar(100)); > is just as good and bit less typing ;) In psql/pgadmin you've tab completion. It will complete without the parameters name. If you're writing more durable code generally you can just cut&paste the creation code. I admit I haven't spent enough time to see if I can have tab completion inside my IDE/editor. When you're refactoring the function most probably you'll have to refactor the drop code too. I tend to refactor much more frequently the number/type of parameters rather than the names, so skipping the names is anticipating some work that I'll seldom take advantage of. I admit I drop functions much more frequently in psql rather than in my IDE/editor, but still I generally have the creation code handy. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to drop function?
On Tue, 15 Nov 2011 18:48:00 -0700 "J.V." wrote: > the drop function works when running from a pgAdmin III Sql window > > but when I try to do from the command line and script it: > psql -h $PGHOST -p $PGPORT -d $PGDATABASE -U $PGUSER -c > "*drop function *" > > the above fails. > It does however work with functions with no params or a single > param. It seems to get hung up on the comma and the extra set of > parenthesis It would be nice to know how it fails and if you reposted exactly what you wrote to make it fail and what you wrote to make it succede. What does it mean "get hung on the comma and extra set of parenthesis"? ivan@dawn:~$ psql -h lan test -c 'drop function process_table (action TEXT, v_table_name varchar(100));' DROP FUNCTION ivan@dawn:~$ the psql command all on the same line. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to drop function?
On Tue, 15 Nov 2011 16:38:20 -0700 "J.V." wrote: > How do I drop a function that was created like so: > > create or replace function process_table (action TEXT, > v_table_name varchar(100)) RETURNS BOOLEAN > AS $$ > DECLARE > > BEGIN >... > END; > $$ LANGUAGE plpgsql; > > --- > I have tried various ways, but it always fails. > > > J.V. > test=# begin; create or replace function process_table ( action TEXT, v_table_name varchar(100) ) RETURNS BOOLEAN AS $$ DECLARE BEGIN return true; END; $$ LANGUAGE plpgsql; drop function process_table ( action TEXT, v_table_name varchar(100) ); commit; BEGIN CREATE FUNCTION DROP FUNCTION COMMIT test=# Repeat just the input parameters. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] function doesn't see change in search_path
On Mon, 7 Nov 2011 19:07:29 +0100 Pavel Stehule wrote: > 2011/11/7 Ivan Sergio Borgonovo : > > On Mon, 7 Nov 2011 17:55:11 +0100 > > Pavel Stehule wrote: > > > >> Hello > >> > >> this is know bug/feature based on caching plans > > > > What puzzled me is I'm operating in a similar way in a different > > system and I'm not experiencing the same problem. > > > > Do different users have different caches? > > depend on usage - cache is per session OK. It is clear it is "per session". Up to my knowledge users can't be changed inside the same session. What are you referring to with "depend on usage". Is there any other thing that can influence cached plans? Right now I just need a workaround and calling the function in different sessions seems cleaner than writing a function for each schema especially since I can use psql \connect. It seems that cache management happens completely behind the scenes and there are no way to control it other than playing tricks as sql := 'select * from ' | sometable |... execute sql; I didn't find anything on cache other than what's written here http://developer.postgresql.org/pgdocs/postgres/plpgsql-implementation.html thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] function doesn't see change in search_path
I have a behaviour similar to this http://archives.postgresql.org/pgsql-bugs/2007-09/msg00017.php create language plpgsql; create schema test1; create schema test2; create table test1.a(a varchar(3) unique); create table test2.a(a varchar(3) unique); create or replace function test_insert() returns void as $$ begin raise notice 'path %', current_schemas(true); insert into a values('a'); end; $$ language plpgsql volatile; set search_path to 'test1', 'public'; select * from test_insert(); NOTICE: path {pg_catalog,test1,public} test_insert - (1 row) set search_path to 'test2', 'public'; select * from test_insert(); NOTICE: path {pg_catalog,test2,public} ERROR: duplicate key value violates unique constraint "a_a_key" CONTEXT: SQL statement "insert into a values('a')" PL/pgSQL function "test_insert" line 3 at SQL statement PostgreSQL 8.3.14 what's going on? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] access to lexems or access to parsed elements
On Thu, 25 Aug 2011 18:21:21 +0200 "Massa, Harald Armin" wrote: > I want to access the single words in a text. Better yet: the > relevant words (i.e. without stop words) in a text. > > to_tsvector or casting gets me the lexems as a tsvector: I wrote this piece of C code more than a year ago. [1] It has been working in a production environment for quite a lot. It just works with versions < 8.4 since it doesn't support *. I'd be willing to maintain the module or even expand its features and release it on any license that will please postgresql community if my effort would actually make it more easily available to other people but throwing it on the internet won't be enough. If someone think it is worth the effort to help me understand how that could happen I'd be glad to learn. [1] http://www.webthatworks.it/d1/content/postgresql-c-module-turn-tsvectors-tsquery-and-return-tsvectors-tables -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pg multi-tenanted hard? was: FREE hosting platforms with PostgreSQL, Java SDK, Tomcat, ecc.?
On Mon, 08 Aug 2011 05:38:02 +0800 Craig Ringer wrote: > On 7/08/2011 11:43 PM, Ivan Sergio Borgonovo wrote: > > On Sun, 07 Aug 2011 20:41:27 +0800 > > Craig Ringer wrote: > > > >> By the way, one of the reasons you're not finding much free > >> hosting for PostgreSQL is that it takes a fair bit of work to > >> run Pg multi-tenanted. Your additional requirement for Java and > >> Tomcat > > Why should pg be harder than anything else? > There are DBs - mostly $OMFGLOTS DBs - that offer much greater OK. I forgot to specify "anything else in the same league" or shortly open source. Nice reading anyway, thanks. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Pg multi-tenanted hard? was: FREE hosting platforms with PostgreSQL, Java SDK, Tomcat, ecc.?
On Sun, 07 Aug 2011 20:41:27 +0800 Craig Ringer wrote: > By the way, one of the reasons you're not finding much free > hosting for PostgreSQL is that it takes a fair bit of work to run > Pg multi-tenanted. Your additional requirement for Java and Tomcat Why should pg be harder than anything else? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Read MS-SQL data into Postgres via ODBC link?
On Tue, 5 Jul 2011 19:38:25 -0400 "Jonathan Brinkman" wrote: > I was really hoping to keep the data-replication (between MSSQL > --> PG) contained within a PG function. > > Instead I could write a small shell script or C service to do this > using tsql (freetds). I have access to the MSSQL data via unixodbc > and tdsodbc/freetds in my Ubuntu console. I wrote some scripts, mainly in php and perl that make use of freetds/odbc to migrate a schema from MSSQL to pg and import the data via csv. Postgresql can run perl, python and php "internally" and all those languages have odbc drivers so you could skip the csv passage. It's not elegant as it would be accessing the data directly from odbc in pg but it works. I can share some code. > But I want to read from that ODBC stream directly in Postgresql, > like a SELECT on Linked Servers in MSSQL-world or on a linked > table in MSACCESS. That would give control over the ODBC > interaction to the PG function rather than a non-DB entity. > Has anyone tried ODBC-Link > (http://www.cybertec.at/en/postgresql_products/odbc-link)? I > couldn't get through the install due to Ubuntu-related error. Nice. I'd be interested in some feedback if you succede to make it work. Does this project has any chance to be included in contrib? It seems alive and kicking. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] benchmark comparing different postgresql versions
Is there a place where I can find comparison in performances of different postgresql versions, hopefully related to new feature/improvements? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] find the greatest, pick it up and group by
On Mon, 16 May 2011 20:05:45 -0400 "David Johnston" wrote: > When asking for help on non-trivial SELECT queries it really helps > to tell us the version of PG you are using so that responders know > what functionality you can and cannot use. In this case > specifically, whether WINDOW (and maybe WITH) clauses available? Unfortunately I'm on 8.3 so no WINDOW. I didn't even think of using them and I can't think of any way to use WINDOW/WITH but if there is a more readable solution that use them I'd like to see it even if I won't be able to use it. Of course I'm more interested to know if there is any cleaner solution for 8.3. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY complaining about unquoted carriage return found in data... in a quoted field
On Tue, 10 May 2011 15:59:07 +0200 Ivan Sergio Borgonovo wrote: Sorry for the noise. The csv was automatically generated. The code was right but during generation there was some problem with the box generating it (php segfaulting) and there were some unclosed quotes in a much earlier line. Postgresql actually behaves as documented and expected... and the documentation is clear. > On Tue, 10 May 2011 14:38:23 +0200 > Pavel Stehule wrote: > > > Hello > > > > COPY doesn't like '\n' too. > > > > Replace '\n' by '\\n' > > mmm maybe you were mislead by the "semi-transliterated" hexdump. > > There is no "slash" in the record, the actual input was the one > reported in hex. The following line was just to help reading the > hexdump. > > 09 22 56 | 69 61 20 46 | 61 65 6E 74 | 69 6E 61 2C > 20 35 33 0D | 0A 22 09 22 | 35 30 30 31 | 34 22 09 > > as a quick reference: > 0x09 = \t > 0x0a = \n > 0x0d = \r > 0x20 = space > 0x22 = " > > I thought that > > csv > quote as '"' > > where there exactly to avoid escaping of delimiters (other than > the quoting character itself) inside fields. > > From the docs: > http://www.postgresql.org/docs/8.3/static/sql-copy.html > «If the value contains the delimiter character, the QUOTE > character, the NULL string, a carriage return, or line feed > character, then the whole value is prefixed and suffixed by the > QUOTE character, and any occurrence within the value of a QUOTE > character or the ESCAPE character is preceded by the escape > character.» > > So up to my understanding once I've "quoted" a field I've to take > care of escaping just the quote and the escape characters and the > field may contain delimiter characters (field separator and record > separator) without escaping. > > I was wondering if a) the documentation is wrong b) I didn't write > a correct \COPY command string c) there is a bug d) I can't > correctly interpret the documentation e) I'm more stupid then usual > > thanks > -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] find the greatest, pick it up and group by
I've a table like: CREATE TABLE ordiniitem ( idordine numeric(18,0) NOT NULL, grupposped smallint, idart numeric(18,0) NOT NULL, qevasa integer, qfuoricat integer, qinris integer, qnonpub integer, qann integer, qord integer, qpren integer, qrichpag integer, qinriass integer, qinesa integer ); I'd like to group by idordine, grupposped, idart. For every row grouped that way, I'd like to pick the greatest of the q* columns and insert: idordine, grupposped, idart, name of the greatest(q*) in a new table. I don't mind if more than one q* column is equal to greatest(q*). It should pick up one, just one no matter which among the one equal to greatest(q*). I think this way works but it hurts my eyes. Any alternative approach? SELECT ioi.idordine, ioi.grupposped, ioi.idart, -- ioi.quantita, case when ioi.qevasa = greatest( ioi.qevasa, ioi.qfuoricat, ioi.qinris, ioi.qnonpub, ioi.qann, ioi.qord, ioi.qpren, ioi.qrichpag, ioi.qinriass, ioi.qinesa ) then 'evaso' when ioi.qfuoricat = greatest( ioi.qevasa, ioi.qfuoricat, ioi.qinris, ioi.qnonpub, ioi.qann, ioi.qord, ioi.qpren, ioi.qrichpag, ioi.qinriass, ioi.qinesa ) then 'fuoricatalogo' when ioi.qinris = greatest( ioi.qevasa, ioi.qfuoricat, ioi.qinris, ioi.qnonpub, ioi.qann, ioi.qord, ioi.qpren, ioi.qrichpag, ioi.qinriass, ioi.qinesa ) then 'in ristampa' -- ... end FROM ordiniitem ioi group by ioi.idordine, ioi.grupposped, ioi.idart, ioi.qevasa, ioi.qfuoricat, ioi.qinris, ioi.qnonpub, ioi.qann, ioi.qord, ioi.qpren, ioi.qrichpag, ioi.qinriass, ioi.qinesa ; thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY complaining about unquoted carriage return found in data... in a quoted field
On Tue, 10 May 2011 14:38:23 +0200 Pavel Stehule wrote: > Hello > > COPY doesn't like '\n' too. > > Replace '\n' by '\\n' mmm maybe you were mislead by the "semi-transliterated" hexdump. There is no "slash" in the record, the actual input was the one reported in hex. The following line was just to help reading the hexdump. 09 22 56 | 69 61 20 46 | 61 65 6E 74 | 69 6E 61 2C 20 35 33 0D | 0A 22 09 22 | 35 30 30 31 | 34 22 09 as a quick reference: 0x09 = \t 0x0a = \n 0x0d = \r 0x20 = space 0x22 = " I thought that csv quote as '"' where there exactly to avoid escaping of delimiters (other than the quoting character itself) inside fields. From the docs: http://www.postgresql.org/docs/8.3/static/sql-copy.html «If the value contains the delimiter character, the QUOTE character, the NULL string, a carriage return, or line feed character, then the whole value is prefixed and suffixed by the QUOTE character, and any occurrence within the value of a QUOTE character or the ESCAPE character is preceded by the escape character.» So up to my understanding once I've "quoted" a field I've to take care of escaping just the quote and the escape characters and the field may contain delimiter characters (field separator and record separator) without escaping. I was wondering if a) the documentation is wrong b) I didn't write a correct \COPY command string c) there is a bug d) I can't correctly interpret the documentation e) I'm more stupid then usual thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] COPY complaining about unquoted carriage return found in data... in a quoted field
I'm on pg 8.3.14 I'm trying to import a csv with \copy anagraficaclienti from 'myfile.csv' delimiter as E' ' -- this is a tab \t null as 'NULL' csv header quote as E'"' escape as E'\\' What I get is ERROR: unquoted carriage return found in data HINT: Use quoted CSV field to represent carriage return. CONTEXT: COPY anagraficaclienti, line 48656 The record actually contains \r\n in a field but up to my understanding that field is quoted. I think what seems to be wrong is my understanding of what postgres consider quoted field. the content in hex 09 22 56 | 69 61 20 46 | 61 65 6E 74 | 69 6E 61 2C 20 35 33 0D | 0A 22 09 22 | 35 30 30 31 | 34 22 09 \t"Via Faentina, 53\r\n"\t What am I missing? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Convert Simple Query into tsvector & tsquery format.
On Fri, 18 Mar 2011 12:30:50 +0530 Adarsh Sharma wrote: > Dear all, > > I have a simple query mentioned below : > > select count(*) from page_content where (content like > '%Militant%' OR content like '%jihad%' OR content like > '%Mujahid%' OR content like '%fedayeen%' OR content like > '%insurgent%' OR content like '%terrORist%' OR > content like '%cadre%' OR content like '%civilians%' OR content > like '%police%' OR content like '%defence%' OR content like > '%cops%' OR content like '%crpf%' OR content like '%dsf%' OR > content like '%ssb%') AND (content like '%kill%' OR content like > '%injure%'); It's not clear where the problem is. select to_tsquery('english', '(yellow | blue) & (red | black)'); to_tsquery - ( 'yellow' | 'blue' ) & ( 'red' | 'black' ) select to_tsvector('english', 'yellow red') @@ to_tsquery('english', '(yellow | blue) & (red | black)'); The only thing I can see that could cause problems is you may have previously "mangled" words in the ilike query while you'd leave that task to ts engine that will find a proper lexeme. Could you be more specific about the problem you're encountering. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Set new owner on cloned database
On Wed, 8 Dec 2010 13:40:29 -0500 (EST) "James B. Byrne" wrote: > I am testing a Rails deployment and wish to copy a database > assigning it an new owner. I have tried this: > > createdb --owner=hll_theheart_db_devl > --template=hll_th_deploytest_prod hll_theheart_devl > > While this indeed sets the database owner to hll_theheart_db_devl > everything else, schema, tables whatever, remains owned by the > original owner. Is there no way to change the owner everywhere in > the cloned database using cretedb? Or am I constrained to do a > dump all and restore? http://www.mail-archive.com/pgsql-hack...@postgresql.org/msg51048.html http://www.mail-archive.com/pgsql-hack...@postgresql.org/msg51047.html I'm not sure if there has been any progress in newer postgres to support easier change of owner. I'm not aware of any more current better solution. Unfortunately I think the license of the above didn't help to make people willing to improve and make the code more popular. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding the other statement causing a sharelock
On Mon, 8 Nov 2010 14:22:16 -0700 Scott Marlowe wrote: > Don't know how much it helps here, but this page: > http://wiki.postgresql.org/wiki/Lock_Monitoring > is priceless when you're having issues midday with a lock that > won't go away. I was thinking to reinvent the wheel and write something similar. But I was already thinking how am I supposed to "intercept" a lock that is caused by a long transaction that I know and a process happening at some unknown time? I've some strong suspect... and I'd like to exit earlier from a function if a process is running but I'm not really sure how to add a semaphore... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding the other statement causing a sharelock
On Mon, 08 Nov 2010 15:45:12 -0500 Tom Lane wrote: > Ivan Sergio Borgonovo writes: > > I get > > DETAIL: Process 24749 waits for ShareLock on transaction > > 113443492; blocked by process 25199. Process 25199 waits for > > ShareLock on transaction 113442820; blocked by process 24749. > > > I would like to know both statements that caused the sharelock > > problem. > Recent versions of PG record both (or all) statements involved in a > deadlock in the postmaster log. What about not so recent 8.3.9? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] finding the other statement causing a sharelock
I get DETAIL: Process 24749 waits for ShareLock on transaction 113443492; blocked by process 25199. Process 25199 waits for ShareLock on transaction 113442820; blocked by process 24749. I would like to know both statements that caused the sharelock problem. This is a long running transaction. I know one of the statement. I'd like to know the other. How? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] exceptionally large UPDATE
On Fri, 29 Oct 2010 10:21:14 -0400 Vick Khera wrote: > On Thu, Oct 28, 2010 at 1:06 PM, Ivan Sergio Borgonovo > wrote: > > What I'm planning to do is: > > max_connections = 5 > > shared_buffers = 240M > > work_mem = 90MB > > maintenance_work_mem = 1GB > > max_fsm_pages = 437616 > > max_fsm_relations = 1200 > > checkpoint_segments = 70 > > default_statistics_target = 30 > > #log_min_duration_statement = 1000 > default_statistics_target = 100 is the new "default" for newer > postgres, and with good reason... try that. > > if you boost your checkpoint_segments, also twiddle the > checkpoint_timeout (increase it) and checkpoint_completion_target > (something like 0.8 would be good, depending on how fast your disks > are) values to try to smooth out your I/O (ie, keep it from > bursting at checkpoint timeout). Is 5 connections really enough > for you? No. 5 is too few. OK... this is what I end up with: max_connections = 100 shared_buffers = 240M work_mem = 90MB maintenance_work_mem = 1GB max_fsm_pages = 437616 max_fsm_relations = 1200 default_statistics_target = 100 checkpoint_segments = 70 checkpoint_timeout = 10min checkpoint_completion_target = 0.6 #(not very fast drives in raid5) #log_min_duration_statement = 1000 random_page_cost = 3.0 I tested this on a RAID10 SATA, 8Gb RAM and 2x4cores Xeons - updating 227985 records over roughly 1.4M took 197744.374 ms - recreating the gin index took 313962.162 ms - commit took 7699.595 ms - vacuum analyse 188261.481 ms The total update took around 13min. I've just heard that a similar update on a slower box (RAID1 SAS, 4Gb, 2x2Cores Xeon) running MS SQL took over 30min. Considering MUCH less pk/fk, constraint and actions where defined on the MS SQL DB, things now look much better for postgres. Furthermore postgresql full text search kicks ass to the MS SQL box even on the slowest box of all (RAID5 SATA, 4Gb, 2x1core HT Xeon, over 6 years old). I'll take note of performance even on the slower box as soon as I'll have large updates, still I'm looking how to make it faster. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] share lock when only one user connected?
On Fri, 29 Oct 2010 08:19:27 +0200 Alban Hertroys wrote: I've to amend the "one user connected" assumption. But I'm really sure there is no other process writing on catalog_items. There is a process that read catalog_items and write on another table. create catalog_items { itemid bigint primary key } create catalog_related { itemid bigint references catalog_items (itemid) on delete cascade, itemid_related bigint references catalog_items (itemid) on delete cascade } To add some more details the update is running inside a larger transaction that update other tables. There are no "on update cascade" and no other triggers. I'm using 8.3.4 I admit I've a very naïve knowledge of locks and maybe I'm making assumptions that aren't valid. I've been able to find: http://www.postgresql.org/docs/7.2/static/locking-tables.html but no equivalent for 8.3 I assume the same is valid for 8.3 so since there are no explicit LOCK TABLE on catalog_items what's left are the INDEX. I've been experiencing the same problem even dropping the gin index on the FT1IDX column but there are other btree index on that table. How can I get more information in the logs to know which statement were producing the lock? One for sure was the update. > On 28 Oct 2010, at 19:28, Ivan Sergio Borgonovo wrote: > > > I'm running this query when I'm the only user and this should be > > the only thing running. > > > > update catalog_items > ... > > from ( > >select a.id, a.codice, a.codicealt, > ... > >from > > import.Articoli a > > left join import.ArticoliPropTipo_1 ip on a.id=ip.idArt > > where a.action=8 > >) as s > >where s.id=catalog_items.ItemID > > ; > > > > And I get > > > > DETAIL: Process 7188 waits for ShareLock on transaction > > 110562621; blocked by process 7244. Process 7244 waits for > > ShareLock on transaction 110562544; blocked by process 7188. > > > > On that table 2 triggers are defined: > > You left out the actual trigger definitions ;) create trigger FT1IDX_catalog_items_update_trigger after insert or update on catalog_items for each row execute procedure FT1IDX_catalog_items_update(); create trigger FT1IDX_catalog_brands_update_trigger after update or delete on catalog_brands for each row execute procedure FT1IDX_catalog_brands_update(); > Could it be possible that you accidentally call the wrong trigger > on update of catalog_items? > Another possibility is that the trigger on catalog_items has a > side-effect of updating catalog_brands - which in turn updates > catalog_items again, causing your situation. Is this the case? My intention was that when catalog_brands.brandid get changed catalog_items.FT1IDX get updated > > create or replace function FT1IDX_catalog_items_update() returns > > trigger as > > $$ > > declare > > _Name varchar(64); > > begin > > select into _Name Name from catalog_brands > > where BrandID=new.BrandID; > > new.FT1IDX:= > > GetFTIDX('pg_catalog.english', new.Code, new.CodeAlt, > > new.ISBN, new.Name, new.Authors, _Name); > > return new; > > end; > > $$ language plpgsql volatile; > > > > create or replace function FT1IDX_catalog_brands_update() returns > > trigger as > > $$ > > begin > > if(TG_OP='DELETE') then > >update catalog_items set > > FT1IDX= > > GetFTIDX('pg_catalog.english', Code, CodeAlt, ISBN, Name, > > Authors, '') where BrandID=old.BrandID; > >return old; > > else > >if(coalesce(new.Name,'')<>coalesce(old.Name,'')) then > > update catalog_items set > >FT1IDX= > > GetFTIDX('pg_catalog.english', Code, CodeAlt, ISBN, > > Name, Authors, new.Name) where BrandID=new.BrandID; end if; > > end if; > > return new; > > end; > > $$ language plpgsql volatile; > > > > What could it be? how can I fix it? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] share lock when only one user connected?
On Thu, 28 Oct 2010 13:57:18 -0400 Tom Lane wrote: > Ivan Sergio Borgonovo writes: > > I'm running this query when I'm the only user and this should be > > the only thing running. > > > And I get > > > DETAIL: Process 7188 waits for ShareLock on transaction > > 110562621; blocked by process 7244. Process 7244 waits for > > ShareLock on transaction 110562544; blocked by process 7188. > > It's pretty hard to believe that you haven't got two processes ... > maybe there's something using dblink down in there somewhere? > > Also, if you're using a recent PG release, the postmaster log > should contain additional information about the deadlock, like the > specific queries involved on both sides. I can't think of any other process running concurrently... but yeah I'm surprised as well, and no there is no dblink running. I may be wrong on assuming there is no other process running and I'll check further, but at least I'm pretty sure there is no query that is writing in that table other than the update. These start to happen when involved row are above 80K and chances it happens grow up as the number of rows involved grow (not a big surprise). Chances that this happens decrease if I drop the triggers. This may just be related to the chance that 2 queries clashes since leaving the triggers there increase the execution time. There is nothing else on the log other than: 2010-10-26 18:12:09 CEST ERROR: deadlock detected 2010-10-26 18:12:09 CEST DETAIL: Process 9729 waits for ShareLock on transaction 110146905; blocked by process 11082. Process 11082 waits for ShareLock on transaction 110145470; blocked by process 9729. Followed by the update statement... How am I going to find which statements where locking each other? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] share lock when only one user connected?
I'm running this query when I'm the only user and this should be the only thing running. update catalog_items set Code=s.codice, CodeAlt=s.codicealt, BrandID=s.marca, Name=initcap(s.nome), Description=s.desc_l1, qty=coalesce(s.quantita, 0), ListPrice=coalesce(s.prezzoListino, 0) , valIva=s.valIva, ivaInc=s.ivaInc, StatusID=coalesce(s.stato, 1) , Weight=s.peso, dataInserimento=s.dataInserimento, daPub=s.daPub, BestSeller=s.bestSeller, posInMag=s.posInMag , ISBN=s.ISBN, dataPub=coalesce(s.dataPubblicazione, datapub), updated=now() from ( select a.id, a.codice, a.codicealt, a.marca, a.nome, a.desc_l1, a.quantita, a.prezzoListino, a.valIva, a.ivaInc, a.stato, a.peso, a.dataInserimento, a.daPub, a.bestSeller, a.posInMag, a.ISBN, ip.dataPubblicazione from import.Articoli a left join import.ArticoliPropTipo_1 ip on a.id=ip.idArt where a.action=8 ) as s where s.id=catalog_items.ItemID ; And I get DETAIL: Process 7188 waits for ShareLock on transaction 110562621; blocked by process 7244. Process 7244 waits for ShareLock on transaction 110562544; blocked by process 7188. On that table 2 triggers are defined: create or replace function FT1IDX_catalog_items_update() returns trigger as $$ declare _Name varchar(64); begin select into _Name Name from catalog_brands where BrandID=new.BrandID; new.FT1IDX:= GetFTIDX('pg_catalog.english', new.Code, new.CodeAlt, new.ISBN, new.Name, new.Authors, _Name); return new; end; $$ language plpgsql volatile; create or replace function FT1IDX_catalog_brands_update() returns trigger as $$ begin if(TG_OP='DELETE') then update catalog_items set FT1IDX= GetFTIDX('pg_catalog.english', Code, CodeAlt, ISBN, Name, Authors, '') where BrandID=old.BrandID; return old; else if(coalesce(new.Name,'')<>coalesce(old.Name,'')) then update catalog_items set FT1IDX= GetFTIDX('pg_catalog.english', Code, CodeAlt, ISBN, Name, Authors, new.Name) where BrandID=new.BrandID; end if; end if; return new; end; $$ language plpgsql volatile; What could it be? how can I fix it? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] exceptionally large UPDATE
On Thu, 28 Oct 2010 08:58:34 -0400 Vick Khera wrote: > On Wed, Oct 27, 2010 at 10:26 PM, Ivan Sergio Borgonovo > wrote: > > I'm increasing maintenance_work_mem to 180MB just before > > recreating the gin index. Should it be more? > > > > You can do this on a per-connection basis; no need to alter the > config file. At the psql prompt (or via your script) just execute > the query > > SET maintenance_work_mem="180MB" > If you've got the RAM, just use more of it. 'd suspect your server > has plenty of it, so use it! When I reindex, I often give it 1 or > 2 GB. If you can fit the whole table into that much space, you're > going to go really really fast. > Also, if you are going to update that many rows you may want to > increase your checkpoint_segments. Increasing that helps a *lot* > when you're loading big data, so I would expect updating big data > may also be helped. I suppose it depends on how wide your rows > are. 1.5 Million rows is really not all that big unless you have > lots and lots of text columns. Actually I'm pretty happy with performance of the DB under normal circumstances. I never investigated to much if I could squeeze it more. But when I have to deal with such "huge" updates the performance is painful. You made me start to wonder if I could improve performances even under normal load. But right now I've to take care of this huge (well the use of huge is just related to the performance I'm obtaining right now) update. The things I've touched compared to stock configuration where: max_connections = 100 shared_buffers = 240M work_mem = 42MB maintenance_work_mem = 180MB #(generally it is 40MB) # these were touched as of autovacuum suggestion max_fsm_pages = 437616 max_fsm_relations = 1200 checkpoint_segments = 35 random_page_cost = 3.0 default_statistics_target = 30 log_min_duration_statement = 1000 The box is running apache, total average occupied length of tetxt for each row should be around 1Kb on the largest table. What I'm planning to do is: max_connections = 5 shared_buffers = 240M work_mem = 90MB maintenance_work_mem = 1GB max_fsm_pages = 437616 max_fsm_relations = 1200 checkpoint_segments = 70 default_statistics_target = 30 #log_min_duration_statement = 1000 Any improvement? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] exceptionally large UPDATE
I've to make large UPDATE to a DB. The largest UPDATE involve a table that has triggers and a gin index on a computed tsvector. The table is 1.5M records with about 15 fields of different types. I've roughly 2.5-3Gb of ram dedicated to postgres. UPDATE queries are simple, few of them use join and mainly consist of updating records from temporary tables that contains a very similar structure to the target. This updates are rare so I can afford to tune postgresql just for this large update and then return to a more balanced configuration. I can even afford to be the only user of the DB so responsiveness of the application using the DB is not an issue. Duration of the update is. Anything I can tune in postgresql.conf to speed up the UPDATE? I'm increasing maintenance_work_mem to 180MB just before recreating the gin index. Should it be more? The update should be monolithic and it is inside a single transaction. Since I can afford to be the only user of the DB for a while, is there anything I can tune to take advantage of it? What else could I change to speed up the update? The triggers recreate the tsvector. One of the component of the tsvector is taken from a join table. I'll surely drop the gin index and recreate it when everything is over. I'm not sure if it's a good idea to drop the triggers since I'll have to update the tsvectr later and I suspect this will cause twice the disk IO. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search
On Tue, 28 Sep 2010 20:19:10 +0200 Alban Hertroys wrote: > On 28 Sep 2010, at 12:49, Ivan Sergio Borgonovo wrote: > > The hardware on the 2 machines is a bit different. > > MS SQL 2005 runs on a box that is 2 years old, 2 SATA drives on > > RAID 1 hw, 2 Xeon dual core (I can't check details right now) > > PG runs on a box that has more than 5 years, 3 SCSI drives on > > RAID 5 hw, 2 Xeon single core HT (Intel(R) Xeon(TM) CPU 3.20GHz, > > cpu fam 14, model 4) > > In both tables I've to update price. > > VACUUM FULL was performed just before updating the prices. > VACUUM FULL? Was that really necessary? You did REINDEX after > that, didn't you? If not, your indexes became bloated. If the > table wasn't empty before you probably meant to do a VACUUM > ANALYSE, but if it was, just ANALYSE would have been sufficient. ... > Did you ANALYSE between loading the data and updating? I thought VACUUM FULL was more "magical" and implied a REINDEX. Am I wrong? The index that should be reindexed is the one on the pk, a simple btree, that's not going to be as slow as rebuilding a gin... still I'd really thought that VACUUM FULL implied a lot of things (including ANALYZE as well). > Also, executing the trigger on each copied line is likely to be a > little slow, due to the overhead of calling a stored procedure > (especially if it's plpgsql). It's probably quite a bit faster to > disable the trigger and create the gin-index after loading the > data. I'll try to drop the trigger. I'm not expecting it the biggest factor still adding something here and something there may end up in the huge difference between the 2. Anyway MS SQL seems to overcome all this nuisances auto-magically. > An EXPLAIN ANALYSE of that statement would tell what it's doing > and what's taking so long. A simple EXPLAIN would probably be > sufficient to see what query plan it thinks it needs though. I'll post the EXPLAIN. Before I run EXPLAIN ANALYZE I've to take some precaution the DB doesn't explode. > Did you tune that database? Several options (work_mem for example) > could significantly improve your performance if you can set them > higher (or reduce it if you set them too high). You can do that > per session too. pg is generally faster than the other MS SQL box on what's normally done on a daily basis. Just large updates to the product page seems to be a pain. Other INSERT/UPDATE operations are seldom performed, they involve smaller tables with no gin index. > > If HW can justify such huge difference I'll devote my time to > > other problems. > Partially, yes, but not that much I think. That's my worry... but still in many circumstances pg performs better than the MS SQL box... yeah... on pretty different workload... but while on other workloads pg is a bit faster (20% to 100% faster) even if it is on an older box, on this one is very slow. > > I'd say that a potential culprit could be the gin index. No > > matter if the tsvector is updated or not, if the row is changed > > I think the index is going to be updated anyway. > gin indexes require relatively much RAM. If you didn't assign much > in your settings then it's quite possible that the database can't > keep the index in memory or that things have to spill to disk. > Leave enough room for the OS's disk cache though, Postgres > benefits from that as well. > Is there any particular reason you went with a gin index and not a > gist one? Gin can be faster, but consumes (much) more memory, but > gist is also quite good with text searches and doesn't require > quite as much memory. gin index is doing a very good work and well full text searches are the typical workload of that box and the one that is more important to be fast. I'd say if gin was occupying so much memory performances wouldn't be so good on a daily basis. I'd post excerpt of my postgres.conf (what's important here?) and see if anything can be improved for *this* workload and temporary tune the DB for this exceptional update still I'm going to continue to be a bit surprised of such a huge difference even if it will come out that it was actually a .conf not suited for this workload. BTW the box is running Apache and php. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search
I know I'm comparing apples and orange but still the difference in performance was quite astonishing. I've 2 tables that look like: create table products( id bigint price double precision, /* legacy, don't ask */ sometextfield1 varchar(128), sometextfield2 varchar(128), ... ); one on a MS SQL 2005 and another one on pg 8.3. MS SQL has full text search on the text fields (I don't know the details). pg product table has a tsvector field and a gin index defined on it + trigger that update the tsvector field when the textfields change. The trigger is made in a way that it actually update the tsvector just if the text fields are changed. The hardware on the 2 machines is a bit different. MS SQL 2005 runs on a box that is 2 years old, 2 SATA drives on RAID 1 hw, 2 Xeon dual core (I can't check details right now) PG runs on a box that has more than 5 years, 3 SCSI drives on RAID 5 hw, 2 Xeon single core HT (Intel(R) Xeon(TM) CPU 3.20GHz, cpu fam 14, model 4) Both have 4Gb of ram. shared_buffers is 240Mb. Both share a similar workload. Both boxes were in the same "price class" when they were bought. In both tables I've to update price. VACUUM FULL was performed just before updating the prices. MS SQL receives a large sql file that contain all the UPDATE statements. PG receive a csv file that is loaded into a table with COPY and then does the update as update products set price=p.price from temp_price where id=p.id and price<>p.price; MS SQL ingurgitate the whole sql file in around 10sec. pg takes more than 5 min to just run the single update statement. I'd like to know if such a large difference can be justified just by HW difference or by a difference in the process on how data are loaded [1] or by the difference in performance of the 2 servers on this kind of workload or by some postgres config before I decide how to manage my time to redesign the import procedure. If HW can justify such huge difference I'll devote my time to other problems. I'd say that a potential culprit could be the gin index. No matter if the tsvector is updated or not, if the row is changed I think the index is going to be updated anyway. Somehow MS SQL circumvent this problem, possibly by building the equivalent of a tsvector column in a "hidden" table that automatically join to the "text" table. This add a join but reduce the cost of table modification since simpler (btree) indexes are faster to update. Still huge updates are rare and that table is mostly read and very rarely written. During unusually huge updates I may consider to drop the gin index. [1] I'd expect that excluding the time it takes to load the csv a single update should run faster than a huge list of single statement update -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL versus Postgres
On Fri, 13 Aug 2010 14:17:17 +0800 Craig Ringer wrote: > On 13/08/10 08:38, Tatsuo Ishii wrote: > >> It's slower than smaller numbers, and if you actually dirty a > >> significant portion of it you can have a checkpoint that takes > >> hours to sync, completely trashing system responsiveness for a > >> good portion of it. > > > > So how much is the reasonal upper limit of shared_buffers at this > > point? If it's obvious, should we disable or warn to use more > > than that number? > > Trouble is, there won't be a "reasonable upper limit" ... because > it depends so much on the ratio of memory to I/O throughput, the > system's writeback aggressiveness, etc etc etc. > > Personally I've had two Pg machines where one seems to suffer with > shared_buffers > 250MB out of 4GB and the other, which has 8GB of > RAM, wants shared_buffers to be around 4GB! The main difference: > disk subsystems. What about the ratio of R/W? If it is a mostly read system is the memory/IO throughput still a limiting factor for increasing shared_buffers? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] delete query taking way too long
On Thu, 12 Aug 2010 12:50:49 +0100 Thom Brown wrote: > On 12 August 2010 12:14, Ivan Sergio Borgonovo > wrote: > > I've > > delete from catalog_items where ItemID in (select id from > > import.Articoli_delete); > > > > id and ItemID have an index. > > > > catalog_items is ~1M rows > > Articoli_delete is less than 2K rows. > > > > This query has been running for roughly 50min. > > Right now it is the only query running. > > > > PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) > > 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) > > > > -- > You can try to do deletes in batches of 10,000: ItemID is a PK. Even if there is no constraint on Articoli_delete.id they *should* be unique. There are some other tables with indexes... but I don't expect that more than 4K row for each table will be deleted. There are a couple of other large (1M rows) table where an on delete cascade is defined. This is the query plan Nested Loop (cost=30.07..10757.29 rows=1766 width=6) -> HashAggregate (cost=30.07..47.73 rows=1766 width=8) -> Seq Scan on articoli_delete (cost=0.00..25.66 rows=1766 width=8) -> Index Scan using catalog_items_pkey on catalog_items (cost=0.00..6.05 rows=1 width=14) Index Cond: (catalog_items.itemid = articoli_delete.id) BTW it is happening again... after I stopped pg, restarted the whole server and re-run the query. This query get generally unnoticed in a longer process but I doubt it ever lasted more than a couple of minutes in the past. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] delete query taking way too long
On Thu, 12 Aug 2010 17:14:17 +0530 Jayadevan M wrote: > > I've > > delete from catalog_items where ItemID in (select id from > > import.Articoli_delete); > Does catalog_items have child tables where the FK columns are not > indexed? Regards, Possibly, but very small ones. What I missed to say is... that query always worked reasonably fast in the past. The size of the DB didn't grow substantially recently. I'd say the query shouldn't be the problem... the question should have been: what should I look when postgresql start to behave strangely? eg. missing resources, locks, solved bug (it is a reasonably old version)... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] delete query taking way too long
I've delete from catalog_items where ItemID in (select id from import.Articoli_delete); id and ItemID have an index. catalog_items is ~1M rows Articoli_delete is less than 2K rows. This query has been running for roughly 50min. Right now it is the only query running. PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On Thu, 29 Jul 2010 14:57:04 -0400 Tom Lane wrote: > Samantha Atkins writes: > > Why is MySQL so much more popular right now, especially in the > > OpenSource community? > > I think it's strictly historical. The mysql bias you see in so > many web tools was established in the late 90s, a time when mysql > worked reasonably well (at least according to the mysql developers' > notion of "reasonably well") whereas postgres was still pretty slow > and buggy. It took us a long time to get from the original > academically-oriented code to something of real production quality. > We're definitely competitive now, but I don't know if we'll ever > fully overcome that historical disadvantage. How popular is Visual Basic right now? And even if it was more popular than C#... what kind of application would you expect to find that start development right now in VB? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On Thu, 29 Jul 2010 08:52:46 -0700 "Joshua D. Drake" wrote: > The issue isn't Drupal. It is modules. There are a lot of popular > modules that do not work with PostgreSQL (Lightbox for example). > The google checkout module for Ubercart didn't work either until > relatively recently. I'd say the opposite but I'll wait to test more D7. Core takes ages to agree on what should be done to fix bugs for Postgres without affecting even the "feelings" of MySQL developers. Modules may have more problems but fixing them is generally trivial and generally upstream is quick to integrate the fix. The problem for core is maintaining your patches till and if they fix the bug. I agree that PHP and MySQL are a perverse match. Still if he plans to deploy stuff as "commodity" software they are a necessary evil. The problem arise when you're in-between custom and RAD. Anyway more python/django based cms are flourishing... and given Django originally supported DB was Postgres... http://www.django-cms.org/ [1] Migration of Onion from Drupal/Mysql -> Django/Postgresql is emblematic. [1] I think I could make a quick benchmark if possible on postgresql and mysql -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On Wed, 28 Jul 2010 22:24:07 -0700 "Joshua D. Drake" wrote: > On Thu, 2010-07-29 at 07:04 +0200, Ivan Sergio Borgonovo wrote: > > > BTW up to my memory Django suggest postgres. I haven't seen any > > benchmark of Django with pg vs mysql. > Django was originally developed for Postgres but really, they are > wholly different beasts. You're right. It would be nice to see benchmark of any cms developed with Django on postgresql and mysql. I tried to find benchmark of Plone on postgres vs mysql. I'd tend to think (and I may be wrong) that as a rule of thumb, being everything else equal, mysql is more suited to "commodity" cms just because it is easier to find coupled with php in hosting (and this reflects on communities etc...). Still it would be nice to put the myth of mysql is better on cms, since they are read most apps, to rest too. But then... there are no popular [anything but php] cms but there are a lot of [anything but php] web framework. You start with a pre-packaged web application that looks like a framework, then you start to do custom code, then you start to have more impedance mismatch... The more you've to code, the more you will prefer a framework and postgres... but if you've coded enough it means you can afford to code your own web application out of a framework and have your own box (no hosting). BTW which one of the example you posted uses ubercart? I'd be curious about how many concurrent operation on the basket does http://www.commandprompt.com/portal/ have... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On Wed, 28 Jul 2010 18:56:56 -0400 Greg Smith wrote: > Ivan Sergio Borgonovo wrote: > > Are there companies that offer drupal/postgres tuning? > I am quite sure that Command Prompt would be happy and fully > prepared to sell you Drupal + PostgreSQL tuning services. We also > have some projects around it, and I'm sure other consulting > companies or individuals do too. I'd predict that if you sent a > message to pgsql-jobs saying you're looking to hire someone for > that sort of work, you'd get a stack of responses from qualified > people in the PostgreSQL community. Sure. What I haven't been able to spot are drupal companies that do drupal tuning when it is running with postgres. Of course here on pg ml is not hard to find companies that won't refuse to tune postgres even if you use it for drupal ;) BTW up to my memory Django suggest postgres. I haven't seen any benchmark of Django with pg vs mysql. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
On Wed, 28 Jul 2010 12:45:47 -0700 "Joshua D. Drake" wrote: > On Thu, 2010-07-29 at 00:36 +0530, Sandeep Srinivasa wrote: > > yup I did. The reason why I wanted examples was to amply > > demonstrate,to clients, that postgresql is viable. > > It is kinda weird if the only examples I have are restricted to > > the postgresql _community_ websites themselves. > > > Well you are kind of asking in the wrong place. You should be > asking in #drupal, #drupal-support, #drupal-ubercart or in the > Drupal forums. Well he will spend most of the time filtering people bashing postgres there. > > Plus, it would have been interesting to know which version of > > Drupal, Ubercart, etc was being used for such deployments. > > Again, it is relevant because of certain (older) benchmarks > > which denote significantly worse performance because of the > > suboptimal way that > Latest 6.x release and latest Ubercart release. > > Drupal integrates with Postgresql : > > http://mikkel.hoegh.org/blog/2008/oct/13/drupal-database-performance-mysql-postgresql/ > > There has been _nothing_ to disprove the above numbers, ever > > since - please correct me if I am wrong. > You should read that "whole" blog. PostgreSQL does very well in > consideration of the environment. I would also note that there is > no reference to whether or not he tuned PostgreSQL or not. > I have zero problems running Drupal with PostgreSQL and getting > great performance but then again I know enough to tune both > Drupal, PHP and PostgreSQL. Most people can't say that (I am not > saying you can't). I'm happy with PostgreSQL and Drupal too and right now I didn't have to get too worried about performances. D7 should support many things that makes more sense to use Postgres. I had to tweak D5 and D6 core to make it work with Postgres as I needed... the problem is it takes a lot of time to see postgres related patch get into core. Modules that are worth to use generally have reasonable maintainer, fixes and release are much faster. Still I'd say that if you don't have any specific reason to use postgresql (you have to access data on another app using postgres, you need some special feature (full text, GIS), you've a lot of writes to the DB...) would be a better choice if you had equal knowledge of both. Are there companies that offer drupal/postgres tuning? > That is a tough one. I mean, prove it to him. Set up Drupal with > MySQL/Innodb and setup Drupal with PostgreSQL and do some tests. > You can also look for things like this: > > http://www.commandprompt.com/blogs/joshua_drake/2010/07/multiple_drupal_installations_single_login_10_steps/ Schemas in postgres with drupal are great. using: http://www.webthatworks.it/d1/content/howto-duplicating-schema-postgresql and http://www.webthatworks.it/d1/content/excluding-some-tables-data-backup-including-their-schema makes a breeze to duplicate sites. And you can still conserve all triggers pk, fk, on duplicate cascade... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] resource management, letting user A use no more than X resource (time, CPU, memory...)
On Fri, 16 Jul 2010 19:43:01 +0800 Craig Ringer wrote: > On 16/07/10 19:21, Ivan Sergio Borgonovo wrote: > > If I'd like to learn how to manage resources in postgres and > > grant different users different time slot/memory/CPU? > > > > eg. I'd like to grant to user A to execute queries that last less > > than 1min or that occupy no more than X Mb... etc... > > PostgreSQL doesn't really offer much in the way of features for > per-user resource control, resource impact isolation, etc. > > You can potentially run different PostgreSQL instances > (postmasters, not just databases) in different domains of a > virtualization or resource-control setup, but that's pretty > inefficient, adds a lot of admin work, and doesn't help if your > users need to be able to use the same database(s). > If you need strong user resource limits, user storage limits, etc > PostgreSQL might not be your best option. There are some things > you can do, but there's not much. What about an external process that monitor backend and kill them gracefully if they suck too many resources accordingly to the user linked to that backend? Or... gluing together a load balancing solution that divert accordingly to the user to different slaves accordingly that have slightly different setup? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] resource management, letting user A use no more than X resource (time, CPU, memory...)
If I'd like to learn how to manage resources in postgres and grant different users different time slot/memory/CPU? eg. I'd like to grant to user A to execute queries that last less than 1min or that occupy no more than X Mb... etc... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need Some Recent Information on the Differences between Postgres and MySql
On Fri, 25 Jun 2010 08:48:11 -0700 Rob Wultsch wrote: > The freedom of the storage engine interface allows for much more > varied backend characteristics. Some examples: This is *really* fascinating but pg transactional engine is very mature and solid. Before any of the should-be-really-transactional-engine of MySQL will be as mature as pg's one, pg will have a lot more feature/speed/development sugar. If I didn't need a transactional engine I'd be happy to go (well not really happy[1]) with MySQL and any of its engines. It is nice to reuse and mix and match different tools. [1] I find some inconsistency of mysql a bit painful to live with... as I find some inconsistency in PHP equally irritating but they are there, quite diffused and still for certain domains the best compromise available. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] A thought about other open source projects
On Mon, 21 Jun 2010 11:27:20 -0700 David Fetter wrote: > On Mon, Jun 21, 2010 at 01:55:36PM -0400, Brad Nicholson wrote: > > Scott Marlowe wrote: > > >>As with phrases like, "the quickest way to grill a unicorn > > >>steak," that it can be stated in a few words does not make in > > >>possible. > > > > > >Exactly. The big issue here is that nobody's saying what kind > > >of app they want to write. > > Or what sort of performance requirements are tied to that app. > > It's not performance requirements that tend to tank such projects, > but the amount of maintenance involved. Extending the app gets > quadratically painful. I perfectly agree. But maybe the number of hackers involved in a project could grow faster if the project serves more purposes. Reuse lower maintenance costs too. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] A thought about other open source projects
On Sun, 20 Jun 2010 12:52:22 +0200 Thomas Kellerer wrote: > David Goodenough wrote on 20.06.2010 11:08: > > I don't support anyone has written a "how to write database > > agnostic code" guide? That way its not a matter of porting, > > more a matter of starting off right. > > I don't believe in "database agnostic code". It depends on what you're after. There are applications that really use a DB as a SQL interface to the file system (sort of). Coding non standard SQL for such kind of application (and there are many out there) is just a different practice of BDSM. Still even when you deal with application that can really take advantage of the various special feature of some DB and their SQL dialect... 80% of the written SQL could be standard. Sometimes people write abstraction code to make it easier to write in the only SQL dialect they know for a very small coding advantage while they could write a better one that at least wouldn't make a pain to post the application. When thinking about portable code I generally find this documents useful: http://sql-info.de/postgresql/postgres-gotchas.html -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how to alias a table
I'm refactoring some code and I'll find helpful to be able to alias tables. What I'd like to do would be to refer to the same table with an alias in the code and later substitute the alias with an actual VIEW. Of course I could define a view as select * from original_table right from the start but I'm worried this may incur in some overhead I currently can't afford. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Debian: upgrading from was: What Linux edition we should chose?
On Mon, 31 May 2010 17:23:51 +0200 Szymon Guz wrote: > > Yeah it is. But what is it going to be an upgrade process? On a > > production box? > > Any experience to share on upgrading from source on Debian? > Usually that's pretty easy: for upgrading the minor version (e.g. > from 8.3.1 to 8.3.3) it should be enough to compile the new > sources, stop server, run `make install` and run the server with > new binaries. Upgrading from 8.3 to 8.4 can be easily done using > dump from current version. There is nothing wrong to run the new > and old postgres versions parallel so you can copy data from one > database to another. There is also pgmigrator, but I haven't > checked that yet. That's clear but there are a bunch of small and possibly very annoying details that make deploying in production a bit more challenging than ./configure, make, make install. I admit I only compiled postgres in my /home when I was developing an extension. It is something I do rarely and never on production. If I was thinking to upgrade on a debian box that is already running a packaged version I'd have to understand how deal with debian patches (I think most were related to paths where postgres expect to find it's stuff). Once I understand what all debian patches do I'll try to see if I can avoid them all so that upgrading will be easier the next time. I'll have to see how debian ./configure the package, I'll have to replicate the init.d script for the newer version, take care of making the 2 servers run temporarily on different ports... etc... I could even think of making a .deb I think about it I could even come up with a longer list of things I should do. I bet I'm not the first one that's going to upgrade Debian from source. So someone may share his recipe and caveats. I was actually thinking to test 9.0 in my /home on some real world DB. That could be a chance to learn how to upgrade from source. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What Linux edition we should chose?
On Mon, 31 May 2010 08:47:25 -0600 Scott Marlowe wrote: > On Mon, May 31, 2010 at 2:29 AM, Michal Szymanski > wrote: > > Hi, > > Currently we use Debian, but it chosen by our OS admnistrator. > > Now we can change our OS and it is question what Linux edition > > will be the best. We would like have access to new versions of > > Postgres as soon as possible, for Debian sometimes we had to > > wait many weeks for official packages. > > Pgsql is pretty easy to build from source. Yeah it is. But what is it going to be an upgrade process? On a production box? Any experience to share on upgrading from source on Debian? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] unable to avoid a deadlock at the end of a long transaction
On Fri, 07 May 2010 10:29:20 -0400 Tom Lane wrote: > Ivan Sergio Borgonovo writes: > > I've been having this: > > psql:include/custom/import_update.custom.sql:63: ERROR: deadlock > > detected DETAIL: Process 13349 waits for AccessExclusiveLock on > > relation 250510 of database 248569; blocked by process 14153. > > Process 14153 waits for ShareLock on transaction 59160779; > > blocked by process 13349. CONTEXT: SQL statement "drop trigger > > if exists FT1IDX_catalog_brands_update_trigger on > > catalog_brands" PL/pgSQL function "ft1idx_trigger_drop" line 2 > > at SQL statement > I'd suggest not using DROP TRIGGER in operations that need to run > concurrently with other accesses to the same table. Consider > fixing things so the trigger is always there but knows enough to > not do anything when it doesn't need to. That's nice to know... but even skipping the whole drop/create trigger thing the lock problem is still there and still happens near the end of a long transaction that makes a lot of other stuff on mainly one table. The statement that cause the lock is not always the same, what is "constant" across several modification of the overall transaction is: the lock happens near the end of the transaction. I'd say that that *should* be the only one transaction *writing* to the few tables that are involved in the transaction, some of which are very small (hundreds of record). I expect (that doesn't mean I know) that from a writing point of view the overall transaction doesn't involve any write concurrency. So I thought I wouldn't be involved in stable locking problems on *random* statement whose only fault is being near the end of the whole transaction. I need some help on how to learn how to track down this kind of problem. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] unable to avoid a deadlock at the end of a long transaction
I've been having this: psql:include/custom/import_update.custom.sql:63: ERROR: deadlock detected DETAIL: Process 13349 waits for AccessExclusiveLock on relation 250510 of database 248569; blocked by process 14153. Process 14153 waits for ShareLock on transaction 59160779; blocked by process 13349. CONTEXT: SQL statement "drop trigger if exists FT1IDX_catalog_brands_update_trigger on catalog_brands" PL/pgSQL function "ft1idx_trigger_drop" line 2 at SQL statement I reshuffled the update process and I started to have other lock problems. The only process that I'm aware of that should be *writing* to the tables involved is the update process. I'd expect other process are reading but just the update should be writing. The lock problem happens nearly at the end of the overall update process that is one big transaction, reshuffling a bit the steps doesn't make it go away... it just locks on other statements but still at the *end* of the process after a bunch of update and insert have been made on a bunch of other tables the largest of whom is an update of roughly 85834 rows on a table containing 1M rows. The only thing that look constant is: I get a deadlock at the end of a long process on a random statement. Where am I going to start from to solve this? DETAIL: Process 3662 waits for ShareLock on transaction 59301028; blocked by process 4303. Process 4303 waits for ShareLock on transaction 59299342; blocked by process 3662. CONTEXT: SQL statement "update catalog_items set Authors= $1 where ItemID= $2 " PL/pgSQL function "updateauthorsall" line 19 at SQL statement create or replace function UpdateAuthorsAll() returns void as $$ declare _row record; _ItemID bigint; _Authors varchar(1024); _AuthorsOLD varchar(1024); _Name varchar(50); begin _Authors := ''; _ItemID := null; for _row in select a.Name, ia.ItemID from catalog_itemauthor ia join catalog_author a on a.AuthorID=ia.AuthorID order by ia.ItemID loop if(_row.ItemID<>_ItemID) then if(length(_Authors)>2) then _Authors := substring(_Authors from 3); select into _AuthorsOLD Authors from catalog_items where ItemID=_ItemID; if(coalesce(_Authors, '')<>coalesce(_AuthorsOLD, '')) then update catalog_items set Authors=_Authors where ItemID=_ItemID; end if; end if; _Authors := ''; end if; _ItemID := _row.ItemID; _Name := trim(E' \t' from _row.Name); if(length(_Name)>0) then _Authors := _Authors || ', ' || _Name; end if; end loop; return; end; $$ language plpgsql volatile; -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] restore multiple tables postgres
On Wed, 14 Apr 2010 10:56:36 -0400 akp geek wrote: > Dear all - > > Can you please help me with this? Is there a way to restore > multiples ( more than one table ) using a single command from a > whole database dump that was created using pg_dump Something along the line of: http://www.webthatworks.it/d1/node/page/excluding_some_tables_data_backup_including_their_schema -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] HW and performances
I'm using apache and postgres to make an ecommerce website work on an old xeon box processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.20GHz It looks it has 4 cores but I think they are 2 cores + HT 4Gb RAM 3 disks RAID5 [sic] free total used free sharedbuffers cached Mem: 40480243984896 63128 0 2083042645748 -/+ buffers/cache:11308442917180 Swap: 78131124927812620 Swap get slightly hit after long period of uptime. Most of the work happens in the items table ~1M rows. The queries that actually impact on usability are: - full text searches on a gin index on a tsvector - updates of the catalogue (and related gin index) in batches of 20K items updated/inserted. It *seems* (not really sure) that the bottleneck for full text searches are the CPU. I'll have the chance to split the load over a newer box: 2 socket 5120 Xeon box 4GB RAM 2 disks RAID0 (single disk io seems faster than the box above) I'm planning to move the DB on the newer box and leave apache on the old one. This should free enough ram on the DB box so that the swap will be untouched. I'm hoping to halve the average page generation time. I could add ram or disks to the newer box. Changing CPU doesn't look a good investment. Did I assign the task to the various boxes correctly? Do I have any chance of getting near to my target even adding some more HW to the newer box? If not I'd think to buy a brand new box. Any suggestion? By comparison I've a 2 socket x 4 cores Xeon E5310 @ 1.60GHz 8 Gb RAM 4 disks SATA RAID10 with apache and pg running inside a vserver (2.6.26-2-vserver-amd64) that looks it can handle both jobs (apache and pg) at a reasonable speed (roughly twice faster than needed) but that is still a bit slower than what I'd like on catalogue updates. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] explicit cast for null::bigint
I've found this a bit strange: insert into catalog_related (itemid, related_itemid, rank) select distinct itemid, null::bigint, 0::float4 from catalog_categoryitem where catid is null or catid<>0 and itemid not in (select itemid from catalog_related); create table catalog_related ( itemid bigint, updated timestamp default now(), related_itemid bigint, rank float4 ); without the explicit cast for the null I get ERROR: column "related_itemid" is of type bigint but expression is of type text LINE 1: insert into catalog_related (itemid, related_itemid, rank) ^ HINT: You will need to rewrite or cast the expression. PostgreSQL 8.3.4 on x86_64-pc-linux-gnu -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] make available C extensions to others
I've finished to write an extension to manipulate tsvectors and tsquery in C. I think it could be useful for someone else and I think I may take advantage at someone else looking at the code too. What would be the right place where to publish the code and how? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] bug in function arguments "recognition"
On Wed, 03 Mar 2010 10:22:31 -0500 Tom Lane wrote: > Ivan Sergio Borgonovo writes: > > I've a function defined as: > > create or replace function catalog_relateditems(__itemid bigint, > > families int[]... > > If you want any useful comments, you're going to have to show a > complete example, rather than selectively editing out what you > think is irrelevant (and, no doubt, isn't). Golden rule... I thought that psql auto completion was a good enough proof something wasn't working and I was thinking if someone could give me a good advice to trim down the clutter to build up a simpler function that could misbehave. I was not aware that in and out parameters could be intermixed and I was expecting an error, but pg was pretty happy with the syntax and this was enough to put me astray. Sorry -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] bug in function arguments "recognition"
On Wed, 3 Mar 2010 16:05:29 +0100 Ivan Sergio Borgonovo wrote: > I've a function defined as: > > create or replace function catalog_relateditems(__itemid bigint, > families int[]... Forget about it... there was a typo (missed out) that mixed in/out parameters. Sorry for the noise. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] bug in function arguments "recognition"
I've a function defined as: create or replace function catalog_relateditems(__itemid bigint, families int[]... I call it with select * from catalog_relateditems(6538::bigint, ARRAY[1,2,3,4,5]); and I get: HINT: No function matches the given name and argument types. You might need to add explicit type casts. The I try to drop the function using psql autocomplete and I get... DROP FUNCTION catalog_relateditems ( bigint, integer[], character varying) \df *rela* reports: public | catalog_relateditems | setof record | __itemid bigint, families integer[], OUT ... There is no other function named catalog_relateditems I'm on 8.3.9 debian lenny -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] C function to create tsquery not working
On Thu, 25 Feb 2010 11:41:58 +0100 Pavel Stehule wrote: > use --enable-assert configure flag? > > you can use memory in bad context. So you are alloc good memory, > but when you leave function, then complete memory context is > freeed and you can have a problem. Meanwhile I experienced some new strange behaviour. I created a table in the same DB containing some tsvector to test the function on a better known, easier to control set of data. The tsvectors contained aren't that different from the one contained in the "real" table, they are just fewer. I finally downloaded all pg source, compiled it, compiled my extension inside contrib. Restored the "real" DB. Tested on the "real" table... and no problem at all. Nothing really helpful since the 2 setup aren't really identical, one was hand compiled on sid, the other is stock debian lenny install. I'll try to compile the debian lenny version in a new virtual machine. Meanwhile if someone could give a glimpse to the source it would be really appreciated. http://www.webthatworks.it/d1/files/ts_utilities.tar.bz2 -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] C function to create tsquery not working
On Thu, 11 Feb 2010 20:11:54 +0100 Ivan Sergio Borgonovo wrote: > I'm still having trouble making this work: > > http://pgsql.privatepaste.com/14a6d3075e I tried to play with item->operator.left to see if reshuffling the expression could make any difference. item->operator.left = 2 * lexeme - 2 (1 + i) or item->operator.left = lexemes But the result seems pretty indifferent to what I put in operator.left. That makes me think the error is here. But I still get those 2 kind of error: ERROR: unrecognized operator type: 50 (first run) or ERROR: stack depth limit exceeded Just at the 3rd returned row, just for certain queries (see previous email). It doesn't look as if I palloced too few memory, I tried to allocate 3x the memory I estimated and I still get the errors. The function is actually returning correct results, so it seems the tsquery object is well formed. But still it looks like infix() is trying to read more operators than the one I thought I've put in... but just for certain queries, and just at the 3rd row returned. Should I use something different than palloc? Should I return the query differently? Am I supposed to free something? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] C function manipulating tsquery doesn't work with -O2
http://psql.privatepaste.com/53cde5e24a I've the above function. Output is something like: '9788876412646':A | ( '8876412646':A | ( 'edizioni':D | ( 'quasi':B | ( 'estat':B | ( 'levi':C | ( 'lia':C | ( 'e/o':D | 'un':B ) ) ) ) ) ) ) It seems it always work with -O0 I can make it work with -O2 in: select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items limit 2; -- q1 select itemid, tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items; -- q2 select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items where itemid=10 or itemid=15 or itemid=27; -- q3 select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items order by random() limit 3; -- q4 As soon as I run select tsvector_to_tsquery(ft1idx, '|', 'ABCD', 22::smallint) from catalog_items limit 3; -- q5 it commit seppuku in 2 ways: ERROR: unrecognized operator type: 50 (first run) or ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth", after ensuring the platform's stack depth limit is adequate. This looks to happen in infix() tsquery.c This doesn't seem to be related with length of the tsvector passed. Crazily increasing the only palloc in the function (3x) doesn't solve the problem. This is what I get back from gdb once the function exit: (gdb) backtrace #0 0x0053739a in ?? () #1 0x00536fd4 in ExecProject () #2 0x0053d150 in ExecScan () #3 0x00536470 in ExecProcNode () #4 0x00549ea0 in ExecLimit () #5 0x00536458 in ExecProcNode () #6 0x00534337 in ExecutorRun () #7 0x005d6b2b in ?? () #8 0x005d8339 in PortalRun () #9 0x005d2de9 in ?? () #10 0x005d4624 in PostgresMain () #11 0x0000005a6c68 in ?? () #12 0x005a7b30 in PostmasterMain () #13 0x0055aaae in main () version is: PostgreSQL 8.3.9 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2 -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] errmsg and multi-byte strings.
How am I supposed to output multibyte strings in an errmsg (and Co.) as in errmsg("operator not permitted '%s'", mbstring) thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] [SOLVED] C function to create tsquery not working
On Thu, 11 Feb 2010 20:11:54 +0100 Ivan Sergio Borgonovo wrote: > I'm still having trouble making this work: > http://pgsql.privatepaste.com/14a6d3075e Finally I got it working, not the above version anyway... CREATE OR REPLACE FUNCTION tsvector_to_tsquery(IN tsv tsvector, op IN char(1), weights IN varchar(4), maxpos IN smallint ) RETURNS tsquery AS 'MODULE_PATHNAME' LANGUAGE C STRICT; There were some small errors, but the main one was setting SET_VARSIZE passing the pointer to the query in spite of the query. I'll need some smaller help to polish everything. It is a small work but there was someone on the list that showed some interest and it may be a nice simple helper for tsearch. What would be the right place to advertise it and make it available? To sum it up... I wrote 2 functions: 1 takes a tsvector and return it as a setof record text, int[], int[] 2 takes a tsvector, filter it according to weights and maximum position and return a | or & tsquery The first is just for "debugging" or to be able to build more complicated tsqueries in your preferred language. The second can come handy to look for text similarity skipping to compute tsvectors twice. create or replace function similar(_id int, out id int, out title text) returns setof record as $$ declare tsvin tsvector; tsq tsquery; begin select into tsvin from table where id = _id; tsq := tsvector_to_tsquery( tsvin, '|', 'AB', 100); return query select t.id, t.title from table t where t.tsv @@ tsq ; return; end; $$ language plpgsql stable; -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] C function to create tsquery not working
I'm still having trouble making this work: http://pgsql.privatepaste.com/14a6d3075e CREATE OR REPLACE FUNCTION tsvector_to_tsquery(IN tsv tsvector, op IN char(1), weights IN varchar(4), maxpos IN smallint ) RETURNS tsquery AS 'MODULE_PATHNAME' LANGUAGE C STRICT; What I expect is: tsvector_to_tsquery('java tano', '&', 'ABCD', 100) -> java & tano tsvector_to_tsquery('java:1A,2B tano:3C,4D', '&', 'ABC', 100) -> java:A & java:B & tano:C tsvector_to_tsquery('java:1A,2B tano:3C,4D', '|', 'ABC', 100) -> java:AB | tano:C I've made some improvement compared to previous version I've posted but still it returns an empty tsquery. Things that works: - tsvector_tsquery_size returns reasonable total length of strings and total number of (operand + operator) - curout is actually filled with a lexeme - filters (wf, posmax) work -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] LIKE a set of pattern generated from a table
I've a column that contain something formed by concatenating fields from another table create table stuff( glued varchar(30), ... ); insert into stuff select 'constA,' || field1 || ',' || field2 from origin where ...; insert into stuff select 'constB,' || field1 || ',' || field2 from origin where ...; I know this is terrible... but this is what I have, and I can't change it. Now what I'd like to do is selecting in stuff using a pattern as: select * from stuff where glue like ( select '%,' || field1 || ',' || field2 || '%' from origin where ... ); But this isn't going to work. Any other way other than specifying all the const one by one in a union and then look for equality? select * from stuff where glue in ( select 'constA,' || field1 || ',' || field2 from origin where ... union select 'constB,' || field1 || ',' || field2 from origin where ... ); -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] border case ::tsvector vs. to_tsvector was turning a tsvector without position in a weighted tsvector
This was what I was after: test=# select version(); version PostgreSQL 8.3.9 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2 test=# select to_tsvector('pino gino'); to_tsvector --- 'gino':2 'pino':1 (1 row) test=# select 'pino gino'::tsvector; tsvector --- 'gino' 'pino' (1 row) test=# select to_tsvector('pino gino') @@ 'gino:B'::tsquery; ?column? -- f (1 row) test=# select to_tsvector('pino gino') @@ 'gino:D'::tsquery; ?column? -- t (1 row) test=# select ('pino gino'::tsvector) @@ 'gino:B'::tsquery; ?column? -- t (1 row) test=# select to_tsvector('pino:1B gino') @@ 'pino'::tsquery; ?column? -- t (1 row) test=# select 'pino gino'::tsvector || to_tsvector('pino gino'); ?column? --- 'gino':2 'pino':1 (1 row) test=# select 'pino gino'::tsvector || 'pino gino'::tsvector; ?column? --- 'gino' 'pino' (1 row) test=# select to_tsvector('pino gino') || to_tsvector('pino gino'); ?column? --- 'gino':2,4 'pino':1,3 (1 row) test=# select 'pino gino'::tsvector || to_tsvector('gino tano'); ?column? -- 'gino':1 'pino' 'tano':2 test=# select setweight('pino gino'::tsvector || to_tsvector('gino tano'), 'A'); setweight 'gino':1A 'pino' 'tano':2A (1 row) So (even if it may sound obvious to many): - tsvectors may be a mix of lexemes with and without weights - a lexeme without a weight (=! default D weight) is a lexeme with ALL weights - you can't assign a weight to a lexeme without a position and it would be hard to assign a position after a document is parsed into a tsvector, so while in theory it could be reasonable to have lexemes with weight and no position, in practice you'll have to assign not meaningful positions if you'd like to assign a weight to a tsvector with no positions. I still wonder if it would be reasonable to write a function that forcefully assign a position and a weight to vectors to be used with ts_rank. I've some ideas about possible use cases but I'm still unsure if they are reasonable. eg. someone would be willing to save storage and CPU cycles storing part of documents in precomputed tsvectors with no weight and then build up a search with merged tsvectors with weights using ts_rank. OK.. trying to finish up my tsvector_to_tsquery function in a reasonable way first. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How do I delete duplicate rows in a table?
On Mon, 8 Feb 2010 15:32:51 -0800 "Wang, Mary Y" wrote: > Hi, > > I have a table that have that duplicate rows. How do I find them > and delete them? http://www.webthatworks.it/d1/node/page/eliminating_duplicates_place_without_oid_postgresql Most likely I've learned it here... -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] turning a tsvector without position in a weighted tsvector
On Mon, 8 Feb 2010 23:01:45 +0300 (MSK) Oleg Bartunov wrote: > Ivan, > > what's wrong with: > > postgres=# select 'abc:1'::tsvector; > tsvector > -- > 'abc':1 Yes you're right. I think I misplaced some quotes. But still, once a vector has no position, I can't add the weights. test=# select setweight('tano'::tsvector, 'A'); setweight --- 'tano' (1 row) test=# select setweight('tano:1'::tsvector, 'A'); setweight --- 'tano':1A (1 row) Since I'm writing some helper to manipulate tsvectors I was wondering if a) there is any reasonable use case of adding weights to vectors with no position b) I missed any obvious way to add weights to tsvectors that were initially without positions thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] turning a tsvector without position in a weighted tsvector
If I convert a string to a tsvector just casting (::tsvector) I obtain a vector without positions. tsvectors without positions don't have weights too. I haven't found a way to turn a vector without weight/pos, into a vector with weight/pos. Is there a way to apply weight/add positions to tsvectors without positions? Is there any use-case? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query to find list of dates between two dates
On Fri, 5 Feb 2010 02:06:12 -0800 (PST) aravind chandu wrote: > Hello guys, > > can you please help me with the following query > > I need a query that displays all the dates in between two dates > say i give two dates 12/1/2009 and 12/31/2009 The result should be > like this select date '2008-05-01' + i from generate_series(0, (date '2009-12-10' - date '2008-05-01')) s(i); -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] reducing result set of tsvector @@ tsquery avoiding to use ts_rank
I've finally made some fruitful steps in writing C functions that manipulate tsvectors. I'd like to build up a simple system based on ts_rank to find similarities between documents. I've some documents containing 4 parts. I build a tsvector the "usual way" setweight(tsvector(field1), 'A') | setweight(tsvector(field2), 'B') | etc... then I'd like to build a query similar to: tsvector @@ to_tsquery( 'field1_lexeme1':A | 'field1_lexeme2':A | ... 'field2_lexeme2':B | 'field2_lexeme2':B | ... Anyway so many OR are going to return a lot of rows and filtering on rank is "too late" for performances. One way to shrink the result set would be to build a query that requires at least 2 lexemes to be present: 'field1_lexeme1':A & ('field1_lexeme2':A | ... 'field2_lexeme2':B | 'field2_lexeme2':B | ... ) | 'field1_lexeme2':A & ('field1_lexeme1 | ... ) | I don't have very long documents and this looks feasible but I'd like to hear any other suggestion to shrink the result set further before filtering on ts_rank... especially suggestions that will exploit the index. So any suggestion that could reduce the result set before filtering on rank is welcome and I'll try to put them in practice in some C functions that taken a tsvector build up a tsquery to be used to find similar documents. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to generate unique hash-type id?
On Fri, 29 Jan 2010 13:13:17 +0100 "Wappler, Robert" wrote: > I'd suggest to use some kind of sequence or something constructed > from the primary keys. But you may still see hash collisions > although the input is different. Concatenate /* ::text */ random() with something like: http://www.webthatworks.it/d1/node/page/pseudo_random_sequences_postgresql -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] is this the correct result for ts_rewrite? reducing tsquery to improve performance?
select ts_rewrite( to_tsquery('java:A & cola & java:AB'), 'java:AB'::tsquery, 'java:AB'::tsquery); ts_rewrite 'cola' & 'java':AB & 'java':AB Is this the expected (documented) result? I found this while looking for a way to build up a tsquery directly in it's own structure without passing through its text representation (and maybe reduce it). The following looks equivalent. Are they? test=# select 'java:ABC'::tsquery; tsquery 'java':ABC (1 row) test=# select 'java:A | java:B | java:C'::tsquery; tsquery ( 'java':A | 'java':B ) | 'java':C (1 row) I did try to pass them through nodetree... but the result keeps on being different. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] tsvector minimum match using index
Would it be possible without writing a very long tsquery to exploit the index to retrieve the tsvectors that contain at least N lexemes? If not exploiting the index... any suggestion to improve performance of such a query? computing rank still requires retrieving a lot of tsvectors and compute the rank for each of them. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FTS uses "tsquery" directly in the query
On Mon, 25 Jan 2010 23:35:12 +0300 (MSK) Oleg Bartunov wrote: > Do you guys wanted something like: > > arxiv=# select and2or(to_tsquery('1 & 2 & 3')); > and2or > - > ( '1' | '2' ) | '3' > (1 row) Nearly. I'm starting from a weighted tsvector not from text/tsquery. I would like to: - keep the weights in the query - avoid parsing the text to extract lexemes twice (I already have a tsvector) For me extending pg in C is a new science, but I'm actually trying to write at least a couple of functions that: - will return a tsvector as a weight int, pos int[], lexeme text record - will turn a tsvector + operator into a tsquery 'orange':A1,2,3 'banana':B4,5 'tomato':C6,7 -> 'orange':A | 'banana':B | 'tomato':C or eventually 'orange':A & 'banana':B & 'tomato':C thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FTS uses "tsquery" directly in the query
On Mon, 25 Jan 2010 07:19:59 -0800 (PST) xu fei wrote: > Hi, Oleg Bartunov: > First thanks for your quick replay. Could you explain it a little > more on "it's general limitation/feature"? I just confuse that > to_tsquery('item') function will return a tsquery type which is > same as 'item'::tsquery, to my understanding. Let me explain what > I want:First Step: extract top K tokensI have a table with a > column as tsvector type. Some records in this column are too big, > which contain hundreds tokens. I just want the top K tokens based > on the frequency, for example top 5. I am not sure there is a > direct way to get such kind top K tokens. I just read them out in > Java and count frequency for each token and sort them. Second > Step: generate queryNow I will use these tokens to construct a > query to search other vectors in the same table. I can not > directly use to_tsquery() due to two reasons: 1) The default logic > operator in to_tsquery() is "&" but what I need it "|". 2) Since > the tokens are from tsvector, they are already normalized. If I > use to_tsquery() again, they will be normalized again! For > example, “course” -> “cours” -> “cour”. So I just concatenate the > top K tokens with “|” and directly use "::tsquery ". > Unfortunately, as you say "it's general limitation/feature”, I can > not do that. I checked your manual “Full-Text Search > in PostgreSQL A Gentle Introduction”, but could not figure out > how. So is it possible to implement what I want in FTS? If so, > how? Thank! Xu --- On Sun, 1/24/10, Oleg Bartunov You're trying to solve a similar problems than mine. I'd like to build up a näive similar text search. I don't have the "length" problem still I'd like to avoid to tokenize/lexize a text twice to build up a tsquery. I've weighted tsvectors stored in a column and once I pick up one I'd like to look for similar ones in the same column. There are thousands way to measure text similarity (and Oleg pointed me to some), still ts_rank should be "good enough for me". I've very short text so I can't use & on the whole tsvector otherwise there will be very high chances to find just one match. As you suggested I could pick up a subset of "important"[1] lexemes in the tsvector and build up an "&"ed tsquery with them. Still at least in my case, since I'm dealing with very short texts, this still looks too risky (just 1 match). Considering that I'm using weighted tsvectors it seems that "|" and picking up the ones with the best rank could be a way to go. But as you've noted there is no function that turns a tsvector in a tsquery (including weight possibly) and give you the choice to use "|". Well... I'm trying to write a couple of helper functions in C. But I'm pretty new to postgres internals and well I miss a reference of functions/macro with some examples... and this is a side project and I haven't been using C for quite a while. Once I'll have that function I'll have to solve how to return few rows (since I'll have to use | I expect a lot of returned rows) to make efficient use of the gin index and avoid to compute ts_rank for too many rows. Don't hold your breath waiting... but let me know if you're interested so I don't have to be the only one posting newbies questions on pgsql-hackers ;) [1] ts_stat could give you some hints about what lexemes may be important... but well deciding what's important is another can of worms... and as anticipated ts_rank should be "good enough for me". -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] not officially documented use of setweight??
On Mon, 25 Jan 2010 12:01:04 +0100 Ivan Sergio Borgonovo wrote: > I think I've learned how to use pg text search from Oleg and Teodor > documentation since I've found on my code this use of setweight: > > query := query && > setweight(configuration, 'banana apple orange', 'B', '&'); > > But I can't find any trace of this use in official postgres docs. > The docs just says: > > setweight(vector tsvector, weight "char") returns tsvector > > Am I missing them? Is that use "supported" in future versions? Forgive me! setweight(cfg, text, weight, op) was actually one of my creations of 1 or 2 years ago using ts_debug. I got mad looking for it in pg sources... till suddently I tought to grep my codebase!!! Since I had not previously processed user input using ts_debug was fine. Now I've already processed tsvectors... so I'm writing something that basically do the above but starting from preprocessed tsvectors in spite of text. Sorry. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] not officially documented use of setweight??
I think I've learned how to use pg text search from Oleg and Teodor documentation since I've found on my code this use of setweight: query := query && setweight(configuration, 'banana apple orange', 'B', '&'); But I can't find any trace of this use in official postgres docs. The docs just says: setweight(vector tsvector, weight "char") returns tsvector Am I missing them? Is that use "supported" in future versions? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] more docs on extending postgres in C
On Fri, 22 Jan 2010 11:02:46 -0500 Tom Lane wrote: > Ivan Sergio Borgonovo writes: > > If I had to build stuff in the pg source tree I'd just clone a > > contrib directory and change the makefile [1]. What am I > > supposed to do if I'd like to create a contrib elsewhere (eg. > > ~/Documents/nfs/projects/ts_extensions)? > > Use PGXS: > http://www.postgresql.org/docs/8.4/static/xfunc-c.html#XFUNC-C-PGXS > > If the docs aren't enough for you, all of the contrib modules can > be built via pgxs, using "make PGXS=1". So their makefiles are > useful examples. What did work for me on Debian sid was: - installing postgresql-server-dev-[version] - apt-get source postgresql-server-dev-[version] - copy from there a contrib dir in my ~ (or wherever you prefer) - export USE_PGXS=1; make I didn't have postgresql server installed on this box but I have stuff that got in to satisfy dependencies for php/python pg drivers and psql. I wasn't able to understand from where PGXS pick up the version since I installed -dev-8.3 but everything else was for 8.4 and it didn't work: i...@dawn:~/ts_extension$ export USE_PGXS=1; make Makefile:12: /usr/lib/postgresql/8.4/lib/pgxs/src/makefiles/pgxs.mk: No such file or directory make: *** No rule to make target `/usr/lib/postgresql/8.4/lib/pgxs/src/makefiles/pgxs.mk'. Stop. Installing dev-8.4 made everything work. Nothing that really trouble me... but I'd expect that installing -dev-8.3 it would look in the right place since actually pgxs.mk is included in both versions. Thanks... I'd publish a summary as soon as I've clearer ideas so that the next poor guy will find easier to write contrib on Debian. I don't know if the version problem is worth a bug report to Debian (or pg people that built the pgxs system). -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] more docs on extending postgres in C
On Wed, 20 Jan 2010 17:43:27 +0100 Adrian von Bidder wrote: > On Wednesday 20 January 2010 15.42:14 Ivan Sergio Borgonovo wrote: > > I'd also appreciate some suggestion about dev environment and > > best practices on Debian, something that could help me to > > compile, install, test easily on Debian. > > (Disclaimer: Haven't done any postgres related programming so far, > so this might need adjusting.) > > The desciption for package postgresql-server-dev-8.4 includes > > [[[ > This package also contains the Makefiles necessary for building > add-on modules of PostgreSQL, which would otherwise have to be > built in the PostgreSQL source-code tree. > ]]] I need some babysitting from someone that actually developed extension on Debian. The package was already installed, but it seems it provides more than just the header files. Now that I know I can build extensions outside pg source tree I'd be nice I understand how to make it in a kosher way. If I had to build stuff in the pg source tree I'd just clone a contrib directory and change the makefile [1]. What am I supposed to do if I'd like to create a contrib elsewhere (eg. ~/Documents/nfs/projects/ts_extensions)? I've seen some forward to this list from pgsql-hackers. Would that list be a better place to ask this kind of things or is it too elite ;) for newbies? [1] actually I tried to make in a contrib module dir in pg source code tree... but unless you run ./configure at the root of the source tree it fails. So I've to guess that -dev gives you a preconfigured environment. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgSQL problem: How to split strings into rows
On Thu, 21 Jan 2010 13:49:45 -0500 Kynn Jones wrote: > I have a table X with some column K consisting of > whitespace-separated words. Is there some SELECT query that will > list all these words (for the entire table) so that there's one > word per row in the returned table? E.g. If the table X is > >K > - > foo bar baz > quux frobozz > eeny meeny > miny moe > > ...I want the result of this query to be > > foo > bar > baz > quux > frobozz > eeny > meeny > miny > moe http://www.postgresql.org/docs/current/static/functions-array.html string_to_array select (string_to_array('tano pino gino', ' '))[i] from generate_series(1, 3) s(i); You'd get the idea... to get the length of the array you've array_length. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] more docs on extending postgres in C
On Wed, 20 Jan 2010 17:38:17 +0100 Pavel Stehule wrote: > > Is there a reference of all macro and functions? > no, only source code It would be nice to at least a list of functions that could be used in extension development to avoid reading all the source. Since I'm a new entry in pg C coding and internals it will take me ages to just find what's worth to know. I'll try to take some notes while I grasp stuff and publish them somewhere. I'm still trying to digest: There are two ways you can build a composite data value (henceforth a "tuple"): you can build it from an array of Datum values, or from an array of C strings that can be passed to the input conversion functions of the tuple's column data types. In either case, you first need to obtain or construct a TupleDesc descriptor for the tuple structure. When working with Datums, you pass the TupleDesc to BlessTupleDesc, and then call heap_form_tuple for each row. When working with C strings, you pass the TupleDesc to TupleDescGetAttInMetadata, and then call BuildTupleFromCStrings for each row. In the case of a function returning a set of tuples, the setup steps can all be done once during the first call of the function. I grep throu contrib and I wasn't able to find anything that really enlighted me about BlessTupleDesc. I'll try to see if tomorrow things will look clearer. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] more docs on extending postgres in C
On Wed, 20 Jan 2010 16:56:04 +0100 Dimitri Fontaine wrote: > Ivan Sergio Borgonovo writes: > > I haven't been able to find anything better than the online > > manual and pg source code to learn how to write extensions. > > Maybe this will help: > http://wiki.postgresql.org/wiki/Image:Prato_2008_prefix.pdf > http://github.com/dimitri/prefix Thanks to all. Is there a reference of all macro and functions? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] more docs on extending postgres in C
I haven't been able to find anything better than the online manual and pg source code to learn how to write extensions. I couldn't find a reference of all the function/macros I could use and some more examples on how to use them. I'd also appreciate some suggestion about dev environment and best practices on Debian, something that could help me to compile, install, test easily on Debian. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] C: extending text search: from where to start
On Tue, 19 Jan 2010 10:12:21 +0100 Dimitri Fontaine wrote: > Ivan Sergio Borgonovo writes: > > I'd appreciate any pointer that will quickly put me on the right > > track. > > I'd guess you begin here: > http://wiki.postgresql.org/wiki/Developer_FAQ With the exception of formatting style info I didn't find anything useful there. Right now I'm reading: http://www.postgresql.org/docs/8.3/static/xfunc-c.html There is just an example on how to build up the data returned but it uses C string. I couldn't find an example that uses TupleDesc BlessTupleDesc. I'm installing deb contrib source package. Any other resource that will help me to write my own contrib? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] C: extending text search: from where to start
I'd like to extend full text search so that I can transform tvectors in tquery and have direct access to a tvector as a record/array. I'm on Debian. This is my first experience with pg source code. I'd appreciate any pointer that will quickly put me on the right track. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ranking how "similar" are tsvectors was: OR tsquery
On Sun, 17 Jan 2010 20:19:59 +0300 (MSK) Oleg Bartunov wrote: > Ivan, > > You can write function to get lexemes from tsvector: > CREATE OR REPLACE FUNCTION ts_stat(tsvector, weights text, OUT > word text, OUT ndoc integer, OUT nentry integer) > RETURNS SETOF record AS > $$ > SELECT ts_stat('SELECT ' || quote_literal( $1::text ) || > '::tsvector', quote_literal( $2::text) ); $$ LANGUAGE SQL RETURNS > NULL ON NULL INPUT IMMUTABLE; Thanks very much Oleg. Still it is not really making the pain go away. I've weights stored in my tsvector and I need to build the query using them. This means that if I have: 'aubergine':4A 'orange':1B 'banana':5A 'apple':3C and 'coconut':3B 'bananas':1A 'tomatoes:2C stored in a column (tsv) I really would like to build up the query: to_tsquery('aubergine:A | orange:B | bananas:A | apple:C') then tsv @@ to_tsquery('aubergine:A | orange:B | bananas:A | apple:C') and relative ts_rank() I'm aware that it is not symmetrical, but it looks as the cheapest and fastest thing I can do right now. I'm using pg_catalog.english. Am I supposing correctly that NO lexeme will contain spaces? If that is the case I could simply use string manipulation tools. Not nice to see but it will work. > Then, you can create ARRAY like: > > select ARRAY ( select (ts_stat(fts,'*')).word from papers where > id=2); > > Then, you will have two arrays and you're free to apply any > similarity function (cosine, jaccard,) to calculate what do > you want. If you want to preserve weights, then use weight label > instead of '*'. What ts_rank does is more than enough right now. > Another idea is to use array_agg, but I'm not ready to discuss it. > > Please, keep in mind, that document similarity is a hot topic in Not hard to imagine. > IR, and, yes, I and Teodor have something about this, but code > isn't available for public. Unfortunately, we had no sponsor for > full-text search for last year and I see no perspectives this > year, so we postpone our text-search development. Good luck. Do you have anything like http://www.chipin.com/ for small donations? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ranking how "similar" are tsvectors was: OR tsquery
My initial request was about a way to build up a tsquery that was made similar to what plainto_tsquery does but using | inspite of & as a glue. But at the end of the day I'd like to find similar tsvectors and rank them. I've a table containing several fields that contribute to build up a weighted tsvector. I'd like to pick up a tsvector and find which are the N most similar ones. I've found this: http://domas.monkus.lt/document-similarity-postgresql That's not really too far from what I was trying to do. But I have precomputed tsvectors (I think turning text into a tsvector should be a more expensive operation than string replacement) and I'd like to conserve weights. I'm not really sure but I think a lexeme can actually contain a ' or a space (depending on stemmer/parser?), so I'd have to take care of escaping etc... Since there is no direct access to the elements of a tsvector... the only "correct" way I see to build the query would be to manually rebuild the tsvector and getting back the result as a record using ts_debug and ts_lexize... that looks a bit a PITA. I don't even think that having direct access to elements of a tsvector will completely solve the problem since tsvectors store positions too, but it will be a step forward in making easier to compare documents to find similar ones. An operator that check the intersection of tsvectors would come handy. Adding a ts_rank(tsvector, tsvector) will surely help too. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] OR tsquery
On Sat, 16 Jan 2010 19:10:45 +0300 (MSK) Oleg Bartunov wrote: > Ivan, > did you ever read official documentation ? > http://www.postgresql.org/docs/8.4/static/textsearch-controls.html Yes but I still can't find something that works like plainto_tsquery but with | or any example that wouldn't make obtaining that result convoluted. plainto_tsquery do a couple of stuff that I find hard to replicate with the available functions. It split a string into lexemes. It loops over the lexemes to build up the query with &. Something like: select ( string_to_array( strip( to_tsvector('pg_catalog.english', 'orange banana woods booking'))::text , ' ') )[i] from generate_series(0,3) s(i); and then gluing up the pieces with |. And the above example still miss to solve some of the details like cleaning the '. Another option would be to return the tsvector to the client and then build the tsquery there and send it back to the server. I'm on 8.3 but I don't think it makes any real difference for this. Sorry if I'm still missing the obvious. -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] OR tsquery
to_tsquery and plainto_tsquery produce AND tsquery 'apple banana orange' -> 'apple' & 'banana' & 'orange' I can't see anything that will produce OR tsquery. 'apple banana orange' -> 'apple' | 'banana' | 'orange' The only thing I can think of is looping on ts_lexize that looks not very efficient in plpgsql. Am I missing something? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how much left for restore?
On Mon, 11 Jan 2010 18:36:18 + Sam Mason wrote: > On Fri, Jan 08, 2010 at 11:28:15AM +0100, Ivan Sergio Borgonovo > wrote: > > Is there a way to know/estimate how much is left to complete a > > restore? > maybe something like "pv" would help? > http://www.ivarch.com/programs/pv.shtml Nice. Start to look more as what I was looking for... and yeah... I'm aware it could be misleading. It would be nice to have it integrated in pg_restore/dump. If the file is compressed pg_* may make a better work to give an estimate. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how much left for restore?
On Mon, 11 Jan 2010 12:30:45 -0500 Francisco Reyes wrote: > Ivan Sergio Borgonovo writes: > > > Is there a way to know/estimate how much is left to complete a > > restore? > > Not sure on plain ASCII files but if your pg_dump used Fc then at > restore you can pass the -v flag. It get a bit better but even knowing what are the largest tables it is hard to get an estimate of how much is missing before complete restore. I'm really looking at rough figures... even a: I've read 40% of the file will give a more usable information than: I've already loaded table A. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] R: aggregate over tables in different schema
On Sun, 10 Jan 2010 10:49:48 +0100 Vincenzo Romano wrote: > Try using inheritance. One of the things I didn't mention is: I've to join these tables with other tables that may or may not (public) belong to the same schema. select sum(i.qty) from s1.list_items li join public.item i on i.itemid=li.itemid; Now I'd like to pick up the sum over all list_items tables across all the schemas. If I define the summary table as the child of all the sub-tables I'm going to write dynamic SQL anyway. So I guess I should define a common ancestor for all the tables (list_items) in different schema. create public.list_items ( itemid int primary key, // trouble name varchar(32) ); create table s1.list_items ( ) inherits (public.list_items); create table s2.list_items ( ) inherits (public.list_items); But I can't see how am I going to write the query. Furthermore the children should have their own pk and not share them. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] aggregate over tables in different schema
I've tables in different schemas all with the same name and structure. I'd like to compute an aggregate on the union of those tables. I don't know the schemas in advance. The list of the schema will be built selecting all the schemas that contain a table with that name. Other than building dynamically the statement as a list of union or building up a view is there any other way? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] how much left for restore?
Is there a way to know/estimate how much is left to complete a restore? It would be enough just knowing which part of the file is being restored (without causing too much extra IO, that will definitively put my notebook on its knee). Next time I try a restore on this box is there anything I could tweak in pg config to make it faster? For dev only... could I just stop the dev server, copy the *files* on flash and mount them on the notebook? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Optimized Select Statement
On Wed, 6 Jan 2010 17:45:31 -0800 (PST) Yan Cheng Cheok wrote: > situation, I will be only interested in 1 "YanChengCHEOK". > SELECT measurement_type_id INTO _measurement_type_id FROM > measurement_type WHERE measurement_type_name='YanChengCHEOK'; LIMIT 1 Is that what you were looking for? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump excluding tables content but not table schema
On Mon, 28 Dec 2009 21:20:17 +0100 Ivan Sergio Borgonovo wrote: > pg_dump -Fc -Z9 -s -t *.cache* -d mydb > schema_only.bak > pg_dump -Fc -Z9 -T *.cache* -d mydb > nearly_full.bak > cat nearly_full.bak schema_only.bak | pg_restore -1 -d mydb > It seems it is working... I'll test if everything is there. Unfortunately it doesn't work as expected. It silently skip to restore the second backup (schema_only.bak). I'm surprised it didn't output any error message, but the cache tables aren't there. It seems that you have to actually restore the 2 backup separately. pg_restore -1 -d mydb < nearly_full.bak pg_restore -1 -d mydb < schema_only.bak I can't think of any other way to restore both in one transaction unless I backup in plain text. But that should have other drawback. Any hint? -- Ivan Sergio Borgonovo http://www.webthatworks.it -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general