Re: [GENERAL] Inheritance efficiency
2010/4/30 Vincenzo Romano : > 2010/4/30 David Fetter : >> On Thu, Apr 29, 2010 at 11:29:36AM +0200, Vincenzo Romano wrote: >>> > No info about this point (partial indexes)? >>> > Is also this geared with linear algorithms ? >>> >>> Should I move to an "enterprise grade" version of PostgreSQL? >> >> The enterprise grade version of PostgreSQL is the community version. >> >> Proprietary forks exist, but they don't fix this kind of problem. :) > > Hmmm ... I think this is the kind of problems that keeps PostgreSQL away > from the "enterprise grade" world. > The ability to cope with thousands of DB objects like (child-)tables, > indexes, functions and so on with > O(1) or at least O(log(n)) complexity is among the key points. > > For example, the Linux kernel made the big jump with server hardware > thanks also to the O(1) schedulers. > > In this specific case, if you think about "inheritance for > partitioning" and you stick with the example idea of "one partition > per month", then the current solution is more than OK. > In the real world, that is not really the general case, especially in > the "enterprise grade" world, where maybe you partition with both a > time stamp and another column, like product code ranges and prefixes > ... > > Is there any planning about this improvement? Could it be possible to just make some changes (adding indexes) to the information schema to gain this enterprise gradeness? -- Vincenzo Romano NotOrAnd Information Technologies NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
Hi Did you eventually figure out what was wrong? Was it just that you were trying to load a full result set and running out of memory with an OutOfMemoryError? Or was the jvm truly crashing rather than just throwing OutOfMemoryError? -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Inheritance efficiency
2010/4/30 David Fetter : > On Thu, Apr 29, 2010 at 11:29:36AM +0200, Vincenzo Romano wrote: >> > No info about this point (partial indexes)? >> > Is also this geared with linear algorithms ? >> >> Should I move to an "enterprise grade" version of PostgreSQL? > > The enterprise grade version of PostgreSQL is the community version. > > Proprietary forks exist, but they don't fix this kind of problem. :) Hmmm ... I think this is the kind of problems that keeps PostgreSQL away from the "enterprise grade" world. The ability to cope with thousands of DB objects like (child-)tables, indexes, functions and so on with O(1) or at least O(log(n)) complexity is among the key points. For example, the Linux kernel made the big jump with server hardware thanks also to the O(1) schedulers. In this specific case, if you think about "inheritance for partitioning" and you stick with the example idea of "one partition per month", then the current solution is more than OK. In the real world, that is not really the general case, especially in the "enterprise grade" world, where maybe you partition with both a time stamp and another column, like product code ranges and prefixes ... Is there any planning about this improvement? > > Cheers, > David. > -- > David Fetter http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fet...@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate > -- Vincenzo Romano NotOrAnd Information Technologies cel. +39 339 8083886 | gtalk. vincenzo.rom...@notorand.it fix. +39 0823 454163 | skype. notorand.it fax. +39 02 700506964 | msn. notorand.it NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Inheritance efficiency
On Thu, Apr 29, 2010 at 11:29:36AM +0200, Vincenzo Romano wrote: > > No info about this point (partial indexes)? > > Is also this geared with linear algorithms ? > > Should I move to an "enterprise grade" version of PostgreSQL? The enterprise grade version of PostgreSQL is the community version. Proprietary forks exist, but they don't fix this kind of problem. :) Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
Thanks a lot for all your responses I am impress, really impress. I never though I could get this amount of responses in this shorter time. Wonderful support :) Thanks a lot :) :) I don't have details, I'll get them really soon. But all your input is really valuable. I have much more information. I'll continue my research. I'll spend a lot of time reading at wiki :P :) I am agree, 4k requests seams to be t much and crazy. I hope that my contact was wrong and it's only 400, which looks to be manageable. Once again, thanks a lot, I have a lot of information. Really appreciate your valuable time. Thanks :) On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe wrote: > On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe > wrote: > > On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez > > wrote: > >> hi, > >> Today is my first day looking at PostgreSQL > >> I am looking to migrate a MS SQL DB to PostgreSQL :) :) > >> My customer requires that DBMS shall support 4000 simultaneous requests > >> Also the system to be deploy maybe a cluster, with 12 microprocessors > > > > I'm gonna jump in here and say that if you 400 REQUESTS running at the > > same time, you're gonna want a REALLY big machine. > > I hate my keyboard... I meant to say: > > .. if you really need 4000 requests running at the same time... > -- Ing. Jaime Rodríguez Quesada, Mag Liberux S.A. http://www.liberux.com
Re: [GENERAL] sql help, reusing a column
On 04/29/2010 05:08 PM, Thomas Kellerer wrote: SELECT organization, state, lastdate, age(lastdate) FROM ( SELECT organization, state, (select max(idate) from times where customers.custid=times.custid and taskid = 27) as lastdate FROM customers ) t order by lastdate desc Ah, yes, that does work, very nice. Thank you. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL Performance issue
Hello there, 1. Try using COPY Command, you will see significant decrease in the loading time. 2. Turn off auto commit and Remove foreign key constraints if it is only one time load - this will also help in decreasing the load time. Try these options and let us know how it went. We load around 6M rows of data into a table using copy command it takes few mins to load the data and system configuration is not that high too. Also one more thing we use linux box over here. Do a small test as to how long will it take to do 1000 inserts into a similar table and send us your timings and definition of the table. I will compare against mine. Thanks Deepak On Tue, Apr 27, 2010 at 10:09 PM, wrote: > I am curious to know how much of your delay is due to PostgreSQL and how > much to your Java batch program. If you comment out the call to the > database function, so that you are reading your input file but not doing > anything with the data, how long does your batch program take to run? > > > > RobR > > > > -- > > The Java program hardly takes a minute to process all the flat files but at > the time inserting the records into the db, the entire process takes more > than 4 hours. > > > > Many thanks > > >
Re: [GENERAL] Problem: concat an array of arrays
Thanks, Merlin! The "restack" function solves the problem! :) > what are you trying to do w/unfold function exactly? The recursive query I mentioned was to produce from the argument array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11]] the result array[1,2,3,4,5,6,7,8,9,10,11]. The behaviour of the unnest function confused me, I didn't expect anything like that -- postgres=# select array(select unnest(array[array[1,2,3],array[4,5,6]])); ?column? --- {1,2,3,4,5,6} (1 row) postgres=# select array(select unnest(array[array[1,2,3],array[4,5]])); ERROR: multidimensional arrays must have array expressions with matching dimensions - But, oh well, at least I can make a {...} from {{...}} in a functional way:) Regards, Belka 29.04.10, 08:53, "Merlin Moncure" : > On Thu, Apr 29, 2010 at 8:46 AM, Merlin Moncure wrote: > > On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda wrote: > >> Hi! > >> > >> I tried to write a recursive SELECT, that would do the concatination, but > a problem appeared: > >> can't make a {1,2,3} from {{1,2,3}}. > >> Here are some experiments: > >> > --- > >> postgres=# select > array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]; > >> > >> array > >> -- > >> {{1,2,3},{4,5,6},{7,8,9},{10,11,12}} > >> (1 row) > >> > >> > >> postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > >> )[3]; > >> array > >> --- > >> > >> (1 row) > >> > >> > >> postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > >> )[3:3]; > >> array > >> --- > >> {{7,8,9}} > >> (1 row) > >> > >> > >> postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > >> )[3][1]; > >> array > >> --- > >> 7 > >> (1 row) > >> > --- > >> > >> The original query, that would do the concatenation: > >> --- > >> WITH RECURSIVE unfold (rest, accum) AS ( > >> VALUES ($1 :: int[][], ARRAY[] :: int[]) > >> UNION ALL > >> SELECT u.rest[2:array_length(u.rest, 1)] AS rest, > array_cat(u.rest[1], u.accum) AS accum > >> FROM unfold AS u > >> WHERE array_length(u.rest, 1) > 0 > >> ) > >> SELECT u.accum > >> FROM unfold AS u > >> WHERE array_length(u.rest, 1) = 0; > >> --- > >> Throws an error: > >> ERROR: function array_cat(integer, integer[]) does not exist > > > > array_cat requires too array arguments. you could rewrite your expression > to > > array_cat(array[u.rest[1], u.accum) > > (i think, not quite sure what you are trying to do). > > > > you can append scalars to arrays with the || operator: > > select array[1,2,3] || 4; > > ?column? > > --- > > {1,2,3,4} > > > > > > you can kinda sorta slice an array using the slice method: > > select (array[array[1,2,3], array[2,4,6]])[1:1]; > > array > > --- > > {{1,2,3}} > > > > what are you trying to do w/unfold function exactly? > > hm. the basic problem is that it's difficult to slide arrays up/down > dimensions. you can move from scalars to arrays and arrays to > scalars, but not from dimension N to N-1 etc. you can however move > from dimension 'N' to 1: > > create or replace function restack(_array anyarray) returns anyarray as > $$ >select array(select unnest($1)); > $$ language sql immutable; > > select restack(array[1,2,3]); > restack > - > {1,2,3} > > select restack(array[array[1,2,3]]); > restack > - > {1,2,3} > > > merlin > > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recovering Data from a crashed database
On 30/04/2010 5:29 AM, Eric Langheinrich wrote: I'm looking for options to recover data from a crashed postgres database server. We recently had a solid state storage device blow up taking the database server with it. The database is version 8.3, the pg_clog, pg_xlog and subdirectories of pg_tblspc were wiped out with the crashed storage device. We do have the files under /data/base. pgfsck looked like the right tool for the job, but seems to be outdated and lacking support for 8.3 Whatever you do, and before you do anything else, take a full copy of everything you still have and put it on storage you then ensure is read-only. This is important. Any recovery attempt you make may make things worse, and change the situation from "recoverable" to "completely hosed". Once you have a full snapshot, you can supply that to anyone you choose to help with recovery. I strongly suggest making sure the original pg data directory is read-only too. If you're going to do your own recovery attempts, copy the data to a spare machine and try it there, simply to make sure you've got everything isolated and there's no chance you're going to stomp on the original copy. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sql help, reusing a column
Andy Colson wrote on 29.04.2010 23:51: Here is my query, which works: select organization, state, (select max(idate) from times where customers.custid=times.custid and taskid = 27) as lastdate, age( (select max(idate) from times where customers.custid=times.custid and taskid = 27) ) from customers order by lastdate desc nulls last; I'd love to use age(lastdate) instead of age( (repeat sql) ), but it does not seem to work. This should work: SELECT organization, state, lastdate, age(lastdate) FROM ( SELECT organization, state, (select max(idate) from times where customers.custid=times.custid and taskid = 27) as lastdate FROM customers ) t order by lastdate desc -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] sql help, reusing a column
On 4/29/2010 4:51 PM, Andy Colson wrote: I tried this: select organization, state, max(idate), age(max(idate)) from customers inner join times using(custid) where taskid = 27 group by organization, state order by idate desc nulls last; but get error that times.idate must appear in group by or used in agg func... except it is used in an agg func. Any hints on what I'm missing? Thanks, -Andy Ahh, shoot, it was the idate in the order by, not the select list. Both "order by 3" and "order by max(idate)" work just fine. Sorry for the noise... but still... I'm kinda curious, in my first example, how you can re-use a column. Is there a way to: select organization, state, (select max(idate) from times where customers.custid=times.custid and taskid = 27) as lastdate, age(lastdate) from customers -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] sql help, reusing a column
Here is my query, which works: select organization, state, (select max(idate) from times where customers.custid=times.custid and taskid = 27) as lastdate, age( (select max(idate) from times where customers.custid=times.custid and taskid = 27) ) from customers order by lastdate desc nulls last; I'd love to use age(lastdate) instead of age( (repeat sql) ), but it does not seem to work. I tried this: select organization, state, max(idate), age(max(idate)) from customers inner join times using(custid) where taskid = 27 group by organization, state order by idate desc nulls last; but get error that times.idate must appear in group by or used in agg func... except it is used in an agg func. Any hints on what I'm missing? Thanks, -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Recovering Data from a crashed database
On Thu, 2010-04-29 at 15:29 -0600, Eric Langheinrich wrote: > I'm looking for options to recover data from a crashed postgres > database server. We recently had a solid state storage device blow up > taking the database server with it. > > The database is version 8.3, the pg_clog, pg_xlog and subdirectories > of pg_tblspc were wiped out with the crashed storage device. We do > have the files under /data/base. > > pgfsck looked like the right tool for the job, but seems to be > outdated and lacking support for 8.3 > > I'm open to all options including outsourcing the data recovery. Any > help is appreciated. 2ndQuadrant offers commercial data recovery services for people in your position. We'd be happy to help and regrettably have considerable experience. If you're interested, please contact us direct/off-list. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Recovering Data from a crashed database
I'm looking for options to recover data from a crashed postgres database server. We recently had a solid state storage device blow up taking the database server with it. The database is version 8.3, the pg_clog, pg_xlog and subdirectories of pg_tblspc were wiped out with the crashed storage device. We do have the files under /data/base. pgfsck looked like the right tool for the job, but seems to be outdated and lacking support for 8.3 I'm open to all options including outsourcing the data recovery. Any help is appreciated. Thank you, Eric
Re: [GENERAL] How to monitor Parallel pg_restore ?
Any suggestions ? On Thu, Apr 29, 2010 at 4:42 PM, raghavendra t wrote: > Hi All, > > I am using Postgres 8.4. pg_restore -j option. I have dump of the database > with -Fc and elected the pg_restore -j option for the faster restoration. > When the restoration process is in progress, i want to monitor the threads > invoked by pg_restore (suppose if i give -j 4). I have verified in the > pg_stat_activity, in which i see only one transaction running that is COPY > command. > > Even "top" command havent resulted any. > > Could please assist me in this. > > Regards > Raghavendra > >
Re: [GENERAL] Populate arrays from multiple rows
Thanks Merlin: I failed to mention that I'm running 8.3 (no array_agg), but you certainly pointed me in the right direction. This worked: INSERT INTO foo_arrays SELECT cde, nbr, ARRAY_ACCUM(CAST(aaa AS text)), ARRAY_ACCUM(CAST(bbb AS text)), ARRAY_ACCUM(CAST(ccc AS text)) FROM raw_foo GROUP BY 1,2; Cheers, Rob -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: Wednesday, April 28, 2010 4:33 PM To: Clift, Robert Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Populate arrays from multiple rows On Wed, Apr 28, 2010 at 1:39 PM, wrote: > Good afternoon: > > I would like to insert some (1 or more) values from multiple rows of > one table into an array in another table. Here's the scenario: > > --table to house data provided by a third party CREATE TABLE raw_foo ( > rf_id serial PRIMARY KEY, > cde character varying(4), > nbr integer, > aaa character varying(60), > bbb character(10), > ccc character varying(20) > ); > > --table raw_foo populated by copying from a text file --columns > cde||nbr identify a person while columns aaa||bbb||ccc describe an > attribute of a person --since each person can have one or more > attributes, the cde||nbr identifier is not distinct --need data in > raw_foo flattened so that there is only one record per person > > --second table in which aaa, bbb, and ccc are array fields CREATE > TABLE foo_arrays ( > cde character varying(4), > nbr integer, > aaa text[], > bbb text[], > ccc text[], > PRIMARY KEY (cde, nbr) > ); > > --insertion of all distinct cde||nbr combinations from raw_foo INSERT > INTO foo_arrays > (cde, nbr) > (SELECT cde, nbr > FROM raw_foo > GROUP BY cde, nbr > HAVING COUNT(*) = 1) > UNION > (SELECT cde, nbr > FROM raw_foo > GROUP BY cde, nbr > HAVING COUNT(*) > 1); > > --hope to update foo_arrays.aaa by selecting every instance of > raw_foo.aaa where raw_foo.cde||raw_foo.nbr matches the distinct value > of foo_arrays.cde||foo_arrays.nbr (repeating the process for > foo_arrays.bbb and > foo_arrays.ccc) > > UPDATE foo_arrays > SET aaa = ??? > > This is where I'm stumped. > Am I on the right path? > Thanks in advance. Hello, fellow Floridian! :-) how about this: insert into foo_arrays select cde, nbr, array_agg(aaa), array_agg(bbb), array_agg(ccc) group by 1,2; merlin -- For up-to-date information about H1N1 Swine Flu visit http://www.myflusafety.com or call 877-352-3581 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On Apr 29, 2010, at 10:45 AM, Justin Graf wrote: > Many people encode the binary data in Base64 and store as text data > type?? Then never have to deal with escaping bytea data type. Which i > have found can be a pain Damn. Wish I'd thought of that ;-) -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Recovering Data from a crashed database
I'm looking for options to recover data from a crashed postgres database server. We recently had a solid state storage device blow up taking the database server with it. The database is version 8.3, the pg_clog, pg_xlog and subdirectories of pg_tblspc were wiped out with the crashed storage device. We do have the files under /data/base. pgfsck looked like the right tool for the job, but seems to be outdated and lacking support for 8.3 I'm open to all options including outsourcing the data recovery. Any help is appreciated. Thank you, Eric
Re: [GENERAL] Performance and Clustering
On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe wrote: > On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez > wrote: >> hi, >> Today is my first day looking at PostgreSQL >> I am looking to migrate a MS SQL DB to PostgreSQL :) :) >> My customer requires that DBMS shall support 4000 simultaneous requests >> Also the system to be deploy maybe a cluster, with 12 microprocessors > > I'm gonna jump in here and say that if you 400 REQUESTS running at the > same time, you're gonna want a REALLY big machine. I hate my keyboard... I meant to say: .. if you really need 4000 requests running at the same time... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez wrote: > hi, > Today is my first day looking at PostgreSQL > I am looking to migrate a MS SQL DB to PostgreSQL :) :) > My customer requires that DBMS shall support 4000 simultaneous requests > Also the system to be deploy maybe a cluster, with 12 microprocessors I'm gonna jump in here and say that if you 400 REQUESTS running at the same time, you're gonna want a REALLY big machine. I admin a setup where two db servers handle ~200 simultaneous requests, almost all being very short millisecond long requests, and a few being 100 milliseconds, and a very very few running for seconds. With 8 2.1 GHz Opteron cores, 32 Gigs of ram, and 14x15k drives those machines run with a load factor in the range of 10 to 15. CPUs are maxed at that range of load, and IO is 70 to 80% utilized acording to iostat -x. Wait % is generally one core max. Some of that load is fixed on the master, but a lot can be handled by slaves. Your load, if you really are having 4000 simultaneous connections, is likely going to need 20 times the load handling I need. Given the newer 12 core AMDs are somewhat faster, you could probably get away with two or three of these machines. If you were to use 96 core machines (8Px12core) with as many disks as you could throw at them (40 to 100) then you're in the ballpark for a set of machines to process 4,000 simultaneous requests, assuming a mostly read (80% or so) setup. We're talking a large % of a full sized rack to hold all the drives and cores you'd need. But this brings up a lot of questions about partitioning your dataset if you can, things like that. Do all of these 4,000 simultaneous requests need to update the same exact data set? Or are they read mostly reporting users? Can you use memcached to handle part of the load? Usage patterns informs a great deal on how to size a system to handle that much load. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On 4/29/2010 3:18 PM, Tom Lane wrote: > Alvaro Herrera writes: > > However, that toast limit is per-table, whereas the pg_largeobject limit > is per-database. So for example if you have a partitioned table then > the toast limit only applies per partition. With large objects you'd > fall over at 4G objects (probably quite a bit less in practice) no > matter what. > > regards, tom lane > has there been any thought of doing something similar to MS filestream http://msdn.microsoft.com/en-us/library/cc949109.aspx it seems to overcome all the draw backs of storing files in the DB. All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored. CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately. Thank you. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
This whole sockets conversation has wandered way off topic. PostgreSQL runs into high-connection scaling issues due to memory limitations (on Windows in particular, as noted in the FAQ entry I suggested), shared resource contention, and general per-connection overhead long before socket issues matter. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Alvaro Herrera writes: > Each toasted object also requires an OID, so you cannot have more than 4 > billion toasted attributes in a table. > I've never seen this to be a problem in real life, but if you're talking > about having that many large objects, then it will be a problem with > toast too. However, that toast limit is per-table, whereas the pg_largeobject limit is per-database. So for example if you have a partitioned table then the toast limit only applies per partition. With large objects you'd fall over at 4G objects (probably quite a bit less in practice) no matter what. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Justin Graf wrote: > On 4/29/2010 12:07 PM, David Wall wrote: > > > > > > Big downside for the DB is that all large objects appear to be stored > > together in pg_catalog.pg_largeobject, which seems axiomatically > > troubling that you know you have lots of big data, so you then store > > them together, and then worry about running out of 'loids'. > Huh ??? isn't that point of using bytea or text datatypes. > > I could have sworn bytea does not use large object interface it uses > TOAST or have i gone insane Each toasted object also requires an OID, so you cannot have more than 4 billion toasted attributes in a table. I've never seen this to be a problem in real life, but if you're talking about having that many large objects, then it will be a problem with toast too. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
I dont think its that easy. 50,000 sockets open, sure, but whats the performance? The programming model has everything to do with that, and windows select() wont support that many sockets with any sort of performance. For windows you have to convert to using non-blocking sockets w/messages. (and I've never see the PG code, but I'll bet it's not using non-blocking sockets & windows msg q, so 50k sockets using select() on windows will not be usable). That being said, I'm not a windows socket component developer, so its mostly guessing. But saying "it can" and saying "its usable" are two different things, and that depends on the code, not the registry settings. Actually that is incorrect. You can use Synchronous non-blocking sockets. Asynchronous is a nightmare due to the overhead of pushing and handling messages... the busier the kernel, the slower your application. Syn-Non-Blocking will perform a small degradation in performance every 5,000 sockets. (Meaning 10,000 streams is minimally slower than 5,000 - but enough to denote degradation). Systems Running my product and Designs: AOL's Proxy Server System Some of the UK's largest ISP's AT&T Fiber Monitoring Framework HBO Video Streaming to Satellite Hart, a Front-End for TransUnion, Equifax and Experian OFAC Query (B-Tree Query Service, processing over 100,000 requests a second) (*) * WAN Latency plays a running variable on their stats, but they average 100,000+ a second during peak-hours. [1 master, 2 fail-over load-balanced servers]. Most people run into the "2048+/-" thread limitation until they learn how to properly manage stack allocation per thread. I have been designing commercial enterprise socket solutions for over 15 years and sell an SDK that no product has yet to touch and I compete with ALL the big boys (and they all know who I am). :-) ... the limitations in performance are factors of poor (modern sloppiness) variable allocation, memory management, buffering techniques, etc. I got out of actively promoting DXSock (my socket suite) when I found I could capitalize more on my time and my product... so since 2000 - I sale my knowledge. Factors which also come into play are the built-in overhead of the Operating System when it is a "Network Client/Server" it has active connections. These connections also incur the poor default settings Microsoft picked (FIN_WAIT/2 issue which is another registry tweak). Once you learn what servers a "Dedicated Windows Server" will not need, rip out all of the excess "Network Client" junk (and this is well documented all over the net) - you can produce very robust Windows servers. (Of course there are much better solutions for production servers than Windows - but, people still drink the Microsoft "blue" coolaide. * People who document the registry tweaks needed: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc38421_1500/html/ntconfig/X26667.htm ;-) O. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On Thu, Apr 29, 2010 at 1:51 PM, David Wall wrote: > I missed the part that BYTEA was being used since it's generally not a good > way for starting large binary data because you are right that BYTEA requires > escaping across the wire (client to backend) both directions, which for true > binary data (like compressed/encrypted data, images or other non-text files) > makes for a lot of expansion in size and related memory. what?? postgresql supports binary data in both directions without escaping. here is how i do it with libpqtypes: PGbytea b; b.data = some_pointer; b.len = data_length; res = PGexecf(conn, "insert into table values (%bytea*);", b); merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
On 4/29/2010 11:49 AM, Ozz Nixon wrote: On 4/29/10 12:42 PM, Greg Smith wrote: Alban Hertroys wrote: The reason I'm asking is that Postgres doesn't perform at its best on Windows and I seriously wonder whether the OS would be able to handle a load like that at all (can Windows handle 4000 open sockets for example?). You have to go out of your way to even get >125 connections going on Windows; see the very last entry at http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows I design socket component suites for developers, on windows, with few registry tweaks, you are able to have over 50,000 live, hot sockets. I dont think its that easy. 50,000 sockets open, sure, but whats the performance? The programming model has everything to do with that, and windows select() wont support that many sockets with any sort of performance. For windows you have to convert to using non-blocking sockets w/messages. (and I've never see the PG code, but I'll bet it's not using non-blocking sockets & windows msg q, so 50k sockets using select() on windows will not be usable). That being said, I'm not a windows socket component developer, so its mostly guessing. But saying "it can" and saying "its usable" are two different things, and that depends on the code, not the registry settings. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Writing SRF
it has been years since i've mucked in the C++ swamp but that means your (near) heap is ok but you're stack is hosed.. probably specific to compiler (version) and Operating System(version) and environment settings..ping back if you are still experiencing those problems with those configuration settings Saludos Cordiales desde EEUU! Martin Gainty __ Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni. > From: jorgearev...@gis4free.org > Date: Thu, 29 Apr 2010 19:45:41 +0200 > Subject: Re: [GENERAL] Writing SRF > To: t...@sss.pgh.pa.us > CC: pgsql-general@postgresql.org > > On Thu, Apr 29, 2010 at 3:56 PM, Tom Lane wrote: > > Jorge Arevalo writes: > >> Yes. For example, the function expects 2 arguments, and it's called > >> with 2 arguments: 1 composite type (following this format > >> https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat) > >> and one integer. But PG_NARGS() returns a really big value (16297) > >> when I first check the number of arguments at the beginning of the > >> function. Has sense? > > > > Given only that data point, I would guess that you forgot to mark the > > function as being called with V1 protocol (PG_FUNCTION_INFO_V1). > > > >regards, tom lane > > > > Many thanks! That was one of my errors. Another one was this: > > char szDataPointer[10]; > sprintf(szDataPointer, "%p", a_pointer); > > These lines caused a memory error. I changed them for: > > char * pszDataPointer; > pszDataPointer = (char *)allocator(10 * sizeof(char)); > sprintf(pszDataPointer, "%p", a_pointer); > > Meaning "allocator" a memory allocator in a valid memory context for > PostgreSQL. > > And seems to work :-). Is the static memory "dangerous" in a > PostgreSQL memory context? > > Thanks again! > Jorge > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general _ The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
Re: [GENERAL] Storing many big files in database- should I do it?
On 4/29/2010 1:51 PM, David Wall wrote: > >> Put it another way: bytea values are not stored in the pg_largeobject >> catalog. > > I missed the part that BYTEA was being used since it's generally not a > good way for starting large binary data because you are right that > BYTEA requires escaping across the wire (client to backend) both > directions, which for true binary data (like compressed/encrypted > data, images or other non-text files) makes for a lot of expansion in > size and related memory. > > BYTEA and TEXT both can store up to 1GB of data (max field length), > which means even less "file size" supported if you use TEXT with > base64 coding. LO supports 2GB of data. In JDBC, typically BYTEA is > used with byte[] or binary stream while LOs with BLOB. I think LOs > allow for streaming with the backend, too, but not sure about that, > whereas I'm pretty sure BYTEA/TEXT move all the data together you it > will be in memory all or nothing. > > Of course, to support larger file storage than 1GB or 2GB, you'll have > to create your own "toast" like capability to split them into multiple > rows. > > David > Outside of videos/media streams what other kind of data is going to be 1gig in size. Thats allot of data still even still today. We all talk about 1 gig and 2 gig limits on this, but really who has bumped into that on regular bases??? Every time i hear about that not being big enough the person is trying to shoe horn in media files into the database, which is insane All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored. CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately. Thank you. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Writing SRF
Jorge Arevalo writes: > Many thanks! That was one of my errors. Another one was this: > char szDataPointer[10]; > sprintf(szDataPointer, "%p", a_pointer); > These lines caused a memory error. That looks all right in itself (unless you're on a 64-bit machine, in which case you need a bigger array to hold %p output). However the array would only live as long as the function it's in. What were you doing with the data afterwards, returning it maybe? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Huh ??? isn't that point of using bytea or text datatypes. I could have sworn bytea does not use large object interface it uses TOAST or have i gone insane You're not insane :) Put it another way: bytea values are not stored in the pg_largeobject catalog. I missed the part that BYTEA was being used since it's generally not a good way for starting large binary data because you are right that BYTEA requires escaping across the wire (client to backend) both directions, which for true binary data (like compressed/encrypted data, images or other non-text files) makes for a lot of expansion in size and related memory. BYTEA and TEXT both can store up to 1GB of data (max field length), which means even less "file size" supported if you use TEXT with base64 coding. LO supports 2GB of data. In JDBC, typically BYTEA is used with byte[] or binary stream while LOs with BLOB. I think LOs allow for streaming with the backend, too, but not sure about that, whereas I'm pretty sure BYTEA/TEXT move all the data together you it will be in memory all or nothing. Of course, to support larger file storage than 1GB or 2GB, you'll have to create your own "toast" like capability to split them into multiple rows. David -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [SQL] [GENERAL] Tsearch not searching 'Y'
You can avoid stemming by using 'simple' instead of 'english' as the language of the words in to_tsvector (which is a little more awkward than the cast). "There are no stop words for the simple dictionary. It will just convert to lower case, and index every unique word. SELECT to_tsvector('simple', 'Andy andy The the in out'); to_tsvector - 'in':5 'out':6 'the':3,4 'andy':1,2 (1 row) John On Apr 29, 2010, at 4:01 PM, Tom Lane wrote: "sandeep prakash dhumale" writes: I am trying to get tsearch working for my application but I am facing a problem when alphabet 'Y' is the in the tsquery. # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*'); ?column? -- f (1 row) You can't use to_tsquery for this sort of thing, because it tries to normalize the given words: regression=# select to_tsquery('holly:*'); to_tsquery 'holli':* (1 row) If you do this it works: regression=# SELECT 'hollywood'::tsvector @@ 'holly:*'::tsquery; ?column? -- t (1 row) So if you want to use prefix matching, don't normalize. regards, tom lane -- Sent via pgsql-sql mailing list (pgsql-...@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Writing SRF
On Thu, Apr 29, 2010 at 3:56 PM, Tom Lane wrote: > Jorge Arevalo writes: >> Yes. For example, the function expects 2 arguments, and it's called >> with 2 arguments: 1 composite type (following this format >> https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat) >> and one integer. But PG_NARGS() returns a really big value (16297) >> when I first check the number of arguments at the beginning of the >> function. Has sense? > > Given only that data point, I would guess that you forgot to mark the > function as being called with V1 protocol (PG_FUNCTION_INFO_V1). > > regards, tom lane > Many thanks! That was one of my errors. Another one was this: char szDataPointer[10]; sprintf(szDataPointer, "%p", a_pointer); These lines caused a memory error. I changed them for: char * pszDataPointer; pszDataPointer = (char *)allocator(10 * sizeof(char)); sprintf(pszDataPointer, "%p", a_pointer); Meaning "allocator" a memory allocator in a valid memory context for PostgreSQL. And seems to work :-). Is the static memory "dangerous" in a PostgreSQL memory context? Thanks again! Jorge -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
> > Curious note - how does the non-subselect version and the subselect > version compare performance-wise? Magnus, On a test table with 12,000 rows there's not much in it, the subselect has a simpler plan but they both take practically the same time. The two plans (note I've been rewriting the field names for readability until now but haven't here): explain analyze SELECT year, sum(c) over (order by year) FROM ( SELECT extract(year from a56b7a8d6de03f67b) AS year, count(a10e4ab8863c199f1) AS c FROM a2e9a7e9e257153de GROUP BY extract(year from a56b7a8d6de03f67b) ) as subq; QUERY PLAN - WindowAgg (cost=851.49..874.06 rows=1290 width=16) (actual time=43.369..43.394 rows=5 loops=1) -> Sort (cost=851.49..854.71 rows=1290 width=16) (actual time=43.340..43.342 rows=5 loops=1) Sort Key: (date_part('year'::text, a2e9a7e9e257153de.a56b7a8d6de03f67b)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=752.59..771.94 rows=1290 width=26) (actual time=43.300..43.317 rows=5 loops=1) -> Seq Scan on a2e9a7e9e257153de (cost=0.00..689.56 rows=12605 width=26) (actual time=0.031..26.723 rows=12605 loops=1) Total runtime: 43.549 ms explain analyze SELECT extract(year from a56b7a8d6de03f67b), count(a10e4ab8863c199f1), sum(count(a10e4ab8863c199f1)) OVER (ORDER BY count(a10e4ab8863c199f1)) FROM a2e9a7e9e257153de GROUP BY 1 ORDER BY 1; QUERY PLAN --- Sort (cost=1382.39..1388.52 rows=2451 width=32) (actual time=44.229..44.230 rows=5 loops=1) Sort Key: (date_part('year'::text, a56b7a8d6de03f67b)) Sort Method: quicksort Memory: 25kB -> WindowAgg (cost=1195.39..1244.41 rows=2451 width=32) (actual time=44.171..44.208 rows=5 loops=1) -> Sort (cost=1195.39..1201.52 rows=2451 width=32) (actual time=44.125..44.127 rows=5 loops=1) Sort Key: (count(a10e4ab8863c199f1)) Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=1014.52..1057.41 rows=2451 width=32) (actual time=44.071..44.099 rows=5 loops=1) -> Seq Scan on a2e9a7e9e257153de (cost=0.00..833.58 rows=24126 width=32) (actual time=0.032..26.683 rows=12605 loops=1) Total runtime: 44.396 ms Regards Oliver Kohll oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk - software www.gtwm.co.uk - company
[GENERAL] Select with string that has a lone hyphen yields nothing
I have a product names table like this: datab=# select product_id, name from table.product_synonyms where name ilike '%%olympus e-pl1%%'; product_id | name +--- 8736 | Olympus E-PL1 8736 | Olympus E-PL1 Interchangeable Lens Type Live View Digital Camera – Black (Body Only) (Call for pre-order. Available on: 2010-04-09) 8736 | Olympus E-PL1 Interchangeable Lens Type Live View Digital Camera – Blue (Body Only) 8736 | Olympus E-PL1 Interchangeable Lens Type Live View Digital Camera w/ ED 14-42mm f3.5-5.6 (champagne/gold) (4 rows) Any select statement prior to the hyphen yields a result, but from the hyphen on yields nothing: datab=# select product_id, name from table.product_synonyms where name ilike '%%Olympus E-PL1 Interchangeable Lens Type Live View Digital Camera - Blue %%'; product_id | name +-- (0 rows) Any ideas how to fix this? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Le 29/04/2010 18:45, Justin Graf a écrit : > On 4/29/2010 12:07 PM, David Wall wrote: >> >> >> Big downside for the DB is that all large objects appear to be stored >> together in pg_catalog.pg_largeobject, which seems axiomatically >> troubling that you know you have lots of big data, so you then store >> them together, and then worry about running out of 'loids'. > Huh ??? isn't that point of using bytea or text datatypes. > > I could have sworn bytea does not use large object interface it uses > TOAST or have i gone insane > You're not insane :) Put it another way: bytea values are not stored in the pg_largeobject catalog. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
On 4/29/10 12:42 PM, Greg Smith wrote: Alban Hertroys wrote: The reason I'm asking is that Postgres doesn't perform at its best on Windows and I seriously wonder whether the OS would be able to handle a load like that at all (can Windows handle 4000 open sockets for example?). You have to go out of your way to even get >125 connections going on Windows; see the very last entry at http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows I design socket component suites for developers, on windows, with few registry tweaks, you are able to have over 50,000 live, hot sockets. On Linux 2.6 and later, I have yet to hit a serious limit. Performance wise, your focus will be poor memory paging, so make sure you have too much RAM, and nothing else running. O. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
On 4/29/2010 12:07 PM, David Wall wrote: > > > Big downside for the DB is that all large objects appear to be stored > together in pg_catalog.pg_largeobject, which seems axiomatically > troubling that you know you have lots of big data, so you then store > them together, and then worry about running out of 'loids'. Huh ??? isn't that point of using bytea or text datatypes. I could have sworn bytea does not use large object interface it uses TOAST or have i gone insane Many people encode the binary data in Base64 and store as text data type?? Then never have to deal with escaping bytea data type. Which i have found can be a pain All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored. CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately. Thank you. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
Alban Hertroys wrote: The reason I'm asking is that Postgres doesn't perform at its best on Windows and I seriously wonder whether the OS would be able to handle a load like that at all (can Windows handle 4000 open sockets for example?). You have to go out of your way to even get >125 connections going on Windows; see the very last entry at http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
Things to consider when /not /storing them in the DB: 1) Backups of DB are incomplete without a corresponding backup of the files. 2) No transactional integrity between filesystem and DB, so you will have to deal with orphans from both INSERT and DELETE (assuming you don't also update the files). 3) No built in ability for replication, such as WAL shipping Big downside for the DB is that all large objects appear to be stored together in pg_catalog.pg_largeobject, which seems axiomatically troubling that you know you have lots of big data, so you then store them together, and then worry about running out of 'loids'. David On 4/29/2010 2:10 AM, Cédric Villemain wrote: 2010/4/28 Adrian Klaver: On Tuesday 27 April 2010 5:45:43 pm Anthony wrote: On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain< cedric.villemain.deb...@gmail.com> wrote: store your files in a filesystem, and keep the path to the file (plus metadata, acl, etc...) in database. What type of filesystem is good for this? A filesystem with support for storing tens of thousands of files in a single directory, or should one play the 41/56/34/41563489.ext game? I'll prefer go with XFS or ext{3-4}. In both case with a path game. You path game will let you handle the scalability of your uploads. (so the first increment is the first directory) something like 1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash function or something that split a SHA1(or other) sum of the file to get the path. Are there any open source systems which handle keeping a filesystem and database in sync for this purpose, or is it a wheel that keeps getting reinvented? I know "store your files in a filesystem" is the best long-term solution. But it's just so much easier to just throw everything in the database. In the for what it is worth department check out this Wiki: http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems and postgres fuse also :-D -- Adrian Klaver adrian.kla...@gmail.com
Re: [GENERAL] Start-up script for few clusters: just add water?
Le 29/04/2010 10:40, Piotr Kublicki a écrit : > Guillaume Lelarge wrote on 28/04/2010 15:04:07: > >>> In such case the new created start-up script postgresql2 should not be >>> modified in the following line: >>> >>> # Override defaults from /etc/sysconfig/pgsql if file is present >>> [ -f /etc/sysconfig/pgsql/${NAME} ] && . /etc/sysconfig/pgsql/${NAME} >>> >>> export PGDATA >>> export PGPORT >>> >>> Or it will automatically match-up names from both directories, i.e.: >>> /etc/init.d/postgresql will run /etc/sysconfig/pgsql/postgresql? >>> >>> /etc/init.d/postgresql2 will run /etc/sysconfig/pgsql/postgresql2 >>> >> >> The latter. It will automatically match the script file name and the >> config file name. > > Thanks again. By the way, if I want to start instances with few optional > arguments, as -S or -B where can I include these? In the same config file, > i.e. /etc/sysconfig/pgsql/postgresql (speaking about RedHat directories > structure)? > I don't think so. I suppose you will have to change the /etc/init.d/postgresql script (or the one of your other instances). -- Guillaume. http://www.postgresqlfr.org http://dalibo.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using between with dates
In response to Geoffrey Myers : > I'm trying the following: > > ship_date between '04/30/2010' AND '04/30/2010' + 14 > > But this returns: > > ERROR: invalid input syntax for integer: "04/30/2010" > > Can I use between with dates? Sure, why not, but you have to CAST your STRING into a DATE, or you have to use to_date(). test=*# select '04/30/2010'::date; date 2010-04-30 (1 row) And yes, write "+ '14 days'::interval " instead of just only +14. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using between with dates
On 29 April 2010 14:55, Geoffrey Myers wrote: > I'm trying the following: > > ship_date between '04/30/2010' AND '04/30/2010' + 14 > > But this returns: > > ERROR: invalid input syntax for integer: "04/30/2010" > > Can I use between with dates? > > You need to cast that last date, so: ship_date between '04/30/2010' AND '04/30/2010'::date + 14 Thom
Re: [GENERAL] using between with dates
2010/4/29 Geoffrey Myers > I'm trying the following: > > ship_date between '04/30/2010' AND '04/30/2010' + 14 > > But this returns: > > ERROR: invalid input syntax for integer: "04/30/2010" > > Can I use between with dates? > > > This should be fine: ship_date between '04/30/2010'::date AND '04/30/2010'::date + 14 regards Szymon Guz
[GENERAL] using between with dates
I'm trying the following: ship_date between '04/30/2010' AND '04/30/2010' + 14 But this returns: ERROR: invalid input syntax for integer: "04/30/2010" Can I use between with dates? -- Geoffrey Myers Myers Consulting Inc. 770.592.1651 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using between with dates
Tom Lane wrote: Geoffrey writes: ship_date between '04/30/2010' AND '04/30/2010' + 14 ERROR: invalid input syntax for integer: "04/30/2010" Can I use between with dates? The problem with that is the parser has no reason to treat the strings as dates, at least not till it comes to consider the BETWEEN comparisons, which is too late to help in resolving the addition in the subexpression (data types are determined bottom-up). This'd work: ship_date between '04/30/2010' AND '04/30/2010'::date + 14 Thanks muchly, likely a better solution then my timestamp approach. -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: Re: [GENERAL] [SQL] Tsearch not searching 'Y'
On Thu, 29 Apr 2010 19:27:33 +0530 wrote >On Thu, Apr 29, 2010 at 01:13:40PM -, sandeep prakash dhumale wrote: >> Hello All, >> >> I am trying to get tsearch working for my application but I am facing a >> problem when alphabet 'Y' is the in the tsquery. >> >> can anyone please share some light on it. >> >> >> # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*'); >> ?column? >> -- >> f >> (1 row) >> >> SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); >> ?column? >> -- >> t >> (1 row) >> >> >> It works when i put <> in y as below but i don't want to do it that way. >> >> SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); >> ?column? >> -- >> t >> >> Thanks in advance > >That is because the to_tsquery() normalizes the tokens. Here is >what I get from the default configuration: > >db=# select to_tsquery('holly:*'); > to_tsquery > > 'holli':* >(1 row) > >db=# select to_tsquery('holl:*'); > to_tsquery > > 'holl':* >(1 row) > >It is pretty easy to see why you see the behavior that you do. >Maybe you need to change your tsearch configuration to match what >you expect to happen. > >Regards, >Ken > > >-- >Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-general > First of all thanks for your replies Tom and Ken, I am little newbie to Tsearch so I appologies if I sound a little confuse. Tom: If i do by casting like you wrote then i ran into case sensitivity issue also then it does not work for other searches I guess then it sees for exact matches and not normalize to lexims. Ken: As you said I need to change my configuration, It would be great if you can point me out where i can change that configuration and what about that in the query how does it work, does that mean to explicitly include y in to_tsquery. All your help is higly appriciated. --Sandy
Re: [GENERAL] using between with dates
Geoffrey writes: > ship_date between '04/30/2010' AND '04/30/2010' + 14 > ERROR: invalid input syntax for integer: "04/30/2010" > Can I use between with dates? The problem with that is the parser has no reason to treat the strings as dates, at least not till it comes to consider the BETWEEN comparisons, which is too late to help in resolving the addition in the subexpression (data types are determined bottom-up). This'd work: ship_date between '04/30/2010' AND '04/30/2010'::date + 14 regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using between with dates
On Thursday 29 April 2010 6:58:26 am Geoffrey wrote: > I'm trying the following: > > ship_date between '04/30/2010' AND '04/30/2010' + 14 > > But this returns: > > ERROR: invalid input syntax for integer: "04/30/2010" > > Can I use between with dates? > > -- > Until later, Geoffrey > > "I predict future happiness for America if they can prevent > the government from wasting the labors of the people under > the pretense of taking care of them." > - Thomas Jefferson ship_date between '04/30/2010' AND '04/30/2010'::date + 14 -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using between with dates
Geoffrey wrote: I'm trying the following: ship_date between '04/30/2010' AND '04/30/2010' + 14 But this returns: ERROR: invalid input syntax for integer: "04/30/2010" Can I use between with dates? Got it: ship_date between '04/30/2010' and timestamp '04/30/2010' + interval '14 day' -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] using between with dates
I'm trying the following: ship_date between '04/30/2010' AND '04/30/2010' + 14 But this returns: ERROR: invalid input syntax for integer: "04/30/2010" Can I use between with dates? -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Tsearch not searching 'Y'
"sandeep prakash dhumale" writes: > I am trying to get tsearch working for my application but I am facing a > problem when alphabet 'Y' is the in the tsquery. > # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*'); > ?column? > -- > f > (1 row) You can't use to_tsquery for this sort of thing, because it tries to normalize the given words: regression=# select to_tsquery('holly:*'); to_tsquery 'holli':* (1 row) If you do this it works: regression=# SELECT 'hollywood'::tsvector @@ 'holly:*'::tsquery; ?column? -- t (1 row) So if you want to use prefix matching, don't normalize. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Writing SRF
Jorge Arevalo writes: > Yes. For example, the function expects 2 arguments, and it's called > with 2 arguments: 1 composite type (following this format > https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat) > and one integer. But PG_NARGS() returns a really big value (16297) > when I first check the number of arguments at the beginning of the > function. Has sense? Given only that data point, I would guess that you forgot to mark the function as being called with V1 protocol (PG_FUNCTION_INFO_V1). regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [SQL] Tsearch not searching 'Y'
On Thu, Apr 29, 2010 at 01:13:40PM -, sandeep prakash dhumale wrote: > Hello All, > > I am trying to get tsearch working for my application but I am facing a > problem when alphabet 'Y' is the in the tsquery. > > can anyone please share some light on it. > > > # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*'); > ?column? > -- > f > (1 row) > > SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); > ?column? > -- > t > (1 row) > > > It works when i put <> in y as below but i don't want to do it that way. > > SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); > ?column? > -- > t > > Thanks in advance That is because the to_tsquery() normalizes the tokens. Here is what I get from the default configuration: db=# select to_tsquery('holly:*'); to_tsquery 'holli':* (1 row) db=# select to_tsquery('holl:*'); to_tsquery 'holl':* (1 row) It is pretty easy to see why you see the behavior that you do. Maybe you need to change your tsearch configuration to match what you expect to happen. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
On 29/04/2010 9:23 PM, ashish.a...@sungard.com wrote: Hi Craig, Sorry for creating confusion. Let me (I work with Ambarish, the original author of the mail) try to be more specific now. We have a library (written in C) which helps us in doing phonetic based name search. We want to use this library inside a postgres DB function. To achieve this we wrote a small C code (we referred as wrapper class) which uses the library. This C code is an ECPG which is bundled as a dll and placed in postgres's lib dir. OK, that makes sense - though I'm not sure your use of ecpg in that role does. I haven't worked with ecpg much at all, but I didn't realise it was capable of being used as a tool for server backend functions. Are you using EXEC SQL CONNECT TO in the ecpg code? Or is there some other way of using ECPG embeded in a backend that I don't know about? I don't see anything in: http://www.postgresql.org/docs/8.4/static/ecpg.html but I haven't gone through it in detail. The usual way to write a PostgreSQL backend function in C is using the server extension interfaces: http://www.postgresql.org/docs/8.4/static/extend.html http://www.postgresql.org/docs/8.4/static/xfunc-c.html and, if you need to execute SQL from within your C code, the Server Programming Interface: http://www.postgresql.org/docs/8.4/static/spi.html Your code is generally compiled using pgxs. The original postgres function is supposed to be called from a java program using JDBC. And the postgres function should call the C function of the wrapper class. That makes sense. At runtime we observed that when the postgres DB function calls the C function of the wrapper class (in the dll), the java heap memory start increasing and reached to the max level resulted in crashing of JVM. OK, so the earlier statement that made it sound like you were calling a DLL from the Java runtime: "Now the issue is that, when we make a call to this dll, it consumes a lot of memory and this memory is getting consumed from the heap space of the original java process causing an out of memory exception in Java." meant nothing of the sort, and your Java code (that's running out of memory) is really only using JDBC? Then we commented out the call to ECPG C function from postgres DB function and realized that everything went well. ... but the amount of data returned from your function call changed lots too, right? We were surprised why the loading and execution of the ECPG is taking JVM memory. I doubt it is. I suspect you're just seeing memory used by a large result set. Consider using a cursor. See the JDBC manual on handling large result sets. The JVM should *not* crash if it runs out of memory due to JDBC using too much, though. Any crash really should be reported directly to sun. If the JVM crashes it saves some error logs and reports the crash on the text console so you can send them off to Sun for analysis. If you don't actually mean that the JVM crashes at all, and actually mean "my program throws an OutOfMemoryError" ... then yes, that's the expected behaviour when you try to use too much memory loading a big result set. This is NOT a jvm crash, the jvm is doing exactly what it's supposed to do. Describing this as "crashing of JVM" is very misleading if this is actually what's happening. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
Hi Craig, Sorry for creating confusion. Let me (I work with Ambarish, the original author of the mail) try to be more specific now. We have a library (written in C) which helps us in doing phonetic based name search. We want to use this library inside a postgres DB function. To achieve this we wrote a small C code (we referred as wrapper class) which uses the library. This C code is an ECPG which is bundled as a dll and placed in postgres's lib dir. The original postgres function is supposed to be called from a java program using JDBC. And the postgres function should call the C function of the wrapper class. At runtime we observed that when the postgres DB function calls the C function of the wrapper class (in the dll), the java heap memory start increasing and reached to the max level resulted in crashing of JVM. Then we commented out the call to ECPG C function from postgres DB function and realized that everything went well. We were surprised why the loading and execution of the ECPG is taking JVM memory. Ideally it should use the postgres.exe memory and CPU utilization. We observed all these using windows task manager. I hope this will help you in understanding our problem. Thanks and Regards, Ashish Arya -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Craig Ringer Sent: Thursday, April 29, 2010 6:34 PM To: Bhattacharya, A Cc: pgsql-general@postgresql.org Subject: Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library On 29/04/2010 8:48 PM, a.bhattacha...@sungard.com wrote: > Your understanding is slightly incorrect. Actually we required to uses a > special library from postgres. That mystery library being? From "postgres"? Do you mean a library supplied by the PostgreSQL project its self? Libpq? If not, what? C'mon, specifics. Please. Try to read your message after you have written it while pretending you are somebody who does not work with your code and does not know the things you know about it. Try to notice assumptions and clarify them. > For this we write a wrapper around the > library and composed a dll out of it (which uses ECPG to communicate to > postgres). From the application front (Java) we make a JDBC call to > postgres function. This postgres function then makes call to the dll. This does not make sense. Your earlier mail said that the DLL is loaded into the Java runtime. How does a "postgres function" make a call to a DLL running in the JRE? You can't make a call from the postgresql backend to code on a client connected to ecpg. Please be SPECIFIC. What is a "postgres function?" How does it "make a call to the dll?" Etc. Provide links to PostgreSQL and Java documentation for the APIs you use, use the precise terms for those APIs, and generally be SPECIFIC and DETAILED. Post diagrams or code samples if you have to! > But the strange thing we observed that at runtime, somehow the execution > of dll is attached with java.exe. In other words, when we monitor the > performance of the application we realized that the execution of dll is > consuming java heap memory and is shown as a thread in java.exe process. > We expected that it should be linked with postgres.exe. Ah, ok, so you are TRYING to load code a C extension to PostgreSQL into the server backend? If so, why are you using ECPG? Or are you in fact *not* using ECPG, but the PostgreSQL backend extension interface? How is the dll "shown as a thread in [the] java.exe process" ? Using a Java debugger attached to the JRE? Using Process Explorer or a C-level debugger examining the loaded DLL list of the JRE? Using the Java monitoring APIs? I really do not understand what you are doing, or what you are trying to do. I doubt anybody can help you without a much more detailed and specific explanation of what you're doing. I think it'd be a really good idea for you to write a minimalist test case for this and post it. Either you'll figure out what's wrong in the process of making the test case, or you'll have something to post that people can play with to see what you are trying to do. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Tsearch not searching 'Y'
Hello All, I am trying to get tsearch working for my application but I am facing a problem when alphabet 'Y' is the in the tsquery. can anyone please share some light on it. # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*'); ?column? -- f (1 row) SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); ?column? -- t (1 row) It works when i put <> in y as below but i don't want to do it that way. SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*'); ?column? -- t Thanks in advance
Re: [GENERAL] Writing SRF
On Wed, Apr 28, 2010 at 10:43 PM, Tom Lane wrote: > Jorge Arevalo writes: >> My doubt is if I'm doing things right getting all the stuff I need (an >> array) in the first call, pointing user_fctx to this array and >> accessing myStructsArray[call_cntr] in each successive call, until >> myStructsArray + call_cntr == NULL (last array element?). > > Sounds reasonable enough. Is it not working for you? Maybe you need > to be careful about which memory context the array is created in. > > regards, tom lane > Yes. For example, the function expects 2 arguments, and it's called with 2 arguments: 1 composite type (following this format https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat) and one integer. But PG_NARGS() returns a really big value (16297) when I first check the number of arguments at the beginning of the function. Has sense? The array, if I'm doing things right, is created in the context pointed by fcinfo->flinfo->fn_mcxt. But I'd like to solve "silly" things like the previous before, and to be sure I'm doing things right, in general. Now, I know at least sounds reasonable :-) Many thanks! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance and Clustering
On 29 Apr 2010, at 3:08, Jaime Rodriguez wrote: > hi, > Today is my first day looking at PostgreSQL > I am looking to migrate a MS SQL DB to PostgreSQL :) :) > My customer requires that DBMS shall support 4000 simultaneous requests > Also the system to be deploy maybe a cluster, with 12 microprocessors > > From what I have read, PostgreSQL has really good performance and reliability > but I would like to get some numbers, not sure if somewhere in the wiki some > of this data is available. Are you looking at PostgreSQL on Windows or on a UNIX or UNIX-based OS? The reason I'm asking is that Postgres doesn't perform at its best on Windows and I seriously wonder whether the OS would be able to handle a load like that at all (can Windows handle 4000 open sockets for example?). Other database solutions on Windows will probably have similar issues, so this is not a reason to base your choice of database on - it is IMHO something that you should look into. OTOH, changing both the database and the OS is a big change. For example, most UNIX-es by default use a case-sensitive file system, whereas Windows does not. That said, for both you'll certainly have to make lots of changes in your application, so combining the two and do that only once may be preferable. If you're thinking of going that way I'd suggest FreeBSD or Solaris, but Linux is a popular choice (as is Windows, for that matter). Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4bd984be10411660912508! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
On 29/04/2010 8:48 PM, a.bhattacha...@sungard.com wrote: Your understanding is slightly incorrect. Actually we required to uses a special library from postgres. That mystery library being? From "postgres"? Do you mean a library supplied by the PostgreSQL project its self? Libpq? If not, what? C'mon, specifics. Please. Try to read your message after you have written it while pretending you are somebody who does not work with your code and does not know the things you know about it. Try to notice assumptions and clarify them. For this we write a wrapper around the library and composed a dll out of it (which uses ECPG to communicate to postgres). From the application front (Java) we make a JDBC call to postgres function. This postgres function then makes call to the dll. This does not make sense. Your earlier mail said that the DLL is loaded into the Java runtime. How does a "postgres function" make a call to a DLL running in the JRE? You can't make a call from the postgresql backend to code on a client connected to ecpg. Please be SPECIFIC. What is a "postgres function?" How does it "make a call to the dll?" Etc. Provide links to PostgreSQL and Java documentation for the APIs you use, use the precise terms for those APIs, and generally be SPECIFIC and DETAILED. Post diagrams or code samples if you have to! But the strange thing we observed that at runtime, somehow the execution of dll is attached with java.exe. In other words, when we monitor the performance of the application we realized that the execution of dll is consuming java heap memory and is shown as a thread in java.exe process. We expected that it should be linked with postgres.exe. Ah, ok, so you are TRYING to load code a C extension to PostgreSQL into the server backend? If so, why are you using ECPG? Or are you in fact *not* using ECPG, but the PostgreSQL backend extension interface? How is the dll "shown as a thread in [the] java.exe process" ? Using a Java debugger attached to the JRE? Using Process Explorer or a C-level debugger examining the loaded DLL list of the JRE? Using the Java monitoring APIs? I really do not understand what you are doing, or what you are trying to do. I doubt anybody can help you without a much more detailed and specific explanation of what you're doing. I think it'd be a really good idea for you to write a minimalist test case for this and post it. Either you'll figure out what's wrong in the process of making the test case, or you'll have something to post that people can play with to see what you are trying to do. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Convert of string to array problem
On Thu, Apr 29, 2010 at 8:52 AM, Thom Brown wrote: > On 29 April 2010 13:35, Bård Grønbech wrote: >> >> Have a string like '0.1;0.2;null;0.3' which I would like to convert >> into a double precision[] array. >> >> Trying: >> >> select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[]) >> >> gives me an error: invalid input syntax for type double precision: "null". >> >> Can anybody help me? >> >> -Bård >> > > I believe string_to_array will take that null as a string called 'null', > which it can't convert to a float8. select array(select case when n='null' then null else n end from unnest(string_to_array('0.1;0.2;null;0.3', ';')) as n); ?column? {0.1,0.2,NULL,0.3} merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem: concat an array of arrays
On Thu, Apr 29, 2010 at 8:46 AM, Merlin Moncure wrote: > On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda wrote: >> Hi! >> >> I tried to write a recursive SELECT, that would do the concatination, but a >> problem appeared: >> can't make a {1,2,3} from {{1,2,3}}. >> Here are some experiments: >> --- >> postgres=# select >> array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]; >> >> array >> -- >> {{1,2,3},{4,5,6},{7,8,9},{10,11,12}} >> (1 row) >> >> >> postgres=# select >> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] >> )[3]; >> array >> --- >> >> (1 row) >> >> >> postgres=# select >> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] >> )[3:3]; >> array >> --- >> {{7,8,9}} >> (1 row) >> >> >> postgres=# select >> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] >> )[3][1]; >> array >> --- >> 7 >> (1 row) >> --- >> >> The original query, that would do the concatenation: >> --- >> WITH RECURSIVE unfold (rest, accum) AS ( >> VALUES ($1 :: int[][], ARRAY[] :: int[]) >> UNION ALL >> SELECT u.rest[2:array_length(u.rest, 1)] AS rest, >> array_cat(u.rest[1], u.accum) AS accum >> FROM unfold AS u >> WHERE array_length(u.rest, 1) > 0 >> ) >> SELECT u.accum >> FROM unfold AS u >> WHERE array_length(u.rest, 1) = 0; >> --- >> Throws an error: >> ERROR: function array_cat(integer, integer[]) does not exist > > array_cat requires too array arguments. you could rewrite your expression to > array_cat(array[u.rest[1], u.accum) > (i think, not quite sure what you are trying to do). > > you can append scalars to arrays with the || operator: > select array[1,2,3] || 4; > ?column? > --- > {1,2,3,4} > > > you can kinda sorta slice an array using the slice method: > select (array[array[1,2,3], array[2,4,6]])[1:1]; > array > --- > {{1,2,3}} > > what are you trying to do w/unfold function exactly? hm. the basic problem is that it's difficult to slide arrays up/down dimensions. you can move from scalars to arrays and arrays to scalars, but not from dimension N to N-1 etc. you can however move from dimension 'N' to 1: create or replace function restack(_array anyarray) returns anyarray as $$ select array(select unnest($1)); $$ language sql immutable; select restack(array[1,2,3]); restack - {1,2,3} select restack(array[array[1,2,3]]); restack - {1,2,3} merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Convert of string to array problem
On 29 April 2010 13:35, Bård Grønbech wrote: > Have a string like '0.1;0.2;null;0.3' which I would like to convert > into a double precision[] array. > > Trying: > > select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[]) > > gives me an error: invalid input syntax for type double precision: "null". > > Can anybody help me? > > -Bård > > I believe string_to_array will take that null as a string called 'null', which it can't convert to a float8. Regards Thom
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
-Original Message- From: Craig Ringer [mailto:cr...@postnewspapers.com.au] Sent: Thursday, April 29, 2010 6:02 PM To: Bhattacharya, A Cc: pgsql-general@postgresql.org Subject: Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library On 29/04/2010 7:34 PM, a.bhattacha...@sungard.com wrote: > We have a java exe making a call to a postgres function. This postgres > function internally makes a call to a dll (which is written using > Postgres extended C). If I understand you correctly, you have a dll that uses ecpg to communicate with postgresql. It is loaded by a JVM via JNI, and that JVM then uses JNI calls to use your ecpg-based DLL to talk to the database. If I've understood you right: Why this strange architecture? Usually Java apps just talk to the database via JDBC. As far as I know there's no particular reason you *can't* do calls to your ecpg-using dll via JNI, though. I'd be surprised if your problems didn't turn out to be misuse/misunderstanding of the notoriously-hard-to-get-right JNI interface, or issues with your DLL and/or its use of ecpg. If I have NOT understood how you are doing things correctly, then please be MUCH MORE SPECIFIC. Making people guess what you're doing doesn't help you get accurate, useful answers. I strongly recommend that you try to produce a self-contained test case. Write a minimalist ecpg dll that does only one simple thing. Write a minimalist, simple JNI-using Java program that loads and uses that DLL. See if it leaks memory. If this minimalist test case demonstrates the leak, post the source code to BOTH the dll and your simple Java program here. Make sure both can be compiled without extra libraries and include any project files etc required to compile them. If your new test case doesn't leak, then you know the basic JNI + ECPG combination isn't the issue. You need to start investigating what in your code is causing the leak. If you just can't figure it out, then start chopping things out and disabling things until the leak stops happening to narrow it down. Without a lot more detail than you have provided, it is unlikely that anybody here can help you. > Now the issue is that, when we make a call to this dll, it consumes a > lot of memory and this memory is getting consumed from the heap space of > the original java process causing an out of memory exception in Java. > > Is this a known issue. I rather doubt that anybody else anywhere in the world is doing what you're doing ;-) Not as far as I know it's not, no. Nor have you provided any evidence it's an issue with PostgreSQL not your own code yet. > Do we have a way to keep these processes disjoint > and not eat up heap space of the original process? http://google.com/search?q=inter+process+communication ... but you should probably fix the problem that's causing the excessive memory use/leakage rather than just working around it by running the code out-of-process. -- Craig Ringer Thanks Craig for your comment. Your understanding is slightly incorrect. Actually we required to uses a special library from postgres. For this we write a wrapper around the library and composed a dll out of it (which uses ECPG to communicate to postgres). From the application front (Java) we make a JDBC call to postgres function. This postgres function then makes call to the dll. But the strange thing we observed that at runtime, somehow the execution of dll is attached with java.exe. In other words, when we monitor the performance of the application we realized that the execution of dll is consuming java heap memory and is shown as a thread in java.exe process. We expected that it should be linked with postgres.exe. And this incorrect linking resulted in out of heap memory error in java. But when we run the same postgres function using pgAdmin tool, the memory and CPU utilization of the dll is shown attached with the postgres.exe (corresponding to pgAdmin). I hope I am able to explain the issue. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem: concat an array of arrays
On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda wrote: > Hi! > > I tried to write a recursive SELECT, that would do the concatination, but a > problem appeared: > can't make a {1,2,3} from {{1,2,3}}. > Here are some experiments: > --- > postgres=# select > array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]; > > array > -- > {{1,2,3},{4,5,6},{7,8,9},{10,11,12}} > (1 row) > > > postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > )[3]; > array > --- > > (1 row) > > > postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > )[3:3]; > array > --- > {{7,8,9}} > (1 row) > > > postgres=# select > (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]] > )[3][1]; > array > --- > 7 > (1 row) > --- > > The original query, that would do the concatenation: > --- > WITH RECURSIVE unfold (rest, accum) AS ( > VALUES ($1 :: int[][], ARRAY[] :: int[]) > UNION ALL > SELECT u.rest[2:array_length(u.rest, 1)] AS rest, array_cat(u.rest[1], > u.accum) AS accum > FROM unfold AS u > WHERE array_length(u.rest, 1) > 0 > ) > SELECT u.accum > FROM unfold AS u > WHERE array_length(u.rest, 1) = 0; > --- > Throws an error: > ERROR: function array_cat(integer, integer[]) does not exist array_cat requires too array arguments. you could rewrite your expression to array_cat(array[u.rest[1], u.accum) (i think, not quite sure what you are trying to do). you can append scalars to arrays with the || operator: select array[1,2,3] || 4; ?column? --- {1,2,3,4} you can kinda sorta slice an array using the slice method: select (array[array[1,2,3], array[2,4,6]])[1:1]; array --- {{1,2,3}} what are you trying to do w/unfold function exactly? merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Convert of string to array problem
Have a string like '0.1;0.2;null;0.3' which I would like to convert into a double precision[] array. Trying: select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[]) gives me an error: invalid input syntax for type double precision: "null". Can anybody help me? -Bård -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library
On 29/04/2010 7:34 PM, a.bhattacha...@sungard.com wrote: We have a java exe making a call to a postgres function. This postgres function internally makes a call to a dll (which is written using Postgres extended C). If I understand you correctly, you have a dll that uses ecpg to communicate with postgresql. It is loaded by a JVM via JNI, and that JVM then uses JNI calls to use your ecpg-based DLL to talk to the database. If I've understood you right: Why this strange architecture? Usually Java apps just talk to the database via JDBC. As far as I know there's no particular reason you *can't* do calls to your ecpg-using dll via JNI, though. I'd be surprised if your problems didn't turn out to be misuse/misunderstanding of the notoriously-hard-to-get-right JNI interface, or issues with your DLL and/or its use of ecpg. If I have NOT understood how you are doing things correctly, then please be MUCH MORE SPECIFIC. Making people guess what you're doing doesn't help you get accurate, useful answers. I strongly recommend that you try to produce a self-contained test case. Write a minimalist ecpg dll that does only one simple thing. Write a minimalist, simple JNI-using Java program that loads and uses that DLL. See if it leaks memory. If this minimalist test case demonstrates the leak, post the source code to BOTH the dll and your simple Java program here. Make sure both can be compiled without extra libraries and include any project files etc required to compile them. If your new test case doesn't leak, then you know the basic JNI + ECPG combination isn't the issue. You need to start investigating what in your code is causing the leak. If you just can't figure it out, then start chopping things out and disabling things until the leak stops happening to narrow it down. Without a lot more detail than you have provided, it is unlikely that anybody here can help you. Now the issue is that, when we make a call to this dll, it consumes a lot of memory and this memory is getting consumed from the heap space of the original java process causing an out of memory exception in Java. Is this a known issue. I rather doubt that anybody else anywhere in the world is doing what you're doing ;-) Not as far as I know it's not, no. Nor have you provided any evidence it's an issue with PostgreSQL not your own code yet. Do we have a way to keep these processes disjoint and not eat up heap space of the original process? http://google.com/search?q=inter+process+communication ... but you should probably fix the problem that's causing the excessive memory use/leakage rather than just working around it by running the code out-of-process. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
On Thu, Apr 29, 2010 at 13:43, Oliver Kohll - Mailing Lists wrote: >> >> Aren't you looking for something along the line of: >> >> SELECT year, sum(c) over (order by year) >> FROM ( >> SELECT extract(year from signup_date) AS year, count(email_address) AS c >> FROM email_list >> GROUP BY extract(year from signup_date) >> ) >> >> (adjust for typos, I didn't test it) >> >> Yes that does work thanks, if you give the subquery a name. I'd still like >> to know if it's possible to do with a window function rather than a >> subquery. >> Oliver Kohll > > Like this?: > > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM > email_list GROUP BY 1 ORDER BY 1; > > Thom > > Almost, but put me on the right track! This one is exactly what I'm looking > for: > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) > FROM email_list GROUP BY 1 ORDER BY 1; > The ORDER BY count(email_address) did give the same results for my data but > only because the count values just happen to give the same ordering as the > years - I tested by changing some dates. > Many thanks all. Curious note - how does the non-subselect version and the subselect version compare performance-wise? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
>> >> Aren't you looking for something along the line of: >> >> SELECT year, sum(c) over (order by year) >> FROM ( >> SELECT extract(year from signup_date) AS year, count(email_address) AS c >> FROM email_list >> GROUP BY extract(year from signup_date) >> ) >> >> (adjust for typos, I didn't test it) > > Yes that does work thanks, if you give the subquery a name. I'd still like to > know if it's possible to do with a window function rather than a subquery. > > Oliver Kohll > > > Like this?: > > SELECT extract(year from signup_date), count(email_address), > sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM > email_list GROUP BY 1 ORDER BY 1; > > Thom Almost, but put me on the right track! This one is exactly what I'm looking for: SELECT extract(year from signup_date), count(email_address), sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) FROM email_list GROUP BY 1 ORDER BY 1; The ORDER BY count(email_address) did give the same results for my data but only because the count values just happen to give the same ordering as the years - I tested by changing some dates. Many thanks all. Oliver
FW: [GENERAL] Java Memory Issue while Loading Postgres library
Dear All Experts, We have a java exe making a call to a postgres function. This postgres function internally makes a call to a dll (which is written using Postgres extended C). Now the issue is that, when we make a call to this dll, it consumes a lot of memory and this memory is getting consumed from the heap space of the original java process causing an out of memory exception in Java. Is this a known issue. Do we have a way to keep these processes disjoint and not eat up heap space of the original process? Please provide your suggestions/advises on the same. Sorry missed few information. I am using PostgreSQL 8.3.5 on Windows XP with 4GB of RAM and 160GB of hard drive with Core 2 Duo processor CPU @ 6300 1.86 GHz.
[GENERAL] How to monitor Parallel pg_restore ?
Hi All, I am using Postgres 8.4. pg_restore -j option. I have dump of the database with -Fc and elected the pg_restore -j option for the faster restoration. When the restoration process is in progress, i want to monitor the threads invoked by pg_restore (suppose if i give -j 4). I have verified in the pg_stat_activity, in which i see only one transaction running that is COPY command. Even "top" command havent resulted any. Could please assist me in this. Regards Raghavendra
Re: [GENERAL] Cumulative count (running total) window fn
On 29 April 2010 11:39, Oliver Kohll - Mailing Lists < oliver.li...@gtwm.co.uk> wrote: > > On 29 Apr 2010, at 10:01, Magnus Hagander wrote: > > > select extract(year from signup_date), > > count(email_address), > > sum(count(email_address)) over (partition by 1 order by 1 asc rows > unbounded preceding) > > from email_list group by 1 order by 1; > > > Does anyone have any other ideas? > > > Aren't you looking for something along the line of: > > SELECT year, sum(c) over (order by year) > FROM ( > SELECT extract(year from signup_date) AS year, count(email_address) AS c > FROM email_list > GROUP BY extract(year from signup_date) > ) > > (adjust for typos, I didn't test it) > > > Yes that does work thanks, if you give the subquery a name. I'd still like > to know if it's possible to do with a window function rather than a > subquery. > > Oliver Kohll > Like this?: SELECT extract(year from signup_date), count(email_address), sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM email_list GROUP BY 1 ORDER BY 1; Thom
Re: [GENERAL] Cumulative count (running total) window fn
On 29 Apr 2010, at 10:01, Magnus Hagander wrote: >> >> select extract(year from signup_date), >> count(email_address), >> sum(count(email_address)) over (partition by 1 order by 1 asc rows >> unbounded preceding) >> from email_list group by 1 order by 1; >> >> Does anyone have any other ideas? > > Aren't you looking for something along the line of: > > SELECT year, sum(c) over (order by year) > FROM ( > SELECT extract(year from signup_date) AS year, count(email_address) AS c > FROM email_list > GROUP BY extract(year from signup_date) > ) > > (adjust for typos, I didn't test it) Yes that does work thanks, if you give the subquery a name. I'd still like to know if it's possible to do with a window function rather than a subquery. Oliver Kohll
[GENERAL] Java Memory Issue while Loading Postgres library
Dear All Experts, We have a java exe making a call to a postgres function. This postgres function internally makes a call to a dll (which is written using Postgres extended C). Now the issue is that, when we make a call to this dll, it consumes a lot of memory and this memory is getting consumed from the heap space of the original java process causing an out of memory exception in Java. Is this a known issue. Do we have a way to keep these processes disjoint and not eat up heap space of the original process? Please provide your suggestions/advises on the same.
Re: [GENERAL] Inheritance efficiency
2010/4/26 Vincenzo Romano : > 2010/4/26 Bruce Momjian : >> Vincenzo Romano wrote: >>> Hi all. >>> >>> I'm wondering how efficient the inheritance can be. >>> I'm using the constraint exclusion feature and for each child table >>> (maybe but one) I have a proper CHECK constraint. >>> How efficient can the query planner be in choosing the right child >>> tables in the case of, say, thousands of them? >>> Would the selection process behave linearly, logarithmically or what? >> >> It is fine for dozens of child tables, but not thousands; it does need >> improvement. > > This sounds like "linear" algorithms. Doesn't it? > >>> And now it comes to my mind the same question for partial indexes. >>> That is, if I had a lot (really a lot) of small partial indexes over a >>> very large table, how efficient can the query planner be >>> in selecting the right indexes? > > No info about this point (partial indexes)? > Is also this geared with linear algorithms ? Should I move to an "enterprise grade" version of PostgreSQL? -- Vincenzo Romano NotOrAnd Information Technologies NON QVIETIS MARIBVS NAVTA PERITVS -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [SPAM] Re: [GENERAL] Best way to replicate to large number of nodes
2010/4/22 Brian Peschel : > > On 04/22/2010 10:12 AM, Ben Chobot wrote: >> >> On Apr 21, 2010, at 1:41 PM, Brian Peschel wrote: >> >> >>> >>> I have a replication problem I am hoping someone has come across before >>> and can provide a few ideas. >>> >>> I am looking at a configuration of on 'writable' node and anywhere from >>> 10 to 300 'read-only' nodes. Almost all of these nodes will be across a WAN >>> from the writable node (some over slow VPN links too). I am looking for a >>> way to replicate as quickly as possible from the writable node to all the >>> read-only nodes. I can pretty much guarantee the read-only nodes will never >>> become master nodes. Also, the updates to the writable node are bunched and >>> at known times (ie only updated when I want it updated, not constant >>> updates), but when changes occur, there are a lot of them at once. >>> >> >> Two things you didn't address are the acceptable latency of keeping the >> read-only nodes in sync with the master - can they be different for a day? A >> minute? Do you need things to stay synchronous? Also, how big is your >> dataset? A simple pg_dump and some hot scp action after you batched updates >> might be able to solve your problem. > > Latency is important. I would say 10 to 15 minutes max, but the shorter the > better. I don't have an exact size, but I believe the entire DB is about 10 > gig. > > We had an idea of creating our apps write the SQL statements to a file, > rather than using an ODBC drive to directly change the DBs. Then we could > scp/rsync the files to the remote machines and execute them there. This > just seems like a very manual process though. You need to have a look at PgQ. (in short, skytools will do exactly what you want if I understand correctly your requirments, londiste being somewhat like slony) > > - Brian > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Cédric Villemain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storing many big files in database- should I do it?
2010/4/28 Adrian Klaver : > On Tuesday 27 April 2010 5:45:43 pm Anthony wrote: >> On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain < >> >> cedric.villemain.deb...@gmail.com> wrote: >> > store your files in a filesystem, and keep the path to the file (plus >> > metadata, acl, etc...) in database. >> >> What type of filesystem is good for this? A filesystem with support for >> storing tens of thousands of files in a single directory, or should one >> play the 41/56/34/41563489.ext game? I'll prefer go with XFS or ext{3-4}. In both case with a path game. You path game will let you handle the scalability of your uploads. (so the first increment is the first directory) something like 1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash function or something that split a SHA1(or other) sum of the file to get the path. >> >> Are there any open source systems which handle keeping a filesystem and >> database in sync for this purpose, or is it a wheel that keeps getting >> reinvented? >> >> I know "store your files in a filesystem" is the best long-term solution. >> But it's just so much easier to just throw everything in the database. > > In the for what it is worth department check out this Wiki: > http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems and postgres fuse also :-D > > -- > Adrian Klaver > adrian.kla...@gmail.com > -- Cédric Villemain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cumulative count (running total) window fn
On Thu, Apr 29, 2010 at 10:52, Oliver Kohll - Mailing Lists wrote: > Hello, > > Many thanks to andreas.kretschmer for this helpful reply about how to set up > a window function to perform a running total: > http://archives.postgresql.org/pgsql-general/2010-03/msg01122.php > > It works perfectly with the simple test data but I've just got back to work, > tried implementing it on my live data and the results are slightly different. > My query is almost exactly the same - I've simplified by grouping by year > only rather than year and month: > > select extract(year from signup_date), > count(email_address), > sum(count(email_address)) over (rows unbounded preceding) > from email_list group by 1 order by 1; > > date_part | count | sum > ---+---+-- > 2007 | 501 | 1374 > 2008 | 491 | 491 > 2009 | 382 | 873 > 2010 | 66 | 1440 > (4 rows) > > What I'm looking for is > date_part | count | sum > ---+---+-- > 2007 | 501 | 501 > 2008 | 491 | 992 > 2009 | 382 | 1374 > 2010 | 66 | 1440 > > It seems to be adding up the counts but not in the right order. > > I've also tried an explicit ORDER BY inside the partition with no difference: > > select extract(year from signup_date), > count(email_address), > sum(count(email_address)) over (partition by 1 order by 1 asc rows unbounded > preceding) > from email_list group by 1 order by 1; > > Does anyone have any other ideas? Aren't you looking for something along the line of: SELECT year, sum(c) over (order by year) FROM ( SELECT extract(year from signup_date) AS year, count(email_address) AS c FROM email_list GROUP BY extract(year from signup_date) ) (adjust for typos, I didn't test it) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Cumulative count (running total) window fn
Hello, Many thanks to andreas.kretschmer for this helpful reply about how to set up a window function to perform a running total: http://archives.postgresql.org/pgsql-general/2010-03/msg01122.php It works perfectly with the simple test data but I've just got back to work, tried implementing it on my live data and the results are slightly different. My query is almost exactly the same - I've simplified by grouping by year only rather than year and month: select extract(year from signup_date), count(email_address), sum(count(email_address)) over (rows unbounded preceding) from email_list group by 1 order by 1; date_part | count | sum ---+---+-- 2007 | 501 | 1374 2008 | 491 | 491 2009 | 382 | 873 2010 |66 | 1440 (4 rows) What I'm looking for is date_part | count | sum ---+---+-- 2007 | 501 | 501 2008 | 491 | 992 2009 | 382 | 1374 2010 |66 | 1440 It seems to be adding up the counts but not in the right order. I've also tried an explicit ORDER BY inside the partition with no difference: select extract(year from signup_date), count(email_address), sum(count(email_address)) over (partition by 1 order by 1 asc rows unbounded preceding) from email_list group by 1 order by 1; Does anyone have any other ideas? Regards Oliver Kohll oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk - software www.gtwm.co.uk - company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Start-up script for few clusters: just add water?
Guillaume Lelarge wrote on 28/04/2010 15:04:07: > > In such case the new created start-up script postgresql2 should not be > > modified in the following line: > > > > # Override defaults from /etc/sysconfig/pgsql if file is present > > [ -f /etc/sysconfig/pgsql/${NAME} ] && . /etc/sysconfig/pgsql/${NAME} > > > > export PGDATA > > export PGPORT > > > > Or it will automatically match-up names from both directories, i.e.: > > /etc/init.d/postgresql will run /etc/sysconfig/pgsql/postgresql? > > > > /etc/init.d/postgresql2 will run /etc/sysconfig/pgsql/postgresql2 > > > > The latter. It will automatically match the script file name and the > config file name. Thanks again. By the way, if I want to start instances with few optional arguments, as -S or -B where can I include these? In the same config file, i.e. /etc/sysconfig/pgsql/postgresql (speaking about RedHat directories structure)? Pete This email (and attachments) are confidential and intended for the addressee(s) only. If you are not the intended recipient please notify the sender, delete any copies and do not take action in reliance on it. Any views expressed are the author's and do not represent those of IOP, except where specifically stated. IOP takes reasonable precautions to protect against viruses but accepts no responsibility for loss or damage arising from virus infection. For the protection of IOP's systems and staff emails are scanned automatically.” Institute of Physics Registered in England under Registration No 293851 Registered Office: 76/78 Portland Place, London W1B 1NT
Re: [GENERAL] Need magic that checks if a collumn contains values that CAN NOT be converted to Latin1
On Thu, Apr 29, 2010 at 05:02, Andreas wrote: > Hi, > > I've got an 8.4.3 Unicode DB that accidentally holds a few records with > characters that can't be converted to Latin1 or 9 for output to CSV. > > I'd just need a way to check if a collumn contains values that CAN NOT be > converted from Utf8 to Latin1 to select all those affected records. > > I tried: > Select convert_to (my_column::text, 'LATIN1') from my_table; > > It raises an error that says translated: > ERROR: character 0xe28093 in encoding »UTF8« has no equivalent in »LATIN1« > Regrettably it doesn't explain where it found this sign. > > Select '\xe28093' > complains that this weren't a valid UTF8 code at all. > So how was it accepted and stored in the first place? > > When I know which record has faulty content I can correct it. Wrap your check in a simple function: CREATE OR REPLACE FUNCTION public.is_valid_encoding(vtext text, encname text) RETURNS boolean LANGUAGE plpgsql AS $$ BEGIN BEGIN PERFORM convert_to(vtext, encname); EXCEPTION WHEN untranslatable_character THEN RETURN 'f'; END; RETURN 't'; END; $$ And execute SELECT * FROM my_table WHERE NOT is_valid_encoding(my_column, 'LATIN1') -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_hba.conf
I had a similar problem: older versions of Postgres have IP addressing in one column and subnetting/mask in the next one. 8.4 uses CIDR expression in one column - applying CIDR notation solved my problem. I think it's advisable to manually correct the pg_hba.conf file instead of replacing it with the old configuration file from the older version of Postgres. Cheers, Pete From: Scott Mead To: jkun...@laurcat.com Cc: postgres help Date: 28/04/2010 18:41 Subject:Re: [GENERAL] pg_hba.conf Sent by:pgsql-general-ow...@postgresql.org On Tue, Apr 27, 2010 at 6:42 AM, wrote: I am putting up a new server on version 8.4.3. I copied pg_hba.conf from a running 8.3.6 system, changing only the public IP address for the local machine. I get the error: FATAL: no pg_hba.conf entry for host "209.159.145.248", user "postgres", database "arc" pg_hba.conf contains the line: host all all 209.159.145.248 255.255.255.255 trust Hmm, just for giggles, does it work using CIDR syntax: host all all 209.159.145.248/32 trust ? --Scott -- Scott Mead Principal Systems Engineer EnterpriseDB Corporation The Enterprise Postgres Company Other records work (I can connect from my remote site using pgAdmin, just fine), so I know the file is being read by posgres. Any ideas? Thanks in advance, Jim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general This email (and attachments) are confidential and intended for the addressee(s) only. If you are not the intended recipient please notify the sender, delete any copies and do not take action in reliance on it. Any views expressed are the author's and do not represent those of IOP, except where specifically stated. IOP takes reasonable precautions to protect against viruses but accepts no responsibility for loss or damage arising from virus infection. For the protection of IOP's systems and staff emails are scanned automatically.” Institute of Physics Registered in England under Registration No 293851 Registered Office: 76/78 Portland Place, London W1B 1NT