date:20100429


Hi

Did you eventually figure out what was wrong?

Was it just that you were trying to load a full result set and running 
out of memory with an OutOfMemoryError?


Or was the jvm truly crashing rather than just throwing OutOfMemoryError?

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Inheritance efficiency

2010-04-29 Thread Vincenzo Romano

2010/4/30 David Fetter :
> On Thu, Apr 29, 2010 at 11:29:36AM +0200, Vincenzo Romano wrote:
>> > No info about this point (partial indexes)?
>> > Is also this geared with linear algorithms ?
>>
>> Should I move to an "enterprise grade" version of PostgreSQL?
>
> The enterprise grade version of PostgreSQL is the community version.
>
> Proprietary forks exist, but they don't fix this kind of problem. :)

Hmmm ... I think this is the kind of problems that keeps PostgreSQL away
from the "enterprise grade" world.
The ability to cope with thousands of DB objects like (child-)tables,
indexes, functions and so on with
O(1) or at least O(log(n))  complexity is among the key points.

For example, the Linux kernel made the big jump with server hardware
thanks also to the O(1) schedulers.

In this specific case, if you think about "inheritance for
partitioning" and you stick with the example idea of "one partition
per month", then the current solution is more than OK.
In the real world, that is not really the general case, especially in
the "enterprise grade" world, where maybe you partition with both a
time stamp and another column, like product code ranges and prefixes
...

Is there any planning about this improvement?

>
> Cheers,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fet...@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

-- 
Vincenzo Romano
NotOrAnd Information Technologies
cel. +39 339 8083886  | gtalk. vincenzo.rom...@notorand.it
fix. +39 0823 454163  | skype. notorand.it
fax. +39 02 700506964 | msn.   notorand.it
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Inheritance efficiency

2010-04-29 Thread David Fetter

On Thu, Apr 29, 2010 at 11:29:36AM +0200, Vincenzo Romano wrote:
> > No info about this point (partial indexes)?
> > Is also this geared with linear algorithms ?
> 
> Should I move to an "enterprise grade" version of PostgreSQL?

The enterprise grade version of PostgreSQL is the community version.

Proprietary forks exist, but they don't fix this kind of problem. :)

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Jaime Rodriguez

Thanks a lot for all your responses

I am impress, really impress. I never though I could get this amount of
responses in this shorter time. Wonderful support :)
Thanks a lot :) :)

I don't have details, I'll get them really soon. But all your input is
really valuable. I have much more information. I'll continue my research.
I'll spend a lot of time reading at wiki :P :)

I am agree, 4k requests seams to be t much and crazy. I hope that my
contact was wrong and it's only 400, which looks to be manageable.

Once again, thanks a lot, I have a lot of information. Really appreciate
your valuable time.

Thanks :)

On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe wrote:

> On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe 
> wrote:
> > On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez
> >  wrote:
> >> hi,
> >> Today is my first day looking at PostgreSQL
> >> I am looking to migrate a MS SQL DB to PostgreSQL :) :)
> >> My customer requires that DBMS shall support 4000 simultaneous requests
> >> Also the system to be deploy maybe a cluster, with 12 microprocessors
> >
> > I'm gonna jump in here and say that if you 400 REQUESTS running at the
> > same time, you're gonna want a REALLY big machine.
>
> I hate my keyboard... I meant to say:
>
> .. if you really need 4000 requests running at the same time...
>

-- 
Ing. Jaime Rodríguez Quesada, Mag
Liberux S.A.
http://www.liberux.com

Re: [GENERAL] sql help, reusing a column


On 04/29/2010 05:08 PM, Thomas Kellerer wrote:

SELECT organization, state, lastdate, age(lastdate)
FROM (
   SELECT organization,
  state,
  (select max(idate) from times where
customers.custid=times.custid and taskid = 27) as lastdate
   FROM customers
) t
order by lastdate desc


Ah, yes, that does work, very nice.  Thank you.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Performance issue

2010-04-29 Thread DM

Hello there,

1. Try using COPY Command, you will see significant decrease in the loading
time.
2. Turn off auto commit and Remove foreign key constraints if it is only one
time load - this will also help in decreasing the load time.

Try these options and let us know how it went.

We load around 6M rows of data into a table using copy command it takes few
mins to load the data and system configuration is not that high too. Also
one more thing we use linux box over here.

Do a small test as to how long will it take to do 1000 inserts into a
similar table and send us your timings and definition of the table. I will
compare against mine.

Thanks
Deepak

On Tue, Apr 27, 2010 at 10:09 PM,  wrote:

>  I am curious to know how much of your delay is due to PostgreSQL and how
> much to your Java batch program.  If you comment out the call to the
> database function, so that you are reading your input file but not doing
> anything with the data, how long does your batch program take to run?
>
>
>
> RobR
>
>
>
> --
>
> The Java program hardly takes a minute to process all the flat files but at
> the time inserting the records into the db, the entire process takes more
> than 4 hours.
>
>
>
> Many thanks
>
>
>

Re: [GENERAL] Problem: concat an array of arrays

2010-04-29 Thread Belka Lambda

Thanks, Merlin! The "restack" function solves the problem! :)

> what are you trying to do w/unfold function exactly?
The recursive query I mentioned was to produce from the argument 
array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11]] the result 
array[1,2,3,4,5,6,7,8,9,10,11].

The behaviour of the unnest function confused me, I didn't expect anything like 
that
--
postgres=# select array(select unnest(array[array[1,2,3],array[4,5,6]]));
   ?column?
---
 {1,2,3,4,5,6}
(1 row)

postgres=# select array(select unnest(array[array[1,2,3],array[4,5]]));
ERROR:  multidimensional arrays must have array expressions with matching 
dimensions
-
But, oh well, at least I can make a {...} from {{...}} in a functional way:)

Regards, Belka

29.04.10, 08:53, "Merlin Moncure" :

> On Thu, Apr 29, 2010 at 8:46 AM, Merlin Moncure  wrote:
>  > On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda  wrote:
>  >> Hi!
>  >>
>  >> I tried to write a recursive SELECT, that would do the concatination, but 
> a problem appeared:
>  >> can't make a {1,2,3} from {{1,2,3}}.
>  >> Here are some experiments:
>  >> 
> ---
>  >> postgres=# select 
> array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]];
>  >>
>  >>                array
>  >> --
>  >>  {{1,2,3},{4,5,6},{7,8,9},{10,11,12}}
>  >> (1 row)
>  >>
>  >>
>  >> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>  >> )[3];
>  >>  array
>  >> ---
>  >>
>  >> (1 row)
>  >>
>  >>
>  >> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>  >> )[3:3];
>  >>   array
>  >> ---
>  >>  {{7,8,9}}
>  >> (1 row)
>  >>
>  >>
>  >> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>  >> )[3][1];
>  >>  array
>  >> ---
>  >>     7
>  >> (1 row)
>  >> 
> ---
>  >>
>  >> The original query, that would do the concatenation:
>  >> ---
>  >> WITH RECURSIVE unfold (rest, accum) AS (
>  >>        VALUES ($1 :: int[][], ARRAY[] :: int[])
>  >>      UNION ALL
>  >>        SELECT u.rest[2:array_length(u.rest, 1)] AS rest, 
> array_cat(u.rest[1], u.accum) AS accum
>  >>        FROM unfold AS u
>  >>        WHERE array_length(u.rest, 1) > 0
>  >> )
>  >> SELECT u.accum
>  >> FROM unfold AS u
>  >> WHERE array_length(u.rest, 1) = 0;
>  >> ---
>  >> Throws an error:
>  >> ERROR:  function array_cat(integer, integer[]) does not exist
>  >
>  > array_cat requires too array arguments.  you could rewrite your expression 
> to
>  > array_cat(array[u.rest[1], u.accum)
>  > (i think, not quite sure what you are trying to do).
>  >
>  > you can append scalars to arrays with the || operator:
>  > select array[1,2,3] || 4;
>  >  ?column?
>  > ---
>  >  {1,2,3,4}
>  >
>  >
>  > you can kinda sorta slice an array using the slice method:
>  > select (array[array[1,2,3], array[2,4,6]])[1:1];
>  >   array
>  > ---
>  >  {{1,2,3}}
>  >
>  > what are you trying to do w/unfold function exactly?
>  
>  hm. the basic problem is that it's difficult to slide arrays up/down
>  dimensions.  you can move from scalars to arrays and arrays to
>  scalars, but not from dimension N to N-1 etc. you can however move
>  from dimension 'N' to 1:
>  
>  create or replace function restack(_array anyarray) returns anyarray as
>  $$
>select array(select unnest($1));
>  $$ language sql immutable;
>  
>  select restack(array[1,2,3]);
>   restack
>  -
>   {1,2,3}
>  
>  select restack(array[array[1,2,3]]);
>   restack
>  -
>   {1,2,3}
>  
>  
>  merlin
>  
>  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Recovering Data from a crashed database


On 30/04/2010 5:29 AM, Eric Langheinrich wrote:


I'm looking for options to recover data from a crashed postgres database
server. We recently had a solid state storage device blow up taking the
database server with it.
The database is version 8.3, the pg_clog, pg_xlog and subdirectories of
pg_tblspc were wiped out with the crashed storage device. We do have the
files under /data/base.
pgfsck looked like the right tool for the job, but seems to be outdated
and lacking support for 8.3


Whatever you do, and before you do anything else, take a full copy of 
everything you still have and put it on storage you then ensure is 
read-only. This is important. Any recovery attempt you make may make 
things worse, and change the situation from "recoverable" to "completely 
hosed".


Once you have a full snapshot, you can supply that to anyone you choose 
to help with recovery.


I strongly suggest making sure the original pg data directory is 
read-only too. If you're going to do your own recovery attempts, copy 
the data to a spare machine and try it there, simply to make sure you've 
got everything isolated and there's no chance you're going to stomp on 
the original copy.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] sql help, reusing a column

2010-04-29 Thread Thomas Kellerer


Andy Colson wrote on 29.04.2010 23:51:

Here is my query, which works:


select organization,
state,
(select max(idate) from times where customers.custid=times.custid and
taskid = 27) as lastdate,
age( (select max(idate) from times where customers.custid=times.custid
and taskid = 27) )
from customers
order by lastdate desc nulls last;


I'd love to use age(lastdate) instead of age( (repeat sql) ), but it
does not seem to work.


This should work:

SELECT organization, state, lastdate, age(lastdate)
FROM (
  SELECT organization,
 state,
 (select max(idate) from times where customers.custid=times.custid and 
taskid = 27) as lastdate
  FROM customers
) t
order by lastdate desc




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] sql help, reusing a column


On 4/29/2010 4:51 PM, Andy Colson wrote:

I tried this:

select organization, state, max(idate), age(max(idate))
from customers
inner join times using(custid)
where taskid = 27
group by organization, state
order by idate desc nulls last;


but get error that times.idate must appear in group by or used in agg
func... except it is used in an agg func.


Any hints on what I'm missing?

Thanks,

-Andy



Ahh, shoot, it was the idate in the order by, not the select list.  Both 
"order by 3" and "order by max(idate)" work just fine.


Sorry for the noise... but still... I'm kinda curious, in my first 
example, how you can re-use a column.  Is there a way to:


select organization, state,
 (select max(idate) from times where customers.custid=times.custid and 
taskid = 27) as lastdate,

 age(lastdate)
from customers


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] sql help, reusing a column


Here is my query, which works:


select organization,
  state,
  (select max(idate) from times where customers.custid=times.custid and 
taskid = 27) as lastdate,
  age( (select max(idate) from times where 
customers.custid=times.custid and taskid = 27) )

from customers
order by lastdate desc nulls last;


I'd love to use age(lastdate) instead of age( (repeat sql) ), but it 
does not seem to work.


I tried this:

select organization, state, max(idate), age(max(idate))
from customers
inner join times using(custid)
where taskid = 27
group by organization, state
order by idate desc nulls last;


but get error that times.idate must appear in group by or used in agg 
func... except it is used in an agg func.



Any hints on what I'm missing?

Thanks,

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Recovering Data from a crashed database

2010-04-29 Thread Simon Riggs

On Thu, 2010-04-29 at 15:29 -0600, Eric Langheinrich wrote:

> I'm looking for options to recover data from a crashed postgres
> database server. We recently had a solid state storage device blow up
> taking the database server with it. 
>  
> The database is version 8.3, the pg_clog, pg_xlog and subdirectories
> of pg_tblspc were wiped out with the crashed storage device. We do
> have the files under /data/base.
>  
> pgfsck looked like the right tool for the job, but seems to be
> outdated and lacking support for 8.3
>  
> I'm open to all options including outsourcing the data recovery. Any
> help is appreciated.

2ndQuadrant offers commercial data recovery services for people in your
position. We'd be happy to help and regrettably have considerable
experience. If you're interested, please contact us direct/off-list.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Recovering Data from a crashed database

2010-04-29 Thread Eric Langheinrich

 I'm looking for options to recover data from a crashed postgres database
server. We recently had a solid state storage device blow up taking the
database server with it.

The database is version 8.3, the pg_clog, pg_xlog and subdirectories of
pg_tblspc were wiped out with the crashed storage device. We do have the
files under /data/base.

pgfsck looked like the right tool for the job, but seems to be outdated and
lacking support for 8.3

I'm open to all options including outsourcing the data recovery. Any help is
appreciated.

Thank you,

Eric

Re: [GENERAL] How to monitor Parallel pg_restore ?

2010-04-29 Thread raghavendra t

Any suggestions ?

On Thu, Apr 29, 2010 at 4:42 PM, raghavendra t wrote:

> Hi All,
>
> I am using Postgres 8.4. pg_restore -j option. I have dump of the database
> with -Fc and elected the pg_restore -j option for the faster restoration.
> When the restoration process is in progress, i want to monitor the threads
> invoked by pg_restore (suppose if i give -j 4). I have verified in the
> pg_stat_activity, in which i see only one transaction running that is COPY
> command.
>
> Even "top" command havent resulted any.
>
> Could please assist me in this.
>
> Regards
> Raghavendra
>
>

Re: [GENERAL] Populate arrays from multiple rows

2010-04-29 Thread Robert_Clift

Thanks Merlin:

I failed to mention that I'm running 8.3 (no array_agg), but you certainly 
pointed me in the right direction. This worked:

INSERT INTO foo_arrays SELECT 
 cde,
 nbr,
 ARRAY_ACCUM(CAST(aaa AS text)),
 ARRAY_ACCUM(CAST(bbb AS text)),
 ARRAY_ACCUM(CAST(ccc AS text))
FROM raw_foo
GROUP BY 1,2;

Cheers,
Rob

-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Wednesday, April 28, 2010 4:33 PM
To: Clift, Robert
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Populate arrays from multiple rows

On Wed, Apr 28, 2010 at 1:39 PM,   wrote:
> Good afternoon:
>
> I would like to insert some (1 or more) values from multiple rows of 
> one table into an array in another table. Here's the scenario:
>
> --table to house data provided by a third party CREATE TABLE raw_foo (
>     rf_id serial PRIMARY KEY,
>     cde character varying(4),
>     nbr integer,
>     aaa character varying(60),
>     bbb character(10),
>     ccc character varying(20)
>     );
>
> --table raw_foo populated by copying from a text file --columns 
> cde||nbr identify a person while columns aaa||bbb||ccc describe an 
> attribute of a person --since each person can have one or more 
> attributes, the cde||nbr identifier is not distinct --need data in 
> raw_foo flattened so that there is only one record per person
>
> --second table in which aaa, bbb, and ccc are array fields CREATE 
> TABLE foo_arrays (
>     cde character varying(4),
>     nbr integer,
>     aaa text[],
>     bbb text[],
>     ccc text[],
>     PRIMARY KEY (cde, nbr)
>     );
>
> --insertion of all distinct cde||nbr combinations from raw_foo INSERT 
> INTO foo_arrays
>     (cde, nbr)
>     (SELECT cde, nbr
>         FROM raw_foo
>         GROUP BY cde, nbr
>         HAVING COUNT(*) = 1)
>     UNION
>     (SELECT cde, nbr
>         FROM raw_foo
>         GROUP BY cde, nbr
>         HAVING COUNT(*) > 1);
>
> --hope to update foo_arrays.aaa by selecting every instance of 
> raw_foo.aaa where raw_foo.cde||raw_foo.nbr matches the distinct value 
> of foo_arrays.cde||foo_arrays.nbr (repeating the process for 
> foo_arrays.bbb and
> foo_arrays.ccc)
>
> UPDATE foo_arrays
>     SET aaa = ???
>
> This is where I'm stumped.
> Am I on the right path?
> Thanks in advance.

Hello, fellow Floridian! :-)

how about this:
insert into foo_arrays select cde, nbr, array_agg(aaa), array_agg(bbb), 
array_agg(ccc) group by 1,2;

merlin

--
For up-to-date information about H1N1 Swine Flu visit 
http://www.myflusafety.com or call 877-352-3581

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Scott Ribe

On Apr 29, 2010, at 10:45 AM, Justin Graf wrote:

> Many people encode the binary data in Base64  and store as text data 
> type??  Then never have to deal with escaping  bytea data type. Which i 
> have found can be a pain

Damn. Wish I'd thought of that ;-)

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Recovering Data from a crashed database

2010-04-29 Thread Eric Langheinrich

I'm looking for options to recover data from a crashed postgres database
server. We recently had a solid state storage device blow up taking the
database server with it.

The database is version 8.3, the pg_clog, pg_xlog and subdirectories of
pg_tblspc were wiped out with the crashed storage device. We do have the
files under /data/base.

pgfsck looked like the right tool for the job, but seems to be outdated and
lacking support for 8.3

I'm open to all options including outsourcing the data recovery. Any help is
appreciated.

Thank you,

Eric

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Scott Marlowe

On Thu, Apr 29, 2010 at 1:41 PM, Scott Marlowe  wrote:
> On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez
>  wrote:
>> hi,
>> Today is my first day looking at PostgreSQL
>> I am looking to migrate a MS SQL DB to PostgreSQL :) :)
>> My customer requires that DBMS shall support 4000 simultaneous requests
>> Also the system to be deploy maybe a cluster, with 12 microprocessors
>
> I'm gonna jump in here and say that if you 400 REQUESTS running at the
> same time, you're gonna want a REALLY big machine.

I hate my keyboard... I meant to say:

.. if you really need 4000 requests running at the same time...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Scott Marlowe

On Wed, Apr 28, 2010 at 7:08 PM, Jaime Rodriguez
 wrote:
> hi,
> Today is my first day looking at PostgreSQL
> I am looking to migrate a MS SQL DB to PostgreSQL :) :)
> My customer requires that DBMS shall support 4000 simultaneous requests
> Also the system to be deploy maybe a cluster, with 12 microprocessors

I'm gonna jump in here and say that if you 400 REQUESTS running at the
same time, you're gonna want a REALLY big machine.

I admin a setup where two db servers handle ~200 simultaneous
requests, almost all being very short millisecond long requests, and a
few being 100 milliseconds, and a very very few running for seconds.

With 8 2.1 GHz Opteron cores, 32 Gigs of ram, and 14x15k drives those
machines run with a load factor in the range of 10 to 15.  CPUs are
maxed at that range of load, and IO is 70 to 80% utilized acording to
iostat -x.  Wait % is generally one core max.  Some of that load is
fixed on the master, but a lot can be handled by slaves.

Your load, if you really are having 4000 simultaneous connections, is
likely going to need 20 times the load handling I need.  Given the
newer 12 core AMDs are somewhat faster, you could probably get away
with two or three of these machines.  If you were to use 96 core
machines (8Px12core) with as many disks as you could throw at them (40
to 100) then you're in the ballpark for a set of machines to process
4,000 simultaneous requests, assuming a mostly read (80% or so) setup.
 We're talking a large % of a full sized rack to hold all the drives
and cores you'd need.

But this brings up a lot of questions about partitioning your dataset
if you can, things like that.  Do all of these 4,000 simultaneous
requests need to update the same exact data set?  Or are they read
mostly reporting users? Can you use memcached to handle part of the
load?  Usage patterns informs a great deal on how to size a system to
handle that much load.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Justin Graf

On 4/29/2010 3:18 PM, Tom Lane wrote:
> Alvaro Herrera  writes:
>
> However, that toast limit is per-table, whereas the pg_largeobject limit
> is per-database.  So for example if you have a partitioned table then
> the toast limit only applies per partition.  With large objects you'd
> fall over at 4G objects (probably quite a bit less in practice) no
> matter what.
>
>   regards, tom lane
>
has there been any thought of doing something similar to MS filestream 
http://msdn.microsoft.com/en-us/library/cc949109.aspx

it seems to overcome all the draw backs of storing files in the DB.

All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Greg Smith

This whole sockets conversation has wandered way off topic.  PostgreSQL 
runs into high-connection scaling issues due to memory limitations (on 
Windows in particular, as noted in the FAQ entry I suggested), shared 
resource contention, and general per-connection overhead long before 
socket issues matter.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

Alvaro Herrera  writes:
> Each toasted object also requires an OID, so you cannot have more than 4
> billion toasted attributes in a table.

> I've never seen this to be a problem in real life, but if you're talking
> about having that many large objects, then it will be a problem with
> toast too.

However, that toast limit is per-table, whereas the pg_largeobject limit
is per-database.  So for example if you have a partitioned table then
the toast limit only applies per partition.  With large objects you'd
fall over at 4G objects (probably quite a bit less in practice) no
matter what.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Alvaro Herrera

Justin Graf wrote:
> On 4/29/2010 12:07 PM, David Wall wrote:
> >
> >
> > Big downside for the DB is that all large objects appear to be stored 
> > together in pg_catalog.pg_largeobject, which seems axiomatically 
> > troubling that you know you have lots of big data, so you then store 
> > them together, and then worry about running out of 'loids'.
> Huh ???  isn't that point of using bytea or text datatypes.
> 
> I could have sworn bytea does not use large object interface it uses 
> TOAST or have i gone insane

Each toasted object also requires an OID, so you cannot have more than 4
billion toasted attributes in a table.

I've never seen this to be a problem in real life, but if you're talking
about having that many large objects, then it will be a problem with
toast too.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Ozz Nixon

I dont think its that easy. 50,000 sockets open, sure, but whats the
performance? The programming model has everything to do with that,
and windows select() wont support that many sockets with any sort of
performance. For windows you have to convert to using non-blocking
sockets w/messages. (and I've never see the PG code, but I'll bet
it's not using non-blocking sockets & windows msg q, so 50k sockets
using select() on windows will not be usable).

That being said, I'm not a windows socket component developer, so its
mostly guessing. But saying "it can" and saying "its usable" are two
different things, and that depends on the code, not the registry
settings.
Actually that is incorrect. You can use Synchronous non-blocking
sockets. Asynchronous is a nightmare due to the overhead of pushing and
handling messages... the busier the kernel, the slower your application.
Syn-Non-Blocking will perform a small degradation in performance every
5,000 sockets. (Meaning 10,000 streams is minimally slower than 5,000 -
but enough to denote degradation).

Systems Running my product and Designs:

AOL's Proxy Server System
Some of the UK's largest ISP's
AT&T Fiber Monitoring Framework
HBO Video Streaming to Satellite
Hart, a Front-End for TransUnion, Equifax and Experian
OFAC Query (B-Tree Query Service, processing over 100,000 requests
a second) (*)

* WAN Latency plays a running variable on their stats, but they average
100,000+ a second during peak-hours. [1 master, 2 fail-over
load-balanced servers].

Most people run into the "2048+/-" thread limitation until they learn
how to properly manage stack allocation per thread. I have been
designing commercial enterprise socket solutions for over 15 years and
sell an SDK that no product has yet to touch and I compete with ALL the
big boys (and they all know who I am). :-) ... the limitations in
performance are factors of poor (modern sloppiness) variable allocation,
memory management, buffering techniques, etc. I got out of actively
promoting DXSock (my socket suite) when I found I could capitalize more
on my time and my product... so since 2000 - I sale my knowledge.

Factors which also come into play are the built-in overhead of the
Operating System when it is a "Network Client/Server" it has active
connections. These connections also incur the poor default settings
Microsoft picked (FIN_WAIT/2 issue which is another registry tweak).
Once you learn what servers a "Dedicated Windows Server" will not need,
rip out all of the excess "Network Client" junk (and this is well
documented all over the net) - you can produce very robust Windows
servers. (Of course there are much better solutions for production
servers than Windows - but, people still drink the Microsoft "blue"
coolaide.

* People who document the registry tweaks needed:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.dc38421_1500/html/ntconfig/X26667.htm

;-)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

On Thu, Apr 29, 2010 at 1:51 PM, David Wall  wrote:
> I missed the part that BYTEA was being used since it's generally not a good
> way for starting large binary data because you are right that BYTEA requires
> escaping across the wire (client to backend) both directions, which for true
> binary data (like compressed/encrypted data, images or other non-text files)
> makes for a lot of expansion in size and related memory.

what?? postgresql supports binary data in both directions without
escaping.  here is how i do it with libpqtypes:

PGbytea b;
b.data = some_pointer;
b.len = data_length;

res = PGexecf(conn, "insert into table values (%bytea*);", b);

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering


On 4/29/2010 11:49 AM, Ozz Nixon wrote:

On 4/29/10 12:42 PM, Greg Smith wrote:

Alban Hertroys wrote:

The reason I'm asking is that Postgres doesn't perform at its best on
Windows and I seriously wonder whether the OS would be able to handle
a load like that at all (can Windows handle 4000 open sockets for
example?).


You have to go out of your way to even get >125 connections going on
Windows; see the very last entry at

http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows



I design socket component suites for developers, on windows, with few
registry tweaks, you are able to have over 50,000 live, hot sockets.


I dont think its that easy.  50,000 sockets open, sure, but whats the 
performance?  The programming model has everything to do with that, and 
windows select() wont support that many sockets with any sort of 
performance.  For windows you have to convert to using non-blocking 
sockets w/messages.  (and I've never see the PG code, but I'll bet it's 
not using non-blocking sockets & windows msg q, so 50k sockets using 
select() on windows will not be usable).


That being said, I'm not a windows socket component developer, so its 
mostly guessing.  But saying "it can" and saying "its usable" are two 
different things, and that depends on the code, not the registry settings.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Writing SRF

2010-04-29 Thread Martin Gainty

it has been years since i've mucked in the C++ swamp but 
that means your (near) heap is ok but you're stack is hosed..

probably specific to compiler (version) and Operating System(version) and 
environment settings..ping back if you are still experiencing those problems 
with those configuration settings

Saludos Cordiales desde EEUU!
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.

> From: jorgearev...@gis4free.org
> Date: Thu, 29 Apr 2010 19:45:41 +0200
> Subject: Re: [GENERAL] Writing SRF
> To: t...@sss.pgh.pa.us
> CC: pgsql-general@postgresql.org
> 
> On Thu, Apr 29, 2010 at 3:56 PM, Tom Lane  wrote:
> > Jorge Arevalo  writes:
> >> Yes. For example, the function expects 2 arguments, and it's called
> >> with 2 arguments: 1 composite type (following this format
> >> https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat)
> >> and one integer. But PG_NARGS() returns a really big value (16297)
> >> when I first check the number of arguments at the beginning of the
> >> function. Has sense?
> >
> > Given only that data point, I would guess that you forgot to mark the
> > function as being called with V1 protocol (PG_FUNCTION_INFO_V1).
> >
> >regards, tom lane
> >
> 
> Many thanks! That was one of my errors. Another one was this:
> 
> char szDataPointer[10];
> sprintf(szDataPointer, "%p", a_pointer);
> 
> These lines caused a memory error. I changed them for:
> 
> char * pszDataPointer;
> pszDataPointer = (char *)allocator(10 * sizeof(char));
> sprintf(pszDataPointer, "%p", a_pointer);
> 
> Meaning "allocator" a memory allocator in a valid memory context for 
> PostgreSQL.
> 
> And seems to work :-). Is the static memory "dangerous" in a
> PostgreSQL memory context?
> 
> Thanks again!
> Jorge
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

_
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
Hotmail. 
http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Justin Graf

On 4/29/2010 1:51 PM, David Wall wrote:
>
>> Put it another way: bytea values are not stored in the pg_largeobject
>> catalog.
>
> I missed the part that BYTEA was being used since it's generally not a 
> good way for starting large binary data because you are right that 
> BYTEA requires escaping across the wire (client to backend) both 
> directions, which for true binary data (like compressed/encrypted 
> data, images or other non-text files) makes for a lot of expansion in 
> size and related memory.
>
> BYTEA and TEXT both can store up to 1GB of data (max field length), 
> which means even less "file size" supported if you use TEXT with 
> base64 coding.  LO supports 2GB of data.  In JDBC, typically BYTEA is 
> used with byte[] or binary stream while LOs with BLOB.  I think LOs 
> allow for streaming with the backend, too, but not sure about that, 
> whereas I'm pretty sure BYTEA/TEXT move all the data together you it 
> will be in memory all or nothing.
>
> Of course, to support larger file storage than 1GB or 2GB, you'll have 
> to create your own "toast" like capability to split them into multiple 
> rows.
>
> David
>
Outside of  videos/media streams what other kind of data is going to be 
1gig in size.  Thats  allot of data still even still  today.

We all talk about 1 gig and 2 gig limits on this, but really who has 
bumped into that on regular bases???  Every time i hear about that not 
being big enough the person is trying to shoe horn in media files into 
the database,  which is insane

All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Writing SRF

Jorge Arevalo  writes:
> Many thanks! That was one of my errors. Another one was this:

> char szDataPointer[10];
> sprintf(szDataPointer, "%p", a_pointer);

> These lines caused a memory error.

That looks all right in itself (unless you're on a 64-bit machine, in
which case you need a bigger array to hold %p output).  However the
array would only live as long as the function it's in.  What were you
doing with the data afterwards, returning it maybe?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread David Wall




Huh ???  isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

 

You're not insane :)

Put it another way: bytea values are not stored in the pg_largeobject
catalog.
   


I missed the part that BYTEA was being used since it's generally not a 
good way for starting large binary data because you are right that BYTEA 
requires escaping across the wire (client to backend) both directions, 
which for true binary data (like compressed/encrypted data, images or 
other non-text files) makes for a lot of expansion in size and related 
memory.


BYTEA and TEXT both can store up to 1GB of data (max field length), 
which means even less "file size" supported if you use TEXT with base64 
coding.  LO supports 2GB of data.  In JDBC, typically BYTEA is used with 
byte[] or binary stream while LOs with BLOB.  I think LOs allow for 
streaming with the backend, too, but not sure about that, whereas I'm 
pretty sure BYTEA/TEXT move all the data together you it will be in 
memory all or nothing.


Of course, to support larger file storage than 1GB or 2GB, you'll have 
to create your own "toast" like capability to split them into multiple rows.


David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [SQL] [GENERAL] Tsearch not searching 'Y'

2010-04-29 Thread John Gage

You can avoid stemming by using 'simple' instead of 'english' as the  
language of the words in to_tsvector (which is a little more awkward  
than the cast).


"There are no stop words for the simple dictionary. It will just  
convert to lower case, and index every unique word.

SELECT to_tsvector('simple', 'Andy andy The the in out');
 to_tsvector
 -
 'in':5 'out':6 'the':3,4 'andy':1,2
(1 row)

John


On Apr 29, 2010, at 4:01 PM, Tom Lane wrote:


"sandeep prakash dhumale"  writes:
I am trying to get tsearch working for my application but I am  
facing a

problem when alphabet 'Y' is the in the tsquery.



# SELECT 'hollywood'::tsvector  @@ to_tsquery('holly:*');
?column?
--
f
(1 row)


You can't use to_tsquery for this sort of thing, because it tries to
normalize the given words:

regression=# select to_tsquery('holly:*');
to_tsquery

'holli':*
(1 row)

If you do this it works:

regression=# SELECT 'hollywood'::tsvector  @@ 'holly:*'::tsquery;
?column?
--
t
(1 row)

So if you want to use prefix matching, don't normalize.

regards, tom lane

--
Sent via pgsql-sql mailing list (pgsql-...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Writing SRF

2010-04-29 Thread Jorge Arevalo

On Thu, Apr 29, 2010 at 3:56 PM, Tom Lane  wrote:
> Jorge Arevalo  writes:
>> Yes. For example, the function expects 2 arguments, and it's called
>> with 2 arguments: 1 composite type (following this format
>> https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat)
>> and one integer. But PG_NARGS() returns a really big value (16297)
>> when I first check the number of arguments at the beginning of the
>> function. Has sense?
>
> Given only that data point, I would guess that you forgot to mark the
> function as being called with V1 protocol (PG_FUNCTION_INFO_V1).
>
>                        regards, tom lane
>

Many thanks! That was one of my errors. Another one was this:

char szDataPointer[10];
sprintf(szDataPointer, "%p", a_pointer);

These lines caused a memory error. I changed them for:

char * pszDataPointer;
pszDataPointer = (char *)allocator(10 * sizeof(char));
sprintf(pszDataPointer, "%p", a_pointer);

Meaning "allocator" a memory allocator in a valid memory context for PostgreSQL.

And seems to work :-). Is the static memory "dangerous" in a
PostgreSQL memory context?

Thanks again!
Jorge

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cumulative count (running total) window fn

> 
> Curious note - how does the non-subselect version and the subselect
> version compare performance-wise?

Magnus,

On a test table with 12,000 rows there's not much in it, the subselect has a 
simpler plan but they both take practically the same time.

The two plans (note I've been rewriting the field names for readability until 
now but haven't here):

explain analyze SELECT year, sum(c) over (order by year)
FROM (  
  SELECT extract(year from a56b7a8d6de03f67b) AS year, count(a10e4ab8863c199f1) 
AS c
  FROM a2e9a7e9e257153de
  GROUP BY extract(year from a56b7a8d6de03f67b)
) as subq;
   QUERY PLAN   
 
-
 WindowAgg  (cost=851.49..874.06 rows=1290 width=16) (actual 
time=43.369..43.394 rows=5 loops=1)
   ->  Sort  (cost=851.49..854.71 rows=1290 width=16) (actual 
time=43.340..43.342 rows=5 loops=1)
 Sort Key: (date_part('year'::text, 
a2e9a7e9e257153de.a56b7a8d6de03f67b))
 Sort Method:  quicksort  Memory: 25kB
 ->  HashAggregate  (cost=752.59..771.94 rows=1290 width=26) (actual 
time=43.300..43.317 rows=5 loops=1)
   ->  Seq Scan on a2e9a7e9e257153de  (cost=0.00..689.56 rows=12605 
width=26) (actual time=0.031..26.723 rows=12605 loops=1)
 Total runtime: 43.549 ms

explain analyze SELECT extract(year from a56b7a8d6de03f67b), 
count(a10e4ab8863c199f1), sum(count(a10e4ab8863c199f1)) OVER (ORDER BY 
count(a10e4ab8863c199f1)) FROM a2e9a7e9e257153de GROUP BY 1 ORDER BY 1;
  QUERY PLAN
   
---
 Sort  (cost=1382.39..1388.52 rows=2451 width=32) (actual time=44.229..44.230 
rows=5 loops=1)
   Sort Key: (date_part('year'::text, a56b7a8d6de03f67b))
   Sort Method:  quicksort  Memory: 25kB
   ->  WindowAgg  (cost=1195.39..1244.41 rows=2451 width=32) (actual 
time=44.171..44.208 rows=5 loops=1)
 ->  Sort  (cost=1195.39..1201.52 rows=2451 width=32) (actual 
time=44.125..44.127 rows=5 loops=1)
   Sort Key: (count(a10e4ab8863c199f1))
   Sort Method:  quicksort  Memory: 25kB
   ->  HashAggregate  (cost=1014.52..1057.41 rows=2451 width=32) 
(actual time=44.071..44.099 rows=5 loops=1)
 ->  Seq Scan on a2e9a7e9e257153de  (cost=0.00..833.58 
rows=24126 width=32) (actual time=0.032..26.683 rows=12605 loops=1)
 Total runtime: 44.396 ms

Regards
Oliver Kohll

oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll
www.agilebase.co.uk - software
www.gtwm.co.uk - company

[GENERAL] Select with string that has a lone hyphen yields nothing

2010-04-29 Thread Beevee

I have a product names table like this:

datab=# select product_id, name from table.product_synonyms where name
ilike '%%olympus e-pl1%%';
 product_id
|
name

+---
   8736 | Olympus E-PL1
   8736 | Olympus E-PL1 Interchangeable Lens Type Live View
Digital Camera – Black (Body Only) (Call for pre-order. Available on:
2010-04-09)
   8736 | Olympus E-PL1 Interchangeable Lens Type Live View
Digital Camera – Blue (Body Only)
   8736 | Olympus E-PL1 Interchangeable Lens Type Live View
Digital Camera w/ ED 14-42mm f3.5-5.6 (champagne/gold)
(4 rows)

Any select statement prior to the hyphen yields a result, but from the
hyphen on yields nothing:

datab=# select product_id, name from table.product_synonyms where name
ilike '%%Olympus E-PL1 Interchangeable Lens Type Live View Digital
Camera -  Blue %%';
 product_id | name
+--
(0 rows)

Any ideas how to fix this?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Guillaume Lelarge

Le 29/04/2010 18:45, Justin Graf a écrit :
> On 4/29/2010 12:07 PM, David Wall wrote:
>>
>>
>> Big downside for the DB is that all large objects appear to be stored 
>> together in pg_catalog.pg_largeobject, which seems axiomatically 
>> troubling that you know you have lots of big data, so you then store 
>> them together, and then worry about running out of 'loids'.
> Huh ???  isn't that point of using bytea or text datatypes.
> 
> I could have sworn bytea does not use large object interface it uses 
> TOAST or have i gone insane
> 

You're not insane :)

Put it another way: bytea values are not stored in the pg_largeobject
catalog.


-- 
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Ozz Nixon


On 4/29/10 12:42 PM, Greg Smith wrote:

Alban Hertroys wrote:
The reason I'm asking is that Postgres doesn't perform at its best on 
Windows and I seriously wonder whether the OS would be able to handle 
a load like that at all (can Windows handle 4000 open sockets for 
example?). 


You have to go out of your way to even get >125 connections going on 
Windows; see the very last entry at


http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows 



I design socket component suites for developers, on windows, with  few 
registry tweaks, you are able to have over 50,000 live, hot sockets. On 
Linux 2.6 and later, I have yet to hit a serious limit.


Performance wise, your focus will be poor memory paging, so make sure 
you have too much RAM, and nothing else running.


O.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Justin Graf

On 4/29/2010 12:07 PM, David Wall wrote:
>
>
> Big downside for the DB is that all large objects appear to be stored 
> together in pg_catalog.pg_largeobject, which seems axiomatically 
> troubling that you know you have lots of big data, so you then store 
> them together, and then worry about running out of 'loids'.
Huh ???  isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses 
TOAST or have i gone insane

Many people encode the binary data in Base64  and store as text data 
type??  Then never have to deal with escaping  bytea data type. Which i 
have found can be a pain

All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Greg Smith


Alban Hertroys wrote:
The reason I'm asking is that Postgres doesn't perform at its best on Windows and I seriously wonder whether the OS would be able to handle a load like that at all (can Windows handle 4000 open sockets for example?). 


You have to go out of your way to even get >125 connections going on 
Windows; see the very last entry at


http://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread David Wall


Things to consider when /not /storing them in the DB:

1) Backups of DB are incomplete without a corresponding backup of the files.

2) No transactional integrity between filesystem and DB, so you will 
have to deal with orphans from both INSERT and DELETE (assuming you 
don't also update the files).


3) No built in ability for replication, such as WAL shipping

Big downside for the DB is that all large objects appear to be stored 
together in pg_catalog.pg_largeobject, which seems axiomatically 
troubling that you know you have lots of big data, so you then store 
them together, and then worry about running out of 'loids'.


David

On 4/29/2010 2:10 AM, Cédric Villemain wrote:

2010/4/28 Adrian Klaver:
   

On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
 

On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain<

cedric.villemain.deb...@gmail.com>  wrote:
   

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.
 

What type of filesystem is good for this?  A filesystem with support for
storing tens of thousands of files in a single directory, or should one
play the 41/56/34/41563489.ext game?
   

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.


   

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.
   

In the for what it is worth department check out this Wiki:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems
 

and postgres fuse also :-D

   

--
Adrian Klaver
adrian.kla...@gmail.com

Re: [GENERAL] Start-up script for few clusters: just add water?

2010-04-29 Thread Guillaume Lelarge

Le 29/04/2010 10:40, Piotr Kublicki a écrit :
> Guillaume Lelarge  wrote on 28/04/2010 15:04:07:
> 
>>> In such case the new created start-up script postgresql2 should not be
>>> modified in the following line:
>>>
>>> # Override defaults from /etc/sysconfig/pgsql if file is present
>>> [ -f /etc/sysconfig/pgsql/${NAME} ] && . /etc/sysconfig/pgsql/${NAME}
>>>
>>> export PGDATA
>>> export PGPORT
>>>
>>> Or it will automatically match-up names from both directories, i.e.:
>>> /etc/init.d/postgresql   will run   /etc/sysconfig/pgsql/postgresql?
>>>
>>> /etc/init.d/postgresql2   will run   /etc/sysconfig/pgsql/postgresql2
>>>
>>
>> The latter. It will automatically match the script file name and the
>> config file name.
> 
> Thanks again. By the way, if I want to start instances with few optional
> arguments, as -S or -B where can I include these? In the same config file,
> i.e. /etc/sysconfig/pgsql/postgresql (speaking about RedHat directories
> structure)?
> 

I don't think so. I suppose you will have to change the
/etc/init.d/postgresql script (or the one of your other instances).


-- 
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using between with dates

2010-04-29 Thread A. Kretschmer

In response to Geoffrey Myers :
> I'm trying the following:
> 
> ship_date between '04/30/2010' AND '04/30/2010' + 14
> 
> But this returns:
> 
> ERROR:  invalid input syntax for integer: "04/30/2010"
> 
> Can I use between with dates?

Sure, why not, but you have to CAST your STRING into a DATE, or you have
to use to_date().

test=*# select '04/30/2010'::date;
date

 2010-04-30
(1 row)


And yes, write "+ '14 days'::interval " instead of just only +14.


Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using between with dates

2010-04-29 Thread Thom Brown

On 29 April 2010 14:55, Geoffrey Myers  wrote:

> I'm trying the following:
>
> ship_date between '04/30/2010' AND '04/30/2010' + 14
>
> But this returns:
>
> ERROR:  invalid input syntax for integer: "04/30/2010"
>
> Can I use between with dates?
>
>
You need to cast that last date, so:

ship_date between '04/30/2010' AND '04/30/2010'::date + 14

Thom

Re: [GENERAL] using between with dates

2010-04-29 Thread Szymon Guz

2010/4/29 Geoffrey Myers 

> I'm trying the following:
>
> ship_date between '04/30/2010' AND '04/30/2010' + 14
>
> But this returns:
>
> ERROR:  invalid input syntax for integer: "04/30/2010"
>
> Can I use between with dates?
>
>
>
This should be fine:
ship_date between '04/30/2010'::date AND '04/30/2010'::date + 14

regards
Szymon Guz

[GENERAL] using between with dates

2010-04-29 Thread Geoffrey Myers


I'm trying the following:

ship_date between '04/30/2010' AND '04/30/2010' + 14

But this returns:

ERROR:  invalid input syntax for integer: "04/30/2010"

Can I use between with dates?

--
Geoffrey Myers
Myers Consulting Inc.
770.592.1651

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using between with dates

2010-04-29 Thread Geoffrey


Tom Lane wrote:

Geoffrey  writes:

ship_date between '04/30/2010' AND '04/30/2010' + 14
ERROR:  invalid input syntax for integer: "04/30/2010"



Can I use between with dates?


The problem with that is the parser has no reason to treat the strings
as dates, at least not till it comes to consider the BETWEEN
comparisons, which is too late to help in resolving the addition
in the subexpression (data types are determined bottom-up).
This'd work:

 ship_date between '04/30/2010' AND '04/30/2010'::date + 14


Thanks muchly, likely a better solution then my timestamp approach.


--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Re: Re: [GENERAL] [SQL] Tsearch not searching 'Y'

2010-04-29 Thread sandeep prakash dhumale

 

On Thu, 29 Apr 2010 19:27:33 +0530  wrote
>On Thu, Apr 29, 2010 at 01:13:40PM -, sandeep prakash dhumale wrote:
>> Hello All,
>> 
>> I am trying to get tsearch working for my application but I am facing a
>> problem when alphabet 'Y' is the in the tsquery.
>> 
>> can anyone please share some light on it.
>> 
>> 
>> # SELECT 'hollywood'::tsvector @@ to_tsquery('holly:*');
>> ?column?
>> --
>> f
>> (1 row)
>> 
>> SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*');
>> ?column?
>> --
>> t
>> (1 row)
>> 
>> 
>> It works when i put <> in y as below but i don't want to do it 
that way.
>> 
>> SELECT 'hollywood'::tsvector @@ to_tsquery('holl:*');
>> ?column?
>> --
>> t
>> 
>> Thanks in advance 
>
>That is because the to_tsquery() normalizes the tokens. Here is
>what I get from the default configuration:
>
>db=# select to_tsquery('holly:*');
> to_tsquery 
>
> 'holli':*
>(1 row)
>
>db=# select to_tsquery('holl:*');
> to_tsquery 
>
> 'holl':*
>(1 row)
>
>It is pretty easy to see why you see the behavior that you do.
>Maybe you need to change your tsearch configuration to match what
>you expect to happen.
>
>Regards,
>Ken
>
>
>-- 
>Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-general
>



First of all thanks for your replies Tom and Ken,

I am little newbie to Tsearch so I appologies if I sound a little confuse.

Tom: If i do by casting like you wrote then i ran into case sensitivity issue 
also then it does not work for other searches I guess then it sees for exact 
matches and not normalize to lexims.

Ken: As you said I need to change my configuration, It would be great if you 
can point me out where i can change that configuration 

and what about that   in the query how does it work, does that 
mean to explicitly include y in to_tsquery.


All your help is higly appriciated.


--Sandy

Re: [GENERAL] using between with dates

Geoffrey  writes:
> ship_date between '04/30/2010' AND '04/30/2010' + 14
> ERROR:  invalid input syntax for integer: "04/30/2010"

> Can I use between with dates?

The problem with that is the parser has no reason to treat the strings
as dates, at least not till it comes to consider the BETWEEN
comparisons, which is too late to help in resolving the addition
in the subexpression (data types are determined bottom-up).
This'd work:

 ship_date between '04/30/2010' AND '04/30/2010'::date + 14

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using between with dates

2010-04-29 Thread Adrian Klaver

On Thursday 29 April 2010 6:58:26 am Geoffrey wrote:
> I'm trying the following:
>
> ship_date between '04/30/2010' AND '04/30/2010' + 14
>
> But this returns:
>
> ERROR:  invalid input syntax for integer: "04/30/2010"
>
> Can I use between with dates?
>
> --
> Until later, Geoffrey
>
> "I predict future happiness for America if they can prevent
> the government from wasting the labors of the people under
> the pretense of taking care of them."
> - Thomas Jefferson

ship_date between '04/30/2010' AND '04/30/2010'::date + 14

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using between with dates

2010-04-29 Thread Geoffrey


Geoffrey wrote:

I'm trying the following:

ship_date between '04/30/2010' AND '04/30/2010' + 14

But this returns:

ERROR:  invalid input syntax for integer: "04/30/2010"

Can I use between with dates?



Got it:

ship_date between '04/30/2010' and timestamp '04/30/2010' + interval '14 
day'



--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] using between with dates

2010-04-29 Thread Geoffrey


I'm trying the following:

ship_date between '04/30/2010' AND '04/30/2010' + 14

But this returns:

ERROR:  invalid input syntax for integer: "04/30/2010"

Can I use between with dates?

--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Tsearch not searching 'Y'

"sandeep prakash dhumale"  writes:
> I am trying to get tsearch working for my application but I am facing a
> problem when alphabet 'Y' is the in the tsquery.

> # SELECT 'hollywood'::tsvector  @@ to_tsquery('holly:*');
>  ?column?
> --
>  f
> (1 row)

You can't use to_tsquery for this sort of thing, because it tries to
normalize the given words:

regression=# select to_tsquery('holly:*');
 to_tsquery 

 'holli':*
(1 row)

If you do this it works:

regression=# SELECT 'hollywood'::tsvector  @@ 'holly:*'::tsquery;
 ?column? 
--
 t
(1 row)

So if you want to use prefix matching, don't normalize.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Writing SRF

Jorge Arevalo  writes:
> Yes. For example, the function expects 2 arguments, and it's called
> with 2 arguments: 1 composite type (following this format
> https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat)
> and one integer. But PG_NARGS() returns a really big value (16297)
> when I first check the number of arguments at the beginning of the
> function. Has sense?

Given only that data point, I would guess that you forgot to mark the
function as being called with V1 protocol (PG_FUNCTION_INFO_V1).

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [SQL] Tsearch not searching 'Y'

2010-04-29 Thread Kenneth Marshall

On Thu, Apr 29, 2010 at 01:13:40PM -, sandeep prakash dhumale wrote:
> Hello All,
> 
> I am trying to get tsearch working for my application but I am facing a
> problem when alphabet 'Y' is the in the tsquery.
> 
> can anyone please share some light on it.
> 
> 
> # SELECT 'hollywood'::tsvector  @@ to_tsquery('holly:*');
>  ?column?
> --
>  f
> (1 row)
> 
> SELECT 'hollywood'::tsvector  @@ to_tsquery('holl:*');
> ?column?
> --
>  t
> (1 row)
> 
> 
> It works when i put <> in y as below but i don't want to do it that way.
> 
> SELECT 'hollywood'::tsvector  @@ to_tsquery('holl:*');
>  ?column?
> --
>  t
> 
> Thanks in advance 

That is because the to_tsquery() normalizes the tokens. Here is
what I get from the default configuration:

db=# select to_tsquery('holly:*');
 to_tsquery 

 'holli':*
(1 row)

db=# select to_tsquery('holl:*');
 to_tsquery 

 'holl':*
(1 row)

It is pretty easy to see why you see the behavior that you do.
Maybe you need to change your tsearch configuration to match what
you expect to happen.

Regards,
Ken


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library


On 29/04/2010 9:23 PM, ashish.a...@sungard.com wrote:

Hi Craig,

Sorry for creating confusion. Let me (I work with Ambarish, the original
author of the mail) try to be more specific now.

We have a library (written in C) which helps us in doing phonetic based
name search. We want to use this library inside a postgres DB function.
To achieve this we wrote a small C code (we referred as wrapper class)
which uses the library. This C code is an ECPG which is bundled as a dll
and placed in postgres's lib dir.


OK, that makes sense - though I'm not sure your use of ecpg in that role 
does.


I haven't worked with ecpg much at all, but I didn't realise it was 
capable of being used as a tool for server backend functions. Are you 
using EXEC SQL CONNECT TO in the ecpg code? Or is there some other way 
of using ECPG embeded in a backend that I don't know about?


I don't see anything in:
 http://www.postgresql.org/docs/8.4/static/ecpg.html
but I haven't gone through it in detail.


The usual way to write a PostgreSQL backend function in C is using the 
server extension interfaces:


  http://www.postgresql.org/docs/8.4/static/extend.html
  http://www.postgresql.org/docs/8.4/static/xfunc-c.html

and, if you need to execute SQL from within your C code, the Server 
Programming Interface:


  http://www.postgresql.org/docs/8.4/static/spi.html



Your code is generally compiled using pgxs.




The original postgres function is
supposed to be called from a java program using JDBC. And the postgres
function should call the C function of the wrapper class.


That makes sense.


At runtime we
observed that when the postgres DB function calls the C function of the
wrapper class (in the dll), the java heap memory start increasing and
reached to the max level resulted in crashing of JVM.


OK, so the earlier statement that made it sound like you were calling a 
DLL from the Java runtime:


"Now the issue is that, when we make a call to this dll, it consumes a 
lot of memory and this memory is getting consumed from the heap space of 
the original java process causing an out of memory exception in Java."


meant nothing of the sort, and your Java code (that's running out of 
memory) is really only using JDBC?




Then we commented
out the call to ECPG C function from postgres DB function and realized
that everything went well.


... but the amount of data returned from your function call changed lots 
too, right?



We were surprised why the loading and execution of the ECPG is taking
JVM memory.


I doubt it is. I suspect you're just seeing memory used by a large 
result set. Consider using a cursor. See the JDBC manual on handling 
large result sets.


The JVM should *not* crash if it runs out of memory due to JDBC using 
too much, though. Any crash really should be reported directly to sun. 
If the JVM crashes it saves some error logs and reports the crash on the 
text console so you can send them off to Sun for analysis.


If you don't actually mean that the JVM crashes at all, and actually 
mean "my program throws an OutOfMemoryError" ... then yes, that's the 
expected behaviour when you try to use too much memory loading a big 
result set. This is NOT a jvm crash, the jvm is doing exactly what it's 
supposed to do. Describing this as "crashing of JVM" is very misleading 
if this is actually what's happening.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library

2010-04-29 Thread Ashish.Arya

Hi Craig,

Sorry for creating confusion. Let me (I work with Ambarish, the original
author of the mail) try to be more specific now.

We have a library (written in C) which helps us in doing phonetic based
name search. We want to use this library inside a postgres DB function.
To achieve this we wrote a small C code (we referred as wrapper class)
which uses the library. This C code is an ECPG which is bundled as a dll
and placed in postgres's lib dir. The original postgres function is
supposed to be called from a java program using JDBC. And the postgres
function should call the C function of the wrapper class. At runtime we
observed that when the postgres DB function calls the C function of the
wrapper class (in the dll), the java heap memory start increasing and
reached to the max level resulted in crashing of JVM. Then we commented
out the call to ECPG C function from postgres DB function and realized
that everything went well.

We were surprised why the loading and execution of the ECPG is taking
JVM memory. Ideally it should use the postgres.exe memory and CPU
utilization. We observed all these using windows task manager.

I hope this will help you in understanding our problem. 

Thanks and Regards,

Ashish Arya

-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Craig Ringer
Sent: Thursday, April 29, 2010 6:34 PM
To: Bhattacharya, A
Cc: pgsql-general@postgresql.org
Subject: Re: FW: [GENERAL] Java Memory Issue while Loading Postgres
library

On 29/04/2010 8:48 PM, a.bhattacha...@sungard.com wrote:

> Your understanding is slightly incorrect. Actually we required to uses
a
> special library from postgres.

That mystery library being? From "postgres"? Do you mean a library 
supplied by the PostgreSQL project its self? Libpq? If not, what?

C'mon, specifics. Please. Try to read your message after you have 
written it while pretending you are somebody who does not work with your

code and does not know the things you know about it. Try to notice 
assumptions and clarify them.

> For this we write a wrapper around the
> library and composed a dll out of it (which uses ECPG to communicate
to
> postgres). From the application front (Java) we make a JDBC call to
> postgres function.  This postgres function then makes call to the dll.

This does not make sense. Your earlier mail said that the DLL is loaded 
into the Java runtime.

How does a "postgres function" make a call to a DLL running in the JRE? 
You can't make a call from the postgresql backend to code on a client 
connected to ecpg.

Please be SPECIFIC. What is a "postgres function?" How does it "make a 
call to the dll?" Etc. Provide links to PostgreSQL and Java 
documentation for the APIs you use, use the precise terms for those 
APIs, and generally be SPECIFIC and DETAILED. Post diagrams or code 
samples if you have to!

> But the strange thing we observed that at runtime, somehow the
execution
> of dll is attached with java.exe. In other words, when we monitor the
> performance of the application we realized that the execution of dll
is
> consuming java heap memory and is shown as a thread in java.exe
process.
> We expected that it should be linked with postgres.exe.

Ah, ok, so you are TRYING to load code a C extension to PostgreSQL into 
the server backend?

If so, why are you using ECPG? Or are you in fact *not* using ECPG, but 
the PostgreSQL backend extension interface?

How is the dll "shown as a thread in [the] java.exe process" ? Using a 
Java debugger attached to the JRE? Using Process Explorer or a C-level 
debugger examining the loaded DLL list of the JRE? Using the Java 
monitoring APIs?

I really do not understand what you are doing, or what you are trying to

do. I doubt anybody can help you without a much more detailed and 
specific explanation of what you're doing.

I think it'd be a really good idea for you to write a minimalist test 
case for this and post it. Either you'll figure out what's wrong in the 
process of making the test case, or you'll have something to post that 
people can play with to see what you are trying to do.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Tsearch not searching 'Y'

2010-04-29 Thread sandeep prakash dhumale

Hello All,

I am trying to get tsearch working for my application but I am facing a
problem when alphabet 'Y' is the in the tsquery.

can anyone please share some light on it.


# SELECT 'hollywood'::tsvector  @@ to_tsquery('holly:*');
 ?column?
--
 f
(1 row)

SELECT 'hollywood'::tsvector  @@ to_tsquery('holl:*');
?column?
--
 t
(1 row)


It works when i put <> in y as below but i don't want to do it that way.

SELECT 'hollywood'::tsvector  @@ to_tsquery('holl:*');
 ?column?
--
 t

Thanks in advance

Re: [GENERAL] Writing SRF

2010-04-29 Thread Jorge Arevalo

On Wed, Apr 28, 2010 at 10:43 PM, Tom Lane  wrote:
> Jorge Arevalo  writes:
>> My doubt is if I'm doing things right getting all the stuff I need (an
>> array) in the first call, pointing user_fctx to this array and
>> accessing myStructsArray[call_cntr] in each successive call, until
>> myStructsArray + call_cntr == NULL (last array element?).
>
> Sounds reasonable enough.  Is it not working for you?  Maybe you need
> to be careful about which memory context the array is created in.
>
>                        regards, tom lane
>

Yes. For example, the function expects 2 arguments, and it's called
with 2 arguments: 1 composite type (following this format
https://svn.osgeo.org/postgis/spike/wktraster/doc/RFC1-SerializedFormat)
and one integer. But PG_NARGS() returns a really big value (16297)
when I first check the number of arguments at the beginning of the
function. Has sense?

The array, if I'm doing things right, is created in the context
pointed by fcinfo->flinfo->fn_mcxt. But I'd like to solve "silly"
things like the previous before, and to be sure I'm doing things
right, in general. Now, I know at least sounds reasonable :-)

Many thanks!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Performance and Clustering

2010-04-29 Thread Alban Hertroys

On 29 Apr 2010, at 3:08, Jaime Rodriguez wrote:

> hi,
> Today is my first day looking at PostgreSQL
> I am looking to migrate a MS SQL DB to PostgreSQL :) :)
> My customer requires that DBMS shall support 4000 simultaneous requests
> Also the system to be deploy maybe a cluster, with 12 microprocessors
> 
> From what I have read, PostgreSQL has really good performance and reliability 
> but I would like to get some numbers, not sure if somewhere in the wiki some 
> of this data is available.

Are you looking at PostgreSQL on Windows or on a UNIX or UNIX-based OS?

The reason I'm asking is that Postgres doesn't perform at its best on Windows 
and I seriously wonder whether the OS would be able to handle a load like that 
at all (can Windows handle 4000 open sockets for example?). Other database 
solutions on Windows will probably have similar issues, so this is not a reason 
to base your choice of database on - it is IMHO something that you should look 
into.

OTOH, changing both the database and the OS is a big change. For example, most 
UNIX-es by default use a case-sensitive file system, whereas Windows does not.
That said, for both you'll certainly have to make lots of changes in your 
application, so combining the two and do that only once may be preferable. If 
you're thinking of going that way I'd suggest FreeBSD or Solaris, but Linux is 
a popular choice (as is Windows, for that matter).

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.

!DSPAM:737,4bd984be10411660912508!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library


On 29/04/2010 8:48 PM, a.bhattacha...@sungard.com wrote:


Your understanding is slightly incorrect. Actually we required to uses a
special library from postgres.


That mystery library being? From "postgres"? Do you mean a library 
supplied by the PostgreSQL project its self? Libpq? If not, what?


C'mon, specifics. Please. Try to read your message after you have 
written it while pretending you are somebody who does not work with your 
code and does not know the things you know about it. Try to notice 
assumptions and clarify them.



For this we write a wrapper around the
library and composed a dll out of it (which uses ECPG to communicate to
postgres). From the application front (Java) we make a JDBC call to
postgres function.  This postgres function then makes call to the dll.


This does not make sense. Your earlier mail said that the DLL is loaded 
into the Java runtime.


How does a "postgres function" make a call to a DLL running in the JRE? 
You can't make a call from the postgresql backend to code on a client 
connected to ecpg.


Please be SPECIFIC. What is a "postgres function?" How does it "make a 
call to the dll?" Etc. Provide links to PostgreSQL and Java 
documentation for the APIs you use, use the precise terms for those 
APIs, and generally be SPECIFIC and DETAILED. Post diagrams or code 
samples if you have to!



But the strange thing we observed that at runtime, somehow the execution
of dll is attached with java.exe. In other words, when we monitor the
performance of the application we realized that the execution of dll is
consuming java heap memory and is shown as a thread in java.exe process.
We expected that it should be linked with postgres.exe.


Ah, ok, so you are TRYING to load code a C extension to PostgreSQL into 
the server backend?


If so, why are you using ECPG? Or are you in fact *not* using ECPG, but 
the PostgreSQL backend extension interface?


How is the dll "shown as a thread in [the] java.exe process" ? Using a 
Java debugger attached to the JRE? Using Process Explorer or a C-level 
debugger examining the loaded DLL list of the JRE? Using the Java 
monitoring APIs?


I really do not understand what you are doing, or what you are trying to 
do. I doubt anybody can help you without a much more detailed and 
specific explanation of what you're doing.


I think it'd be a really good idea for you to write a minimalist test 
case for this and post it. Either you'll figure out what's wrong in the 
process of making the test case, or you'll have something to post that 
people can play with to see what you are trying to do.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Convert of string to array problem

On Thu, Apr 29, 2010 at 8:52 AM, Thom Brown  wrote:
> On 29 April 2010 13:35, Bård Grønbech  wrote:
>>
>> Have a string like '0.1;0.2;null;0.3' which I would like to convert
>> into a double precision[] array.
>>
>> Trying:
>>
>> select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[])
>>
>> gives me an error: invalid input syntax for type double precision: "null".
>>
>> Can anybody help me?
>>
>> -Bård
>>
>
> I believe string_to_array will take that null as a string called 'null',
> which it can't convert to a float8.

select array(select case when n='null' then null else n end from
unnest(string_to_array('0.1;0.2;null;0.3', ';')) as n);

?column?

 {0.1,0.2,NULL,0.3}

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problem: concat an array of arrays

On Thu, Apr 29, 2010 at 8:46 AM, Merlin Moncure  wrote:
> On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda  wrote:
>> Hi!
>>
>> I tried to write a recursive SELECT, that would do the concatination, but a 
>> problem appeared:
>> can't make a {1,2,3} from {{1,2,3}}.
>> Here are some experiments:
>> ---
>> postgres=# select 
>> array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]];
>>
>>                array
>> --
>>  {{1,2,3},{4,5,6},{7,8,9},{10,11,12}}
>> (1 row)
>>
>>
>> postgres=# select 
>> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>> )[3];
>>  array
>> ---
>>
>> (1 row)
>>
>>
>> postgres=# select 
>> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>> )[3:3];
>>   array
>> ---
>>  {{7,8,9}}
>> (1 row)
>>
>>
>> postgres=# select 
>> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
>> )[3][1];
>>  array
>> ---
>>     7
>> (1 row)
>> ---
>>
>> The original query, that would do the concatenation:
>> ---
>> WITH RECURSIVE unfold (rest, accum) AS (
>>        VALUES ($1 :: int[][], ARRAY[] :: int[])
>>      UNION ALL
>>        SELECT u.rest[2:array_length(u.rest, 1)] AS rest, 
>> array_cat(u.rest[1], u.accum) AS accum
>>        FROM unfold AS u
>>        WHERE array_length(u.rest, 1) > 0
>> )
>> SELECT u.accum
>> FROM unfold AS u
>> WHERE array_length(u.rest, 1) = 0;
>> ---
>> Throws an error:
>> ERROR:  function array_cat(integer, integer[]) does not exist
>
> array_cat requires too array arguments.  you could rewrite your expression to
> array_cat(array[u.rest[1], u.accum)
> (i think, not quite sure what you are trying to do).
>
> you can append scalars to arrays with the || operator:
> select array[1,2,3] || 4;
>  ?column?
> ---
>  {1,2,3,4}
>
>
> you can kinda sorta slice an array using the slice method:
> select (array[array[1,2,3], array[2,4,6]])[1:1];
>   array
> ---
>  {{1,2,3}}
>
> what are you trying to do w/unfold function exactly?

hm. the basic problem is that it's difficult to slide arrays up/down
dimensions.  you can move from scalars to arrays and arrays to
scalars, but not from dimension N to N-1 etc. you can however move
from dimension 'N' to 1:

create or replace function restack(_array anyarray) returns anyarray as
$$
  select array(select unnest($1));
$$ language sql immutable;

select restack(array[1,2,3]);
 restack
-
 {1,2,3}

select restack(array[array[1,2,3]]);
 restack
-
 {1,2,3}


merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Convert of string to array problem

2010-04-29 Thread Thom Brown

On 29 April 2010 13:35, Bård Grønbech  wrote:

> Have a string like '0.1;0.2;null;0.3' which I would like to convert
> into a double precision[] array.
>
> Trying:
>
> select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[])
>
> gives me an error: invalid input syntax for type double precision: "null".
>
> Can anybody help me?
>
> -Bård
>
>
I believe string_to_array will take that null as a string called 'null',
which it can't convert to a float8.

Regards

Thom

Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library

2010-04-29 Thread A.Bhattacharya

-Original Message-
From: Craig Ringer [mailto:cr...@postnewspapers.com.au] 
Sent: Thursday, April 29, 2010 6:02 PM
To: Bhattacharya, A
Cc: pgsql-general@postgresql.org
Subject: Re: FW: [GENERAL] Java Memory Issue while Loading Postgres
library

On 29/04/2010 7:34 PM, a.bhattacha...@sungard.com wrote:

> We have a java exe making a call to a postgres function. This postgres
> function internally makes a call to a dll (which is written using
> Postgres extended C).

If I understand you correctly, you have a dll that uses ecpg to 
communicate with postgresql. It is loaded by a JVM via JNI, and that JVM

then uses JNI calls to use your ecpg-based DLL to talk to the database.

If I've understood you right: Why this strange architecture? Usually 
Java apps just talk to the database via JDBC.

As far as I know there's no particular reason you *can't* do calls to 
your ecpg-using dll via JNI, though. I'd be surprised if your problems 
didn't turn out to be misuse/misunderstanding of the 
notoriously-hard-to-get-right JNI interface, or issues with your DLL 
and/or its use of ecpg.

If I have NOT understood how you are doing things correctly, then please

be MUCH MORE SPECIFIC. Making people guess what you're doing doesn't 
help you get accurate, useful answers.

I strongly recommend that you try to produce a self-contained test case.

Write a minimalist ecpg dll that does only one simple thing. Write a 
minimalist, simple JNI-using Java program that loads and uses that DLL. 
See if it leaks memory. If this minimalist test case demonstrates the 
leak, post the source code to BOTH the dll and your simple Java program 
here. Make sure both can be compiled without extra libraries and include

any project files etc required to compile them.

If your new test case doesn't leak, then you know the basic JNI + ECPG 
combination isn't the issue. You need to start investigating what in 
your code is causing the leak. If you just can't figure it out, then 
start chopping things out and disabling things until the leak stops 
happening to narrow it down.

Without a lot more detail than you have provided, it is unlikely that 
anybody here can help you.

> Now the issue is that, when we make a call to this dll, it consumes a
> lot of memory and this memory is getting consumed from the heap space
of
> the original java process causing an out of memory exception in Java.
>
> Is this a known issue.

I rather doubt that anybody else anywhere in the world is doing what 
you're doing ;-)

Not as far as I know it's not, no. Nor have you provided any evidence 
it's an issue with PostgreSQL not your own code yet.

> Do we have a way to keep these processes disjoint
> and not eat up heap space of the original process?

http://google.com/search?q=inter+process+communication

... but you should probably fix the problem that's causing the excessive

memory use/leakage rather than just working around it by running the 
code out-of-process.

--
Craig Ringer

Thanks Craig for your comment.

Your understanding is slightly incorrect. Actually we required to uses a
special library from postgres. For this we write a wrapper around the
library and composed a dll out of it (which uses ECPG to communicate to
postgres). From the application front (Java) we make a JDBC call to
postgres function. This postgres function then makes call to the dll.
But the strange thing we observed that at runtime, somehow the execution
of dll is attached with java.exe. In other words, when we monitor the
performance of the application we realized that the execution of dll is
consuming java heap memory and is shown as a thread in java.exe process.
We expected that it should be linked with postgres.exe. And this
incorrect linking resulted in out of heap memory error in java. But when
we run the same postgres function using pgAdmin tool, the memory and CPU
utilization of the dll is shown attached with the postgres.exe
(corresponding to pgAdmin).

I hope I am able to explain the issue.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problem: concat an array of arrays

On Wed, Apr 28, 2010 at 8:48 PM, Belka Lambda  wrote:
> Hi!
>
> I tried to write a recursive SELECT, that would do the concatination, but a 
> problem appeared:
> can't make a {1,2,3} from {{1,2,3}}.
> Here are some experiments:
> ---
> postgres=# select 
> array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]];
>
>                array
> --
>  {{1,2,3},{4,5,6},{7,8,9},{10,11,12}}
> (1 row)
>
>
> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
> )[3];
>  array
> ---
>
> (1 row)
>
>
> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
> )[3:3];
>   array
> ---
>  {{7,8,9}}
> (1 row)
>
>
> postgres=# select 
> (array[array[1,2,3],array[4,5,6],array[7,8,9],array[10,11,12]]
> )[3][1];
>  array
> ---
>     7
> (1 row)
> ---
>
> The original query, that would do the concatenation:
> ---
> WITH RECURSIVE unfold (rest, accum) AS (
>        VALUES ($1 :: int[][], ARRAY[] :: int[])
>      UNION ALL
>        SELECT u.rest[2:array_length(u.rest, 1)] AS rest, array_cat(u.rest[1], 
> u.accum) AS accum
>        FROM unfold AS u
>        WHERE array_length(u.rest, 1) > 0
> )
> SELECT u.accum
> FROM unfold AS u
> WHERE array_length(u.rest, 1) = 0;
> ---
> Throws an error:
> ERROR:  function array_cat(integer, integer[]) does not exist

array_cat requires too array arguments.  you could rewrite your expression to
array_cat(array[u.rest[1], u.accum)
(i think, not quite sure what you are trying to do).

you can append scalars to arrays with the || operator:
select array[1,2,3] || 4;
 ?column?
---
 {1,2,3,4}


you can kinda sorta slice an array using the slice method:
select (array[array[1,2,3], array[2,4,6]])[1:1];
   array
---
 {{1,2,3}}

what are you trying to do w/unfold function exactly?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Convert of string to array problem

2010-04-29 Thread Bård Grønbech

Have a string like '0.1;0.2;null;0.3' which I would like to convert
into a double precision[] array.

Trying:

select cast (string_to_array('0.1;0.2;null;0.3', ';') as float8[])

gives me an error: invalid input syntax for type double precision: "null".

Can anybody help me?

-Bård

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: [GENERAL] Java Memory Issue while Loading Postgres library


On 29/04/2010 7:34 PM, a.bhattacha...@sungard.com wrote:


We have a java exe making a call to a postgres function. This postgres
function internally makes a call to a dll (which is written using
Postgres extended C).


If I understand you correctly, you have a dll that uses ecpg to 
communicate with postgresql. It is loaded by a JVM via JNI, and that JVM 
then uses JNI calls to use your ecpg-based DLL to talk to the database.


If I've understood you right: Why this strange architecture? Usually 
Java apps just talk to the database via JDBC.


As far as I know there's no particular reason you *can't* do calls to 
your ecpg-using dll via JNI, though. I'd be surprised if your problems 
didn't turn out to be misuse/misunderstanding of the 
notoriously-hard-to-get-right JNI interface, or issues with your DLL 
and/or its use of ecpg.


If I have NOT understood how you are doing things correctly, then please 
be MUCH MORE SPECIFIC. Making people guess what you're doing doesn't 
help you get accurate, useful answers.



I strongly recommend that you try to produce a self-contained test case. 
Write a minimalist ecpg dll that does only one simple thing. Write a 
minimalist, simple JNI-using Java program that loads and uses that DLL. 
See if it leaks memory. If this minimalist test case demonstrates the 
leak, post the source code to BOTH the dll and your simple Java program 
here. Make sure both can be compiled without extra libraries and include 
any project files etc required to compile them.


If your new test case doesn't leak, then you know the basic JNI + ECPG 
combination isn't the issue. You need to start investigating what in 
your code is causing the leak. If you just can't figure it out, then 
start chopping things out and disabling things until the leak stops 
happening to narrow it down.


Without a lot more detail than you have provided, it is unlikely that 
anybody here can help you.



Now the issue is that, when we make a call to this dll, it consumes a
lot of memory and this memory is getting consumed from the heap space of
the original java process causing an out of memory exception in Java.

Is this a known issue.


I rather doubt that anybody else anywhere in the world is doing what 
you're doing ;-)


Not as far as I know it's not, no. Nor have you provided any evidence 
it's an issue with PostgreSQL not your own code yet.



Do we have a way to keep these processes disjoint
and not eat up heap space of the original process?


http://google.com/search?q=inter+process+communication

... but you should probably fix the problem that's causing the excessive 
memory use/leakage rather than just working around it by running the 
code out-of-process.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Magnus Hagander

On Thu, Apr 29, 2010 at 13:43, Oliver Kohll - Mailing Lists
 wrote:
>>
>> Aren't you looking for something along the line of:
>>
>> SELECT year, sum(c) over (order by year)
>> FROM (
>>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>>  FROM email_list
>>  GROUP BY extract(year from signup_date)
>> )
>>
>> (adjust for typos, I didn't test it)
>>
>> Yes that does work thanks, if you give the subquery a name. I'd still like
>> to know if it's possible to do with a window function rather than a
>> subquery.
>> Oliver Kohll
>
> Like this?:
>
> SELECT extract(year from signup_date), count(email_address),
> sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM
> email_list GROUP BY 1 ORDER BY 1;
>
> Thom
>
> Almost, but put me on the right track! This one is exactly what I'm looking
> for:
> SELECT extract(year from signup_date), count(email_address),
> sum(count(email_address)) OVER (ORDER BY extract(year from signup_date))
> FROM email_list GROUP BY 1 ORDER BY 1;
> The ORDER BY count(email_address) did give the same results for my data but
> only because the count values just happen to give the same ordering as the
> years - I tested by changing some dates.
> Many thanks all.

Curious note - how does the non-subselect version and the subselect
version compare performance-wise?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cumulative count (running total) window fn

>> 
>> Aren't you looking for something along the line of:
>> 
>> SELECT year, sum(c) over (order by year)
>> FROM (
>>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>>  FROM email_list
>>  GROUP BY extract(year from signup_date)
>> )
>> 
>> (adjust for typos, I didn't test it)
> 
> Yes that does work thanks, if you give the subquery a name. I'd still like to 
> know if it's possible to do with a window function rather than a subquery.
> 
> Oliver Kohll
> 
> 
> Like this?:
> 
> SELECT extract(year from signup_date), count(email_address), 
> sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM 
> email_list GROUP BY 1 ORDER BY 1;
> 
> Thom

Almost, but put me on the right track! This one is exactly what I'm looking for:

SELECT extract(year from signup_date), count(email_address), 
sum(count(email_address)) OVER (ORDER BY extract(year from signup_date)) FROM 
email_list GROUP BY 1 ORDER BY 1;

The ORDER BY count(email_address) did give the same results for my data but 
only because the count values just happen to give the same ordering as the 
years - I tested by changing some dates.

Many thanks all.
Oliver

FW: [GENERAL] Java Memory Issue while Loading Postgres library

2010-04-29 Thread A.Bhattacharya

Dear All Experts,

 

We have a java exe making a call to a postgres function. This postgres
function internally makes a call to a  dll (which is written using
Postgres extended C). 

Now the issue is that, when we make a call to this dll, it consumes a
lot of memory and this memory is getting consumed from the heap space of
the original java process causing an out of memory exception in Java. 

 

Is this a known issue. Do we have a way to keep these processes disjoint
and not eat up heap space of the original process?

 

Please provide your suggestions/advises on the same.

 

Sorry missed few information.

I am using  PostgreSQL 8.3.5 on Windows XP with 4GB of RAM and 160GB of
hard drive with Core 2 Duo processor CPU @ 6300 1.86 GHz.

[GENERAL] How to monitor Parallel pg_restore ?

2010-04-29 Thread raghavendra t

Hi All,

I am using Postgres 8.4. pg_restore -j option. I have dump of the database
with -Fc and elected the pg_restore -j option for the faster restoration.
When the restoration process is in progress, i want to monitor the threads
invoked by pg_restore (suppose if i give -j 4). I have verified in the
pg_stat_activity, in which i see only one transaction running that is COPY
command.

Even "top" command havent resulted any.

Could please assist me in this.

Regards
Raghavendra

Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Thom Brown

On 29 April 2010 11:39, Oliver Kohll - Mailing Lists <
oliver.li...@gtwm.co.uk> wrote:

>
> On 29 Apr 2010, at 10:01, Magnus Hagander wrote:
>
>
> select extract(year from signup_date),
>
>  count(email_address),
>
>  sum(count(email_address)) over (partition by 1 order by 1 asc rows
> unbounded preceding)
>
> from email_list group by 1 order by 1;
>
>
> Does anyone have any other ideas?
>
>
> Aren't you looking for something along the line of:
>
> SELECT year, sum(c) over (order by year)
> FROM (
>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>  FROM email_list
>  GROUP BY extract(year from signup_date)
> )
>
> (adjust for typos, I didn't test it)
>
>
> Yes that does work thanks, if you give the subquery a name. I'd still like
> to know if it's possible to do with a window function rather than a
> subquery.
>
> Oliver Kohll
>


Like this?:

SELECT extract(year from signup_date), count(email_address),
sum(count(email_address)) OVER (ORDER BY count(email_address)) FROM
email_list GROUP BY 1 ORDER BY 1;

Thom

Re: [GENERAL] Cumulative count (running total) window fn


On 29 Apr 2010, at 10:01, Magnus Hagander wrote:
>> 
>> select extract(year from signup_date),
>>  count(email_address),
>>  sum(count(email_address)) over (partition by 1 order by 1 asc rows 
>> unbounded preceding)
>> from email_list group by 1 order by 1;
>> 
>> Does anyone have any other ideas?
> 
> Aren't you looking for something along the line of:
> 
> SELECT year, sum(c) over (order by year)
> FROM (
>  SELECT extract(year from signup_date) AS year, count(email_address) AS c
>  FROM email_list
>  GROUP BY extract(year from signup_date)
> )
> 
> (adjust for typos, I didn't test it)

Yes that does work thanks, if you give the subquery a name. I'd still like to 
know if it's possible to do with a window function rather than a subquery.

Oliver Kohll

[GENERAL] Java Memory Issue while Loading Postgres library

2010-04-29 Thread A.Bhattacharya

Dear All Experts,

 

We have a java exe making a call to a postgres function. This postgres
function internally makes a call to a  dll (which is written using
Postgres extended C). 

Now the issue is that, when we make a call to this dll, it consumes a
lot of memory and this memory is getting consumed from the heap space of
the original java process causing an out of memory exception in Java. 

 

Is this a known issue. Do we have a way to keep these processes disjoint
and not eat up heap space of the original process?

 

Please provide your suggestions/advises on the same.

Re: [GENERAL] Inheritance efficiency

2010-04-29 Thread Vincenzo Romano

2010/4/26 Vincenzo Romano :
> 2010/4/26 Bruce Momjian :
>> Vincenzo Romano wrote:
>>> Hi all.
>>>
>>> I'm wondering how efficient the inheritance can be.
>>> I'm using the constraint exclusion feature and for each child table
>>> (maybe but one) I have a proper CHECK constraint.
>>> How efficient can the query planner be in choosing the right child
>>> tables in the case of, say, thousands of them?
>>> Would the selection process behave linearly, logarithmically or what?
>>
>> It is fine for dozens of child tables, but not thousands; it does need
>> improvement.
>
> This sounds like "linear" algorithms. Doesn't it?
>
>>> And now it comes to my mind the same question for partial indexes.
>>> That is, if I had a lot (really a lot) of small partial indexes over a
>>> very large table, how efficient can the query planner be
>>> in selecting the right indexes?
>
> No info about this point (partial indexes)?
> Is also this geared with linear algorithms ?

Should I move to an "enterprise grade" version of PostgreSQL?

-- 
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [SPAM] Re: [GENERAL] Best way to replicate to large number of nodes

2010-04-29 Thread Cédric Villemain

2010/4/22 Brian Peschel :
>
> On 04/22/2010 10:12 AM, Ben Chobot wrote:
>>
>> On Apr 21, 2010, at 1:41 PM, Brian Peschel wrote:
>>
>>
>>>
>>> I have a replication problem I am hoping someone has come across before
>>> and can provide a few ideas.
>>>
>>> I am looking at a configuration of on 'writable' node and anywhere from
>>> 10 to 300 'read-only' nodes.  Almost all of these nodes will be across a WAN
>>> from the writable node (some over slow VPN links too).  I am looking for a
>>> way to replicate as quickly as possible from the writable node to all the
>>> read-only nodes.  I can pretty much guarantee the read-only nodes will never
>>> become master nodes.  Also, the updates to the writable node are bunched and
>>> at known times (ie only updated when I want it updated, not constant
>>> updates), but when changes occur, there are a lot of them at once.
>>>
>>
>> Two things you didn't address are the acceptable latency of keeping the
>> read-only nodes in sync with the master - can they be different for a day? A
>> minute? Do you need things to stay synchronous? Also, how big is your
>> dataset? A simple pg_dump and some hot scp action after you batched updates
>> might be able to solve your problem.
>
> Latency is important.  I would say 10 to 15 minutes max, but the shorter the
> better.  I don't have an exact size, but I believe the entire DB is about 10
> gig.
>
> We had an idea of creating our apps write the SQL statements to a file,
> rather than using an ODBC drive to directly change the DBs.  Then we could
> scp/rsync the files to the remote machines and execute them there.  This
> just seems like a very manual process though.

You need to have a look at PgQ. (in short, skytools will do exactly
what you want if I understand correctly your requirments, londiste
being somewhat like slony)

>
> - Brian
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Cédric Villemain

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Storing many big files in database- should I do it?

2010-04-29 Thread Cédric Villemain

2010/4/28 Adrian Klaver :
> On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
>> On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain <
>>
>> cedric.villemain.deb...@gmail.com> wrote:
>> > store your files in a filesystem, and keep the path to the file (plus
>> > metadata, acl, etc...) in database.
>>
>> What type of filesystem is good for this?  A filesystem with support for
>> storing tens of thousands of files in a single directory, or should one
>> play the 41/56/34/41563489.ext game?

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.


>>
>> Are there any open source systems which handle keeping a filesystem and
>> database in sync for this purpose, or is it a wheel that keeps getting
>> reinvented?
>>
>> I know "store your files in a filesystem" is the best long-term solution.
>> But it's just so much easier to just throw everything in the database.
>
> In the for what it is worth department check out this Wiki:
> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems

and postgres fuse also :-D

>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>



-- 
Cédric Villemain

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cumulative count (running total) window fn

2010-04-29 Thread Magnus Hagander

On Thu, Apr 29, 2010 at 10:52, Oliver Kohll - Mailing Lists
 wrote:
> Hello,
>
> Many thanks to andreas.kretschmer for this helpful reply about how to set up 
> a window function to perform a running total:
> http://archives.postgresql.org/pgsql-general/2010-03/msg01122.php
>
> It works perfectly with the simple test data but I've just got back to work, 
> tried implementing it on my live data and the results are slightly different. 
> My query is almost exactly the same - I've simplified by grouping by year 
> only rather than year and month:
>
> select extract(year from signup_date),
>  count(email_address),
>  sum(count(email_address)) over (rows unbounded preceding)
> from email_list group by 1 order by 1;
>
>  date_part | count | sum
> ---+---+--
>      2007 |   501 | 1374
>      2008 |   491 |  491
>      2009 |   382 |  873
>      2010 |    66 | 1440
> (4 rows)
>
> What I'm looking for is
>  date_part | count | sum
> ---+---+--
>      2007 |   501 | 501
>      2008 |   491 |  992
>      2009 |   382 |  1374
>      2010 |    66 | 1440
>
> It seems to be adding up the counts but not in the right order.
>
> I've also tried an explicit ORDER BY inside the partition with no difference:
>
> select extract(year from signup_date),
>  count(email_address),
>  sum(count(email_address)) over (partition by 1 order by 1 asc rows unbounded 
> preceding)
> from email_list group by 1 order by 1;
>
> Does anyone have any other ideas?

Aren't you looking for something along the line of:

SELECT year, sum(c) over (order by year)
FROM (
  SELECT extract(year from signup_date) AS year, count(email_address) AS c
  FROM email_list
  GROUP BY extract(year from signup_date)
)

(adjust for typos, I didn't test it)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Cumulative count (running total) window fn