Re: [GENERAL] Apache + PHP + Postgres Interaction

2007-08-24 Thread Hannes Dorbath

On 24.08.2007 02:43, Bill Moran wrote:

Hannes Dorbath <[EMAIL PROTECTED]> wrote:

Bill Moran wrote:

I guess I just feel that "broken" is a bit of a harsh term.  If
your expectations are for full-blown connection management from
pconnect(), then you will be disappointed.  If you take it for
what it is: persistent connections, then those limitations would
be expected.

It's broken because persistent connections get randomly garbage
collected where they should not. So broken in the sense of bugged.
Expect connections to die for no reason, especially under load.


It's funny that you should mention that, since I haven't seen that
behaviour in 18 months of load testing over a dozen servers.


Please reply to the list as well.

How did you verify that? It will spawn a new connection silently, if the 
old got dropped. Did you really verify your logs, that you don't get 
more new connections than Apache spawns workers? This might not be 
noticeable for you, if you are running Apache. In a FCGI environment 
where you have a fixed amount of workers, you notice new connections, as 
there should not be any.





--
Regards,
Hannes Dorbath

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Apache + PHP + Postgres Interaction

2007-08-24 Thread Hannes Dorbath

On 24.08.2007 00:43, Hannes Dorbath wrote:

Bill Moran wrote:

I guess I just feel that "broken" is a bit of a harsh term.  If
your expectations are for full-blown connection management from
pconnect(), then you will be disappointed.  If you take it for
what it is: persistent connections, then those limitations would
be expected.


It's broken because persistent connections get randomly garbage
collected where they should not. So broken in the sense of bugged.
Expect connections to die for no reason, especially under load.


How did you verify that? It will spawn a new connection silently, if the 
old got dropped. Did you really verify your logs, that you don't get 
more new connections than Apache spawns workers? This might not be 
noticeable for you, if you are running Apache. In a FCGI environment 
where you have a fixed amount of workers, you notice new connections, as 
there should not be any.



--
Regards,
Hannes Dorbath

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Simple Regex question, hoping for direct answer (no Socratic approach necessary)

2007-08-24 Thread Albe Laurenz
Postgres User wrote:
> I'm new to Regex in Postgres.  Can someone give me a quick pointer on
> how I'd SELECT the substring between   ''and  ''  in
> a field?
> 
> Sample field data:
> address city here Rogers, Jim zip code place
> 
> and I'd like the SELECT to return only:
> Rogers, Jim

SELECT regexp_replace(
  'Sing along with Rogers, Jim',
  '.*(.*?).*',
  E'\\1');

 regexp_replace 

 Rogers, Jim
(1 row)

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Undetected corruption of table files

2007-08-24 Thread Albe Laurenz
Tom Lane wrote:
>> - Shouldn't there be an error, some kind of 'missing magic
>>   number' or similar, when a table file consists of only
>>   zeros?
> 
> The particular case of an all-zeroes page is specifically allowed,
> and has to be because it's a valid transient state in various
> scenarios.

I see, that was a pathological case.

>> - Wouldn't it be desirable to have some means to verify the
>>   integrity of a table file or a whole database?
> 
> SELECT * usually does reasonably well at that.

Would it be an option to have a checksum somewhere in each
data block that is verified upon read?

Would this be a lot of work to add? I'd be willing to try it
if it is desirable and feasible.

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Alban Hertroys
Joshua D. Drake wrote:
>> I agree with you on the multi-threaded.  I think I will add a note
>> saying the the multi-threaded architecture is only advantageous  on
>> Windows.
> 
> And Solaris.

I'm not entirely sure what makes multi-threading be advantageous on a
specific operating system, but I think FreeBSD should be added to that
list as well... They've been bench marking their threading support using
multi-threading in MySQL (not for the db, mind you - just for load ;),
and it performs really well.

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Dave Page

Alban Hertroys wrote:

Joshua D. Drake wrote:

I agree with you on the multi-threaded.  I think I will add a note
saying the the multi-threaded architecture is only advantageous  on
Windows.

And Solaris.


I'm not entirely sure what makes multi-threading be advantageous on a
specific operating system, but I think FreeBSD should be added to that
list as well... They've been bench marking their threading support using
multi-threading in MySQL (not for the db, mind you - just for load ;),
and it performs really well.



I'm not sure I necessarily agree with those two - we have no real proof 
that a multithreaded architecture would be significantly more efficient 
than a multi process. It certainly wouldn't be as robust as an error in 
one backend thread could bring down the entire server.


Windows is a special case in this regard. The OS has been designed from 
the outset as a threaded environment. The important point is not that 
Windows threads are necessarily any more efficient than their Solaris or 
FreeBSD counterparts, but that the multi-process architecture is alien 
to Windows and is inherently slower. Two of the major bottlenecks we 
have on Windows as a result are backend startup time and shared memory 
access speed - both of which are significantly slower than on *nix.


Regards, Dave

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Local authentication/security

2007-08-24 Thread Richard Huxton

Lange Marcus wrote:

Ok. But apart from the risk with having the physical access to the
computer, would it be a reasonable safe solution to limit the access
with a password then ? So that, assuming a user does not have the
possibility to get any kind of root access, there is no way to read
or copy the database unless you login as the correct user ? And if
only the program knows the password, then the user running the
program will not be able to access the database, except through the
program ?


Make sure that only the postgres user can read/write the database files.
Make sure that only the application has a password to access the database.
Make sure that there aren't any tools that let you trap 
system-calls/dump memory as an unprivileged user.



Is there a way to assure that the database does not get stored on
disk ? If my program, everytime it starts, would create or build up
the database could this database be kept in RAM ? Or does postgres
automatically create a database file or something ? Do you understand
what i mean ?


You could create a ramdisk, then create the database on that. Of course, 
if you lose power then you lose your data. Check the manuals for details 
on "tablespaces" for how to have some DB objects in a different location.



Another question, are md5 passowords supported under windows ?


Any reason why they shouldn't be?

Not that it matters in your case. The password might as well be 
"password" - if they get access to the files/application, it's game over.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Alban Hertroys
Dave Page wrote:
> Alban Hertroys wrote:
>> Joshua D. Drake wrote:
 I agree with you on the multi-threaded.  I think I will add a note
 saying the the multi-threaded architecture is only advantageous  on
 Windows.
>>> And Solaris.
>>
>> I'm not entirely sure what makes multi-threading be advantageous on a
>> specific operating system, but I think FreeBSD should be added to that
>> list as well... They've been bench marking their threading support using
>> multi-threading in MySQL (not for the db, mind you - just for load ;),
>> and it performs really well.
>>
> 
> I'm not sure I necessarily agree with those two - we have no real proof
> that a multithreaded architecture would be significantly more efficient
> than a multi process. It certainly wouldn't be as robust as an error in
> one backend thread could bring down the entire server.
> 
> Windows is a special case in this regard. The OS has been designed from
> the outset as a threaded environment. The important point is not that
> Windows threads are necessarily any more efficient than their Solaris or
> FreeBSD counterparts, but that the multi-process architecture is alien
> to Windows and is inherently slower. Two of the major bottlenecks we
> have on Windows as a result are backend startup time and shared memory
> access speed - both of which are significantly slower than on *nix.
> 
> Regards, Dave

Thanks for explaining (again).

So actually the remark shouldn't be that "the multi-threaded
architecture is only advantageous  on Windows", but more like "the
multi-process architecture is disadvantageous on Windows and hence a
multi-threaded architecture is preferred (on that particular OS)".

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Dave Page

Alban Hertroys wrote:

So actually the remark shouldn't be that "the multi-threaded
architecture is only advantageous  on Windows", but more like "the
multi-process architecture is disadvantageous on Windows and hence a
multi-threaded architecture is preferred (on that particular OS)".


Yeah - but I'm not sure thats necessarily something that should have a 
place on a bullet point comparison.


Regards, Dave

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] "out of memory" error

2007-08-24 Thread Christian Schröder

Martijn van Oosterhout wrote:

You've got it completely wrong.


Hm, you seem to be right. :(
I have now decreased the "shared_buffers" setting to 128 MB. I have also 
found some tuning pages with warnings about not setting the value too 
high. I'm sure that I have read these pages before, but I seem to have 
been blind ...


Ok, many thanks for putting me right!

Regards,
  Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] "out of memory" error

2007-08-24 Thread Mikko Partio
On 8/23/07, Martijn van Oosterhout <[EMAIL PROTECTED]> wrote:
>
>
> You've got it completely wrong. By setting shared_buffers to 2GB it
> means no-one can use it. It's not postgres that's running out of
> memory, it's the rest of your system. Set it to something sane like
> 128MB or maybe smaller.



Isn't 128MB quite low considering the "current standard" of 25% - 50% of
total ram?

Regards

MP


[GENERAL] pgsql functions: get column headers; optional arguments

2007-08-24 Thread Mark




Hi. I hope this is the correct mailing list.

I am writing a function to extract data either from a table or a query
and output it in xml. I have the logic down, but I cannot work out a
few things.

1. How can I read the column headings from the returned data set? I
have resorted to writing the same function in tcl in which I can work
out how to do this, but what about pgsql? I can't use the system tables
for this, since the data may not come from a table.

2. It it possible, either in tcl or pgsql, to have optional function
arguments?

Thanks,

Mark

-- 


Mark Simon
Manngo Net Pty Ltd
Phone/Fax: 1300 726 000
mobile: 0411 246 672
email: [EMAIL PROTECTED]
web: http://www.manngo.net
Resume: http://mark.manngo.net







[GENERAL] cant get spatial data..

2007-08-24 Thread andrew quaresma
try {



Object fs = this.resource.resolve(FeatureSource.class, new
GeoProgressMonitor(new NullProgressMonitor()));




if ((fs != null) && (fs instanceof FeatureSource))
{


FeatureCollection fc = ((FeatureSource) fs).getFeatures();


FeatureIterator fi = fc.features();


while (fi.hasNext())
{

FeatureGeoResource fgr=null;
Feature f = fi.next();



System.out.println("TOTAL " + f.getNumberOfAttributes(
));
//System.out.println( f.getAttribute(2));
System.out.println(f.getAttribute("rota"));
System.out.println(f.getAttribute("ponto"));

members.add(fgr);

}


fc.close(fi);
}

} catch (IOException e) {
e.printStackTrace();
}



why does the sysout of the point returns null?... cant get spatial data...
what shall i do?... :S


Re: [GENERAL] "out of memory" error

2007-08-24 Thread Christian Schröder




Mikko Partio wrote:

  
  Isn't 128MB quite low considering the "current standard" of 25%
- 50% of total ram?
  
  


I had also read a statement about using this amount of memory as shared
buffers. Exactly that was the reason why I set it to such a high value,
but I am now convinced that this is wrong.

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#shbuf
and
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
are quite clear about this.

On the other hand,
http://edoceo.com/liber/db-postgresql-performance.php says:

  
Shared Memory
PostgreSQL uses lots of this, view ipcs to prove
it, the more shared memory the better as more data (tables) can be
loaded.
On a dedicated datbase server it's not uncommon to give half the memory
to the database.
  

and

  
shared_buffers = N
Set anywhere from 1/4 to 1/2 physical memory, must set kernel
shared memory max first.
Will see noticeable difference.
  

Since the first links are also mentioned on the official PostgreSQL
website (http://www.postgresql.org/docs/techdocs.2) I think they should
be trusted more.

Regards,
    Christian
-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




Re: [GENERAL] "out of memory" error

2007-08-24 Thread Christian Schröder
Side note: Why does Thunderbird send HTML mails albeit being configured 
for sending plain text mails? Sorry for that! And sorry for being off-topic.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] "out of memory" error

2007-08-24 Thread Martijn van Oosterhout
On Fri, Aug 24, 2007 at 12:07:37PM +0300, Mikko Partio wrote:
> On 8/23/07, Martijn van Oosterhout <[EMAIL PROTECTED]> wrote:
> > You've got it completely wrong. By setting shared_buffers to 2GB it
> > means no-one can use it. It's not postgres that's running out of
> > memory, it's the rest of your system. Set it to something sane like
> > 128MB or maybe smaller.
> Isn't 128MB quite low considering the "current standard" of 25% - 50% of
> total ram?

Not sure about "current standard" but it depends heavily on your
system. On my laptop where postgres is among the least important
applications, even 16MB might be too much. If you have a server
dedicated to it, maybe a larger percentage would be helpful, but >50%
of memory is definitly counterproductive.

The basic issue is that everything postgres reads into shared_buffers
is going to be in the system-wide disk cache also. That's two copies of
everything -> wasted memory. I think it's probably more to do with your
write-traffic than read-traffic.

At the very least, if you're reaching the point where the kernel is
swapping shared memory to disk, you've lost. More is most definitly not
always better.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Apache + PHP + Postgres Interaction

2007-08-24 Thread Bill Moran
In response to Hannes Dorbath <[EMAIL PROTECTED]>:

> > On 24.08.2007 02:43, Bill Moran wrote:
> >> Hannes Dorbath <[EMAIL PROTECTED]> wrote:
> >>> Bill Moran wrote:
>  I guess I just feel that "broken" is a bit of a harsh term.  If
>  your expectations are for full-blown connection management from
>  pconnect(), then you will be disappointed.  If you take it for
>  what it is: persistent connections, then those limitations would
>  be expected.
> >>> It's broken because persistent connections get randomly garbage
> >>> collected where they should not. So broken in the sense of bugged.
> >>> Expect connections to die for no reason, especially under load.
> >> 
> >> It's funny that you should mention that, since I haven't seen that
> >> behaviour in 18 months of load testing over a dozen servers.
> 
> Please reply to the list as well.

Your reply to me did not have the list in the CC.

> How did you verify that? It will spawn a new connection silently, if the 
> old got dropped. Did you really verify your logs, that you don't get 
> more new connections than Apache spawns workers? This might not be 
> noticeable for you, if you are running Apache. In a FCGI environment 
> where you have a fixed amount of workers, you notice new connections, as 
> there should not be any.

As I stated in the other reply to an email that looked similar to this
one -- I'm not sure I understand the behaviour you're trying to describe.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Gregory Stark
"Dave Page" <[EMAIL PROTECTED]> writes:

> Alban Hertroys wrote:
>> So actually the remark shouldn't be that "the multi-threaded
>> architecture is only advantageous  on Windows", but more like "the
>> multi-process architecture is disadvantageous on Windows and hence a
>> multi-threaded architecture is preferred (on that particular OS)".
>
> Yeah - but I'm not sure thats necessarily something that should have a place 
> on
> a bullet point comparison.

Note that while we use the OS's "threads" api we're not really any more
multi-threaded on Windows than we are on Unix. We don't use any shared memory
data structures we don't on Unix using SysV shared memory, we don't use any
mutexes or other threaded programming techniques that we don't use on Unix,
and so on.

It's purely a question of which API we use to create the threads of execution.
Not an architectural change in Postgres.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Trevor Talbot
On 8/23/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> Shelby Cain wrote:

> > Wild guess on my part... could that error be the result of an attempt
> > to map shared memory into a process at a fixed location that just
> > happens to already be occupied by a dll that Windows had decided to
> > relocate?
>
> Not that wild a guess, really :-) I'd say it's a very good possibility -
> but I have no idea why it'd do that, since all backends load the same
> DLLs at that stage.

Not a valid assumption; you can't rely on consistent VM space among
multiple [non-cloned] processes without a serious amount of effort.
Anything can use that space, it's not just file views.  Obviously it
happens to work some of the time, but when it doesn't, it doesn't.  I
gather postgres depends on it being at the same address, and fixing
that isn't trivial?

If everything relevant is going through the intriguing
internal_forkexec(), you could probably reserve address space there
before resuming the thread.  You'd want to combine this with picking
address space that's less likely to be used before creating the shared
memory section.  (Actually, if you're doing that, you might as well
just inject the backend variables too instead of going through the
mapped file gymnastics.)

Not a simple change, but would likely make this particular problem go
away (assuming this is the problem).  It's also the first time I've
looked at the source, so perhaps I missed something.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Apache + PHP + Postgres Interaction

2007-08-24 Thread Hannes Dorbath

On 24.08.2007 12:56, Bill Moran wrote:
How did you verify that? It will spawn a new connection silently, if the 
old got dropped. Did you really verify your logs, that you don't get 
more new connections than Apache spawns workers? This might not be 
noticeable for you, if you are running Apache. In a FCGI environment 
where you have a fixed amount of workers, you notice new connections, as 
there should not be any.


As I stated in the other reply to an email that looked similar to this
one -- I'm not sure I understand the behaviour you're trying to describe.


Persistent connections get randomly dropped. Now there is either an 
Apache worker with mod_php or and FCGI child that has lost its 
connection to the database. On the next request that is processed by 
this child / worker it will notice that it has lost its connection and 
will create a new one. This behaviour might depend on 
pgsql.auto_reset_persistent.


When using a classical Apache config (that is a non-threaded MPM) Apache 
does adjust its number of workers dynamically, so it's natural that more 
database connections are created under load and are dropped again after 
some idle time. That is why it might be hard to verify that you are not 
hit by this bug, as in your environment there is a native fluctuation.


In an FCGI environment however, you will notice this immediatelly, as 
the number of childs / workers is usually fixed and something must be 
broken if connection come and go.



--
Regards,
Hannes Dorbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [OT - sorta] How to extract a substring using Regex

2007-08-24 Thread Josh Trutwin
On Fri, 24 Aug 2007 06:31:58 +0100
Mark Cave-Ayland <[EMAIL PROTECTED]> wrote:

> > substring(data_field from '(.+)<\/name>')
> 
> FWIW, I find the following site extremely useful when trying to
> create moderately complex regular expressions: http://www.rexv.org.

Nice site - here's another good one for an installed app that is
excellent for even advanced regexp's:

http://weitz.de/regex-coach/

It used to be cross-platform but now he only develops a Windows
version.

Thanks,

Josh

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Local authentication/security

2007-08-24 Thread Lange Marcus
> > Is there a way to assure that the database does not get 
> stored on disk 
> > ? If my program, everytime it starts, would create or build up the 
> > database could this database be kept in RAM ? Or does postgres 
> > automatically create a database file or something ? Do you 
> understand 
> > what i mean ?
> 
> You could create a ramdisk, then create the database on that. 
> Of course, if you lose power then you lose your data. Check 
> the manuals for details on "tablespaces" for how to have some 
> DB objects in a different location.
> 
> > Another question, are md5 passowords supported under windows ?
> 
> Any reason why they shouldn't be?
> 
> Not that it matters in your case. The password might as well 
> be "password" - if they get access to the files/application, 
> it's game over.

What about having some of the columns encrypted in the database ? 
Will that improve things abit ?
Regards Marcus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] pgsql functions: get column headers; optional arguments

2007-08-24 Thread Mark

Hi. I hope this is the correct mailing list.

I am writing a function to extract data either from a table or a query 
and output it in xml. I have the logic down, but I cannot work out a few 
things.


1. How can I read the column headings from the returned data set? I have 
resorted to writing the same function in tcl in which I can work out how 
to do this, but what about pgsql? I can't use the system tables for 
this, since the data may not come from a table.


2. It it possible, either in tcl or pgsql, to have optional function 
arguments?


Thanks,

Mark
--


 Mark Simon

Manngo Net Pty Ltd

Phone/Fax: 1300 726 000
mobile: 0411 246 672

email: [EMAIL PROTECTED] 
web: http://www.manngo.net

Resume: http://mark.manngo.net



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Local authentication/security

2007-08-24 Thread Richard Huxton

Lange Marcus wrote:
Not that it matters in your case. The password might as well 
be "password" - if they get access to the files/application, 
it's game over.


What about having some of the columns encrypted in the database ? 
Will that improve things abit ?


Not unless you can keep the key protected. If you prevent access to a 
file containing the key then you can do the same to the database and 
application files.



Can you tell us what you're trying to do? What are you going to deploy, 
and what threats do you want to protect against?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Gregory Stark
"Trevor Talbot" <[EMAIL PROTECTED]> writes:

> I gather postgres depends on it being at the same address, and fixing that
> isn't trivial?

I haven't been following the rest of the thread so I'm not sure if this is
important. But no, fixing that should be relatively trivial as there are
already some configurations where it's not the case (the EXEC_BACKEND case I
believe). The rest of the system uses a shared memory base pointer and
references everything relative to that.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] tsearch query question

2007-08-24 Thread roy simkes

Hi,

SELECT contentid, title, (rank(to_tsvector(body),q) + 
rank(to_tsvector(title),q) + rank(to_tsvector(subtitle),q)) AS Score

FROM content, to_tsquery('parkyeri') AS q
WHERE statusid = 1
AND ispublished = 1
AND (to_tsvector(body) @@ q
  OR to_tsvector(title) @@ q
  OR to_tsvector(subtitle) @@ q )
ORDER BY Score

I have such a query. I'm not very sure if it will work but that's not 
the part of the question. As you see I'm using a lot to_tsvector() 
function. Which I believe it will not be good for the performance. So I 
thought changing my query to something like this:


SELECT contentid, title, (rank(fts_body, q) + rank(fts_title,q) + 
rank(fts_subtitle,q) ) AS Score

FROM content,
  to_tsquery('search & string') AS q,
  to_tsvector(body) AS fts_body,
  to_tsvector(title) AS fts_title,
  to_tsvector(subtitle) AS fts_subtitle
WHERE statusid = 1
AND ispublished = 1
AND ( fts_body @@ q
  OR fts_title @@ q
  OR fts_subtitle @@ q )
ORDER BY Score

So when I have changed to this, will the to_tsvector part will be 
available for every row? Or will it be just computed once? I mean in the 
first query where part is executed for every row, so I'm sure that it 
will be evaluated for all the rows. But when I put that in the from part 
will it compute the value once and will use the same value for all the 
where clauses? If that's the case what will be the value of fts_body? 
The tsvector of every row's data or just one row's data?


thank you for your time and patience

roy simkes

note: I had posted the same question a week ago too, but I think it's 
missing. I'm sorry if it's already posted and I'm double posting.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] [OT - sorta] How to extract a substring using Regex

2007-08-24 Thread Lee Keel
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-general-
> [EMAIL PROTECTED] On Behalf Of Josh Trutwin
> Sent: Friday, August 24, 2007 7:58 AM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] [OT - sorta] How to extract a substring using Regex
> 
> On Fri, 24 Aug 2007 06:31:58 +0100
> Mark Cave-Ayland <[EMAIL PROTECTED]> wrote:
> 
> > > substring(data_field from '(.+)<\/name>')
> >
> > FWIW, I find the following site extremely useful when trying to
> > create moderately complex regular expressions: http://www.rexv.org.
> 
> Nice site - here's another good one for an installed app that is
> excellent for even advanced regexp's:
> 
> http://weitz.de/regex-coach/
> 
> It used to be cross-platform but now he only develops a Windows
> version.
> 
> Thanks,
> 
> Josh
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org/

As long as we are talking about regex tools...  I found this tool a while
back and today I don't know how I ever figured out regular expressions
without it. (http://www.regexbuddy.com/)  Unfortunately it is not free, but
for the time that it saves, it is WELL worth $40 a copy.  In fact, I
personally bought it and was going to charge back to my company, but decided
I would keep the licenses myself.
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the sender. This message 
contains confidential information and is intended only for the individual 
named. If you are not the named addressee you should not disseminate, 
distribute or copy this e-mail.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] ERD of information_schema / system catalog?

2007-08-24 Thread Josh Trutwin
Hi,

Did a little googling but didn't come up with much - does anyone know
if someone made an ERD of either the information_schema or system
catalog (pg_) tables for postgres 8?

Something similar to this for MySQL 5 would be ideal:

http://www.xcdsql.org/MySQL/information_schema/5.1/MySQL_5_1_INFORMATION_SCHEMA.html

Thanks,

Josh

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-24 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/24/07 08:21, Markus Schiltknecht wrote:
> Hi,
> 
> Matthew wrote:
>> Hey all, new postgres user here. We are trying to setup/research an
>> HA/Replicated solution with Postrgresql between a datacenter in LA and a
>> d.c. in NY.
>>
>> We have a private LAN link between the two D.C.'s with a max round-trip
>> of 150ms.
>>
>> We will have a web server at each d.c. (among other servers) that will
>> write/read to/from the local LAN DB. On writes, that data should be
>> xmited to the other data center so that if, for whatever reason, my
>> website request was sent to LA instead of NY, all my session information
>> etc will still exist.
> 
> This is commonly known as synchronous replication. As that involves
> *at-least* one round-trip *before* committing, it's quite expensive. Can
> you live with a delay of ~150ms before COMMIT confirmation?

Which puts an upper limit on transaction rates at 6TPS.  Blech.

> Another issue is the reliability of your failure detectors. How does
> server B know that server A is really down (and not only the link?).
> Normally, that's solved with a quorum device. So that you have to have
> at least three servers - preferably in different locations.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGzu/7S9HxQb37XmcRArnEAJ947nvYFT1eQvRzj6YkpVEDLtLUqQCgsbQy
rgyz2ZCrlGbS+RzzXTD1ybY=
=SRQv
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-24 Thread Gregory Stark
"Ron Johnson" <[EMAIL PROTECTED]> writes:

> On 08/24/07 08:21, Markus Schiltknecht wrote:
>
>> This is commonly known as synchronous replication. As that involves
>> *at-least* one round-trip *before* committing, it's quite expensive. Can
>> you live with a delay of ~150ms before COMMIT confirmation?
>
> Which puts an upper limit on transaction rates at 6TPS.  Blech.

Only if your application is single-threaded. By single-threaded I don't refer
to operating system threads but to the architecture. If you're processing a
large batch file handling records one by one and waiting for each commit
before proceeding then it's single threaded. If you have a hundred independent
clients on separate connections doing separate things then each one of them
could get 6tps. Which you have will depend on your application and your needs,
it may not be something you can change.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Bruce Momjian
Trevor Talbot wrote:
> On 8/23/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> > Shelby Cain wrote:
> 
> > > Wild guess on my part... could that error be the result of an attempt
> > > to map shared memory into a process at a fixed location that just
> > > happens to already be occupied by a dll that Windows had decided to
> > > relocate?
> >
> > Not that wild a guess, really :-) I'd say it's a very good possibility -
> > but I have no idea why it'd do that, since all backends load the same
> > DLLs at that stage.
> 
> Not a valid assumption; you can't rely on consistent VM space among
> multiple [non-cloned] processes without a serious amount of effort.
> Anything can use that space, it's not just file views.  Obviously it
> happens to work some of the time, but when it doesn't, it doesn't.  I
> gather postgres depends on it being at the same address, and fixing
> that isn't trivial?
> 
> If everything relevant is going through the intriguing
> internal_forkexec(), you could probably reserve address space there
> before resuming the thread.  You'd want to combine this with picking
> address space that's less likely to be used before creating the shared
> memory section.  (Actually, if you're doing that, you might as well
> just inject the backend variables too instead of going through the
> mapped file gymnastics.)
> 
> Not a simple change, but would likely make this particular problem go
> away (assuming this is the problem).  It's also the first time I've
> looked at the source, so perhaps I missed something.

I think this is accurate.  When we created the Win32 native port there
was a lot of concern about how to handle shared memory in a BACKEND_EXEC
case, namely that postmaster children were not copies which had the same
shared memory mappings, but rather were new processes that had to attach
to shared memory at a fixed address.

The WIN32 solution was to create the shared memory in the parent, and
then pass that address value down to the children to use in attaching to
the existing segment.  We expected all sorts of problems with this but
in fact it seemed to work fine (most of the time).

As you can see it doesn't work 100% of the time, but it worked more
reliabily than we expected.  What we have been waiting for is someone
who can recreate a failure so we can track down how to best make it 100%
reliable, and as you can see, we haven't had a flood of problem reports
to track this down.

If you want to help make it 100% we will work with you to find the
solution.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Bruce Momjian
Gregory Stark wrote:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
> 
> > I gather postgres depends on it being at the same address, and fixing that
> > isn't trivial?
> 
> I haven't been following the rest of the thread so I'm not sure if this is
> important. But no, fixing that should be relatively trivial as there are
> already some configurations where it's not the case (the EXEC_BACKEND case I
> believe). The rest of the system uses a shared memory base pointer and
> references everything relative to that.

This is inaccurate, I believe.  The original Berkeley code did exec()
for backends and hence allowed shared memory to be at different
addresses for different backends, but we started using fork() and
eliminated much of that capability for performance and clarify reasons,
so right now all backends have to have shared memory at the same
address, and changing this will not be simple.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] [OT - sorta] How to extract a substring using Regex

2007-08-24 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Lee Keel wrote:
>> -Original Message-

>>
>> ---(end of broadcast)---
>> TIP 4: Have you searched our list archives?
>>
>>http://archives.postgresql.org/
> 
> As long as we are talking about regex tools...  I found this tool a while
> back and today I don't know how I ever figured out regular expressions
> without it. (http://www.regexbuddy.com/)  Unfortunately it is not free, but
> for the time that it saves, it is WELL worth $40 a copy.  In fact, I
> personally bought it and was going to charge back to my company, but decided
> I would keep the licenses myself.
> This email and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom they are addressed. If 
> you have received this email in error please notify the sender. This message 
> contains confidential information and is intended only for the individual 
> named. If you are not the named addressee you should not disseminate, 
> distribute or copy this e-mail.

Woah! That is great. Now to find a linux version.

Joshua D. Drake

> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGzvScATb/zqfZUUQRAnBRAKCcMesTYri0caOiNzShMwTXXw137ACfcLxu
3Egy0J00iFIaFFfl9KiOlc0=
=rlVZ
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Undetected corruption of table files

2007-08-24 Thread Tom Lane
"Albe Laurenz" <[EMAIL PROTECTED]> writes:
> Would it be an option to have a checksum somewhere in each
> data block that is verified upon read?

That's been proposed before and rejected before.  See the archives ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Jeff Amiel
"PostgreSQL 8.2.4 on i386-pc-solaris2.10, compiled by
GCC gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath)"


Week-old installstill tuning and tweaking this
thing.

Over last 2 days, have spotted 10 "Out of Memory"
errors in postgres logs (never saw before with same
app/usage patterns on tuned hardware/postgres under
FreeBSD)

Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-1] 2007-08-22 18:08:24 CDT   ERROR:
 out of memory.
Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-2] 2007-08-22 18:08:24 CDT  
DETAIL:  Failed on request of size 536870910.

What I found interesting is that It's ALWAYS the same
size536870910

I am running autovacuum and slony.but I see
nothing in the logs anywhere near the "out of memory"
errors related to either (autovacuum used to under
8.0.X log INFO messages every time it vacuumed which
came in handy...I assume it doesn't so this any more?)
 

The events are fairly spread out...and cannot (by
looking at app logs and rest of DB logs) correlate to
any specific query or activity.

Any help would be appreciated

Box is a Sun X4600 with 8 dual-core processors and 32
gig of ram.

# su - pgsql
Sun Microsystems Inc.   SunOS 5.10  Generic
January 2005
-bash-3.00$ ulimit -a
core file size(blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
open files(-n) 256
pipe size  (512 bytes, -p) 10
stack size(kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes(-u) 16357
virtual memory(kbytes, -v) unlimited

shared_buffers = 3GB# min 128kB or
max_connections*16kB
temp_buffers = 1000 # min 800kB  
was 8MB
max_prepared_transactions = 450 # can be 0 or
more
work_mem = 100MB# min
64kB
maintenance_work_mem = 512MB# min 1MB
#max_stack_depth = 2MB  # min 100kB
max_fsm_pages = 208000  # min
max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1   # min 100, ~70
bytes each
#max_files_per_process = 1000   # min 25
#shared_preload_libraries = ''  # (change
requires restart)
fsync = on  # turns forced
synchronization on or off
wal_sync_method = fdatasync # the default
is the first option
full_page_writes = off  # recover from
partial page writes
wal_buffers = 2300  # min 32kB
commit_delay = 10   # range
0-10, in microseconds
#commit_siblings = 5# range 1-1000
checkpoint_segments = 128   # in logfile
segments, min 1, 16MB each
checkpoint_timeout = 5min   # range 30s-1h
checkpoint_warning = 99s# 0 is off




   

Be a better Globetrotter. Get better travel answers from someone who knows. 
Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list&sid=396545469

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] [OT - sorta] How to extract a substring using Regex

2007-08-24 Thread Lee Keel
> -Original Message-
> From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
> Sent: Friday, August 24, 2007 10:09 AM
> To: Lee Keel
> Cc: Josh Trutwin; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] [OT - sorta] How to extract a substring using Regex
> 
> 
> Woah! That is great. Now to find a linux version.
> 
> Joshua D. Drake
> 

I am not sure if you saw the information about the installation on wine
(http://www.regexbuddy.com/wine.html).  I am not a linux guru, so I really
couldn't speak to this very much.  Sorry.  But yes, very cool tool and will
help newbies and the more experienced.

-LK
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the sender. This message 
contains confidential information and is intended only for the individual 
named. If you are not the named addressee you should not disseminate, 
distribute or copy this e-mail.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Tom Lane
"Trevor Talbot" <[EMAIL PROTECTED]> writes:
> On 8/23/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
>> Not that wild a guess, really :-) I'd say it's a very good possibility -
>> but I have no idea why it'd do that, since all backends load the same
>> DLLs at that stage.

> Not a valid assumption; you can't rely on consistent VM space among
> multiple [non-cloned] processes without a serious amount of effort.

I'm not sure if you have a specific technical meaning of "clone" in mind
here, but these processes are all executing the identical executable,
and taking care to map the shmem early in execution *before* they load
any DLLs.  So it should work.  Apparently, it *does* work for awhile for
the OP, and then stops working, which is even odder.

> I gather postgres depends on it being at the same address, and fixing
> that isn't trivial?

That's correct, and not having to change it is not negotiable ---
finding a way to make this work was one of the gating factors that
made it practical to have a Windows port at all.

If you've got a specific suggestion for making it more reliable,
we're all ears.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Erik Jones

On Aug 24, 2007, at 10:09 AM, Jeff Amiel wrote:


"PostgreSQL 8.2.4 on i386-pc-solaris2.10, compiled by
GCC gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath)"


Week-old installstill tuning and tweaking this
thing.

Over last 2 days, have spotted 10 "Out of Memory"
errors in postgres logs (never saw before with same
app/usage patterns on tuned hardware/postgres under
FreeBSD)

Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-1] 2007-08-22 18:08:24 CDT   ERROR:
 out of memory.
Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-2] 2007-08-22 18:08:24 CDT
DETAIL:  Failed on request of size 536870910.

What I found interesting is that It's ALWAYS the same
size536870910

I am running autovacuum and slony.but I see
nothing in the logs anywhere near the "out of memory"
errors related to either (autovacuum used to under
8.0.X log INFO messages every time it vacuumed which
came in handy...I assume it doesn't so this any more?)


The events are fairly spread out...and cannot (by
looking at app logs and rest of DB logs) correlate to
any specific query or activity.

Any help would be appreciated

Box is a Sun X4600 with 8 dual-core processors and 32
gig of ram.

# su - pgsql
Sun Microsystems Inc.   SunOS 5.10  Generic
January 2005
-bash-3.00$ ulimit -a
core file size(blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
open files(-n) 256
pipe size  (512 bytes, -p) 10
stack size(kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes(-u) 16357
virtual memory(kbytes, -v) unlimited

shared_buffers = 3GB# min 128kB or
max_connections*16kB
temp_buffers = 1000 # min 800kB
was 8MB
max_prepared_transactions = 450 # can be 0 or
more
work_mem = 100MB# min
64kB
maintenance_work_mem = 512MB# min 1MB
#max_stack_depth = 2MB  # min 100kB
max_fsm_pages = 208000  # min
max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1   # min 100, ~70
bytes each
#max_files_per_process = 1000   # min 25
#shared_preload_libraries = ''  # (change
requires restart)
fsync = on  # turns forced
synchronization on or off
wal_sync_method = fdatasync # the default
is the first option
full_page_writes = off  # recover from
partial page writes
wal_buffers = 2300  # min 32kB
commit_delay = 10   # range
0-10, in microseconds
#commit_siblings = 5# range 1-1000
checkpoint_segments = 128   # in logfile
segments, min 1, 16MB each
checkpoint_timeout = 5min   # range 30s-1h
checkpoint_warning = 99s# 0 is off


A few weeks ago I got the same error on the same server.  In fact,  
the only difference is our memory where you have 32GB and I have 16GB  
and you have 512MB maintenance_work_mem and I have 256MB.  I point  
out the maintenance work memory setting as that is pretty much  
exactly what the request size that your error pointed out as was mine  
(yours/2).  However, that was the only time I've seen this.  Below is  
the full context of the error report in my log.  I see that there is  
an Autovacuum context as well as references to a toast table so,  
something to do with autovacuum?


TopMemoryContext: 14466424 total in 1758 blocks; 7160792 free (12578  
chunks); 7305632 used
TopTransactionContext: 8192 total in 1 blocks; 7688 free (10 chunks);  
504 used
Type information cache: 8192 total in 1 blocks; 1800 free (0 chunks);  
6392 used
Operator class cache: 8192 total in 1 blocks; 4872 free (0 chunks);  
3320 used
Autovacuum context: 16769024 total in 11 blocks; 6959320 free (11  
chunks); 9809704 used
smgr relation table: 24576 total in 2 blocks; 11952 free (4 chunks);  
12624 used
TransactionAbortContext: 32768 total in 1 blocks; 32752 free (0  
chunks); 16 used

Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 256 free (0 chunks); 7936 used
CacheMemoryContext: 1183288 total in 20 blocks; 889824 free (4094  
chunks); 293464 used
pg_toast_356294_index: 1024 total in 1 blocks; 328 free (0 chunks);  
696 used
pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 288 free (0  
chunks); 736 used
pg_index_indrelid_index: 1024 total in 1 blocks; 352 free (0 chunks);  
672 used
pg_namespace_oid_index: 1024 total in 1 blocks; 352 free (0 chunks);  
672 used
pg_authid_oid_index: 1024 total in 1 blocks; 352 free (0 chunks); 672  
used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 288 free (0  
chunks); 736 used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 328 free (0  
chunks); 696 used
pg_operator_oid_index: 1024 total 

[GENERAL] Can tsearch do some basic text mining

2007-08-24 Thread Phoenix Kiula
Hi,

We have big blobs of text (average 10,000 characters) in a database,
from which we would like to discover the most often repeated words or
phrases. Can tsearch be used for this kind of pattern search? I
suppose it's Text Mining 101 sort of stuff, nothing complex.

TIA!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Trevor Talbot" <[EMAIL PROTECTED]> writes:
>> I gather postgres depends on it being at the same address, and fixing that
>> isn't trivial?

> I haven't been following the rest of the thread so I'm not sure if this is
> important. But no, fixing that should be relatively trivial as there are
> already some configurations where it's not the case (the EXEC_BACKEND case I
> believe). The rest of the system uses a shared memory base pointer and
> references everything relative to that.

That hasn't been the case for quite a few years, and we're not going back.
The pointer-to-offset-and-back gymnastics that that required were
utterly destructive to code readability and maintainability, mainly
because if everything stored in shmem data structures is an "offset"
then you can't get any useful error checking from the compiler about how
you are using the fields.  It's like decreeing that every pointer
must be declared "void *" and cast to something else when it's used.

There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET,
but I think it's mostly just that no one's bothered to rewrite the code
for SHM_QUEUE linked lists.  The vast majority of our shmem structures
use regular pointers, and have for years.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] [OT - sorta] How to extract a substring using Regex

2007-08-24 Thread Dimitri Fontaine
Le Friday 24 August 2007 17:09:16 Joshua D. Drake, vous avez écrit :
> > As long as we are talking about regex tools...  I found this tool a while
> > back and today I don't know how I ever figured out regular expressions
> > without it. (http://www.regexbuddy.com/) 
>
> Woah! That is great. Now to find a linux version.

Does this tool looks like the one you'd wanna find?
  http://www.blackie.dk/KDE/KRegExpEditor/
  http://docs.kde.org/stable/en/kdeutils/KRegExpEditor/

Regards,
-- 
dim

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Alvaro Herrera
Tom Lane escribió:

> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET,
> but I think it's mostly just that no one's bothered to rewrite the code
> for SHM_QUEUE linked lists.  The vast majority of our shmem structures
> use regular pointers, and have for years.

... except that, not knowing that, I wrote part of the new autovac code
using MAKE_PTR/OFFSET, and it needs to be rewritten eventually :-(

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET,
> but I think it's mostly just that no one's bothered to rewrite the code
> for SHM_QUEUE linked lists.  The vast majority of our shmem structures
> use regular pointers, and have for years.

Ah, I happened to be recently in that code so I was mislead.

So even in EXEC_BACKEND we require that we can attach to the shared memory at
a specified location. hm.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-24 Thread Markus Schiltknecht

Hi,

Matthew wrote:

Hey all, new postgres user here. We are trying to setup/research an
HA/Replicated solution with Postrgresql between a datacenter in LA and a
d.c. in NY.

We have a private LAN link between the two D.C.'s with a max round-trip
of 150ms.

We will have a web server at each d.c. (among other servers) that will
write/read to/from the local LAN DB. On writes, that data should be
xmited to the other data center so that if, for whatever reason, my
website request was sent to LA instead of NY, all my session information
etc will still exist.


This is commonly known as synchronous replication. As that involves 
*at-least* one round-trip *before* committing, it's quite expensive. Can 
you live with a delay of ~150ms before COMMIT confirmation?


Another issue is the reliability of your failure detectors. How does 
server B know that server A is really down (and not only the link?). 
Normally, that's solved with a quorum device. So that you have to have 
at least three servers - preferably in different locations.


Regards

Markus

Disclaimer: I'm the developer behind Postgres-R (www.postgres-r.org), 
have a look at it, it's designed to do what you are looking for.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Tom Lane
Jeff Amiel <[EMAIL PROTECTED]> writes:
> Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
> local0.warning] [6-1] 2007-08-22 18:08:24 CDT   ERROR:
>  out of memory.
> Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
> local0.warning] [6-2] 2007-08-22 18:08:24 CDT  
> DETAIL:  Failed on request of size 536870910.

> What I found interesting is that It's ALWAYS the same
> size536870910

> maintenance_work_mem = 512MB# min 1MB

Apparently this maintenance_work_mem setting is higher than your system
can reliably provide.  Knock it back a bit.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Geographic High-Availability/Replication

2007-08-24 Thread Markus Schiltknecht

Hi,

Gregory Stark wrote:

Only if your application is single-threaded. By single-threaded I don't refer
to operating system threads but to the architecture. If you're processing a
large batch file handling records one by one and waiting for each commit
before proceeding then it's single threaded. If you have a hundred independent
clients on separate connections doing separate things then each one of them
could get 6tps. Which you have will depend on your application and your needs,
it may not be something you can change.


Correct.

Plus, as in the implementation of Postgres-R, performance is *not* bound 
to the slowest node. Instead, every node can process transactions at 
it's own speed. Slower nodes might then have to queue transactions from 
those until they catch up again.


Regards

Markus


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PG Seg Faults Performing a Query

2007-08-24 Thread Tom Lane
Bill Thoen <[EMAIL PROTECTED]> writes:
> (gdb) bt
> #0  0x003054264571 in fputc () from /lib64/libc.so.6
> #1  0x0040dbc2 in print_aligned_text (title=0x0, headers=0x5665d0,
> cells=0x2f8fc010, footers=0x557c90,
> opt_align=0x557ef0 'l' , "rr", 'l'  times>, "rl lll", opt_tuples_only=0 '\0', opt_numeric_locale=0 '\0', 
> opt_border=1,
> encoding=8, fout=0x0) at print.c:448
> #2  0x0040f0eb in printTable (title=0x0, headers=0x5665d0,
> cells=0x2f8fc010, footers=0x557c90,
> align=0x557ef0 'l' , "rr", 'l' , 
> "rl lll", opt=0x7fff3e3be8c0, fout=0x3054442760, flog=0x0) at 
> print.c:1551

OK, so the problem is that print_aligned_text is being passed fout = NULL.
Since that wasn't what was passed to printTable, the conclusion must be
that PageOutput() was called and returned NULL --- that is, that its
popen() call failed.  Obviously we should put in some sort of check for
that.  I can see three reasonable responses: either make psql abort
entirely (akin to its out-of-memory behavior), or have it fall back to
not using the pager, either silently or after printing an error
message.  Any thoughts which way to jump?

Meanwhile, the question Bill needs to look into is why popen() is
failing for him.  I'm guessing it's a fork() failure at bottom, but
why so consistent?  strace'ing the psql run might provide some more
info.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Shelby Cain
>- Original Message 
>From: Magnus Hagander <[EMAIL PROTECTED]>
>To: Shelby Cain <[EMAIL PROTECTED]>
>Cc: Alvaro Herrera <[EMAIL PROTECTED]>; Terry Yapt <[EMAIL PROTECTED]>; 
>pgsql-general@postgresql.org
>Sent: Friday, August 24, 2007 1:08:44 AM
>Subject: Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)
>
>Not that wild a guess, really :-) I'd say it's a very good possibility -
>but I have no idea why it'd do that, since all backends load the same
>DLLs at that stage.
>
>//Magnus
>

Assuming this is an issue with shared libraries, I think it would
have more to do with the way Windows resolves address conflicts on process
startup than anything caused by explicit calls to LoadLibrary().  Looking
at postgres.exe with the dependency viewer from
Visual Studio 6, I see that the following shared library dependencies
embedded in the executable image that having conflicting base
addresses.  If I'm not mistaken, Windows will automatically relocate
these libraries prior to actual code execution so there would be no
opportunity for that particular instance of postgres.exe to map the shared 
memory if the address
space is already in use by a relocated dll.







libeay32.dll - 0x1000



libiconv-2.dll - 0x1000



libintl-2.dll - 0x1000



ssleay32.dll - 0x1000



comerr32.dll - 0x1c00



krb5_32.dll - 0x1c00







I also found a KB article that specifically addresses ERROR_INVALID_MEMORY 
being returned from MapViewOfFileEx().  







http://support.microsoft.com/kb/125713





The article specifically addresses the concern where multiple processes
must use
the same address for mappings and how to accomplish that under
Windows.  Search for "Addresses of Mapped Views".  The only thing that
really gives me any pause is the fact the article hasn't been updated
past the NT 3.51/Windows 9x era but the underlying behavior might not have been 
changed in Windows 2000/XP/etc.







Regards,







Shelby Cain








   

Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for 
today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Erik Jones

On Aug 24, 2007, at 11:46 AM, Tom Lane wrote:


Jeff Amiel <[EMAIL PROTECTED]> writes:

Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-1] 2007-08-22 18:08:24 CDT   ERROR:
 out of memory.
Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
local0.warning] [6-2] 2007-08-22 18:08:24 CDT
DETAIL:  Failed on request of size 536870910.



What I found interesting is that It's ALWAYS the same
size536870910



maintenance_work_mem = 512MB# min 1MB


Apparently this maintenance_work_mem setting is higher than your  
system

can reliably provide.  Knock it back a bit.

regards, tom lane


I'm not so sure.  In my case, the request size was only 256MB and we  
maintain about 10 - 11 GB free of our 16 GB of memory (2GB  
shared_buffers, 42MB work_mem, and 256 MB maintenance_work_mem).  The  
toast table that was involved in the error was pretty small and I was  
able to successfully vacuum it myself virtually instantly.  However,  
in my case, this (so far) being a one time error I don't have much  
more data to contribute.  We constantly monitor and graph our  
system's I/O, CPU, and memory usage and scan our logs for errors so  
if anything else comes up I'll be sure to share.



Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Shelby Cain
I apologize for resending this but my editor in combination with Yahoo's web 
mail interface horribly mangled it...

>- Original Message 
>From: Magnus Hagander <[EMAIL PROTECTED]>
>To: Shelby Cain <[EMAIL PROTECTED]>
>Cc: Alvaro Herrera <[EMAIL PROTECTED]>; Terry Yapt <[EMAIL PROTECTED]>; 
>pgsql-general@postgresql.org
>Sent: Friday, August 24, 2007 1:08:44 AM
>Subject: Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)
>
>Not that wild a guess, really :-) I'd say it's a very good possibility -
>but I have no idea why it'd do that, since all backends load the same
>DLLs at that stage.
>
>//Magnus
>

Assuming this is an issue with shared libraries, I think it would have more to 
do with the way Windows resolves address conflicts on process startup than 
anything caused by explicit calls to LoadLibrary().  Looking at postgres.exe 
with the dependency viewer from Visual Studio 6, I see that the following 
shared library dependencies embedded in the executable image that having 
conflicting base addresses.  If I'm not mistaken, Windows will automatically 
relocate these libraries prior to actual code execution so there would be no 
opportunity for that particular instance of postgres.exe to map the shared 
memory if the address space is already in use by a relocated dll.

libeay32.dll - 0x1000
libiconv-2.dll - 0x1000
libintl-2.dll - 0x1000
ssleay32.dll - 0x1000
comerr32.dll - 0x1c00
krb5_32.dll - 0x1c00

I also found a KB article that addresses ERROR_INVALID_MEMORY being returned 
from MapViewOfFileEx().  

http://support.microsoft.com/kb/125713

The article specifically addresses the concern where multiple processes must 
use the same address for mappings and how to accomplish that under Windows.  
Search for "Addresses of Mapped Views".  The only thing that really gives me 
any pause is the fact the article hasn't been updated past the NT 3.51/Windows 
9x era but the underlying behavior might not have been changed in Windows 
2000/XP/etc.

Regards,

Shelby Cain






   

Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for 
today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET,
>> but I think it's mostly just that no one's bothered to rewrite the code
>> for SHM_QUEUE linked lists.  The vast majority of our shmem structures
>> use regular pointers, and have for years.

> Ah, I happened to be recently in that code so I was mislead.

IIRC, the reason for not bothering to change the SHM_QUEUE code (other
than inertia) was that it's a generic linked list package, and so if
it wasn't storing SHMEM_OFFSETs it'd be storing "void *"'s, and so there
didn't seem to be any traction to be gained in terms of compiler error
detection capability.  However, if both you and Alvaro were confused
about the liveness of that coding convention, maybe it'd be worth making
a push to eliminate all trace of MAKE_PTR/MAKE_OFFSET.  TODO for 8.4?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Can tsearch do some basic text mining

2007-08-24 Thread Oleg Bartunov

On Fri, 24 Aug 2007, Phoenix Kiula wrote:


Hi,

We have big blobs of text (average 10,000 characters) in a database,
from which we would like to discover the most often repeated words or
phrases. Can tsearch be used for this kind of pattern search? I
suppose it's Text Mining 101 sort of stuff, nothing complex.


there is stat() function, see 
http://www.sai.msu.su/~megera/wiki/Tsearch_V2_Notes

for more details.
It's not fast, so better to save results in a table



TIA!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] lc_collate issue

2007-08-24 Thread Cody Pisto

Hi All,

I'm looking for any kind of a reason (and potential workarounds), be it 
bug or otherwise, why the following two queries produce different 
results under a database encoding of UTF8 and lc_collate of en_US.UTF-8:


SELECT x FROM (SELECT 'Something else' AS x UNION SELECT '-SOMETHING 
ELSE-' AS x UNION SELECT 'Somethang' AS x) y ORDER BY LOWER(x)


   x
--
Somethang
-SOMETHING ELSE-
Something else
(3 rows)


*AND*

SELECT x FROM (SELECT 'Somethingelse' AS x UNION SELECT 
'-SOMETHINGELSE-' AS x UNION SELECT 'Somethang' AS x) y ORDER BY LOWER(x)


   x
-
Somethang
Somethingelse
-SOMETHINGELSE-


The removal of spaces from the strings gives "more correct" sorting 
results, with the spaces and '-' characters, '-SOMETHING ELSE-' is 
strangely sorted in the middle?
It does not matter if you use LOWER or UPPER, and the "problem" does not 
occur on databases with encoding SQL_ASCII and lc_collate of C


I have tested this on Postgres 8.1.9, 8.2, 8.2.4 with database encoding 
of UTF8 and lc_collate of en_US.UTF8

and on 7.4.16 with database encoding of SQL_ASCII and lc_collate of C

Thank in advance for any consideration!


-Cody


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Erik Jones wrote:
> On Aug 24, 2007, at 11:46 AM, Tom Lane wrote:
> 
>> Jeff Amiel <[EMAIL PROTECTED]> writes:
>>> Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
>>> local0.warning] [6-1] 2007-08-22 18:08:24 CDT   ERROR:
>>>  out of memory.
>>> Aug 22 18:08:24 db-1 postgres[16452]: [ID 748848
>>> local0.warning] [6-2] 2007-08-22 18:08:24 CDT
>>> DETAIL:  Failed on request of size 536870910.
>>
>>> What I found interesting is that It's ALWAYS the same
>>> size536870910
>>
>>> maintenance_work_mem = 512MB# min 1MB
>>
>> Apparently this maintenance_work_mem setting is higher than your system
>> can reliably provide.  Knock it back a bit.
>>
>> regards, tom lane
> 
> I'm not so sure.  In my case, the request size was only 256MB and we
> maintain about 10 - 11 GB free of our 16 GB of memory (2GB
> shared_buffers, 42MB work_mem, and 256 MB maintenance_work_mem).  The
> toast table that was involved in the error was pretty small and I was
> able to successfully vacuum it myself virtually instantly.  However, in
> my case, this (so far) being a one time error I don't have much more
> data to contribute.  We constantly monitor and graph our system's I/O,
> CPU, and memory usage and scan our logs for errors so if anything else
> comes up I'll be sure to share.

We are actually diagnosing a similar problem on this end, where we get a
failure at 1920... I am currently trying to get some DEBUG output.

Sincerely,

Joshua D. Drake


> 
> 
> Erik Jones
> 
> Software Developer | Emma®
> [EMAIL PROTECTED]
> 800.595.4401 or 615.292.5888
> 615.292.0777 (fax)
> 
> Emma helps organizations everywhere communicate & market in style.
> Visit us online at http://www.myemma.com
> 
> 
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>   http://archives.postgresql.org/
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGzx/gATb/zqfZUUQRAg10AJ9bmIUZ8V99vVCDZfWH05PWckf49QCfa4ta
G1daeagQY2CMUR1QDMtuXTQ=
=HxG3
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Jeff Amiel

--- "Joshua D. Drake" <[EMAIL PROTECTED]> wrote:


> We are actually diagnosing a similar problem on this
> end, where we get a
> failure at 1920... I am currently trying to get some
> DEBUG output.

We are actually getting it semi-regularly today (3
times already)I would be happy to provide some
more info if somebody  guides me (just set
log_min_messages to one of the debug settings?)





  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/ 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Out of Memory - 8.2.4

2007-08-24 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeff Amiel wrote:
> --- "Joshua D. Drake" <[EMAIL PROTECTED]> wrote:
> 
> 
>> We are actually diagnosing a similar problem on this
>> end, where we get a
>> failure at 1920... I am currently trying to get some
>> DEBUG output.
> 
> We are actually getting it semi-regularly today (3
> times already)I would be happy to provide some
> more info if somebody  guides me (just set
> log_min_messages to one of the debug settings?)

Having log_line_prefix with at least %p and %m (or %t) plus a
log_min_messages of DEBUG2 would be great.

Joshua D. Drake

> 
> 
> 
> 
> 
>   
> 
> Park yourself in front of a world of choices in alternative vehicles. Visit 
> the Yahoo! Auto Green Center.
> http://autos.yahoo.com/green_center/ 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGzySaATb/zqfZUUQRAqxJAJwL8VcEjDJ1dwQYuvEPh4pORCRUQQCeIwAO
ajfjr7m1bTy9r5DFuNmUP6Y=
=zq4y
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Terry Yapt

Tom Lane escribió:

I'm not sure if you have a specific technical meaning of "clone" in mind
here, but these processes are all executing the identical executable,
and taking care to map the shmem early in execution *before* they load
any DLLs.  So it should work.  Apparently, it *does* work for awhile for
the OP, and then stops working, which is even odder.

  
Yes, the windows system log (application log section) doesn't show any 
error in several days.  Suddenly errors bring back to life and syslog 
errors repeats every few time.  But again errors disappears and return 
in a few hours.  After few hours the system goes out.


Curiosity:
==
On the log lines I have and I sent to the list:   * FATAL:  could not 
reattach to shared memory (key=5432001, addr=01D8): Invalid argument
, this one: "addr=01D8" is always the same in spite of  the system 
have been shutting down and restarted or the error was out for a days.


Greetings.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Tom Lane
Shelby Cain <[EMAIL PROTECTED]> writes:
> Assuming this is an issue with shared libraries, I think it would have more=
>  to do with the way Windows resolves address conflicts on process startup t=
> han anything caused by explicit calls to LoadLibrary().  Looking at postgre=
> s.exe with the dependency viewer from Visual Studio 6, I see that the follo=
> wing shared library dependencies embedded in the executable image that havi=
> ng conflicting base addresses.  If I'm not mistaken, Windows will automatic=
> ally relocate these libraries prior to actual code execution so there would=
>  be no opportunity for that particular instance of postgres.exe to map the =
> shared memory if the address space is already in use by a relocated dll.

But the shmem was originally allocated in the postmaster process, which
is the identical executable with the identical set of linked-in DLLs.
So it's really unclear why the child processes would be unable to
reattach at the same address.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] PG Seg Faults Performing a Query

2007-08-24 Thread Bill Thoen
I'm a bit out of my depth with using these debugging tools and 
interpreting their results, but I think the problem is due to the output 
being just too big for interactive display. Using the same query with 
tighter limits in the WHERE clause works perfectly. When I changed the 
SQL script to write output into a table it worked with the same query 
using even looser limits in the WHERE clause. So sending output to a 
table instead of to the monitor when the queries produce a large amount 
of output is reliable, faster and doesn't tie up the machine.


I tried using strace, but it produced so much telemetry and 
unfortunately I couldn't understand it anyway that I don't think this 
would do me any good. I don't want to bug the PostgreSQL list with a 
problem that's probably not a PostgreSQL one, but if someone here would 
be willing to help me track down this apparent popen or fork problem I'd 
appreciate it. However, I managed to get the results I needed, so we 
could also call this "fixed via workaround."


Thanks for the help, Tom and others!
- Bill Thoen

Tom Lane wrote:

Bill Thoen <[EMAIL PROTECTED]> writes:
  

(gdb) bt
#0  0x003054264571 in fputc () from /lib64/libc.so.6
#1  0x0040dbc2 in print_aligned_text (title=0x0, headers=0x5665d0,
cells=0x2f8fc010, footers=0x557c90,
opt_align=0x557ef0 'l' , "rr", 'l' times>, "rl lll", opt_tuples_only=0 '\0', opt_numeric_locale=0 '\0', 
opt_border=1,

encoding=8, fout=0x0) at print.c:448
#2  0x0040f0eb in printTable (title=0x0, headers=0x5665d0,
cells=0x2f8fc010, footers=0x557c90,
align=0x557ef0 'l' , "rr", 'l' , 
"rl lll", opt=0x7fff3e3be8c0, fout=0x3054442760, flog=0x0) at 
print.c:1551



OK, so the problem is that print_aligned_text is being passed fout = NULL.
Since that wasn't what was passed to printTable, the conclusion must be
that PageOutput() was called and returned NULL --- that is, that its
popen() call failed.  Obviously we should put in some sort of check for
that.  I can see three reasonable responses: either make psql abort
entirely (akin to its out-of-memory behavior), or have it fall back to
not using the pager, either silently or after printing an error
message.  Any thoughts which way to jump?

Meanwhile, the question Bill needs to look into is why popen() is
failing for him.  I'm guessing it's a fork() failure at bottom, but
why so consistent?  strace'ing the psql run might provide some more
info.

regards, tom lane

  



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] PG Seg Faults Performing a Query

2007-08-24 Thread Tom Lane
Bill Thoen <[EMAIL PROTECTED]> writes:
> I'm a bit out of my depth with using these debugging tools and 
> interpreting their results, but I think the problem is due to the output 
> being just too big for interactive display.

Well, I can certainly believe it's related to the amount of data
involved, but the exact relationship is far from clear.  popen()
doesn't do any actual data-pushing, it just sets up a pipe and forks
a child process --- so even if the child fails immediately after being
forked, that wouldn't lead to the problem seen here.  The rarity of
a failure here explains why we hadn't noticed the lack of error checking
long ago.

What I suppose is that you are running into some system-wide resource
constraint.  Exactly which one, and whether it's easy to fix, remain to
be seen.

> I tried using strace, but it produced so much telemetry and 
> unfortunately I couldn't understand it anyway that I don't think this 
> would do me any good.

Sorry, I should have said: the last few dozen lines before the crash are
all that will be interesting.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] FATAL: could not reattach to shared memory (Win32)

2007-08-24 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> "Tom Lane" <[EMAIL PROTECTED]> writes:
>>> There are a few old bits of code that still use MAKE_PTR/MAKE_OFFSET,
>>> but I think it's mostly just that no one's bothered to rewrite the code
>>> for SHM_QUEUE linked lists.  The vast majority of our shmem structures
>>> use regular pointers, and have for years.
>
>> Ah, I happened to be recently in that code so I was mislead.
>
> IIRC, the reason for not bothering to change the SHM_QUEUE code (other
> than inertia) was that it's a generic linked list package, and so if
> it wasn't storing SHMEM_OFFSETs it'd be storing "void *"'s, and so there
> didn't seem to be any traction to be gained in terms of compiler error
> detection capability.  However, if both you and Alvaro were confused
> about the liveness of that coding convention, maybe it'd be worth making
> a push to eliminate all trace of MAKE_PTR/MAKE_OFFSET.  TODO for 8.4?

It would also make using gdb to look at the lock queues a bit less of a pain.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Vivek Khera


On Aug 24, 2007, at 4:09 AM, Alban Hertroys wrote:


I'm not entirely sure what makes multi-threading be advantageous on a
specific operating system, but I think FreeBSD should be added to that
list as well... They've been bench marking their threading support  
using

multi-threading in MySQL (not for the db, mind you - just for load ;),
and it performs really well.


Maybe only for FreeBSD >= 6.0.  Prior to that, the threading was  
rather lackluster.   I still think the separate process model is  
superior, in that you get private data spaces with them.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Tom Lane
Cody Pisto <[EMAIL PROTECTED]> writes:
> I'm looking for any kind of a reason (and potential workarounds), be it 
> bug or otherwise, why the following two queries produce different 
> results under a database encoding of UTF8 and lc_collate of en_US.UTF-8:

That's just how it is in most non-C locales --- they use some weird
algorithm that's alleged to approximate what dictionary makers
traditionally do with phrases.  I don't recall the details but there's
something about multiple passes with spaces being ignored in earlier
passes.  You'll find that sort(1) sorts these lines the same way.

If you don't like it, use C locale, or put together your own locale
definition.  (No, I'm not sure how hard that is ...)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Cody Pisto

Hi Tom,

Thanks for answering,

I pretty much assumed that was the case (whatever library postgres is 
using for encoding is causing the "issue")


The[my] problem is,  it just seems like completely incorrect behavior..

The quickest and dirtiest examples I can do are that both python and 
mysql sort these 3 example strings (in utf8 encoding) the way I would 
expect (like a C locale)



python:

>>> x = [unicode("Somethang", "utf8"), unicode("-SOMETHING ELSE-", 
"utf8"), unicode("Something else", "utf8")]

>>> x
[u'Somethang', u'-SOMETHING ELSE-', u'Something else']
>>> x.sort()
>>> x
[u'-SOMETHING ELSE-', u'Somethang', u'Something else']


mysql:

mysql> SET NAMES 'utf8';
Query OK, 0 rows affected (0.02 sec)

mysql> SELECT x FROM (SELECT 'Something else' AS x UNION SELECT 
'-SOMETHING ELSE-' AS x UNION SELECT 'Somethang' AS x) y ORDER BY LOWER(x);

+--+
| x|
+--+
| -SOMETHING ELSE- |
| Somethang|
| Something else   |
+--+
3 rows in set (0.00 sec)


postgres:

SELECT x FROM (SELECT 'Something else' AS x UNION SELECT '-SOMETHING 
ELSE-' AS x UNION SELECT 'Somethang' AS x) y ORDER BY LOWER(x);

   x
--
Somethang
-SOMETHING ELSE-
Something else
(3 rows)


And I bet oracle, firebird, sqlite, mssql, and everything else out there 
that does utf8 would return it in the "right" order (I'm willing to test 
that too if needed..)



If this is potentially a problem in postgres somewhere, point me in the 
general direction and I'm more than willing to fix it myself..


Thanks for your consideration..


-Cody





Tom Lane wrote:

Cody Pisto <[EMAIL PROTECTED]> writes:
  
I'm looking for any kind of a reason (and potential workarounds), be it 
bug or otherwise, why the following two queries produce different 
results under a database encoding of UTF8 and lc_collate of en_US.UTF-8:



That's just how it is in most non-C locales --- they use some weird
algorithm that's alleged to approximate what dictionary makers
traditionally do with phrases.  I don't recall the details but there's
something about multiple passes with spaces being ignored in earlier
passes.  You'll find that sort(1) sorts these lines the same way.

If you don't like it, use C locale, or put together your own locale
definition.  (No, I'm not sure how hard that is ...)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



  


--

*Cody Pisto*
Redzia RVs
10555 Montgomery NE
Suite 80
Albuquerque, NM 87111
Phone: (866) 844-1986
Fax: (866) 204-4403



[GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Matthew
Hi gang,
 In MySQL it is possible to add a column before/after another column. I
have not been able to find such syntax in the Postgres manual. Does this
 ability exist?

Is my only solution to create a new table with the new column, copy the
data, delete the old table and rename the new one?

TIA,
Matthew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Bill Moran
In response to Matthew <[EMAIL PROTECTED]>:

> Hi gang,
>  In MySQL it is possible to add a column before/after another column. I
> have not been able to find such syntax in the Postgres manual. Does this
>  ability exist?

It does not.

> Is my only solution to create a new table with the new column, copy the
> data, delete the old table and rename the new one?

The fact that you're asking this question is a hint that you're using
unsafe coding practices that may bite you at some point.  The order
of columns is not guaranteed to be the same as when you do CREATE
TABLE, it's just coincidence that they usually are.

To get your columns in a specific order, specify the column names in
that order in your SELECT statement.  The SQL standard doesn't provide
for any other way to guarantee column order, and neither does Postgres.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Matthew
Hey Bill,

> It does not.

Bummer.

> To get your columns in a specific order, specify the column names in
> that order in your SELECT statement.  The SQL standard doesn't provide
> for any other way to guarantee column order, and neither does Postgres.

Yes, I realize this and we do identify our columns during select
statements, but when you look at a table using a tool like phpPGAdmin or
pgAdmin3, the columns are displayed in some defined order.  It's much
easier to see your data/columns in some logical order (ie: all the cost
columns next to each other).

Thanks,
Matthew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Steve Atkins


On Aug 24, 2007, at 2:18 PM, Matthew wrote:


Hey Bill,


It does not.


Bummer.


To get your columns in a specific order, specify the column names in
that order in your SELECT statement.  The SQL standard doesn't  
provide
for any other way to guarantee column order, and neither does  
Postgres.


Yes, I realize this and we do identify our columns during select
statements, but when you look at a table using a tool like  
phpPGAdmin or

pgAdmin3, the columns are displayed in some defined order.  It's much
easier to see your data/columns in some logical order (ie: all the  
cost

columns next to each other).


Using a view might give you what you're looking for:

  abacus=# select * from access_role;
   id | name
  +---
1 | user
2 | admin
  (2 rows)

  abacus=# create view bass_ackwards as select name, id from  
access_role;

  CREATE VIEW
  abacus=# select * from bass_ackwards;
   name  | id
  ---+
   user  |  1
   admin |  2
  (2 rows)

Cheers,
  Steve


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Alexander Staubo
On 8/24/07, Dave Page <[EMAIL PROTECTED]> wrote:
> Alban Hertroys wrote:
> > So actually the remark shouldn't be that "the multi-threaded
> > architecture is only advantageous  on Windows", but more like "the
> > multi-process architecture is disadvantageous on Windows and hence a
> > multi-threaded architecture is preferred (on that particular OS)".
>
> Yeah - but I'm not sure thats necessarily something that should have a
> place on a bullet point comparison.

You solve it by letting the category be labeled "Process/thread
architecture", and letting the columns say "Thread-based" for Firebird
and "Process-based" for PostgreSQL respectively.

Then add a footnote about this row that explains the difference and
how they behave on the different operating systems.

PostgreSQL clearly is at a slight disadvantage on Windows, so this is
relevant, at least to Tony's Delphi-user demographic.

Alexander.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Tom Lane
Cody Pisto <[EMAIL PROTECTED]> writes:
> If this is potentially a problem in postgres somewhere, point me in the 
> general direction and I'm more than willing to fix it myself..

You seem not to have absorbed what I said.  This *is* the correct result
according to that locale's definition of sorting.  You can demonstrate
that without any use of Postgres:

[EMAIL PROTECTED] ~]$ cat fooey
Somethang
-SOMETHING ELSE-
Something else
[EMAIL PROTECTED] ~]$ LANG=C sort fooey
-SOMETHING ELSE-
Somethang
Something else
[EMAIL PROTECTED] ~]$ LANG=en_US sort fooey
Somethang
Something else
-SOMETHING ELSE-
[EMAIL PROTECTED] ~]$ 

If you prefer C sort ordering, run Postgres in C locale.  It's as
simple as that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] PostgreSQL vs Firebird feature comparison finished

2007-08-24 Thread Thomas Kellerer

Alexander Staubo wrote on 24.08.2007 23:49:

So actually the remark shouldn't be that "the multi-threaded
architecture is only advantageous  on Windows", but more like "the
multi-process architecture is disadvantageous on Windows and hence a
multi-threaded architecture is preferred (on that particular OS)".

Yeah - but I'm not sure thats necessarily something that should have a
place on a bullet point comparison.


You solve it by letting the category be labeled "Process/thread
architecture", and letting the columns say "Thread-based" for Firebird
and "Process-based" for PostgreSQL respectively.


To my understanding Firebird offers a choice between thread-based (Superserver) 
and process-based (Classic) at least on the Windows platform.


Thomas


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Robert Treat
On Friday 24 August 2007 17:18, Matthew wrote:
> Hey Bill,
>
> > It does not.
>
>   Bummer.
>
> > To get your columns in a specific order, specify the column names in
> > that order in your SELECT statement.  The SQL standard doesn't provide
> > for any other way to guarantee column order, and neither does Postgres.
>
>   Yes, I realize this and we do identify our columns during select
> statements, but when you look at a table using a tool like phpPGAdmin or
> pgAdmin3, the columns are displayed in some defined order.  It's much
> easier to see your data/columns in some logical order (ie: all the cost
> columns next to each other).
>

FWIW (and it isnt much) we had the semblence of a patch and an actual plan for 
implementing this, but no one got interested enough to finish it for 8.3.  
Should you happen to know an ambituous C hacker, there is a good chance it 
could be included in 8.4. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Cody Pisto

Hi Tom,

I did understand what you said, I apologize that it came out otherwise.

I'm just looking for the correct workaround.

If initdb was done with a C locale, and thus lc_collate and friends 
where all C, but the database and client encoding was set to UTF-8, 
would postgres convert data on the fly from UTF-8(storage) to ASCII for 
sorting or would things just blow up when a >1 byte character hit the mix?


The docs say bad things would happen:

http://www.postgresql.org/docs/8.2/static/multibyte.html
   *Important: * Although you can specify any encoding you want for a 
database, it is unwise to choose an encoding that is not what is 
expected by the locale you have selected. The LC_COLLATE and LC_CTYPE 
settings imply a particular encoding, and locale-dependent operations 
(such as sorting) are likely to misinterpret data that is in an 
incompatible encoding.



Right now for me ORDER BY LOWER(ASCII(column)), LOWER(column) (or some 
variation there of) works, but is there a better workaround?



Thanks,


-Cody

Tom Lane wrote:

Cody Pisto <[EMAIL PROTECTED]> writes:
  
If this is potentially a problem in postgres somewhere, point me in the 
general direction and I'm more than willing to fix it myself..



You seem not to have absorbed what I said.  This *is* the correct result
according to that locale's definition of sorting.  You can demonstrate
that without any use of Postgres:

[EMAIL PROTECTED] ~]$ cat fooey
Somethang
-SOMETHING ELSE-
Something else
[EMAIL PROTECTED] ~]$ LANG=C sort fooey
-SOMETHING ELSE-
Somethang
Something else
[EMAIL PROTECTED] ~]$ LANG=en_US sort fooey
Somethang
Something else
-SOMETHING ELSE-
[EMAIL PROTECTED] ~]$ 


If you prefer C sort ordering, run Postgres in C locale.  It's as
simple as that.

regards, tom lane



  


--

*Cody Pisto*
Redzia RVs
10555 Montgomery NE
Suite 80
Albuquerque, NM 87111
Phone: (866) 844-1986
Fax: (866) 204-4403



[GENERAL] connect by service name in psql

2007-08-24 Thread Stuart
I can't find an option in psql to connect to a database 
by "service" name?  Should there be one?




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Benjamin Arai

Hi,

I have an application which loads millions of NEW documents each month
into a PostgreSQL tsearch2 table.  I have the initial version completed
and searching performance is great but my problem is that each time a  
new

month rolls around I have to drop all the indexes do a COPY and re-index
the entire table. This is problematic considering that each month takes
longer than the previous to rebuild the indexes and the application in
unavailable during the rebuilding process.

In order to avoid the re-indexing I was thinking of instead creating  
a new

table each month (building its indexes and etc) and accessing them all
through a view. This way I only have to index the new data each month.

Does this work?  Does a view with N tables make it N times slower for
tsearch2 queries? Is there a better solution?

Benjamin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Brandon Shalton

Benjamin,




In order to avoid the re-indexing I was thinking of instead creating  a 
new

table each month (building its indexes and etc) and accessing them all
through a view. This way I only have to index the new data each month.



Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are inserted into 
the database.


Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such that you could 
query the child table name directly, or you run the query against the parent 
(ie. logfile table) and get all the data.


the indexes are done on a per table basis, so new data that comes in, is a 
lesser amount, and doesn't require re-indexing.



example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can perform any 
kind of sql query and field structures are you normally do.


the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via indexes, then 
results are merged.


but, this approach does allow me to dump alot of data in, without having the 
re-indexing issues you are facing.


at some point, you could roll up the days, in to weekly child tables, then 
monthly tables, etc.


I believe Bizgres has a new version of their system that does parallel 
queries which would certainly speed things up.


For your documents, you can do it by the day it was checked in, or maybe you 
have some other way of logically grouping, but the parent/child table 
structure really helped to solve my problem of adding in millions of records 
each day.


The closest thing in mysql is using merge tables, which is not really 
practical when it comes time to do the joins to the tables.


-brandon

http://www.t3report.com - marketing intelligence for online marketing and 
affiliate programs











---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Can tsearch do some basic text mining

2007-08-24 Thread Phoenix Kiula
On 25/08/07, Oleg Bartunov <[EMAIL PROTECTED]> wrote:
> On Fri, 24 Aug 2007, Phoenix Kiula wrote:
>
> > Hi,
> >
> > We have big blobs of text (average 10,000 characters) in a database,
> > from which we would like to discover the most often repeated words or
> > phrases. Can tsearch be used for this kind of pattern search? I
> > suppose it's Text Mining 101 sort of stuff, nothing complex.
>
> there is stat() function, see
> http://www.sai.msu.su/~megera/wiki/Tsearch_V2_Notes
> for more details.
> It's not fast, so better to save results in a table



Thanks. This seems to give words only. How about phrases? If words are
so slow, I shudder to think how long phrase analysis would take -- it
that is possible at all?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Tom Lane
Cody Pisto <[EMAIL PROTECTED]> writes:
> If initdb was done with a C locale, and thus lc_collate and friends 
> where all C, but the database and client encoding was set to UTF-8, 
> would postgres convert data on the fly from UTF-8(storage) to ASCII for 
> sorting or would things just blow up when a >1 byte character hit the mix?

No, C locale just sorts the bytes.  It won't "blow up".  Whether it will
give you a sort ordering you like for multibyte characters is a
different question.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brandon Shalton wrote:
> Benjamin,
> 
> 
>>
>> In order to avoid the re-indexing I was thinking of instead creating 
>> a new
>> table each month (building its indexes and etc) and accessing them all
>> through a view. This way I only have to index the new data each month.
>>
> 
> Take a look at bizgres.org (based on postgres).
> 
> They have a parent-child structure.
> 
> The way i use it, is I have about 30M records a day that are inserted
> into the database.
> 
> Each day is a "child" table to the "parent".
> 
> so example:
> 
> the parent table is called  "logfile"
> 
> each day, is a child, with the structure like  "logfile_YYMMDD"
> 
> the "child" inherits the table structure of the parent, such that you
> could query the child table name directly, or you run the query against
> the parent (ie. logfile table) and get all the data.
> 
> the indexes are done on a per table basis, so new data that comes in, is
> a lesser amount, and doesn't require re-indexing.


PostgreSQL can do all of this too.

Sincerely,

Joshua D. Drake

> 
> 
> example:
> 
> select * from logfile_070825 where datafield = 'foo'
> 
> if i knew i wanted to specifically go into that child, or:
> 
> select * from logfile where datafield = 'foo'
> 
> and all child tables are searched and results merged.  You can perform
> any kind of sql query and field structures are you normally do.
> 
> the downside is that the queries are run sequentially.
> 
> so if you had 100 child tables, each table is queried via indexes, then
> results are merged.
> 
> but, this approach does allow me to dump alot of data in, without having
> the re-indexing issues you are facing.
> 
> at some point, you could roll up the days, in to weekly child tables,
> then monthly tables, etc.
> 
> I believe Bizgres has a new version of their system that does parallel
> queries which would certainly speed things up.
> 
> For your documents, you can do it by the day it was checked in, or maybe
> you have some other way of logically grouping, but the parent/child
> table structure really helped to solve my problem of adding in millions
> of records each day.
> 
> The closest thing in mysql is using merge tables, which is not really
> practical when it comes time to do the joins to the tables.
> 
> -brandon
> 
> http://www.t3report.com - marketing intelligence for online marketing
> and affiliate programs
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 6: explain analyze is your friend
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGz4XuATb/zqfZUUQRAukhAJ9b2x4PLPZsoPmtm3O/Ze4AobDXngCgq+rl
X2j2ePDyjYxRajfGCVmjnYU=
=pIjb
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] connect by service name in psql

2007-08-24 Thread Stuart

"Stuart" wrote in message news:[EMAIL PROTECTED]:
> I can't find an option in psql to connect to a database 
> by "service" name?  Should there be one?

Sorry for folowing up my own post but since I got asked in 
email, and since I wrote code to do essentially the same thing
before noticing it in the PG docs...

By "service" I am refering to the pg_service.conf file described in
http://www.postgresql.org/docs/8.2/interactive/libpq-connect.html 
(see the "service" parameter) and 
http://www.postgresql.org/docs/8.2/interactive/libpq-pgservice.html







---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] lc_collate issue

2007-08-24 Thread Tatsuo Ishii
> Cody Pisto <[EMAIL PROTECTED]> writes:
> > If initdb was done with a C locale, and thus lc_collate and friends 
> > where all C, but the database and client encoding was set to UTF-8, 
> > would postgres convert data on the fly from UTF-8(storage) to ASCII for 
> > sorting or would things just blow up when a >1 byte character hit the mix?
> 
> No, C locale just sorts the bytes.  It won't "blow up".  Whether it will
> give you a sort ordering you like for multibyte characters is a
> different question.

Yup.

For example, LATIN1 part of UTF-8 (UNICODE) is physicaly ordered same
as ISO 8859-1. So if you see the order of ISO 8859-1 is "natural",
then the sort order of UTF-8 is ok as well. However the order of CJK
part of UTF-8 is totally different from the original charcater sets
(almost random), you need to use convert() for converting UTF-8 to
original encoding to get "natural" sort order. I don't think you are
interested in CJK part, though.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] connect by service name in psql

2007-08-24 Thread Tom Lane
"Stuart" <[EMAIL PROTECTED]> writes:
>> I can't find an option in psql to connect to a database 
>> by "service" name?  Should there be one?

In 8.3 libpq will have code that interprets what's usually the database
name parameter as a "conninfo" string if there's an "=" in it, which
means you can get at the service name as well as a lot of other arcane
options even if the calling application (such as psql) doesn't know they
exist.

For instance,

$ psql "service = foo"
psql: ERROR: service file "/home/postgres/version83/etc/pg_service.conf" not 
found

(I didn't actually bother to make a service file, but you can see that
the code knew what it was supposed to do.)

I suppose this is more awkward than something like "psql -S foo", but
we're unlikely to want to keep putting connection-related switches into
psql and all the other client apps ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] connect by service name in psql

2007-08-24 Thread Stuart
"Tom Lane" wrote in message news:[EMAIL PROTECTED]:
> "Stuart" <[EMAIL PROTECTED]> writes:
> >> I can't find an option in psql to connect to a database 
> >> by "service" name?  Should there be one?
> 
> In 8.3 libpq will have code that interprets what's usually the database
> name parameter as a "conninfo" string if there's an "=" in it, [...]

Thank you, that's good news!



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Erik Jones


On Aug 24, 2007, at 4:18 PM, Matthew wrote:


Hey Bill,


It does not.


Bummer.


To get your columns in a specific order, specify the column names in
that order in your SELECT statement.  The SQL standard doesn't  
provide
for any other way to guarantee column order, and neither does  
Postgres.


Yes, I realize this and we do identify our columns during select
statements, but when you look at a table using a tool like  
phpPGAdmin or

pgAdmin3, the columns are displayed in some defined order.  It's much
easier to see your data/columns in some logical order (ie: all the  
cost

columns next to each other).


PHP, eh?  Do you know it?  Given that Postgres doesn't provide that  
convenience, perhaps you could add the functionality to phpPGAdmin.   
It is open source.  If you're not up on PHP or, well, just don't feel  
like it, make a feature request there.  Believe me, it's already been  
made here.  The usual answer, or argument against, is because the  
standard dictates that the order of attributes in rows returned by  
queries is undefined in the absence a specified (in the query) ordering.


Although it occurs to me that, while that answer is correct and  
justified, that's not the answer to what most people actually want.   
What they want is that when they look at a table they see a  
particular order.  Has anybody suggested or discussed adding an  
optional view, or description, ordering table meta-attribute?



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

2007-08-24 Thread Benjamin Arai

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

This kind of disappointing, I was hoping there was more that could be  
done.


There has to be another way to do incremental indexing without  
loosing that much performance.


Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brandon Shalton wrote:

Benjamin,




In order to avoid the re-indexing I was thinking of instead creating
a new
table each month (building its indexes and etc) and accessing  
them all
through a view. This way I only have to index the new data each  
month.




Take a look at bizgres.org (based on postgres).

They have a parent-child structure.

The way i use it, is I have about 30M records a day that are inserted
into the database.

Each day is a "child" table to the "parent".

so example:

the parent table is called  "logfile"

each day, is a child, with the structure like  "logfile_YYMMDD"

the "child" inherits the table structure of the parent, such that you
could query the child table name directly, or you run the query  
against

the parent (ie. logfile table) and get all the data.

the indexes are done on a per table basis, so new data that comes  
in, is

a lesser amount, and doesn't require re-indexing.



PostgreSQL can do all of this too.

Sincerely,

Joshua D. Drake




example:

select * from logfile_070825 where datafield = 'foo'

if i knew i wanted to specifically go into that child, or:

select * from logfile where datafield = 'foo'

and all child tables are searched and results merged.  You can  
perform

any kind of sql query and field structures are you normally do.

the downside is that the queries are run sequentially.

so if you had 100 child tables, each table is queried via indexes,  
then

results are merged.

but, this approach does allow me to dump alot of data in, without  
having

the re-indexing issues you are facing.

at some point, you could roll up the days, in to weekly child tables,
then monthly tables, etc.

I believe Bizgres has a new version of their system that does  
parallel

queries which would certainly speed things up.

For your documents, you can do it by the day it was checked in, or  
maybe

you have some other way of logically grouping, but the parent/child
table structure really helped to solve my problem of adding in  
millions

of records each day.

The closest thing in mysql is using merge tables, which is not really
practical when it comes time to do the joins to the tables.

-brandon

http://www.t3report.com - marketing intelligence for online marketing
and affiliate programs










---(end of  
broadcast)---

TIP 6: explain analyze is your friend




- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGz4XuATb/zqfZUUQRAukhAJ9b2x4PLPZsoPmtm3O/Ze4AobDXngCgq+rl
X2j2ePDyjYxRajfGCVmjnYU=
=pIjb
-END PGP SIGNATURE-



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Darwin)

iQIVAwUBRs+/UfyqRf6YpodNAQL6Xg//eEqR0UQ4I/snn7Dtmkru40jCuECGeG8g
XoxLWEa+bumVgwrEYbjKTBp3KP6OEKz9VV4xHQROTtqxh+rg0hdoc0kWxSyquCm8
VljL24ykvBmRmjhacwi8FKp092zwRcLrbkzTxIr90q8u008aVPWxQCBtmfL6QVTv
I9AyN0kb00ypx+B9I2ySugYzBerVCMUiKUeXplHWn1loSSm1w+5CzXY8gtvivFEV
YspS1Fk2rxjnjlPE/FTGUiwJrdWZTJrd3BuSVbH5DWBoCjz9gzq0NyNZAtESWX2H
oGwlWBEJNFTtoHnK4iTMS+CzKHQQQZ9ZuQcHy84SlXYUo9n0/NCIeabu2xaj44Fs
LFq8jBCH3ebAkD/hQOgk1H05ljbfX8A/u2zz75W1NbD0xTB/sAljWqhypz2x7pOo
sUJF9MQ7DwVG8JitUAAc5fuGpLLR4WxF68YdkgycaCNknP7IATeD2ecqJkC26Av+
GHHci2ct5ypVq9Qq8OuesYSox7XpO2+E+Y5DtgBo+/R7eOJRLA3Z0FDXFLGsdFxy
0OKoew1MN79jP+KMZFJwvddH/TrkZBdIKlkacXYwUHU3c1ATwne6WteKTnEmr2aP
99oQgfmNDyQgTeEL20jokF4YZOdm1UO3Cc7wTi2QlwyqUDbUmYtWzgbS9QbnaGGA
58XdVacGznw=
=Hst4
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Jaime Casanova
On 8/24/07, Robert Treat <[EMAIL PROTECTED]> wrote:
> On Friday 24 August 2007 17:18, Matthew wrote:
> > Hey Bill,
> >
> > > It does not.
> >
> >   Bummer.
> >
> > > To get your columns in a specific order, specify the column names in
> > > that order in your SELECT statement.  The SQL standard doesn't provide
> > > for any other way to guarantee column order, and neither does Postgres.
> >
> >   Yes, I realize this and we do identify our columns during select
> > statements, but when you look at a table using a tool like phpPGAdmin or
> > pgAdmin3, the columns are displayed in some defined order.  It's much
> > easier to see your data/columns in some logical order (ie: all the cost
> > columns next to each other).
> >
>
> FWIW (and it isnt much) we had the semblence of a patch and an actual plan for
> implementing this, but no one got interested enough to finish it for 8.3.
> Should you happen to know an ambituous C hacker, there is a good chance it
> could be included in 8.4.
>

really? i don't remember it? can you point me where that patch is?, i
can't find it, either

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
   Richard Cook

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Add Column BEFORE/AFTER another column

2007-08-24 Thread Tom Lane
Erik Jones <[EMAIL PROTECTED]> writes:
> ... The usual answer, or argument against, is because the  
> standard dictates that the order of attributes in rows returned by  
> queries is undefined in the absence a specified (in the query) ordering.

I don't think this is true.  The spec is explicit that *row* order is
indeterminate without an ORDER BY clause, eg SQL92 section 20.2 GR4:

 4) If an  is not specified, then the ordering of
the rows of Q is implementation-dependent.

However, *column* order seems always respected, eg 7.9 rule 3 saith
about SELECT * :

b) Otherwise, the  "*" is equivalent to a  sequence in which each  is a
   that references a column of T and each
  column of T is referenced exactly once. The columns are ref-
  erenced in the ascending sequence of their ordinal position
  within T.

Another example is that section 11.11 is crystal clear that ALTER TABLE
ADD COLUMN puts the new column at the end:

 4) In all other respects, the specification of a  in an  has the same effect
as specification of the  in the  for T would have had. In particular, the degree of T
is increased by 1 (one) and the ordinal position of that column
is equal to the new degree of T as specified in the General
Rules of Subclause 11.4, "".


AFAIK the convention that "using SELECT * in code is a bad idea" is
folklore born from years of maintaining real applications; it's surely
good advice, but it's not grounded in any vagueness of the spec.

As far as real PG plans go, I've lost track of the number of requests
we've had for the ability to adjust column order.  To do this in a
reasonably sane fashion, we have to separate logical from physical
column order; and the reason that's not been tackled is fear of bugs
from using the wrong kind of column number in the wrong place.  It'll
probably get done someday, but my advice would be to stay away from
the first major release after it happens ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly