Re: [GENERAL] pgAdmin error

2008-11-22 Thread Dave Page
On Sat, Nov 22, 2008 at 6:35 AM, David [EMAIL PROTECTED] wrote:
 I am trying to use pgAdmin 1.8.4 to edit the pg_hba.conf file on a
 PostgreSQL 8.3 database running on  Ubuntu 8.10.  I get the following
 error message:

 An error has occurred:

 ERROR: absolute path not allowed
 CONTEXT: SQL function pg_file_length statement 1

 then...

 Backend Access Configuration Editor -
 /etc/postresql/8.3/main/pg_hba.conf on Kepler...

 with a blank screen.

Hmm - PostgreSQL won't allow us to access files outside of the data
directory. iirc, it does that by insisting that all paths are
relative, and don't contain ..

I assume your data is not in /etc/postgresql/8.3/main - just your config?


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Peter Eisentraut
On Friday 21 November 2008 19:10:45 Tom Lane wrote:
 Yeah, I think this is most probably explained by repeat postings
 of successive versions of large patches.  Still, Ron might be on to
 something.  I had not considered message lengths in my previous
 numbers ...

Also consider that since we started using the wiki for tracking patches, a lot 
of trivial emails like your patch has been added to the queue and where 
are we on this have disappeared.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pgAdmin error

2008-11-22 Thread Magnus Hagander
Dave Page wrote:
 On Sat, Nov 22, 2008 at 6:35 AM, David [EMAIL PROTECTED] wrote:
 I am trying to use pgAdmin 1.8.4 to edit the pg_hba.conf file on a
 PostgreSQL 8.3 database running on  Ubuntu 8.10.  I get the following
 error message:

 An error has occurred:

 ERROR: absolute path not allowed
 CONTEXT: SQL function pg_file_length statement 1

 then...

 Backend Access Configuration Editor -
 /etc/postresql/8.3/main/pg_hba.conf on Kepler...

 with a blank screen.
 
 Hmm - PostgreSQL won't allow us to access files outside of the data
 directory. iirc, it does that by insisting that all paths are
 relative, and don't contain ..
 
 I assume your data is not in /etc/postgresql/8.3/main - just your config?

That's the default location on debian. IIRC it also means that the
database server doesn't have write permissions on the files anyway, so
even if it allowed the opening of them it would not be possible to make
modifications from pgAdmin anyway.

Bottom line - you can't edit your postgresql configuration remotely on
Debian.

//Magnus


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Stefan Kaltenbrunner

Alvaro Herrera wrote:

Sam Mason wrote:


the following has links to more:

  http://markmail.org/search/?q=list:org.postgresql


Wow, the spanish list is the 3rd in traffic after hackers and general!


yeah and that tom lane guy sent over 77000(!!!) mails to the lists up to 
now ...



Stefan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Bruce Momjian
Ron Mayer wrote:
 Joshua D. Drake wrote:
  On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote:
  Bruce Momjian wrote:
  Tom Lane wrote:
  ... harder to keep
  up with the list traffic; so something is happening that a simple
  volume count doesn't capture.
  If measured in bytes of the gzipped mbox it ...
  
  Its because we eliminated the -patches mailing list.
 
 That's part of it.  I've added -patches to the graph at
 http://0ape.com/postgres_mailinglist_size/ as well as
 a graph of hackers+patches combined; and it still looks
 like hackers+patches is quite high in the past 3 months.
 
 With hackers+patches it looks like 2002-08 was the biggest
 month; but the past 3 months still look roughly twice
 late 2007's numbers.

Can someoone graph CVS traffic, showing the historical number of commits
and number of changed lines?

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Magnus Hagander
Bruce Momjian wrote:
 Ron Mayer wrote:
 Joshua D. Drake wrote:
 On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote:
 Bruce Momjian wrote:
 Tom Lane wrote:
 ... harder to keep
 up with the list traffic; so something is happening that a simple
 volume count doesn't capture.
 If measured in bytes of the gzipped mbox it ...
 Its because we eliminated the -patches mailing list.
 That's part of it.  I've added -patches to the graph at
 http://0ape.com/postgres_mailinglist_size/ as well as
 a graph of hackers+patches combined; and it still looks
 like hackers+patches is quite high in the past 3 months.

 With hackers+patches it looks like 2002-08 was the biggest
 month; but the past 3 months still look roughly twice
 late 2007's numbers.
 
 Can someoone graph CVS traffic, showing the historical number of commits
 and number of changed lines?

Ohloh has some graphs, are they detailed enough?
http://www.ohloh.net/projects/postgres/analyses/latest

//Magnus

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database access over the Internet...

2008-11-22 Thread Bayless Kirtley
Michelle, I don't think the list is going to change its operations for 
one disgruntled user. Since you seem unwilling or unable to employ 
the advice already given, maybe your only acceptable option is to 
unsubscribe from the list. At least that would eliminate much of the 
noise that currently is cluttering everyone's inboxes.



- Original Message - 
From: Michelle Konzack [EMAIL PROTECTED]

To: pgsql-general@postgresql.org
Sent: Sunday, November 16, 2008 4:41 AM
Subject: Re: [GENERAL] Database access over the Internet...



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database access over the Internet...

2008-11-22 Thread Scott Ribe
 You did not understand the problem

1) You did not explain the problem.

2) It's your problem, not ours. Your demand that thousands of other people
adapt to your unusual problem is absurdly self-absorbed. Get decent email
service, then subscribe from there. (Or go away. Either works for the
members of this list, so it's really your choice.)


-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 722-0567 voice



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SQL query

2008-11-22 Thread Michael Thorsen
I am running the Postgres(8.2.11) on Windows.

I have 2 tables, one with users and one with locations.

user_table
---
user_id  user_code  price   value
1   245.23  -97.82
2   3 42.67  -98.32
3   4 35.56 -76.32

locations
--
id   code pricevalue
12   45.23 -97.82
23   42.67  -98.32
34   43.26 -98.65

I have a query that takes every user and looks into locations to see if the
code, price and value match up. If they do then count it.

select count(*)
  from user_table u, locations l
 where u.user_code = l.code
and u.price = l.price
and u.value = l.value;

The answer to this should be 2, but when I run my query I get 4 (in fact
more entries than user_table) which is incorrect. What am I doing
incorrectly? I have been breaking my head over this for a while. Is there
some other query to get the required results? Any help would be highly
appreciated. I gave a simple example above, but the query runs over 2 tables
with about a million entries in each. So I am unable to verify what is
wrong, but I know the count is incorrect as I should not have more than what
is in the user_table.

Thanks,
Michael


[GENERAL] how to cite postgresql?

2008-11-22 Thread Tomas Lanczos
I am using Postgresql to store all my research related data. At the 
moment I am just finishing my PhD thesis and I want to cite postgresql 
correctly but can't find how to do it. Could somebody give me an advice?


Many thanks

tomas

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PgAdminIII - RAISE WARNING prints in one line

2008-11-22 Thread Michal Szymanski
Hi,
When I call pgsql procedure in Postgres 8.3 from pgAdmin III
everything is printed in one line:

WARNING:  Odczytany token'+48'WARNING:  v_processed_strb-
zWARNING:  Odczytany tokenb-zWARNING:  v_processed_strWARNING:
Odczytany tokenWARNING:  v_processed_str

Całkowity czas wykonania zapytania:16 ms.
1 wierszy zwróconych.

It is happening in all our computers and it was started after
migration to 8.3. It is small thing but sometimes is hard to debug our
procedures.

Michal Szymanski
http://blog.szymanskich.net

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Best way to import a plpythonu module

2008-11-22 Thread Chris McDonald
I am using plpythonu on linux version postgresql-plpython-8.2.9-1.fc7.

Consider a python class called Wibble which is written into a python
module wibble.py.
I wish to use Wibble from within a plpythonu procedure.

If I simply do:

from wibble import Wibble

then I am told the object was not found, (presumably because postgres
does not know where the module is).

So as far as I can see, my available options to use class Wibble are:

1. Add code to modify the search path before my import statement. This
allows me to find wibble for the duration of my session I think

2. Set environment variable PYTHONPATH in user postgres .bash_profile
to add a directory where I can put my module - then restart
postgresql. This does not seem particularly safe to me though. I also
wonder if I would need to restart postgresql to reload a modified
module or whether the module is freshly read whenever a new reference
is made it it?

Is there a best practise approach here?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Connecting to old 7.1 Database

2008-11-22 Thread Andy Greensted

Hi All,

I need to connect to a version 7.1 PostgreSQL database. Unfortunately, I 
cannot get the 7.1.3 source to compile. configure gives this error:


checking types of arguments for accept()... configure: error: could not 
determine argument types


I'm not posting to pgsql-bugs because I think this is to do with me 
having a much newer environment (gentoo linux) than was around when 
7.1.3 was released.


So, two questions:

- Is there anyway to run a newer version (8.3.5) of psql in some sort of 
'backwards compatible' mode?


- Do you have any tips on making 7.1.3 compile on a newer system?

Many thanks
Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] strange commit behavior

2008-11-22 Thread Flavio Palumbo
Hi all,

I'm developing a little tool in Java that manages database update throught 
external text files.

In this tool there is an option that allows the user accepts a defined amount 
of errors during the update and save the data well formed.

To do this I start commitment control when the process begins and, at the end 
of process itself, if there are no errors or the threshold is not reached I 
commit data, else rollback.

I tested this tool under MySql and Oracle and everything went as expected.

Unfortunately postgres seems to work in a different way, cause if there is an 
error in the transaction (beetwen start commiment control and commit) the whole 
set of updates is discarded ; no way to commit the well formed records inserted 
(earlier or later).

Is this right ?

There is a way or a workaround to achieve my goal ?

Any hint would be appreciated.

Flavio

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Michal Szymanski
On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun)
wrote:
     Hello all!

     I would like to ask some advice about the following problem
 (related to the Dehems project:http://www.dehems.eu/):
     * there are some clients; (the clients are in fact house holds;)
     * each device has a number of sensors (about 10), and not all the
 clients have the same sensor; also sensors might appear and disappear
 dynamicaly; (the sensors are appliances;)
     * for each device and each sensor a reading is produced (at about
 6 seconds); (the values could be power consumptions;)
     * I would like to store the following data: (client, sensor,
 timestamp, value);
     * the usual queries are:
         * for a given client (and sensor), and time interval, I need
 the min, max, and avg of the values;
         * for a given time interval (and sensor), I need min, max, and
 avg of the values;
         * other statistics;

How many devices you expect ?
As I understand number of expected is more or less:
no.of devices * no.sensors (about 10)
every 6second. Let assume that you have 100 devices it means 1000
inserts per 6s = 166 insert for 1 seconds.

     * inserts are done like this:
         * generated 100 million readings by using the following rule:
             * client is randomly chosen between 0 and 10 thousand;
             * sensor is randomly chosen between 0 and 10;
             * the timestamp is always increasing by one;
         * the insert is done in batches of 500 thousand inserts (I've
 also tried 5, 25, 50 and 100 thousand without big impact);
         * the banch inserts are done through COPY sds_benchmark_data
 FROM STDIN through libpq (by using UNIX (local) sockets);

     What have I observed / tried:
     * I've tested without the primary key and the index, and the
 results were the best for inserts (600k inserts / s), but the
 readings, worked extremly slow (due to the lack of indexing);
     * with only the index (or only the primary key) the insert rate is
 good at start (for the first 2 million readings), but then drops to
 about 200 inserts / s;


Try periodicaly execute REINDEX your index, and execute ANALYZE for
your table . To be honest should not influance on inserts but will
influance on select.


Michal Szymanski
http://blog.szymanskich.net

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Fri, Nov 21, 2008 at 3:12 PM, Michal Szymanski [EMAIL PROTECTED] wrote:
 On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun)
 wrote:
 Hello all!

 I would like to ask some advice about the following problem
 (related to the Dehems project:http://www.dehems.eu/):
 * there are some clients; (the clients are in fact house holds;)
 * each device has a number of sensors (about 10), and not all the
 clients have the same sensor; also sensors might appear and disappear
 dynamicaly; (the sensors are appliances;)
 * for each device and each sensor a reading is produced (at about
 6 seconds); (the values could be power consumptions;)
 * I would like to store the following data: (client, sensor,
 timestamp, value);
 * the usual queries are:
 * for a given client (and sensor), and time interval, I need
 the min, max, and avg of the values;
 * for a given time interval (and sensor), I need min, max, and
 avg of the values;
 * other statistics;

 How many devices you expect ?
 As I understand number of expected is more or less:
 no.of devices * no.sensors (about 10)
 every 6second. Let assume that you have 100 devices it means 1000
 inserts per 6s = 166 insert for 1 seconds.

Yes, the figures are like this:
* average number of raw inserts / second (without any optimization
or previous aggregation): #clients (~ 100 thousand) * #sensors (~ 10)
/ 6seconds = 166 thousand inserts / second...
* if I use sharding this number vould drop linearly with the
number of Postgres instances... so let's say I use about 10 thousand
users / Postgres instance = 16 thousand inserts / second... (a figure
which I wasn't able to reach in my Postgres benchmarks...)

Either way, I would expect at least 2-3 thousand inserts per second...


 * inserts are done like this:
 * generated 100 million readings by using the following rule:
 * client is randomly chosen between 0 and 10 thousand;
 * sensor is randomly chosen between 0 and 10;
 * the timestamp is always increasing by one;
 * the insert is done in batches of 500 thousand inserts (I've
 also tried 5, 25, 50 and 100 thousand without big impact);
 * the banch inserts are done through COPY sds_benchmark_data
 FROM STDIN through libpq (by using UNIX (local) sockets);

 What have I observed / tried:
 * I've tested without the primary key and the index, and the
 results were the best for inserts (600k inserts / s), but the
 readings, worked extremly slow (due to the lack of indexing);
 * with only the index (or only the primary key) the insert rate is
 good at start (for the first 2 million readings), but then drops to
 about 200 inserts / s;


 Try periodicaly execute REINDEX your index, and execute ANALYZE for
 your table . To be honest should not influance on inserts but will
 influance on select.

I'll keep this in mind when I'll reach the select part... For the
moment I'm strugling with inserts... (Actually I've kind of given
up...)


 Michal Szymanski
 http://blog.szymanskich.net

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


Thanks,
Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Fwd: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
   (I'm adding the discussion also to the Postgres list.)

On Fri, Nov 21, 2008 at 11:19 PM, Dann Corbit [EMAIL PROTECTED] wrote:
 What is the schema for your table?
 If you are using copy rather than insert, 1K rows/sec for PostgreSQL seems 
 very bad unless the table is extremely wide.

   The schema is posted at the beginning of the thread. But in short
it is a table with 4 columns: client, sensor, timestamp and value, all
beeing int4 (integer). There is only one (compound) index on the
client and sensor...

   I gues the problem is from the index...


 Memory mapped database systems may be the answer to your need for speed.
 If you have a single inserting process, you can try FastDB, but unless you 
 use a 64 bit operating system and compiler, you will be limited to 2 GB file 
 size.  FastDB is single writer, multiple reader model.  See:
 http://www.garret.ru/databases.html

 Here is output from the fastdb test program testperf, when compiled in 64 bit 
 mode (the table is ultra-simple with only a string key and a string value, 
 with also a btree and a hashed index on key):
 Elapsed time for inserting 100 record: 8 seconds
 Commit time: 1
 Elapsed time for 100 hash searches: 1 seconds
 Elapsed time for 100 index searches: 4 seconds
 Elapsed time for 10 sequential search through 100 records: 2 seconds
 Elapsed time for search with sorting 100 records: 3 seconds
 Elapsed time for deleting all 100 records: 0 seconds

 Here is a bigger set so you can get an idea about scaling:

 Elapsed time for inserting 1000 record: 123 seconds
 Commit time: 13
 Elapsed time for 1000 hash searches: 10 seconds
 Elapsed time for 1000 index searches: 82 seconds
 Elapsed time for 10 sequential search through 1000 records: 8 seconds
 Elapsed time for search with sorting 1000 records: 41 seconds
 Elapsed time for deleting all 1000 records: 4 seconds

 If you have a huge database, then FastDB may be problematic because you need 
 free memory equal to the size of your database.
 E.g. a 100 GB database needs 100 GB memory to operate at full speed.  In 4GB 
 allotments, at $10-$50/GB 100 GB costs between $1000 and $5000.

   Unfortunately the database will be too large (eventually) to store
all of it inside the memory...

   For the moment, I don't think I'll be able to try FastDB... Il put
it on my reminder list...


 MonetDB is worth a try, but I had trouble getting it to work properly on 64 
 bit Windows:
 http://monetdb.cwi.nl/

   I've heard of MonetDB -- it's from the same family as
Hypertable... Maybe I'll give it a try after I finish with SQLlite...

   Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL query

2008-11-22 Thread Raymond O'Donnell
On 22/11/2008 04:33, Michael Thorsen wrote:

 select count(*)
   from user_table u, locations l
  where u.user_code = l.code
 and u.price = l.price
 and u.value = l.value;
 
 The answer to this should be 2, but when I run my query I get 4 (in fact

Are you sure that's the query that's being run? I just tried it here,
and got 2 - this was using your data above.

What do your table definitions look like? - here's what I did:

CREATE TABLE user_table
(
  user_id integer NOT NULL,
  user_code integer NOT NULL,
  price numeric(6,2) NOT NULL,
  value numeric(6,2) NOT NULL,
  CONSTRAINT user_pk PRIMARY KEY (user_id)
)
WITH (OIDS=FALSE);

CREATE TABLE locations
(
  id integer NOT NULL,
  code integer NOT NULL,
  price numeric(6,2) NOT NULL,
  value numeric(6,2) NOT NULL,
  CONSTRAINT location_pk PRIMARY KEY (id)
)
WITH (OIDS=FALSE);

Does this correspond to what you have?

Ray.

--
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL query

2008-11-22 Thread Raymond O'Donnell
On 22/11/2008 16:07, Michael Thorsen wrote:
 For the most part yes. The price and value were real columns,
 otherwise the rest of it is the same. On a small data set I seem to get

That's almost certainly the problem, so - rounding errors are causing
the equality test in the join to fail. You should use NUMERIC for those
floating-point values.

Have a look at what the docs say on REAL and family:

http://www.postgresql.org/docs/8.3/static/datatype-numeric.html#DATATYPE-FLOAT

Ray.


PS - please don't top-post, as it makes the thread difficult to follow.


--
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals
--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Shane Ambler

Ciprian Dorin Craciun wrote:



I would try it if I would know that it could handle the load... Do
you have some info about this? Any pointers about the configuration
issues?

Ciprian.




Apart from the configure options at build time you should read -
http://www.sqlite.org/pragma.html

It was a few versions ago so may be changed by now, but I reckon it
was the temp_store setting - which is described as temp tables and
indexes but is (or was) also used for large query and sorting needs.
Setting this to memory did make a difference for some queries.


--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL query

2008-11-22 Thread Tom Lane
Michael Thorsen [EMAIL PROTECTED] writes:
 ... I gave a simple example above, but the query runs over 2 tables
 with about a million entries in each. So I am unable to verify what is
 wrong, but I know the count is incorrect as I should not have more than what
 is in the user_table.

You could easily get a count larger than the number of rows in
user_table, if there are rows in user_table that join to multiple rows
in the locations table.  So look for duplicated data in locations ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] delete commands fails silently to delete primary key

2008-11-22 Thread Andrus

I have table in 8.1.4 which tracks users logged into db

CREATE TABLE session
(
 workplace character(16) NOT NULL,
 ipaddress character(20),
 logintime character(28),
 loggeduser character(10),
 CONSTRAINT session_pkey PRIMARY KEY (workplace)
);

Commands executed at logon in same transaction are:

delete from session where workplace=E'LIIVA' ;
insert into session (workplace,ipaddress,logintime,loggeduser) values ( 
E'LIIVA' , inet_client_addr()::CHAR(14), 
current_timestamp::CHAR(28),CURRENT_USER)


Sometimes (during locking contention or during heavy load) those commands 
cause error:


2008-11-22 11:24:26 EET INSERT 1 4745ERROR:  duplicate key violates 
unique constraint session_pkey
2008-11-22 11:24:26 EET INSERT 2 4745STATEMENT:  delete from session 
where workplace=E'LIIVA' ;insert into session 
(workplace,ipaddress,logintime,loggeduser) values (  E'LIIVA' , 
inet_client_addr()::CHAR(14), current_timestamp::CHAR(28),CURRENT_USER)


No other client can add 'LIIVA' primary key.
Any idea why this error occurs and how to fix ?

Andrus.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sat, Nov 22, 2008 at 8:04 PM, Shane Ambler [EMAIL PROTECTED] wrote:
 Ciprian Dorin Craciun wrote:


I would try it if I would know that it could handle the load... Do
 you have some info about this? Any pointers about the configuration
 issues?

Ciprian.



 Apart from the configure options at build time you should read -
 http://www.sqlite.org/pragma.html

 It was a few versions ago so may be changed by now, but I reckon it
 was the temp_store setting - which is described as temp tables and
 indexes but is (or was) also used for large query and sorting needs.
 Setting this to memory did make a difference for some queries.


 --

 Shane Ambler
 pgSQL (at) Sheeky (dot) Biz

 Get Sheeky @ http://Sheeky.Biz

Hello all!

(This email now is about Sqlite3, but it also relates to Postgres
as a coparison.)

I've tested also Sqlite3 and it has the same behavior as
Postgres... Meaning at beginning it goes really nice 20k inserts,
drops to about 10k inserts, but after a few million records, the HDD
led starts to blink non-stop, and then it drops to unde 1k

I've used exactly the same schema as for Postgres, and the
following pragmas:
* page_size = 8192;
* fullsync = 0;
* synchronous = off;
* journal_mode = off; (this has a 10 fold impact... from 1k
inserts at the beginning to 10 or 20k...)
* cache_size = 65536; (this is in pages, and it results at 512MB,
but I don't see the memory being used during inserts...)
* auto_vacuum = none;
* analyze at the end of the inserts;

So I would conclude that relational stores will not make it for
this use case...

I'll rerun the tests tomorrow and post a comparison between SQLite
and Postgres.

Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Scott Marlowe
On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun
[EMAIL PROTECTED] wrote:

Hello all!
SNIP
So I would conclude that relational stores will not make it for
 this use case...

I was wondering you guys are having to do all individual inserts or if
you can batch some number together into a transaction.  Being able to
put  1 into a single transaction is a huge win for pgsql.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sat, Nov 22, 2008 at 11:51 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
 On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun
 [EMAIL PROTECTED] wrote:

Hello all!
 SNIP
So I would conclude that relational stores will not make it for
 this use case...

 I was wondering you guys are having to do all individual inserts or if
 you can batch some number together into a transaction.  Being able to
 put  1 into a single transaction is a huge win for pgsql.

I'm aware of the performance issues between 1 insert vs x batched
inserts in one operation / transaction. That is why in the case of
Postgres I am using COPY table FROM STDIN, and using 5k batches...
(I've tried even 10k, 15k, 25k, 50k, 500k, 1m inserts / batch and no
improvement...)

Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread David Wilson
On Sat, Nov 22, 2008 at 4:54 PM, Ciprian Dorin Craciun
[EMAIL PROTECTED] wrote:
 On Sat, Nov 22, 2008 at 11:51 PM, Scott Marlowe [EMAIL PROTECTED] wrote:
 On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun
 [EMAIL PROTECTED] wrote:

Hello all!
 SNIP
So I would conclude that relational stores will not make it for
 this use case...

 I was wondering you guys are having to do all individual inserts or if
 you can batch some number together into a transaction.  Being able to
 put  1 into a single transaction is a huge win for pgsql.

I'm aware of the performance issues between 1 insert vs x batched
 inserts in one operation / transaction. That is why in the case of
 Postgres I am using COPY table FROM STDIN, and using 5k batches...
 (I've tried even 10k, 15k, 25k, 50k, 500k, 1m inserts / batch and no
 improvement...)

I've had exactly the same experience with Postgres during an attempt
to use it as a store for large-scale incoming streams of data at a
rate very comparable to what you're looking at (~100k/sec). We
eventually just ended up rolling our own solution.

-- 
- David T. Wilson
[EMAIL PROTECTED]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Alvaro Herrera
Ciprian Dorin Craciun escribió:

 I've tested also Sqlite3 and it has the same behavior as
 Postgres... Meaning at beginning it goes really nice 20k inserts,
 drops to about 10k inserts, but after a few million records, the HDD
 led starts to blink non-stop, and then it drops to unde 1k

The problem is, most likely, on updating the indexes.  Heap inserts
should always take more or less the same time, but index insertion
requires walking down the index struct for each insert, and the path to
walk gets larger the more data you have.

Postgres does not have bulk index insert, which could perhaps get you a
huge performance improvement.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Alvaro Herrera

 On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun)
 wrote:

      What have I observed / tried:
      * I've tested without the primary key and the index, and the
  results were the best for inserts (600k inserts / s), but the
  readings, worked extremly slow (due to the lack of indexing);
      * with only the index (or only the primary key) the insert rate is
  good at start (for the first 2 million readings), but then drops to
  about 200 inserts / s;

I didn't read the thread so I don't know if this was suggested already:
bulk index creation is a lot faster than retail index inserts.  Maybe
one thing you could try is to have an unindexed table to do the inserts,
and a separate table that you periodically truncate, refill with the
contents from the other table, then create index.  Two main problems: 1.
querying during the truncate/refill/reindex process (you can solve it by
having a second table that you rename in place); 2. the query table is
almost always out of date.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] date stamp on update?

2008-11-22 Thread blackwater dev
Is there a datatype in postgres that will automatically update the date when
the row is updated?  I know I can do a timestamp and set the default to
now() but once the row is inserted, and then edited, I want the column
updated without editing my application code or adding a trigger.  Is this
possible with Postgres?

Thanks!


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 Richard Huxton wrote:

 Some of the EXPLAINs on the performance list are practically impossible
 to read unless you've got the time to cut+paste and fix line-endings.

 Maybe we should start recommending people to post those via
 http://explain-analyze.info/

What would be really neat would be having the mailing list do something
automatically. Either fix the message inline or generate a link to something
like this.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 So, to a first approximation, the PG list traffic has been constant
 since 2000.  Not the result I expected.

 I also was confused by its flatness.  I am finding the email traffic
 almost impossible to continue tracking, so something different is
 happening, but it seems it is not volume-related.

 Yes, my perception also is that it's getting harder and harder to keep
 up with the list traffic; so something is happening that a simple
 volume count doesn't capture.

I've noticed recently that the mailing list traffic seems very bursty. We
have days with hundreds of messages on lots of different in-depth topics and
other days with hardly any messages at all. I wonder if it's hard to follow
because we've been picking up more simultaneous threads instead of all being
on one thread together before moving on to the next one.

Another idea, I wonder if the project has gone more international and
therefore has more traffic at odd hours of the day for everyone. It would also
mean more long-lived threads with large latencies between messages and replies.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 The problem is, most likely, on updating the indexes.  Heap inserts
 should always take more or less the same time, but index insertion
 requires walking down the index struct for each insert, and the path to
 walk gets larger the more data you have.

It's worse than that: his test case inserts randomly ordered keys, which
means that there's no locality of access during the index updates.  Once
the indexes get bigger than RAM, update speed goes into the toilet,
because the working set of index pages that need to be touched also
is bigger than RAM.  That effect is going to be present in *any*
standard-design database, not just Postgres.

It's possible that performance in a real-world situation would be
better, if the incoming data stream isn't so random; but it's
hard to tell about that with the given facts.

One possibly useful trick is to partition the data by timestamp with
partition sizes chosen so that the indexes don't get out of hand.
But the partition management might be enough of a PITA to negate
any win.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] date stamp on update?

2008-11-22 Thread Tom Lane
blackwater dev [EMAIL PROTECTED] writes:
 Is there a datatype in postgres that will automatically update the date when
 the row is updated?

No, and it's conceptually impossible to make that happen at the datatype
level.  Use a trigger.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] date stamp on update?

2008-11-22 Thread Martin Gainty



Create Table TableName(
ColumnName DATE Sysdate;
);
ALTER TABLE TableName
  ALTER COLUMN  ColumnName { SET DEFAULT newdefaultvalue | DROP OLDDEFAULT }

HTH
Martin 
__ 
Disclaimer and confidentiality note 
Everything in this e-mail and any attachments relates to the official business 
of Sender. This transmission is of a confidential nature and Sender does not 
endorse distribution to any party other than intended recipient. Sender does 
not necessarily endorse content contained within this transmission. 




Date: Sat, 22 Nov 2008 17:34:26 -0500
From: [EMAIL PROTECTED]
To: pgsql-general@postgresql.org
Subject: [GENERAL] date stamp on update?

Is there a datatype in postgres that will automatically update the date when 
the row is updated?  I know I can do a timestamp and set the default to now() 
but once the row is inserted, and then edited, I want the column updated 
without editing my application code or adding a trigger.  Is this possible with 
Postgres?


Thanks!

_
Proud to be a PC? Show the world. Download the “I’m a PC” Messenger themepack 
now.
hthttp://clk.atdmt.com/MRT/go/119642558/direct/01/

Re: [GENERAL] delete commands fails silently to delete primary key

2008-11-22 Thread Tom Lane
Andrus [EMAIL PROTECTED] writes:
 I have table in 8.1.4 which tracks users logged into db

There have been a number of index-corruption bugs fixed since 8.1.4 ...

In particular, if it's possible that any of these clients abort before
committing these insertions, the vacuum race condition bug fixed in
8.1.10 is a pretty likely candidate for your problem.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Scara Maccai
Since you always need the timestamp in your selects, have you tried indexing 
only the timestamp field?
Your selects would be slower, but since client and sensor don't have that many 
distinct values compared to the number of rows you are inserting maybe the 
difference in selects would not be that huge.






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Scott Marlowe
On Sat, Nov 22, 2008 at 5:54 PM, Scara Maccai [EMAIL PROTECTED] wrote:
 Since you always need the timestamp in your selects, have you tried indexing 
 only the timestamp field?
 Your selects would be slower, but since client and sensor don't have that 
 many distinct values compared to the number of rows you are inserting maybe 
 the difference in selects would not be that huge.

Even better might be partitioning on the timestamp.  IF all access is
in a certain timestamp range it's usually a big win, especially
because he can move to a new table every hour / day / week or whatever
and merge the old one into a big old data table.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] [Q]updating multiple rows with Different values

2008-11-22 Thread V S P
Hello,
searched documentation, FAQ and mailing list archives
(mailing list archive search is volumous :-) )

but could not find an answer:

I would like to be able to update
several rows to different values at the same time

In oracle this used to be called Array update or 
'collect' update or 'bulk' update -- but those
keywords did not bring anything for Postgresql.

for example tbl_1 has two columns id and col1


update tbl_1  set
   col1=3  where id=25,
   col1=5  where id=26


I am using PHP PDO (and hoping that if there is a mechanism
within postgresql to do that PDO will support it as well).


Thank you in advance,
VSP

-- 
  V S P
  [EMAIL PROTECTED]

-- 
http://www.fastmail.fm - Access all of your messages and folders
  wherever you are


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres mail list traffic over time

2008-11-22 Thread Bruce Momjian
Magnus Hagander wrote:
 Bruce Momjian wrote:
  Ron Mayer wrote:
  Joshua D. Drake wrote:
  On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote:
  Bruce Momjian wrote:
  Tom Lane wrote:
  ... harder to keep
  up with the list traffic; so something is happening that a simple
  volume count doesn't capture.
  If measured in bytes of the gzipped mbox it ...
  Its because we eliminated the -patches mailing list.
  That's part of it.  I've added -patches to the graph at
  http://0ape.com/postgres_mailinglist_size/ as well as
  a graph of hackers+patches combined; and it still looks
  like hackers+patches is quite high in the past 3 months.
 
  With hackers+patches it looks like 2002-08 was the biggest
  month; but the past 3 months still look roughly twice
  late 2007's numbers.
  
  Can someoone graph CVS traffic, showing the historical number of commits
  and number of changed lines?
 
 Ohloh has some graphs, are they detailed enough?
 http://www.ohloh.net/projects/postgres/analyses/latest

I saw that but that only shows total lines, not the number of lines
changed, or commits per hour, etc.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sun, Nov 23, 2008 at 1:02 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 The problem is, most likely, on updating the indexes.  Heap inserts
 should always take more or less the same time, but index insertion
 requires walking down the index struct for each insert, and the path to
 walk gets larger the more data you have.

 It's worse than that: his test case inserts randomly ordered keys, which
 means that there's no locality of access during the index updates.  Once
 the indexes get bigger than RAM, update speed goes into the toilet,
 because the working set of index pages that need to be touched also
 is bigger than RAM.  That effect is going to be present in *any*
 standard-design database, not just Postgres.

 It's possible that performance in a real-world situation would be
 better, if the incoming data stream isn't so random; but it's
 hard to tell about that with the given facts.

 One possibly useful trick is to partition the data by timestamp with
 partition sizes chosen so that the indexes don't get out of hand.
 But the partition management might be enough of a PITA to negate
 any win.

regards, tom lane

Thanks for your feedback! This is just as I supposed, but i didn't
had the Postgres experience to be certain.
I'll include your conclusion to my report.

Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sun, Nov 23, 2008 at 12:26 AM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 Ciprian Dorin Craciun escribió:

 I've tested also Sqlite3 and it has the same behavior as
 Postgres... Meaning at beginning it goes really nice 20k inserts,
 drops to about 10k inserts, but after a few million records, the HDD
 led starts to blink non-stop, and then it drops to unde 1k

 The problem is, most likely, on updating the indexes.  Heap inserts
 should always take more or less the same time, but index insertion
 requires walking down the index struct for each insert, and the path to
 walk gets larger the more data you have.

 Postgres does not have bulk index insert, which could perhaps get you a
 huge performance improvement.

 --
 Alvaro Herrerahttp://www.CommandPrompt.com/
 The PostgreSQL Company - Command Prompt, Inc.

I don't think the index depth is a problem. For example in the
case of BerkeleyDB with BTree storage, the tree height is 3 after 100m
inserts... So this is not the problem.

I think the problem is that after a certain amount of data,
perdicaly the entire index is touched, and in this case the HDD
becomes a bottleneck... (Demonstrated by the vmstat 1 output I've put
in a previous email.)

Ciprian.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sun, Nov 23, 2008 at 3:09 AM, Scott Marlowe [EMAIL PROTECTED] wrote:
 On Sat, Nov 22, 2008 at 5:54 PM, Scara Maccai [EMAIL PROTECTED] wrote:
 Since you always need the timestamp in your selects, have you tried indexing 
 only the timestamp field?
 Your selects would be slower, but since client and sensor don't have that 
 many distinct values compared to the number of rows you are inserting maybe 
 the difference in selects would not be that huge.

 Even better might be partitioning on the timestamp.  IF all access is
 in a certain timestamp range it's usually a big win, especially
 because he can move to a new table every hour / day / week or whatever
 and merge the old one into a big old data table.

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

Yes, If i would speed the inserts tremendously... I've tested it
and the insert speed is somewhere at 200k-100k.

But unfortunately the query speed is not good at all because most
queries are for a specific client (and sensor) in a given time
range...

Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using Postgres to store high volume streams of sensor readings

2008-11-22 Thread Ciprian Dorin Craciun
On Sun, Nov 23, 2008 at 12:32 AM, Alvaro Herrera
[EMAIL PROTECTED] wrote:

 On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun)
 wrote:

  What have I observed / tried:
  * I've tested without the primary key and the index, and the
  results were the best for inserts (600k inserts / s), but the
  readings, worked extremly slow (due to the lack of indexing);
  * with only the index (or only the primary key) the insert rate is
  good at start (for the first 2 million readings), but then drops to
  about 200 inserts / s;

 I didn't read the thread so I don't know if this was suggested already:
 bulk index creation is a lot faster than retail index inserts.  Maybe
 one thing you could try is to have an unindexed table to do the inserts,
 and a separate table that you periodically truncate, refill with the
 contents from the other table, then create index.  Two main problems: 1.
 querying during the truncate/refill/reindex process (you can solve it by
 having a second table that you rename in place); 2. the query table is
 almost always out of date.

 --
 Alvaro Herrerahttp://www.CommandPrompt.com/
 The PostgreSQL Company - Command Prompt, Inc.

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

The concerts you have listed are very important to me... I will
use the database not only for archival and offline analysis, but also
for realtime queries (like what is the power consumption in the last
minute)...

Of course I could use Postgres only for archival like you've said,
and some other solution for realtime queries, but this adds complexity
to the application...

Thanks,
Ciprian Craciun.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general