Re: [GENERAL] pgAdmin error
On Sat, Nov 22, 2008 at 6:35 AM, David [EMAIL PROTECTED] wrote: I am trying to use pgAdmin 1.8.4 to edit the pg_hba.conf file on a PostgreSQL 8.3 database running on Ubuntu 8.10. I get the following error message: An error has occurred: ERROR: absolute path not allowed CONTEXT: SQL function pg_file_length statement 1 then... Backend Access Configuration Editor - /etc/postresql/8.3/main/pg_hba.conf on Kepler... with a blank screen. Hmm - PostgreSQL won't allow us to access files outside of the data directory. iirc, it does that by insisting that all paths are relative, and don't contain .. I assume your data is not in /etc/postgresql/8.3/main - just your config? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
On Friday 21 November 2008 19:10:45 Tom Lane wrote: Yeah, I think this is most probably explained by repeat postings of successive versions of large patches. Still, Ron might be on to something. I had not considered message lengths in my previous numbers ... Also consider that since we started using the wiki for tracking patches, a lot of trivial emails like your patch has been added to the queue and where are we on this have disappeared. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pgAdmin error
Dave Page wrote: On Sat, Nov 22, 2008 at 6:35 AM, David [EMAIL PROTECTED] wrote: I am trying to use pgAdmin 1.8.4 to edit the pg_hba.conf file on a PostgreSQL 8.3 database running on Ubuntu 8.10. I get the following error message: An error has occurred: ERROR: absolute path not allowed CONTEXT: SQL function pg_file_length statement 1 then... Backend Access Configuration Editor - /etc/postresql/8.3/main/pg_hba.conf on Kepler... with a blank screen. Hmm - PostgreSQL won't allow us to access files outside of the data directory. iirc, it does that by insisting that all paths are relative, and don't contain .. I assume your data is not in /etc/postgresql/8.3/main - just your config? That's the default location on debian. IIRC it also means that the database server doesn't have write permissions on the files anyway, so even if it allowed the opening of them it would not be possible to make modifications from pgAdmin anyway. Bottom line - you can't edit your postgresql configuration remotely on Debian. //Magnus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Alvaro Herrera wrote: Sam Mason wrote: the following has links to more: http://markmail.org/search/?q=list:org.postgresql Wow, the spanish list is the 3rd in traffic after hackers and general! yeah and that tom lane guy sent over 77000(!!!) mails to the lists up to now ... Stefan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Ron Mayer wrote: Joshua D. Drake wrote: On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote: Bruce Momjian wrote: Tom Lane wrote: ... harder to keep up with the list traffic; so something is happening that a simple volume count doesn't capture. If measured in bytes of the gzipped mbox it ... Its because we eliminated the -patches mailing list. That's part of it. I've added -patches to the graph at http://0ape.com/postgres_mailinglist_size/ as well as a graph of hackers+patches combined; and it still looks like hackers+patches is quite high in the past 3 months. With hackers+patches it looks like 2002-08 was the biggest month; but the past 3 months still look roughly twice late 2007's numbers. Can someoone graph CVS traffic, showing the historical number of commits and number of changed lines? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Bruce Momjian wrote: Ron Mayer wrote: Joshua D. Drake wrote: On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote: Bruce Momjian wrote: Tom Lane wrote: ... harder to keep up with the list traffic; so something is happening that a simple volume count doesn't capture. If measured in bytes of the gzipped mbox it ... Its because we eliminated the -patches mailing list. That's part of it. I've added -patches to the graph at http://0ape.com/postgres_mailinglist_size/ as well as a graph of hackers+patches combined; and it still looks like hackers+patches is quite high in the past 3 months. With hackers+patches it looks like 2002-08 was the biggest month; but the past 3 months still look roughly twice late 2007's numbers. Can someoone graph CVS traffic, showing the historical number of commits and number of changed lines? Ohloh has some graphs, are they detailed enough? http://www.ohloh.net/projects/postgres/analyses/latest //Magnus -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database access over the Internet...
Michelle, I don't think the list is going to change its operations for one disgruntled user. Since you seem unwilling or unable to employ the advice already given, maybe your only acceptable option is to unsubscribe from the list. At least that would eliminate much of the noise that currently is cluttering everyone's inboxes. - Original Message - From: Michelle Konzack [EMAIL PROTECTED] To: pgsql-general@postgresql.org Sent: Sunday, November 16, 2008 4:41 AM Subject: Re: [GENERAL] Database access over the Internet... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database access over the Internet...
You did not understand the problem 1) You did not explain the problem. 2) It's your problem, not ours. Your demand that thousands of other people adapt to your unusual problem is absurdly self-absorbed. Get decent email service, then subscribe from there. (Or go away. Either works for the members of this list, so it's really your choice.) -- Scott Ribe [EMAIL PROTECTED] http://www.killerbytes.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] SQL query
I am running the Postgres(8.2.11) on Windows. I have 2 tables, one with users and one with locations. user_table --- user_id user_code price value 1 245.23 -97.82 2 3 42.67 -98.32 3 4 35.56 -76.32 locations -- id code pricevalue 12 45.23 -97.82 23 42.67 -98.32 34 43.26 -98.65 I have a query that takes every user and looks into locations to see if the code, price and value match up. If they do then count it. select count(*) from user_table u, locations l where u.user_code = l.code and u.price = l.price and u.value = l.value; The answer to this should be 2, but when I run my query I get 4 (in fact more entries than user_table) which is incorrect. What am I doing incorrectly? I have been breaking my head over this for a while. Is there some other query to get the required results? Any help would be highly appreciated. I gave a simple example above, but the query runs over 2 tables with about a million entries in each. So I am unable to verify what is wrong, but I know the count is incorrect as I should not have more than what is in the user_table. Thanks, Michael
[GENERAL] how to cite postgresql?
I am using Postgresql to store all my research related data. At the moment I am just finishing my PhD thesis and I want to cite postgresql correctly but can't find how to do it. Could somebody give me an advice? Many thanks tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PgAdminIII - RAISE WARNING prints in one line
Hi, When I call pgsql procedure in Postgres 8.3 from pgAdmin III everything is printed in one line: WARNING: Odczytany token'+48'WARNING: v_processed_strb- zWARNING: Odczytany tokenb-zWARNING: v_processed_strWARNING: Odczytany tokenWARNING: v_processed_str Całkowity czas wykonania zapytania:16 ms. 1 wierszy zwróconych. It is happening in all our computers and it was started after migration to 8.3. It is small thing but sometimes is hard to debug our procedures. Michal Szymanski http://blog.szymanskich.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Best way to import a plpythonu module
I am using plpythonu on linux version postgresql-plpython-8.2.9-1.fc7. Consider a python class called Wibble which is written into a python module wibble.py. I wish to use Wibble from within a plpythonu procedure. If I simply do: from wibble import Wibble then I am told the object was not found, (presumably because postgres does not know where the module is). So as far as I can see, my available options to use class Wibble are: 1. Add code to modify the search path before my import statement. This allows me to find wibble for the duration of my session I think 2. Set environment variable PYTHONPATH in user postgres .bash_profile to add a directory where I can put my module - then restart postgresql. This does not seem particularly safe to me though. I also wonder if I would need to restart postgresql to reload a modified module or whether the module is freshly read whenever a new reference is made it it? Is there a best practise approach here? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Connecting to old 7.1 Database
Hi All, I need to connect to a version 7.1 PostgreSQL database. Unfortunately, I cannot get the 7.1.3 source to compile. configure gives this error: checking types of arguments for accept()... configure: error: could not determine argument types I'm not posting to pgsql-bugs because I think this is to do with me having a much newer environment (gentoo linux) than was around when 7.1.3 was released. So, two questions: - Is there anyway to run a newer version (8.3.5) of psql in some sort of 'backwards compatible' mode? - Do you have any tips on making 7.1.3 compile on a newer system? Many thanks Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] strange commit behavior
Hi all, I'm developing a little tool in Java that manages database update throught external text files. In this tool there is an option that allows the user accepts a defined amount of errors during the update and save the data well formed. To do this I start commitment control when the process begins and, at the end of process itself, if there are no errors or the threshold is not reached I commit data, else rollback. I tested this tool under MySql and Oracle and everything went as expected. Unfortunately postgres seems to work in a different way, cause if there is an error in the transaction (beetwen start commiment control and commit) the whole set of updates is discarded ; no way to commit the well formed records inserted (earlier or later). Is this right ? There is a way or a workaround to achieve my goal ? Any hint would be appreciated. Flavio -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun) wrote: Hello all! I would like to ask some advice about the following problem (related to the Dehems project:http://www.dehems.eu/): * there are some clients; (the clients are in fact house holds;) * each device has a number of sensors (about 10), and not all the clients have the same sensor; also sensors might appear and disappear dynamicaly; (the sensors are appliances;) * for each device and each sensor a reading is produced (at about 6 seconds); (the values could be power consumptions;) * I would like to store the following data: (client, sensor, timestamp, value); * the usual queries are: * for a given client (and sensor), and time interval, I need the min, max, and avg of the values; * for a given time interval (and sensor), I need min, max, and avg of the values; * other statistics; How many devices you expect ? As I understand number of expected is more or less: no.of devices * no.sensors (about 10) every 6second. Let assume that you have 100 devices it means 1000 inserts per 6s = 166 insert for 1 seconds. * inserts are done like this: * generated 100 million readings by using the following rule: * client is randomly chosen between 0 and 10 thousand; * sensor is randomly chosen between 0 and 10; * the timestamp is always increasing by one; * the insert is done in batches of 500 thousand inserts (I've also tried 5, 25, 50 and 100 thousand without big impact); * the banch inserts are done through COPY sds_benchmark_data FROM STDIN through libpq (by using UNIX (local) sockets); What have I observed / tried: * I've tested without the primary key and the index, and the results were the best for inserts (600k inserts / s), but the readings, worked extremly slow (due to the lack of indexing); * with only the index (or only the primary key) the insert rate is good at start (for the first 2 million readings), but then drops to about 200 inserts / s; Try periodicaly execute REINDEX your index, and execute ANALYZE for your table . To be honest should not influance on inserts but will influance on select. Michal Szymanski http://blog.szymanskich.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Fri, Nov 21, 2008 at 3:12 PM, Michal Szymanski [EMAIL PROTECTED] wrote: On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun) wrote: Hello all! I would like to ask some advice about the following problem (related to the Dehems project:http://www.dehems.eu/): * there are some clients; (the clients are in fact house holds;) * each device has a number of sensors (about 10), and not all the clients have the same sensor; also sensors might appear and disappear dynamicaly; (the sensors are appliances;) * for each device and each sensor a reading is produced (at about 6 seconds); (the values could be power consumptions;) * I would like to store the following data: (client, sensor, timestamp, value); * the usual queries are: * for a given client (and sensor), and time interval, I need the min, max, and avg of the values; * for a given time interval (and sensor), I need min, max, and avg of the values; * other statistics; How many devices you expect ? As I understand number of expected is more or less: no.of devices * no.sensors (about 10) every 6second. Let assume that you have 100 devices it means 1000 inserts per 6s = 166 insert for 1 seconds. Yes, the figures are like this: * average number of raw inserts / second (without any optimization or previous aggregation): #clients (~ 100 thousand) * #sensors (~ 10) / 6seconds = 166 thousand inserts / second... * if I use sharding this number vould drop linearly with the number of Postgres instances... so let's say I use about 10 thousand users / Postgres instance = 16 thousand inserts / second... (a figure which I wasn't able to reach in my Postgres benchmarks...) Either way, I would expect at least 2-3 thousand inserts per second... * inserts are done like this: * generated 100 million readings by using the following rule: * client is randomly chosen between 0 and 10 thousand; * sensor is randomly chosen between 0 and 10; * the timestamp is always increasing by one; * the insert is done in batches of 500 thousand inserts (I've also tried 5, 25, 50 and 100 thousand without big impact); * the banch inserts are done through COPY sds_benchmark_data FROM STDIN through libpq (by using UNIX (local) sockets); What have I observed / tried: * I've tested without the primary key and the index, and the results were the best for inserts (600k inserts / s), but the readings, worked extremly slow (due to the lack of indexing); * with only the index (or only the primary key) the insert rate is good at start (for the first 2 million readings), but then drops to about 200 inserts / s; Try periodicaly execute REINDEX your index, and execute ANALYZE for your table . To be honest should not influance on inserts but will influance on select. I'll keep this in mind when I'll reach the select part... For the moment I'm strugling with inserts... (Actually I've kind of given up...) Michal Szymanski http://blog.szymanskich.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Thanks, Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Fwd: [GENERAL] Using Postgres to store high volume streams of sensor readings
(I'm adding the discussion also to the Postgres list.) On Fri, Nov 21, 2008 at 11:19 PM, Dann Corbit [EMAIL PROTECTED] wrote: What is the schema for your table? If you are using copy rather than insert, 1K rows/sec for PostgreSQL seems very bad unless the table is extremely wide. The schema is posted at the beginning of the thread. But in short it is a table with 4 columns: client, sensor, timestamp and value, all beeing int4 (integer). There is only one (compound) index on the client and sensor... I gues the problem is from the index... Memory mapped database systems may be the answer to your need for speed. If you have a single inserting process, you can try FastDB, but unless you use a 64 bit operating system and compiler, you will be limited to 2 GB file size. FastDB is single writer, multiple reader model. See: http://www.garret.ru/databases.html Here is output from the fastdb test program testperf, when compiled in 64 bit mode (the table is ultra-simple with only a string key and a string value, with also a btree and a hashed index on key): Elapsed time for inserting 100 record: 8 seconds Commit time: 1 Elapsed time for 100 hash searches: 1 seconds Elapsed time for 100 index searches: 4 seconds Elapsed time for 10 sequential search through 100 records: 2 seconds Elapsed time for search with sorting 100 records: 3 seconds Elapsed time for deleting all 100 records: 0 seconds Here is a bigger set so you can get an idea about scaling: Elapsed time for inserting 1000 record: 123 seconds Commit time: 13 Elapsed time for 1000 hash searches: 10 seconds Elapsed time for 1000 index searches: 82 seconds Elapsed time for 10 sequential search through 1000 records: 8 seconds Elapsed time for search with sorting 1000 records: 41 seconds Elapsed time for deleting all 1000 records: 4 seconds If you have a huge database, then FastDB may be problematic because you need free memory equal to the size of your database. E.g. a 100 GB database needs 100 GB memory to operate at full speed. In 4GB allotments, at $10-$50/GB 100 GB costs between $1000 and $5000. Unfortunately the database will be too large (eventually) to store all of it inside the memory... For the moment, I don't think I'll be able to try FastDB... Il put it on my reminder list... MonetDB is worth a try, but I had trouble getting it to work properly on 64 bit Windows: http://monetdb.cwi.nl/ I've heard of MonetDB -- it's from the same family as Hypertable... Maybe I'll give it a try after I finish with SQLlite... Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL query
On 22/11/2008 04:33, Michael Thorsen wrote: select count(*) from user_table u, locations l where u.user_code = l.code and u.price = l.price and u.value = l.value; The answer to this should be 2, but when I run my query I get 4 (in fact Are you sure that's the query that's being run? I just tried it here, and got 2 - this was using your data above. What do your table definitions look like? - here's what I did: CREATE TABLE user_table ( user_id integer NOT NULL, user_code integer NOT NULL, price numeric(6,2) NOT NULL, value numeric(6,2) NOT NULL, CONSTRAINT user_pk PRIMARY KEY (user_id) ) WITH (OIDS=FALSE); CREATE TABLE locations ( id integer NOT NULL, code integer NOT NULL, price numeric(6,2) NOT NULL, value numeric(6,2) NOT NULL, CONSTRAINT location_pk PRIMARY KEY (id) ) WITH (OIDS=FALSE); Does this correspond to what you have? Ray. -- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland [EMAIL PROTECTED] Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL query
On 22/11/2008 16:07, Michael Thorsen wrote: For the most part yes. The price and value were real columns, otherwise the rest of it is the same. On a small data set I seem to get That's almost certainly the problem, so - rounding errors are causing the equality test in the join to fail. You should use NUMERIC for those floating-point values. Have a look at what the docs say on REAL and family: http://www.postgresql.org/docs/8.3/static/datatype-numeric.html#DATATYPE-FLOAT Ray. PS - please don't top-post, as it makes the thread difficult to follow. -- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland [EMAIL PROTECTED] Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
Ciprian Dorin Craciun wrote: I would try it if I would know that it could handle the load... Do you have some info about this? Any pointers about the configuration issues? Ciprian. Apart from the configure options at build time you should read - http://www.sqlite.org/pragma.html It was a few versions ago so may be changed by now, but I reckon it was the temp_store setting - which is described as temp tables and indexes but is (or was) also used for large query and sorting needs. Setting this to memory did make a difference for some queries. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL query
Michael Thorsen [EMAIL PROTECTED] writes: ... I gave a simple example above, but the query runs over 2 tables with about a million entries in each. So I am unable to verify what is wrong, but I know the count is incorrect as I should not have more than what is in the user_table. You could easily get a count larger than the number of rows in user_table, if there are rows in user_table that join to multiple rows in the locations table. So look for duplicated data in locations ... regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] delete commands fails silently to delete primary key
I have table in 8.1.4 which tracks users logged into db CREATE TABLE session ( workplace character(16) NOT NULL, ipaddress character(20), logintime character(28), loggeduser character(10), CONSTRAINT session_pkey PRIMARY KEY (workplace) ); Commands executed at logon in same transaction are: delete from session where workplace=E'LIIVA' ; insert into session (workplace,ipaddress,logintime,loggeduser) values ( E'LIIVA' , inet_client_addr()::CHAR(14), current_timestamp::CHAR(28),CURRENT_USER) Sometimes (during locking contention or during heavy load) those commands cause error: 2008-11-22 11:24:26 EET INSERT 1 4745ERROR: duplicate key violates unique constraint session_pkey 2008-11-22 11:24:26 EET INSERT 2 4745STATEMENT: delete from session where workplace=E'LIIVA' ;insert into session (workplace,ipaddress,logintime,loggeduser) values ( E'LIIVA' , inet_client_addr()::CHAR(14), current_timestamp::CHAR(28),CURRENT_USER) No other client can add 'LIIVA' primary key. Any idea why this error occurs and how to fix ? Andrus. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sat, Nov 22, 2008 at 8:04 PM, Shane Ambler [EMAIL PROTECTED] wrote: Ciprian Dorin Craciun wrote: I would try it if I would know that it could handle the load... Do you have some info about this? Any pointers about the configuration issues? Ciprian. Apart from the configure options at build time you should read - http://www.sqlite.org/pragma.html It was a few versions ago so may be changed by now, but I reckon it was the temp_store setting - which is described as temp tables and indexes but is (or was) also used for large query and sorting needs. Setting this to memory did make a difference for some queries. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz Hello all! (This email now is about Sqlite3, but it also relates to Postgres as a coparison.) I've tested also Sqlite3 and it has the same behavior as Postgres... Meaning at beginning it goes really nice 20k inserts, drops to about 10k inserts, but after a few million records, the HDD led starts to blink non-stop, and then it drops to unde 1k I've used exactly the same schema as for Postgres, and the following pragmas: * page_size = 8192; * fullsync = 0; * synchronous = off; * journal_mode = off; (this has a 10 fold impact... from 1k inserts at the beginning to 10 or 20k...) * cache_size = 65536; (this is in pages, and it results at 512MB, but I don't see the memory being used during inserts...) * auto_vacuum = none; * analyze at the end of the inserts; So I would conclude that relational stores will not make it for this use case... I'll rerun the tests tomorrow and post a comparison between SQLite and Postgres. Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun [EMAIL PROTECTED] wrote: Hello all! SNIP So I would conclude that relational stores will not make it for this use case... I was wondering you guys are having to do all individual inserts or if you can batch some number together into a transaction. Being able to put 1 into a single transaction is a huge win for pgsql. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sat, Nov 22, 2008 at 11:51 PM, Scott Marlowe [EMAIL PROTECTED] wrote: On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun [EMAIL PROTECTED] wrote: Hello all! SNIP So I would conclude that relational stores will not make it for this use case... I was wondering you guys are having to do all individual inserts or if you can batch some number together into a transaction. Being able to put 1 into a single transaction is a huge win for pgsql. I'm aware of the performance issues between 1 insert vs x batched inserts in one operation / transaction. That is why in the case of Postgres I am using COPY table FROM STDIN, and using 5k batches... (I've tried even 10k, 15k, 25k, 50k, 500k, 1m inserts / batch and no improvement...) Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sat, Nov 22, 2008 at 4:54 PM, Ciprian Dorin Craciun [EMAIL PROTECTED] wrote: On Sat, Nov 22, 2008 at 11:51 PM, Scott Marlowe [EMAIL PROTECTED] wrote: On Sat, Nov 22, 2008 at 2:37 PM, Ciprian Dorin Craciun [EMAIL PROTECTED] wrote: Hello all! SNIP So I would conclude that relational stores will not make it for this use case... I was wondering you guys are having to do all individual inserts or if you can batch some number together into a transaction. Being able to put 1 into a single transaction is a huge win for pgsql. I'm aware of the performance issues between 1 insert vs x batched inserts in one operation / transaction. That is why in the case of Postgres I am using COPY table FROM STDIN, and using 5k batches... (I've tried even 10k, 15k, 25k, 50k, 500k, 1m inserts / batch and no improvement...) I've had exactly the same experience with Postgres during an attempt to use it as a store for large-scale incoming streams of data at a rate very comparable to what you're looking at (~100k/sec). We eventually just ended up rolling our own solution. -- - David T. Wilson [EMAIL PROTECTED] -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
Ciprian Dorin Craciun escribió: I've tested also Sqlite3 and it has the same behavior as Postgres... Meaning at beginning it goes really nice 20k inserts, drops to about 10k inserts, but after a few million records, the HDD led starts to blink non-stop, and then it drops to unde 1k The problem is, most likely, on updating the indexes. Heap inserts should always take more or less the same time, but index insertion requires walking down the index struct for each insert, and the path to walk gets larger the more data you have. Postgres does not have bulk index insert, which could perhaps get you a huge performance improvement. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun) wrote: What have I observed / tried: * I've tested without the primary key and the index, and the results were the best for inserts (600k inserts / s), but the readings, worked extremly slow (due to the lack of indexing); * with only the index (or only the primary key) the insert rate is good at start (for the first 2 million readings), but then drops to about 200 inserts / s; I didn't read the thread so I don't know if this was suggested already: bulk index creation is a lot faster than retail index inserts. Maybe one thing you could try is to have an unindexed table to do the inserts, and a separate table that you periodically truncate, refill with the contents from the other table, then create index. Two main problems: 1. querying during the truncate/refill/reindex process (you can solve it by having a second table that you rename in place); 2. the query table is almost always out of date. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] date stamp on update?
Is there a datatype in postgres that will automatically update the date when the row is updated? I know I can do a timestamp and set the default to now() but once the row is inserted, and then edited, I want the column updated without editing my application code or adding a trigger. Is this possible with Postgres? Thanks!
Re: [GENERAL] Postgres mail list traffic over time
Alvaro Herrera [EMAIL PROTECTED] writes: Richard Huxton wrote: Some of the EXPLAINs on the performance list are practically impossible to read unless you've got the time to cut+paste and fix line-endings. Maybe we should start recommending people to post those via http://explain-analyze.info/ What would be really neat would be having the mailing list do something automatically. Either fix the message inline or generate a link to something like this. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Tom Lane [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: So, to a first approximation, the PG list traffic has been constant since 2000. Not the result I expected. I also was confused by its flatness. I am finding the email traffic almost impossible to continue tracking, so something different is happening, but it seems it is not volume-related. Yes, my perception also is that it's getting harder and harder to keep up with the list traffic; so something is happening that a simple volume count doesn't capture. I've noticed recently that the mailing list traffic seems very bursty. We have days with hundreds of messages on lots of different in-depth topics and other days with hardly any messages at all. I wonder if it's hard to follow because we've been picking up more simultaneous threads instead of all being on one thread together before moving on to the next one. Another idea, I wonder if the project has gone more international and therefore has more traffic at odd hours of the day for everyone. It would also mean more long-lived threads with large latencies between messages and replies. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
Alvaro Herrera [EMAIL PROTECTED] writes: The problem is, most likely, on updating the indexes. Heap inserts should always take more or less the same time, but index insertion requires walking down the index struct for each insert, and the path to walk gets larger the more data you have. It's worse than that: his test case inserts randomly ordered keys, which means that there's no locality of access during the index updates. Once the indexes get bigger than RAM, update speed goes into the toilet, because the working set of index pages that need to be touched also is bigger than RAM. That effect is going to be present in *any* standard-design database, not just Postgres. It's possible that performance in a real-world situation would be better, if the incoming data stream isn't so random; but it's hard to tell about that with the given facts. One possibly useful trick is to partition the data by timestamp with partition sizes chosen so that the indexes don't get out of hand. But the partition management might be enough of a PITA to negate any win. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] date stamp on update?
blackwater dev [EMAIL PROTECTED] writes: Is there a datatype in postgres that will automatically update the date when the row is updated? No, and it's conceptually impossible to make that happen at the datatype level. Use a trigger. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] date stamp on update?
Create Table TableName( ColumnName DATE Sysdate; ); ALTER TABLE TableName ALTER COLUMN ColumnName { SET DEFAULT newdefaultvalue | DROP OLDDEFAULT } HTH Martin __ Disclaimer and confidentiality note Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission. Date: Sat, 22 Nov 2008 17:34:26 -0500 From: [EMAIL PROTECTED] To: pgsql-general@postgresql.org Subject: [GENERAL] date stamp on update? Is there a datatype in postgres that will automatically update the date when the row is updated? I know I can do a timestamp and set the default to now() but once the row is inserted, and then edited, I want the column updated without editing my application code or adding a trigger. Is this possible with Postgres? Thanks! _ Proud to be a PC? Show the world. Download the “I’m a PC” Messenger themepack now. hthttp://clk.atdmt.com/MRT/go/119642558/direct/01/
Re: [GENERAL] delete commands fails silently to delete primary key
Andrus [EMAIL PROTECTED] writes: I have table in 8.1.4 which tracks users logged into db There have been a number of index-corruption bugs fixed since 8.1.4 ... In particular, if it's possible that any of these clients abort before committing these insertions, the vacuum race condition bug fixed in 8.1.10 is a pretty likely candidate for your problem. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
Since you always need the timestamp in your selects, have you tried indexing only the timestamp field? Your selects would be slower, but since client and sensor don't have that many distinct values compared to the number of rows you are inserting maybe the difference in selects would not be that huge. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sat, Nov 22, 2008 at 5:54 PM, Scara Maccai [EMAIL PROTECTED] wrote: Since you always need the timestamp in your selects, have you tried indexing only the timestamp field? Your selects would be slower, but since client and sensor don't have that many distinct values compared to the number of rows you are inserting maybe the difference in selects would not be that huge. Even better might be partitioning on the timestamp. IF all access is in a certain timestamp range it's usually a big win, especially because he can move to a new table every hour / day / week or whatever and merge the old one into a big old data table. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] [Q]updating multiple rows with Different values
Hello, searched documentation, FAQ and mailing list archives (mailing list archive search is volumous :-) ) but could not find an answer: I would like to be able to update several rows to different values at the same time In oracle this used to be called Array update or 'collect' update or 'bulk' update -- but those keywords did not bring anything for Postgresql. for example tbl_1 has two columns id and col1 update tbl_1 set col1=3 where id=25, col1=5 where id=26 I am using PHP PDO (and hoping that if there is a mechanism within postgresql to do that PDO will support it as well). Thank you in advance, VSP -- V S P [EMAIL PROTECTED] -- http://www.fastmail.fm - Access all of your messages and folders wherever you are -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Magnus Hagander wrote: Bruce Momjian wrote: Ron Mayer wrote: Joshua D. Drake wrote: On Fri, 2008-11-21 at 08:18 -0800, Ron Mayer wrote: Bruce Momjian wrote: Tom Lane wrote: ... harder to keep up with the list traffic; so something is happening that a simple volume count doesn't capture. If measured in bytes of the gzipped mbox it ... Its because we eliminated the -patches mailing list. That's part of it. I've added -patches to the graph at http://0ape.com/postgres_mailinglist_size/ as well as a graph of hackers+patches combined; and it still looks like hackers+patches is quite high in the past 3 months. With hackers+patches it looks like 2002-08 was the biggest month; but the past 3 months still look roughly twice late 2007's numbers. Can someoone graph CVS traffic, showing the historical number of commits and number of changed lines? Ohloh has some graphs, are they detailed enough? http://www.ohloh.net/projects/postgres/analyses/latest I saw that but that only shows total lines, not the number of lines changed, or commits per hour, etc. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sun, Nov 23, 2008 at 1:02 AM, Tom Lane [EMAIL PROTECTED] wrote: Alvaro Herrera [EMAIL PROTECTED] writes: The problem is, most likely, on updating the indexes. Heap inserts should always take more or less the same time, but index insertion requires walking down the index struct for each insert, and the path to walk gets larger the more data you have. It's worse than that: his test case inserts randomly ordered keys, which means that there's no locality of access during the index updates. Once the indexes get bigger than RAM, update speed goes into the toilet, because the working set of index pages that need to be touched also is bigger than RAM. That effect is going to be present in *any* standard-design database, not just Postgres. It's possible that performance in a real-world situation would be better, if the incoming data stream isn't so random; but it's hard to tell about that with the given facts. One possibly useful trick is to partition the data by timestamp with partition sizes chosen so that the indexes don't get out of hand. But the partition management might be enough of a PITA to negate any win. regards, tom lane Thanks for your feedback! This is just as I supposed, but i didn't had the Postgres experience to be certain. I'll include your conclusion to my report. Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sun, Nov 23, 2008 at 12:26 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: Ciprian Dorin Craciun escribió: I've tested also Sqlite3 and it has the same behavior as Postgres... Meaning at beginning it goes really nice 20k inserts, drops to about 10k inserts, but after a few million records, the HDD led starts to blink non-stop, and then it drops to unde 1k The problem is, most likely, on updating the indexes. Heap inserts should always take more or less the same time, but index insertion requires walking down the index struct for each insert, and the path to walk gets larger the more data you have. Postgres does not have bulk index insert, which could perhaps get you a huge performance improvement. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. I don't think the index depth is a problem. For example in the case of BerkeleyDB with BTree storage, the tree height is 3 after 100m inserts... So this is not the problem. I think the problem is that after a certain amount of data, perdicaly the entire index is touched, and in this case the HDD becomes a bottleneck... (Demonstrated by the vmstat 1 output I've put in a previous email.) Ciprian. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sun, Nov 23, 2008 at 3:09 AM, Scott Marlowe [EMAIL PROTECTED] wrote: On Sat, Nov 22, 2008 at 5:54 PM, Scara Maccai [EMAIL PROTECTED] wrote: Since you always need the timestamp in your selects, have you tried indexing only the timestamp field? Your selects would be slower, but since client and sensor don't have that many distinct values compared to the number of rows you are inserting maybe the difference in selects would not be that huge. Even better might be partitioning on the timestamp. IF all access is in a certain timestamp range it's usually a big win, especially because he can move to a new table every hour / day / week or whatever and merge the old one into a big old data table. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Yes, If i would speed the inserts tremendously... I've tested it and the insert speed is somewhere at 200k-100k. But unfortunately the query speed is not good at all because most queries are for a specific client (and sensor) in a given time range... Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store high volume streams of sensor readings
On Sun, Nov 23, 2008 at 12:32 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: On 21 Lis, 13:50, [EMAIL PROTECTED] (Ciprian Dorin Craciun) wrote: What have I observed / tried: * I've tested without the primary key and the index, and the results were the best for inserts (600k inserts / s), but the readings, worked extremly slow (due to the lack of indexing); * with only the index (or only the primary key) the insert rate is good at start (for the first 2 million readings), but then drops to about 200 inserts / s; I didn't read the thread so I don't know if this was suggested already: bulk index creation is a lot faster than retail index inserts. Maybe one thing you could try is to have an unindexed table to do the inserts, and a separate table that you periodically truncate, refill with the contents from the other table, then create index. Two main problems: 1. querying during the truncate/refill/reindex process (you can solve it by having a second table that you rename in place); 2. the query table is almost always out of date. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general The concerts you have listed are very important to me... I will use the database not only for archival and offline analysis, but also for realtime queries (like what is the power consumption in the last minute)... Of course I could use Postgres only for archival like you've said, and some other solution for realtime queries, but this adds complexity to the application... Thanks, Ciprian Craciun. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general