[GENERAL] REINDEX requirement?
How do I know that index require REINDEX?
Re: [GENERAL] Run postgresql\9.0\data on a RAID-1 disk on my Local Area Network (LAN)
Hello, From the logs it seems, the initdb was successful, However, the database server failed to start afterwards. Can you manually start the database service? If yes try installing PostGIS from stackbuilder after that. If no, then whats the error message you are getting? 2010/11/9 Anders Söderman anders.soder...@gisassistans.se Hello I´ve tried many times to install in different ways. Hopefully I removed all files from previous installations before trying again. I don´t know if the logs are useful. I´ve tried to understand what the log says without result. I think, right now, that I should do an PostgreSQL/PostGIS installation using all default values and then try to change the postgresql\9.0\data folder to the RAID-1 LAN-disk. Thanks Anders Söderman Stockholm Sweden -- *From:* pgsql-general-ow...@postgresql.org [mailto: pgsql-general-ow...@postgresql.org] *On Behalf Of *Sachin Srivastava *Sent:* den 8 november 2010 08:56 *To:* Anders Söderman *Cc:* pgsql-general@postgresql.org *Subject:* Re: [GENERAL] Run postgresql\9.0\data on a RAID-1 disk on my Local Area Network (LAN) Hello, On Nov 7, 2010, at 1:22 PM, Anders Söderman wrote: Hello – I´m trying to install PostgreSQL and PostGIS ”asking” PostgreSQL to store data on my new Netgear NAS Duo LAN-disk in a Windows XP sp2 environment. In the Setup Wizard I accept C:\Program\PostgreSQL\9.0 and I change C:\Program\PostgreSQL\9.0\data to either //R1/aa_db/PostgreSQL\9.0\data or P:\PostgreSQL\9.0\data where P:\ = //R1/aa_db/ on the computer from which I´m installing Postgresql. R1 is the name of the Netgear disk och aa_db the Share/folder I want to store the PostgreSQL data in. I have tried every possible combination I can think of. Sometimes the installation seems to work, but in pgAdmin “Connected?” says “No”. If I try to install PostGIS afterwards I crashes during the EnterpriseDB Stackbuilder installation. After several hours I can´t find any useful information in the manual or on the FORUMs. If anybody could give me a hint where I should look I would be so happy. Regards Anders Söderman Stockholm Sweden What exactly is the error message you get while installing PostgreSQL? Can you attach the installation log (%TEMP%\install-postgresql.log) or (%TEMP%\bitrock_installer_some number.log). -- Regards, Sachin Srivastava EnterpriseDB http://www.enterprisedb.com, the Enterprise PostgreSQLhttp://www.enterprisedb.com company. -- Regards, Sachin Srivastava EnterpriseDB, India
Re: [GENERAL] Problem with frequent crashes related to semctl
On 11/05/2010 05:02 PM, Tom Lane wrote: Adrian Maieradrian.ma...@thalesgroup.com writes: I am running PostgreSQL 8.3.5 on a linux machine (Ubuntu 10.04). Sometimes it happens that connecting to the database fails with error : FATAL: semctl(360458, 3, SETVAL, 0) failed: Invalid argument (PGError) If i restart postgres the problem gets fixed. It doesn't matter how do i connect to the database : i saw this happening from psql, from jdbc, and from ruby. The most likely theory is that something deleted Postgres' semaphores out from under it. You could check this by noting the output of ipcs -s while the database is running normally, and then comparing to the output after it starts to fail. If that does seem to be what's happening, look around for root-executed scripts doing ipcrm calls. Tom, Thanks for the tip. The semaphores are indeed deleted with ipcrm from a script. The script is (re)starting another application and it simply erases all the semaphores without taking into account the possibility that some semaphores are actually belonging to another process... I'll simply move the postgres installation to be started by another user. Doing this should protect the postgres semaphores against the script (which is executed as a regular user, not root). Thanks, Adrian Maier -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] temporary table as a subset of an existing table and indexes
Hello Merlin, So far the improvement in responsiveness has been very noticeable, even without indexing the temporary tables. Of course, this is just trading accuracy for speed as I simply narrow arbitrarily the search space ... The schema I am working on is close to the one I am referencing in this thread : http://archives.postgresql.org/pgsql-general/2010-11/msg00272.php Since we want to implement full text search and tags querying, it can lead to rather complex autogenerated queries such as this one (find log lines with the word 'root' in it, dated from 11/04 to 11/06, where the 'program' tag is sshd and the 'severity_code' tag is less than 3) : EXPLAIN ANALYZE SELECT r.logid FROM (((SELECT lcond84.logid, lcond84.date FROM ( SELECT tmp84.logid, tmp84.date FROM logs tmp84 WHERE TRUE AND ( to_tsvector('simple',tmp84.body) @@ plainto_tsquery('simple','root') ) AND tmp84.date '2010-11-04 10:22:06.26' AND tmp84.date '2010-11-06 10:22:06.26' ) AS lcond84 NATURAL JOIN ( SELECT tmp85.logid FROM tags tmp85 WHERE FALSE OR (tmp85.name='severity_code' AND num_lt(tmp85.value, 3) ) OR (tmp85.name='program' AND tmp85.value = CAST(ROW('sshd') AS tagvalue) ) GROUP BY tmp85.logid HAVING COUNT(tmp85.logid) = 2 ) AS tcond84 ) )) AS r ORDER BY r.date DESC LIMIT 1000; Giving the following query plan : Limit (cost=765445.54..765445.56 rows=9 width=16) (actual time=34744.257..34744.257 rows=0 loops=1) - Sort (cost=765445.54..765445.56 rows=9 width=16) (actual time=34744.255..34744.255 rows=0 loops=1) Sort Key: tmp84.date Sort Method: quicksort Memory: 17kB - Hash Join (cost=765005.46..765445.40 rows=9 width=16) (actual time=34744.202..34744.202 rows=0 loops=1) Hash Cond: (tmp85.logid = tmp84.logid) - HashAggregate (cost=758440.29..758669.77 rows=15299 width=8) (actual time=33343.816..33343.816 rows=0 loops=1) Filter: (count(tmp85.logid) = 2) - Bitmap Heap Scan on tags tmp85 (cost=92363.26..757225.45 rows=242968 width=8) (actual time=20676.354..33294.252 rows=32864 loops=1) Recheck Cond: ((name = 'severity_code'::text) OR (name = 'program'::text)) Filter: (((name = 'severity_code'::text) AND num_lt(value, 3::double precision)) OR ((name = 'program'::text) AND ((value).storedvalue = 'sshd'::text))) - BitmapOr (cost=92363.26..92363.26 rows=2148625 width=0) (actual time=19688.915..19688.915 rows=0 loops=1) - Bitmap Index Scan on nameval_idx (cost=0.00..30388.35 rows=707841 width=0) (actual time=19337.358..19337.358 rows=708719 loops=1) Index Cond: (name = 'severity_code'::text) - Bitmap Index Scan on nameval_idx (cost=0.00..61853.42 rows=1440784 width=0) (actual time=351.551..351.551 rows=1484703 loops=1) Index Cond: (name = 'program'::text) - Hash (cost=6553.06..6553.06 rows=969 width=16) (actual time=1400.378..1400.378 rows=32516 loops=1) - Bitmap Heap Scan on logs tmp84 (cost=215.51..6553.06 rows=969 width=16) (actual time=1104.226..1383.745 rows=32516 loops=1) Recheck Cond: (to_tsvector('simple'::regconfig, body) @@ '''root'''::tsquery) Filter: ((date '2010-11-04 10:22:06.26+01'::timestamp with time zone) AND (date '2010-11-06 10:22:06.26+01'::timestamp with time zone)) - Bitmap Index Scan on fulltext_body_idx (cost=0.00..215.27 rows=1740 width=0) (actual time=1097.874..1097.874 rows=64340 loops=1) Index Cond: (to_tsvector('simple'::regconfig, body) @@ '''root'''::tsquery) Total runtime: 34756.938 ms This one isn't too bad, but the runtime seems to increase exponentially with the tables size. Therefore, using a temporary table based on the date condition can cut the query time by a factor of up to ten (table creation included, and provided the resulting table isn't too big - I make a COUNT check prior to creation so that I will eventually limit manually the table size.). But of course, I'd rather have speed AND accuracy ... To make things worse, the tables tend to grow very quickly since as you might have guessed, I am working on the database part of a logs collector; the current implementation doesn't scale well along the data. I hope this makes things clearer. Feel free to ask if you need more clarifications, and thanks for your time. Matthieu Le 08/11/2010 22:26, Merlin Moncure a écrit : On Mon, Nov 8, 2010 at 12:15 PM, Matthieu Huinmatthieu.h...@wallix.com wrote: Greetings all, I am trying to optimize SELECT queries on a large table (10M rows and more) by using temporary tables that are subsets of my main table, thus narrowing the search space
[GENERAL] Looking for PostgreSQL Folks in New Orleans area
Howdy folks, We're looking for some PostgreSQL users / advocates in the New Orleans area for some community outreach activities, like PGDays and User Groups. If you are in that area and interested in helping, or know who to talk to, please drop me a line, thanks! Robert Treat play: http://www.xzilla.net work: http://www.omniti.com
Re: [GENERAL] temporary table as a subset of an existing table and indexes
On Tue, Nov 9, 2010 at 4:47 AM, Matthieu Huin matthieu.h...@wallix.com wrote: Hello Merlin, So far the improvement in responsiveness has been very noticeable, even without indexing the temporary tables. Of course, this is just trading accuracy for speed as I simply narrow arbitrarily the search space ... The schema I am working on is close to the one I am referencing in this thread : http://archives.postgresql.org/pgsql-general/2010-11/msg00272.php Since we want to implement full text search and tags querying, it can lead to rather complex autogenerated queries such as this one (find log lines with the word 'root' in it, dated from 11/04 to 11/06, where the 'program' tag is sshd and the 'severity_code' tag is less than 3) : EXPLAIN ANALYZE SELECT r.logid FROM (((SELECT lcond84.logid, lcond84.date FROM ( SELECT tmp84.logid, tmp84.date FROM logs tmp84 WHERE TRUE AND ( to_tsvector('simple',tmp84.body) @@ plainto_tsquery('simple','root') ) AND tmp84.date '2010-11-04 10:22:06.26' AND tmp84.date '2010-11-06 10:22:06.26' ) AS lcond84 NATURAL JOIN ( SELECT tmp85.logid FROM tags tmp85 WHERE FALSE OR (tmp85.name='severity_code' AND num_lt(tmp85.value, 3) ) OR (tmp85.name='program' AND tmp85.value = CAST(ROW('sshd') AS tagvalue) ) GROUP BY tmp85.logid HAVING COUNT(tmp85.logid) = 2 ) AS tcond84 ) )) AS r ORDER BY r.date DESC LIMIT 1000; Giving the following query plan : Limit (cost=765445.54..765445.56 rows=9 width=16) (actual time=34744.257..34744.257 rows=0 loops=1) - Sort (cost=765445.54..765445.56 rows=9 width=16) (actual time=34744.255..34744.255 rows=0 loops=1) Sort Key: tmp84.date Sort Method: quicksort Memory: 17kB - Hash Join (cost=765005.46..765445.40 rows=9 width=16) (actual time=34744.202..34744.202 rows=0 loops=1) Hash Cond: (tmp85.logid = tmp84.logid) - HashAggregate (cost=758440.29..758669.77 rows=15299 width=8) (actual time=33343.816..33343.816 rows=0 loops=1) Filter: (count(tmp85.logid) = 2) - Bitmap Heap Scan on tags tmp85 (cost=92363.26..757225.45 rows=242968 width=8) (actual time=20676.354..33294.252 rows=32864 loops=1) Recheck Cond: ((name = 'severity_code'::text) OR (name = 'program'::text)) Filter: (((name = 'severity_code'::text) AND num_lt(value, 3::double precision)) OR ((name = 'program'::text) AND ((value).storedvalue = 'sshd'::text))) - BitmapOr (cost=92363.26..92363.26 rows=2148625 width=0) (actual time=19688.915..19688.915 rows=0 loops=1) - Bitmap Index Scan on nameval_idx (cost=0.00..30388.35 rows=707841 width=0) (actual time=19337.358..19337.358 rows=708719 loops=1) Index Cond: (name = 'severity_code'::text) - Bitmap Index Scan on nameval_idx (cost=0.00..61853.42 rows=1440784 width=0) (actual time=351.551..351.551 rows=1484703 loops=1) Index Cond: (name = 'program'::text) - Hash (cost=6553.06..6553.06 rows=969 width=16) (actual time=1400.378..1400.378 rows=32516 loops=1) - Bitmap Heap Scan on logs tmp84 (cost=215.51..6553.06 rows=969 width=16) (actual time=1104.226..1383.745 rows=32516 loops=1) Recheck Cond: (to_tsvector('simple'::regconfig, body) @@ '''root'''::tsquery) Filter: ((date '2010-11-04 10:22:06.26+01'::timestamp with time zone) AND (date '2010-11-06 10:22:06.26+01'::timestamp with time zone)) - Bitmap Index Scan on fulltext_body_idx (cost=0.00..215.27 rows=1740 width=0) (actual time=1097.874..1097.874 rows=64340 loops=1) Index Cond: (to_tsvector('simple'::regconfig, body) @@ '''root'''::tsquery) Total runtime: 34756.938 ms This one isn't too bad, but the runtime seems to increase exponentially with the tables size. Therefore, using a temporary table based on the date condition can cut the query time by a factor of up to ten (table creation included, and provided the resulting table isn't too big - I make a COUNT check prior to creation so that I will eventually limit manually the table size.). But of course, I'd rather have speed AND accuracy ... To make things worse, the tables tend to grow very quickly since as you might have guessed, I am working on the database part of a logs collector; the current implementation doesn't scale well along the data. I hope this makes things clearer. Feel free to ask if you need more clarifications, and thanks for your time. How are you partitioning the tags? Is the partitioned query doing the same job as the non partitioned query? Is date a forced criteria? (and if it is, have you considered date partition/brute force?) merlin -- Sent via pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Mon, Nov 8, 2010 at 11:24 PM, Sandeep Srinivasa s...@clearsenses.com wrote: There was an interesting post today on highscalability - http://highscalability.com/blog/2010/11/4/facebook-at-13-million-queries-per-second-recommends-minimiz.html The discussion/comments touched upon why mysql is a better idea for Facebook than Postgres. Here's an interesting one postgresql might not be a good fit for this type of application, but the reasoning given in the article is really suspicious. The true answer was hinted at in the comments: we chose it first, and there was never a reason to change it. It really comes down to they probably don't need much from the database other than a distributed key value store, and they built a big software layer on top of that to manage it. Hm, I use facebook and I've seen tons of inconsistent answers, missing notifications and such. I wonder if there's a connection there... merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 3:50 PM, Merlin Moncure mmonc...@gmail.com wrote: On Mon, Nov 8, 2010 at 11:24 PM, Sandeep Srinivasa s...@clearsenses.com wrote: There was an interesting post today on highscalability - http://highscalability.com/blog/2010/11/4/facebook-at-13-million-queries-per-second-recommends-minimiz.html The discussion/comments touched upon why mysql is a better idea for Facebook than Postgres. Here's an interesting one postgresql might not be a good fit for this type of application, but the reasoning given in the article is really suspicious. The true answer was hinted at in the comments: we chose it first, and there was never a reason to change it. It really comes down to they probably don't need much from the database other than a distributed key value store, and they built a big software layer on top of that to manage it. Hm, I use facebook and I've seen tons of inconsistent answers, missing notifications and such. I wonder if there's a connection there... merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general I agree with Merlin, There is a surprising big number of good technology companies (including Google) out there using MySQL. For sometime I have been wondering why and have come up with a few (possibly wrong) theories. Such as: these companies are started by application developers not database experts, the cost (effort) of changing to other database engine is substantial given that that probably there is already so much inconsistencies in their current data setup coupled with considerable amount of inconsistency cover-up code at the application programs, and maybe the IT team is doubling up as a fire fighting department constantly putting out the data driven fires. This is then compounded by the rapid increase in data. Allan. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Nov 9, 2010, at 7:04 AM, Allan Kamau wrote: have come up with a few (possibly wrong) theories. They all sound reasonable. I think you missed an important one though: aggressive (and even sometimes outright false) promotion and sales by the company MySQL AB. Why I started looking at databases, you didn't have to look very hard to find PostgreSQL, but you did have to at least make a minimal effort. Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX requirement?
On Tue, Nov 9, 2010 at 4:26 AM, AI Rumman rumman...@gmail.com wrote: How do I know that index require REINDEX? Well, the REINDEX page: http://www.postgresql.org/docs/current/static/sql-reindex.html gives a few examples of why you might need to reindex. I think the most common reason would probably be due to index bloat. See http://wiki.postgresql.org/wiki/Index_Maintenance under Index Bloat for links to some handy queries to see whether your tables and/or indexes suffer bloat. Josh -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Failed archive_command copy - number of attempts configurable?
On Tue, Nov 9, 2010 at 4:01 AM, dan.m.harris daniel.har...@metaswitch.com wrote: But then the primary retries this another 49 times! So 150 attempts in all. What I need to know is whether these numbers are configurable? No. Can they be timed? How long before the primary stops retrying altogether? Forever until the archive will have been available again. BTW, since the primary cannot remove the unarchived WAL file from pg_xlog directory, unless you fix the archive soon, the primary might run out of the disk space and cause a PANIC error. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) In those same times, mysql was also, um, other than rock solid. I have somewhere a personal email from Monty describing how to crash-recover corrupted myisam data files (I was customer number 13 I believe... i wish i still had that support contract certificate as an artifact) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Vick Khera vi...@khera.org writes: On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) In those same times, mysql was also, um, other than rock solid. I don't have enough operational experience with mysql to speak to how reliable it was back in the day. What it *did* have over postgres back then was speed. It was a whole lot faster, particularly on the sort of single-stream-of-simple-queries cases that people who don't know databases are likely to set up as benchmarks. (mysql still beats us on cases like that, though not by as much.) I think that drove quite a few early adoption decisions, and now folks are locked in; the cost of conversion outweighs the (perceived) benefits. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Hey all, IMO that they choiced MySQL because of no knowledge about PostgreSQL and about valid database designs. Just garbage of data for SELECTing with minimal efforts on data integrity and database server programming (ala typical PHP project). Sorry :-) 2010/11/9 Tom Lane t...@sss.pgh.pa.us Vick Khera vi...@khera.org writes: On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) In those same times, mysql was also, um, other than rock solid. I don't have enough operational experience with mysql to speak to how reliable it was back in the day. What it *did* have over postgres back then was speed. It was a whole lot faster, particularly on the sort of single-stream-of-simple-queries cases that people who don't know databases are likely to set up as benchmarks. (mysql still beats us on cases like that, though not by as much.) I think that drove quite a few early adoption decisions, and now folks are locked in; the cost of conversion outweighs the (perceived) benefits. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- // Dmitriy.
Re: [GENERAL] Why facebook used mysql ?
2010/11/9 Tom Lane t...@sss.pgh.pa.us: Vick Khera vi...@khera.org writes: On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) In those same times, mysql was also, um, other than rock solid. I don't have enough operational experience with mysql to speak to how reliable it was back in the day. What it *did* have over postgres back then was speed. It was a whole lot faster, particularly on the sort of single-stream-of-simple-queries cases that people who don't know databases are likely to set up as benchmarks. (mysql still beats us on cases like that, though not by as much.) I think that drove quite a few early adoption decisions, and now folks are locked in; the cost of conversion outweighs the (perceived) benefits. Facebook have writen Flashcache [is] built primarily as a block cache for InnoDB but is general purpose and can be used by other applications as well. https://github.com/facebook/flashcache/ A good tool by the way. It is the only place where I like to see SSD disk. (not at facebook, but with 'volatile' data) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] I guess I'm missing something here WRT FOUND
On 11/08/2010 09:11 PM, Ralph Smith wrote: How is COLLEEN not there and there at the same time? - NOTICE: did not = 11K = 42 CONTEXT: PL/pgSQL function get_word line 37 at perform NOTICE: value = COLLEEN CONTEXT: PL/pgSQL function get_word line 29 at perform ERROR: duplicate key violates unique constraint uniq_tokens CONTEXT: PL/pgSQL function get_word line 30 at SQL statement # /* Generate a list of up to 7 tokens from the business table's conformedname field. Strip off leading and trailing commans and quotes, etc. Results are inserted into table zbus_tokens, not sorted. */ CREATE OR REPLACE FUNCTION get_word() RETURNS VOID AS ' DECLARE business business%ROWTYPE ; bnamevarchar(100) ; --business.conformedname%TYPE ; Word varchar(100) ; Word2varchar(100) ; Wcount INTEGER ; IBIGINT DEFAULT 0 ; JBIGINT DEFAULT 0 ; K BIGINT DEFAULT 0 ; IsThere INT ; BEGIN FOR business IN SELECT * FROM business limit 500 LOOP bname=business.conformedname ; I=I+1 ; FOR Wcount IN 1..7 LOOP Word=split_part(bname,'' '',Wcount) ; Word2=ltrim(Word,''!?.%()+$*/0123456789'') ; Word=rtrim(Word2,''!?.()+$*/0123456789'') ; Word2=rtrim(ltrim(Word,'',''),'','') ; Word=rtrim(ltrim(Word2,),) ; IF LENGTH(Word)0 THEN Word2=substring(Word from 1 for 50) ; -- PERFORM SELECT COUNT(*) FROM zbus_tokens WHERE token = Word2 ; -- IF FOUND THEN PERFORM RNotice1(1,''value'',Word2) ;-- line 29 INSERT INTO zbus_tokens (token) values(Word2); J=J+1 ; IF J % 100 = 0 THEN PERFORM Rnotice2(1,''Row'',I,''Inserts'',J) ; END IF ; ELSE K=K+1 ; PERFORM RNotice2(1,''did not'',I,''K'',K) ; -- line 37 -- END IF ; END IF ; END LOOP ; END LOOP ; RETURN ; END ; ' LANGUAGE plpgsql; -- == SELECT get_word (); SELECT token, count(token) from zbus_tokens group by 1 order by 2 desc ; SELECT count(*) from zbus_tokens where token='COLLEEN; drop function get_word() ; truncate zbus_tokens ; drop table zbus_tokens; create table zbus_tokens (id bigserial, token varchar(50), CONSTRAINT uniq_tokens UNIQUE (token)) ; === DOCTOR FINN'S CARD COMPANY SPECIALTY MAINTENANCE RIVERS LANDING RESTAURANT SEATTLE FUSION FC PROFESSIONAL PRACTICE ENVIRONMENTS INC CELEBRATE YOURSELF NEW ACTIVITEA BEVERAGE CO KARY ADAM HORWITZ JOHN CASTRO MAGICIAN RELIABLE AUTO RENTAL PARKING COLLEEN CASEY, LMP COLLEEN CASEY, LMP THANKS! Again, 7.4 BITES! -- Ralph _ I'm wondering if count(*) isn't ALWAYS found? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] temporary table as a subset of an existing table and indexes
Basically, I take the same query as above and replace all occurences of tables logs and tags with temp_logs and temp_tags, created as follow: CREATE TEMPORARY TABLE temp_logs ON COMMIT DROP AS SELECT * FROM logs WHERE condition ORDER BY date DESC LIMIT max_size; CREATE TEMPORARY TABLE temp_tags ON COMMIT DROP AS SELECT * FROM tags WHERE logid IN (SELECT logid FROM temp_logs); With condition usually defining a date window. As we are experimenting with this approach, date has become a forced criteria. I have experimented with partitioning, but it led to the logid primary key not being unique anymore, which was a problem when joining data with the tags table. So the queries are pretty much the same, the boost in speed being simply due to the limitation of the search space. How are you partitioning the tags? Is the partitioned query doing the same job as the non partitioned query? Is date a forced criteria? (and if it is, have you considered date partition/brute force?) merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: Vick Khera vi...@khera.org writes: On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rock solid.) In those same times, mysql was also, um, other than rock solid. I don't have enough operational experience with mysql to speak to how reliable it was back in the day. What it *did* have over postgres back then was speed. It was a whole lot faster, particularly on the sort of single-stream-of-simple-queries cases that people who don't know databases are likely to set up as benchmarks. (mysql still beats us on cases like that, though not by as much.) I think that drove quite a few early adoption decisions, and now folks are locked in; the cost of conversion outweighs the (perceived) benefits. Postgres 7.2 brought non blocking vacuum. Before that, you could pretty much write off any 24x7 duty applications -- dealing with dead tuples was just too much of a headache. The mysql of the time, 3.23, was fast but locky and utterly unsafe. It has been easier to run though until recently (8.4 really changed things). Postgres has been relatively disadvantaged in terms of administrative overhead which is a bigger deal than sql features, replication, performance, etc for high load website type cases. heap FSM, tunable autovacuum, checkpoint management, smarter/faster statistics collector, and more backup options may not be as sexy as replication etc but are very appealing features if you are running 50 database servers backing a monster web site. Dumping sys v ipc for mmap is a hypothetical improvement in that vein :-) (aiui, it is not possible though). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
--- On Tue, 11/9/10, Gauthier, Dave dave.gauth...@intel.com wrote: A different slant on this has to do with licensing and $$. Might Oracle decide some day to start charging for their new found DB? They are a for-profit company that's beholding to their shareholders LONG before an open software community. Consumers like Facebook and Google have deep pockets, something corporate executives really don't dismiss lightly. This is just FUD. MySQL is GPL'd, just like Linux is. To say you should avoid MySQL because Oracle may someday start charging for it is like saying you should avoid Linux because Red Hat may someday start charging for it. That makes no sense, especially since both Oracle and Red Hat are already charging for their products. Doesn't mean you can't keep using free Linux and MySQL. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding the other statement causing a sharelock
Ivan Sergio Borgonovo m...@webthatworks.it writes: I've some strong suspect... and I'd like to exit earlier from a function if a process is running but I'm not really sure how to add a semaphore... Maybe pg_try_advisory_lock() would help you there? http://www.postgresql.org/docs/8.3/static/explicit-locking.html#ADVISORY-LOCKS http://www.postgresql.org/docs/8.3/static/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Think upgrades -Original Message- From: Andy [mailto:angelf...@yahoo.com] Sent: Tuesday, November 09, 2010 12:02 PM To: pgsql-general@postgresql.org; Gauthier, Dave Subject: Re: [GENERAL] Why facebook used mysql ? --- On Tue, 11/9/10, Gauthier, Dave dave.gauth...@intel.com wrote: A different slant on this has to do with licensing and $$. Might Oracle decide some day to start charging for their new found DB? They are a for-profit company that's beholding to their shareholders LONG before an open software community. Consumers like Facebook and Google have deep pockets, something corporate executives really don't dismiss lightly. This is just FUD. MySQL is GPL'd, just like Linux is. To say you should avoid MySQL because Oracle may someday start charging for it is like saying you should avoid Linux because Red Hat may someday start charging for it. That makes no sense, especially since both Oracle and Red Hat are already charging for their products. Doesn't mean you can't keep using free Linux and MySQL. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 10:00 AM, Merlin Moncure mmonc...@gmail.com wrote: Postgres 7.2 brought non blocking vacuum. Before that, you could pretty much write off any 24x7 duty applications -- dealing with dead tuples was just too much of a headache. Amen! I remember watching vacuum run alongside other queries and getting all school-girl giggly over it. Seriously it was a big big change for pgsql. The mysql of the time, 3.23, was fast but locky and utterly unsafe. True, it was common to see mysql back then just stop, dead. Go to bring it back up and have to repair tables. Postgres has been relatively disadvantaged in terms of administrative overhead which is a bigger deal than sql features, replication, performance, etc for high load website type cases. I would say it's a bigger problem for adoption than for high load sites. If Joe User spends an hour a day keeping his database on his workstation happy, he's probably not happy. If Joe Admin spends an hour a day keeping his 100 machine db farm happy, he's probably REALLY happy that it only takes so long. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
kamaual...@gmail.com (Allan Kamau) writes: I agree with Merlin, There is a surprising big number of good technology companies (including Google) out there using MySQL. For sometime I have been wondering why and have come up with a few (possibly wrong) theories. Such as: these companies are started by application developers not database experts, the cost (effort) of changing to other database engine is substantial given that that probably there is already so much inconsistencies in their current data setup coupled with considerable amount of inconsistency cover-up code at the application programs, and maybe the IT team is doubling up as a fire fighting department constantly putting out the data driven fires. This is then compounded by the rapid increase in data. This wasn't a good explanation for what happened when Sabre announced they were using MySQL: http://www.mysql.com/news-and-events/generate-article.php?id=2003_33 I used to work at Sabre, and what I saw was *mostly* an Oracle shop, but with significant bastions of IMS, DB2, Teradata, and Informix. Your theory might fit with dumb startups, but certainly not with Sabre, which still has significant deployments of IMS! :-) I actually am inclined to go with less rational explanations; a lot of decisions get made for reasons that do not connect materially (if at all) with the technical issues. One such would be that the lawyers and marketing folk that tend to be at the executive layer do *their* thing of making deals, and when they're busy making deals, the only people interfacing with them are: - Salescritters from the Big O buying them lunch - Other Political Animals that Made The Decision to go with MySQL (or such) and are happy to explain, over golf, that it went fine for us (even if it didn't go entirely so fine; they didn't hear about it) Lunch and golf can have material effects. -- cbbrowne,@,acm.org Rules of the Evil Overlord #67. No matter how many shorts we have in the system, my guards will be instructed to treat every surveillance camera malfunction as a full-scale emergency. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On 09 Nov 2010, at 7:16 PM, Gauthier, Dave wrote: Think upgrades This is covered by the GPL license. Once you have released code under the GPL, all derivative code - ie upgrades - have to also be released in source form, under the GPL license. Regards, Graham -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Any upgrades that are based on the MySQL source code will be legally required to be released under GPL too. That's the beauty of GPL. Software under MIT or BSD license could be hijacked by private companies. Software under GPL license could not. --- On Tue, 11/9/10, Gauthier, Dave dave.gauth...@intel.com wrote: From: Gauthier, Dave dave.gauth...@intel.com Subject: RE: [GENERAL] Why facebook used mysql ? To: Andy angelf...@yahoo.com, pgsql-general@postgresql.org pgsql-general@postgresql.org Date: Tuesday, November 9, 2010, 12:16 PM Think upgrades -Original Message- From: Andy [mailto:angelf...@yahoo.com] Sent: Tuesday, November 09, 2010 12:02 PM To: pgsql-general@postgresql.org; Gauthier, Dave Subject: Re: [GENERAL] Why facebook used mysql ? --- On Tue, 11/9/10, Gauthier, Dave dave.gauth...@intel.com wrote: A different slant on this has to do with licensing and $$. Might Oracle decide some day to start charging for their new found DB? They are a for-profit company that's beholding to their shareholders LONG before an open software community. Consumers like Facebook and Google have deep pockets, something corporate executives really don't dismiss lightly. This is just FUD. MySQL is GPL'd, just like Linux is. To say you should avoid MySQL because Oracle may someday start charging for it is like saying you should avoid Linux because Red Hat may someday start charging for it. That makes no sense, especially since both Oracle and Red Hat are already charging for their products. Doesn't mean you can't keep using free Linux and MySQL. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On 11/9/2010 10:27 AM, Graham Leggett wrote: This is covered by the GPL license. Once you have released code under the GPL, all derivative code - ie upgrades - have to also be released in source form, under the GPL license. Sorry but this is 100% not true. It may be true for a 3rd party (you release something under the GPL, I enhance it, therefore I am required to release my enhancement under the GPL). But Oracle owns the copyright to the MySql code and therefore they can decide to do whatever they want with it. The only thing they can't do is to 'un-release' existing code released under the GPL. Everything else is possible. Ownership of the copyright trumps the GPL. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 9:28 AM, Andy angelf...@yahoo.com wrote: Any upgrades that are based on the MySQL source code will be legally required to be released under GPL too. That's the beauty of GPL. Upgrades released by Oracle *do not* have be under GPL. They own all the IP, and can release future versions under whatever terms they see fit. Other entities, do have to use the GPL if they release their own updates. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
In addition to the license a product is currently available under, you need to also consider who owns its copyright; who owns its test suite (which may not be open source at all); who employs all the people who understand the code and who owns the trademarks that identify the product. Red Hat owns none of these things with respect to Linux (although they do for various other products such as their Directory Server and JBoss). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 10:28 AM, Andy angelf...@yahoo.com wrote: Any upgrades that are based on the MySQL source code will be legally required to be released under GPL too. That's the beauty of GPL. This isn't entirely true. Oracle owns all copyrights to mysql source code. they can release a binary only commercially licensed version with features that they choose NOT to release under the GPL. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Not true. As a condition of getting European Commission's approval of its acquisition of Sun/MySQL, Oracle had to agree to continue the GPL release. And there are non-Oracle upgrades from Google, facebook, Percona, etc. So no one is beholden to Oracle. --- On Tue, 11/9/10, Dave Page dp...@pgadmin.org wrote: From: Dave Page dp...@pgadmin.org Subject: Re: [GENERAL] Why facebook used mysql ? To: Andy angelf...@yahoo.com Cc: pgsql-general@postgresql.org pgsql-general@postgresql.org, DaveGauthier dave.gauth...@intel.com Date: Tuesday, November 9, 2010, 12:31 PM On Tue, Nov 9, 2010 at 9:28 AM, Andy angelf...@yahoo.com wrote: Any upgrades that are based on the MySQL source code will be legally required to be released under GPL too. That's the beauty of GPL. Upgrades released by Oracle *do not* have be under GPL. They own all the IP, and can release future versions under whatever terms they see fit. Other entities, do have to use the GPL if they release their own updates. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
At 12:24 PM 11/9/2010, Sandeep Srinivasa wrote: There was an interesting post today on highscalability - http://highscalability.com/blog/2010/11/4/facebook-at-13-million-queries-per-second-recommends-minimiz.htmlhttp://highscalability.com/blog/2010/11/4/facebook-at-13-million-queries-per-second-recommends-minimiz.html T I wonder if anyone can comment on this - especially the part that PG doesnt scale as well as MySQL on multiple cores ? The multiple cores part is unlikely to be very relevant in the Facebook context. Scaling over four or 8 cores is not a biggie nowadays right? From what I've seen the Facebook, Google type companies tend to use LOTS of cheap servers (dual core, quad core, whatever Intel or AMD can churn out cheaply). I doubt they use =32 core machines for their public facing apps. What's more important to such companies is the ability to scale over multiple machines. There is no way a single server is going to handle 1 billion users. So they have to design and build their apps accordingly, to not need massive serialization/locking. When you post something on Facebook, nobody else has to wait for your post first. Nobody cares if their FB friend/likes counter is somewhat wrong for a while (as long as it eventually shows the correct figure). So scaling out over multiple machines and sharding isn't as hard. Whereas if you require all posts to have a globally unique ID taken from an integer sequence that increases without any gaps, it becomes a rather difficult problem to scale out over many machines. No matter how many machines you have and wherever they are in the world, every post has to wait for the sequence. As for why they used mysql- probably the same reason why they used php. They're what the original developers used. It doesn't matter so much as long as the app is not that slow and can scale out without too much pain. Regards, Link. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On 11/9/2010 10:45 AM, Andy wrote: As a condition of getting European Commission's approval of its acquisition of Sun/MySQL, Oracle had to agree to continue the GPL release. In case anyone is interested in what specifically Oracle agreed to do, this is the text from the decision (they agreed to do the following for 5 years post-deal-closing) : Commitment to enhance MySQL in the future under the GPL. Oracle shall continue to enhance MySQL and make subsequent versions of MySQL, including Version 6, available under the GPL. Oracle will not release any new, enhanced version of MySQL Enterprise Edition without contemporaneously releasing a new, also enhanced version of MySQL Community Edition licensed under the GPL. Oracle shall continue to make the source code of all versions of MySQL Community Edition publicly available at no charge. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Full Vacuum/Reindex vs autovacuum
On Mon, Nov 8, 2010 at 2:06 PM, Jason Long mailing.li...@octgsoftware.com wrote: Every night when there is no activity I do a full vacuum, a reindex, and then dump a nightly backup. Is this optimal with regards to performance? autovacuum is set to the default. In the general case this seems way overkill. Do you suffer from a lot of churn daily? That is, are there bunches of updates? One thing you lose when running vacuum full is the space in the file that is pre-allocated but empty. If you do lots of updates and inserts, you'll be allocating pages and growing the underlying files to hold your data. If you leave the unused space there, it is much faster for postgres just to fill it. The key is keeping that unused space from growing beyond reason... the trick is defining for your own use case what within reason means. As for re-index, don't bother. Unless you have some degenerate case (something like a queue) where you always insert values at the tail end of the index and delete from the front end of the index, and let autovacuum do its work, you should remain in a fairly steady state. There are queries you can run against the database to detect how bloated your indexes are after a while, and then reindex if necessary. I find that some of my data needs a reindex about every 4 to 6 months, while others never benefit. I *never* run a vacuum full. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 5:45 PM, Andy angelf...@yahoo.com wrote: Not true. As a condition of getting European Commission's approval of its acquisition of Sun/MySQL, Oracle had to agree to continue the GPL release. And there are non-Oracle upgrades from Google, facebook, Percona, etc. So no one is beholden to Oracle. It is true. The EU commitments are entirely independent of the licencing. Also note that the commitments (http://www.oracle.com/us/corporate/press/042364) are pretty loosely worded. For example, Oracle would be fulfilling them if they released a new Enterprise version with 1000 new features, and a corresponding community version with just one of those features. And after 5 years (nearer to 4 now I guess), those commitments get thrown away entirely. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
David Boreham david_l...@boreham.org writes: In addition to the license a product is currently available under, you need to also consider who owns its copyright; who owns its test suite (which may not be open source at all); who employs all the people who understand the code and who owns the trademarks that identify the product. Indeed. One thing that's particularly worth noting is that the mysql documentation is not, and never has been, freely redistributable. The *real* license risk for mysql in the past was that all the people who were qualified to do serious development worked for the same company. If that company chose to stop releasing updates for free --- as they absolutely had the legal right to do --- the fact that you had the source code for previous releases wasn't really going to do you a whole lot of good. (Now it's possible that users could band together to start their own fork from the last GPL release, and eventually get to the point of doing useful development. We've seen that movie before, in fact: it's called postgres, circa 1996 right after Berkeley abandoned it. So one could hope that after several years you might have a viable development community, but there's gonna be a lot of pain first.) The recent fragmentation of development talent over in the mysql world might change things, but it's still very unclear what the long-term result will be. If I were about to choose a database to bet my company on, I'd be afraid of picking mysql simply because its future development path isn't clear. Oracle may own the copyright, but all the key developers left, so it's definitely not clear that they'll be able to do much with it for some time to come (even assuming that they want to). And who knows which of the forks will succeed? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Porting from MS Access 2007 to PostgreSQL
On Monday 08 November 2010 8:22:51 pm Victor Hooi wrote: Hi, Disclaimer: Not a DBA, nor I am not a DB guy, so please excuse any ignorance in the below. *4. MS Access to Postgres* Anyhow, somebody else suggested it might be better to just go straight from the original MS Access database to PostgreSQL. My first question is, what is the current recommended procedure for this? One way I have done this is to use the Make Table Query in Access to create a table in Postgres from one in Access. This assumes you have an ODBC connection set up to Postgres. Cheers, Victor -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgresql scalability issue
On 11/08/10 16:33, umut orhan wrote: Hi all, I've collected some interesting results during my experiments which I couldn't figure out the reason behind them and need your assistance. I'm running PostgreSQL 9.0 on a quad-core machine having two level on-chip cache hierarchy. Based on this information, you are most likely running on Intel Xeon 5000-5400 series CPU, right? It probably doesn't matter much since you apparently have only a single socket populated but is a bit old architecture known for its FSB bottleneck to the memory. You should write some details about your hardware: at least CPU model/speed and memory speed, and software (which OS? 32-bit or 64-bit?) PostgreSQL has a large and warmed-up buffer cache thus, no disk I/O is observed during experiments (i.e. for each query buffer cache hit rate is 100%). I'm pinning each query/process to an individual core. Queries are simple read-only queries (only selects). Nested loop (without materialize) is used for the join operator. When I pin a single query to an individual core, its execution time is observed as 111 seconds. This result is my base case. Then, I fire two instances of the same query concurrently and pin them to two different cores separately. However, each execution time becomes 132 seconds in this case. In a similar trend, execution times are increasing for three instances (164 seconds) and four instances (201 seconds) cases too. What I was expecting is a linear improvement in throughput (at least). I tried several different queries and got the same trend at each time. Are you measuring wall-clock execution time for queries in parallel? I.e. start measuring when the first query is started (asynchronously?) and stop when the last one is finished? Did you try the same measurement without pinning? I wonder why execution times of individual queries are increasing when I increase the number of their instances. Btw, I don't think on-chip cache hit/miss rates make a difference since L2 cache misses are decreasing as expected. I'm not an expert in PostgreSQL internals. Maybe there is a lock-contention (spinlocks?) occurring even if the queries are read-only. Anyways, all ideas are welcome. As others said, memory bandwidth is the most likely suspect here. CPUs are unfortunately so much faster than memory and memory buses that they frequently have to wait. Unless PostgreSQL uses the exclusive lock model instead of shared-exclusive, there shouldn't be much contention for the shared buffers. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Porting from MS Access 2007 to PostgreSQL
On Mon, Nov 8, 2010 at 11:39 PM, Arnaud Lesauvage arnaud.lis...@codata.eu wrote: I can't really give you any advice about this, but we kept using Access as our Frontend and everything runs fine. You'll just have to make sure that you have read about the 'boolean - integer' problem. This article is a nice start I think : http://www.postgresonline.com/journal/archives/24-Using-MS-Access-with-PostgreSQL.html I've played with mapping access [yes/no] datatype to postgresql BOOLEANs for a while. And after all of that *fun*, I've since started mapping postgresql's INTEGER to access's [yes/no] datatype. Life is so much better now because. -- Regards, Richard Broersma Jr. Visit the Los Angeles PostgreSQL Users Group (LAPUG) http://pugs.postgresql.org/lapug -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
hi, I am the OP. With due respect to everyone (and sincere apologies to Richard Broersma), my intention was not to create a thread about MySQL/Oracle's business practices. It was about the technical discussion on Highscalability - I have been trying to wrap my head around the concept of multiple core scaling for Postgres, especially beyond 8 core (like Scott's Magny Coeurs example). My doubt arises from whether Postgres depends on the kernel scheduler for multiple CPU/core utilization. If that is the case, then does using FreeBSD vs Linux give rise to any differences in scaling? Taking the question one step further, do different Linux kernels (and schedulers) impact Postgres scalability ? The Phoronix Test Suite already tests linux kernel releases for regressions in performance w.r.t postgres DB performance (e.g http://www.phoronix.com/scan.php?page=articleitem=linux_perf_regressionsnum=1), but doesnt particularly focus on multiple cores. Is it something that should be benchmarked ? thanks -Sandeep P.S. on the topic of scalability, here is another article - http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html , where people have asked if a similar thing can be done using Postgres UDF or a marshalling ODBA http://scm.ywesee.com/?p=odba/.git;a=summary
Re: [GENERAL] Why facebook used mysql ?
On 11/9/2010 11:10 AM, Sandeep Srinivasa wrote: It was about the technical discussion on Highscalability - I have been trying to wrap my head around the concept of multiple core scaling for Postgres, especially beyond 8 core (like Scott's Magny Coeurs example). My doubt arises from whether Postgres depends on the kernel scheduler for multiple CPU/core utilization. If that is the case, then does using FreeBSD vs Linux give rise to any differences in scaling? Hmm...typically multi-core scaling issues are in the area of memory contention and cache coherence (and therefore are for the most part not dependent on the OS and its scheduler). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [JDBC] psqlexception syntax error at or near $
[Please keep the list copied and please don't top-post.] [Also, this thread really belongs on pgsql-general or pgsql-novice; it doesn't seem to have anything to do with JDBC, so I'm blind copying the JDBC list and posting to general to move it.] Steven Dahlin pgdb.sldah...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.govwrote: CREATE OR REPLACE FUNCTION system_info_fnc01() RETURNS trigger AS $$ raise exception ''System Info record already present''; $$ When you use dollar quoting you should not double your apostrophes. Try: raise exception 'System Info record already present'; That makes no difference. It seems to make a difference on my machine: test=# CREATE OR REPLACE FUNCTION system_info_fnc01() RETURNS trigger AS test-# $$ test$# declare test$# iCnt integer; test$# begin test$# select count(*) into iCnt from system_info; test$# if ( iCnt 0 ) then test$# raise exception ''System Info record already present''; test$# end if; test$# return new; test$# end; test$# $$ test-# LANGUAGE 'plpgsql' IMMUTABLE; ERROR: syntax error at or near System LINE 8: raise exception ''System Info record already present''; ^ test=# CREATE OR REPLACE FUNCTION system_info_fnc01() RETURNS trigger AS test-# $$ test$# declare test$# iCnt integer; test$# begin test$# select count(*) into iCnt from system_info; test$# if ( iCnt 0 ) then test$# raise exception 'System Info record already present'; test$# end if; test$# return new; test$# end; test$# $$ test-# LANGUAGE 'plpgsql' IMMUTABLE; CREATE FUNCTION If you're getting different results, please copy/paste actual code. What version of PostgreSQL are you running? -Kevin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 11:46 PM, David Boreham david_l...@boreham.orgwrote: Hmm...typically multi-core scaling issues are in the area of memory contention and cache coherence (and therefore are for the most part not dependent on the OS and its scheduler). If it is independent of the OS, then how does one go about tuning it. Consider this - I get a 12 core server on which I want multiple webserver instances + DB. Can one create CPU pools (say core 1,2,3 for webservers, 4,5,6,7 for DB, etc.) ? I know about taskset, but should one be using it ?
Re: [GENERAL] REINDEX requirement?
-Original Message- From: AI Rumman [mailto:rumman...@gmail.com] Sent: Tuesday, November 09, 2010 3:26 AM To: pgsql-general General Subject: REINDEX requirement? How do I know that index require REINDEX? Look at the results of pgstatindex(...) function for specific index. It's part of pgstattupple contrib module - read it up in the docs. Regards, Igor Neyman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] I guess I'm missing something here WRT FOUND
Yeah your right Alban, that looks bad, but it was an artifact of 'try-this, try-this, no, try-this'. The table is empty, and unfortunately remains that way; nothing gets inserted. I tried other variations, however FOUND just isn't behaving as I would think. --- OUTPUT SNIPPET: NOTICE: SQL cmd is = SELECT COUNT(*) FROM zbus_tokens WHERE token = PARKING NOTICE: Row = 10, Skipped INSERT Count = 32, Word2 = PARING NOTICE: SQL cmd is = SELECT COUNT(*) FROM zbus_tokens WHERE token = COLLEEN NOTICE: Row = 11, Skipped INSERT Count = 33, Word2 = COLLEEN --- Alban Hertroys wrote: On 9 Nov 2010, at 5:11, Ralph Smith wrote: Why is FOUND 'finding' and hence avoiding an INSERT? Not really sure what your point is (don't have time to look closely), but... PERFORM Rnotice1(1,''SQL cmd is'',''SELECT COUNT(*) FROM zbus_tokens WHERE token = ''||Word2::varchar) ; PERFORM (SELECT COUNT(*) FROM zbus_tokens WHERE token = Word2) ; IF NOT FOUND THEN PERFORM RNotice1(1,''value'',Word2) ; INSERT INTO zbus_tokens (token) values(Word2); J=J+1 ; IF J % 100 = 0 THEN PERFORM Rnotice2(1,''Row'',I,''Insert Count'',J) ; END IF ; ELSE K=K+1 ; PERFORM RNotice2(1,''Row'',I,''Skipped INSERT Count'',K) ; END IF ; You just connected this ELSE block to the IF statement it was nested inside. You probably need to comment out the rest of this ELSE block as well. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. (Assuming you're not turning the screw driver the wrong way.) -- Ralph _
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 11:10 AM, Sandeep Srinivasa s...@clearsenses.com wrote: hi, I am the OP. With due respect to everyone (and sincere apologies to Richard Broersma), my intention was not to create a thread about MySQL/Oracle's business practices. Hehe, we head off on tangents. It's common, don't worry. It was about the technical discussion on Highscalability - I have been trying to wrap my head around the concept of multiple core scaling for Postgres, especially beyond 8 core (like Scott's Magny Coeurs example). My doubt arises from whether Postgres depends on the kernel scheduler for multiple CPU/core utilization. If that is the case, then does using FreeBSD vs Linux give rise to any differences in scaling? All multi-process applications like pgsql have to depend on the OS kernel scheduler to get their processes run. But in terms of scaling, that's usually not the biggest issue, it's getting rid of choke points in the kernel like linux's much earlier versions having one big spin lock on huge chunks of the kernel. That's been gone a long time, but as the number of cores keeps going up, new chokepoints are found and fixes in both Linux and BSD. Taking the question one step further, do different Linux kernels (and schedulers) impact Postgres scalability ? The Phoronix Test Suite already tests linux kernel releases for regressions in performance w.r.t postgres DB performance The IO scheduler mostly just gets in the way on bigger machines with battery backed caching controllers and / or SAN arrays. (e.g http://www.phoronix.com/scan.php?page=articleitem=linux_perf_regressionsnum=1), but doesnt particularly focus on multiple cores. Is it something that should be benchmarked ? Yes. Sadly, to do so you really need a $7500 machine -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[Fwd: Re: [GENERAL] I guess I'm missing something here WRT FOUND]
Yeah your right Alban, that looks bad, but it was an artifact of 'try-this, try-this, no, try-this'. The table is empty, and unfortunately remains that way; nothing gets inserted. I tried other variations, however FOUND just isn't behaving as I would think. --- OUTPUT SNIPPET: NOTICE: SQL cmd is = SELECT COUNT(*) FROM zbus_tokens WHERE token = PARKING NOTICE: Row = 10, Skipped INSERT Count = 32, Word2 = PARING NOTICE: SQL cmd is = SELECT COUNT(*) FROM zbus_tokens WHERE token = COLLEEN NOTICE: Row = 11, Skipped INSERT Count = 33, Word2 = COLLEEN --- Alban Hertroys wrote: On 9 Nov 2010, at 5:11, Ralph Smith wrote: Why is FOUND 'finding' and hence avoiding an INSERT? Not really sure what your point is (don't have time to look closely), but... PERFORM Rnotice1(1,''SQL cmd is'',''SELECT COUNT(*) FROM zbus_tokens WHERE token = ''||Word2::varchar) ; PERFORM (SELECT COUNT(*) FROM zbus_tokens WHERE token = Word2) ; IF NOT FOUND THEN PERFORM RNotice1(1,''value'',Word2) ; INSERT INTO zbus_tokens (token) values(Word2); J=J+1 ; IF J % 100 = 0 THEN PERFORM Rnotice2(1,''Row'',I,''Insert Count'',J) ; END IF ; ELSE K=K+1 ; PERFORM RNotice2(1,''Row'',I,''Skipped INSERT Count'',K) ; END IF ; You just connected this ELSE block to the IF statement it was nested inside. You probably need to comment out the rest of this ELSE block as well. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. (Assuming you're not turning the screw driver the wrong way.) -- Ralph _ -- Ralph _ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] I guess I'm missing something here WRT FOUND
Ralph Smith rsm...@10kinfo.com writes: ttYeah your right Alban, that looks bad, but it was an artifact of 'try-this, try-this, no, try-this'.br br The table is empty, and unfortunately remains that way; nothing gets inserted.br I tried other variations, however FOUND just isn't behaving as I would think.br (Please avoid html-encoded email.) The original mail looked like you were trying to do perform count(*) from something where something; if found then ... This will in fact *always* set FOUND, because the query always yields exactly one row: that's the nature of aggregate functions. FOUND doesn't respond to whether the result of count(*) was zero or nonzero, but just to the fact that it did deliver a result row. You probably wanted something like perform 1 from something where something; if found then ... which will set FOUND depending on whether there are any rows matching the where-clause. Or you could avoid FOUND altogether: if exists(select 1 from something where something) then ... regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Tom Lane Sent: Tuesday, November 09, 2010 10:55 AM To: Vick Khera Cc: Scott Ribe; Allan Kamau; pgsql-general@postgresql.org Subject: Re: [GENERAL] Why facebook used mysql ? Vick Khera vi...@khera.org writes: On Tue, Nov 9, 2010 at 10:26 AM, Scott Ribe scott_r...@killerbytes.com wrote: Also, my understanding is that if you go way back on the PostgreSQL timeline to versions 6 and earliest 7.x, it was a little shaky. (I started with 7.3 or 7.4, and it has been rocksolid.) In those same times, mysql was also, um, other than rock solid. I don't have enough operational experience with mysql to speak to how reliable it was back in the day. What it *did* have over postgres back then was speed. It was a whole lot faster, particularly on the sort of single-stream-of-simple-queries cases that people who don't know databases are likely to set up as benchmarks. (mysql still beats us on cases like that, though not by as much.) I think that drove quite a few early adoption decisions, and now folks are locked in; the cost of conversion outweighs the (perceived) benefits. A different slant on this has to do with licensing and $$. Might Oracle decide some day to start charging for their new found DB? They are a for-profit company that's beholding to their shareholders LONG before an open software community. Consumers like Facebook and Google have deep pockets, something corporate executives really don't dismiss lightly. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Also there's the strange and mysterious valley group-think syndrome. I've seen this with several products/technologies over the years. I suspect it comes from the VCs, but I'm not sure. The latest example is you should be using EC2. There always follows a discussion where I can present 50 concrete reasons based on hard experience why the suggestion is a bad idea and the other person presents nothing besides everyone's doing it. I saw exactly the same thing with MySQL a few years ago. Before that it was Oracle. It's often easier to go along with the flow and get some work done vs. trying to argue. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On 09 Nov 2010, at 7:30 PM, David Boreham wrote: Sorry but this is 100% not true. It may be true for a 3rd party (you release something under the GPL, I enhance it, therefore I am required to release my enhancement under the GPL). But Oracle owns the copyright to the MySql code and therefore they can decide to do whatever they want with it. The only thing they can't do is to 'un- release' existing code released under the GPL. Everything else is possible. Ownership of the copyright trumps the GPL. Ownership of the copyright is owned by whoever made the contribution, and any competent version control system will give you the list of contributions (and therefore contributors). If a contribution was made in terms of the GPL, then permission would need to be sought from everyone who has made a contribution before it could be released under a different license. Regards, Graham -- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 1:04 PM, Graham Leggett minf...@sharp.fm wrote: On 09 Nov 2010, at 7:30 PM, David Boreham wrote: Sorry but this is 100% not true. It may be true for a 3rd party (you release something under the GPL, I enhance it, therefore I am required to release my enhancement under the GPL). But Oracle owns the copyright to the MySql code and therefore they can decide to do whatever they want with it. The only thing they can't do is to 'un-release' existing code released under the GPL. Everything else is possible. Ownership of the copyright trumps the GPL. Ownership of the copyright is owned by whoever made the contribution, and any competent version control system will give you the list of contributions (and therefore contributors). If a contribution was made in terms of the GPL, then permission would need to be sought from everyone who has made a contribution before it could be released under a different license. Contributed code to MySQL AB MUST be assigned copyright to MySQL AB. If it's been incorporated into MySQL proper, it's owned by MySQL AB ne Oracle. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
Scott Marlowe scott.marl...@gmail.com writes: On Tue, Nov 9, 2010 at 1:04 PM, Graham Leggett minf...@sharp.fm wrote: Ownership of the copyright is owned by whoever made the contribution, and any competent version control system will give you the list of contributions (and therefore contributors). If a contribution was made in terms of the GPL, then permission would need to be sought from everyone who has made a contribution before it could be released under a different license. Contributed code to MySQL AB MUST be assigned copyright to MySQL AB. Yeah, MySQL AB and successors have been very careful to ensure that they have air-tight ownership of that code. I've been asked for copyright assignments for four-line patches :-( regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Sandeep Srinivasa Sent: Tuesday, November 09, 2010 10:10 AM To: Lincoln Yeoh Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Why facebook used mysql ? hi, I am the OP. With due respect to everyone (and sincere apologies to Richard Broersma), my intention was not to create a thread about MySQL/Oracle's business practices. It was about the technical discussion on Highscalability - I have been trying to wrap my head around the concept of multiple core scaling for Postgres, especially beyond 8 core (like Scott's Magny Coeurs example). My doubt arises from whether Postgres depends on the kernel scheduler for multiple CPU/core utilization. If that is the case, then does using FreeBSD vs Linux give rise to any differences in scaling? Taking the question one step further, do different Linux kernels (and schedulers) impact Postgres scalability ? The Phoronix Test Suite already tests linux kernel releases for regressions in performance w.r.t postgres DB performance (e.g http://www.phoronix.com/scan.php?page=articleitem=linux_perf_regressionsnum=1), but doesnt particularly focus on multiple cores. Is it something that should be benchmarked ? thanks -Sandeep P.S. on the topic of scalability, here is another article - http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html , where people have asked if a similar thing can be done using Postgres UDF or a marshalling ODBA http://scm.ywesee.com/?p=odba/.git;a=summary Regarding scaling, there is an interesting NoSQL engine called Kyoto cabinet that has some testing in high volume transactions under different loads and different conditions. The Kyoto cabinet data engine is written by the Tokyo cabinet author Mr. Hirabayashi of Fallabs.com. In this document: http://fallabs.com/kyotocabinet/spex.html We find something interesting. In the section called Transaction, we have this: = default risk on process crash: Some records may be missing. risk on system crash: Some records may be missing. performance penalty: none remark: Auto recovery after crash will take time in proportion of the database size. transaction implicit usage: open(..., BasicDB::OAUTOTRAN); explicit usage: begin_transaction(false); ...; end_transaction(true); risk on process crash: none risk on system crash: Some records may be missing. performance penalty: Throughput will be down to about 30% or less. transaction + synchronize implicit usage: open(..., BasicDB::OAUTOTRAN | BasicDB::OAUTOSYNC); explicit usage: begin_transaction(true); ...; end_transaction(true); risk on process crash: none risk on system crash: none performance penalty: Throughput will be down to about 1% or less. = Notice that there is a 3:1 penalty for flushing request to disk from the program to the operating system, and a 100:1 penalty for hard flushing from the operating system to the disk. So the simple way to scale to huge volumes is simply to allow data loss. That is a major way in which NoSQL data systems can achieve absurd transaction rates. There are also distributed hash tables (these are also called NoSQL engines, but they are an entirely different technology). With a distributed hash table, you can get enormous scaling and huge transaction volumes. http://en.wikipedia.org/wiki/Distributed_hash_table Distributed hash tables are another kind of key/value store {but an entirely different technology compared to traditional key/value stores like DBM}. When we design a data system, we should examine the project requirements and choose appropriate tools to solve the problems facing the project. For something like FaceBook, a Key/Value store is a very good solution. You do not have a big collection of related tables, and there are no billion dollar bank transactions taking place where someone will get a bit bent out of shape if it goes missing. For an analytic project where we plan to do cube operations, a column store like MonetDB is a good idea. For a transactional system like Point Of Sale or Accounting, a traditional RDBMS like PostgreSQL is the best solution. I think that an interesting path for growth would be to expand PostgreSQL to allow different table types. For instance, all leaf tables (those tables without any children) could easily be Key/Value stores. For analytics, create column store tables. For ultra-high access, have a distributed hash table. But right now, most RDBMS systems do not have these extra, special table types. So if you want tools that do those things, then use those tools. IMO-YMMV
Re: [GENERAL] Why facebook used mysql ?
On 11/9/2010 11:36 AM, Sandeep Srinivasa wrote: If it is independent of the OS, then how does one go about tuning it. Consider this - I get a 12 core server on which I want multiple webserver instances + DB. Can one create CPU pools (say core 1,2,3 for webservers, 4,5,6,7 for DB, etc.) ? I know about taskset, but should one be using it ? There are plenty of things you might do, but first you need to figure out what problem you're solving. I'd suggest deploying a relatively simple configuration then evaluate its capacity under your workload. Does it run fast enough? If so, then job done. If not then why not, and so on... The simplest configuration would be one web server instance and one DB instance. I don't think you should be looking at process partitioning and core affinity unless you have already proved that you have processes that don't scale over the cores you have, to deliver the throughput you need. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Thoughts on a surrogate key lookup function?
On Sat, Nov 6, 2010 at 1:01 PM, Nick nboutel...@gmail.com wrote: Merlin, thanks for the reply. Yes, using email_addresses was a very silly example. Maybe the following is a better example... CREATE TABLE first_names (id INT, first_name VARCHAR); ALTER TABLE first_names ADD CONSTRAINT first_names_pkey PRIMARY KEY (id,first_name); ALTER TABLE first_names ADD CONSTRAINT first_names_unique_first_name UNIQUE (first_name); CREATE TABLE last_names (id INT, first_name VARCHAR); ALTER TABLE last_names ADD CONSTRAINT last_names_pkey PRIMARY KEY (id,last_name); ALTER TABLE last_names ADD CONSTRAINT last_names_unique_last_name UNIQUE (last_name); CREATE TABLE referrals (id INT, ref_code VARCHAR); ALTER TABLE referrals ADD CONSTRAINT referrals_pkey PRIMARY KEY (id,ref_code); ALTER TABLE referrals ADD CONSTRAINT referrals_unique_ref_code UNIQUE (ref_code); CREATE TABLE users (id INT, first_name_id INT, first_name VARCHAR, last_name_id INT, last_name VARCHAR, ref_code_id INT, ref_code DATE); ALTER TABLE users ADD CONSTRAINT users_fkey_first_names FOREIGN KEY (first_name_id,first_name) REFERENCES first_names(id,first_name) ON UPDATE CASCADE ON DELETE SET NULL; ALTER TABLE users ADD CONSTRAINT users_fkey_last_names FOREIGN KEY (last_name_id,last_name) REFERENCES last_names(id,last_name) ON UPDATE CASCADE ON DELETE SET NULL; ALTER TABLE users ADD CONSTRAINT users_fkey_referrals FOREIGN KEY (ref_code_id,ref_code) REFERENCES referrals(id,ref_code) ON UPDATE CASCADE ON DELETE SET NULL; CREATE TRIGGER auto_first_name_id BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE _auto_id('first_name_id'); CREATE TRIGGER auto_last_name_id BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE _auto_id('last_name_id'); CREATE TRIGGER auto_ref_code_id BEFORE INSERT OR UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE _auto_id('ref_code_id'); If I would like to insert a new user, first name, last name I would give the surrogates a value of NULL or -1. Their referral code must exist so ill give that surrogate a value of 0. INSERT INTO users (id, first_name_id, first_name, last_name_id, last_name, ref_code_id, ref_code) VALUES (1,-1,'John',-1,'Doe', 0,'xyz') which would... SELECT id FROM first_names WHERE first_name = 'John' INTO NEW.first_name_id IF NOT FOUND INSERT INTO first_names (first_name) VALUES ('John') RETURNING id INTO NEW.first_name_id SELECT id FROM last_names WHERE last_name = 'Doe' INTO NEW.last_name_id IF NOT FOUND INSERT INTO last_names (last_name) VALUES ('Doe') RETURNING id INTO NEW.last_name_id SELECT id FROM referral_codes WHERE ref_code = 'xyz' INTO NEW.ref_code_id IF NOT FOUND raise exception If I want to insert the new user John Smith, and I already know the surrogate value for John and I dont want to add a ref_code then I can do... INSERT INTO users (id, first_name_id, first_name, last_name_id, last_name, ref_code_id, ref_code) VALUES (2,1,NULL,-1,'Smith',NULL,NULL) which would... SELECT first_name FROM first_names WHERE id = 1 INTO NEW.first_name IF NOT FOUND raise exception SELECT id FROM last_names WHERE last_name = 'Smith' INTO NEW.last_name_id IF NOT FOUND INSERT INTO last_names (last_name) VALUES ('Smith') RETURNING id INTO NEW.last_name_id So by adding both the surrogate and natural keys to users table and toggling the surrogate on insert by 0 (must exist) or -1 (select or insert) I can bypass a much more complex insert statement. Is this frowned upon? I havent had many issues (but some ive been able to work around) with this as a plperl trigger and am pleased with how much easier it makes my inserts (besides the execution speed). It's a neat idea, but all things considered, you are better off using one of the two approaches I outlined above: *) Your idea need extra composite index on two fields (each unique) that serves no integrity purpose *) Referring table has extra fields, pick natural or surrogate, but not both... *) It's not faster. Any way you slice this, you need lookup on the master table, even if the system does it internally through RI in the purely natural case. insert where not exists...will do exactly what you are doing above, and does it in one statement, not two. If you want to do this inside database, it's more common to do this in regular function, not trigger function. Just make a function insert_user() that handles logic checking and dispense with all the extra fields... merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On Tue, Nov 9, 2010 at 4:12 PM, David Boreham david_l...@boreham.org wrote: I don't think you should be looking at process partitioning and core affinity unless you have already proved that you have processes that don't scale over the cores you have, to deliver the throughput you need. Note that you're likely to get FAR more out of processor affinity with multiple NICs assigned each to its own core / set of cores that share L3 cache and such.Having the nics and maybe RAID controllers and / or fibre channel cards etc on their own set of cores in one group can make a big difference. Processor affinity doesn't seem to make much difference for me with pgsql. Modern linux schendulers are pretty good at keeping things on the same core for a while without predefined affinity. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why facebook used mysql ?
On 11/9/2010 5:05 PM, Scott Marlowe wrote: Note that you're likely to get FAR more out of processor affinity with multiple NICs assigned each to its own core / set of cores that share L3 cache and such.Having the nics and maybe RAID controllers and / or fibre channel cards etc on their own set of cores in one group can make a big difference. Be careful though: this phenomenon typically only comes into play at very high I/O rates. I wouldn't want to send the OP down this path without first verifying that he has a problem. Most folks are not trying to push 100k requests/s out their web servers.. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using EXCLUDE in 9.0 with operator ...
Hallo all, I have a question about Exclusion constraints in 9.0: * Lets say I have a table with two columns X and Y (of type varchar or int) * Is there any way I can add add an EXCLUDE constraint that says, for each value of X the values of Y must be the same. * So the following rows are ok: X | Y A , 1 A , 1 B , 1 B , 1 C , 2 C , 2 But adding A , 3 should fail, as everything with value X = A should have the same value as existing data (where Y = 1) Theoretically the following would be nice if it worked EXCLUDE (X WITH =, Y WITH ) but it complains that ERROR: operator (text,text) is not a member of operator family text_ops because the Btree index method only allows = in an exclude constraint. Or am I missing a simpler way of doing this? (without having to create and then foreign key to another table, which is obviously a possibility) des. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using EXCLUDE in 9.0 with operator ...
On Wed, 2010-11-10 at 13:45 +1300, Derek Harland wrote: Theoretically the following would be nice if it worked EXCLUDE (X WITH =, Y WITH ) but it complains that ERROR: operator (text,text) is not a member of operator family text_ops because the Btree index method only allows = in an exclude constraint. Or am I missing a simpler way of doing this? (without having to create and then foreign key to another table, which is obviously a possibility) Try using 9.1alpha ( http://www.postgresql.org/developer/alpha ) and installing btree_gist. Then, use: EXCLUDE USING gist (X WITH =, Y WITH ) In 9.0 this particular constraint won't work because there is an additional sanity check in the code that won't pass if the operator is . The sanity check was deemed worthwhile for the first release of the feature, but will be lifted in version 9.1. Also, doesn't work (yet) with btree, but in principle there is no reason why not. Perhaps for 9.1 as well. Can you please share your use case for this particular constraint? I'd like to hear it. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
Further testing shows it is windows networking causing the issue. Copying files to and from the server is 5 to 6 times slower on a Windows client compared to the Linux client. The issue is not specific to libpq. -- Rob -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Rob Brown-Bayliss Sent: Tuesday, November 09, 2010 8:20 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Libpq is very slow on windows but fast on linux. Further testing shows it is windows networking causing the issue. Copying files to and from the server is 5 to 6 times slower on a Windows client compared to the Linux client. The issue is not specific to libpq. Check your Nagle setting. http://support.microsoft.com/kb/138831 See also: http://apocryph.org/2006/07/08/achieving_maximum_tcp_performance_on_windows_xp_2k3/ P.S. Iperf is really useful to see how well things are going as a diagnostic tool: http://sourceforge.net/projects/iperf/files/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
On 11/09/10 9:53 PM, Dann Corbit wrote: Check your Nagle setting. http://support.microsoft.com/kb/138831 that link works for SNA Server, which is some real obscure stuff, but not the general case try this, http://www.speedguide.net/articles/windows-2kxp-more-tweaks-158 where it says _Gaming Tweak - Disable Nagle's algorithm_ This page is talking about games, but this is the exact same thing. and, I still bet the OP need to increase his tcp recieve window to like 320k or something. with the kind of latency mentioned. * * -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
On 10 Nov 2010, at 5:19, Rob Brown-Bayliss wrote: Further testing shows it is windows networking causing the issue. Copying files to and from the server is 5 to 6 times slower on a Windows client compared to the Linux client. The issue is not specific to libpq. Do both machines have similar hardware? If the Linux machine has a proper NIC (intel, for example) while the Windows machine has a poor NIC (Realtek!), then of course you would get differences in performance. I'm not saying Microsofts network stack is particularly good, mind you, just that it isn't necessarily the main cause. One of the problems Windows is facing is that, while supporting a lot of hardware is a good thing in general, it also supports all the crap hardware, crap drivers and crap ACPI implementations. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4cda484310265265216060! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
Op 10-11-10 08:22, Alban Hertroys schreef: On 10 Nov 2010, at 5:19, Rob Brown-Bayliss wrote: Further testing shows it is windows networking causing the issue. Copying files to and from the server is 5 to 6 times slower on a Windows client compared to the Linux client. The issue is not specific to libpq. Do both machines have similar hardware? If the Linux machine has a proper NIC (intel, for example) while the Windows machine has a poor NIC (Realtek!), then of course you would get differences in performance. Not that much, in the past I have seen differences in performance between Intel and Realtek, 20-30% yes, but never by a factor of 5 or 6. Antonio I'm not saying Microsofts network stack is particularly good, mind you, just that it isn't necessarily the main cause. One of the problems Windows is facing is that, while supporting a lot of hardware is a good thing in general, it also supports all the crap hardware, crap drivers and crap ACPI implementations. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling. !DSPAM:737,4cda484310265265216060! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Libpq is very slow on windows but fast on linux.
On 11/09/10 11:22 PM, Alban Hertroys wrote: Do both machines have similar hardware? If the Linux machine has a proper NIC (intel, for example) while the Windows machine has a poor NIC (Realtek!), then of course you would get differences in performance. I'm not saying Microsofts network stack is particularly good, mind you, just that it isn't necessarily the main cause. One of the problems Windows is facing is that, while supporting a lot of hardware is a good thing in general, it also supports all the crap hardware, crap drivers and crap ACPI implementations. He has high speed connections with 350ms ping times to the server thats half way around the world.Windows XP, at least, did not default to an RWIN over 64k. He needs about 300K for that link, give or take, depending on how fast the wires are. He can go into the registry (of the XP client) and bump is RWIN to something larger than pingtime (in seconds) * wirespeed (in byte/sec) ... example: 0.350 ping * 700K byte/sec = ~250K, so use 300k or 400k for TCP Recieve Window Size) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general