Re: [PERFORM] not exists clause
S Golly wrote: I cannot get the "not exists" clause of ANSI SQL to execute correctly. select t.col11, t.col1... from table1 t where not exists (select 1 from table2 where col2 = t.col1); table1 has 40M + rows. if that matters. OS is FreeBSD 6.2, postgresql version 8.2.6 Is it not supported or a bug ? thank you for your support. This is really not a performance question, but a general SQL question. select * from t1 f1 -- 1 2 3 select * from t2 f1 -- 1 2 select * from t1 where not exists ( select 1 from t2 where t2.f1 = t1.f1 ) f1 -- 3 -- Guy Rouillier ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Best way to index IP data?
On Jan 10, 2008 6:25 PM, Steve Atkins <[EMAIL PROTECTED]> wrote: > http://pgfoundry.org/projects/ip4r/ > > That has the advantage over using integers, or the built-in inet type, > of being indexable for range and overlap queries. Agreed. ip4r is da bomb. -- Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 499 Thornall Street, 2nd Floor | [EMAIL PROTECTED] Edison, NJ 08837| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Best way to index IP data?
On Jan 10, 2008, at 3:14 PM, Kevin Kempter wrote: Hi List; We'll be loading a table with begining & ending I.P.'s - the table will likely have upwards of 30million rows. Any thoughts on how to get the best performance out of queries that want to look for IP ranges or the use of between queries? Should these be modeled as integers? http://pgfoundry.org/projects/ip4r/ That has the advantage over using integers, or the built-in inet type, of being indexable for range and overlap queries. Cheers, Steve ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Best way to index IP data?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, 10 Jan 2008 16:14:54 -0700 Kevin Kempter <[EMAIL PROTECTED]> wrote: > Hi List; > > We'll be loading a table with begining & ending I.P.'s - the table > will likely have upwards of 30million rows. Any thoughts on how to > get the best performance out of queries that want to look for IP > ranges or the use of between queries? Should these be modeled as > integers? > http://www.postgresql.org/docs/current/static/datatype-net-types.html > Thanks in advance > > /Kevin > > ---(end of > broadcast)--- TIP 1: if posting/reading > through Usenet, please send an appropriate subscribe-nomail command > to [EMAIL PROTECTED] so that your message can get through to > the mailing list cleanly > - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHhqjHATb/zqfZUUQRAvMOAJ984Np5GMrFd1vixP/zECIl3qUWYgCff6U4 bCBBz1VaxqIoZfCFfKEIZLU= =+9vD -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Best way to index IP data?
Hi List; We'll be loading a table with begining & ending I.P.'s - the table will likely have upwards of 30million rows. Any thoughts on how to get the best performance out of queries that want to look for IP ranges or the use of between queries? Should these be modeled as integers? Thanks in advance /Kevin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] not exists clause
Golly, > I cannot get the "not exists" clause of ANSI SQL to execute correctly. > select t.col11, t.col1... from table1 t where not exists (select 1 from > table2 where col2 = t.col1); > table1 has 40M + rows. if that matters. > > OS is FreeBSD 6.2, postgresql version 8.2.6 You'll have to post the actual query and error message. WHERE NOT EXISTS has been supported since version 7.1. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] not exists clause
I cannot get the "not exists" clause of ANSI SQL to execute correctly. select t.col11, t.col1... from table1 t where not exists (select 1 from table2 where col2 = t.col1); table1 has 40M + rows. if that matters. OS is FreeBSD 6.2, postgresql version 8.2.6 Is it not supported or a bug ? thank you for your support.
Re: [PERFORM] big database performance
On Thu, Jan 10, 2008 at 12:08:39PM +0100, Stephane Bailliez wrote: > Jared Mauch wrote: >> I do large databases in Pg, like 300GB/day of new data. > > That's impressive. Would it be possible to have details on your hardware, > schema and configuration and type of usage ? > > I'm sure there's something to learn in there for a lot of people (or at > least for me) http://archives.postgresql.org/pgsql-performance/2007-12/msg00372.php http://archives.postgresql.org/pgsql-performance/2006-05/msg00444.php The hardware specs are kinda boring since it's not I/O bound, so you could get the same disk performance out of some EIDE 7200 rpm disks (which I have done for testing). The current setup is a 4xOpteron 8218 (dual core) w/ 16G ram. I have roughly 12TB usable disk space on the sysem connected via some SATA <-> FC thing our systems folks got us. Problem I have is the linear cpu speed isn't enough and there would be challenges splitting the workload across multiple cpus. All my major reporting is done via pg_dump and I'm pondering what would happen if I just removed Pg from the equation for the major reporting tasks entirely. I may see much better performance without the database [in my way]. I've not done that as some types of data access would need to be significantly redone and I don't want to spend the time on that... - Jared -- Jared Mauch | pgp key available via finger from [EMAIL PROTECTED] clue++; | http://puck.nether.net/~jared/ My statements are only mine. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] big database performance
On Thu, Jan 10, 2008 at 10:57:46AM +0200, Adrian Moisey wrote: > What sort of information do you need from me ? Ratio of read vs write operations (select vs insert/copy). average number of indicies per table average table size. (analyze verbose if you want to get into more details). What is the process doing (eg: in top, is it just on the CPU or is it blocking for I/O?). I/O information, from iostat -d (You may need to build an iostat binary for Linux, the source is out there, i can give you a pointer if you need it). >> Is your problem with performance database reads? writes? (insert/copy?) >> How many indicies do you have? > > I think the problem is related to load. Everything is slow because there > are way too many connections. So everything is making everything else > slow. Not much detail, is it? > > We have 345 indicies on the db. If the tables are heavily indexed this could easily slow down insert performance. Taking a large dataset and adding a second index, postgres doesn't use threads to create the two indicies on different cpus/cores in parallel. This could represent some of your performance difference. If you're doing a lot of write operations and fewer read, perhaps the cost of an index isn't worth it in the cpu time spent creating it vs the amount of time for a seq scan. - Jared -- Jared Mauch | pgp key available via finger from [EMAIL PROTECTED] clue++; | http://puck.nether.net/~jared/ My statements are only mine. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Search for fixed set of keywords
On Thu, 10 Jan 2008, J?rg Kiegeland wrote: Did you try integer arrays with GIN (inverted index) ? I now tried this, and GIN turned out to be linear time, compared with GIST which was acceptable time. However I tested this only for Z=infinity, for Z=1000, GIST/GIN are both not acceptable. Sorry, I didn't follow your problem, but GIN should be certainly logarithmic on the number of unique words. Also, it'd be much clear if you show us your queries and explain analyze. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] big database performance
Jared Mauch wrote: I do large databases in Pg, like 300GB/day of new data. That's impressive. Would it be possible to have details on your hardware, schema and configuration and type of usage ? I'm sure there's something to learn in there for a lot of people (or at least for me) Cheers, -- stephane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Search for fixed set of keywords
Did you try integer arrays with GIN (inverted index) ? I now tried this, and GIN turned out to be linear time, compared with GIST which was acceptable time. However I tested this only for Z=infinity, for Z=1000, GIST/GIN are both not acceptable. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] big database performance
Hi I do large databases in Pg, like 300GB/day of new data. Need a lot more data on what you're having issues with. That is big! What sort of information do you need from me ? Is your problem with performance database reads? writes? (insert/copy?) How many indicies do you have? I think the problem is related to load. Everything is slow because there are way too many connections. So everything is making everything else slow. Not much detail, is it? We have 345 indicies on the db. -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] big database performance
Hi Also, we're running the db on ext3 with noatime. Should I look at changing or getting rid of journaling ? No (unless you like really long fsck times). data=writeback is safe with PostgreSQL, though. I tested that on a dev box, and I didn't notice a difference when using pgbench -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match