Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
On Thu, 2007-07-12 at 11:19 -0700, Benjamin Arai wrote: > I am running the latest 8.2.4. I am using GIN. The data is static. I do > a batch upload every week of about 500GB and the ata is never touched > again, it is always add and never delete or update. Partitioning will certainly help the index build times, even if it doesn't specifically help with the queries. > >From your slides you state: > > GIN_FUZZY_SEARCH_LIMIT - maximum number of > returned rows > – GIN_FUZZY_SEARCH_LIMIT=0, disabled on > default > > When I do a search with say LIMIT 100 isn't this essentially the same thing? Both restrict the number of matches found, but they're not the same thing. One is for the query as a whole, the other is for one index scan on a GIN index. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Joshua D. Drake wrote: > Benjamin Arai wrote: > RAID 5 with 16 spindles? RAID 10 will give you better performance I > would think. I'd say RAID 5 is not that bad for this kind of query, at least if the controller is worth anything. RAID 10 is the best choice for OLTP, but if the main job for this query is to read an index as fast as possible, RAID 5 might be well suited. I have good experience with TSearch queries on a 8 drive RAID 6 setup. Even if those 16 drives are cheap desktop SATAs, he should be able to read with around 800MB/sec. Benjamin, could you try if the following does change your query performance in any way? blockdev --setra 16384 /dev/sdX -- Best regards, Hannes Dorbath ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Oleg, I am running the latest 8.2.4. I am using GIN. The data is static. I do a batch upload every week of about 500GB and the ata is never touched again, it is always add and never delete or update. >From your slides you state: GIN_FUZZY_SEARCH_LIMIT - maximum number of returned rows GIN_FUZZY_SEARCH_LIMIT=0, disabled on default When I do a search with say LIMIT 100 isn't this essentially the same thing? Benjamin > Benjamin, > > what version of postgres and what type of index you used ? > The best setup is to use partitioning with rather small table for > new data and GiST index and big archive table with static data and > GIN index. I have some slides from PGDay > http://www.sai.msu.su/~megera/postgres/talks/fts-pgday-2007.pdf > Also, did you consider using dblink/dbilink to scale your search ? > > Oleg > On Thu, 12 Jul 2007, Benjamin Arai wrote: > >> Hi, >> >> I have a really big Tsearch2 table (100s GB) that takes a while to >> perform >> queries and takes days to index. Is there any way to fix these issues >> using UNIONs or partitioning? I was thinking that I could partition the >> data by date but since I am always performing queries on the Tsearch2 >> field I do not know if this will help performance. I think paritioning >> will help the indexing problem since I can incrementally re-index the >> data >> but again I figured it would be better to ask. >> >> Any suggestions will be greatly appreciated. Thanks in advance. >> >> System I am running on: >> >> -Raid 5 with 16x drives >> -Quad core XEON >> 16 GB of memory (Any suggestion on the postgresql.conf setup would also >> be >> great! Currently I am just setting shared mem to 8192MB) >> -x86_64 but Redhat 5 Ent >> >> Benjamin >> >> >> >> ---(end of broadcast)--- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to [EMAIL PROTECTED] so that your >> message can get through to the mailing list cleanly >> > > Regards, > Oleg > _ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 > > ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Benjamin, what version of postgres and what type of index you used ? The best setup is to use partitioning with rather small table for new data and GiST index and big archive table with static data and GIN index. I have some slides from PGDay http://www.sai.msu.su/~megera/postgres/talks/fts-pgday-2007.pdf Also, did you consider using dblink/dbilink to scale your search ? Oleg On Thu, 12 Jul 2007, Benjamin Arai wrote: Hi, I have a really big Tsearch2 table (100s GB) that takes a while to perform queries and takes days to index. Is there any way to fix these issues using UNIONs or partitioning? I was thinking that I could partition the data by date but since I am always performing queries on the Tsearch2 field I do not know if this will help performance. I think paritioning will help the indexing problem since I can incrementally re-index the data but again I figured it would be better to ask. Any suggestions will be greatly appreciated. Thanks in advance. System I am running on: -Raid 5 with 16x drives -Quad core XEON 16 GB of memory (Any suggestion on the postgresql.conf setup would also be great! Currently I am just setting shared mem to 8192MB) -x86_64 but Redhat 5 Ent Benjamin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
> Benjamin Arai wrote: >> Hi, >> >> I have a really big Tsearch2 table (100s GB) that takes a while to >> perform >> queries and takes days to index. Is there any way to fix these issues >> using UNIONs or partitioning? I was thinking that I could partition the >> data by date but since I am always performing queries on the Tsearch2 >> field I do not know if this will help performance. I think paritioning >> will help the indexing problem since I can incrementally re-index the >> data >> but again I figured it would be better to ask. >> >> Any suggestions will be greatly appreciated. Thanks in advance. >> >> System I am running on: >> >> -Raid 5 with 16x drives > > RAID 5 with 16 spindles? RAID 10 will give you better performance I > would think. > > >> -Quad core XEON >> 16 GB of memory (Any suggestion on the postgresql.conf setup would also >> be >> great! Currently I am just setting shared mem to 8192MB) > > Assuming 8.1+ I would try something much more aggressive, like 4GB. > Dont' forget your effective_cache_size. How is 4GB more aggressive? How large should the effective_cache_size be? > > Joshua D. Drake > > >> -x86_64 but Redhat 5 Ent >> >> Benjamin >> >> >> >> ---(end of broadcast)--- >> TIP 1: if posting/reading through Usenet, please send an appropriate >>subscribe-nomail command to [EMAIL PROTECTED] so that your >>message can get through to the mailing list cleanly >> > > > -- > >=== The PostgreSQL Company: Command Prompt, Inc. === > Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 > Providing the most comprehensive PostgreSQL solutions since 1997 > http://www.commandprompt.com/ > > Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate > PostgreSQL Replication: http://www.commandprompt.com/products/ > > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Benjamin Arai wrote: Hi, I have a really big Tsearch2 table (100s GB) that takes a while to perform queries and takes days to index. Is there any way to fix these issues using UNIONs or partitioning? I was thinking that I could partition the data by date but since I am always performing queries on the Tsearch2 field I do not know if this will help performance. I think paritioning will help the indexing problem since I can incrementally re-index the data but again I figured it would be better to ask. Any suggestions will be greatly appreciated. Thanks in advance. System I am running on: -Raid 5 with 16x drives RAID 5 with 16 spindles? RAID 10 will give you better performance I would think. -Quad core XEON 16 GB of memory (Any suggestion on the postgresql.conf setup would also be great! Currently I am just setting shared mem to 8192MB) Assuming 8.1+ I would try something much more aggressive, like 4GB. Dont' forget your effective_cache_size. Joshua D. Drake -x86_64 but Redhat 5 Ent Benjamin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Hi, I have a really big Tsearch2 table (100s GB) that takes a while to perform queries and takes days to index. Is there any way to fix these issues using UNIONs or partitioning? I was thinking that I could partition the data by date but since I am always performing queries on the Tsearch2 field I do not know if this will help performance. I think paritioning will help the indexing problem since I can incrementally re-index the data but again I figured it would be better to ask. Any suggestions will be greatly appreciated. Thanks in advance. System I am running on: -Raid 5 with 16x drives -Quad core XEON 16 GB of memory (Any suggestion on the postgresql.conf setup would also be great! Currently I am just setting shared mem to 8192MB) -x86_64 but Redhat 5 Ent Benjamin ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Big table with UNION ALL or partitioning with Tsearch2
Hi, I have a really big Tsearch2 table (100s GB) that takes a while to perform queries and takes days to index. Is there any way to fix these issues using UNIONs or partitioning? I was thinking that I could partition the data by date but since I am always performing queries on the Tsearch2 field I do not know if this will help performance. I think paritioning will help the indexing problem since I can incrementally re-index the data but again I figured it would be better to ask. Any suggestions will be greatly appreciated. Thanks in advance. System I am running on: -Raid 5 with 16x drives -Quad core XEON 16 GB of memory (Any suggestion on the postgresql.conf setup would also be great! Currently I am just setting shared mem to 8192MB) -x86_64 but Redhat 5 Ent Benjamin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly