Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-13 Thread Simon Riggs
On Thu, 2007-07-12 at 11:19 -0700, Benjamin Arai wrote:

> I am running the latest 8.2.4.  I am using GIN.  The data is static.  I do
> a batch upload every week of about 500GB and the ata is never touched
> again, it is always add and never delete or update.

Partitioning will certainly help the index build times, even if it
doesn't specifically help with the queries.

> >From your slides you state:
> 
> GIN_FUZZY_SEARCH_LIMIT - maximum number of
> returned rows
> – GIN_FUZZY_SEARCH_LIMIT=0, disabled on
> default
> 
> When I do a search with say LIMIT 100 isn't this essentially the same thing?

Both restrict the number of matches found, but they're not the same
thing. One is for the query as a whole, the other is for one index scan
on a GIN index.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Hannes Dorbath
Joshua D. Drake wrote:
> Benjamin Arai wrote:
> RAID 5 with 16 spindles? RAID 10 will give you better performance I
> would think.

I'd say RAID 5 is not that bad for this kind of query, at least if the
controller is worth anything. RAID 10 is the best choice for OLTP, but
if the main job for this query is to read an index as fast as possible,
RAID 5 might be well suited. I have good experience with TSearch queries
on a 8 drive RAID 6 setup.

Even if those 16 drives are cheap desktop SATAs, he should be able to
read with around 800MB/sec.

Benjamin, could you try if the following does change your query
performance in any way?

blockdev --setra 16384 /dev/sdX


-- 
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Benjamin Arai
Oleg,

I am running the latest 8.2.4.  I am using GIN.  The data is static.  I do
a batch upload every week of about 500GB and the ata is never touched
again, it is always add and never delete or update.

>From your slides you state:

GIN_FUZZY_SEARCH_LIMIT - maximum number of
returned rows
– GIN_FUZZY_SEARCH_LIMIT=0, disabled on
default

When I do a search with say LIMIT 100 isn't this essentially the same thing?

Benjamin

> Benjamin,
>
> what version of postgres and what type of index you used ?
> The best setup is to use partitioning with rather small table for
> new data and GiST index and big archive table with static data and
> GIN index. I have some slides from PGDay
> http://www.sai.msu.su/~megera/postgres/talks/fts-pgday-2007.pdf
> Also, did you consider using dblink/dbilink to scale your search ?
>
> Oleg
> On Thu, 12 Jul 2007, Benjamin Arai wrote:
>
>> Hi,
>>
>> I have a really big Tsearch2 table (100s GB) that takes a while to
>> perform
>> queries and takes days to index.  Is there any way to fix these issues
>> using UNIONs or partitioning?  I was thinking that I could partition the
>> data by date but since I am always performing queries on the Tsearch2
>> field I do not know if this will help performance.  I think paritioning
>> will help the indexing problem since I can incrementally re-index the
>> data
>> but again I figured it would be better to ask.
>>
>> Any suggestions will be greatly appreciated. Thanks in advance.
>>
>> System I am running on:
>>
>> -Raid 5 with 16x drives
>> -Quad core XEON
>> 16 GB of memory (Any suggestion on the postgresql.conf setup would also
>> be
>> great! Currently I am just setting shared mem to 8192MB)
>> -x86_64 but Redhat 5 Ent
>>
>> Benjamin
>>
>>
>>
>> ---(end of broadcast)---
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>>   message can get through to the mailing list cleanly
>>
>
>   Regards,
>   Oleg
> _
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
>



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Oleg Bartunov

Benjamin,

what version of postgres and what type of index you used ?
The best setup is to use partitioning with rather small table for 
new data and GiST index and big archive table with static data and 
GIN index. I have some slides from PGDay

http://www.sai.msu.su/~megera/postgres/talks/fts-pgday-2007.pdf
Also, did you consider using dblink/dbilink to scale your search ?

Oleg
On Thu, 12 Jul 2007, Benjamin Arai wrote:


Hi,

I have a really big Tsearch2 table (100s GB) that takes a while to perform
queries and takes days to index.  Is there any way to fix these issues
using UNIONs or partitioning?  I was thinking that I could partition the
data by date but since I am always performing queries on the Tsearch2
field I do not know if this will help performance.  I think paritioning
will help the indexing problem since I can incrementally re-index the data
but again I figured it would be better to ask.

Any suggestions will be greatly appreciated. Thanks in advance.

System I am running on:

-Raid 5 with 16x drives
-Quad core XEON
16 GB of memory (Any suggestion on the postgresql.conf setup would also be
great! Currently I am just setting shared mem to 8192MB)
-x86_64 but Redhat 5 Ent

Benjamin



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Benjamin Arai
> Benjamin Arai wrote:
>> Hi,
>>
>> I have a really big Tsearch2 table (100s GB) that takes a while to
>> perform
>> queries and takes days to index.  Is there any way to fix these issues
>> using UNIONs or partitioning?  I was thinking that I could partition the
>> data by date but since I am always performing queries on the Tsearch2
>> field I do not know if this will help performance.  I think paritioning
>> will help the indexing problem since I can incrementally re-index the
>> data
>> but again I figured it would be better to ask.
>>
>> Any suggestions will be greatly appreciated. Thanks in advance.
>>
>> System I am running on:
>>
>> -Raid 5 with 16x drives
>
> RAID 5 with 16 spindles? RAID 10 will give you better performance I
> would think.
>
>
>> -Quad core XEON
>> 16 GB of memory (Any suggestion on the postgresql.conf setup would also
>> be
>> great! Currently I am just setting shared mem to 8192MB)
>
> Assuming 8.1+ I would try something much more aggressive, like 4GB.
> Dont' forget your effective_cache_size.
How is 4GB more aggressive?  How large should the effective_cache_size be?
>
> Joshua D. Drake
>
>
>> -x86_64 but Redhat 5 Ent
>>
>> Benjamin
>>
>>
>>
>> ---(end of broadcast)---
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>subscribe-nomail command to [EMAIL PROTECTED] so that your
>>message can get through to the mailing list cleanly
>>
>
>
> --
>
>=== The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
> Providing the most comprehensive  PostgreSQL solutions since 1997
>   http://www.commandprompt.com/
>
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> PostgreSQL Replication: http://www.commandprompt.com/products/
>
>



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Joshua D. Drake

Benjamin Arai wrote:

Hi,

I have a really big Tsearch2 table (100s GB) that takes a while to perform
queries and takes days to index.  Is there any way to fix these issues
using UNIONs or partitioning?  I was thinking that I could partition the
data by date but since I am always performing queries on the Tsearch2
field I do not know if this will help performance.  I think paritioning
will help the indexing problem since I can incrementally re-index the data
but again I figured it would be better to ask.

Any suggestions will be greatly appreciated. Thanks in advance.

System I am running on:

-Raid 5 with 16x drives


RAID 5 with 16 spindles? RAID 10 will give you better performance I 
would think.




-Quad core XEON
16 GB of memory (Any suggestion on the postgresql.conf setup would also be
great! Currently I am just setting shared mem to 8192MB)


Assuming 8.1+ I would try something much more aggressive, like 4GB. 
Dont' forget your effective_cache_size.


Joshua D. Drake



-x86_64 but Redhat 5 Ent

Benjamin



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Benjamin Arai
Hi,

I have a really big Tsearch2 table (100s GB) that takes a while to perform
queries and takes days to index.  Is there any way to fix these issues
using UNIONs or partitioning?  I was thinking that I could partition the
data by date but since I am always performing queries on the Tsearch2
field I do not know if this will help performance.  I think paritioning
will help the indexing problem since I can incrementally re-index the data
but again I figured it would be better to ask.

Any suggestions will be greatly appreciated. Thanks in advance.

System I am running on:

-Raid 5 with 16x drives
-Quad core XEON
16 GB of memory (Any suggestion on the postgresql.conf setup would also be
great! Currently I am just setting shared mem to 8192MB)
-x86_64 but Redhat 5 Ent

Benjamin



---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Big table with UNION ALL or partitioning with Tsearch2

2007-07-12 Thread Benjamin Arai
Hi,

I have a really big Tsearch2 table (100s GB) that takes a while to perform
queries and takes days to index.  Is there any way to fix these issues
using UNIONs or partitioning?  I was thinking that I could partition the
data by date but since I am always performing queries on the Tsearch2
field I do not know if this will help performance.  I think paritioning
will help the indexing problem since I can incrementally re-index the data
but again I figured it would be better to ask.

Any suggestions will be greatly appreciated. Thanks in advance.

System I am running on:

-Raid 5 with 16x drives
-Quad core XEON
16 GB of memory (Any suggestion on the postgresql.conf setup would also be
great! Currently I am just setting shared mem to 8192MB)
-x86_64 but Redhat 5 Ent

Benjamin



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly