Re: [PERFORM] SSD + RAID

2009-11-14 Thread Lists

Laszlo Nagy wrote:

Hello,

I'm about to buy SSD drive(s) for a database. For decision making, I 
used this tech report:


http://techreport.com/articles.x/16255/9
http://techreport.com/articles.x/16255/10

Here are my concerns:

   * I need at least 32GB disk space. So DRAM based SSD is not a real
 option. I would have to buy 8x4GB memory, costs a fortune. And
 then it would still not have redundancy.
   * I could buy two X25-E drives and have 32GB disk space, and some
 redundancy. This would cost about $1600, not counting the RAID
 controller. It is on the edge.
This was the solution I went with (4 drives in a raid 10 actually). Not 
a cheap solution, but the performance is amazing.



   * I could also buy many cheaper MLC SSD drives. They cost about
 $140. So even with 10 drives, I'm at $1400. I could put them in
 RAID6, have much more disk space (256GB), high redundancy and
 POSSIBLY good read/write speed. Of course then I need to buy a
 good RAID controller.

My question is about the last option. Are there any good RAID cards 
that are optimized (or can be optimized) for SSD drives? Do any of you 
have experience in using many cheaper SSD drives? Is it a bad idea?


Thank you,

  Laszlo





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Lists

Josh Berkus wrote:

Tom,


Right, because they do.  If you think otherwise, demonstrate it.
(bonnie tests approximating a reverse seqscan are not relevant
to the performance of indexscans.)


Working on it.  I *think* I've seen this issue in the field, which is 
why I brought it up in the first place, but getting a good test case 
is, of course, difficult.




I think I may be experiencing this situation now.

The query

   select comment_date
   from user_comments
   where user_comments.uid=1
   order by comment_date desc limit 1

   Explain:
   Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
   time=52848.785..52848.787 rows=1 loops=1)
 -  Index Scan Backward using idx_user_comments_comment_date on
   user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
   time=52848.781..52848.781 rows=1 loops=1)
   Filter: (uid = 1)
   Total runtime: 52848.840 ms

takes 10's of seconds to complete (52 sec last run). However

   select comment_date
   from user_comments
   where user_comments.uid=1
   order by comment_date limit 1

   Explain:
   Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
   time=70.402..70.403 rows=1 loops=1)
 -  Index Scan using idx_user_comments_comment_date on
   user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
   time=70.398..70.398 rows=1 loops=1)
   Filter: (uid = 1)
   Total runtime: 70.453 ms

takes well under 1 sec.


reply_date is a timestamp with time zone and has the index

   CREATE INDEX idx_user_comments_comment_date
 ON user_comments
 USING btree
 (comment_date);


I don't understand why it is so much slower to scan it reverse

It's a fairly big table. About 4.4 million rows, 888MB. That index is 
96MB. I tried dropping and recreating the index, but it doesn't seem to 
have helped any.



Can I create a reverse index on the dates so it can do a forward scan of 
the reverse index?


Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Lists

Tom Lane wrote:

Lists li...@on-track.ca writes:
  

The query



  

select comment_date
from user_comments
where user_comments.uid=1
order by comment_date desc limit 1



  

Explain:
Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
time=52848.785..52848.787 rows=1 loops=1)
  -  Index Scan Backward using idx_user_comments_comment_date on
user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
time=52848.781..52848.781 rows=1 loops=1)
Filter: (uid = 1)
Total runtime: 52848.840 ms



  

takes 10's of seconds to complete (52 sec last run). However



  

select comment_date
from user_comments
where user_comments.uid=1
order by comment_date limit 1



  

Explain:
Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
time=70.402..70.403 rows=1 loops=1)
  -  Index Scan using idx_user_comments_comment_date on
user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
time=70.398..70.398 rows=1 loops=1)
Filter: (uid = 1)
Total runtime: 70.453 ms



  

takes well under 1 sec.



AFAICS this is pure chance --- it is based on when we happen to hit the
first row with uid = 1 while scanning in forward or reverse comment_date
order.  Unless you have evidence that the number of rows skipped over
is similar in both cases, there is no reason to suppose that this
example bears on Josh's concern.

As noted by Merlin, if you're willing to create another index to help
this type of query, then a two-column index on (uid, comment_date) would
be ideal.

regards, tom lane
  


Thank you Tom and Merlin (and Grzegorz for the answer to my other 
question I no longer need). The composite index seems to do the trick. 
The reverse index scan is now taking about the same time.


Rows with uid=1 should be spread throughout the table but there should 
be a larger amount earlier in the table (based on insert order).


I already had a separate index on uid

   CREATE INDEX idx_user_comments_uid
 ON user_comments
 USING btree
 (uid);

Under the circumstances, shouldn't a bitmap of those 2 indexes be far 
faster than using just the date index (compared to the old plan, not the 
new composite index). Why would the planner not choose that plan?


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists
I'm currently running 32bit FreeBSD so I can't really add more ram (PAE 
doesn't work well under FreeBSD from what I've read) and there are 
enough writes that more ram won't solve the problem completely.


However I will add plenty more ram next time I rebuild it.


Heikki Linnakangas wrote:

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB 
in pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a 
dramatic effect.





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-06 Thread Lists

Andrew Sullivan wrote:

On Sun, Apr 05, 2009 at 11:36:33AM -0700, Lists wrote:

  
*Slony-I* - I've used this in the past, but it's a huge pain to work  
with, caused serious performance issues under heavy load due to long  
running transactions (may not be the case anymore, it's been a while  
since I used it on a large database with many writes), and doesn't seem  
very reliable (I've had replication break on me multiple times).



It is indeed a pain to work with, but I find it hard to believe that
it is the actual source of performance issues.  What's more likely
true is that it wasn't tuned to your write load -- that _will_ cause
performance issues.  
Can you point me in the direction of the documentation for tuning it? I 
don't see anything in the documentation for tuning for write load.



Of course, tuning it is a major pain, as
mentioned.  I'm also somewhat puzzled by the claim of unreliability:
most of the actual replication failures I've ever seen under Slony are
due to operator error (these are trivial to induce, alas --
aforementioned pain to work with again).  
Recently I had a problem with duplicate key errors on the slave, which 
shouldn't be possible since they keys are the same.

I've just noticed in the documentation that

   The Duplicate Key Violation
   http://www.slony.info/documentation/faq.html#DUPKEY bug has helped
   track down a number of rather obscure PostgreSQL race conditions, so
   that in modern versions of Slony-I and PostgreSQL, there should be
   little to worry about.

so that may no longer be an issue. However I experienced with this the 
latest Slony (as of late last year) and Postgresql 8.3.


Also the dupe key error linked appears to be duplicate key of slony 
meta-data were as this was a duplicate key of one of my table's primary 
key.

Slony is baroque and
confusing, but it's specifically designed to fail in safe ways (which
is not true of some of the other systems: several of them have modes
in which it's possible to have systems out of sync with each other,
but with no way to detect as much.  IMO, that's much worse, so we
designed Slony to fail noisily if it was going to fail at all).  
  

An error is better than silently failing, but of course neither is optimal.

The slony project could really benefit from a simpler user interface and 
simpler documentation. It's integration into pgadminIII is a good step, 
but even with that it is still a bit of a pain so I hope it continues to 
improve in ease of use.


Being powerful and flexable is good, but ease of use with sensible 
defaults for complex items that can be easily overridden is even better.


  
*Mammoth Replicator* - This is open source now, is it any good? It  
sounds like it's trigger based like Slony. Is it based on Slony, or  
simply use a similar solution?



It's completely unrelated, and it doesn't use triggers.  I think the
people programming it are first-rate.  Last I looked at it, I felt a
little uncomfortable with certain design choices, which seemed to me
to be a little hacky.  They were all on the TODO list, though.

  

*SkyTools/Londiste* - Don't know anything special about it.



I've been quite impressed by the usability.  It's not quite as
flexible as Slony, but it has the same theory of operation.  The
documentation is not as voluminous, although it's also much handier as
reference material than Slony's (which is, in my experience, a little
hard to navigate if you don't already know the system pretty well).

A

  

Thanks, I'll look into both of those as well.



[PERFORM] Best replication solution?

2009-04-05 Thread Lists
I am looking to setup replication of my postgresql database, primarily 
for performance reasons.


The searching I've done shows a lot of different options, can anyone 
give suggestions about which one(s) are best? I've read the archives, 
but there seems to be more replication solutions since the last thread 
on this subject and it seems to change frequently.


I'd really like a solution that replicates DDL, but very few do so I 
think I'm out of luck for that. I can live without it.
Multi-master support would be nice too, but also seems to cause too many 
problems so it looks like I'll have to do without it too.



*Slony-I* - I've used this in the past, but it's a huge pain to work 
with, caused serious performance issues under heavy load due to long 
running transactions (may not be the case anymore, it's been a while 
since I used it on a large database with many writes), and doesn't seem 
very reliable (I've had replication break on me multiple times).


*Mammoth Replicator* - This is open source now, is it any good? It 
sounds like it's trigger based like Slony. Is it based on Slony, or 
simply use a similar solution?


*pgpool* - Won't work for us reliably for replication because we have 
some triggers and stored procedures that write data.


*PGCluster* - Sounds cool, but based on the mailing list traffic and the 
last news post on the site being from 2005, development seems to be near 
dead. Also, no releases seems to make it beyond the RC stage -- for 
multi-master stability is particularly important for data integrity.


*PGReplicator - *Don't know anything special about it.
*
Bucardo* - Don't know anything special about it.

*Postgres-R* - Don't know anything special about it.

*SkyTools/Londiste* - Don't know anything special about it.


Re: [PERFORM] Best replication solution?

2009-04-05 Thread Lists
I have a high traffic database with high volumes of reads, and moderate 
volumes of writes. Millions of queries a day.


Running the latest version of Postgresql 8.2.x (I want to upgrade to 
8.3, but the dump/reload requires an unacceptable amount of downtime)


Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB in 
pgdump format).



The IO on the disks is being maxed out and I don't have the budget to 
add more disks at this time. The web server has a raid10 of sata drives 
with some io bandwidth to spare so I would like to replicate all data 
over, and send some read queries to that server -- in particular the 
very IO intensive FTI based search queries.



ries van Twisk wrote:

Dr Mr No Name,

what replication solution is the best depends on your requirements.
May be you can tell a bit more what your situation is?
Since you didn't gave us to much information about your requirements 
it's hard to give you any advice.


Ries

On Apr 5, 2009, at 1:36 PM, Lists wrote:

I am looking to setup replication of my postgresql database, 
primarily for performance reasons.


The searching I've done shows a lot of different options, can anyone 
give suggestions about which one(s) are best? I've read the archives, 
but there seems to be more replication solutions since the last 
thread on this subject and it seems to change frequently.


I'd really like a solution that replicates DDL, but very few do so I 
think I'm out of luck for that. I can live without it.
Multi-master support would be nice too, but also seems to cause too 
many problems so it looks like I'll have to do without it too.



*Slony-I* - I've used this in the past, but it's a huge pain to work 
with, caused serious performance issues under heavy load due to long 
running transactions (may not be the case anymore, it's been a while 
since I used it on a large database with many writes), and doesn't 
seem very reliable (I've had replication break on me multiple times).


*Mammoth Replicator* - This is open source now, is it any good? It 
sounds like it's trigger based like Slony. Is it based on Slony, or 
simply use a similar solution?


*pgpool* - Won't work for us reliably for replication because we have 
some triggers and stored procedures that write data.


*PGCluster* - Sounds cool, but based on the mailing list traffic and 
the last news post on the site being from 2005, development seems to 
be near dead. Also, no releases seems to make it beyond the RC stage 
-- for multi-master stability is particularly important for data 
integrity.


*PGReplicator - *Don't know anything special about it.
*
Bucardo* - Don't know anything special about it.

*Postgres-R* - Don't know anything special about it.

*SkyTools/Londiste* - Don't know anything special about it.











[PERFORM] Optmal tags design?

2007-07-18 Thread lists
I am planning to add a tags (as in the web 2.0 thing) feature to my web
based application. I would like some feedback from the experts here on
what the best database design for that would be.

The possibilities I have come up with are:
* A tags table containing the tag and id number of what it links to.
select pid from tags where tag='bla'
select tag from tags where pid=xxx.

* a tags table where each tag exists only once, and a table with the tag
ID and picture ID to link them together.

select pid from tags inner join picture_tags using(tag_id) where tag='bla'
select tag from tags inner join picture_tags using(tag_id) where pid='xxx'

* A full text index in the picture table containing the tags

select pid from pictures where tags @@ to_tsquery('bla')
(or the non-fti version)
select pid from pictures where tags ~* '.*bla.*'

select tags from pictures where pid=xxx;

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Adding disks/xlog index

2007-05-25 Thread lists
I have a busy postgresql server running running on a raid1 of 2 15k rpm
scsi drives.

I have been running into the problem of maxed out IO bandwidth. I would
like to convert my raid1 into a raid10 but that would require a full
rebuild which is more downtime than I want so I am looking into other
alternatives.

The best one I have come up with is moving the xlog/wal (can someone
confirm whether these are the same thing?) to another physical drive. I
also think it may be beneficial to move some indexes to another drive as
well (same one as xlog).

Some questions on this:
1. Can the database survive loss/corruption of the xlog and indexes in a
recoverable way? To save money (and because I won't need the throughput as
much), I am thinking on making this index/wal/xlog drive a single cheap
sata drive (or maybe a non-raided 15k scsi for 60% more money). However
without the redundancy of a mirror I am concerned about drive failure.
Loss of several mins of recent transactions in a serious crash is
acceptable to be, but full/serious database corruption (the likes of fsync
off) is not.

2. Is there any point using a high performance (ie scsi) disk for this, or
would the mirror containing the majority of the data still be the major
bottleneck causing the disk usage to not exceed sata performance anyway?

3. Is there any easy way to move ALL indexes to another drive? Is this a
good performance idea or would they just bottleneck each other seriously?


Other info for reference
Running postgresql 8.2 on FreeBSD 6.1
server is a core2 with 4gb of ram. CPU usage is moderate.


Also, can anyone recommend a good shared_buffers size? The server is
dedicated to postgres except for half a gig used by memcached. Right now I
have it set at 51200 which may be too high (I've read varying suggestions
with this and I'm not sure how aggressive FreeBSD6's IO cache is).

And any suggestions on what effective_cache_size I should use on this
hardware and OS? I've been using 384MB but I don't know if this is optimal
or not.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Trigger performance problem

2005-05-17 Thread lists


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql

2005-01-18 Thread Nicolai Petri (lists)
This must be a linux'ism because to my knowledge FreeBSD does not keep the 
os-cache mapped into the kernel address space unless it have active objects 
associated with the data.

And FreeBSD also have a default split of 3GB userspace and 1GB. kernelspace 
when running with a default configuration. Linux people might want to try 
other os'es to compare the performance.

Best regards,
Nicolai Petri
Ps. Sorry for my lame MS mailer - quoting is not something it knows how to 
do. :)
- Original Message - 
From: William Yu [EMAIL PROTECTED]


I inferred this from reading up on the compressed vm project. It can be 
higher or lower depending on what devices you have in your system -- 
however, I've read messages from kernel hackers saying Linux is very 
aggressive in reserving memory space for devices because it must be 
allocated at boottime.


Josh Berkus wrote:
William,

The theshold for using PAE is actually far lower than 4GB. 4GB is the
total memory address space -- split that in half for 2GB for userspace,
2GB for kernel. The OS cache resides in kernel space -- after you take
alway the memory allocation for devices, you're left with a window of
roughly 900MB.

I'm curious, how do you get 1.1GB for memory allocation for devices?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html