Re: [PERFORM] hardware performance and some more

2003-07-25 Thread Kasim Oztoprak
On 24 Jul 2003 23:25 EEST you wrote:

 On Thu, 2003-07-24 at 13:25, Kasim Oztoprak wrote:
  On 24 Jul 2003 17:08 EEST you wrote:
  
   On 24 Jul 2003 at 15:54, Kasim Oztoprak wrote:
 [snip]
  
  we do not have memory problem or disk problems. as I have seen in the list the 
  best way to 
  use disks are using raid 10 for data and raid 1 for os. we can put as much memory 
  as 
  we require. 
  
  now the question, if we have 100 searches per second and in each search if we need 
  30 sql
  instruction, what will be the performance of the system in the order of time. Let 
  us say
  we have two machines described aove in a cluster.
 
 That's 3000 sql statements per second, 180 thousand per minute
 What the heck is this database doing!
 
 A quad-CPU Opteron sure is looking useful right about now...  Or
 an quad-CPU AlphaServer ES45 running Linux, if 4x Opterons aren't
 available.
 
 How complicated are each of these SELECT statements?

this is kind of directory assistance application. actually the select statements are 
not
very complex. the database contain 25 million subscriber records and the operators 
searches 
for the subscriber numbers or addresses. there are not much update operations actually 
the 
update ratio is approximately %0.1 . 

i will use at least 4 machines each having 4 cpu with the speed of 2.8 ghz xeon 
processors.
and suitable memory capacity with it. 

i hope it will overcome with this problem. any similar implementation?

 
 -- 
  - 
 | Ron Johnson, Jr.Home: [EMAIL PROTECTED] |
 | Jefferson, LA  USA  |
 | |
 | I'm not a vegetarian because I love animals, I'm a vegetarian  |
 |  because I hate vegetables!|
 |unknown  |
  - 
 
 
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] hardware performance and some more

2003-07-25 Thread Shridhar Daithankar
On 25 Jul 2003 at 16:38, Kasim Oztoprak wrote:
 this is kind of directory assistance application. actually the select statements are 
 not
 very complex. the database contain 25 million subscriber records and the operators 
 searches 
 for the subscriber numbers or addresses. there are not much update operations 
 actually the 
 update ratio is approximately %0.1 . 
 
 i will use at least 4 machines each having 4 cpu with the speed of 2.8 ghz xeon 
 processors.
 and suitable memory capacity with it. 

Are you going to duplicate the data?

If you are going to have 3000 sql statements per second, I would suggest,

1. Get quad CPU. You probably need that horsepower
2. Use prepared statements and stored procedures to avoid parsing overhead.

I doubt you would need cluster of machines though. If you run it thr. a pilot 
program, that would give you an idea whether or not you need a cluster..

Bye
 Shridhar

--
Default, n.:The hardware's, of course.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] hardware performance and some more

2003-07-25 Thread Shridhar Daithankar
On 24 Jul 2003 at 9:42, William Yu wrote:

 As far as I can tell, the performance impact seems to be minimal. 
 There's a periodic storm of replication updates in cases where there's 
 mass updates sync last resync. But if you have mostly reads and few 
 writes, you shouldn't see this situation. The biggest performance impact 
 seems to be the CPU power needed to zip/unzip/encrypt/decrypt files.

Can you use WAL based replication? I don't have a URL handy but there are 
replication projects which transmit WAL files to another server when they fill 
in.

OTOH, I was thinking of a simple replication theme. If postgresql provides a 
hook where it calls an external library routine for each heapinsert in WAL, 
there could be a simple multi-slave replication system. One doesn't have to 
wait till WAL file fills up.

Of course, it's upto the library to make sure that it does not hold postgresql 
commits for too long that would hamper the performance.

Also there would need a receiving hook which would directly heapinsert the data 
on another node.

But if the external library is threaded, will that work well with postgresql?

Just a thought. If it works, load-balancing could be lot easy and near-
realtime..


Bye
 Shridhar

--
We fight only when there is no other choice.  We prefer the ways ofpeaceful contact.   
 -- Kirk, Spectre of the Gun, stardate 4385.3


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] hardware performance and some more

2003-07-24 Thread Kasim Oztoprak
hello,

some of my questions may not be related to this group however, I know that some 
of them are directly related to this list. 

first of all I would like to learn that, any of you use the postgresql within the 
clustered environment? Or, let me ask you the question, in different manner,
can we use postgresql in a cluster environment? If we can do what is the support
method of the postgresql for clusters? 

I would like to know two main clustering methods. (let us assume we use 2 machines 
in the clustering system) in the first case we have two machines running in a cluster
however, the second one does not run the database server untill the observation of the 
failure of the first machine, the oracle guys call this situation as active-passive 
configuration. There is only one machine running the database server at the same time. 
Hence, in the case of failure there are some time to be waited untill the second 
machine comes up.

In the second option both machines run the database server at the same time. Again 
oracle 
supports this method using some additional applications called Real Application 
Cluster (RAC).
Again oracle guys call this method as active-active configuration.

The questions for this explanation are:
  1 - Can we use postgresql within clustered environment?
  2 - if the answer is yes, in which method can we use postgresql within a cluster?
  active - passive or active - active?

Now, the second question is related to the performance of the database. Assuming we 
have a 
dell's poweredge 6650 with 4 x 2.8 Ghz Xeon processors having 2 MB of cache for each, 
with the 
main memory of lets say 32 GB. We can either use a small SAN from EMC or we can put 
all disks 
into the machines with the required raid confiuration.

We will install RedHat Advanced Server 2.1 to the machine as the operating system and 
postgresql as 
the database server. We have a database having 25 millions records  having the length 
of 250 bytes 
on average for each record. And there are 1000 operators accessing the database 
concurrently. The main 
operation on the database (about 95%) is select rather than insert, so do you have any 
idea about 
the performance of the system? 

best regards,

-kasým



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match