I've been working on the same project and figured I would chip in.
A compromise to avoid needing synchronous replication would be to
determine which functions in our code need to use "live" or recently
modified data, and ensure that queries pertaining to those function
calls get sent to the master database (where all INSERT and UPDATE
operations are performed). For other functions where a few seconds of
delay doesn't matter, the queries would be directed to a replicated
slave database.
It isn't clear how to achieve this. We have pgpool2 working in master/
slave mode, but it doesn't have a very fine level of control over how
queries get directed. The only way I could see to do it at the
application level would be to step in and out of the "read committed"
isolation level. According to the Postgres documentation this makes
no difference in behavior because "read committed" is actually the
minimum level of transaction isolation. However, pgpool2 isn't aware
of this; it directs all queries to the master at this isolation level
and uses load balancing when the isolation level is "read
uncommitted."
I was able to direct queries from individual functions to the master
database by wrapping them in a decorator that sets the connection's
isolation level to 1 and then back to 0. However, this seems way too
sketchy for us to be comfortable with it. I wonder if there is a
better way.
Michael
On Nov 3, 10:19 pm, Adam Seering wrote:
> Hi,
> We're running a website that usually runs just fine on our server; but
> every now and then we get a big load burst (thousands of simultaneous
> users in an interactive Web 1.5-ish app), and our database server
> (PostgreSQL) just gets completely swamped.
>
> We'd like to set up some form of load-balancing. The workload is very
> SELECT-heavy, so this seems plausible. It looks like Slony is the
> recommended package for doing this. However, if we set up a Slony
> cluster and use pgpool to divide up queries among the nodes, the default
> isolation level requested by psycopg forces all the queries to go to the
> master database, which defeats the purpose of the cluster. If we force
> the system to a lower isolation level, all kinds of things start
> breaking, because data doesn't appear quickly enough in the slave
> databases, and various chunks of Django code (and our code) seem to rely
> on writing data and immediately reading it back.
>
> Does anyone else do this type of load-balancing? Any tips? In
> general, what (if anything) do folks here do for load-balancing?
>
> Thanks,
> Adam
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---