Re: 1 Million users.. I can't Scale!!

2005-10-27 Thread maxoutrage
How are you actually sending messages to the SMSC?

If you are directly connected - IE using SMPP or UCP then I would
imagine that there is a bottle neck at the SMSC. Large SMSC systems
in the US typically deliver upto 1000 sm/s with larger systems
delivering
2000+ sm/s - From the throughput you require I can see this could be
a problem. I can imagine the carrier would not be happy for your
application
to dominate the SMSC and may impose throttling on your application.

What have you agreed with the carrier ragarding your connection. With
a high volume application it is important to verify the network
bandwidth
available and round trip delays. Also, verify the the windowing
parameters
for the SMPP/UCP connection and the maximum number of connections you
are
allowed to make.

Are you connecting to a multi-node SMSC? Then make separate connections
to each node.

Peter.

phil wrote:
> > Quite true and this lack of clarity was a mistake on my part.  Requests
> > from users do not really become a significant part of this equation
> > because, as described above, once a user subscribes the onus is upon us
> > to generate messages throughout a given period determined by the number
> > of updates a user has subscribed to receive.
> >
>
>
> So you are trying to SEND a million times several packets every
> 5 minutes?
> No way Python is the bottleneck in that volume.
> I have a POS app in Python that handles 10,000 packets
> per SECOND including a MySQL lookup.
> You have a bottleneck, but its not Python.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-30 Thread Christos Georgiou
On Wed, 28 Sep 2005 21:58:15 -0400, rumours say that Jeff Schwab
<[EMAIL PROTECTED]> might have written:

>For many (most?) applications in need of 
>serious scalability, multi-processor servers are preferable.  IBM has 
>eServers available with up to 64 processors each, and Sun sells E25Ks 
>with 72 processors apiece.

SGI offers modular single-image Itanium2 servers of up to 512 CPU at the
moment:

http://www.sgi.com/products/servers/altix/configs.html

And NASA have clustered 20 of these machines to create a 10240 CPU
cluster...

>I like to work on those sorts of machine 
>when possible.  Of course, they're not right for every application, 
>especially since they're so expensive.

And expensive they are :)
-- 
TZOTZIOY, I speak England very best.
"Dear Paul,
please stop spamming us."
The Corinthians
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-29 Thread phil

> Quite true and this lack of clarity was a mistake on my part.  Requests
> from users do not really become a significant part of this equation
> because, as described above, once a user subscribes the onus is upon us
> to generate messages throughout a given period determined by the number
> of updates a user has subscribed to receive.
> 


So you are trying to SEND a million times several packets every
5 minutes?
No way Python is the bottleneck in that volume.
I have a POS app in Python that handles 10,000 packets
per SECOND including a MySQL lookup.
You have a bottleneck, but its not Python.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-29 Thread Jeff Schwab
Aahz wrote:
> In article <[EMAIL PROTECTED]>,
> Jeff Schwab  <[EMAIL PROTECTED]> wrote:
> 
>>Sure, multiple machines are probably the right approach for the OP; I 
>>didn't mean to disagree with that.  I just don't think they are "the 
>>only practical way for a multi-process application to scale beyond a few 
>>processors," like you said.  For many (most?) applications in need of 
>>serious scalability, multi-processor servers are preferable.  IBM has 
>>eServers available with up to 64 processors each, and Sun sells E25Ks 
>>with 72 processors apiece.  I like to work on those sorts of machine 
>>when possible.  Of course, they're not right for every application, 
>>especially since they're so expensive.
> 
> 
> Do these use shared memory?

Yes.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-29 Thread Aahz
In article <[EMAIL PROTECTED]>,
Jeff Schwab  <[EMAIL PROTECTED]> wrote:
>
>Sure, multiple machines are probably the right approach for the OP; I 
>didn't mean to disagree with that.  I just don't think they are "the 
>only practical way for a multi-process application to scale beyond a few 
>processors," like you said.  For many (most?) applications in need of 
>serious scalability, multi-processor servers are preferable.  IBM has 
>eServers available with up to 64 processors each, and Sun sells E25Ks 
>with 72 processors apiece.  I like to work on those sorts of machine 
>when possible.  Of course, they're not right for every application, 
>especially since they're so expensive.

Do these use shared memory?
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread yoda
Thanks for the whitepapers and incredibly useful advice.  I'm beginning
to get a picture of what I should be thinking about and implementing to
achieve this kind of scalability.  Before I go down any particular
route here's a synopsis of the application.

1)User requests are received only during subscription.  We currently
don't have any problems with this because subscription rates increase
along a  sigmoid curve.

2)Once a user subscribes we begin to send them content as it becomes
available.

3)The content is sports data. Content generation is dependent on the
day. On days when there's a lot of action, we can generate up to 20
separate items in a second every 10 minutes.

4)The content is event driven e.g. a goal is scored. It is therefore
imperative that we send the content to the subscribers within a period
of 5 minutes or less.

>There is a difference between one million users each who make one request once 
>a >month, and one million users who are each hammering the system with ten 
>>requests a second. Number of users on its own is a meaningless indicator of 
>>requirements.
Quite true and this lack of clarity was a mistake on my part.  Requests
from users do not really become a significant part of this equation
because, as described above, once a user subscribes the onus is upon us
to generate messages throughout a given period determined by the number
of updates a user has subscribed to receive.

5)Currently, hardware is a constraint (we're a startup and can't afford
high end servers). I would prefer a solution that doesn't have to
result in any changes to the hardware stack. For now, let's assume that
hardware is not part of the equation and every optimization has to be
software based. (except the beautiful network  optimizations suggested)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Paul Rubin
"yoda" <[EMAIL PROTECTED]> writes:
> Currently, the content is generated and a number of SMS per user are
> generated. I'll have to measure this more accurately but a cursory
> glance indicated that we're generting approximately 1000 sms per
> second. (I'm sure this can't be right.. the parser\generator should be
> faster than that:)

Don't be sure.  Try some profiling, and maybe Psyco, C extensions,
etc.  Python is many things but it's not terribly fast at
interpreting.  It sounds like you have more bottlenecks than just the
python app though, starting with the SMS gateway issue that you just
mentioned.  If you get more gateway bandwidth and your application can
split across multiple boxes, then the simple short-term fix is buy
more boxes.  Longer term you need to re-examine the sw architecture
and possibly make changes.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread yoda
>1. How are you transmitting your SMSs?
Currently, a number of different gateways are being used: 2 provide a
SOAP web service interface, 1 other provides a REST based web service.

A transaction using the SOAP web services takes 3-5 seconds to complete
(from the point of calling the method to receive an error\success
confirmation)
The REST web service transaction takes 1 second or less to complete.

> 2. If you disable the actual transmission, how many SMSs can your
>application generate per second?
Currently, the content is generated and a number of SMS per user are
generated. I'll have to measure this more accurately but a cursory
glance indicated that we're generting approximately 1000 sms per
second. (I'm sure this can't be right.. the parser\generator should be
faster than that:)

Additionally, I've just confirmed that the gateway's we use can pump
out 20-100 sms's per second. This is currently too slow and we'll
probably get direct access to the mobile operator's SMSC which provides
larger throughput

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread simonwittber
yoda wrote:
> I'm considering moving to stackless python so that I can make use of
> continuations so that I can open a massive number of connections to the
> gateway and pump the messages out to each user simultaneously.(I'm
> thinking of 1 connection per user).

This won't help if your gateway works synchronously. You need to
determine what your gateway can do. If it works asynchronously,
determine the max bandwidth it can handle, then determine how many
messages you can fit into 4 seconds of that bandwidth. That should
provide you with a number of connections you can safely open and still
recieve acceptable response times.

> My questions therefore are:
> 1)Should I switch to stackless python or should I carry out experiments
> with mutlithreading the application?

You will build a more scalable solution if you create a multi process
system. This will let you deploy across multiple servers, rather than
CPU's. Multithreading and Multiprocessing will only help you if your
application is IO bound.

If your application is CPU bound, multiprocessing and multithreading
will likely hurt your performance. You will have to build a parallel
processing application which will work across different machines. This
is easier than it sounds, as Python has a great selection of IPC
mechanisms to choose from.

> 2)What architectural suggestions can you give me?

Multithreading will introduce extra complexity and overhead. I've
always ended up regretting any use of multithreading which I have
tried. Avoid it if you can.

> 3)Has anyone encountered such a situation before? How did you deal with
> it?

Profile each section or stage of the operation. Find the bottlenecks,
and reduce it whichever way you can. Check out ping times. Use gigabit
or better. Remove as many switches and other hops between machines
which talk to each other.

Cache content, reuse it if you can. Pregenerate content, and stick it
in a cache. Cache cache cache! :-)

> 4)Lastly, and probably most controversial: Is python the right language
> for this? I really don't want to switch to Lisp, Icon or Erlang as yet.

Absolutely. Python will let you easily implement higher level
algorithms to cope with larger problems.

Sw.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Jeff Schwab
[EMAIL PROTECTED] wrote:
> Jeff> How many are more than "a few?"
> 
> I don't know.  What can you do today in commercial stuff, 16 processors?
> How many cores per die, two? Four?  We're still talking < 100 processors
> with access to the same chunk of memory.  For the OP's problem that's still
> 10,000 users per processor.  Maybe that's small enough, but if not, he'll
> need multiple processes across machines that don't share memory.

Sure, multiple machines are probably the right approach for the OP; I 
didn't mean to disagree with that.  I just don't think they are "the 
only practical way for a multi-process application to scale beyond a few 
processors," like you said.  For many (most?) applications in need of 
serious scalability, multi-processor servers are preferable.  IBM has 
eServers available with up to 64 processors each, and Sun sells E25Ks 
with 72 processors apiece.  I like to work on those sorts of machine 
when possible.  Of course, they're not right for every application, 
especially since they're so expensive.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Michael Schneider
I would need to get a better picture of your app.

I use a package called twisted to handle large scale computing
on multicore, and multi-computer problems


http://twistedmatrix.com/

Hope this is useful,
Mike

yoda wrote:
> Hi guys,
> My situation is as follows:
> 
> 1)I've developed a service that generates content for a mobile service.
> 2)The content is sent through an SMS gateway (currently we only send
> text messages).
> 3)I've got a million users (and climbing).
> 4)The users need to get the data a minimum of 5 seconds after it's
> generated. (not considering any bottlenecks external to my code).
> 5)Generating the content takes 1 second.
> 
> I'm considering moving to stackless python so that I can make use of
> continuations so that I can open a massive number of connections to the
> gateway and pump the messages out to each user simultaneously.(I'm
> thinking of 1 connection per user).
> 
> My questions therefore are:
> 1)Should I switch to stackless python or should I carry out experiments
> with mutlithreading the application?
> 2)What architectural suggestions can you give me?
> 3)Has anyone encountered such a situation before? How did you deal with
> it?
> 4)Lastly, and probably most controversial: Is python the right language
> for this? I really don't want to switch to Lisp, Icon or Erlang as yet.
> 
> I really need help because my application currently can't scale. Some
> user's end up getting their data 30 seconds after generation(best case)
> and up to 5 minutes after content generation.  This is simply
> unacceptable.  The subscribers deserve much better service if my
> startup is to survive in the market.
> 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Steven D'Aprano
On Wed, 28 Sep 2005 09:36:54 -0700, ncf wrote:

> If you have that many users, I don't know if Python really is suited
> well for such a large scale application. Perhaps it'd be better suited
> to do CPU intensive tasks it in a compiled language so you can max out
> proformance and then possibly use a UNIX-style socket to send/execute
> instructions to the Python interface, if necessary.

Given that the original post contains no data indicating that the issue is
Python's execution speed, why assume that's where the problem lies, and
that the problem will be solved by throwing extra layers of software at it?

There is a difference between one million users each who make one request
once a month, and one million users who are each hammering the system with
ten requests a second. Number of users on its own is a meaningless
indicator of requirements.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Jeremy Jones
[EMAIL PROTECTED] wrote:

>Damjan> Is there some python module that provides a multi process Queue?
>
>Skip> Not as cleanly encapsulated as Queue, but writing a class that
>Skip> does that shouldn't be all that difficult using a socket and the
>Skip> pickle module.
>
>Jeremy> What about bsddb?  The example code below creates a multiprocess
>Jeremy> queue.
>
>I tend to think "multiple computers" when someone says "multi-process".  I
>realize that's not always the case, but I think you need to consider that
>case (it's the only practical way for a multi-process application to scale
>beyond a few processors).
>
>Skip
>  
>
Doh!  I'll buy that.  When I hear "multi-process", I tend to think of 
folks overcoming the scaling issues that accompany the GIL.  This, of 
course, won't scale across computers without a networking interface.

- JMJ
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread skip

Jeff> How many are more than "a few?"

I don't know.  What can you do today in commercial stuff, 16 processors?
How many cores per die, two? Four?  We're still talking < 100 processors
with access to the same chunk of memory.  For the OP's problem that's still
10,000 users per processor.  Maybe that's small enough, but if not, he'll
need multiple processes across machines that don't share memory.

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread skip

Damjan> Is there some python module that provides a multi process Queue?

Skip> Not as cleanly encapsulated as Queue, but writing a class that
Skip> does that shouldn't be all that difficult using a socket and the
Skip> pickle module.

Here's a trivial implementation of a pair of blocking queue classes:

http://orca.mojam.com/~skip/python/SocketQueue.py

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Jeff Schwab
[EMAIL PROTECTED] wrote:
> Damjan> Is there some python module that provides a multi process Queue?
> 
> Skip> Not as cleanly encapsulated as Queue, but writing a class that
> Skip> does that shouldn't be all that difficult using a socket and the
> Skip> pickle module.
> 
> Jeremy> What about bsddb?  The example code below creates a multiprocess
> Jeremy> queue.
> 
> I tend to think "multiple computers" when someone says "multi-process".  I
> realize that's not always the case, but I think you need to consider that
> case (it's the only practical way for a multi-process application to scale
> beyond a few processors).

How many are more than "a few?"

I think processors with multiple cores per die are going to be far more 
mainstream within the next few years, so I still don't think of multiple 
computers for most of my multi-processing.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread skip

Damjan> Is there some python module that provides a multi process Queue?

Skip> Not as cleanly encapsulated as Queue, but writing a class that
Skip> does that shouldn't be all that difficult using a socket and the
Skip> pickle module.

Jeremy> What about bsddb?  The example code below creates a multiprocess
Jeremy> queue.

I tend to think "multiple computers" when someone says "multi-process".  I
realize that's not always the case, but I think you need to consider that
case (it's the only practical way for a multi-process application to scale
beyond a few processors).

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Jeremy Jones


[EMAIL PROTECTED] wrote:

>Damjan> Is there some python module that provides a multi process Queue?
>
>Not as cleanly encapsulated as Queue, but writing a class that does that
>shouldn't be all that difficult using a socket and the pickle module.
>
>Skip
>
>  
>
What about bsddb?  The example code below creates a multiprocess queue.  
Kick off two instances of it, one in each of two terminal windows.  Do a 
mp_db.consume_wait() in one first, then do a mp_db.append("foo or some 
other text here") in the other and you'll see the consumer get the 
data.  This keeps the stuff on disk,  which is not what the OP wants, 
but I *think* with flipping the flags or the dbenv, you can just keep 
stuff in memory:

#!/usr/bin/env python

import bsddb
import os

db_base_dir = "/home/jmjones/svn/home/source/misc/python/standard_lib/bsddb"

dbenv = bsddb.db.DBEnv(0)
dbenv.set_shm_key(40)
dbenv.open(os.path.join(db_base_dir, "db_env_dir"),
#bsddb.db.DB_JOINENV |
bsddb.db.DB_INIT_LOCK |
bsddb.db.DB_INIT_LOG |
bsddb.db.DB_INIT_MPOOL |
bsddb.db.DB_INIT_TXN |
#bsddb.db.DB_RECOVER |
bsddb.db.DB_CREATE |
#bsddb.db.DB_SYSTEM_MEM |
bsddb.db.DB_THREAD,
)

db_flags = bsddb.db.DB_CREATE | bsddb.db.DB_THREAD


mp_db = bsddb.db.DB(dbenv)
mp_db.set_re_len(1024)
mp_db.set_re_pad(0)
mp_db_id = mp_db.open(os.path.join(db_base_dir, "mp_db.db"), 
dbtype=bsddb.db.DB_QUEUE, flags=db_flags)



- JMJ
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Tim Daneliuk
yoda wrote:

> Hi guys,
> My situation is as follows:
> 
> 1)I've developed a service that generates content for a mobile service.
> 2)The content is sent through an SMS gateway (currently we only send
> text messages).
> 3)I've got a million users (and climbing).
> 4)The users need to get the data a minimum of 5 seconds after it's
> generated. (not considering any bottlenecks external to my code).
> 5)Generating the content takes 1 second.
> 

We need more information on just where the bottleneck might be.
There are any number of places that things could be getting
choked up and you need to get a profile of where things are
falling down before trying to fix it.

However, that said:

A possible culprit is session setup/teardown - I'm assuming connection
to the SMS gateway is connection-oriented/reliable, not datagram-based.
I suggest this because this is quite often the culprit in connection-
oriented performance problems.

If this is the case, you need to preestablish sessions and pool them for
reuse somehow so that each and every message transmission does not incur
the overhead of message setup and teardown to the gateway. It is a good
idea to make that session pooling logic adaptive. Have it start with a
minimum number of preestablished sessions to the gateway and then
monitor the message 'highwater' mark. As the system becomes starved for
sessions, alocate more to the pool. As system utililization declines,
remove spare sessions from the pool until the count falls back to the
initial minimum

Write the pooling manager to be able to configure both the initial
session count as well as the interval for adjusting that count up and
down (i.e. Over what interval you will 'integrate' the function that
figures out just how many sessions the pool needs). Too short an interval
and the system will throw itself into feedback hystersis trying to
figure out just many sessions you need. Too long an interval, and the
system will exhbit poor response to changing load.

P.S. My firm does consultancy for these kinds of problems. We're always
  looking for a great new customer.

Always-Developing-New-Business-ly Yours,


Tim Daneliuk [EMAIL PROTECTED]
PGP Key: http://www.tundraware.com/PGP/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread skip
Damjan> Is there some python module that provides a multi process Queue?

Not as cleanly encapsulated as Queue, but writing a class that does that
shouldn't be all that difficult using a socket and the pickle module.

Skip

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Damjan
> If you want to use a multithreaded design, then simply use a python
> Queue.Queue for each delivery channel. If you want to use a
> multi-process design, devise a simple protocol for communicating those
> messages from your generating database/process to your delivery channel
> over TCP sockets.

Is there some python module that provides a multi process Queue?

-- 
damjan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Alan Kennedy
[yoda]
> I really need help because my application currently can't scale. Some
> user's end up getting their data 30 seconds after generation(best case)
> and up to 5 minutes after content generation.  This is simply
> unacceptable.  The subscribers deserve much better service if my
> startup is to survive in the market.

> My questions therefore are:
> 1)Should I switch to stackless python or should I carry out experiments
> with mutlithreading the application?
> 2)What architectural suggestions can you give me?
> 3)Has anyone encountered such a situation before? How did you deal with
> it?
> 4)Lastly, and probably most controversial: Is python the right language
> for this? I really don't want to switch to Lisp, Icon or Erlang as yet.

I highly recommend reading the following paper on the architecture of 
highly concurrent systems.

A Design Framework for Highly Concurrent Systems, Welsh et al.
http://www.eecs.harvard.edu/~mdw/papers/events.pdf

The key principle that I see being applicable to your scenario is to 
have a fixed number of delivery processes/threads. Welsh terms this the 
"width" of your delivery channel. The number should match the number of 
"delivery channels" that your infrastructure can support. If you are 
delivering your SMSs by SMPP, then there is probably a limit to the 
number of messages/second that your outgoing SMPP server can handle. If 
you go above that limit, then you might cause thrashing or overload in 
that server. If you're delivering by an actual GSM mobile connected 
serially connected to your server/pc, then you should have a single 
delivery process/thread for each connected mobile. These delivery 
processes/threads would be fed by queues of outgoing SMSs.

If you want to use a multithreaded design, then simply use a python 
Queue.Queue for each delivery channel. If you want to use a 
multi-process design, devise a simple protocol for communicating those 
messages from your generating database/process to your delivery channel 
over TCP sockets.

As explained in Welsh's paper, you will get the highest stability 
ensuring that your delivery channels only receive as many messages as 
the outgoing transmission mechanism can actually handle.

If you devise a multi-process solution, using TCP sockets to distribute 
messages from your generating application to your delivery channels, 
then it would be very straightforward to scale that up to multiple 
processes running on a either a multiple-core-cpu, a 
multiple-cpu-server, or a multiple-server-network.

All of this should be achievable with python.

Some questions:

1. How are you transmitting your SMSs?
2. If you disable the actual transmission, how many SMSs can your 
application generate per second?

HTH,

-- 
alan kennedy
--
email alan:  http://xhaus.com/contact/alan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Paul Boddie
yoda wrote:
> 2)The content is sent through an SMS gateway (currently we only send
> text messages).

[...]

> 4)The users need to get the data a minimum of 5 seconds after it's
> generated. (not considering any bottlenecks external to my code).

You surely mean a "maximum of 5 seconds"! Unfortunately, I only have a
passing familiarity with SMS-related messaging, but I imagine you'd
have to switch on any and all quality-of-service features to get that
kind of guarantee (if it's even possible).

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Irmen de Jong
Chris Curvey wrote:

> Multi-threading may help if your python program is spending all it's
> time waiting for the network (quite possible).  If you're CPU-bound and
> not waiting on network, then multi-threading probably isn't the answer.

Unless you are on a multi cpu/ multi core machine.
(but mind Python's GIL)

--Irmen


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread Chris Curvey
I guess I'd look at each part of the system independently to be sure
I'm finding the real bottleneck.  (It may be Python, it may not).

Under your current system, is your python program still trying to send
messages after 5 seconds?  30 seconds, 300 seconds?  (Or have the
messages been delivered to SMS and they're waiting in queue there?)

If your python program is still streaming out the messages, what is it
spending time on?  At a gross level, is your machine CPU-bound?  If
you time out each step in your program after the content is generated,
where is all the time going (message assembly, sending over the
network, waiting for a response)?

Just by some back-of-the-envelope calculations, 1 million messages at
100 bytes each is 100Mb.  That's a bunch of data to push over a network
in 2-3 seconds, especially in small chunks.  (It's possible, but I'd
look at that.)  Can the SMS gateway handle that kind of traffic
(incoming and outgoing)?

Multi-threading may help if your python program is spending all it's
time waiting for the network (quite possible).  If you're CPU-bound and
not waiting on network, then multi-threading probably isn't the answer.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 1 Million users.. I can't Scale!!

2005-09-28 Thread ncf
If you have that many users, I don't know if Python really is suited
well for such a large scale application. Perhaps it'd be better suited
to do CPU intensive tasks it in a compiled language so you can max out
proformance and then possibly use a UNIX-style socket to send/execute
instructions to the Python interface, if necessary.


Sorry I really couldn't be of much help
-Wes

-- 
http://mail.python.org/mailman/listinfo/python-list


1 Million users.. I can't Scale!!

2005-09-28 Thread yoda
Hi guys,
My situation is as follows:

1)I've developed a service that generates content for a mobile service.
2)The content is sent through an SMS gateway (currently we only send
text messages).
3)I've got a million users (and climbing).
4)The users need to get the data a minimum of 5 seconds after it's
generated. (not considering any bottlenecks external to my code).
5)Generating the content takes 1 second.

I'm considering moving to stackless python so that I can make use of
continuations so that I can open a massive number of connections to the
gateway and pump the messages out to each user simultaneously.(I'm
thinking of 1 connection per user).

My questions therefore are:
1)Should I switch to stackless python or should I carry out experiments
with mutlithreading the application?
2)What architectural suggestions can you give me?
3)Has anyone encountered such a situation before? How did you deal with
it?
4)Lastly, and probably most controversial: Is python the right language
for this? I really don't want to switch to Lisp, Icon or Erlang as yet.

I really need help because my application currently can't scale. Some
user's end up getting their data 30 seconds after generation(best case)
and up to 5 minutes after content generation.  This is simply
unacceptable.  The subscribers deserve much better service if my
startup is to survive in the market.

-- 
http://mail.python.org/mailman/listinfo/python-list