Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread William Ahern
On Fri, Apr 18, 2008 at 10:36:53PM +0200, Cezary Rzewuski wrote:
> Thank you for Your suggestions. I've just finished the implementation.
> I used the approach of libevent as HTTP server and threads working
> on downloaded content (they are performing some statistical computation on
> downloaded javascripts). It looks to work efficiently.
> 
> It's probably not the right group, but you says that switching between 
> threads
> is expensive. However, I've read somewhere (it was probably "Advanced linux
> programming" by Alex Samuel) that creating a new thread is nearly as fast as
> calling a function. Does it mean, that switching between threads is 
> slower than
> creating a new thread?

Creating a new thread is almost certainly not as fast as calling a function.
Maybe, I suppose, as fast calling into the kernel; author's point being that
Linux can quickly allocate and setup the task structures.

But this sounds like one of those things you should test yourself. I'm not
quite sure what the benchmark would look like. Try Usenet:
comp.unix.programmer, or comp.programming.threads.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread William Ahern
On Sat, Apr 19, 2008 at 02:49:20AM +0200, Springande Ulv wrote:
> 
> On 19. april. 2008, at 01.57, William Ahern wrote:
> >In some sense--like code complexity--using both an event-oriented  
> >design
> >*and* threads is the worst of both worlds. Your processing logic is  
> >turned
> >inside-out (state machines, etc, can be confusing to some people),  
> >plus you
> >have mutexes and barriers and all that crap littered throughout your  
> >code.
> 
> He he, the crack you just heard is the thin ice you walked out on.

I must be deaf.

> >Premature optimization is the root of all evil
> 
> Oh that is an old misunderstood meme. What Hoare actually said was "We  
> should forget about small efficiencies, say about 97% of the time:  
> premature optimization is the root of all evil." He was talking about  
> micro-optimizing code. Not exactly what I would call thinking about a  
> scalable program design, which one should think about from the start  
> and at all time, unless you work in the Microsoft Office team.

I really couldn't care less what Hoare or anybody else said. I'm not
appealing to authority. It's an apposite phrase.

If you fancy all your first or even second iterations of your projects
running a gigantic data center, *and* you actually meet your targets out the
gate, then good for you. (Normal people would write a proof-of-concept
first, and maybe second, and third.)

But, in my experience it's best to aim for correctness first. If you must,
you can fix most any well designed single process event-oriented daemon to
scale [more] by tweaking it to use multiple processes, or multiple servers.

The same more-or-less goes for threaded servers.

But you usually don't need to, because as long as you're not needlessly
copying data or doing other wasteful things, then you'll best any
second-rate application, period.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread Springande Ulv


On 19. april. 2008, at 01.57, William Ahern wrote:
In some sense--like code complexity--using both an event-oriented  
design
*and* threads is the worst of both worlds. Your processing logic is  
turned
inside-out (state machines, etc, can be confusing to some people),  
plus you
have mutexes and barriers and all that crap littered throughout your  
code.


He he, the crack you just heard is the thin ice you walked out on.


Premature optimization is the root of all evil


Oh that is an old misunderstood meme. What Hoare actually said was "We  
should forget about small efficiencies, say about 97% of the time:  
premature optimization is the root of all evil." He was talking about  
micro-optimizing code. Not exactly what I would call thinking about a  
scalable program design, which one should think about from the start  
and at all time, unless you work in the Microsoft Office team.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread William Ahern
On Sat, Apr 19, 2008 at 12:23:16AM +0200, Springande Ulv wrote:
> 
> On 18. april. 2008, at 22.36, Cezary Rzewuski wrote:
> >>our best bet is to use a single thread using
> >>libevent, or go totally multi-threaded without libevent. In 90% of  
> >>the
> >>circumstances one of those options (though not both)
> 
> Why not both?! Using both threads _and_ libevent is not only possible  
> both often feasible. Use libevent in a i/o bound thread and a thread  
> pool for CPU bound operations. In a network application the network is  
> the bottleneck and a few context switches is diminutive in this context.

Because _unless_ you can justify the additional complexity, why bother?

In some sense--like code complexity--using both an event-oriented design
*and* threads is the worst of both worlds. Your processing logic is turned
inside-out (state machines, etc, can be confusing to some people), plus you
have mutexes and barriers and all that crap littered throughout your code.

I've used libevent with threads. I've also used libevent with multi-process
configurations. But, usually I stick with one or the other. Premature
optimization is the root of all evil

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread Springande Ulv


On 18. april. 2008, at 22.36, Cezary Rzewuski wrote:

our best bet is to use a single thread using
libevent, or go totally multi-threaded without libevent. In 90% of  
the

circumstances one of those options (though not both)


Why not both?! Using both threads _and_ libevent is not only possible  
both often feasible. Use libevent in a i/o bound thread and a thread  
pool for CPU bound operations. In a network application the network is  
the bottleneck and a few context switches is diminutive in this context.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-18 Thread Cezary Rzewuski

Thank you for Your suggestions. I've just finished the implementation.
I used the approach of libevent as HTTP server and threads working
on downloaded content (they are performing some statistical computation on
downloaded javascripts). It looks to work efficiently.

It's probably not the right group, but you says that switching between 
threads

is expensive. However, I've read somewhere (it was probably "Advanced linux
programming" by Alex Samuel) that creating a new thread is nearly as fast as
calling a function. Does it mean, that switching between threads is 
slower than

creating a new thread?

Once more - thanks for comprehensive answer.

William Ahern wrote:

On Wed, Apr 02, 2008 at 10:15:59PM +0200, Cezary Rzewuski wrote:
  

Hi,
I'd like to ask if sending http requests with libevent is carried out in 
separate thread or is the library
single-threaded? I want to use the library in a program which will visit 
many URL and download
it's content. Is it good idea to use libevent or the classic solution 
with creating a separate thread per URL

request will be much more efficient solution?



It depends. What you describe is not nearly enough informatio to even give a
suggestion.

One thread per URL normally is a very poor choice (just as a matter of
runtime efficiency), unless each URL causes you to do a lot of disk I/O, or
if each URL causes you to do CPU intensive operations, like decode
compressed audio/video. In each of those two situations, the process context
switching costs are diminished relative to the type of work being done.

Basically, the idea is that if your thread will block on an operation--CPU
or I/O--but another thread running in parallel (not merely concurrently)
could utilize additional resources, you want to multi-thread.

If your application is merely moving bytes (say, as a proxy), usually a
single thread is enough; you can multiplex non-blocking network operations
on a single thread. In that sense, you're "switching contexts" in the
application, and not the kernel. This reduces the workload, because context
switching in the kernel is usually more expensive., 


OTOH, copying data in itself can be CPU intensive. If you read into a buffer
from one socket, you might evict previous data you read in earlier. If you
then try to re-read and/or copy that previous data over to another buffer
later, the process will block as the data is fetched from RAM. If your proxy
is even on a 100Mb connection, depending on how you process the data, you
most definitely will need multiple threads. That's because 100Mb of network
data could ballon to 5x or 10x that mount of byte shuffling. Of course,
depending on how the L1, L2 and L3 caches are shared, it might not actually
make much of a difference. It all depends!

Of course, you can always use an event-oriented model within each particular
thread. Or spread event delivery and processing across multiple threads.

Given that you seem new to this (or at least new to the particular problem
you're trying to solve), your best bet is to use a single thread using
libevent, or go totally multi-threaded without libevent. In 90% of the
circumstances one of those options (though not both) are as near to optimal
as you'll get, and you don't need to the headaches of any additional
complexity.

  

I saw that libevent was used in spybye, which is kind of similar what I
want to do. I was wondering if spybye were more efficient with requests
served in separate threads instead of using libevent (I don't say that
it's not efficient, just theoretically).



I'm not sure, maybe its most efficient using _both_. But I suspect it
probably just uses libevent in a single thread.

Note, there are other ways to use threads. You could use one thread using
libevent to handle all your queries and network I/O. Then you could use a
separate thread worker pool to, for instance, run ClamAV on the data. This
works well if you can isolate your CPU intensive work outside the mundane
network I/O parts. If your application is overall CPU bound, and latency of
particular requests isn't of primary concern, then it doesn't matter that
libevent is running in a single thread. All your CPUs are doing work, just
not the same types of work.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users

  


___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-02 Thread William Ahern
On Wed, Apr 02, 2008 at 10:15:59PM +0200, Cezary Rzewuski wrote:
> Hi,
> I'd like to ask if sending http requests with libevent is carried out in 
> separate thread or is the library
> single-threaded? I want to use the library in a program which will visit 
> many URL and download
> it's content. Is it good idea to use libevent or the classic solution 
> with creating a separate thread per URL
> request will be much more efficient solution?

It depends. What you describe is not nearly enough informatio to even give a
suggestion.

One thread per URL normally is a very poor choice (just as a matter of
runtime efficiency), unless each URL causes you to do a lot of disk I/O, or
if each URL causes you to do CPU intensive operations, like decode
compressed audio/video. In each of those two situations, the process context
switching costs are diminished relative to the type of work being done.

Basically, the idea is that if your thread will block on an operation--CPU
or I/O--but another thread running in parallel (not merely concurrently)
could utilize additional resources, you want to multi-thread.

If your application is merely moving bytes (say, as a proxy), usually a
single thread is enough; you can multiplex non-blocking network operations
on a single thread. In that sense, you're "switching contexts" in the
application, and not the kernel. This reduces the workload, because context
switching in the kernel is usually more expensive., 

OTOH, copying data in itself can be CPU intensive. If you read into a buffer
from one socket, you might evict previous data you read in earlier. If you
then try to re-read and/or copy that previous data over to another buffer
later, the process will block as the data is fetched from RAM. If your proxy
is even on a 100Mb connection, depending on how you process the data, you
most definitely will need multiple threads. That's because 100Mb of network
data could ballon to 5x or 10x that mount of byte shuffling. Of course,
depending on how the L1, L2 and L3 caches are shared, it might not actually
make much of a difference. It all depends!

Of course, you can always use an event-oriented model within each particular
thread. Or spread event delivery and processing across multiple threads.

Given that you seem new to this (or at least new to the particular problem
you're trying to solve), your best bet is to use a single thread using
libevent, or go totally multi-threaded without libevent. In 90% of the
circumstances one of those options (though not both) are as near to optimal
as you'll get, and you don't need to the headaches of any additional
complexity.

> I saw that libevent was used in spybye, which is kind of similar what I
> want to do. I was wondering if spybye were more efficient with requests
> served in separate threads instead of using libevent (I don't say that
> it's not efficient, just theoretically).

I'm not sure, maybe its most efficient using _both_. But I suspect it
probably just uses libevent in a single thread.

Note, there are other ways to use threads. You could use one thread using
libevent to handle all your queries and network I/O. Then you could use a
separate thread worker pool to, for instance, run ClamAV on the data. This
works well if you can isolate your CPU intensive work outside the mundane
network I/O parts. If your application is overall CPU bound, and latency of
particular requests isn't of primary concern, then it doesn't matter that
libevent is running in a single thread. All your CPUs are doing work, just
not the same types of work.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users


Re: [Libevent-users] http: libevent vs many threads

2008-04-02 Thread Robert Iakobashvili
Hi Cezary,

On Wed, Apr 2, 2008 at 11:15 PM, Cezary Rzewuski <[EMAIL PROTECTED]>
wrote:

> Hi,
> I'd like to ask if sending http requests with libevent is carried out in
> separate thread or is the library
> single-threaded? I want to use the library in a program which will visit
> many URL and download
> it's content. Is it good idea to use libevent or the classic solution with
> creating a separate thread per URL
> request will be much more efficient solution?
>
> I saw that libevent was used in spybye, which is kind of similar what I
> want to do. I was wondering
> if spybye were more efficient with requests served in separate threads
> instead of using libevent
> (I don't say that it's not efficient, just theoretically).
>
> Kind regards,
> Cezary
> ___
> Libevent-users mailing list
> Libevent-users@monkey.org
> http://monkeymail.org/mailman/listinfo/libevent-users
>


If you are doing a heavy load server on a HW with multiple cores
and/or multiple CPUs, you indeed have a business case for threading.

Y may wish to look at ACE-library implementation of Leader-Followers design
pattern.
There are also great folks on the list, that implemented the pattern using
libevent; look in the mailing archives.

-- 
Sincerely,
Robert Iakobashvili
"Light will come from Jerusalem"
...
http://curl-loader.sourceforge.net
An open-source web testing and traffic generation.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users