Re: how to really bang on a script?
Perrin Harkins wrote: On Sat, 28 Oct 2000, Matthew Byng-Maddick wrote: On Sat, 28 Oct 2000, Matt Sergeant wrote: exactly the same thing (changing server logs into a benchmark tool) at ApacheCon, only I can't for the life of me remember who it was. Theo, during the mod_backhand talk, or at lunch just before, I can't remember. It was during the talk. The tool is called Daquiri, and he said it was available in the mod_backhand CVS tree. I have also found httperf and http_load pretty useful for this stuff, although they don't support logfile playback. Hey, I went by http://www.mod_backhand.org/, and found no mention of anything called Daquiri, nor any description of how to access the mod_backhand CVS tree, not even an email address to ask for information. I downloaded the mod_backhand source, and I found nothing there either. Does anyone have a clue about where to start getting my hands on this neat and nifty tool? --Christopher
Re: how to really bang on a script?
Hi all, On Mon, 30 Oct 2000, Ask Bjoern Hansen wrote: On Sat, 28 Oct 2000, Christopher L. Everett wrote: OK, I confess: I've written (probably yet another) mod_perl banner exchange. I need to know that when we serve 100K banners to 40K if anyone doubts that perl and mod_perl is a good solution for that, you can tell them that at ValueClick we can serve thousands and thousands of banners per second on our technology that is close to 100% Pure Perl. And if anyone wants to know how to write an ad-blocking proxy using the same technology it's on pages 374-381 of the first edition of the Eagle Book, "Writing Apache Modules with Perl and C" ISBN 1-56592-567-X. Works a treat. 73, Ged.
Re: how to really bang on a script?
On Mon, 30 Oct 2000, Perrin Harkins wrote: On Sat, 28 Oct 2000, Matthew Byng-Maddick wrote: Theo, during the mod_backhand talk, or at lunch just before, I can't remember. It was during the talk. The tool is called Daquiri, and he said it was available in the mod_backhand CVS tree. Ah yes, I remember someone joking about what flavour it was next to me. :) MBM -- Matthew Byng-Maddick Home: [EMAIL PROTECTED] +44 20 8981 8633 (Home) http://colondot.net/ Work: [EMAIL PROTECTED] +44 7956 613942 (Mobile) perl -e '$_="Oyvv bsswjfw Thtm mefmfw2\n";while(m([^\n])){$_=$'"'"';$a=$; $a=($a=~m(^\s)?$a:pack "c",unpack("c",$a)-5+($i++%5));print $a}print"\n";'
Re: how to really bang on a script?
"G.W. Haywood" [EMAIL PROTECTED] writes: Hi all, On Mon, 30 Oct 2000, Ask Bjoern Hansen wrote: On Sat, 28 Oct 2000, Christopher L. Everett wrote: OK, I confess: I've written (probably yet another) mod_perl banner exchange. I need to know that when we serve 100K banners to 40K if anyone doubts that perl and mod_perl is a good solution for that, you can tell them that at ValueClick we can serve thousands and thousands of banners per second on our technology that is close to 100% Pure Perl. And if anyone wants to know how to write an ad-blocking proxy using the same technology it's on pages 374-381 of the first edition of the Eagle Book, "Writing Apache Modules with Perl and C" ISBN 1-56592-567-X. Hey, you leave ValueClick alone! They're the least worst out there! (Is that a compliment?) -- Dave Hodgkinson, http://www.hodgkinson.org Editor-in-chief, The Highway Star http://www.deep-purple.com Apache, mod_perl, MySQL, Sybase hired gun for, well, hire -
Re: how to really bang on a script?
Hi Dave, On 31 Oct 2000, David Hodgkinson wrote: Hey, you leave ValueClick alone! They're the least worst out there! (Is that a compliment?) Ok, truce. There's room in the world for all sorts (even you and me:) At the moment. 73, Ged.
Re: how to really bang on a script?
"Christopher L. Everett" wrote: Adi wrote: martin langhoff wrote: Chris, i'd bet my head a few months ago someone announced an apache::bench module, that would take a log and run it as a benchmarking secuence of HTTP requests. just get to the list archives and start searching with benchmarks and logs. CPAN is your friend, also. It was HTTPD::Bench::ApacheBench. It is a Perl API to ab. It doesn't take a log per se, it simply sends sequences of HTTP requests and benchmarks the results. I'm sure you could very easily write a script to parse a log and then make a benchmarking run out of it. Yes, I considered ab and I did find HTTPD::Bench::ApacheBench, while excellently done and copiously documented, isn't quite what I need: 1) I want to spoof the IP addresses of the browsers (I just realized that since I'm using mod_proxy_add_forward anyway, I can make the requester script behave as a proxy; the rest is cookbook). I can't find provision for that in the interface for HTTPD::Bench::ApacheBench. Yeah, that would require adding arbitrary HTTP headers to each benchmark request... that's currently on our to-do list (I think). 2) Record the query parameters as well as the response's MD5 checksum directly in a database table on the fly. This is all possible with ApacheBench. It wouldn't be stored in real-time (i.e. as the request is sent) but you set up the benchmarking runs, so you know all the query parameters (which are all either in the URI or the postdata for each request), and ApacheBench returns all the response data, which you can then pass thru an MD5 hash and store in the database. 3) The interface is more suited to setting up, then executing a batch run programmatically, rather than replaying a log. Right.. it was designed to be generally useful. To replay a log you simply need to set up a batch that exactly duplicates your log. Having examined the ApacheBench.pm source, I don't see how I can make it do what I want by subclassing it. Also the code is a little bit mystifying I didn't really intend it to be subclassed, because basically all it is is a single XS function with a little perl to set it up and store the results. You should be able to inherit it's methods if you want, though I don't see what it would get you over just instantiating an ApacheBench object. package MyModule; @ISA = qw(HTTPD::Bench::ApacheBench); MyModule-add({urls = ["http://url.one/", "http://url.two/"]}); my $r = MyModule-execute; to me in that the last line in the execute method, "return $self-ab;" is the only mention of the class method "ab" in the entire file. Obviously I That's because "ab" is the XS function that sends the HTTP requests and builds up a hash with all the response data and times. All the looping is done in C for speed. Take a look at ApacheBench.xs. (especially if you feel like adding the arbitrary HTTP request header functionality, hint hint :) have _much, much_ more to learn ... :) No, actually you pointed out some good feature additions that we should think about making to ApacheBench. Thanks. -Adi
Re: how to really bang on a script?
On Sat, 28 Oct 2000, Matthew Byng-Maddick wrote: On Sat, 28 Oct 2000, Matt Sergeant wrote: exactly the same thing (changing server logs into a benchmark tool) at ApacheCon, only I can't for the life of me remember who it was. Theo, during the mod_backhand talk, or at lunch just before, I can't remember. It was during the talk. The tool is called Daquiri, and he said it was available in the mod_backhand CVS tree. I have also found httperf and http_load pretty useful for this stuff, although they don't support logfile playback. - Perrin
Re: how to really bang on a script?
Adi wrote: "Christopher L. Everett" wrote: Adi wrote: martin langhoff wrote: Chris, i'd bet my head a few months ago someone announced an apache::bench module, that would take a log and run it as a benchmarking secuence of HTTP requests. just get to the list archives and start searching with benchmarks and logs. CPAN is your friend, also. It was HTTPD::Bench::ApacheBench. It is a Perl API to ab. It doesn't take a log per se, it simply sends sequences of HTTP requests and benchmarks the results. I'm sure you could very easily write a script to parse a log and then make a benchmarking run out of it. Yes, I considered ab and I did find HTTPD::Bench::ApacheBench, while excellently done and copiously documented, isn't quite what I need: 1) I want to spoof the IP addresses of the browsers (I just realized that since I'm using mod_proxy_add_forward anyway, I can make the requester script like a proxy; the rest is cookbook). I can't find provision for that in the interface for HTTPD::Bench::ApacheBench. Yeah, that would require adding arbitrary HTTP headers to each benchmark request... that's currently on our to-do list (I think). 2) Record the query parameters as well as the response's MD5 checksum directly in a database table on the fly. This is all possible with ApacheBench. It wouldn't be stored in real-time (i.e. as the request is sent) but you set up the benchmarking runs, so you know all the query parameters (which are all either in the URI or the postdata for each request), and ApacheBench returns all the response data, which you can then pass thru an MD5 hash and store in the database. 3) The interface is more suited to setting up, then executing a batch run programmatically, rather than replaying a log. Right.. it was designed to be generally useful. To replay a log you simply need to set up a batch that exactly duplicates your log. As I understand ApacheBench, to set up a really large (1M) run with say 200 or 300 URI's with ApacheBench distributed randomly across 400K unique IP addresses, you just about would end up doing 1 run per access log entry, which turns it into a really massive data structure, at which point a file starts looking like a better place to put all that. Or, I could try forking off ApacheBench objects one by one, putting runs into the next object to fork while the most recently forked object beats on the server. Kind of a roundabout way of trading off complexity against RAM. [snippage] to me in that the last line in the execute method, "return $self-ab;" is the only mention of the class method "ab" in the entire file. Obviously I That's because "ab" is the XS function that sends the HTTP requests and builds up a hash with all the response data and times. All the looping is done in C for speed. Take a look at ApacheBench.xs. (especially if you feel like adding the arbitrary HTTP request header functionality, hint hint :) What would be nice from the self-documenting code point of view for a newby like me would be a clue _in_the_code_itself_ that ApacheBench.xs was the place to look for the definition of ab. That's how I would expect a programming language to behave :), but that's something more for the Perl 6 list. have _much, much_ more to learn ... :) No, actually you pointed out some good feature additions that we should think about making to ApacheBench. Thanks. You're welcome. -Adi --Christopher Christopher L. Everett [EMAIL PROTECTED]
Re: how to really bang on a script?
On Fri, 27 Oct 2000, G.W. Haywood wrote: I need to prove to myself and my marketing guy that my script has certain statistical properties, not the least of which is the question of whether my activity logs match what actually happened. You've been spending too much time with your marketing guy. "Certain statistical properties" is gobbledygook. What properties? able to handle N hits in M seconds with a maximum of K concurrent sessions in any given second, L percent of the hits on these files, J percent of the hits on this handler, U percent of these with these parameters and W percent with those. You add more to the list. It's a perfectly sensibly thing to do, also from a technical standpoint. Maybe you've spend too little time on larger sites where those things matters. ;-) Activity logs don't match statistically. Either they match or they don't. If they don't then either the logging is turned off or it isn't working. Ha. Ask [ insert name of large internet advertising company ] about that. Actually, just ask any given internet advertising company about that. (in that industry the logs are real money and not just input data to get pretty colored graphs). More to the point: What do you know about how Christopher is logging? Maybe he is using mod_log_spread or something fancy like that and is unsure if it's working correctly. Also, there's concurrency issues to make sure I've got right. Get yourself a bunch of users. It's a real hassle to be dependent on thousands of users just to exercise your test environment. Christopher, depending on what you need to test and what kind of site you have I'd look at (some combination of) http_load - http://www.acme.com/software/http_load/ - to test how your site can handle large amounts of hits to random combinations of a predefined list of urls. ab - comes with apache, see src/support/ab(\.[ch])? - good for testing performance parameters of a certain URL. LWP - Roll your own tests with the LWP modules to have an automated test of functionality of the site. - ask -- ask bjoern hansen - http://www.netcetera.dk/~ask/ more than 70M impressions per day, http://valueclick.com
Re: how to really bang on a script?
On Sat, 28 Oct 2000, Christopher L. Everett wrote: [...] OK, I confess: I've written (probably yet another) mod_perl banner exchange. I need to know that when we serve 100K banners to 40K Hi Christopher, if anyone doubts that perl and mod_perl is a good solution for that, you can tell them that at ValueClick we can serve thousands and thousands of banners per second on our technology that is close to 100% Pure Perl. :-) [...] place. And again, my questions are: How would I go about proving to myself that my script does what I designed it to do? Has anyone else dealt with a similar problem, and how did they go about doing it? Look at your system and then make up a lot of ways you can think it could go wrong and then write some utilities to automatically do those things and test in our output / logs / databases if what happended was what you expected. If I solve it for myself, would anyone else find the solution useful, and how would I make it more useful to them? That depends on how you solve it, but chances are pretty good that you'll solve it very specifically to your problem (which might be a good thing). Usually, I would test by running through the script a few times with some variations, but we are so freaked out by our experience with the 2 other banner exchange scripts we tried, we find a lot of value in being certain. Sounds scary; I don't want to know more. :-) - ask -- ask bjoern hansen - http://www.netcetera.dk/~ask/ more than 70M impressions per day, http://valueclick.com
Re: how to really bang on a script?
Adi wrote: martin langhoff wrote: Chris, i'd bet my head a few months ago someone announced an apache::bench module, that would take a log and run it as a benchmarking secuence of HTTP requests. just get to the list archives and start searching with benchmarks and logs. CPAN is your friend, also. It was HTTPD::Bench::ApacheBench. It is a Perl API to ab. It doesn't take a log per se, it simply sends sequences of HTTP requests and benchmarks the results. I'm sure you could very easily write a script to parse a log and then make a benchmarking run out of it. Yes, I considered ab and I did find HTTPD::Bench::ApacheBench, while excellently done and copiously documented, isn't quite what I need: 1) I want to spoof the IP addresses of the browsers (I just realized that since I'm using mod_proxy_add_forward anyway, I can make the requester script behave as a proxy; the rest is cookbook). I can't find provision for that in the interface for HTTPD::Bench::ApacheBench. 2) Record the query parameters as well as the response's MD5 checksum directly in a database table on the fly. 3) The interface is more suited to setting up, then executing a batch run programmatically, rather than replaying a log. Having examined the ApacheBench.pm source, I don't see how I can make it do what I want by subclassing it. Also the code is a little bit mystifying to me in that the last line in the execute method, "return $self-ab;" is the only mention of the class method "ab" in the entire file. Obviously I have _much, much_ more to learn ... :) --Christopher Christopher L. Everett [EMAIL PROTECTED]
Re: how to really bang on a script?
On Sat, 28 Oct 2000, Christopher L. Everett wrote: So, I apologize for not describing my problem clearly in the first place. And again, my questions are: How would I go about proving to myself that my script does what I designed it to do? Has anyone else dealt with a similar problem, and how did they go about doing it? If I solve it for myself, would anyone else find the solution useful, and how would I make it more useful to them? Basically you've got exactly the right idea, and it *has* been done before - I recall vaguely in the back of my head someone mentioning doing almost exactly the same thing (changing server logs into a benchmark tool) at ApacheCon, only I can't for the life of me remember who it was. On the plus side though, provided you've got no POST parameters, its a pretty trivial script. The hard part is getting a scalable engine to execute all those requests. Really you have to do that in C I think - and you should probably just look towards the source of ab. FWIW, I think it was Theo Schlossnagel (mod_backhand guy) talking about the utility, in fact I'm almost certain. Why don't you drop him a line. -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: how to really bang on a script?
On Sat, 28 Oct 2000, Matt Sergeant wrote: exactly the same thing (changing server logs into a benchmark tool) at ApacheCon, only I can't for the life of me remember who it was. Theo, during the mod_backhand talk, or at lunch just before, I can't remember. MBM -- Matthew Byng-Maddick Home: [EMAIL PROTECTED] +44 20 8981 8633 (Home) http://colondot.net/ Work: [EMAIL PROTECTED] +44 7956 613942 (Mobile) perl -e '$_="Oyvv bsswjfw Thtm mefmfw2\n";while(m([^\n])){$_=$'"'"';$a=$; $a=($a=~m(^\s)?$a:pack "c",unpack("c",$a)-5+($i++%5));print $a}print"\n";'
Re: how to really bang on a script?
Chris, i'd bet my head a few months ago someone announced an apache::bench module, that would take a log and run it as a benchmarking secuence of HTTP requests. just get to the list archives and start searching with benchmarks and logs. CPAN is your friend, also. there are at least 2 or 3 benching perl scripts available. I bet at least one does what you need. but I may still loose my bet ... m
Re: how to really bang on a script?
Hi again, On Sat, 28 Oct 2000, Christopher L. Everett wrote: OK, I confess: I've written (probably yet another) mod_perl banner exchange. Argh. [snipped impressive numbers] That's what I meant by "certain statistical properties". Interesting. Nothing to do with statistics, but interesting. So, I apologize for not describing my problem clearly No need for that. I need to know that the 500 sites that hosted our banner ads are getting the 50K banners that we promised them The only way to *know* that is to get information from the sites. If I wanted to be sure, I'd connect to them and look in their logs. You probaly already know that people cheat a lot with this kind of thing. Can't understand why they'd do that. and the 30K banners that we sold, we really did serve. Is there a reason you're only selling - no, don't answer that. Also, I want to know that the banners my logs say the script sent are really the ones people saw on their browsers. There you go again. The only way to know is to go and look. But you might try something along the lines of tcpdump on your firewall box, so that you could at least remove a whole slew of uncertainties from the equation. You could pull your magic numbers out from the packets and see from the packet headers if they were addressed to where they were suposed to be addressed to. But without going further afield I think that's about the best you will be able to do. And don't write off ab - remember, you saw it here first... 73, Ged.
Re: how to really bang on a script?
"Christopher L. Everett" wrote: Hello All: I've written some mod_perl scripts that need testing over a million hits or so before I deploy it. I need to prove to myself and my marketing guy that my script has certain statistical properties, not the least of which is the question of whether my activity logs match what actually happened. Also, there's concurrency issues to make sure I've got right. snip sorry, but i fail to see why all the trickery is needed. i assume that you want to check the content against what is expected, but the banners are rotating based on some formula. if you know the formula before hand, then you know the expected distribution for the banners served. why not use something from libwww package to make the requests, md5 the returned banner relavant data, return a report which gives the counts for each unique md5. so in perl pseudocode (untested, no error checking, steps skipped): for $testnum (0..$number_to_test) { $request_url=@bannerurls[rand(@bannerurls)]; $req = HTTP::Request-new(GET = $request_url); $res=$ua-request($req); if($res-is_success){ $dig=md5($res-content); $md5{$dig}++; } $md5{__ERROR__}++; } } foreach $dig (keys %md5){ print "the banner with digest=$dig returned $md5{$dig} hits for",($md5{$dig}/$number_to_test)*100),"% of total\n"; } the report should print the distribution requested. not sure if this can be done using ab or the bench i believe Stas was working on, but libwww is easy to use. you can even distribute this test to a few of your friends and have them bang on your system from a variety of different places to test your system in a more realistic environment. then write a program to scour the logs for the test period, and produce the same report from the logs. they should match almost exactly. only differences in incompleted log requests etc.. 1) Is there a more elegant way of solving my problem? 2) Has this been done before? 2a) If so, is the source for that available? 2b) If not, is a tool like this useful for anyone else, so that I should build it better than I would a once-off? What would make it more useful? Thanks in advance for your help. --Christopher Everett [EMAIL PROTECTED] 641-472-4178 -- ___cliff [EMAIL PROTECTED]http://www.genwax.com/
Re: how to really bang on a script?
On Sat, 28 Oct 2000, martin langhoff wrote: Chris, i'd bet my head a few months ago someone announced an apache::bench module, that would take a log and run it as a benchmarking secuence of HTTP requests. just get to the list archives and start searching with I wrote a simple perl script (that forks multiple childredn and uses IPCs to get multiple threads banging on your box) that runs from a parsed log... but it was more to test functionality than as a benchmarking tool. It _should_ still be floating around here... benchmarks and logs. CPAN is your friend, also. there are at least 2 or 3 benching perl scripts available. I bet at least one does what you need. but I may still loose my bet ... m -- [EMAIL PROTECTED] | Don't go around saying the world owes you a living; http://BareMetal.com/ | the world owes you nothing; it was here first. web hosting since '95 | - Mark Twain
Re: how to really bang on a script?
Hi there, On Fri, 27 Oct 2000, Christopher L. Everett wrote: I've written some mod_perl scripts that need testing over a million hits or so before I deploy it. ab (distributed with Apache, 'man ab' for help) can give you a million hits with one command. I don't know if you're going to get a real-life test other than by going live. Scripts tend to test what you think will go wrong. Invariably something else goes wrong in real life. Putting a million records in a logfile will test your disc capacity and little else. I need to prove to myself and my marketing guy that my script has certain statistical properties, not the least of which is the question of whether my activity logs match what actually happened. You've been spending too much time with your marketing guy. "Certain statistical properties" is gobbledygook. What properties? Activity logs don't match statistically. Either they match or they don't. If they don't then either the logging is turned off or it isn't working. Also, there's concurrency issues to make sure I've got right. Get yourself a bunch of users. Probably the easiest way to really bang on your script would be to advertise a new sex site (sorry:). Anyway I'd really hate you to do all that work only to find that you hadn't tested the test script well enough... 73, Ged.
Re: how to really bang on a script?
"G.W. Haywood" wrote: Hi there, On Fri, 27 Oct 2000, Christopher L. Everett wrote: snipped helpful advice I need to prove to myself and my marketing guy that my script has certain statistical properties, not the least of which is the question of whether my activity logs match what actually happened. You've been spending too much time with your marketing guy. "Certain statistical properties" is gobbledygook. What properties? Activity logs don't match statistically. Either they match or they don't. If they don't then either the logging is turned off or it isn't working. OK, I confess: I've written (probably yet another) mod_perl banner exchange. I need to know that when we serve 100K banners to 40K different IP addresses a day, and we are selling 30K banners/day, the 500 sites that hosted our banner ads are getting the 50K banners that we promised them, and the 30K banners that we sold, we really did serve. Also, I want to know that the banners my logs say the script sent are really the ones people saw on their browsers. That's what I meant by "certain statistical properties". So, I apologize for not describing my problem clearly in the first place. And again, my questions are: How would I go about proving to myself that my script does what I designed it to do? Has anyone else dealt with a similar problem, and how did they go about doing it? If I solve it for myself, would anyone else find the solution useful, and how would I make it more useful to them? Usually, I would test by running through the script a few times with some variations, but we are so freaked out by our experience with the 2 other banner exchange scripts we tried, we find a lot of value in being certain. Thanks again for your kind help. --Christopher Everett