Re: [OT] benchmarking typical programs
On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said: Lots of one trick pony type benchmarks exist, but very few that actually try to look like they are doing typical things typical programs do, at the typical scales real programs work out, so As a search engineer (recovering) I'm inclined to say - get a corpus of docs, build an inverted index out of it and then do some searches. This will test 1) File/IO Performance (Reading in the corpus) 2) Text manipulation (Tokenizing, Stop word removal, Stemming) 3) Data structure performance (Building the index) 4) Maths Calculation (performing TF/IDF searches) All in pretty good, discrete steps. Plus by tweaking the size of the corpus you can stress memory as well. Simon
Re: [OT] benchmarking typical programs
+1 And as a bonus, you cover pretty much the whole data munging market as a side effect with this one. On 21/09/2012, at 17:56, Simon Wistow si...@thegestalt.org wrote: On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said: Lots of one trick pony type benchmarks exist, but very few that actually try to look like they are doing typical things typical programs do, at the typical scales real programs work out, so As a search engineer (recovering) I'm inclined to say - get a corpus of docs, build an inverted index out of it and then do some searches. This will test 1) File/IO Performance (Reading in the corpus) 2) Text manipulation (Tokenizing, Stop word removal, Stemming) 3) Data structure performance (Building the index) 4) Maths Calculation (performing TF/IDF searches) All in pretty good, discrete steps. Plus by tweaking the size of the corpus you can stress memory as well. Simon
Re: [OT] benchmarking typical programs
On Fri, Sep 21, 2012 at 08:56:34AM +0100, Simon Wistow wrote: On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said: Lots of one trick pony type benchmarks exist, but very few that actually try to look like they are doing typical things typical programs do, at the typical scales real programs work out, so As a search engineer (recovering) I'm inclined to say - get a corpus of docs, build an inverted index out of it and then do some searches. This will test 1) File/IO Performance (Reading in the corpus) 2) Text manipulation (Tokenizing, Stop word removal, Stemming) 3) Data structure performance (Building the index) 4) Maths Calculation (performing TF/IDF searches) All in pretty good, discrete steps. Plus by tweaking the size of the corpus you can stress memory as well. Thanks, this is a useful suggestion, but... I'm not a search engineer (recovering or otherwise), so this represents rather more work that I wanted to do. In that I first have to learn enough of how to *be* a search engineer to figure out how to write the above code to do something useful, and *then* how to write such code to a reasonably performant production versions, and then to turn working code into something sufficiently stand alone to be a benchmark. I don't want to be spending my time figuring out the right way to do all the above algorithms in Perl. I want to get as fast as possible to the point of figuring out how the perl interpreter (mis)behaves when presented with extant decent code to do the above. Unless there's a CPAN-in-a-box for doing most of the four steps. (which doesn't depend on external C libraries. That was one of my preferably criteria) So, next question - if I wanted to be as lazy as possible and write a search engine (as described above) using as much of CPAN as possible, which modules are recommended? :-) Nicholas Clark
Re: [OT] benchmarking typical programs
On 21 September 2012 10:22, Nicholas Clark n...@ccl4.org wrote: Unless there's a CPAN-in-a-box for doing most of the four steps. (which doesn't depend on external C libraries. That was one of my preferably criteria) Alas, the best ones are indeed C (or C++) libraries: Search::Xapian (my preference); KinoSearch; yada yada. So, next question - if I wanted to be as lazy as possible and write a search engine (as described above) using as much of CPAN as possible, which modules are recommended? :-) Probably look at all the Lucene-related modules and steal some code. You wouldn't need to do a full-blown engine with spelling correction, fancy query parsing etc. Nicholas Clark
Re: [OT] benchmarking typical programs
On Fri, 21 Sep 2012, Nicholas Clark wrote: On Fri, Sep 21, 2012 at 08:56:34AM +0100, Simon Wistow wrote: On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said: Lots of one trick pony type benchmarks exist, but very few that actually try to look like they are doing typical things typical programs do, at the typical scales real programs work out, so As a search engineer (recovering) I'm inclined to say - get a corpus of docs, build an inverted index out of it and then do some searches. This will test 1) File/IO Performance (Reading in the corpus) 2) Text manipulation (Tokenizing, Stop word removal, Stemming) 3) Data structure performance (Building the index) 4) Maths Calculation (performing TF/IDF searches) All in pretty good, discrete steps. Plus by tweaking the size of the corpus you can stress memory as well. Thanks, this is a useful suggestion, but... I'm not a search engineer (recovering or otherwise), so this represents rather more work that I wanted to do. In that I first have to learn enough of how to *be* a search engineer to figure out how to write the above code to do something useful, and *then* how to write such code to a reasonably performant production versions, and then to turn working code into something sufficiently stand alone to be a benchmark. I don't want to be spending my time figuring out the right way to do all the above algorithms in Perl. I want to get as fast as possible to the point of figuring out how the perl interpreter (mis)behaves when presented with extant decent code to do the above. Unless there's a CPAN-in-a-box for doing most of the four steps. (which doesn't depend on external C libraries. That was one of my preferably criteria) So, next question - if I wanted to be as lazy as possible and write a search engine (as described above) using as much of CPAN as possible, which modules are recommended? :-) the Plucene test suite maybe the answer. I know it cetainly does the indexing bit. -- bob walker everything should be purple and bendy http://randomness.org.uk
Re: [OT] benchmarking typical programs
On 21/09/2012, at 19:22, Nicholas Clark n...@ccl4.org wrote: On Fri, Sep 21, 2012 at 08:56:34AM +0100, Simon Wistow wrote: On Thu, Sep 20, 2012 at 12:35:18PM +0100, Nicholas Clark said: Lots of one trick pony type benchmarks exist, but very few that actually try to look like they are doing typical things typical programs do, at the typical scales real programs work out, so As a search engineer (recovering) I'm inclined to say - get a corpus of docs, build an inverted index out of it and then do some searches. This will test 1) File/IO Performance (Reading in the corpus) 2) Text manipulation (Tokenizing, Stop word removal, Stemming) 3) Data structure performance (Building the index) 4) Maths Calculation (performing TF/IDF searches) All in pretty good, discrete steps. Plus by tweaking the size of the corpus you can stress memory as well. Thanks, this is a useful suggestion, but... I'm not a search engineer (recovering or otherwise), so this represents rather more work that I wanted to do. In that I first have to learn enough of how to *be* a search engineer to figure out how to write the above code to do something useful, and *then* how to write such code to a reasonably performant production versions, and then to turn working code into something sufficiently stand alone to be a benchmark. I don't want to be spending my time figuring out the right way to do all the above algorithms in Perl. I want to get as fast as possible to the point of figuring out how the perl interpreter (mis)behaves when presented with extant decent code to do the above. Unless there's a CPAN-in-a-box for doing most of the four steps. (which doesn't depend on external C libraries. That was one of my preferably criteria) So, next question - if I wanted to be as lazy as possible and write a search engine (as described above) using as much of CPAN as possible, which modules are recommended? :-) I think you want Plucene. But please let someone else correct me if I'm wrong. Nicholas Clark
Re: [OT] benchmarking typical programs
On 19 Sep 2012, at 12:09, Nicholas Clark n...@ccl4.org wrote: Does the mighty hive mind of london.pm have any suggestions (preferably useful) of what to use for benchmarking typical Perl programs? Does benchmarking the test suites for a representative subsection of the CPAN world count? And what precisely are you attempting to benchmark? The core behaviour of the perl interpreter itself, or the edge cases of domain-specific work such as parsing XML in pure perl?
Re: [OT] benchmarking typical programs
On 21 Sep 2012, at 10:57, David Hodgkinson daveh...@gmail.com wrote: Does benchmarking the test suites for a representative subsection of the CPAN world count? I doubt it. Each test suite is very repetetetive, so you certainly won't be doing a realistic benchmark re CPU caches and possibly not re the MMU or I/O system. -- David Cantrell
Re: [OT] benchmarking typical programs
On 21 Sep 2012, at 11:09, David Cantrell da...@cantrell.org.uk wrote: On 21 Sep 2012, at 10:57, David Hodgkinson daveh...@gmail.com wrote: Does benchmarking the test suites for a representative subsection of the CPAN world count? I doubt it. Each test suite is very repetetetive, so you certainly won't be doing a realistic benchmark re CPU caches and possibly not re the MMU or I/O system. -j
Re: [OT] benchmarking typical programs
On Fri, Sep 21, 2012 at 10:22:44AM +0100, Nicholas Clark said: I'm not a search engineer (recovering or otherwise), so this represents rather more work that I wanted to do. I'll try and know something together but really it's fairly simple algorithm. Warning untested: my %index; foreach my $doc (@corpus) { my $text = slurp($doc); my @tokens = tokenize($text); foreach my $token (@tokens) { $index{$token}-{$doc}++; } } my $D = scalar(@corpus); foreach my $query (@queries) { my %results; my @tokens = tokenize($query); foreach my $token (@tokens) { my $docs = $index-{$token}; my $d= size keys %$docs; foreach my $doc (keys %docs) { # http://en.wikipedia.org/wiki/Tf*idf my $tf = $docs-{$doc}; my $idf = log($D / $d); $results{$doc} += $tf * $idf; } } my $count = 1; foreach my $doc (sort { $results{$b} = $results{$a} } %results) { print $count) $doc (score .$results-{$doc}.)\n; $count++; } } sub tokenize { my $text = shift; my @words = split ' ', $text; return map { stem($_) } grep { !$STOP_WORDS{$_} } @words; } # world's most usless stemmer # here for munging performance checking only sub stem { my $word = shift; $word =~ s!(ing|s|ed|ly$); $word; }
Re: [OT] benchmarking typical programs
How open to craziness are you A program, any program on any language can only do so many things right ? Write/Read to memory Write/Read to disk Write/Read to the network Write/Read to a given port ( Serial / Parallel / USB ) Write/Read to another program ( I might be missing something, though but my current in cerebrum visualisation of the Von Newman architecture doesn't leave room for more ) This should be relatively simple to write or scavenge from somewhere amongst the many test suites there are. If you REALLY REALLY REALLY want to make sure your cache, wherever it might be located doesn't play tricks on you, you can always go berserk mode and use the most used feature in the Windows world. Reboot. These tests that you might have or might have written could read randomish patterns of data big/small enough to fit your needs from specific test data files. And if they're all random between tests I would say that that's even better, but that's up to the architect to decide ;) So the recipe could be: Get a cheap box you don't mind thrashing with reboots ( there's nothing wrong with reboots btw ) Make it boot into single mode with network on ( So you you can upload things or trigger another Perl build ) Make sure cron/anacron and friends ( whomever might want to run havoc whilst the test is running ) is disabled Get a trigger in place ( ssh my_super_evil_boz ./build_perl 5.42 --with-evil-test-suite || via trigger it via post-commit or similar ) Make sure that in the end the results are kept in the box and or are mailed to you or posted to some other interface reboot if (end_of_test) SYSTEM READY 4 ANOTHER RIDE ... I suppose that amongst the exact same build you should expect some fluctuations for different test runs much like the ones that happen in the BogoMips calculation. But never to a degree of magnitude that might indicate a regression. Regards, PECastro