Re: [slim] Re: Performance measurements ?

2005-10-19 Thread Niek Jongerius
> I'm sure what they all really mean to say, Niek, is thank you very much
> for your contribution. :)

I know. I was just replying "you're welcome, grab a cold one and
put your feet up".

Cheers, Niek.

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-19 Thread kdf
I'm sure what they all really mean to say, Niek, is thank you very much 
for your contribution. :)


-k

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-18 Thread Niek Jongerius
>> And as I said in an earlier post, the measurements taken here show the
>> _real_ time it takes for the CLI to perform some database query.

> Sorry, your test tool does not quite do what you say... it does not
> measure _the_ time it takes, rather, it measures only _a_ single run
> which by itself *plus* all the perturbing affects of an uncontrolled
> system, is terribly inaccurate.  The numbers reported early demonstrate
> and support this.  A single run of the tool is competing with the rest
> of the processes running on the system - there are dozens or hundreds
> of threads also running and competing.
>
> Your test tool does not isolate the various affects that perturb
> measurements, so its the benchmarker's job to defeat such affects.
> With the earlier results reported being 2-3x out of agreement, it is
> clear that what is being benchmarked, is not in fact what you believe
> is being measured.  Therefore, drawing any conclusions is not very
> meaningful, or useful.
>
> Numerous background processes, virus scanners, network activity, disk
> spinup time, low-power to max-power CPU speedup time, swapping, disk
> cache, hardware interrupts, are factors which need to be eliminated and
> reduced before conclusions can be drawn.

All true, but please bear in mind what this tool actually tries to do.
There are quite a few complaints about performance of the server.
Performance in this context is something that is perceived, it is not
a measurement of "top speed". When people complain, they probably just
tried to use their SB. During that test, there were all sorts of other
processes running, just as you explained. That very experience of
performance makes them act and send out a call for help.

This tool tries to do exactly that. It is _intended_ to run on a system
that is polluted by all sorts of junk. The measurement would not be
realistic if there wasn't any real life interference by whatever tries
to slow the server down. All we have now is some vague indication of
performance. If someone complains "the server stalls when I navigate
to that menu, then click right, and then press play", it could be very
handy to have the queries to the database that correspond to his actions,
and have his server (running all the junk that is messing up the machine)
to spit out a more tangible value than "it is sooo slow".

I don't expect the tool to be very accurate in the light of all that is
said, but the bottom line is that if someone wants their toy to play a
piece of music, and it takes say 1 minute to start the play whereas a
"normal" server should be able to start in about a second, this tool
could give a more accurate indication of what the user experiences. If
the stats are very poor, maybe people could do some digging into what is
making the server so slow. Turn off whatever service they suspect, run a
couple more tests (using the same tool with the same queries on the tuned
server), and if these new tests show a significant and consistent drop in
response time (say, a factor of two or three), then I guess they are on
to something.

These are just ballpark figures (and very probably a huge ballpark at
that), but still the tool could be used to quantify what people see on
their messed-up server. It is no different than the server stats that
the nightlies can spit out. They too have to be scrutinized with care,
and cannot be readily compared to other installs.

Niek.

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-18 Thread Marc Sherman

Michaelwagner wrote:



Michael Herger wrote:[color=blue]


[20.891] titles 0 10
[16.235] titles 0 100


Why would 10 titles take more time than 100?


Ramp-up anomalies (due to pre-fetching, caching, lazy code loading, etc) 
are very common in performance testing.  The usual methodology to 
eliminate those effects is to run the entire series of test a few times 
first and throw those results away, before you start recording 
reportable results.


- Marc
___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-18 Thread Michael Herger

The only thing that makes sense to me here is a caching artifact. But it
didn't happen with Mikes other test on his other computer ...


I'd confirm it and give a possible explication: that slimserver had been  
idle for about four days before I did that test over a ssh connection. As  
the mail and web servers on that machine turn 24/7 it's pretty probable  
that slimserver had been swapped out.


--

Michael

---
Help translate SlimServer by using the
StringEditor Plugin (http://www.herger.net/slim/)
___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-17 Thread Niek Jongerius
>> Why would 10 titles take more time than 100?

Your guess is as good as (or possibly better than) mine. But there is
no introduced artifact from the test program that I can see (the very
simple source is here: http://media.qwertyboy.org/files/sstime.c).

> I indicated in an earlier post that the testing methodologies being
> used here are un-controlled, and the margin of error is too high to
> have any meaning.  Without proper controls put into place, and
> reduction of all extraneous variables, such "tests" should be for
> amusement only.

And as I said in an earlier post, the measurements taken here show the
_real_ time it takes for the CLI to perform some database query. Yes,
the numbers that various installs yield are probably hard to compare
amongst each other, but the fact remains that this _is_ the time some
defined query takes to return results using the CLI. If the CLI should
give similar performance to a Slimpy/SB/SB2 when executing queries (and
I'm not knowledgeable to say they do, but I can't see why not), then
these numbers give a good indication of how long our beloved hardware
has to wait for the server to response.

Again, this is _NOT_ meant to show how the server performs in an ideal,
controlled environment, this is a down-to-Earth measurement of real life
installs. Can someone tell me how these performance measurements differ
conceptually from what the graphs show that Triode made that are in the
nightlies? Are they also "for amusement only"? They too give some idea
of a real install, and are not meant for an ideal, controlled environment.

Now if we could come up with a set of CLI commands that give a good
representation of what a real scenario would fire at the database, we
would have an objective indication of performance instead of vague
statements like "it is too slow" or whatever. _That_ is what I'm trying
to get to.

Unless I am totally off base here...

Niek.

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-13 Thread Michael Herger

Agreed. There are too many variables in the way machines are set up to
readily compare output numbers. CPU and RAM are by no means the only
variables here. OS, procs running, procs priority, intermediate network
(if test prog is run over a network) etc.


I was still surprised how my tests reflected the machine's category: times  
always about doubled when testing a C3/600 (Linux), C3/1000 (Linux) and a  
P4/2.66 (Windows). Though their software configurations are _very_  
different.


--

Michael

---
Help translate SlimServer by using the
SlimString Translation Helper (http://www.herger.net/slim/)

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-13 Thread Niek Jongerius
>
> Yeah, you really need to be careful about methodology here.
>
> If you want to select typical things and get typical response times,
> you probably need to carefully think out the typical things people do
> at the user interface and mimic them in the CLI and run them on many
> different configurations.
>
> I doubt many people list the top thousand songs when they're at the
> remote.
>
> If you want to benchmark the code to do before and after studies of
> code improvements, you need to have one typical machine, benchmark it
> accurately AND THEN FREEZE IT AND DON'T CHANGE IT. That almost means
> dedicate it to the benchmarking task and making it a reference system,
> because you never know when installing Microsoft Office 2007 (heaven
> help us) or IE 17.2 won't change the way I/O works or how much
> background activity there is cluttering up the disk.

Note that I did not intend this to be a benchmark tool. In a lot of posts
on this list people said the performance of their install was , which was a very subjective indication.
This program simply times a request to the CLI. It should give some idea
about what a user would see if using a real SqueezeBox (assuming we use a
reasonable set of CLI queries, which I probably don't).

There are even some of us desparately switching OSes and tweaking stuff
on the same machine and discussing whether ActiveState is faster than
compiled Windows or CygWin or whatever. This tool then is able to give
_some_ numbers. Again, you cannot compare them 1 on 1 to other installs,
but IMHO if one install does someting in 20 seconds, and another one in
just 2, there is going to be the same user experience when connecting and
using a SqueezeBox.

Niek.

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss


Re: [slim] Re: Performance measurements ?

2005-10-13 Thread Niek Jongerius
>> Second pass:
>> [55.391] titles 0 1
>>
>> Third pass:
>> [66.390] titles 0 1
>>
> With a 20% variance, we can see that the testing methodology and
> environment is very rough.
>
> And, with Niek running a P4/2.8 getting
>
> [24.201] titles 0 1
>
> and Bill running a P4/3 getting an average of
>
> [60.114] titles 0 1
>
> There's an almost 2.5x difference in times running on hardware where
> hardware specs alone would account roughly for only 10% difference.
>
> Hopefully nobody will look at these data points and attempt to draw
> unwarranted conclusions.

Agreed. There are too many variables in the way machines are set up to
readily compare output numbers. CPU and RAM are by no means the only
variables here. OS, procs running, procs priority, intermediate network
(if test prog is run over a network) etc.

But bottom line still is that it takes that reported amount of time for
the SlimServer to cough up the data requested (assuming the CLI uses
comparable ways in getting the data). If Bill is getting 2.5 times worse
performance in the same tests as I get, I would assume his setup performs
about that factor worse than mine when serving a SqueezeBox. The proggy
does nothing fancy (I'll post the source in a minute on my site), it
just times the start and end of the CLI command.

I have not been very inventive in the queries I posed in my sample input
file. It could be that my example commands are somehow not representable
for gauging performance. Someone with a better understanding of what
actually are reasonable queries could maybe give a few. It's just a matter
of editing the input file to test other CLI commands.

Niek.

___
Discuss mailing list
Discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss