On Tue, Aug 25, 2009 at 11:30:01PM -0500, Mike Gerdts wrote:
> Ugh.  If we accept the falacy that performance scales linearly with
> clock speed, that means that if my T5120 were running at 4.8 GHz I
> would see similar performance.

Hold on, now.  I'm not making any such assertion.  My point was that on
x86, where most of this project has been developed, the speeds are fast
enough that they haven't been a hinderance to the current development
work.

> The specs I have access to says that your CPU's (with significantly
> more sophisticated execution units) are running somewhere between 2.6
> and 3.3 GHz.  I'm not sure that we're seeing a lot more here than the
> speed of individual execution units within the cores.

That's probably part of it.  I'd give some credit to the better branch
prediction algorithms and speculation in the processor.  In 64-bit mode,
the x86 processors have a lot of registers to pass arguments through,
and no windowing issues, either.

> I can't help but think that parsing text every time is the wrong
> approach.  A binary format that can be mmap'd and used would probably
> take over 17 of the 17.9 seconds off of this function.

Keep in mind that with ZFS, mmap doesn't provide the performance gains
that you would see with other filsystems.  Since it has its own cache,
having to double-check its mappings against the page cache acutally
makes mmap slower.  Though, I do take your point about a binary format.

> Trapstat showed:
> 
> # trapstat 10 1
> vct name                |     cpu0     cpu1     cpu2     cpu3
> ------------------------+------------------------------------
>   9 immu-miss           |        0       48        5       72
>  20 fp-disabled         |        0        0        0        0
>  24 cleanwin            |   116540      393    65420    13737
>  31 dmmu-miss           |        1      136       92      245
>  34 unalign             |    87318        0    48793     9499
> ...
>  ac spill-asuser-32-cln |    87416      264    48956     9993
>  b0 spill-asuser-64-cln |       40       39        0        0
> ...
> 
> Is this indicative of the problem you mention, or another one?

Seeing cleanwin and spill traps certainly indicates that what ever is
going on is causing you to exhaust your available register windows.
This could be due to recursion.

Besides finding the cases where we're running into excessive recursion.
We also need to parallelize our CPU intensive algorithms, since any
single core on modern SPARC systems is slow, but used in parallel, we
can get a lot done.

It's intersting to see unalign show up in this output.  This implies
that something isn't performing aligned memory accesses, which is
generally a preventable problem.

> Are any of the improvements possibly aimed at mmap'able database?  I
> believe this is the key to rpm's speed.

I'm not the search expert, so I can't say for certain.  My understanding
was that we looked at existing database formats, but found that they
didn't provide the performance gains that we were expecting.

-j
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to