Re: Reducing NSS's allocation rate
On Tue, Nov 11, 2014 at 11:10 PM, Brian Smith br...@briansmith.org wrote: I've attached to the bug a profile that uses a stack trace depth of 10. Unfortunately, 10 isn't enough to see the non-NSS entry (one that doesn't start with security/nss/) for every case. I've now attached to the bug a profile that uses a stack trace depth of 24. That's the maximum I have in this particular profile, and definitely gets out of security/nss/ :) Nick -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Mon, Nov 10, 2014 at 7:06 PM, Ryan Sleevi ryan-mozdevtechcry...@sleevi.com wrote: Not to be a pain and discourage someone from hacking on NSS My patches are in the following bugs: https://bugzilla.mozilla.org/show_bug.cgi?id=1094650 https://bugzilla.mozilla.org/show_bug.cgi?id=1095307 https://bugzilla.mozilla.org/show_bug.cgi?id=1096741 I'm happy to hear specific criticisms. Nick -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Tue, November 11, 2014 10:26 am, Nicholas Nethercote wrote: On Mon, Nov 10, 2014 at 7:06 PM, Ryan Sleevi ryan-mozdevtechcry...@sleevi.com wrote: Not to be a pain and discourage someone from hacking on NSS My patches are in the following bugs: https://bugzilla.mozilla.org/show_bug.cgi?id=1094650 https://bugzilla.mozilla.org/show_bug.cgi?id=1095307 https://bugzilla.mozilla.org/show_bug.cgi?id=1096741 I'm happy to hear specific criticisms. Nick -- Not trying to be a pain, but I don't think that's fair to position like that. I'd rather the first question be answered - each one of your patches adds more branches and more complexity. That part is indisputable, even if it may be seen as small. However, what measures do we have to ensure that this is meaningfully improving any objective measure of performance (other than allocation churn, which allocators are exceptionally capable of handling)? How do we ensure this doesn't regress? Otherwise, we're adding complexity without any benefits. And I don't think there are - or at least, I haven't seen, other than reduces allocations. -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On 11/11/2014 12:32 PM, Ryan Sleevi wrote: On Tue, November 11, 2014 10:26 am, Nicholas Nethercote wrote: On Mon, Nov 10, 2014 at 7:06 PM, Ryan Sleevi ryan-mozdevtechcry...@sleevi.com wrote: Not to be a pain and discourage someone from hacking on NSS My patches are in the following bugs: https://bugzilla.mozilla.org/show_bug.cgi?id=1094650 https://bugzilla.mozilla.org/show_bug.cgi?id=1095307 https://bugzilla.mozilla.org/show_bug.cgi?id=1096741 I'm happy to hear specific criticisms. Nick -- Not trying to be a pain, but I don't think that's fair to position like that. I'd rather the first question be answered - each one of your patches adds more branches and more complexity. OK, looking at that patches, I would challege that assertion. The patch in https://bugzilla.mozilla.org/show_bug.cgi?id=1094650 actually simplifies the code. I do agree we need to make complexity versus reward calculations, but most of these changes don't raise the complexity bar very high, if at all. That part is indisputable, even if it may be seen as small. Sorry, I just disputed it;). However, what measures do we have to ensure that this is meaningfully improving any objective measure of performance (other than allocation churn, which allocators are exceptionally capable of handling)? I agree that these changes should be targeted based on instrumentation. I believe Nicholas has provided this in the bug. How do we ensure this doesn't regress? I think that depends on the change. Certainly https://bugzilla.mozilla.org/show_bug.cgi?id=1095307 is simple enough, that the combination of review and normal testing should give us confidence in the patch's correctness. I'd say the same forhttps://bugzilla.mozilla.org/show_bug.cgi?id=1094650 . The patch in bug https://bugzilla.mozilla.org/show_bug.cgi?id=1096741 seems more risky, as it delays allocation. So it would require more testing and and a more dillegent review, and may actually cross the complexity/benefit bar for me. (also, the code in question is code meant to avoid cache attacks, we want to make sure we don't reintroduce a new back channel attack). Otherwise, we're adding complexity without any benefits. And I don't think there are - or at least, I haven't seen, other than reduces allocations. For me the bar is pretty low, particularly since I've seen patches that actually simplify the code. The 'use static until some threshold, the malloc' is a pretty well established meme in NSS already, and is particularly safe if we can see it's entire use over a single function. The bigger issue is are we getting proper bang for the buck. Here we have an NSS user who is admittedly tuning for his particular workload, but doing performance tuning. It's targeted by hitting the largest offenders using instrumented tools to figure that out given the client's workload, then knocking out the low hanging fruit. I think that's pretty classic optimization strategy. bob smime.p7s Description: S/MIME Cryptographic Signature -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Mon, Nov 10, 2014 at 9:04 PM, Nicholas Nethercote n.netherc...@gmail.com wrote: In your analysis, it would be better to use a call stack trace depth larger than 5 that allows us to see what non-NSS function is calling into NSS. I've attached to the bug a profile that uses a stack trace depth of 10. Unfortunately, 10 isn't enough to see the non-NSS entry (one that doesn't start with security/nss/) for every case. However, it looks like the data supports the types of changes that you are making and also my suggestions for coalescing and caching results, as well as my suggestion to avoid constructing CERTCertificate objects. Depending on how much effort you're willing to invest in this, many (probably most) of those allocations can be avoided. David Keeler is very familiar with the code in security/certverifier and security/manager/ssl/src that would be changed to implement the additional things I suggested, so I suggest you talk to him about it. Cheers, Brian -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Reducing NSS's allocation rate
Hi, I've been doing some heap allocation profiling and found that during basic usage NSS accounts for 1/3 of all of Firefox's cumulative (*not* live) heap allocations. We're talking gigabytes of allocations in short browsing sessions. That is *insane*. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1095272 about this. I've written several patches that fix problems, one of which has r+ and is awaiting checkin; check the dependent bugs. This is making Ryan Sleevi is nervous and he wanted me to post something here about my plans. So here they are: I want to reduce unnecessary allocations. I want to do so in a very non-intrusive fashion: I'm aware that NSS is security-sensitive code, and TBH it's not that enjoyable to read or modify. I will plug away at this as long as there is low-hanging fruit to be found, which may not be that much longer. Nick -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Mon, November 10, 2014 6:51 pm, Nicholas Nethercote wrote: Hi, I've been doing some heap allocation profiling and found that during basic usage NSS accounts for 1/3 of all of Firefox's cumulative (*not* live) heap allocations. We're talking gigabytes of allocations in short browsing sessions. That is *insane*. Could you explain why it's insane? I guess that's sort of why I poked you on the bug. Plenty of allocators are rather smart under such churn, and NSS itself uses an Arena allocator designed to re-use some (but understandably not all) allocations. Is there a set of performance criteria you're measuring? For example, we're spending X% of CPU in the allocator, and we believe we can reduce it to Y%, which will improve test Z. Not to be a pain and discourage someone from hacking on NSS (we always need more NSS hackers), but I guess I'm just trying to understand the complexity (and locally caching / lazy instantiating always adds some degree of complexity, though hopefully minor) vs performance (which is hopefully being measured) tradeoffs. Further, if we aren't continuously monitoring this, meaning it's not a metric integrated as something we watch, then it seems very easy that any improvements you make can quickly regress, which would be unfortunate for your work. -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
Personally, I would like to encourage your efforts. If you are able to move many of these allocations from heap-based with locks, to something stack-based instead, this will improve NSS server performance tremendously. I would be surprised if it was a significant boost to client apps like Firefox, though. On the other hand, I'm not sure that there is that much low-hanging fruit, based on the stacks you list in the bug. Many are dictated by the design of NSS, the PKCS#11 API, and the current softoken implementation. Working within these constraints is not so simple. Keep in mind that many things you might want to change cannot be, in order to preserve NSS API binary compatibility. Julien On 11/10/2014 18:51, Nicholas Nethercote wrote: Hi, I've been doing some heap allocation profiling and found that during basic usage NSS accounts for 1/3 of all of Firefox's cumulative (*not* live) heap allocations. We're talking gigabytes of allocations in short browsing sessions. That is *insane*. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1095272 about this. I've written several patches that fix problems, one of which has r+ and is awaiting checkin; check the dependent bugs. This is making Ryan Sleevi is nervous and he wanted me to post something here about my plans. So here they are: I want to reduce unnecessary allocations. I want to do so in a very non-intrusive fashion: I'm aware that NSS is security-sensitive code, and TBH it's not that enjoyable to read or modify. I will plug away at this as long as there is low-hanging fruit to be found, which may not be that much longer. Nick -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Mon, Nov 10, 2014 at 6:51 PM, Nicholas Nethercote n.netherc...@gmail.com wrote: I've been doing some heap allocation profiling and found that during basic usage NSS accounts for 1/3 of all of Firefox's cumulative (*not* live) heap allocations. We're talking gigabytes of allocations in short browsing sessions. That is *insane*. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1095272 about this. I've written several patches that fix problems, one of which has r+ and is awaiting checkin; check the dependent bugs. In your analysis, it would be better to use a call stack trace depth larger than 5 that allows us to see what non-NSS function is calling into NSS. The checks done in mozilla::pkix's CheckPublicKeySize can easily be optimized. But, first check how often the call stack contains CheckPublicKey vs VerifySignedData; CheckPublicKey can be optimized even more than VerifySignedData. My original plans for VerifySignedData was for it to have a cache added to it, if/when performance testing showed that there was a performance problem. It is likely that such a cache is important, even without the heap thrashing that you are concerned about. Also, there is already a bug on file about caching and coalescing SSL server cert verification results in SSLServerCertVerification. This is trickier than the type of caching you can do in VerifySignedData but it is potentially a bigger win. Also, I think recent changes to Gecko's connection management (the parallelism to a new host restricted to 1 bug being fixed) made it more important to do at least the coalescing part. Note that when bug 1036103 is fixed (which will be basically whenever I get around to posting one more patch), it will be possible to avoid any of the NSS CERT_* API during certificate verification, if people are willing to do a little (probably quite a bit, actiually) refactoring. That that, except for the calls to SECKEY_DecodeDERSubjectPublicKeyInfo and SECKEY_ExtractPublicKey in CheckPublicKeySize, mozilla::pkix allocates no memory at all, ever (once CheckNameConstraints is replaced, which is the thing that is one patch away from happening). Cheers, Brian -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto
Re: Reducing NSS's allocation rate
On Mon, Nov 10, 2014 at 8:53 PM, Brian Smith br...@briansmith.org wrote: I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1095272 about this. I've written several patches that fix problems, one of which has r+ and is awaiting checkin; check the dependent bugs. In your analysis, it would be better to use a call stack trace depth larger than 5 that allows us to see what non-NSS function is calling into NSS. I've attached to the bug a profile that uses a stack trace depth of 10. Nick -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto