On 07/10/2016 02:02 AM, Amos Jeffries wrote: > On 10/07/2016 2:38 p.m., Alex Rousskov wrote: >> On 07/09/2016 07:10 AM, Amos Jeffries wrote: >>> On 9/07/2016 11:18 a.m., Alex Rousskov wrote: >>>> On 07/08/2016 12:44 AM, Amos Jeffries wrote: >>>>> On 8/07/2016 11:24 a.m., Alex Rousskov wrote: >>>>>>> >>>>>>> + void resetWithoutLocking(T *t) { >>>>>>> + unlock(); >>>>>>> + raw = t; >>>>>>> } >>>>>>> >>>>>>> void resetAndLock(T *t) { >>>>>>> + if (t != get()) { >>>>>>> + resetWithoutLocking(t); >>>>>>> + lock(t); >>>>>>> + } >>>>>>> + } >> >>>>>> We should probably add self-reset protection to both cases or to none. >> >>>> Self-assignment and self-reset is a gray area. Each >>>> assignment/reset-handling class can define what should happen under >>>> those circumstances. The possible sane definitions are: >>>> >>>> 1. Do nothing (side effects lead to leaks/crashes in non-benign cases) >>>> 2. assertion/exception (explicit error in all cases) >>>> 3. Do the usual (side effects lead to leaks/crashes in most cases) >>>> >>>> Which definition to pick is our choice, but we should be _consistent_ in >>>> that choice. I recommend #1, but it is a weak recommendation. >>>> >>>> In the latest patch I have seen, resetAndLock() was using #1 but >>>> resetWithoutLocking() was using #3. >> >> >>> resetAndLock() is doing #1 >> >> Agreed. >> >> >>> and there are no side effects. Because the >>> new lock() is performed by us, not the caller. So we can confidently do >>> nothing with no side effects.
> * For the LockingPointer(T*) ctor and resetWithoutLocking() method > the caller is responsible for: > - doing a +1 lock itself, and > - exactly once for *each time either is used*, and > - doing it before the call/use. > > * For release() the caller is taking on responsibility for exactly 1 > un-lock operation. > > * For all other methods and operators the recieving Pointer is > responsible for the locking. > Please reconsider your code examples in light of those requirements and > you should see where the bugs in the examples are confusing you. I know the requirements. My examples are purposefully built to contain problematic self-reset conditions and illustrate the differences between self-reset treatment approaches. Real callers may think they follow all the rules, but code complexities can easily hide self-assignment/reset from them. My examples are simple enough to clearly expose those normally hidden problems. It is impossible to study self-reset treatment approaches using perfectly correct caller code. Perfectly correct caller code does not have self-resets. >> We cannot be confident that doing nothing will not cause problems. For >> example, the following well-meaning caller code will assert due to those >> unexpected "side effects" of approach #1: >> >> void importCertificate(LockingPointer &p, X509 *x) >> { >> const int referenceCount = x->references; >> p.resetAndLock(x); >> // check that we actually locked the new certificate >> assert(x->references > referenceCount); >> } >> >> >> ... if importCertificate() is fed p that already manages certificate x. > The p.resetAndLock() semantically increments the x references and > decrements the existing pointers references before it returns. Making > the assert '>' a bug. Yes, the example contains code that is intentionally buggy. The example illustrates why the do-nothing approach #3 may cause problems. The example function code "works" all the time except when dealing with a self-reset. When dealing with a self-reset, it asserts. Note that the function itself does not cause that self-reset -- its callers do. And, I repeat, these are just simple examples meant to illustrate the problem. Similar problems in real code are often well-hidden and may result in far more obscure side effects than an immediate crash or assertion. > That is true for any proper reference counting. The self-check is an > optimization of that case. Not a requirement. In isolation, using a specific form of a self-check is not a requirement -- we have a choice among the three options I itemized, one of which is do-nothing (#3). However, #2 becomes a requirement if, like you claimed, you want to detect self-resets reliably. And #1 becomes a requirement if we decide to promise a no-op to the callers in self-reset cases. >>> resetWithoutLocking() cannot do #1 without adding the listed side >>> effects for all 'benign' cases. >> >> In what I call benign cases, approach #1 always works correctly: >> >> p.resetWithoutLocking(p->get()); // never a problem with #1! >> > > Bug: caller of resetWithoutLocking() is not doing a +1 lock increment > itself before that method is called. Yes, but that is not important here. The point here is that with approach #1 the code works correctly in benign cases where the reset call simply was not needed at all. Again, in real code, the above self-reset will be obscured with something like: // NP: getNewCertificate() locks the certificate for us Root.resetWithoutLocking(getNewCertificate()); but the underlying code will have the same self-reset pattern if/when getNewCertificate() returns a raw pointer that is already managed by Root (e.g., because somebody changed getNewCertificate() to register with the global Root pointer _without_ fixing this caller code). For the purpose of this discussion, there is no difference between clearly self-resetting "p.resetWithoutLocking(p.get())" and the well-hidden Root example above. > Yes the bug location failing to do +1 may be somewhere in the caller of > the above importCertificate(). But in essence the whole chain of code > passing 'x' variable down to resetWithoutLocking() is responsible. The question is how LockingPointer should deal with these self-reset bugs. I suggest that we pick one of the three approaches itemized and use that approach consistently. You committed inconsistent LockingPointer code. That may be an even better solution for some unknown to me reason, but no valid defense of that inconsistency has been offered so far IMO. >>> The words "most cases" is incorrect on #3, because most cases will be >>> caller acting correctly and doing its +1 on the locks before our >>> unlock() is used. Only buggy callers will lead to the listed side effects. >> >> Unfortunately, #3 actually leads to leaks or crashes in benign cases >> where caller _is_ doing its +1 (as well as in some non-benign cases, of >> course). For example: >> >> x = newX509(); // creates and locks certificate x >> assert(x->references == 1); // OK, newX509() +1ed as expected >> LockingPointer p(x); >> assert(x->references == 1); // OK, our constructor did nothing >> p.resetWithoutLocking(x); // Opos, x is deleted here due to #3 > > Bug: using resetWithoutLocking() instead of resetAndLock(), when 'x' has > no caller lock associated. Yes, you are consistently pointing out bugs in the purposefully buggy code. You are not wrong -- the code in examples _is_ problematic -- but to illustrate bug handling problems we need those bugs in the examples. The problem we are trying to address is how LockingPointer should deal with problematic caller code that is likely to be present in Squid, now or in the future. You went as far as saying something like "I want us to expose those bugs, not hide them" but the code you committed does not do that. My attempts to explain why it does not do that have failed so far, possibly because you focus on the purposefully introduced bugs in the examples rather than on the problem we are trying to solve. All my recent examples use problematic code because we are (or, rather, I am and we should be) discussing how to deal with problematic code. >> std::cout << *p->get(); // crash >> >> The same is true for resetAndLock(), of course > > No. If it was, then Reference Counting (the universal concept) would not > work anywhere. What I said is a fact: resetAndLock() would lead to problems similar to those illustrated in the above resetWithoutLocking() example if resetAndLock() were to be implemented using the do-nothing approach #3 without self-reset protection. I do not know how you got from that fact to your "then a universal concept would not work" statement. Also, a universal concept of reference counting does not define how self-assignment and self-reset events are to be handled. For example, std::shared_ptr::reset() specifically says that "If the object pointed to by ptr is already owned, the function results in undefined behavior". > resetAndLock() is the method by which regular normal reference counting > is done. The recipient smart-pointer taking care of any +1/-1 that needs > to happen and ordering them to ensure the pointer received stays valid. Yes. And any assignment-like method in C++, including our resetAndLock() needs to deal with self-assignment/reset. Nothing in that method specs can avoid the problem because the problem comes from the language itself. The only choice we have is how to deal with that self-assignment. > self-reset is perfectly fine IFF the caller meets the requirements I > outlined at the top. You can't make an assert that tests that reliably. I agree that treating benign self-reset cases differently from non-benign ones is impossible because the reset method cannot know which self-reset is benign -- the code will be the same for both categories. The definition of a benign case can be something like "if the caller knew that this is a self-reset, it would have skipped calling the reset() method and everything would have worked OK". We either declare all self-reset cases as bugs (and then we may use the asserting approach #2) or we cannot assert at all (i.e., we can only use either approach #1 or approach #3). >> By definition, self-assignment is "p=p" or equivalent. >> Similarly, self-reset is, by definition: >> p.reset*(p->get()) >> >> or equivalent. That is, you are resetting a smart pointer using the >> smart pointer itSELF (or equivalent). That is what resetAndLock() >> if-statement checks, and that is the check that resetWithoutLocking() >> currently lacks. > For resetAndLock() that above code is fine. p ensures that pointer stays > valid through the locking/unlocking. For resetAndLock(), the above code is "fine" only because we have added self-reset protection to resetAndLock()! Without that self-reset protection of some kind, the above self-reset code will crash resetAndLock(). It is not something in the LockingPointer _specs_ that makes the above self-reset code safe in resetAndLock() case. It is our decision to treat all resetAndLock() self-resets as benign that makes it safe. Nothing in the LockingPointer specs requires us to add self-reset protection to resetAndLock(). It is our experience that self-reset happens in real code that prompts us to decide how to handle it, not the specs. Here is what we currently do: * We know that some self-reset cases in resetAndLock() may be serious bugs and some may be benign. We treat all self-reset cases in resetAndLock() as benign bugs. That is, we assume that if the caller knew that it is dealing with a self-reset, the caller would have skipped the reset, and everything would have worked correctly after that. If there are actually non-benign cases, then our resetAndLock() will lead to undefined behavior in those cases (we know how resetAndLock() is going to behave, but not what will happen to the self-reseting code using resetAndLock()). * We know that some self-reset cases in resetWithoutLocking() may be serious bugs and some may be benign. Our resetWithoutLocking() results in undefined behavior on any self-reset. Some of those self-reset cases could correspond to benign bugs: If the caller knew that it is dealing with a self-reset, the caller would have skipped the reset, and everything would work correctly after that, but we may crash or otherwise mishandle that benign caller anyway. Why we treat self-reset bugs in resetAndLock() callers differently from self-reset bugs in resetWithoutLocking() callers, I do not know. Nothing in your answers explain that asymmetry. > It is nasty that LockingPointer has to deal with external lock fiddling. > But we can't get away from that. Agreed. > All we can do is be very careful that > the alternative API is used correctly. We are certainly not very careful about that. We do not even assert that the resetWithoutLocking() pointer has been locked by the caller. However, this thread is currently not about that. It is about treating self-reset differently in two reset*() methods. >>> A buggy caller code is one that is not adding that lock and still using >>> resetWithoutLocking() >> >> That would be indeed a bug, but it has nothing to do with self-reset. > All the presented examples of self-reset are demonstrations of callers > failing that one requirement. Let's assume that is true. Why is that relevant? > I am completely certain that if the callers using the non-locking API > meet that requirement the locking math works out okay with the current > trunk code. That assertion is not being disputed here. Indeed, if everything is correct, then everything is correct. Unfortunately, we are discussing a different (and more realistic) case -- there are buggy callers out there. They think they do the right thing, but they do not. Should we treat buggy resetWithoutLocking() callers differently from buggy resetAndLock() callers? A "no" is a natural answer, for symmetry sake. A "yes" answer may be correct/better, but it requires some backing/explanation (and a source code comment!) that we can agree on. > I am also completely certain that if we add the if-statement to prevent > self-reset/assign unlock() happening *inside* the resetWithoutLocking() > that callers will become unable to use it reliably in those cases. Here is a trivial counter-example of a caller that reliably uses resetWithoutLocking() with do-nothing protection: ... some perfect/rules-obeying code here ... p.resetWithoutLocking(p.get()); // do-nothing makes this a no-op ... some perfect/rules-obeying code here ... > How can I be so certain? > > 1) The same behaviour can be implemented by ordering the new values +1 > before the old values -1. With no self-protecting if-statement. > --> always both operations happen even on self-reset/assign. > > 2) The normal pattern we are all familiar with wraps BOTH the lock and > unlock operations inside the if-statemet. > --> either both happen or neither. > When we come to the resetWithoutLocking() it is behaviourally only an > unlock for the old value. Yes, the method unlocks the old value and assigns the new value, under the assumption that somebody has locked the new value for us. In a self-reset case, the "new" part of the assumption fails (i.e., the new value is actually the old one in a self-reset case, by definition), resulting in crashes and such (unless protection is added). In a self-reset case, we could say, OK, the caller probably locked the value once, called resetWithoutLocking() once, and now mistakenly called resetWithoutLocking() again for the same value. To keep lock/unlock parity, we are going to do nothing (i.e., use approach #1). > Critically it does not lock on the new value. That means we CANNOT be > doing the 'BOTH' behaviour of (2) inside the method. Yes, of course, the two reset*() methods perform different operations (or we would not have two methods). resetWithoutLocking() never does two operations. Why is that important here? > Therefore (1) has to be the implementation for use of this part of the > API. Not really. Just because resetAndLock()'s locking/unlocking pattern #2 is not applicable to resetWithoutLocking() does not mean resetAndLock()'s locking/unlocking pattern #1 is applicable. resetWithoutLocking() has its own locking/unlocking pattern. > With the +1 step happening in caller, and the -1 inside > resetWithoutLocking(). Well, yes, that is true. > --> this is where the requirements list I wrote at the top come from. Nobody is disputing the requirements. > Is that clear enough now? What you say is mostly clear and correct, but mostly irrelevant to the problem I am discussing. You are not answering my questions. You are attacking bugs specifically planted in examples to illustrate bug handling. You are defending basic requirements that nobody is attacking. Etc. In other words, a squid-dev discussion as usual. I do not think we should waste more time on this. If we are lucky, no Squid code will trigger self-reset on resetWithoutLocking() and we will not waste days or weeks tracking those bugs. If we are not that lucky, I know that I did what I reasonably could to prevent that. Good luck, Alex. _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev