On 09/19/2010 05:35 AM, Amos Jeffries wrote:
Round 2 of only src/comm/ changes.

I believe this accounts for all the bits requested to date. Including
adding Subscriptions for ConnAcceptor.

* The patch is missing the new Subscription class. Based on your subject line, I am guessing you will post that separately.


* Comm::ConnOpener::start() should call connect() directly. Start() is already async, there is no need to pay the async price again. Moreover, by doing an async call, you are more likely to encounter timeouts even before we call connect() because the clock starts ticking before the call is fired.

* Comm::ConnOpener::timeout() should call connect() directly. Timeout() is already async, there is no need to pay the async price again.

* I do not like how Comm::ConnOpener::connect() calls itself. I assume you did not invent this, but the code will fail to produce the intended(?) delay if there are no other async calls already scheduled, which is something the class does not really control. Recall that async calls are AsyncCallQueue::fire()d until there are none left...

At the very least, please add a comment explaining why we use such a hack instead of either using an explicit loop or waiting a little bit via eventAdd.


* calls_.connect_ should not be reused in Comm::ConnOpener::connect() because AsyncCall objects may not support being fired twice -- they may have state. The connect_ object is not needed at all if you address the above comments. If you do not, set it to NULL at the start of Comm::ConnOpener::connect() and recreate when needed.


* I may be wrong, but the interesting "Is opened conn fully open?" comment below may reveal a couple of bugs:

+    // recover what we can from the job
+    if (conn_ != NULL && conn_->isOpen()) {
+        // it never reached fully open, so abort the FD
+        conn_->close();
+        ...

Clearly, conn_ is open inside that if-statement (there is no such thing as half-open conn, I hope; we have enough headaches from half-closed ones!). There is nothing wrong with the connection. Instead, it is the ConnOpener job itself that got into a half-finished state here, most likely due to an exception after the connection got opened and before it was sent to the caller. To clean up, two separate actions are needed:

a) Close the opened connection. The beginning of your if-statement will do the job, but do not call doneConnecting(). Here is a polished version:

  if (conn_ != NULL && conn_->isOpen()) {
      Must(callback_ != NULL); // or there should be no conn_
      // we never sent this open connection to the caller, so close
      conn_->close();
      fd_table[conn_->fd].flags.open = 0;
      conn_ = NULL;
  }

b) Notify the caller we are dying:

  if (callback_ != NULL) {
      // do not let the caller wait forever
      doneConnecting(COMM_ERR_CONNECT, 0);
  }

I am not sure about the order. The above order means "caller can handle and should get nil params.conn". You can remove "conn_ = NULL" if "caller should get a non-nil but closed params.conn", but there will still be two if-statements.

Please check that I did not miss some other inter-state dependencies in the above sketch. You know this code much better than I am.


* Consider moving case COMM_OK code, including lookupLocalAddress() to something like Connection::opened(). The code seems to be manipulating conn_ without access to Comm::ConnOpener data except perhaps host_. I could have missed some other dependencies, of course, but it feels like a Connection business to set things up after the connection has been established. The ConnOpener job is done at that stage.

* Can we leave resetting fd_table[conn_->fd].flags.open to comm_close(), which will be called from via conn_->close()? It feels wrong to force it to zero way before the connection is actually closed but I understand that you may be fighting some hidden dependencies.


* Should we test _peer cbdata-validity before dereferencing it below?
+        if (_peer)
+            _peer->stats.conn_open--;

* The API below seems like a bad idea:
+    /** retrieve the peer pointer for use.
+     * The caller is responsible for all CBDATA operations regarding the
+     * used of the pointer returned.
+     */
+    peer * const getPeer() const { return _peer; }

because it leads to bugs like that (missing cbdata check):
+        if (conn_->getPeer())
+            conn_->getPeer()->stats.conn_open++;

I suggest checking cbdata validity of _peer in getPeer(), returning NULL if the pointer is invalid, and using getPeer() even internally.



* Our Vector::shift is both slow and buggy. This is not your problem, I guess, but perhaps you can add an XXX comment to Comm::AcceptLimiter::kick() to replace Vector with an efficient queue type.

* Can you explain how "syncWithComm() side effects hit theCallSub->callback()"? Is not theCallSub->callback() itself constant and its result is stored in a local variable? I do not understand how syncWithComm() can affect theCallSub itself. Thank you.


* Why check calls_.earlyAbort_, calls_.timeout_, and connStart_ in ConnOpener::start(). Start is called only once and these checks seem to imply that those variables can be set somewhere else, before the start, even though they are private.

* Consider s/connStart_/connectStart_/ for clarity. Too many conn prefixes...

* Please use "const TextException &e" in
+   } catch(TextException &e) {


* The code below does not need to declare temp outside its scope:
+        ConnAcceptor *temp = deferred.shift();
+        if (temp != NULL) {


* cbdataReferenceDone() already sets its parameter to NULL. No need to do that twice:
+    /* clear any previous ptr */
+    if (_peer) {
+        cbdataReferenceDone(_peer);
+        _peer = NULL;
+    }

* s/COMM flags set on this connection/COMM flags; see COMM_ defines above/ or something of that kind.


* Consider s/_peer/peer_/ in Connection for consistency with most other code.

* s/unused//g. C++ does not require naming unused parameters.

* s/As soon as comm IO accepts/As soon as commSetSelect accepts/
just to be more precise, because many comm IO calls already accept AsyncCalls.


HTH,

Alex.

Reply via email to