Comments on the Target Discovery design, v 0.6: meta-comment: the orchestrator is using mixed-case function names, whereas here we're using lower-case with '_' as a word separator. I'd like to see some consistency of style in this respect.
3.3 OS module: I'd feel better if we were a little more precise in the spec on what constitutes a "critical filesystem" 4.1.1 td_discover: the description says "Repeat calls will discard all existing discovery information and rediscover everything"; is this meaning to say that a td_discover(TD_OT_SLICE,...) will cause prior results from, say, a td_discover(TD_OT_DISK,...) to be discarded? If so, why would that be necessary? The text in the last paragraph of this section discussing caching indicates a less drastic result, and then there's section 8 which says there's no caching at all, so please clarify. td_discovery_release: why not a less drastic form of release to free only specific types? 4.1.2.1: "All enumeration functions return ... ENOENT=end of list". I'm not a big fan of overloading the system errno codes with alternate universe meanings within a sub-system, because inevitably you end up twisting them in ways which are hard to understand. You've gone to the trouble to define td_errno_t, how about making that an enum with sub-system-specific values? Additionally, I'm not seeing that the usage pattern of these functions works out that well; the example on page 9 under 4.1.5 doesn't match up with what's here, as it's missing the call to td_get_next() to traverse the list. But why can't it just be something like: nv_list_t *td_get_next(td_object_type_t, &td_errno_t) I just don't see what the extra call in the td_get_next/td_attributes_get pair buys the caller. Can you elaborate on the requirement which drives creating td_get_slice_next()? Why isn't there a similar call for the other types? 4.1.3: the table here lists td_discover_free, which isn't mentioned elsewhere, and omits td_list_free. Also, in terms of doc structure, it might make sense to put this table at the start of the section to help provide an overview before diving into the details. 4.1.6: do we have any advice from the I/O groups on whether a 10000 ms timeout is an appropriate default? Also, some additional discussion of the threading strategy here seems appropriate; is it correct to say that we are attempting to do the discovery in a single-threaded fashion and only switch to multi-threaded once we encounter a timeout? section 9: I think we have verified conclusively that libdiskmgt is presently consolidation private and this needs upgrading to meet our needs (in fact, 6.1 said essentially this). Please clean up here.
