Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 comments

Xavi Vilajosana Guillen Fri, 02 Mar 2018 08:16:26 -0800

Dear Lotte,

thanks so much for your detailed review, as implementer, this is really
useful!


See inline our responses ( XV:  ). The proposed corrections will be
incorporated in the next version of the draft that will be published asap
(before cut-off date).

kind regards,
Xavi
------------
Dear 6TiSCH Working Group,

somehow I missed the WGLC announcement for the 6top protocol draft. I'm not
quite sure if I'm too late with my review/questions now, but in case I'm
not, I'd like to share what I've got so far.

As for the context of my E-Mail: I'm currently implementing the 6top
Protocol as part of my master's thesis. It's not a full implementation,
just the parts that I currently need: 3-Step transactions are missing and
DELETE Requests are still Work In Progress, for example. I'm new-ish to the
ideas of 6TiSCH and TSCH in general, so my comments come from an outsiders'
point of view.

While implementing 6P, I've stumbled across some parts of the document
where I'm not quite sure if there's an inconsistency or if I'm just missing
something. In any case, I think it might be helpful to clear these parts up
(in the draft or on the WG Mailing List) before publishing 6P as Proposed
Standard. (Any feedback to my questions would be greatly appreciated, and
all statements proposing a change come with an implicit "I'd be happy to
write/suggest text for that", of course.)

Overall, I've found the draft to be nicely written and easy to read, but
lacking clear instructions in places. The idea of how 6P works is quick to
grasp, also thanks to the illustrations in Fig. 4 and 5. However, when it
comes to implementing the protocol, I've found myself skipping all over the
document to gather information on what's what. Especially the message
format and handling feels incomplete; not all message types are illustrated
or documented in full and one often has to infer what to check and send
when.
In the following, I will detail what exactly was unclear to me.

6P ADD Response where NumCells == 0
---------------------------------------------------------
Section 3.3.1. says:

"[...] The returned list can contain NumCells elements (succeeded) or
between 0 and NumCells elements (partially succeeded).
In the case that none of the cells could be allocated node B MUST send a 6P
Response with return code set to NOALLOC,
indicating that cells could not be allocated in the schedule, for example
because they are already used or reserved.
The returned list in this case MUST contain 0 elements."

If the returned list contains 0 elements, it satisfies both the
requirements to send a SUCCESS as well as a NOALLOC response. Should I send
a) both a SUCCESS as well as a NOALLOC response
b) a NOALLOC response
c) a SUCCESS response
in this case?

I'd assume the answer is b), but since the wording around partially
succeeded allocations is explicitly mentioning 0 cells as an options, that
assumption may very well be wrong. Depending on what the correct answer is,
I'd propose to state it more explicitly in the draft.

XV: This is clarified in the current text now. The only possible response
is RC_SUCCESS. The size of the list determines whether there is a full,
partial or none allocation.


Also, NOALLOC doesn't appear in Figure 36: 6P Return Codes, is this on
purpose?

XV: this RC has been removed. Is not used anymore.

Response Format Specification and Illustrations
---------------------------------------------------------------------
I would've found it very helpful to have Figures & subsections describing
the format of SUCCESS, RESET and NOALLOC responses (and how to handle them)
just like the requests are illustrated in fig. 9, 11, 13 (and 14).

As an example, I would've assumed that the NOALLOC response looks like this:

      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |Version| T | R |     Code      |     SFID      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(since the cellList is always empty anyways, there's no need to explicitly
transmit it), but the text says "The returned list in this case MUST
contain 0 elements", and while the text says that a message with code
RC_RESET marks a "critical error, reset"– what exactly should be reset?
The transaction state? How is it any different to a NOALLOC message then?
Or are NOALLOC and RESET messages the same thing by a different name
and NOALLOC is just a leftover from previous renamings?

XV: The response format is detailed in Figure 13. In case of a 0 length
CellList, the element is not present in the frame.
The response CODE is RC_SUCCESS as described in the current text. The size
of the list tells us how many of the cells have been actually allocated.
The RC_RESET return code is returned upon a critical error, such an
inconsistency that cannot be resolved. In this case the transaction is
aborted
and the cells scheduled between the peer nodes in the transactions is
cleared. (all cells schedueld between them cleared).

-----
On a side note, fig. 14 is missing the descriptions for the T and Code
fields (i.e. to which values they should be set) that the other figures
have, and while it's possible to infer these values from the draft, I
personally think it'd be handy to have them near the figure as well. :)

XV: the description is detailed right before the figure.
The Type field (T) is set to REQUEST.
The Code field is set to ADD.

Erroneous CellOptions in ADD request
-------------------------------------------------------
What happens if a node receives an ADD request where more than one link
type (RX, TX or SHARED) is set in the cellOptions field? I'd assume it
would send an error response, but of which type? RC_CELLLIST?

XV: A table indicating the meaning of the celloptions for
ADD/DELETE/RELOCATE request has been added.
Basically the installed cells from the cell list are of the type specified
by the CellOptions. There are some celloptions that have no meaning and
RC_ERR is returned in that case.

Handling CandidateCellList.size() < numCells in Relocate responses
------------------------------------------------------------
-------------------------------
Section 3.3.3 states that in a 6P RELOCATE request, " NumCandidate MUST be
larger or equal to NumCells", but doesn't specify how Node B should
 react if it receives a Relocate request that violates this requirement.
Does it respond with an RC_CELLLIST error message?
 If so, I'd propose to change the first sentences of the paragraph starting
with "Upon receiving the request" to something along the lines of:

"Upon receiving the request, Node B checks if the length of
candidateCellList is be larger or equal to NumCells. Node B's SF verifies
that all the cells in the Relocation CellList are indeed scheduled with
node A and are of the link type specified in the CellOptions field. If any
of these checks fail, node B MUST send a 6P Response to node A with return
code RC_CELLLIST. [...]"

XV: good point. We added that check.

6P CLEAR Response format
----------------------------------------
1. If the value of SeqNum doesn't matter for CLEAR messages, why do they
have a SeqNum field nonetheless?
2. Section 3.2.2 says "The Code field contains a 6P Return Code when the 6P
message is of Type RESPONSE or CONFIRMATION." However, the 6P Return Codes
listed in Section 6.2.4. don't list an RC_CLEAR code. Should an RC_CLEAR
code be added to section 6.2.4. or should CLEAR response messages in
fact have CMD_CLEAR set in their code field? (For the record, the former
seems more intuitive to me personally)
Or am I misunderstanding something entirely?

XV: the seqnum matters for all messages. CLEAR is the only command that is
not checking it as a clear can be issued after an inconsistency is
detected.
CLEAR is a command. Not a Response code. Clear is issued as a separate
action after some inconsistency is detected for example.
We added a clarification for the CLEAR Request on what are the possible
return codes.
" The Response Code to a 6P CLEAR command SHOULD be RC_SUCCESS unless the
operation cannot be executed.
  When the CLEAR operation cannot be executed the Response Code MUST be set
to RC_RESET."

Handling RC_CELLLIST Response messages
----------------------------------------------------------------
The Draft states that a faulty RELOCATE message should be responded to with
a 6P response with RC_CELLLIST set.
However, the draft does not specify how to handle a RC_CELLLIST response.
I'm assuming that it means the same as RC_RESET or RC_NOALLOC: abort
transaction, don't add/delete anything. Is this correct? In any case,
I think this should be made clear in the draft.

XV: we clarified this in the draft. Thanks.
" In case the received Response Code is RC_ERR_CELLLIST. The transaction is
aborted and no cell is relocated."

Handling RC_RESET Response messages
------------------------------------------------------------
Section 3.4.3 says:
"If a node receives a 6P Request from a given neighbor before having sent
the 6P Response to the previous 6P Request from that neighbor, it MUST send
back a 6P Response with a return code of RC_RESET (as per Figure 36).  A
node receiving RC_RESET code MUST abort the transaction and consider it
never happened."

I'm assuming that the node sending the RC_RESET response discards all data
on this transaction after it has sent the Response, is this correct? If so,
I'd propose to state this explicitly.

XV: this text has been clarified so it is pointed that the sender also
discards the second transaction.

Timeout management
-------------------------------
Section 3.4.4. explains how a timeout works and when it occurs, but not
what is supposed to happen when it occurs. From section 3.1.1.
I gathered that open transactions are aborted on timeout. It might seem
trivial, but I think it'd be handy to explicitly mention this in section
3.4.4.

XV: We added a clarification in section 3.4.4.
"When a timeout occurs the transaction MUST be cancelled at the node where
the timeout occurred."

SeqNum maintenance
--------------------------------
Am I correct in assuming that the SeqNum is maintained as a shared SeqNum
on a per-link basis (as opposed to A and B maintaining an internal
SeqNum each, and keeping track of the others' seqNum as well)?
i.e. if Node A and Node B share a link which was created by an ADD request
from A to B, they commit to maintaining (i.e. increasing with every
 transaction) the sequence number that A included in its initial ADD
request. This sequence number is valid only for the link between Nodes A
and B.
 If Node A also has a link to a Node C, their (shared) Sequence Number
might be completely different.

XV: yes this is how it works. The SeqNum is a representation of the
transactions between 2 particular nodes (e.g A and B). A and C will have
another one.

Handling a Request with SeqNum == 0
----------------------------------------------------------------------------
I couldn't find any language explicitly specifying what to do when a SeqNum
of 0 is received (except for Fig. 30).
I'm assuming it is the following:

- assume the node has reset: cancel any half-open transactions, let SF
decide how to handle the situation
- if 3-step transactions are used and there's an active transaction: send
response with 6p, send response with return code RC_SEQNUM and SeqNum = 0
- if there's no active transaction: we might just be hearing from this new
node for the first time because it was just freshly booted & added to the
network (and thus its SeqNum is 0)

is this correct? If so, would it make sense to state something like this in
section 3.4.6?

XV: thanks, seeing how you understood it we think the current description
is clear.


SeqNum == 0 (again)
------------------------------
What happens when
1. Node A sends a Request to Node B
2. Node A reboots (i.e. the sequence number is reset to 0)
3. Node A sends another Request to B

Does Node B recognize that A has reset and cancel the ongoing transaction
(as well as the whole "stale" link)?
Does it trigger the inconsistency handling of the SF?
Could this never occur because A should wait with sending any Request for
$timeout time so that all half-open transactions can expire?

XV: if this is a 2-way transaction, B sends the ack to A and commits the
transaction. Then A resets. In the next transaction, B expects
A to have a SeqNum different than 0 but is 0. This is an inconsistency
situation. The SF MUST decide what to do. Either CLEAR or remove the
last installed cells from A.
if this is a 3-way transaction. If B receives the confirmation, the same as
before applies. if B does not receive the Confirmation message then
B timeouts and the transaction in cancelled. B upon receiving a SeqNum from
A will be detect a possible inconsistency situation again.


Cell Relocation
----------------------
Section 3.3.3. says that NumCells MUST be >= 1. What happens if (for
example) the Relocation CellList contains 5 cells, but NumCells == 1?
Shouldn't it be that NumCells MUST be == length(Relocation CellList)?

XV: the draft states: The Relocation CellList MUST contain exactly NumCells
entries.

Regarding the Bitbucket links in Appendix C
----------------------------------------------------------------
They don't seem to work for me, is this on purpose?
XV: this is a temporary changelog. Most of the issues have been closed or
even removed. This will be deleted at the end so no worries.

With best regards,
Lotte Steenbrink


2018-02-28 13:56 GMT+01:00 Lotte Steenbrink <[email protected]>:

> Dear 6TiSCH Working Group,
>
> somehow I missed the WGLC announcement for the 6top protocol draft. I'm
> not quite sure if I'm too late with my review/questions now, but in case
> I'm not, I'd like to share what I've got so far.
>
> As for the context of my E-Mail: I'm currently implementing the 6top
> Protocol as part of my master's thesis. It's not a full implementation,
> just the parts that I currently need: 3-Step transactions are missing and
> DELETE Requests are still Work In Progress, for example. I'm new-ish to
> the ideas of 6TiSCH and TSCH in general, so my comments come from an
> outsiders' point of view.
>
> While implementing 6P, I've stumbled across some parts of the document
> where I'm not quite sure if there's an inconsistency or if I'm just missing
> something. In any case, I think it might be helpful to clear these parts up
> (in the draft or on the WG Mailing List) before publishing 6P as Proposed
> Standard. (Any feedback to my questions would be greatly appreciated, and
> all statements proposing a change come with an implicit "I'd be happy to
> write/suggest text for that", of course.)
>
> Overall, I've found the draft to be nicely written and easy to read, but
> lacking clear instructions in places. The idea of how 6P works is quick to
> grasp, also thanks to the illustrations in Fig. 4 and 5. However, when it
> comes to implementing the protocol, I've found myself skipping all over the
> document to gather information on what's what. Especially the message
> format and handling feels incomplete; not all message types are
> illustrated or documented in full and one often has to infer what to check
> and send when.
> In the following, I will detail what exactly was unclear to me.
>
> *6P ADD Response where NumCells == 0*
> *---------------------------------------------------------*
> Section 3.3.1. says:
>
> "[...] The returned list can contain NumCells elements (succeeded) or
> between 0 and NumCells elements (partially succeeded). In the case that
> none of the cells could be allocated node B MUST send a 6P Response with
> return code set to NOALLOC, indicating that cells could not be allocated in
> the schedule, for example because they are already used or reserved. The
> returned list in this case MUST contain 0 elements."
>
> If the returned list contains 0 elements, it satisfies both the
> requirements to send a SUCCESS as well as a NOALLOC response. Should I send
> a) both a SUCCESS as well as a NOALLOC response
> b) a NOALLOC response
> c) a SUCCESS response
> in this case?
>
> I'd assume the answer is b), but since the wording around partially
> succeeded allocations is explicitly mentioning 0 cells as an options, that
> assumption may very well be wrong. Depending on what the correct answer is,
> I'd propose to state it more explicitly in the draft.
>
> Also, NOALLOC doesn't appear in Figure 36: 6P Return Codes, is this on
> purpose?
>
> *Response Format Specification and Illustrations*
> *---------------------------------------------------------------------*
> I would've found it very helpful to have Figures & subsections describing
> the format of SUCCESS, RESET and NOALLOC responses (and how to handle them)
> just like the requests are illustrated in fig. 9, 11, 13 (and 14).
>
> As an example, I would've assumed that the NOALLOC response looks like
> this:
>
>       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>      |Version| T | R |     Code      |     SFID      |
>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>
>
> (since the cellList is always empty anyways, there's no need to explicitly
> transmit it), but the text says "The returned list in this case MUST
> contain 0 elements", and while the text says that a message with code
> RC_RESET marks a "critical error, reset"– what exactly should be reset? The
> transaction state? How is it any different to a NOALLOC message then? Or
> are NOALLOC and RESET messages the same thing by a different name and
> NOALLOC is just a leftover from previous renamings?
>
> On a side note, fig. 14 is missing the descriptions for the T and Code
> fields (i.e. to which values they should be set) that the other figures
> have, and while it's possible to infer these values from the draft, I
> personally think it'd be handy to have them near the figure as well. :)
>
> *Erroneous CellOptions in ADD request*
> *-------------------------------------------------------*
> What happens if a node receives an ADD request where more than one link
> type (RX, TX or SHARED) is set in the cellOptions field? I'd assume it
> would send an error response, but of which type? RC_CELLLIST?
>
> *Handling CandidateCellList.size() < numCells in Relocate responses*
>
> *-------------------------------------------------------------------------------------------*
> Section 3.3.3 states that in a 6P RELOCATE request, " NumCandidate MUST
> be larger or equal to NumCells", but doesn't specify how Node B should
> react if it receives a Relocate request that violates this requirement.
> Does it respond with an RC_CELLLIST error message? If so, I'd propose to
> change the first sentences of the paragraph starting with "Upon receiving
> the request" to something along the lines of:
>
> "Upon receiving the request, Node B checks if the length of
> candidateCellList is be larger or equal to NumCells. Node B's SF verifies
> that all the cells in the Relocation CellList are indeed scheduled with
> node A and are of the link type specified in the CellOptions field. If any
> of these checks fail, node B MUST send a 6P Response to node A with
> return code RC_CELLLIST. [...]"
>
> *6P CLEAR Response format*
> *----------------------------------------*
> 1. If the value of SeqNum doesn't matter for CLEAR messages, why do they
> have a SeqNum field nonetheless?
> 2. Section 3.2.2 says "The Code field contains a 6P Return Code when the 6P
> message is of Type RESPONSE or CONFIRMATION." However, the 6P Return
> Codes listed in Section 6.2.4. don't list an RC_CLEAR code. Should an
> RC_CLEAR code be added to section 6.2.4. or should CLEAR response
> messages in fact have CMD_CLEAR set in their code field? (For the record,
> the former seems more intuitive to me personally)
> Or am I misunderstanding something entirely?
>
> *Handling RC_CELLLIST Response messages*
> *----------------------------------------------------------------*
> The Draft states that a faulty RELOCATE message should be responded to
> with a 6P response with RC_CELLLIST set. However, the draft does not
> specify how to handle a RC_CELLLIST response. I'm assuming that it means
> the same as RC_RESET or RC_NOALLOC: abort transaction, don't add/delete
> anything. Is this correct? In any case, I think this should be made clear
> in the draft.
>
> *Handling RC_RESET Response messages*
> *------------------------------------------------------------*
> Section 3.4.3 says:
> "If a node receives a 6P Request from a given neighbor before having sent
> the 6P Response to the previous 6P Request from that neighbor, it MUST send
> back a 6P Response with a return code of RC_RESET (as per Figure 36).  A
> node receiving RC_RESET code MUST abort the transaction and consider it
> never happened."
>
> I'm assuming that the node sending the RC_RESET response discards all data
> on this transaction after it has sent the Response, is this correct? If so,
> I'd propose to state this explicitly.
>
> *Timeout management*
> *-------------------------------*
> Section 3.4.4. explains how a timeout works and when it occurs, but not
> what is supposed to happen when it occurs. From section 3.1.1
> <https://tools.ietf.org/html/draft-ietf-6tisch-6top-protocol-09#section-3.1.1>
> .
> I gathered that open transactions are aborted on timeout. It might seem
> trivial, but I think it'd be handy to explicitly mention this in section
> 3.4.4.
>
> *SeqNum maintenance*
> *--------------------------------*
> Am I correct in assuming that the SeqNum is maintained as a shared SeqNum
> on a per-link basis (as opposed to A and B maintaining an internal SeqNum
> each, and keeping track of the others' seqNum as well)?
> i.e. if Node A and Node B share a link which was created by an ADD request
> from A to B, they commit to maintaining (i.e. increasing with every
> transaction) the sequence number that A included in its initial ADD
> request. This sequence number is valid only for the link between Nodes A
> and B. If Node A also has a link to a Node C, their (shared) Sequence
> Number might be completely different.
>
> *Handling a Request with SeqNum == 0*
>
> *----------------------------------------------------------------------------*
> I couldn't find any language explicitly specifying what to do when a
> SeqNum of 0 is received (except for Fig. 30).
> I'm assuming it is the following:
>
> - assume the node has reset: cancel any half-open transactions, let SF
> decide how to handle the situation
> - if 3-step transactions are used *and *there's an active transaction:
> send response with 6p, send response with return code RC_SEQNUM and SeqNum
> = 0
> - if there's no active transaction: we might just be hearing from this new
> node for the first time because it was just freshly booted & added to the
> network (and thus its SeqNum is 0)
>
> is this correct? If so, would it make sense to state something like this
> in section 3.4.6?
>
> *SeqNum == 0 (again)*
> *------------------------------*
> What happens when
> 1. Node A sends a Request to Node B
> 2. Node A reboots (i.e. the sequence number is reset to 0)
> 3. Node A sends another Request to B
>
> Does Node B recognize that A has reset and cancel the ongoing transaction
> (as well as the whole "stale" link)? Does it trigger the inconsistency
> handling of the SF? Could this never occur because A should wait with
> sending any Request for $timeout time so that all half-open transactions
> can expire?
>
> *Cell Relocation*
> *----------------------*
> Section 3.3.3. says that NumCells MUST be >= 1. What happens if (for
> example) the Relocation CellList contains 5 cells, but NumCells == 1?
> Shouldn't it be that NumCells MUST be == length(Relocation CellList)?
>
> *Regarding the Bitbucket links in Appendix C*
> *----------------------------------------------------------------*
> They don't seem to work for me, is this on purpose?
>
>
> With best regards,
> Lotte Steenbrink
>
> _______________________________________________
> 6tisch mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/6tisch
>
>


-- 
Dr. Xavier Vilajosana
Wireless Networks Lab

*Internet Interdisciplinary Institute (IN3)Professor*
(+34) 646 633 681
[email protected] <[email protected]>
http://xvilajosana.org
http://wine.rdi.uoc.edu
Parc Mediterrani de la Tecnologia
Av Carl Friedrich Gauss 5, B3 Building
08860 Castelldefels (Barcelona). Catalonia. Spain
[image: Universitat Oberta de Catalunya]

_______________________________________________
6tisch mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/6tisch

Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 comments

Reply via email to