Re: Proposed changes to Nimble host
0 - 63: Core event types (TIMER, MQUEUE_DATA, etc.) 64+: Per-task event types. So, the options for the host package are: 1. Reserve new core event IDs. This avoids conflicts, but permanently uses up a limited resource. 2. Use arbitrary per-task event IDs. This has the potential for conflicts, and doesn't strike me as a particularly good solution. 3. Use a separate host task. This allows the host use IDs in the per-task ID space without the risk of conflict. 4. Leverage existing core events. This is what I proposed. It avoids conflicts and doesn't require any new event IDs, but it does feel a bit hacky to use the TIMER event ID for something that isn't a timer. What should use these core events? I think reserving events is fine. The other option is to reserve a generic "trampoline" event, which basically is like a callout, in that you have a pointer to a function call and an arg, and everything can repost them to a task. I think we should have this regardless, but I'm not opposed to burning events for something as critical as the networking stack, for example. Sterling
Re: Proposed changes to Nimble host
Yeah, I can see why you chose OS_EVENT_TIMER. It is almost like we should rename that event type :-) But I agree with everything you say below; creating a new event type for this seems wasteful. I am not quite sure what you mean by "My concern there is that applications may want to add special handling for certain event types…”. Are you referring to the events that a package may require of an application? Anyway, solving this generically is definitely what we need to do. Will > On Apr 18, 2016, at 10:06 AM, Christopher Collinswrote: > > On Mon, Apr 18, 2016 at 09:43:35AM -0700, Christopher Collins wrote: >> On Mon, Apr 18, 2016 at 09:18:16AM -0700, will sanfilippo wrote: >>> For #2, my only “concerns” (if you could call them such) are: >>> * Using OS_EVENT_TIMER as opposed to some other event. Should all >>> OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What >>> events are going to be processed here? Do you envision many host >>> events? >> >> Yes, I agree. I think a more appropriate event type would be >> OS_EVENT_CALLBACK or similar. I am a bit leery about adding a new OS >> event type for this case, because it would require all applications to >> handle an extra event type without any practical benefit. Perhaps >> mynewt could relieve this burden with an "os_handle_event()" function >> which processes these generic events. My concern there is that >> applications may want to add special handling for certain event types, >> so they wouldn't want to call the helper function anyway. >> >> The OS events that the host would generate are: >>* Incoming ACL data packets. >>* Incoming HCI events. >>* Expired timers. > > (I meant "process", not "generate"!) > > Oops... I went down a rabbit hole and forgot to address the main point > :). What we would *really* want here is something like: >* BLE_HS_EVENT_ACL_DATA_IN >* BLE_HS_EVENT_HCI_EVENT_IN > > However, the issue here is that the event type IDs are defined in a > single "number-space". If the host package reserves IDs for its own > events, then no other packages can use those IDs for its own events > without a conflict. The 8-bit ID space is divided into two parts: > > 0 - 63: Core event types (TIMER, MQUEUE_DATA, etc.) > 64+: Per-task event types. > > So, the options for the host package are: > 1. Reserve new core event IDs. This avoids conflicts, but permanently > uses up a limited resource. > 2. Use arbitrary per-task event IDs. This has the potential for > conflicts, and doesn't strike me as a particularly good solution. > 3. Use a separate host task. This allows the host use IDs in the per-task > ID space without the risk of conflict. > 4. Leverage existing core events. This is what I proposed. It avoids > conflicts and doesn't require any new event IDs, but it does feel a > bit hacky to use the TIMER event ID for something that isn't a timer. > > I think this might be a common problem for other packages in the future. > I don't think it is that unusual for a package to not create its own > task, but still have the need to generate OS events. So perhaps we > should think about how to solve this general problem. > > Chris
Re: Proposed changes to Nimble host
On Mon, Apr 18, 2016 at 09:18:16AM -0700, will sanfilippo wrote: > For #2, my only “concerns” (if you could call them such) are: > * Using OS_EVENT_TIMER as opposed to some other event. Should all > OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What > events are going to be processed here? Do you envision many host > events? Yes, I agree. I think a more appropriate event type would be OS_EVENT_CALLBACK or similar. I am a bit leery about adding a new OS event type for this case, because it would require all applications to handle an extra event type without any practical benefit. Perhaps mynewt could relieve this burden with an "os_handle_event()" function which processes these generic events. My concern there is that applications may want to add special handling for certain event types, so they wouldn't want to call the helper function anyway. The OS events that the host would generate are: * Incoming ACL data packets. * Incoming HCI events. * Expired timers. > * I wonder about the complexity of this from an application developers > standpoint. Not saying that what you propose would be more or less > complex; just something we should consider when making these changes. I think the taskless design reduces complexity for the application developer. If there is no host task, the developer can worry less about task priorities and stack sizes. > On a side note (I guess it is related), we should consider how > applications are going to initialize the host and/or the controller in > regards to system memory requirements (i.e. mbufs). While our current > methodology to create a BLE app is not rocket science, I think we > could make it a bit simpler. Yes, definitely. As you say, the setup is not terribly complicated, but it does involve a fair number of steps, so it will seem complicated to someone not familiar with Mynewt. Chris
Re: Proposed changes to Nimble host
All sounds excellent! +1 for #1. That only seems like a good thing. For #2, my only “concerns” (if you could call them such) are: * Using OS_EVENT_TIMER as opposed to some other event. Should all OS_EVENT_TIMER events be caused by a timer? Probably no big deal… What events are going to be processed here? Do you envision many host events? * I wonder about the complexity of this from an application developers standpoint. Not saying that what you propose would be more or less complex; just something we should consider when making these changes. On a side note (I guess it is related), we should consider how applications are going to initialize the host and/or the controller in regards to system memory requirements (i.e. mbufs). While our current methodology to create a BLE app is not rocket science, I think we could make it a bit simpler. > On Apr 17, 2016, at 3:57 PM, Christopher Collinswrote: > > Hello all, > > The Mynewt BLE stack is called Nimble. Nimble consists of two packages: >* Controller (link-layer) [net/nimble/controller] >* Host (upper layers) [net/nimble/host] > > This email concerns the Nimble host. > > As I indicated in an email a few weeks ago, the code size of the Nimble > host had increased beyond what I considered a reasonable level. When > built for the ARM cortex-M4, with security enabled and the log level set > to INFO, the host code size was about 48 kB. In recent days, I came up > with a few ideas for reducing the host code size. As I explored these > ideas, I realized that they open the door for some major improvements in > the fundamental design of the host. Making these changes would > introduce some backwards-compatibility issues, but I believe it is > absolutely the right thing to do. If we do this, it needs to be done > now while Mynewt is still in its beta phase. I have convinced myself > that this is the right way forward; now I would like to see what the > community thinks. As always, all feedback is greatly appreciated. > > There are two major changes that I am proposing: > > 1. All HCI command/acknowledgement exchanges are blocking. > > Background: The host and controller communicate with one another via the > host-controller-interface (HCI) protocol. The host sends _commands_ to > the controller; the controller sends _events_ to the host. Whenever the > controller receives a command from the host, it immediately responds > with an acknowledgement event. In addition, the controller also sends > unsolicited events to the host to indicate state changes or to request > information in a subsequent command. > > In the current host, all HCI commands are sent asynchronously > (non-blocking). When the host wants to send an HCI command, it > schedules a transmit operation by putting an OS event on its own event > queue. The event points to a callback which does the actual HCI > transmission. The callback also configures a second callback to be > executed when the expected acknowledgement is received from the > controller. Each time the host receives an HCI event from the > controller, an OS event is put on the host's event queue. Processing of > this OS event ultimately calls the configured callback (if it is an > acknowledgement), or a hardcoded callback (if it is an unsolicited HCI > event). > > This design works, but it introduces a number of problems. First, it > requires the host code to maintain some quite complex state machines for > what seem like simple HCI exchanges. This FSM machinery translates into > a lot of extra code. There is also a lot of ugliness involved in > canceling scheduled HCI transmits. > > Another complication with non-blocking HCI commands is that they require > the host to jump through a lot of hoops to provide feedback to the > application. Since all the work is done in parallel by the host task, > the host has to notify the application of failures by executing > callbacks configured by the application. I did not want to place any > restrictions on what the application is allowed to do during these > callbacks, which means the host has to ensure that it is in a valid > state whenever a callback gets executed (no mutexes are locked, for > example). This requires the code to use a large number of mutexes and > temporary copies of host data structures, resulting in a lot of > complicated code. > > Finally, non-blocking HCI operations complicates the API presented to > the application. A single return code from a blocking operation is > easier to manage than a return code plus the possibility of a callback > being executed sometime in the future from a different task. A blocking > operation collapses several failure scenarios into a single function > return. > > Making HCI command/acknowledgement exchanges blocking addresses all of > the above issues: >* FSM machinery goes away; controller response is indicated in the > return code of the HCI send function. >*
Proposed changes to Nimble host
Hello all, The Mynewt BLE stack is called Nimble. Nimble consists of two packages: * Controller (link-layer) [net/nimble/controller] * Host (upper layers) [net/nimble/host] This email concerns the Nimble host. As I indicated in an email a few weeks ago, the code size of the Nimble host had increased beyond what I considered a reasonable level. When built for the ARM cortex-M4, with security enabled and the log level set to INFO, the host code size was about 48 kB. In recent days, I came up with a few ideas for reducing the host code size. As I explored these ideas, I realized that they open the door for some major improvements in the fundamental design of the host. Making these changes would introduce some backwards-compatibility issues, but I believe it is absolutely the right thing to do. If we do this, it needs to be done now while Mynewt is still in its beta phase. I have convinced myself that this is the right way forward; now I would like to see what the community thinks. As always, all feedback is greatly appreciated. There are two major changes that I am proposing: 1. All HCI command/acknowledgement exchanges are blocking. Background: The host and controller communicate with one another via the host-controller-interface (HCI) protocol. The host sends _commands_ to the controller; the controller sends _events_ to the host. Whenever the controller receives a command from the host, it immediately responds with an acknowledgement event. In addition, the controller also sends unsolicited events to the host to indicate state changes or to request information in a subsequent command. In the current host, all HCI commands are sent asynchronously (non-blocking). When the host wants to send an HCI command, it schedules a transmit operation by putting an OS event on its own event queue. The event points to a callback which does the actual HCI transmission. The callback also configures a second callback to be executed when the expected acknowledgement is received from the controller. Each time the host receives an HCI event from the controller, an OS event is put on the host's event queue. Processing of this OS event ultimately calls the configured callback (if it is an acknowledgement), or a hardcoded callback (if it is an unsolicited HCI event). This design works, but it introduces a number of problems. First, it requires the host code to maintain some quite complex state machines for what seem like simple HCI exchanges. This FSM machinery translates into a lot of extra code. There is also a lot of ugliness involved in canceling scheduled HCI transmits. Another complication with non-blocking HCI commands is that they require the host to jump through a lot of hoops to provide feedback to the application. Since all the work is done in parallel by the host task, the host has to notify the application of failures by executing callbacks configured by the application. I did not want to place any restrictions on what the application is allowed to do during these callbacks, which means the host has to ensure that it is in a valid state whenever a callback gets executed (no mutexes are locked, for example). This requires the code to use a large number of mutexes and temporary copies of host data structures, resulting in a lot of complicated code. Finally, non-blocking HCI operations complicates the API presented to the application. A single return code from a blocking operation is easier to manage than a return code plus the possibility of a callback being executed sometime in the future from a different task. A blocking operation collapses several failure scenarios into a single function return. Making HCI command/acknowledgement exchanges blocking addresses all of the above issues: * FSM machinery goes away; controller response is indicated in the return code of the HCI send function. * Nearly all HCI failures are indicated to the application immediately, so there is no need for lots of mutexes and temporary copies of data structures. * API is simplified; operation results are indicated via a simple function return code. 2. The Nimble host is "taskless" Currently the Nimble host runs in its own OS task. This is not necessarily a bad thing, but in the case of the host, I think the costs outweigh the benefits. I can think of three benefits to running a library in its own task: * Guarantee that timing requirements are met; just configure the task with an appropriate priority. * (related to the above point) The library task can continue to work while the application task is blocked. * Facilitates stack sizing. Since the library performs its operations in its own stack, it is easier to predict stack usage of both the library task and the application task. I don't think any of these benefits are very compelling in the case of the Nimble host for the following reasons: * The host has nothing