Re: [PATCH v2] ubusd: convert tx_queue to linked list

2021-03-29 Thread Arnout Vandecappelle
 Hi Felix,

On 25/03/2021 23:56, Felix Fietkau wrote:
[snip]

> The data structure changes that I pointed out are not a blocker for me.
> I do want to see at least a per-client message limit (in order to not
> regress on accidental DoS).

 I posted a v3 with exactly that at around the time you sent this mail.

> A global one would be nice too, but is not a must-have.

 That one wouldn't be too hard to add, it's just that I'm not entirely sure
about global variable access in the ubusd code. Is multithreading possible 
there?


> I do agree with your proposal regarding lookup method rework,
> and wire format changes are acceptable too, as long as the libubus API
> doesn't change.

 I also took a look at that, but it was just a bit too daunting for me. Code
wise it can't be too hard, but I don't understand enough about the wire format
and which parts in the code it affects to have confidence in not breaking
something somewhere else.

 Regards,
 Arnout

> 
> The wire format change should probably be something generic like adding
> support for having a single ubus message contain an array of data blobs
> (and having libubus call data_cb() for each element of that array).
> That way its use is not limited to lookups, it could also be used by
> other ubus clients for sending multiple messages with less overhead.
> The changes for this could probably be very simple without any major
> refactoring.

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


Re: [PATCH v2] ubusd: convert tx_queue to linked list

2021-03-25 Thread Felix Fietkau


On 2021-03-25 21:43, Arnout Vandecappelle wrote:
> 
> 
> On 25/03/2021 10:48, Felix Fietkau wrote:
>> 
>> On 2021-03-24 13:27, Arnout Vandecappelle (Essensium/Mind) wrote:
>>> ubusd maintains a per-client tx_queue containing references to message
>>> buffers that have not been sent yet (due to the socket blocking). This
>>> is a fixed-size, 64-element queue.
>>>
>>> When more than 64 elements are queued, subsequent elements are simply
>>> dropped. Thus, a client that is waiting for those messages will block
>>> indefinitely. In particular, this happens when more than +- 250 objects
>>> are registered on the bus and either "ubus list" or "ubus wait_for" is
>>> called. The responses to these requests consist of a message buffer per
>>> object. Since in practice, ubusd will not yield between the sends of
>>> these message buffers, the client has no time to process them and
>>> eventually the output socket blocks. After 64 more objects, the rest is
>>> dropped, including the final message that indicates termination. Thus,
>>> the client waits indefinitely for the termination message.
>>>
>>> To solve this, turn the tx_queue into a variable-sized linked list
>>> instead of a fixed-size queue.
>> In order to reduce memory allocation overhead, I would propose the
>> following:
>> 
>> struct ubus_msg_backlog {
>>  struct ubus_msg_backlog *next;
>>  struct ubus_msg_buf *msg[UBUSD_CLIENT_BACKLOG];
>>  int tail;
>> };
> 
>  This saves space by open-coding a single-linked list rather than using the
> normal double-linked list. This comes at the cost of iterating over the entire
> list in order to append to it.
> 
>  It also saves space by dropping the offset member. But for that, we don't 
> need
> to go to this extra structure.
> 
>  Applying those to "optimisations" to struct ubus_msg_buf_list would make that
> one 8 bytes (compared to 16 now). So that's 8 bytes per queued buffer, in
> addition to the 24 bytes of struct ubus_msg_buf and the actual buffer itself,
> which you can expect to be hundreds of bytes.
> 
>  struct ubus_msg_backlog is 24 bytes (assuming UBUSD_CLIENT_BACKLOG is 4),
> that's 6 bytes per element. So not much gained here. Increasing
> UBUSD_CLIENT_BACKLOG will decrease the overhead for large backlog, but gives a
> much bigger overhead for smaller backlog. So not a clear win.
UBUSD_CLIENT_BACKLOG is currently 32. I'm not so much counting bytes for
the optimization, I'm more interested in reducing the number of memory
allocations. This helps reduce memory allocator overhead and fragmentation.

>  Finally, the backlog should normally be empty. It's pretty unusual for a ubus
> message to take more then one write to transfer.
> 
>  In conclusion:
> - I think your proposal doesn't save much;
> - and it complicates things quite a bit;
> - in addition, the single-linked list potentially poses significant time 
> overhead.
With a reasonable upper limit for the maximum number of messages and the
current UBUSD_CLIENT_BACKLOG of 32, the time overhead for walking the
pointers is actually insignificant. If it ever becomes significant, we
can simply add a tail pointer to struct ubus_client and still save space.
I also expect it to be less than the malloc overhead for lots of
single-message list entries.

>  If you want to save a few bytes, it does make sense to move the offset back 
> to
> struct ubus_client.
> 
>  If you *really* want to save space, and time as well, it would be better to
> optimise ubusd_handle_lookup. That currently sends a separate, relatively 
> small,
> message for every object. The overhead of the struct ubus_msg_buf dwarfs the
> overhead of struct ubus_msg_buf_list by quite a lot, and in addition there's
> overhead on the wire as well. It shouldn't be too hard to change 
> ubus_lookup_cb
> to handle a list rather than a single object.
> 
>  Maybe I should have gone down that path. I hadn't thought of it at the time.
I think that's a good idea.
>>> The txq_off variable was used to keep track of which part of the current
>>> message was already written, in case only a partial write was possible.
>>> We take this opportunity to also move that variable under the
>>> ubus_msg_buf_list structure (and rename it to "offset", and change its
>>> type to size_t). This makes it clearer that it is related to that
>>> particular queued message and not to the queue as a whole.
>>>
>>> Note that this infinite tx_queue opens the door to a DoS attack. You can
>>> open a client and a server connection, then send messages from the
>>> client to the server without ever reading anything on the server side.
>>> This will eventually lead to an out-of-memory. However, such a DoS
>>> already existed anyway, it just requires opening multiple server
>>> connections and filling up the fixed-size queue on each one. To protect
>>> against such DoS attacks, we'd need to:
>> I don't fully agree with your reasoning regarding the potential for DoS
>> attacks. It's true that the potential for *intentional* 

Re: [PATCH v2] ubusd: convert tx_queue to linked list

2021-03-25 Thread Arnout Vandecappelle



On 25/03/2021 10:48, Felix Fietkau wrote:
> 
> On 2021-03-24 13:27, Arnout Vandecappelle (Essensium/Mind) wrote:
>> ubusd maintains a per-client tx_queue containing references to message
>> buffers that have not been sent yet (due to the socket blocking). This
>> is a fixed-size, 64-element queue.
>>
>> When more than 64 elements are queued, subsequent elements are simply
>> dropped. Thus, a client that is waiting for those messages will block
>> indefinitely. In particular, this happens when more than +- 250 objects
>> are registered on the bus and either "ubus list" or "ubus wait_for" is
>> called. The responses to these requests consist of a message buffer per
>> object. Since in practice, ubusd will not yield between the sends of
>> these message buffers, the client has no time to process them and
>> eventually the output socket blocks. After 64 more objects, the rest is
>> dropped, including the final message that indicates termination. Thus,
>> the client waits indefinitely for the termination message.
>>
>> To solve this, turn the tx_queue into a variable-sized linked list
>> instead of a fixed-size queue.
> In order to reduce memory allocation overhead, I would propose the
> following:
> 
> struct ubus_msg_backlog {
>   struct ubus_msg_backlog *next;
>   struct ubus_msg_buf *msg[UBUSD_CLIENT_BACKLOG];
>   int tail;
> };

 This saves space by open-coding a single-linked list rather than using the
normal double-linked list. This comes at the cost of iterating over the entire
list in order to append to it.

 It also saves space by dropping the offset member. But for that, we don't need
to go to this extra structure.

 Applying those to "optimisations" to struct ubus_msg_buf_list would make that
one 8 bytes (compared to 16 now). So that's 8 bytes per queued buffer, in
addition to the 24 bytes of struct ubus_msg_buf and the actual buffer itself,
which you can expect to be hundreds of bytes.

 struct ubus_msg_backlog is 24 bytes (assuming UBUSD_CLIENT_BACKLOG is 4),
that's 6 bytes per element. So not much gained here. Increasing
UBUSD_CLIENT_BACKLOG will decrease the overhead for large backlog, but gives a
much bigger overhead for smaller backlog. So not a clear win.

 Finally, the backlog should normally be empty. It's pretty unusual for a ubus
message to take more then one write to transfer.

 In conclusion:
- I think your proposal doesn't save much;
- and it complicates things quite a bit;
- in addition, the single-linked list potentially poses significant time 
overhead.

 If you want to save a few bytes, it does make sense to move the offset back to
struct ubus_client.

 If you *really* want to save space, and time as well, it would be better to
optimise ubusd_handle_lookup. That currently sends a separate, relatively small,
message for every object. The overhead of the struct ubus_msg_buf dwarfs the
overhead of struct ubus_msg_buf_list by quite a lot, and in addition there's
overhead on the wire as well. It shouldn't be too hard to change ubus_lookup_cb
to handle a list rather than a single object.

 Maybe I should have gone down that path. I hadn't thought of it at the time.


> 
> To struct ubus_client add these:
> 
>   struct ubus_msg_backlog *backlog;
>   int backlog_head;
> 
> After sending messages from tx_queue, you pull in messages from
> cl->backlog, incrementing backlog_head.
> Once cl->backlog_head == backlog->tail, you set cl->backlog to
> backlog->next and free backlog.
> 
>> To maintain the linked list, an additional structure ubus_msg_buf_list
>> is created. We could also have added the linked list to the ubus_msg_buf
>> struct itself, but it is not relevant in the many other uses of the
>> ubus_msg_buf struct so it would just complicate things.
> Adding the linked list to ubus_msg_buf doesn't work, because a single
> message can be queued for multiple receivers. This mistake was already
> made by previous attempts at introducing a linked list for messages.

 Right, ubus_msg_free doesn't actually free it, it decreases the refcount. I
forgot that. Good thing I didn't implement it that way then :-)

>> The txq_off variable was used to keep track of which part of the current
>> message was already written, in case only a partial write was possible.
>> We take this opportunity to also move that variable under the
>> ubus_msg_buf_list structure (and rename it to "offset", and change its
>> type to size_t). This makes it clearer that it is related to that
>> particular queued message and not to the queue as a whole.
>>
>> Note that this infinite tx_queue opens the door to a DoS attack. You can
>> open a client and a server connection, then send messages from the
>> client to the server without ever reading anything on the server side.
>> This will eventually lead to an out-of-memory. However, such a DoS
>> already existed anyway, it just requires opening multiple server
>> connections and filling up the fixed-size queue on each one. To protect
>> against such DoS 

Re: [PATCH v2] ubusd: convert tx_queue to linked list

2021-03-25 Thread Felix Fietkau


On 2021-03-24 13:27, Arnout Vandecappelle (Essensium/Mind) wrote:
> ubusd maintains a per-client tx_queue containing references to message
> buffers that have not been sent yet (due to the socket blocking). This
> is a fixed-size, 64-element queue.
> 
> When more than 64 elements are queued, subsequent elements are simply
> dropped. Thus, a client that is waiting for those messages will block
> indefinitely. In particular, this happens when more than +- 250 objects
> are registered on the bus and either "ubus list" or "ubus wait_for" is
> called. The responses to these requests consist of a message buffer per
> object. Since in practice, ubusd will not yield between the sends of
> these message buffers, the client has no time to process them and
> eventually the output socket blocks. After 64 more objects, the rest is
> dropped, including the final message that indicates termination. Thus,
> the client waits indefinitely for the termination message.
> 
> To solve this, turn the tx_queue into a variable-sized linked list
> instead of a fixed-size queue.
In order to reduce memory allocation overhead, I would propose the
following:

struct ubus_msg_backlog {
struct ubus_msg_backlog *next;
struct ubus_msg_buf *msg[UBUSD_CLIENT_BACKLOG];
int tail;
};

To struct ubus_client add these:

struct ubus_msg_backlog *backlog;
int backlog_head;

After sending messages from tx_queue, you pull in messages from
cl->backlog, incrementing backlog_head.
Once cl->backlog_head == backlog->tail, you set cl->backlog to
backlog->next and free backlog.

> To maintain the linked list, an additional structure ubus_msg_buf_list
> is created. We could also have added the linked list to the ubus_msg_buf
> struct itself, but it is not relevant in the many other uses of the
> ubus_msg_buf struct so it would just complicate things.
Adding the linked list to ubus_msg_buf doesn't work, because a single
message can be queued for multiple receivers. This mistake was already
made by previous attempts at introducing a linked list for messages.

> The txq_off variable was used to keep track of which part of the current
> message was already written, in case only a partial write was possible.
> We take this opportunity to also move that variable under the
> ubus_msg_buf_list structure (and rename it to "offset", and change its
> type to size_t). This makes it clearer that it is related to that
> particular queued message and not to the queue as a whole.
> 
> Note that this infinite tx_queue opens the door to a DoS attack. You can
> open a client and a server connection, then send messages from the
> client to the server without ever reading anything on the server side.
> This will eventually lead to an out-of-memory. However, such a DoS
> already existed anyway, it just requires opening multiple server
> connections and filling up the fixed-size queue on each one. To protect
> against such DoS attacks, we'd need to:
I don't fully agree with your reasoning regarding the potential for DoS
attacks. It's true that the potential for *intentional* DoS attacks
exists in the current code already. What I'm worried about with this
patch is that you're adding extra potential for *accidental* DoS attacks
(i.e. excessive ubusd memory use for hanging processes receiving events).

> - keep a global maximum queue size that applies to all rx and tx queues
>   together;
> - stop reading from any connection when the maximum is reached;
> - close any connection when it hasn't become writeable after some
>   timeout.
I think we should have both a local and a global maximum queue size.
Other than that, I agree with the above.

- Felix

___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


[PATCH v2] ubusd: convert tx_queue to linked list

2021-03-24 Thread Arnout Vandecappelle (Essensium/Mind)
ubusd maintains a per-client tx_queue containing references to message
buffers that have not been sent yet (due to the socket blocking). This
is a fixed-size, 64-element queue.

When more than 64 elements are queued, subsequent elements are simply
dropped. Thus, a client that is waiting for those messages will block
indefinitely. In particular, this happens when more than +- 250 objects
are registered on the bus and either "ubus list" or "ubus wait_for" is
called. The responses to these requests consist of a message buffer per
object. Since in practice, ubusd will not yield between the sends of
these message buffers, the client has no time to process them and
eventually the output socket blocks. After 64 more objects, the rest is
dropped, including the final message that indicates termination. Thus,
the client waits indefinitely for the termination message.

To solve this, turn the tx_queue into a variable-sized linked list
instead of a fixed-size queue.

To maintain the linked list, an additional structure ubus_msg_buf_list
is created. We could also have added the linked list to the ubus_msg_buf
struct itself, but it is not relevant in the many other uses of the
ubus_msg_buf struct so it would just complicate things.

The txq_off variable was used to keep track of which part of the current
message was already written, in case only a partial write was possible.
We take this opportunity to also move that variable under the
ubus_msg_buf_list structure (and rename it to "offset", and change its
type to size_t). This makes it clearer that it is related to that
particular queued message and not to the queue as a whole.

Note that this infinite tx_queue opens the door to a DoS attack. You can
open a client and a server connection, then send messages from the
client to the server without ever reading anything on the server side.
This will eventually lead to an out-of-memory. However, such a DoS
already existed anyway, it just requires opening multiple server
connections and filling up the fixed-size queue on each one. To protect
against such DoS attacks, we'd need to:
- keep a global maximum queue size that applies to all rx and tx queues
  together;
- stop reading from any connection when the maximum is reached;
- close any connection when it hasn't become writeable after some
  timeout.

Fixes: https://bugs.openwrt.org/index.php?do=details_id=1525

Signed-off-by: Arnout Vandecappelle (Essensium/Mind) 
---
v2: workarounds for clang static analysis false positives:
 - use list_for_each_safe instead of a while loop;
 - hide the free() by moving it to ubusd.c.
---
 ubusd.c   | 30 +-
 ubusd.h   | 11 ---
 ubusd_main.c  | 38 +++---
 ubusd_proto.c |  1 +
 4 files changed, 41 insertions(+), 39 deletions(-)

diff --git a/ubusd.c b/ubusd.c
index 5993653..f8e33f8 100644
--- a/ubusd.c
+++ b/ubusd.c
@@ -133,24 +133,38 @@ ssize_t ubus_msg_writev(int fd, struct ubus_msg_buf *ub, 
size_t offset)
return ret;
 }
 
-static void ubus_msg_enqueue(struct ubus_client *cl, struct ubus_msg_buf *ub)
+void ubus_msg_list_free(struct ubus_msg_buf_list *ubl)
 {
-   if (cl->tx_queue[cl->txq_tail])
+   list_del_init(>list);
+   ubus_msg_free(ubl->msg);
+   free(ubl);
+}
+
+static void ubus_msg_enqueue(struct ubus_client *cl, struct ubus_msg_buf *ub,
+size_t offset)
+{
+   struct ubus_msg_buf_list *ubl;
+
+   ubl = calloc(1, sizeof(struct ubus_msg_buf_list));
+   if (!ubl)
return;
 
-   cl->tx_queue[cl->txq_tail] = ubus_msg_ref(ub);
-   cl->txq_tail = (cl->txq_tail + 1) % ARRAY_SIZE(cl->tx_queue);
+   INIT_LIST_HEAD(>list);
+   ubl->msg = ubus_msg_ref(ub);
+   ubl->offset = offset;
+
+   list_add_tail(>tx_queue, >list);
 }
 
 /* takes the msgbuf reference */
 void ubus_msg_send(struct ubus_client *cl, struct ubus_msg_buf *ub)
 {
-   ssize_t written;
+   ssize_t written = 0;
 
if (ub->hdr.type != UBUS_MSG_MONITOR)
ubusd_monitor_message(cl, ub, true);
 
-   if (!cl->tx_queue[cl->txq_cur]) {
+   if (list_empty(>tx_queue)) {
written = ubus_msg_writev(cl->sock.fd, ub, 0);
 
if (written < 0)
@@ -159,10 +173,8 @@ void ubus_msg_send(struct ubus_client *cl, struct 
ubus_msg_buf *ub)
if (written >= (ssize_t) (ub->len + sizeof(ub->hdr)))
return;
 
-   cl->txq_ofs = written;
-
/* get an event once we can write to the socket again */
uloop_fd_add(>sock, ULOOP_READ | ULOOP_WRITE | 
ULOOP_EDGE_TRIGGER);
}
-   ubus_msg_enqueue(cl, ub);
+   ubus_msg_enqueue(cl, ub, written);
 }
diff --git a/ubusd.h b/ubusd.h
index 923e43d..e20e55a 100644
--- a/ubusd.h
+++ b/ubusd.h
@@ -23,7 +23,6 @@
 #include "ubusmsg.h"
 #include "ubusd_acl.h"
 
-#define UBUSD_CLIENT_BACKLOG   32
 #define UBUS_OBJ_HASH_BITS 4
 
 extern struct