[hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread José Miguel Sánchez García
The value passed as maxval, SIZE_MAX, doesn't fit on a long long int due
to signedness. It was causing legitimate range request to be discarded
as bad.

I tested it serving an mp4 and opening it with Firefox. A "range=0-" was
requested, and it triggered the bug.
---
 http.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/http.c b/http.c
index 1862dc4..d7b04e9 100644
--- a/http.c
+++ b/http.c
@@ -478,10 +478,10 @@ parse_range(const char *str, size_t size, size_t *lower, 
size_t *upper)
 * last byte if 'last' is not given),
 * inclusively, and byte-numbering beginning at 0
 */
-   *lower = strtonum(first, 0, SIZE_MAX, );
+   *lower = strtonum(first, 0, LLONG_MAX, );
if (!err) {
if (last[0] != '\0') {
-   *upper = strtonum(last, 0, SIZE_MAX, );
+   *upper = strtonum(last, 0, LLONG_MAX, );
} else {
*upper = size - 1;
}
@@ -513,7 +513,7 @@ parse_range(const char *str, size_t size, size_t *lower, 
size_t *upper)
 * use upper as a temporary storage for 'num',
 * as we know 'upper' is size - 1
 */
-   *upper = strtonum(last, 0, SIZE_MAX, );
+   *upper = strtonum(last, 0, LLONG_MAX, );
if (err) {
return S_BAD_REQUEST;
}
-- 
2.29.2




Re: [hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread Laslo Hunhold
On Sat, 31 Oct 2020 21:58:26 +
José Miguel Sánchez García  wrote:

Dear José,

> The value passed as maxval, SIZE_MAX, doesn't fit on a long long int
> due to signedness. It was causing legitimate range request to be
> discarded as bad.
> 
> I tested it serving an mp4 and opening it with Firefox. A "range=0-"
> was requested, and it triggered the bug.

this is a great catch, thanks! But wouldn't it be better to use
MIN(SIZE_MAX, LLONG_MAX)?

I haven't found anything in the standard that puts "long long" and
"size_t" into any relation, which means, for me, that any case is
possible where either value could be larger, but please correct me if
I'm wrong.

With best regards

Laslo



[hackers] [quark] Use epoll/kqueue and worker threads to handle connections || Laslo Hunhold

2020-10-31 Thread git
commit dff98c0bcaef7be220c563ebaebd66f8c6704197
Author: Laslo Hunhold 
AuthorDate: Sun Nov 1 00:27:46 2020 +0100
Commit: Laslo Hunhold 
CommitDate: Sun Nov 1 00:27:46 2020 +0100

Use epoll/kqueue and worker threads to handle connections

This adds quite a bit of code, but is the culmination of the previous
restructurizations. Each worker thread has a connection pool and the
interesting part is that it's 100% nonblocking. If reading or writing
blocks at any point, the worker thread can just drop it and continue
with something else. This is especially powerful against attacks like
slow loris, which cannot be caught with a forking-model and could easily
be used in a DoS against a quark instance.

There are no memory allocations at runtime, unless you use dirlistings,
whose libc-allocations you can't work around.

In case the connection pool is exhausted due to a lot of slow lorises,
we still hit a DoS, but at least it can now be possible to assess the
connection pool and just drop another connection that can be
heuristically assessed as a "malicious" one (e.g. many connections from
one client, long time in one state or something using a monotonic
clock).

Given we still sadly don't have kqueue in linux, which is 1000x times
better than epoll, which is deeply flawed, I wrote a very thin wrapper
in queue.{c,h} which exposes the necessary functions in a common
interface.

Signed-off-by: Laslo Hunhold 

diff --git a/Makefile b/Makefile
index 548e6aa..da0e458 100644
--- a/Makefile
+++ b/Makefile
@@ -4,13 +4,13 @@
 
 include config.mk
 
-COMPONENTS = data http sock util
+COMPONENTS = data http queue sock util
 
 all: quark
 
 data.o: data.c data.h http.h util.h config.mk
 http.o: http.c config.h http.h util.h config.mk
-main.o: main.c arg.h data.h http.h sock.h util.h config.mk
+main.o: main.c arg.h data.h http.h queue.h sock.h util.h config.mk
 sock.o: sock.c sock.h util.h config.mk
 util.o: util.c util.h config.mk
 
diff --git a/config.mk b/config.mk
index 7056241..cedb3e7 100644
--- a/config.mk
+++ b/config.mk
@@ -10,7 +10,7 @@ MANPREFIX = $(PREFIX)/share/man
 # flags
 CPPFLAGS = -DVERSION=\"$(VERSION)\" -D_DEFAULT_SOURCE -D_XOPEN_SOURCE=700 
-D_BSD_SOURCE
 CFLAGS   = -std=c99 -pedantic -Wall -Wextra -Os
-LDFLAGS  = -s
+LDFLAGS  = -lpthread -s
 
 # compiler and linker
 CC = cc
diff --git a/main.c b/main.c
index d64774b..e26cc77 100644
--- a/main.c
+++ b/main.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #include "arg.h"
 #include "data.h"
 #include "http.h"
+#include "queue.h"
 #include "sock.h"
 #include "util.h"
 
@@ -48,16 +50,20 @@ logmsg(const struct connection *c)
 }
 
 static void
-serve(struct connection *c, const struct server *srv)
+close_connection(struct connection *c)
+{
+   if (c != NULL) {
+   close(c->fd);
+   memset(c, 0, sizeof(*c));
+   }
+}
+
+static void
+serve_connection(struct connection *c, const struct server *srv)
 {
enum status s;
int done;
 
-   /* set connection timeout */
-   if (sock_set_timeout(c->fd, 30)) {
-   warn("sock_set_timeout: Failed");
-   }
-
switch (c->state) {
case C_VACANT:
/*
@@ -145,12 +151,212 @@ response:
}
 err:
logmsg(c);
+   close_connection(c);
+}
+
+struct connection *
+accept_connection(int insock, struct connection *connection,
+  size_t nslots)
+{
+   struct connection *c = NULL;
+   size_t j;
+
+   /* find vacant connection (i.e. one with no fd assigned to it) */
+   for (j = 0; j < nslots; j++) {
+   if (connection[j].fd == 0) {
+   c = [j];
+   break;
+   }
+   }
+   if (j == nslots) {
+   /* nothing available right now, return without accepting */
+
+   /* 
+* NOTE: This is currently still not the best option, but
+* at least we now have control over it and can reap a
+* connection from our pool instead of previously when
+* we were forking and were more or less on our own in
+* each process
+*/
+   return NULL;
+   }
+
+   /* accept connection */
+   if ((c->fd = accept(insock, (struct sockaddr *)>ia,
+   &(socklen_t){sizeof(c->ia)})) < 0) {
+   if (errno != EAGAIN && errno != EWOULDBLOCK) {
+   /* not much we can do here */
+   warn("accept:");
+   }
+   return NULL;
+   }
+
+   /* set socket to non-blocking mode */
+   if (sock_set_nonblocking(c->fd)) {
+   /* we can't allow blocking sockets */
+   return NULL;
+   }
+
+   return c;
+}
+
+struct worker_data {
+ 

Re: [hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread José Miguel Sánchez García

On 10/31/2020 11:27 PM, Laslo Hunhold wrote:


this is a great catch, thanks! But wouldn't it be better to use
MIN(SIZE_MAX, LLONG_MAX)?

I haven't found anything in the standard that puts "long long" and
"size_t" into any relation, which means, for me, that any case is
possible where either value could be larger, but please correct me if
I'm wrong.


Good point! It could be the case that SIZE_MAX is smaller than 
LLONG_MAX. Honestly I don't know, but I would do what you are proposing 
just to be sure: it is the safest option, and maybe the compiler will 
take care of replacing the correct value at compile time. Way better 
than leaving another bug lingering until someone else finds it again.


Best regards,
José Miguel




[hackers] [quark] Prevent overflow in strtonum()-parameters || Laslo Hunhold

2020-10-31 Thread git
commit 7d26fc695d548b5a73305a97dce274a313e0f602
Author: Laslo Hunhold 
AuthorDate: Sun Nov 1 01:47:11 2020 +0100
Commit: Laslo Hunhold 
CommitDate: Sun Nov 1 01:49:27 2020 +0100

Prevent overflow in strtonum()-parameters

Make sure not to overflow the long long value. Given the standard
doesn't bring any tangible guarantees for the upper limits of size_t,
we just determine which (long long or size_t) is larger at compile time.

Thanks José Miguel Sánchez García for reporting this!

Signed-off-by: Laslo Hunhold 

diff --git a/http.c b/http.c
index d43ceaf..dc32290 100644
--- a/http.c
+++ b/http.c
@@ -491,10 +491,13 @@ parse_range(const char *str, size_t size, size_t *lower, 
size_t *upper)
 * last byte if 'last' is not given),
 * inclusively, and byte-numbering beginning at 0
 */
-   *lower = strtonum(first, 0, SIZE_MAX, );
+   *lower = strtonum(first, 0, MIN(SIZE_MAX, LLONG_MAX),
+ );
if (!err) {
if (last[0] != '\0') {
-   *upper = strtonum(last, 0, SIZE_MAX, );
+   *upper = strtonum(last, 0,
+ MIN(SIZE_MAX, LLONG_MAX),
+ );
} else {
*upper = size - 1;
}
@@ -526,7 +529,7 @@ parse_range(const char *str, size_t size, size_t *lower, 
size_t *upper)
 * use upper as a temporary storage for 'num',
 * as we know 'upper' is size - 1
 */
-   *upper = strtonum(last, 0, SIZE_MAX, );
+   *upper = strtonum(last, 0, MIN(SIZE_MAX, LLONG_MAX), );
if (err) {
return S_BAD_REQUEST;
}



[hackers] [quark] Prepare http_send_buf() http_recv_header() for blocking I/O || Laslo Hunhold

2020-10-31 Thread git
commit 4d3a6c5297015285f88a56cc47f6a53c372b1506
Author: Laslo Hunhold 
AuthorDate: Sun Nov 1 00:10:54 2020 +0100
Commit: Laslo Hunhold 
CommitDate: Sun Nov 1 00:10:54 2020 +0100

Prepare http_send_buf() http_recv_header() for blocking I/O

Signed-off-by: Laslo Hunhold 

diff --git a/http.c b/http.c
index f1e15a4..d43ceaf 100644
--- a/http.c
+++ b/http.c
@@ -109,7 +109,17 @@ http_send_buf(int fd, struct buffer *buf)
 
while (buf->len > 0) {
if ((r = write(fd, buf->data, buf->len)) <= 0) {
-   return S_REQUEST_TIMEOUT;
+   if (errno == EAGAIN || errno == EWOULDBLOCK) {
+   /*
+* socket is blocking, return normally.
+* given the buffer still contains data,
+* this indicates to the caller that we
+* have been interrupted.
+*/
+   return 0;
+   } else {
+   return S_REQUEST_TIMEOUT;
+   }
}
memmove(buf->data, buf->data + r, buf->len - r);
buf->len -= r;
@@ -145,8 +155,17 @@ http_recv_header(int fd, struct buffer *buf, int *done)
while (1) {
if ((r = read(fd, buf->data + buf->len,
  sizeof(buf->data) - buf->len)) <= 0) {
-   s = S_REQUEST_TIMEOUT;
-   goto err;
+   if (errno == EAGAIN || errno == EWOULDBLOCK) {
+   /*
+* socket is drained, return normally,
+* but set done to zero
+*/
+   *done = 0;
+   return 0;
+   } else {
+   s = S_REQUEST_TIMEOUT;
+   goto err;
+   }
}
buf->len += r;
 
@@ -181,7 +200,7 @@ http_parse_header(const char *h, struct request *req)
const char *p, *q;
char *m, *n;
 
-   /* empty all fields */
+   /* empty the request struct */
memset(req, 0, sizeof(*req));
 
/*
diff --git a/queue.c b/queue.c
new file mode 100644
index 000..a3b8e45
--- /dev/null
+++ b/queue.c
@@ -0,0 +1,177 @@
+/* See LICENSE file for copyright and license details. */
+#include 
+
+#ifdef __linux__
+   #include 
+#else
+   #include 
+   #include 
+   #include 
+#endif
+
+#include "queue.h"
+#include "util.h"
+
+int
+queue_create(void)
+{
+   int qfd;
+
+   #ifdef __linux__
+   if ((qfd = epoll_create1(0)) < 0) {
+   warn("epoll_create1:");
+   }
+   #else
+
+   #endif
+
+   return qfd;
+}
+
+int
+queue_add_fd(int qfd, int fd, enum queue_event_type t, int shared,
+ const void *data)
+{
+   #ifdef __linux__
+   struct epoll_event e;
+
+   /* set event flag */
+   if (shared) {
+   /*
+* if the fd is shared, "exclusive" is the only
+* way to avoid spurious wakeups and "blocking"
+* accept()'s.
+*/
+   e.events = EPOLLEXCLUSIVE;
+   } else {
+   /*
+* if we have the fd for ourselves (i.e. only
+* within the thread), we want to be
+* edge-triggered, as our logic makes sure
+* that the buffers are drained when we return
+* to epoll_wait()
+*/
+   e.events = EPOLLET;
+   }
+
+   switch (t) {
+   case QUEUE_EVENT_IN:
+   e.events |= EPOLLIN;
+   break;
+   case QUEUE_EVENT_OUT:
+   e.events |= EPOLLOUT;
+   break;
+   }
+
+   /* set data */
+   if (data == NULL) {
+   /* the data is the fd itself */
+   e.data.fd = fd;
+   } else {
+   /* set data pointer */
+   e.data.ptr = (void *)data;
+   }
+
+   /* register fd in the interest list */
+   if (epoll_ctl(qfd, EPOLL_CTL_ADD, fd, ) < 0) {
+   warn("epoll_ctl:");
+   return 1;
+   }
+   #else
+
+   #endif
+
+   return 0;
+}
+
+int
+queue_mod_fd(int qfd, int fd, enum queue_event_type t, const void *data)
+{
+   #ifdef __linux__
+   struct epoll_event e;
+
+   /* set event flag */
+ 

Re: [hackers] [quark] Thoughts on CGI and authentication?

2020-10-31 Thread Laslo Hunhold
On Mon, 26 Oct 2020 11:49:33 +0100
José Miguel Sánchez García  wrote:

Dear José,

> Funny, that's my current use case. All my CGI is through forms, so
> I'm currently running a separate server for the form handlers,
> regenerating the HTML and then redirecting to the recently updated
> page through a "303 See Other" code.
> 
> My motivation behind integrating CGI into quark was leveraging the 
> quality of its implementation to avoid the security pitfalls of 
> badly-written HTTP servers out there. I would only have to worry
> about writing a simple script to handle the form data.
> 
> Also, if CGI was integrated into the web server itself, I could use
> the same domain/port/endpoint to serve the static page (via a GET
> request) and to handle the form (via a POST request). Moot point but
> it goes a long way towards usability.

another approach would be to have a very small interposer that splits
GET and POST requests and forwards them to quark and the CGI-handler
respectively.

> Finally, CGI is often used to customize the content of a page for a 
> given user. Imagine a logged in user in a forum: they must see a link 
> that points to their profile. Anonymous users would see a login/signup
> bar instead.
> 
> I must say that, even with these advantages in mind, I've come to
> think that CGI would not be appropriate for quark. Its goals are at
> odds with the needs of a CGI implementation, and that's fine (there
> are alternatives for those who want CGI). Feel free to prove me wrong
> :)

Software gets really complex if you try covering the last 5% of
use-cases. Given the massive flexibility of the static web and how many
CGI-applications really are just far away from the original idea of the
web I really don't see a reason to tailor quark towards CGI. It was
there before, but it just made everything really complicated.

With best regards

Laslo



Re: [hackers] [quark][PATCH] Fix overflow when calling strtonum in parse_range

2020-10-31 Thread Laslo Hunhold
On Sun, 1 Nov 2020 01:15:32 +0100
José Miguel Sánchez García  wrote:

Dear José,

> Good point! It could be the case that SIZE_MAX is smaller than 
> LLONG_MAX. Honestly I don't know, but I would do what you are
> proposing just to be sure: it is the safest option, and maybe the
> compiler will take care of replacing the correct value at compile
> time. Way better than leaving another bug lingering until someone
> else finds it again.

we should be safe at that point and I have committed the MIN-solution
in commit 7d26fc695. Thanks for your report!

With best regards

Laslo