Re: [HelenOS-devel] realloc() always resorts to malloc()+memcpy()

Sergio Lopez Sun, 26 Feb 2017 01:02:31 -0800

On 2017-02-25 08:04, Jakub Jermář wrote:

Hi Sergio,


thanks for experimenting with HelenOS. Are you going to make a write up
about your discoveries? Anyway, please find my further answers below.

I have not yet decided what I'm going to do with the results. I've spenta lot of time over the years studying microkernels (as an amateur, neverprofessionally), especially GNU Mach and OSF Mach (I even maintain arepo [1] of OSF Mach + MkLinux, which I find quite interesting as itimplements optimizations like kernel colocation and migrating threads,being a predecessor of Apple's XNU).

Over that time, I came to the conclusion that most of the user*perceived* slowness on microkernel systems, comes from the overhead ofcached I/O operations. When a request is touching the disk, you get anadditional cost of ~100us on a ~2500us operation, which is significant,but not a big deal.

But when the request is being served from memory (a cached read block oralmost every write, except when using O_DIRECT), you get a cost of ~1uson a monolithic kernel vs. ~50us on a microkernel, which is a hugedifference.

So when I heard about Magenta, I decided to run a quick test, to see ifGoogle has found a way to mitigate this issue (spoiler, it has not).Testing on a memory-backed filesystem is quite similar to a test oncached contents, but without having to deal with real block devices orcaching algorithms.

On 02/25/2017 02:15 AM, Sergio Lopez wrote:

I've been running a simple test on various OSes, which creates a 64MB
file on a memory-backed filesystem, writing it in 4K chunks, and then
measures the time for a full read and rewrite (without truncating). My

intention was to get some metrics for the latency of minimal read()and

write() operations.


In this case involving the test application, VFS and TMPFS servers, the
read/rewrite basically tests multiple IPC and memory copy latency from
the app via VFS to TMPFS and then back to the app.

Thanks, that matches what I got from a quick read of the code. This putsHelenOS at a disadvantage against GNU Hurd and Magenta, as it requirestwice as much IPCs to serve the same I/O operations (both Hurd andMagenta get a direct port to the endpoint serving the correspondingnamespace). Still, it's only 10-20% slower than Magenta [2].

Can you show us what you have done exactly (perhaps even including the
testcase) so that we know what fixes the issue? That way we could merge
it and give you a full credit for it or suggest improvements /alternate
versions.

I'm sending both the test (read_latency_test.c), which is very simpleand self-explanatory and the patch I've applied (helenos_realloc.patch).


Sergio.

[1] https://github.com/slp/mkunity
[2] https://gist.github.com/slp/4ae33c06f3d3f56e0c355de06f1c6fb4

=== modified file 'uspace/lib/c/generic/malloc.c'
--- uspace/lib/c/generic/malloc.c	2016-08-28 11:27:42 +0000
+++ uspace/lib/c/generic/malloc.c	2017-02-26 08:53:13 +0000
@@ -936,6 +936,7 @@
 		heap_block_head_t *next_head =
 		    (heap_block_head_t *) (((void *) head) + head->size);
 		
+retry:
 		if (((void *) next_head < area->end) &&
 		    (head->size + next_head->size >= real_size) &&
 		    (next_head->free)) {
@@ -945,8 +946,13 @@
 			
 			ptr = ((void *) head) + sizeof(heap_block_head_t);
 			next_fit = NULL;
-		} else
+			reloc = false;
+		} else if (!reloc) {
 			reloc = true;
+			if (area_grow(area, (area->end - area->start) + head->size + real_size)) {
+				goto retry;
+			}
+		}
 	}
 	
 	heap_unlock();

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
//#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>

#define TFILE "/tmp/readtest"
#define BSIZE 4096
// Magenta's memfs can't create files longer
// that this.
#define BNUM  16384

char buf[BSIZE];

static double get_elapsed_usec(struct timeval *start, struct timeval *end)
{
	double s, e;

	s = start->tv_sec * 1000 * 1000;
	s += start->tv_usec;

	e = end->tv_sec * 1000 * 1000;
	e += end->tv_usec;

	return e - s;
}

static void measure_time(void (*test)(int fd), int fd)
{
	struct timeval start, end;
	double etime;

	gettimeofday(&start, NULL);
	(*test)(fd);
	gettimeofday(&end, NULL);

	etime = get_elapsed_usec(&start, &end);

	if (etime > 1000) {
		printf("elapsed time: %.2f ms", etime / 1000);
	} else {
		printf("elapsed time: %.2f us", etime);
	}
  
	printf(" cost per call: %.2f us\n", etime / BNUM);
}

static void realloc_test()
{
	int i;
	char *buf;

	buf = malloc(BSIZE);

	for (i = 2; i < BNUM; ++i) {
		buf = realloc(buf, BSIZE * i);
	}
}

static void write_file(int fd)
{
	int i;

	for (i = 0; i < BNUM; ++i) {
		if (write(fd, &buf, BSIZE) != BSIZE) {
			printf("Error writing to file\n");
			exit(-1);
		}
	}
}

static void read_sequential(int fd)
{
	int i;

	for (i = 0; i < BNUM; ++i) {
		if (read(fd, &buf, BSIZE) != BSIZE) {
			printf("Error reading from file\n");
			exit(-1);
		}
	}
}


int main(int argc, char **argv)
{
	int fd;

	measure_time(&realloc_test, -1);

	if ((fd = open(TFILE, O_CREAT | O_RDWR)) == -1) {
		printf("Can't open file");
		return -1;
	}

	write_file(fd);

	lseek(fd, 0, SEEK_SET);

	measure_time(&write_file, fd);

	lseek(fd, 0, SEEK_SET);

	measure_time(&read_sequential, fd);
	
	return 0;
}

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/listinfo/helenos-devel

Re: [HelenOS-devel] realloc() always resorts to malloc()+memcpy()

Reply via email to