On (25/06/08 16:50), Andrew Hastings didst pronounce:
> Mel Gorman wrote:
> >HUGETLB_MORECORE currently exists to allow glibc to back malloc() with
> >large pages instead of small pages. However, not all applications use glibc
> >malloc() nor is it always desirable to back malloc() with huge pages. There
> >exists a requirement that a hugepage-aware application be able to allocate
> >hugepages directly.
> >
> >Currently, each application is expected to discover the filesystem 
> >themselves,
> >mmap() the file and other house-keeping tasks. libhugetlbfs already 
> >implements
> >much of this complex logic internally. This patch exposes a simple API for 
> >the
> >allocation and freeing of regions backed by hugepages that applications 
> >linking
> >directly to libhugetlbfs can use. The implementation is a little simplistic
> >but can be optimised later if and when applications perceive its 
> >performance
> >to be a bottleneck. The API itself should not need to change as a 
> >multi-page
> >aware API would be an additional rather than a replacement interface.
> >
> >Ideally, this or something like it would be merged for 2.0. Comments?
> 
> Sure, this seems like a reasonable idea.
> 

Sweet

> >Signed-off-by: Mel Gorman <[EMAIL PROTECTED]>
> >---
> > Makefile    |    5 +
> > alloc.c     |  152 
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > hugetlbfs.h |   11 ++++
> > version.lds |    6 ++
> > 4 files changed, 173 insertions(+), 1 deletion(-)
> >
> >diff -rup -X /usr/src/patchset-0.6/bin//dontdiff 
> >libhugetlbfs-clean/alloc.c libhugetlbfs-userspace-alloc/alloc.c
> >--- libhugetlbfs-clean/alloc.c       2008-06-24 11:15:54.000000000 -0700
> >+++ libhugetlbfs-userspace-alloc/alloc.c     2008-06-25 
> >13:40:37.000000000 -0700
> >@@ -0,0 +1,131 @@
> >+/*
> >+ * libhugetlbfs - Easy use of Linux hugepages
> >+ * alloc.c - Simple allocator of regions backed by hugepages
> >+ *
> >+ * This library is free software; you can redistribute it and/or
> >+ * modify it under the terms of the GNU Lesser General Public License
> >+ * as published by the Free Software Foundation; either version 2.1 of
> >+ * the License, or (at your option) any later version.
> >+ *
> >+ * This library is distributed in the hope that it will be useful, but
> >+ * WITHOUT ANY WARRANTY; without even the implied warranty of
> >+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> >+ * Lesser General Public License for more details.
> >+ *
> >+ * You should have received a copy of the GNU Lesser General Public
> >+ * License along with this library; if not, write to the Free Software
> >+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 
> >USA
> >+ */
> >+
> >+#define _GNU_SOURCE
> >+#include <errno.h>
> >+#include <stdio.h>
> >+#include <stdlib.h>
> >+#include <string.h>
> >+#include <unistd.h>
> >+#include <sys/mman.h>
> >+#include <sys/types.h>
> >+
> >+#include "hugetlbfs.h"
> >+#include "libhugetlbfs_internal.h"
> >+
> >+/**
> >+ * hugepage_malloc - Allocate an amount of memory backed by huge pages
> 
> The function name in the comment doesn't match the function definition. 
>  (Personally, I prefer "hugepage_malloc" over "hugepages_malloc" but 
> perhaps this is an Americanism.)
> 

I have poor naming taste. I changed it to hugepages_malloc() because it
can be multiple pages depending on the value of len. I do not feel
stronly and am open to suggestion until a release with the symbol
occurs.

> >+ * len: Size of the region to allocate
> >+ * flags: Flags specifying the behaviour of the function
> >+ *
> >+ * This function allocates a buffer similar in principal to malloc() 
> >except
> >+ * that the region is backed by huge pages and at least hugepage-aligned.
> >+ * This is not a suitable drop-in for malloc() and is only suitable in the
> >+ * event the length is expected to be hugepage-aligned. However, a
> >+ * malloc-like library could use this function to create additional heap
> >+ * similar in principal to what morecore does for glibc malloc.
> >+ */
> >+void *hugepages_malloc(size_t len, ghp_t flags)
> >+{
> >+    void *buf;
> >+    int heap_fd;
> >+
> >+    /* Create a file descriptor for the new region */
> >+    heap_fd = hugetlbfs_unlinked_fd();
> >+    if (heap_fd < 0) {
> >+            ERROR("Couldn't open hugetlbfs file for %d sized heap\n", 
> >len);
> 
> I think you need "%zd" for size_t or the value will be truncated on 
> 64-bit platforms.  Grammar nit:  "%d-sized" or "size %d".

Good spot. Fixed
 

> >+            return NULL;
> >+    }
> >+
> >+    /* Map the requested region */
> >+    buf = mmap(NULL, len, PROT_READ|PROT_WRITE,
> >+             MAP_PRIVATE, heap_fd, len);
> >+    if (buf == MAP_FAILED) {
> >+            WARNING("New heap segment map failed: %s\n", 
> >strerror(errno));
> >+            return NULL;
> 
> Still need a close(heap_fd) on mmap failure.
> 

Very good spot, fixed.

> >+    }
> >+
> >+    /* Close the file so we do not have to track the descriptor */
> >+    if (close(heap_fd) != 0) {
> >+            WARNING("Failed to close new heap fd: %s\n", 
> >strerror(errno));
> >+            munmap(buf, len);
> >+            return NULL;
> >+    }
> >+
> >+    /* woo, new buffer of shiny */
> >+    return buf;
> >+}
> >+
> >+#define MAPS_BUF_SZ 4096
> >+/**
> >+ * hugepages_free - Free a region allocated that was backed by large pages
> >+ * ptr - The pointer to the buffer returned by hugepages_malloc()
> >+ *
> >+ * This function finds a region to free based on the contents of
> >+ * /proc/pid/maps. The assumption is made that the ptr is the start of
> >+ * a hugepage region allocated with hugepages_malloc. No checking is made
> >+ * that the pointer is to a hugepage backed region.
> >+ */
> >+void hugepages_free(void *ptr)
> >+{
> >+    FILE *fd;
> >+    char line[MAPS_BUF_SZ];
> >+    unsigned long start = 0, end = 0;
> >+
> >+    /*
> >+     * /proc/self/maps is used to determine the length of the original
> >+     * allocation. As mappings are based on different files, we can
> >+     * assume that maps will not merge. If the hugepages were truely
> 
> "truly"

Fixed

> 
> >+     * anonymous, this assumption would be broken.
> >+     */
> >+    fd = fopen("/proc/self/maps", "r");
> >+    if (!fd) {
> >+            ERROR("Failed to open /proc/self/maps\n");
> >+            return;
> >+    }
> >+
> >+    /* Parse /proc/maps for address ranges line by line */
> >+    while (!feof(fd)) {
> >+            char *bufptr;
> >+            char *saveptr = NULL;
> >+
> >+            /* Read a line of input */
> >+            if (fgets(line, MAPS_BUF_SZ, fd) == NULL)
> >+                    break;
> >+    
> >+            /* Parse the line to get the start and end of each mapping */
> >+            bufptr = strtok_r(line, " ", &saveptr);
> >+            bufptr = strtok_r(bufptr, "-", &saveptr);
> >+            start = strtoull(bufptr, NULL, 16);
> >+            bufptr = strtok_r(NULL, "-", &saveptr);
> >+
> >+            /* If the correct mapping is found, remove it */
> >+            if (start == (unsigned long)ptr) {
> >+                    end = strtoull(bufptr, NULL, 16);
> >+                    munmap(ptr, end - start);
> >+                    break;
> >+            }
> >+    }
> >+
> >+    /* Print a warning if the ptr appeared to point nowhere */
> >+    if (end == 0)
> >+            ERROR("hugepages_free using invalid or double free\n");
> >+
> >+    fclose(fd);
> >+}
> >diff -rup -X /usr/src/patchset-0.6/bin//dontdiff 
> >libhugetlbfs-clean/hugetlbfs.h libhugetlbfs-userspace-alloc/hugetlbfs.h
> >--- libhugetlbfs-clean/hugetlbfs.h   2008-05-16 13:43:11.000000000 -0700
> >+++ libhugetlbfs-userspace-alloc/hugetlbfs.h 2008-06-25 
> >13:28:51.000000000 -0700
> >@@ -33,4 +33,15 @@ long dump_proc_pid_maps(void);
> > 
> > #define PF_LINUX_HUGETLB    0x100000
> > 
> >+/*
> >+ * Direct alloc flags and types
> >+ * GHP_FORCELARGE - Returns NULL if a huge page cannot be allocated 
> >(default)
> >+ */
> >+typedef unsigned long ghp_t;
> >+#define GHP_FORCELARGE      0x01UL
> 
> Looks like ghp* and GHP* might be part of an unrelated patch?
> 

No, it just happens to be the default behaviour right now. Later I
envision that flags could be used that say "allocate a region with huge
pages if possible and small pages otherwise, but do not return NULL".

This was a bad choice though. Maybe specifying GHP_DEFAULT would be a
better option as with a number of flags, there may be a desire to
specify what the default flag combination would be. Would that be
better?

I wanted to have the flags arguement for the API either way for
extending later without altering the function signature.

> >+
> >+/* Direct alloc functions */
> >+void *hugepages_malloc(size_t len, ghp_t flags);
> >+void hugepages_free(void *ptr);
> >+
> > #endif /* _HUGETLBFS_H */
> >diff -rup -X /usr/src/patchset-0.6/bin//dontdiff 
> >libhugetlbfs-clean/Makefile libhugetlbfs-userspace-alloc/Makefile
> >--- libhugetlbfs-clean/Makefile      2008-05-16 13:43:11.000000000 -0700
> >+++ libhugetlbfs-userspace-alloc/Makefile    2008-06-25 
> >10:33:38.000000000 -0700
> >@@ -1,7 +1,8 @@
> > PREFIX = /usr/local
> > 
> >-LIBOBJS = hugeutils.o version.o init.o morecore.o debug.o
> >+LIBOBJS = hugeutils.o version.o init.o morecore.o debug.o alloc.o
> > INSTALL_OBJ_LIBS = libhugetlbfs.so libhugetlbfs.a
> >+INSTALL_HEADERS = hugetlbfs.h
> > LDSCRIPT_TYPES = B BDT
> > LDSCRIPT_DIST_ELF = elf32ppclinux elf64ppc elf_i386 elf_x86_64
> > INSTALL_OBJSCRIPT = ld.hugetlbfs
> >@@ -89,6 +90,7 @@ endif
> > LIBOBJS32 += $(LIBOBJS:%=obj32/%)
> > LIBOBJS64 += $(LIBOBJS:%=obj64/%)
> > 
> >+HEADERDIR = $(PREFIX)/include
> > LIBDIR32 = $(PREFIX)/$(LIB32)
> > LIBDIR64 = $(PREFIX)/$(LIB64)
> > LDSCRIPTDIR = $(PREFIX)/share/libhugetlbfs/ldscripts
> >@@ -254,6 +256,7 @@ objscript.%: %
> > install: libs $(OBJDIRS:%=%/install) $(INSTALL_OBJSCRIPT:%=objscript.%)
> >     @$(VECHO) INSTALL
> >     $(INSTALL) -d $(DESTDIR)$(LDSCRIPTDIR)
> >+    $(INSTALL) -m 644 $(INSTALL_HEADERS) $(HEADERDIR)
> >     $(INSTALL) -m 644 $(INSTALL_LDSCRIPTS:%=ldscripts/%) 
> >     $(DESTDIR)$(LDSCRIPTDIR)
> >     $(INSTALL) -d $(DESTDIR)$(BINDIR)
> >     for x in $(INSTALL_OBJSCRIPT); do \
> >diff -rup -X /usr/src/patchset-0.6/bin//dontdiff 
> >libhugetlbfs-clean/version.lds libhugetlbfs-userspace-alloc/version.lds
> >--- libhugetlbfs-clean/version.lds   2008-05-16 13:43:11.000000000 -0700
> >+++ libhugetlbfs-userspace-alloc/version.lds 2008-06-25 
> >10:21:25.000000000 -0700
> >@@ -7,3 +7,9 @@ VERS_1.0 {
> >     local:
> >             *;
> > };
> >+
> >+HTLBFS_2.0 {
> >+    global:
> >+            hugepages_malloc;
> >+            hugepages_free;
> >+};
> 
> -Andrew Hastings
>  Cray Inc.
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Libhugetlbfs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to