Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2013-06-28 Thread Andres Freund
Hi,

On 2013-06-28 23:03:22 -0400, Bruce Momjian wrote:
> Did we decide against specifying huge pages in Postgres?

I don't think so. We need somebody to make some last modifications to
the patch though and Christian doesn't seem to have the time atm.

I think the bigger memory (size of the per process page table) and #cpus
(number of pg backends => number of page tables allocated) the more
imporant it gets to implement this. We already can spend gigabytes of
memory on those on big systems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2013-06-28 Thread Bruce Momjian

Did we decide against specifying huge pages in Postgres?

---

On Tue, Oct 30, 2012 at 09:16:07PM +0100, Christian Kruse wrote:
> Hey,
> 
> ok, I think I implemented all of the changes you requested. All but
> the ia64 dependent, I have to do more research for this one.
> 
> 
> Greetings,
>  CK

> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index b4fcbaf..66ed10f 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -1049,6 +1049,37 @@ include 'filename'
>
>   
>  
> + 
> +  huge_tlb_pages (enum)
> +  
> +   huge_tlb_pages configuration parameter
> +  
> +  
> +   
> +Enables/disables the use of huge tlb pages. Valid values are
> +on, off and 
> try.
> +The default value is try.
> +   
> +
> +   
> +With huge_tlb_pages set to on
> +mmap() will be called with 
> MAP_HUGETLB.
> +If the call fails the server will fail fatally.
> +   
> +
> +   
> +With huge_tlb_pages set to off 
> we
> +will not use MAP_HUGETLB at all.
> +   
> +
> +   
> +With huge_tlb_pages set to try
> +we will try to use MAP_HUGETLB and fall back to
> +mmap() without MAP_HUGETLB.
> +   
> +  
> + 
> +
>   
>temp_buffers (integer)
>
> diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
> index df06312..f9de239 100644
> --- a/src/backend/port/sysv_shmem.c
> +++ b/src/backend/port/sysv_shmem.c
> @@ -27,10 +27,14 @@
>  #ifdef HAVE_SYS_SHM_H
>  #include 
>  #endif
> +#ifdef MAP_HUGETLB
> +#include 
> +#endif
>  
>  #include "miscadmin.h"
>  #include "storage/ipc.h"
>  #include "storage/pg_shmem.h"
> +#include "utils/guc.h"
>  
>  
>  typedef key_t IpcMemoryKey;  /* shared memory key passed to 
> shmget(2) */
> @@ -61,6 +65,19 @@ typedef int IpcMemoryId;   /* shared memory ID 
> returned by shmget(2) */
>  #define MAP_FAILED ((void *) -1)
>  #endif
>  
> +#ifdef MAP_HUGETLB
> +#  ifdef __ia64__
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
> +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
> +#  else
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
> +#define PG_MAP_HUGETLB MAP_HUGETLB
> +#  endif
> +#else
> +#  define PG_MAP_HUGETLB 0
> +#endif
> +
> +
>  
>  unsigned long UsedShmemSegID = 0;
>  void*UsedShmemSegAddr = NULL;
> @@ -73,7 +90,6 @@ static void IpcMemoryDelete(int status, Datum shmId);
>  static PGShmemHeader *PGSharedMemoryAttach(IpcMemoryKey key,
>IpcMemoryId *shmid);
>  
> -
>  /*
>   *   InternalIpcMemoryCreate(memKey, size)
>   *
> @@ -342,6 +358,155 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long 
> id2)
>  }
>  
>  
> +#ifdef MAP_HUGETLB
> +#define HUGE_PAGE_INFO_DIR  "/sys/kernel/mm/hugepages"
> +
> +/*
> + *   static long InternalGetFreeHugepagesCount(const char *name)
> + *
> + * Attempt to read the number of available hugepages from
> + * /sys/kernel/mm/hugepages/hugepages-/free_hugepages
> + * Will fail (return -1) if file could not be opened, 0 if no pages are 
> available
> + * and > 0 if there are free pages
> + *
> + */
> +static long
> +InternalGetFreeHugepagesCount(const char *name)
> +{
> + int fd;
> + char buff[1024];
> + size_t len;
> + long result;
> + char *ptr;
> +
> + len = snprintf(buff, 1024, "%s/%s/free_hugepages", HUGE_PAGE_INFO_DIR, 
> name);
> + if (len == 1024) /* I don't think that this will happen ever */
> + {
> + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
> + (errmsg("Filename %s/%s/free_hugepages is too 
> long", HUGE_PAGE_INFO_DIR, name),
> +  errcontext("while checking hugepage size")));
> + return -1;
> + }
> +
> + fd = open(buff, O_RDONLY);
> + if (fd <= 0)
> + {
> + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
> + (errmsg("Could not open file %s: %s", buff, 
> strerror(errno)),
> +  errcontext("while checking hugepage size")));
> + return -1;
> + }
> +
> + len = read(fd, buff, 1024);
> + if (len <= 0)
> + {
> + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
> + (errmsg("Error reading from file %s: %s", buff, 
> strerror(errno)),
> +  errcontext("while checking hugepage size")));
> + close(fd);
> + return -1;
> + }
> +
> + /*
> +  * If the content of free_hugepages is longer than or equal to 1024 
> bytes
> +  * the rest is irrelevant; we simply want to know if there are any
> +  * hugepages left
> +  */
> + if (len == 1024)
> + {
> + buff[1023] = 0;
> + }
> + else
> + 

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-11-13 Thread Andres Freund
Hi CK,

On 2012-10-30 21:16:07 +0100, Christian Kruse wrote:
> index b4fcbaf..66ed10f 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml

I think a short introduction or at least a reference on how to configure
hugepages would be a good thing.

>   
>temp_buffers (integer)
>
> diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
> index df06312..f9de239 100644
> --- a/src/backend/port/sysv_shmem.c
> +++ b/src/backend/port/sysv_shmem.c
> @@ -27,10 +27,14 @@
>  #ifdef HAVE_SYS_SHM_H
>  #include 
>  #endif
> +#ifdef MAP_HUGETLB
> +#include 
> +#endif

I think a central #define for the MAP_HUGETLB capability would be a good
idea, akin to HAVE_SYS_SHM_H.

E.g. this:
> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #ifdef HAVE_SYSLOG
>  #include 
>  #endif

is unlikely to fly on windows.


> +/*
> + *   static long InternalGetHugepageSize()
> + *
> + * Attempt to get a valid hugepage size from /sys/kernel/mm/hugepages/ by
> + * reading directory contents
> + * Will fail (return -1) if the directory could not be opened or no valid
> + * page sizes are available. Will return the biggest hugepage size on
> + * success.
> + *
> + */

The "biggest" remark is out of date.


> +static long
> +InternalGetHugepageSize()
> +{
> ...
> + if ((smallest_size == -1 || size < smallest_size)
> + && InternalGetFreeHugepagesCount(ent->d_name) > 
> 0)
> + {
> + smallest_size = size;
> + }
> ...
> +
> + if (smallest_size == -1)
> + {
> + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
> + (errmsg("Could not find a valid hugepage size"),
> +  errhint("This error usually means that either 
> CONFIG_HUGETLB_PAGE "
> +  "is not in kernel or that your 
> architecture does not "
> +  "support hugepages or you did 
> not configure hugepages")));
> + }

I think differentiating the error message between no hugepages found and
InternalGetFreeHugepagesCount(ent->d_name) always beeing zero would be a
good idea. Failing this way if
InternalGetFreeHugepagesCount(ent->d_name) < 0 seems fine.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-31 Thread Christian Kruse
Hey,

On 30/10/12 20:33, Andres Freund wrote:
> +#ifdef MAP_HUGETLB
> +#  ifdef __ia64__
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
> +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
> +#  else
>
> Not your fault, but that looks rather strange to me. The level of 
> documentation
> around the requirement of using MAP_FIXED is subpar...

Yeah, looks strange to me, too. This is why I asked if this is really
all we have to do.

> I dislike automatically using the biggest size here. There are platforms with
> 16GB hugepages... I think we should rather always use the smallest size and
> leave it to the kernel to choose whatever pagesize he wants to use.

Good point. I will change this to the smallest available page size.

> Thats going to log two warnings if the current system doesn't have hugepage
> support compiled in and thus no /sys/kernel/mm/hugepages directory.
> I think you should 1. only test this if TRY or ON is configured, 2. make the
> messages in InternalGetHugepageSize DEBUG1 at least for the TRY case.

Ok, point taken.

> […]
> + HUGE_TLB_TRY, huge_tlb_options,
> + NULL, NULL, NULL
> + },
>
> you use HUGE_TLB_TRY with ifdef here.

Ah, right. Setting it to HUGE_TLB_OFF when no MAP_HUGETLB is available.

Greetings,
 CK


pgprK0vhwogK9.pgp
Description: PGP signature


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse
Hey,

On Tuesday 30 October 2012 20:33:18 Andres Freund wrote:
> +#ifdef MAP_HUGETLB
> +#  ifdef __ia64__
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
> +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
> +#  else
> 
> Not your fault, but that looks rather strange to me. The level of
> documentation around the requirement of using MAP_FIXED is subpar...

Ok, after further reading I think with MAP_FIXED one has to „generate“ SHM 
segment addresses hisself. So 0x8000UL would just be one possible 
location for a hugepage.

Since this feature is not used in PG for now, the code should work. Could 
someone with access to a ia64 architecture verify it?

Greetings,
 CK



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse
Hey,

ok, I think I implemented all of the changes you requested. All but
the ia64 dependent, I have to do more research for this one.


Greetings,
 CK
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4fcbaf..66ed10f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1049,6 +1049,37 @@ include 'filename'
   
  
 
+ 
+  huge_tlb_pages (enum)
+  
+   huge_tlb_pages configuration parameter
+  
+  
+   
+Enables/disables the use of huge tlb pages. Valid values are
+on, off and 
try.
+The default value is try.
+   
+
+   
+With huge_tlb_pages set to on
+mmap() will be called with 
MAP_HUGETLB.
+If the call fails the server will fail fatally.
+   
+
+   
+With huge_tlb_pages set to off we
+will not use MAP_HUGETLB at all.
+   
+
+   
+With huge_tlb_pages set to try
+we will try to use MAP_HUGETLB and fall back to
+mmap() without MAP_HUGETLB.
+   
+  
+ 
+
  
   temp_buffers (integer)
   
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index df06312..f9de239 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -27,10 +27,14 @@
 #ifdef HAVE_SYS_SHM_H
 #include 
 #endif
+#ifdef MAP_HUGETLB
+#include 
+#endif
 
 #include "miscadmin.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
+#include "utils/guc.h"
 
 
 typedef key_t IpcMemoryKey;/* shared memory key passed to 
shmget(2) */
@@ -61,6 +65,19 @@ typedef int IpcMemoryId; /* shared memory ID 
returned by shmget(2) */
 #define MAP_FAILED ((void *) -1)
 #endif
 
+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else
+#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
+#define PG_MAP_HUGETLB MAP_HUGETLB
+#  endif
+#else
+#  define PG_MAP_HUGETLB 0
+#endif
+
+
 
 unsigned long UsedShmemSegID = 0;
 void  *UsedShmemSegAddr = NULL;
@@ -73,7 +90,6 @@ static void IpcMemoryDelete(int status, Datum shmId);
 static PGShmemHeader *PGSharedMemoryAttach(IpcMemoryKey key,
 IpcMemoryId *shmid);
 
-
 /*
  * InternalIpcMemoryCreate(memKey, size)
  *
@@ -342,6 +358,155 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long 
id2)
 }
 
 
+#ifdef MAP_HUGETLB
+#define HUGE_PAGE_INFO_DIR  "/sys/kernel/mm/hugepages"
+
+/*
+ * static long InternalGetFreeHugepagesCount(const char *name)
+ *
+ * Attempt to read the number of available hugepages from
+ * /sys/kernel/mm/hugepages/hugepages-/free_hugepages
+ * Will fail (return -1) if file could not be opened, 0 if no pages are 
available
+ * and > 0 if there are free pages
+ *
+ */
+static long
+InternalGetFreeHugepagesCount(const char *name)
+{
+   int fd;
+   char buff[1024];
+   size_t len;
+   long result;
+   char *ptr;
+
+   len = snprintf(buff, 1024, "%s/%s/free_hugepages", HUGE_PAGE_INFO_DIR, 
name);
+   if (len == 1024) /* I don't think that this will happen ever */
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg("Filename %s/%s/free_hugepages is too 
long", HUGE_PAGE_INFO_DIR, name),
+errcontext("while checking hugepage size")));
+   return -1;
+   }
+
+   fd = open(buff, O_RDONLY);
+   if (fd <= 0)
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg("Could not open file %s: %s", buff, 
strerror(errno)),
+errcontext("while checking hugepage size")));
+   return -1;
+   }
+
+   len = read(fd, buff, 1024);
+   if (len <= 0)
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg("Error reading from file %s: %s", buff, 
strerror(errno)),
+errcontext("while checking hugepage size")));
+   close(fd);
+   return -1;
+   }
+
+   /*
+* If the content of free_hugepages is longer than or equal to 1024 
bytes
+* the rest is irrelevant; we simply want to know if there are any
+* hugepages left
+*/
+   if (len == 1024)
+   {
+   buff[1023] = 0;
+   }
+   else
+   {
+   buff[len] = 0;
+   }
+
+   close(fd);
+
+   result = strtol(buff, &ptr, 10);
+
+   if (ptr == NULL)
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg("Could not convert contents of file 
%s/%s/free_hugepages to number", HUGE_PAGE_INFO_DIR, name),
+errcontext("while checking hugepage size")));
+   

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse
Hey,

On 30/10/12 20:33, Andres Freund wrote:
> +#ifdef MAP_HUGETLB
> +#  ifdef __ia64__
> +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
> +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
> +#  else
>
> Not your fault, but that looks rather strange to me. The level of 
> documentation
> around the requirement of using MAP_FIXED is subpar...

Yeah, looks strange to me, too. This is why I asked if this is really
all we have to do.

> I dislike automatically using the biggest size here. There are platforms with
> 16GB hugepages... I think we should rather always use the smallest size and
> leave it to the kernel to choose whatever pagesize he wants to use.

Good point. I will change this to the smallest available page size.

> Thats going to log two warnings if the current system doesn't have hugepage
> support compiled in and thus no /sys/kernel/mm/hugepages directory.
> I think you should 1. only test this if TRY or ON is configured, 2. make the
> messages in InternalGetHugepageSize DEBUG1 at least for the TRY case.

Ok, point taken.

> […]
> + HUGE_TLB_TRY, huge_tlb_options,
> + NULL, NULL, NULL
> + },
>
> you use HUGE_TLB_TRY with ifdef here.

Ah, right. Setting it to HUGE_TLB_OFF when no MAP_HUGETLB is available.

Greetings,
 CK


pgp6PpeR5Kpxp.pgp
Description: PGP signature


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Andres Freund
On Tuesday, October 30, 2012 08:20:33 PM Christian Kruse wrote:
> Hey,
> 
> Oh man, first I didn't sent the email to the list and now I forgot the
> attachment. I should really get some sleep, sorry for any
> inconveniences :(

+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else

Not your fault, but that looks rather strange to me. The level of documentation 
around the requirement of using MAP_FIXED is subpar...

+long
+InternalGetHugepageSize()
+{
+...
+   if (biggest_size < size && 
InternalGetFreeHugepagesCount(ent-
>d_name) > 0)
+   {
+   biggest_size = size;
+   }

I dislike automatically using the biggest size here. There are platforms with 
16GB hugepages... I think we should rather always use the smallest size and 
leave it to the kernel to choose whatever pagesize he wants to use.

+#ifdef MAP_HUGETLB
+   longpagesize = InternalGetHugepageSize();
+
+   if (pagesize <= 0)
+   pagesize = sysconf(_SC_PAGE_SIZE);
+#else
longpagesize = sysconf(_SC_PAGE_SIZE);
+#endif

Thats going to log two warnings if the current system doesn't have hugepage 
support compiled in and thus no /sys/kernel/mm/hugepages directory.
I think you should 1. only test this if TRY or ON is configured, 2. make the 
messages in InternalGetHugepageSize DEBUG1 at least for the TRY case.
 
+   {
+   {"huge_tlb_pages",
+#ifdef MAP_HUGETLB
+   PGC_SUSET,
+#else
+   PGC_INTERNAL,
+#endif
+   RESOURCES_MEM,
+   gettext_noop("Enable/disable the use of the hugepages 
feature"),
+   NULL
+   },
+   &huge_tlb_pages,
+   HUGE_TLB_TRY, huge_tlb_options,
+   NULL, NULL, NULL
+   },

you use HUGE_TLB_TRY with ifdef here.


Looking forward to having this...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse
Hey,

Oh man, first I didn't sent the email to the list and now I forgot the
attachment. I should really get some sleep, sorry for any
inconveniences :(

Greetings,
 CK
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4fcbaf..66ed10f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1049,6 +1049,37 @@ include 'filename'
   
  
 
+ 
+  huge_tlb_pages (enum)
+  
+   huge_tlb_pages configuration parameter
+  
+  
+   
+Enables/disables the use of huge tlb pages. Valid values are
+on, off and 
try.
+The default value is try.
+   
+
+   
+With huge_tlb_pages set to on
+mmap() will be called with 
MAP_HUGETLB.
+If the call fails the server will fail fatally.
+   
+
+   
+With huge_tlb_pages set to off we
+will not use MAP_HUGETLB at all.
+   
+
+   
+With huge_tlb_pages set to try
+we will try to use MAP_HUGETLB and fall back to
+mmap() without MAP_HUGETLB.
+   
+  
+ 
+
  
   temp_buffers (integer)
   
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index df06312..73cfdef 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -27,10 +27,14 @@
 #ifdef HAVE_SYS_SHM_H
 #include 
 #endif
+#ifdef MAP_HUGETLB
+#include 
+#endif
 
 #include "miscadmin.h"
 #include "storage/ipc.h"
 #include "storage/pg_shmem.h"
+#include "utils/guc.h"
 
 
 typedef key_t IpcMemoryKey;/* shared memory key passed to 
shmget(2) */
@@ -61,6 +65,19 @@ typedef int IpcMemoryId; /* shared memory ID 
returned by shmget(2) */
 #define MAP_FAILED ((void *) -1)
 #endif
 
+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else
+#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
+#define PG_MAP_HUGETLB MAP_HUGETLB
+#  endif
+#else
+#  define PG_MAP_HUGETLB 0
+#endif
+
+
 
 unsigned long UsedShmemSegID = 0;
 void  *UsedShmemSegAddr = NULL;
@@ -342,6 +359,154 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long 
id2)
 }
 
 
+#ifdef MAP_HUGETLB
+#define HUGE_PAGE_INFO_DIR  "/sys/kernel/mm/hugepages"
+
+/*
+ * static long InternalGetFreeHugepagesCount(const char *name)
+ *
+ * Attempt to read the number of available hugepages from
+ * /sys/kernel/mm/hugepages/hugepages-/free_hugepages
+ * Will fail (return -1) if file could not be opened, 0 if no pages are 
available
+ * and > 0 if there are free pages
+ *
+ */
+static long
+InternalGetFreeHugepagesCount(const char *name)
+{
+   int fd;
+   char buff[1024];
+   size_t len;
+   long result;
+   char *ptr;
+
+   len = snprintf(buff, 1024, "%s/%s/free_hugepages", HUGE_PAGE_INFO_DIR, 
name);
+   if (len == 1024) /* I don't think that this will happen ever */
+   {
+   ereport(WARNING,
+   (errmsg("Filename %s/%s/free_hugepages is too 
long", HUGE_PAGE_INFO_DIR, name),
+errcontext("while checking hugepage size")));
+   return -1;
+   }
+
+   fd = open(buff, O_RDONLY);
+   if (fd <= 0)
+   {
+   ereport(WARNING,
+   (errmsg("Could not open file %s: %s", buff, 
strerror(errno)),
+errcontext("while checking hugepage size")));
+   return -1;
+   }
+
+   len = read(fd, buff, 1024);
+   if (len <= 0)
+   {
+   ereport(WARNING,
+   (errmsg("Error reading from file %s: %s", buff, 
strerror(errno)),
+errcontext("while checking hugepage size")));
+   close(fd);
+   return -1;
+   }
+
+   /*
+* If the content of free_hugepages is longer than or equal to 1024 
bytes
+* the rest is irrelevant; we simply want to know if there are any
+* hugepages left
+*/
+   if (len == 1024)
+   {
+   buff[1023] = 0;
+   }
+   else
+   {
+   buff[len] = 0;
+   }
+
+   close(fd);
+
+   result = strtol(buff, &ptr, 10);
+
+   if (ptr == NULL)
+   {
+   ereport(WARNING,
+   (errmsg("Could not convert contents of file 
%s/%s/free_hugepages to number", HUGE_PAGE_INFO_DIR, name),
+errcontext("while checking hugepage size")));
+   return -1;
+   }
+
+   return result;
+}
+
+/*
+ * static long InternalGetHugepageSize()
+ *
+ * Attempt to get a valid hugepage size from /sys/kernel/mm/hugepages/ by
+ * reading directory contents
+ * Will fail (return -1) if the directory could not be opened or no valid
+ * page sizes are available. Will return the biggest hugepage size on
+ * success.
+ *
+ */
+long
+InternalGetHugepage

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse
Hi,

On 29/10/12 21:14, Peter Geoghegan wrote:
> I have a few initial observations on this.

Thanks for your feedback.

>
> * I think you should be making the new GUC PGC_INTERNAL on platforms
> where MAP_HUGETLB is not defined or available. See also,
> effective_io_concurrency. This gives sane error handling.

Ok, sounds good for me.

> * Why aren't you failing when HUGE_TLB_ON and the mmap fails? You do
> the same thing as HUGE_TLB_TRY, contrary to your documentation:
>
>  +if (huge_tlb_pages == HUGE_TLB_ON || huge_tlb_pages == 
> HUGE_TLB_TRY)

No, what I did was mmap()ing the memory with MAP_HUGETLB and falling
back to mmap() w/o MAP_HUGETLB when mmap() failed. But there was
another bug, I didn't mmap() at all when huge_tlb_pages == off.

> * I think you should add MAP_HUGETLB to PG_MMAP_FLAGS directly, rather
> than XOR'ing after the fact. We already build the flags that comprise
> PG_MMAP_FLAGS using another set of intermediary flags, based on
> platform considerations, so what you're doing here is just
> inconsistent.

This would mean that I have to disable the bit when huge_tlb_pages ==
off. I think it is better to enable it if wanted and to just leave the
flags as they were when this feature has been turned off, do you
disagree?

> Don't wrap the mmap() code in ifdefs, and instead rely
> on the GUC being available on all platforms, with setting the GUC
> causing an error on unsupported platforms, in the style of
> effective_io_concurrency, as established in commit
> 3b07182e613a0e63ff662e3dc7f820f010923ec7 . Build a PG_MAP_HUGETLB
> intermediary flag on all platforms.

Ok, this sounds good. It will remove complexity from the code.

> * Apparently you're supposed to do this for the benefit of Itanic [1]:
>
> /* Only ia64 requires this */
> #ifdef __ia64__
> #define ADDR (void *)(0x8000UL)
> #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED)
> #else
> #define ADDR (void *)(0x0UL)
> #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
> #endif

Hm… is this enough? Or do we have to do more work for ia64?

> * You aren't following our style guide with respect to error messages [2].

Thanks, I wasn't aware of this. I attached a new version of the patch.

Greetings,
 CK


pgp3AmMtkv8SY.pgp
Description: PGP signature