Re: shmmax tops out at 2G?

2009-03-13 Thread Ivan Voras
Brian A. Seklecki wrote:
 Thanks to all; with the r1.114 changes, our staff reports the following:
 
 Postgres is able to start with a ~3GB postgresql.conf(5) $shared_buffer
 on 8-CURRENT/amd64:

It has recently also been MFC-ed to 7-STABLE :)

(beware of instabilities and debugging in -CURRENT!)

  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  1036 pgsql   1  440  3013M 79296K select   0:00  0.00% postgres
 
 kern.ipc.shmall: 786432
 kern.ipc.shmmax: 3221225472
 
 FreeBSD db0X 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Thu Mar 12 09:38:36 EDT
 2009 f...@db02:/usr/obj/usr/src/sys/GENERIC  amd64
 
 ~BAS
 
 On Mon, 2009-02-23 at 10:50 -0500, Brian A. Seklecki wrote:
 On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
 In response to Bill Moran wmoran at collaborativefusion.com:
 sysctl kern.ipc.shmmax=22
 kern.ipc.shmmax: 21 - -2094967296
 Someone was nice enough to file a PR related to this:

 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274

 We'd be happy to sponsor development in -current to address this
 limitation.  ~BAS
 
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
 



signature.asc
Description: OpenPGP digital signature


Re: shmmax tops out at 2G?

2009-03-12 Thread Brian A. Seklecki
Thanks to all; with the r1.114 changes, our staff reports the following:

Postgres is able to start with a ~3GB postgresql.conf(5) $shared_buffer
on 8-CURRENT/amd64:

 PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
 1036 pgsql   1  440  3013M 79296K select   0:00  0.00% postgres

kern.ipc.shmall: 786432
kern.ipc.shmmax: 3221225472

FreeBSD db0X 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Thu Mar 12 09:38:36 EDT
2009 f...@db02:/usr/obj/usr/src/sys/GENERIC  amd64

~BAS

On Mon, 2009-02-23 at 10:50 -0500, Brian A. Seklecki wrote:
  On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
  In response to Bill Moran wmoran at collaborativefusion.com:
   sysctl kern.ipc.shmmax=22
   kern.ipc.shmmax: 21 - -2094967296
 
 Someone was nice enough to file a PR related to this:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274
 
 We'd be happy to sponsor development in -current to address this
 limitation.  ~BAS


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-24 Thread Garrett Cooper
On Mon, Feb 23, 2009 at 12:16 PM, Bill Moran wmo...@potentialtech.com wrote:
 In response to Christian Peron c...@freebsd.org:

 On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote:
 [..]
 
      Why isn't the field an unsigned int / size_t? I don't see much value
  in having the size be signed...

 No idea :) This code long predates me.

 It's that way because the original Sun spec for the API said so.

 It makes little sense to change it just to unsigned.  The additional 2G
 it would give doesn't really solve the tuning problem on a 64G system.
 This is simply a spec that has become outdated by modern hardware.

Ah, but an unsigned integer on a 64-bit system supports that kind of
precision ;). Or are you saying you're crazy enough to run PAE mode
with that much RAM 0-o?

Then again the bug filer's statement doesn't make sense given the data
-- there must be a int32_t used somewhere that's mucking up the
system. Trying to compile the test app with -Wall, this is what I see:

[gcoo...@optimus ~]$ gcc -Wall -o test4 test4.c
test4.c: In function 'main':
test4.c:11: warning: 'return' with no value, in function returning non-void
test4.c:13: warning: format '%zd' expects type 'signed size_t', but
argument 2 has type 'int'
#include stdio.h
#include sys/shm.h

int main() {
size_t size = 2*1024*1024*1024l - 4096;
int segid;
printf(Requested: %zd\n, size);
segid = shmget(234, size, IPC_CREAT);
if(segid == -1) {
perror(Died);
return;
}
printf(SHM ID : %zd\n, segid);
}

So I'm not 100% sure that this issue isn't a coding error, or the
sample app is just incorrect... When the error comes back though from
the perror, it returns EINVALID, not ENOMEM or something similar to
that. Not sure if it's because the value is truly invalid, or if it's
just a bad return code.

My 2 cents...
-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-24 Thread Nate Eldredge

On Tue, 24 Feb 2009, Garrett Cooper wrote:


On Mon, Feb 23, 2009 at 12:16 PM, Bill Moran wmo...@potentialtech.com wrote:

In response to Christian Peron c...@freebsd.org:


On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote:
[..]


    Why isn't the field an unsigned int / size_t? I don't see much value
in having the size be signed...


No idea :) This code long predates me.


It's that way because the original Sun spec for the API said so.

It makes little sense to change it just to unsigned.  The additional 2G
it would give doesn't really solve the tuning problem on a 64G system.
This is simply a spec that has become outdated by modern hardware.


Ah, but an unsigned integer on a 64-bit system supports that kind of
precision ;). Or are you saying you're crazy enough to run PAE mode
with that much RAM 0-o?


int and unsigned on amd64 are 32-bit types.  To get a 64-bit integer, you 
need (unsigned) long.


--

Nate Eldredge
neldre...@math.ucsd.edu___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: shmmax tops out at 2G?

2009-02-24 Thread Kostik Belousov
On Mon, Feb 23, 2009 at 01:08:28PM -0600, Christian Peron wrote:
 This issue has come up a number of times.  I was looking into fixing this but 
 I
 just have not had the time.  The basic issue is our shmid_ds structure:
 
  struct shmid_ds {
  struct ipc_perm shm_perm;   /* operation permission structure */
  int shm_segsz;  /* size of segment in bytes */
  pid_t   shm_lpid;   /* process ID of last shared memory op */
  pid_t   shm_cpid;   /* process ID of creator */
  short   shm_nattch; /* number of current attaches */
  time_t  shm_atime;  /* time of last shmat() */
  time_t  shm_dtime;  /* time of last shmdt() */
  time_t  shm_ctime;  /* time of last change by shmctl() */
  void   *shm_internal; /* sysv stupidity */
  };
 
 
 Basically the shm_segsz member needs to be switched from 32 bits (int) to
 64 bits.  The problem is that this breaks the ABI and older versions of
 postgresql will not work.  The solution is to add additional syscalls.
 
 However, everytime this issue comes up, the question on whether we should
 fix struct ipc_perm at the same time is asked.  The answer imho is that
 we should, however this is more complex since semaphores, messaages and
 shared memory segments all use it.
 
 The fixes are straight forward, however making sure we maintain reverse
 compatability is where things become complicated, especially since there
 are multiple layers of reverse compat we need to look after.

Yes, this is the right solution. Meantime, below is what we use ATM to
get over this limitation. The struct shmid_ds is only used for IPC_STAT
call in the usermode, ignoring ipcs(1). Allowing it to break for 2Gb
segments, we get otherwise good workaround. The luck is that shmget
takes size_t instead of int as a segment size.

It might be further tweaked to only allow for 2Gb allocation after some
sysctl is set, by I do not see a point.

diff --git a/sys/kern/sysv_shm.c b/sys/kern/sysv_shm.c
index 4e9854d..a945523 100644
--- a/sys/kern/sysv_shm.c
+++ b/sys/kern/sysv_shm.c
@@ -121,7 +121,8 @@ static sy_call_t *shmcalls[] = {
 #defineSHMSEG_ALLOCATED0x0800
 #defineSHMSEG_WANTED   0x1000
 
-static int shm_last_free, shm_nused, shm_committed, shmalloced;
+static int shm_last_free, shm_nused, shmalloced;
+size_t shm_committed;
 static struct shmid_kernel *shmsegs;
 
 struct shmmap_state {
@@ -250,7 +251,7 @@ shm_deallocate_segment(shmseg)
 
vm_object_deallocate(shmseg-u.shm_internal);
shmseg-u.shm_internal = NULL;
-   size = round_page(shmseg-u.shm_segsz);
+   size = round_page(shmseg-shm_bsegsz);
shm_committed -= btoc(size);
shm_nused--;
shmseg-u.shm_perm.mode = SHMSEG_FREE;
@@ -270,7 +271,7 @@ shm_delete_mapping(struct vmspace *vm, struct shmmap_state 
*shmmap_s)
 
segnum = IPCID_TO_IX(shmmap_s-shmid);
shmseg = shmsegs[segnum];
-   size = round_page(shmseg-u.shm_segsz);
+   size = round_page(shmseg-shm_bsegsz);
result = vm_map_remove(vm-vm_map, shmmap_s-va, shmmap_s-va + size);
if (result != KERN_SUCCESS)
return (EINVAL);
@@ -390,7 +391,7 @@ kern_shmat(td, shmid, shmaddr, shmflg)
error = EMFILE;
goto done2;
}
-   size = round_page(shmseg-u.shm_segsz);
+   size = round_page(shmseg-shm_bsegsz);
 #ifdef VM_PROT_READ_IS_EXEC
prot = VM_PROT_READ | VM_PROT_EXECUTE;
 #else
@@ -422,7 +423,8 @@ kern_shmat(td, shmid, shmaddr, shmflg)
 
vm_object_reference(shmseg-u.shm_internal);
rv = vm_map_find(p-p_vmspace-vm_map, shmseg-u.shm_internal,
-   0, attach_va, size, (flags  MAP_FIXED)?0:1, prot, prot, 0);
+   0, attach_va, size, (flags  MAP_FIXED) ? VMFS_NO_SPACE :
+   VMFS_ANY_SPACE, prot, prot, 0);
if (rv != KERN_SUCCESS) {
vm_object_deallocate(shmseg-u.shm_internal);
error = ENOMEM;
@@ -720,7 +722,7 @@ shmget_existing(td, uap, mode, segnum)
if (error != 0)
return (error);
 #endif
-   if (uap-size  uap-size  shmseg-u.shm_segsz)
+   if (uap-size  uap-size  shmseg-shm_bsegsz)
return (EINVAL);
td-td_retval[0] = IXSEQ_TO_IPCID(segnum, shmseg-u.shm_perm);
return (0);
@@ -732,7 +734,8 @@ shmget_allocate_segment(td, uap, mode)
struct shmget_args *uap;
int mode;
 {
-   int i, segnum, shmid, size;
+   int i, segnum, shmid;
+   size_t size;
struct ucred *cred = td-td_ucred;
struct shmid_kernel *shmseg;
vm_object_t shm_object;
@@ -790,6 +793,7 @@ shmget_allocate_segment(td, uap, mode)
shmseg-u.shm_perm.mode = (shmseg-u.shm_perm.mode  SHMSEG_WANTED) |
(mode  ACCESSPERMS) | SHMSEG_ALLOCATED;
shmseg-u.shm_segsz = uap-size;
+   shmseg-shm_bsegsz = uap-size;

Re: shmmax tops out at 2G?

2009-02-23 Thread Brian A. Seklecki
 On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
 In response to Bill Moran wmoran at collaborativefusion.com:
  sysctl kern.ipc.shmmax=22
  kern.ipc.shmmax: 21 - -2094967296

Someone was nice enough to file a PR related to this:

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274

We'd be happy to sponsor development in -current to address this
limitation.  ~BAS


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-23 Thread Christian Peron
This issue has come up a number of times.  I was looking into fixing this but I
just have not had the time.  The basic issue is our shmid_ds structure:

 struct shmid_ds {
 struct ipc_perm shm_perm;   /* operation permission structure */
 int shm_segsz;  /* size of segment in bytes */
 pid_t   shm_lpid;   /* process ID of last shared memory op */
 pid_t   shm_cpid;   /* process ID of creator */
 short   shm_nattch; /* number of current attaches */
 time_t  shm_atime;  /* time of last shmat() */
 time_t  shm_dtime;  /* time of last shmdt() */
 time_t  shm_ctime;  /* time of last change by shmctl() */
 void   *shm_internal; /* sysv stupidity */
 };


Basically the shm_segsz member needs to be switched from 32 bits (int) to
64 bits.  The problem is that this breaks the ABI and older versions of
postgresql will not work.  The solution is to add additional syscalls.

However, everytime this issue comes up, the question on whether we should
fix struct ipc_perm at the same time is asked.  The answer imho is that
we should, however this is more complex since semaphores, messaages and
shared memory segments all use it.

The fixes are straight forward, however making sure we maintain reverse
compatability is where things become complicated, especially since there
are multiple layers of reverse compat we need to look after.


On Mon, Feb 23, 2009 at 10:50:07AM -0500, Brian A. Seklecki wrote:
  On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
  In response to Bill Moran wmoran at collaborativefusion.com:
   sysctl kern.ipc.shmmax=22
   kern.ipc.shmmax: 21 - -2094967296
 
 Someone was nice enough to file a PR related to this:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274
 
 We'd be happy to sponsor development in -current to address this
 limitation.  ~BAS
 
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-23 Thread Garrett Cooper

On Feb 23, 2009, at 11:08 AM, Christian Peron wrote:

This issue has come up a number of times.  I was looking into fixing  
this but I
just have not had the time.  The basic issue is our shmid_ds  
structure:


struct shmid_ds {
struct ipc_perm shm_perm;   /* operation permission  
structure */

int shm_segsz;  /* size of segment in bytes */
pid_t   shm_lpid;   /* process ID of last shared  
memory op */

pid_t   shm_cpid;   /* process ID of creator */
short   shm_nattch; /* number of current attaches */
time_t  shm_atime;  /* time of last shmat() */
time_t  shm_dtime;  /* time of last shmdt() */
time_t  shm_ctime;  /* time of last change by  
shmctl() */

void   *shm_internal; /* sysv stupidity */
};


Basically the shm_segsz member needs to be switched from 32 bits  
(int) to
64 bits.  The problem is that this breaks the ABI and older versions  
of

postgresql will not work.  The solution is to add additional syscalls.

However, everytime this issue comes up, the question on whether we  
should
fix struct ipc_perm at the same time is asked.  The answer imho is  
that
we should, however this is more complex since semaphores, messaages  
and

shared memory segments all use it.

The fixes are straight forward, however making sure we maintain  
reverse
compatability is where things become complicated, especially since  
there

are multiple layers of reverse compat we need to look after.


On Mon, Feb 23, 2009 at 10:50:07AM -0500, Brian A. Seklecki wrote:

On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:

In response to Bill Moran wmoran at collaborativefusion.com:

sysctl kern.ipc.shmmax=22
kern.ipc.shmmax: 21 - -2094967296


Someone was nice enough to file a PR related to this:

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/130274

We'd be happy to sponsor development in -current to address this
limitation.  ~BAS


	Why isn't the field an unsigned int / size_t? I don't see much value  
in having the size be signed...

-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-23 Thread Christian Peron
On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote:
[..]
 
   Why isn't the field an unsigned int / size_t? I don't see much value 
 in having the size be signed...

No idea :) This code long predates me. 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-23 Thread Bill Moran
In response to Christian Peron c...@freebsd.org:

 On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote:
 [..]
  
  Why isn't the field an unsigned int / size_t? I don't see much value 
  in having the size be signed...
 
 No idea :) This code long predates me. 

It's that way because the original Sun spec for the API said so.

It makes little sense to change it just to unsigned.  The additional 2G
it would give doesn't really solve the tuning problem on a 64G system.
This is simply a spec that has become outdated by modern hardware.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2006-12-13 Thread Bill Moran
In response to Bill Moran [EMAIL PROTECTED]:
 
 [I sent this to questions@ yesterday and have yet to get a response.  I
 suspect it may be a little more technical than [EMAIL PROTECTED]
 
 uname -a
 FreeBSD db00.lab00 6.2-BETA3 FreeBSD 6.2-BETA3 #1: Fri Dec  8 09:27:37 EST 
 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DB-2850-amd64  amd64
 
 sysctl kern.ipc.shmmax=22
 kern.ipc.shmmax: 21 - -2094967296
 
 Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
 would be expected on 64-bit arch (or PAE for that matter).
 
 Is this a mistake, or intentional?  I'm working with some big memory
 systems, and I sure would like to allocate more than 2G for PostgreSQL
 to use ...

Responding/following up on my own post ...

Kris Kennaway [EMAIL PROTECTED] wrote:
 Bill's guess is probably right, so someone needs to go over the sysv
 ipc code and make it 64-bit capable.

OK.  Looks like I've volunteered.  I know little to nothing about the
internals of the sysv system, so some of this may look pretty
uninformed.

First run through finds src/sys/sys/shm.h, which has a few obvious
data structures that need updated:

struct shmid_ds {
struct ipc_perm shm_perm;   /* operation permission structure */
int shm_segsz;  /* size of segment in bytes */
pid_t   shm_lpid;   /* process ID of last shared memory op */
pid_t   shm_cpid;   /* process ID of creator */
short   shm_nattch; /* number of current attaches */
time_t  shm_atime;  /* time of last shmat() */
time_t  shm_dtime;  /* time of last shmdt() */
time_t  shm_ctime;  /* time of last change by shmctl() */
void   *shm_internal;   /* sysv stupidity */
};

struct shminfo {
int shmmax, /* max shared memory segment size (bytes) */
shmmin, /* min shared memory segment size (bytes) */
shmmni, /* max number of shared memory identifiers */
shmseg, /* max shared memory segments per process */
shmall; /* max amount of shared memory (pages) */
};

However, looking at some function declarations in that same file:

int shmget(key_t, size_t, int);

I appears as if those values should have been size_t all along.  I'm
_assuming_ that the return value is an identifier and not a memory
address, which is what the docs seem to imply.

So, my first thought is that all the int values in those structures
should be changed to size_t.  If I understand the use of that type
correctly, it should always be the native word size on the architecture,
but will that make this work for PAE as well, or should those be
changed to uint64_t so they're 8 bits wide on all archs?

Once I understand a little more about what the correct type is for
those, I'll start looking for where they're used ...

-- 
Bill Moran
Collaborative Fusion Inc.

[EMAIL PROTECTED]
Phone: 412-422-3463x4023
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: shmmax tops out at 2G?

2006-12-13 Thread Peter Jeremy
On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
In response to Bill Moran [EMAIL PROTECTED]:
 sysctl kern.ipc.shmmax=22
 kern.ipc.shmmax: 21 - -2094967296
 
 Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
 would be expected on 64-bit arch (or PAE for that matter).
 
 Is this a mistake, or intentional?  I'm working with some big memory
 systems, and I sure would like to allocate more than 2G for PostgreSQL
 to use ...

I thought POSIX specified 'int' but I may be mis-remembering.  Tru64
uses int (and 2GB max) whilst Solaris allows 64-bit values.
Logically, shm_segsz and shm{min,max} should be intptr_t, shmall is
less clear but probably should be similar.

int shmget(key_t, size_t, int);

I appears as if those values should have been size_t all along.  I'm
_assuming_ that the return value is an identifier and not a memory
address, which is what the docs seem to imply.

shmget() returns an id that uniquely refers to a shared memory
segment (stupidly designed SysV IPC namespace) and shmat() takes
the id and returns the address.

So, my first thought is that all the int values in those structures
should be changed to size_t.  If I understand the use of that type
correctly, it should always be the native word size on the architecture,

I believe intptr_t is more logical - an integer size that is the
same size as a pointer.  Unfortunately, as I mentioned above, some
of this is specified in standards and logic is usually only present
by accident in such documents.

but will that make this work for PAE as well, or should those be
changed to uint64_t so they're 8 bits wide on all archs?

PAE is kernel only - userland still sees only 32 bits.  (You can
fit more RAM into the box, but each process is still limited to
4GB - KVM size).  Don't unnecessarily use [u]int64_t as it is
comparatively inefficient on 32-bit architectures.

I know Oracle (at least) avoids the problem on Tru64 by using
multiple SHM segments to allow SGA exceeding 2GB.
-- 
Peter Jeremy


pgpSILO7IEYU0.pgp
Description: PGP signature


Re: shmmax tops out at 2G?

2006-12-13 Thread Bill Moran

[Kris -- are you interested in this or should I trim you from the CC?]

In response to Peter Jeremy [EMAIL PROTECTED]:
 On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
 In response to Bill Moran [EMAIL PROTECTED]:
  sysctl kern.ipc.shmmax=22
  kern.ipc.shmmax: 21 - -2094967296
  
  Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
  would be expected on 64-bit arch (or PAE for that matter).
  
  Is this a mistake, or intentional?  I'm working with some big memory
  systems, and I sure would like to allocate more than 2G for PostgreSQL
  to use ...
 
 I thought POSIX specified 'int' but I may be mis-remembering.  Tru64
 uses int (and 2GB max) whilst Solaris allows 64-bit values.
 Logically, shm_segsz and shm{min,max} should be intptr_t, shmall is
 less clear but probably should be similar.

So, in your opinion:
struct shmid_ds {
struct ipc_perm shm_perm;   /* operation permission structure */
intptr_tshm_segsz;  /* size of segment in bytes */
pid_t   shm_lpid;   /* process ID of last shared memory op */
pid_t   shm_cpid;   /* process ID of creator */
short   shm_nattch; /* number of current attaches */
time_t  shm_atime;  /* time of last shmat() */
time_t  shm_dtime;  /* time of last shmdt() */
time_t  shm_ctime;  /* time of last change by shmctl() */
void   *shm_internal;   /* sysv stupidity */
};

struct shminfo {
intptr_t shmmax,/* max shared memory segment size (bytes) */
 shmmin;/* min shared memory segment size (bytes) */
int  shmmni,/* max number of shared memory identifiers */
 shmseg,/* max shared memory segments per process */
 shmall;/* max amount of shared memory (pages) */
};

 int shmget(key_t, size_t, int);
 
 I appears as if those values should have been size_t all along.  I'm
 _assuming_ that the return value is an identifier and not a memory
 address, which is what the docs seem to imply.
 
 shmget() returns an id that uniquely refers to a shared memory
 segment (stupidly designed SysV IPC namespace) and shmat() takes
 the id and returns the address.
 
 So, my first thought is that all the int values in those structures
 should be changed to size_t.  If I understand the use of that type
 correctly, it should always be the native word size on the architecture,
 
 I believe intptr_t is more logical - an integer size that is the
 same size as a pointer.  Unfortunately, as I mentioned above, some
 of this is specified in standards and logic is usually only present
 by accident in such documents.

Well, I guess there are a few questions if I want to make changes that
will end up back in the tree:
1) Can anyone quote the standards so we know what they expect?  I got
   the impression that you weren't sure about the standards.
2) If the standards attempt to lock us in to the 2G limit, is FreeBSD
   willing to move forward, thus breaking standards compliance?

 but will that make this work for PAE as well, or should those be
 changed to uint64_t so they're 8 bits wide on all archs?
 
 PAE is kernel only - userland still sees only 32 bits.  (You can
 fit more RAM into the box, but each process is still limited to
 4GB - KVM size).  Don't unnecessarily use [u]int64_t as it is
 comparatively inefficient on 32-bit architectures.

So intptr_t makes the most sense here, as it will Do the Right Thing
on 64-bit arch, 32-bit arch, and PAE.

-- 
Bill Moran
Collaborative Fusion Inc.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: shmmax tops out at 2G?

2006-12-13 Thread John Baldwin
On Wednesday 13 December 2006 13:34, Peter Jeremy wrote:
 On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
 In response to Bill Moran [EMAIL PROTECTED]:
  sysctl kern.ipc.shmmax=22
  kern.ipc.shmmax: 21 - -2094967296
  
  Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
  would be expected on 64-bit arch (or PAE for that matter).
  
  Is this a mistake, or intentional?  I'm working with some big memory
  systems, and I sure would like to allocate more than 2G for PostgreSQL
  to use ...
 
 I thought POSIX specified 'int' but I may be mis-remembering.  Tru64
 uses int (and 2GB max) whilst Solaris allows 64-bit values.
 Logically, shm_segsz and shm{min,max} should be intptr_t, shmall is
 less clear but probably should be similar.

Actually, unless you are holding a pointer, it should be size_t.  size_t is 
also the same size as a pointer (in practice), but it's for the size of 
objects in memory (i.e. what sizeof() returns), so I do think size_t is more 
appropriate.  The painful thing here will be destroying the SYSVIPC ABI on 
64-bit archs.  Bill, you should go talk to Robert Watson and time it with him 
as he wants to fix SYSVIPC to use uid_t which breaks the ABI, and if we're 
going to break the ABI, we should do it all at once. :)

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


shmmax tops out at 2G?

2006-12-12 Thread Bill Moran

[I sent this to questions@ yesterday and have yet to get a response.  I
suspect it may be a little more technical than [EMAIL PROTECTED]

uname -a
FreeBSD db00.lab00 6.2-BETA3 FreeBSD 6.2-BETA3 #1: Fri Dec  8 09:27:37 EST 2006 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/DB-2850-amd64  amd64

sysctl kern.ipc.shmmax=22
kern.ipc.shmmax: 21 - -2094967296

Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
would be expected on 64-bit arch (or PAE for that matter).

Is this a mistake, or intentional?  I'm working with some big memory
systems, and I sure would like to allocate more than 2G for PostgreSQL
to use ...

-- 
Bill Moran
Collaborative Fusion Inc.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]