Re: Stacking lots of IP's on a single box - any 'gotchas'?

2013-07-31 Thread Sergey Kandaurov
On 31 July 2013 13:37, Karl Pielorz kpielorz_...@tdx.co.uk wrote:

 Hi,

 We've got a number of boxes we'd like to consolidate - this could mean
 upward of 1,500 IP's on a single box (9.1 amd64).

 Last time we did anything like this we hit at issue at around 900 (ntpd
 'binds' by default to all available IP's - I think we had a workaround for
 that).


This is because select() has a limit on FD_SETSIZE (1024).
If it tries to select  1024 fds, bad things could happen.
Newer ntpd (not in base) has a feature to bind only to the specific
interface; this was used to run ntpd on boxes with  1200 IPs on 1 i/face.

 But is there any hard limit we're likely to encounter putting so many IP's
 on a single machine? - Are there any limits that would likely need tuning to
 support that many IP's?


Unlikely, besides those unrelated things like ntpd+select() et.al.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: hw.physmem/hw.realmem question

2013-07-10 Thread Sergey Kandaurov
On 3 July 2013 01:45, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote:
  AMD Features2=0x1LAHF
  TSC: P-state invariant, performance statistics
 real memory  = 34359738368 (32768 MB)
 avail memory = 32191340544 (30700 MB)


 2GB memory disappears too even when you don't set anything.

 i asked such a question for other machine some time ago without much answer.


 in your laptop it may be shared graphics memory reserved by chipset

 still on my dell server


 real memory  = 34359738368 (32768 MB)
 avail memory = 33166921728 (31630 MB)

 i have over 1GB unavailable and it doesn't have shared graphics memory.

 it would be nice to be able to look exactly how memory is used.

On amd64 about 3% is cut on startup for page structures, see vm_page_startup().

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: UFS+QUOTA+GIANT

2012-05-04 Thread Sergey Kandaurov
On 3 May 2012 23:01, Bryan Drewery br...@shatow.net wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi,

 I recently was re-evaluating my needs for a custom kernel vs GENERIC.
 One of these was due to QUOTA support, which apparently is not in
 GENERIC due to the GIANT lock [1].

This is no longer true since 6.4.
it's just that nobody cares to turn them on.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Approaching the limit on PV entries

2012-03-21 Thread Sergey Kandaurov
On 21 March 2012 19:19, John Baldwin j...@freebsd.org wrote:
 On Tuesday, March 20, 2012 11:37:57 am Sergey Kandaurov wrote:
 On 22 November 2011 19:29, Mark Saad nones...@longcount.org wrote:
  Hello All

 [found this mail in my drafts, not sure if my answer is still useful]

   I want to get to the bottom of a warning in dmesg. On 7.2-RELEASE and
  7.3-RELEASE I have seen the following warning in dmesg.
 
  Approaching the limit on PV entries, consider increasing either the
  vm.pmap.shpgperproc or the vm.pmap.pv_entry_max sysctl.
 
  So looking around I see a few posts here and there about how to tune
  the sysctls to address the warning however I am not 100% sure what
  each value does.
  It appears changing vm.pmap.shpgperproc affects the value of
  vm.pmap.pv_entry_max . Can someone explain the relationship of the two
  sysctls. Also

 This is how they are calculated.

 pv_entry_max = shpgperproc * maxproc + cnt.v_page_count;

 and, respectively,

 shpgperproc = (pv_entry_max - cnt.v_page_count) / maxproc;

 So, changing one sysctl will change another and vice versa.

  what pitfalls of changing them are.

 Not known to me (on amd64 platform).
 I have had vm.pmap.shpgperproc=15000 on 8.1 amd64 with 4G RAM
 to make some badly written commercial software to work until it
 was decommissioned to the scrap.

 FYI, Alan just removed this warning and the associated sysctls from HEAD
 yesterday because they were made obsolete several years ago.  I think they are
 obsolete even on 7.  Certainly on 8.

Yep, and since switching to direct map (somewhere around 7.x on amd64?)
made PV entry limit factually obsolete, this is really cool.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Approaching the limit on PV entries

2012-03-20 Thread Sergey Kandaurov
On 22 November 2011 19:29, Mark Saad nones...@longcount.org wrote:
 Hello All

[found this mail in my drafts, not sure if my answer is still useful]

  I want to get to the bottom of a warning in dmesg. On 7.2-RELEASE and
 7.3-RELEASE I have seen the following warning in dmesg.

 Approaching the limit on PV entries, consider increasing either the
 vm.pmap.shpgperproc or the vm.pmap.pv_entry_max sysctl.

 So looking around I see a few posts here and there about how to tune
 the sysctls to address the warning however I am not 100% sure what
 each value does.
 It appears changing vm.pmap.shpgperproc affects the value of
 vm.pmap.pv_entry_max . Can someone explain the relationship of the two
 sysctls. Also

This is how they are calculated.

pv_entry_max = shpgperproc * maxproc + cnt.v_page_count;

and, respectively,

shpgperproc = (pv_entry_max - cnt.v_page_count) / maxproc;

So, changing one sysctl will change another and vice versa.

 what pitfalls of changing them are.

Not known to me (on amd64 platform).
I have had vm.pmap.shpgperproc=15000 on 8.1 amd64 with 4G RAM
to make some badly written commercial software to work until it
was decommissioned to the scrap.

 Also why would setting
 kern.ipc.shm_use_phys=1  effect the pv entries. Is this supposed to
 lower the pv entries ?

Changing this sysctl with restarting a quite busy PgSQL server helped
me to reduce pv entries from 14M to tens of thousands (though that
could just coincide with decrease in workload).

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Tiny 'tunefs' bug

2012-02-10 Thread Sergey Kandaurov
2012/2/10  rank1see...@gmail.com:
 Seems like there is a little bug in 'tunefs' binary.
 When I strip '/dev/' to have a shorter CMD, it reports an error, even it does 
 finish it's task.

 # tunefs -j enable ufsid/4edc992e27d147ce
 tunefs: Can't stat ufsid/4edc992e27d147ce: No such file or directory
 Using inode 13 in cg 0 for 8388608 byte journal
 tunefs: soft updates journaling set

 # tunefs -p ufsid/4edc992e27d147ce
 tunefs: Can't stat ufsid/4edc992e27d147ce: No such file or directory
 tunefs: POSIX.1e ACLs: (-a)                                disabled
 tunefs: NFSv4 ACLs: (-N)                                   disabled
 tunefs: MAC multilabel: (-l)                               disabled
 tunefs: soft updates: (-n)                                 enabled
 tunefs: soft update journaling: (-j)                       enabled
 tunefs: gjournal: (-J)                                     disabled
 tunefs: trim: (-t)                                         disabled
 tunefs: maximum blocks per file in a cylinder group: (-e)  2048
 tunefs: average file size: (-f)                            16384
 tunefs: average number of files in a directory: (-s)       64
 tunefs: minimum percentage of free space: (-m)             8%
 tunefs: optimization preference: (-o)                      time
 tunefs: volume label: (-L)

It seems like this was changed with svn rev 207421 in 9.x.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: accepting rtadv broken on 9-STABLE, re driver?

2012-01-06 Thread Sergey Kandaurov
On 6 January 2012 22:19, Mark Felder f...@feld.me wrote:
 Hi guys,


Hi,

 I upgraded my desktop at work just around christmas to 9-PRERELEASE builds
 and ipv6 has been broken since then. I've been too busy at work to fix it
 but today I finally had the chance to figure it out.

 Currently I'm running:

 12:11:15 tech304:~  uname -a
 FreeBSD tech304.office.supranet.net 9.0-STABLE FreeBSD 9.0-STABLE #2
 r229703M: Fri Jan  6 11:01:58 CST 2012
 r...@tech304.office.supranet.net:/usr/obj/tank/svn/sys/GENERIC  amd64

 and my ipv6 is not working. In rc.conf I have
 ipv6_enable_all_interfaces=YES which sets the link local and I had

You mean ipv6_activate_all_interfaces=YES ?

 net.inet6.ip6.accept_rtadv=1 in sysctl.conf. I can confirm that it was
 indeed activated in sysctl, but ifconfig didn't think so:

 re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

  options=209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC
        ether d0:67:e5:17:e1:32
        inet6 fe80::d267:e5ff:fe17:e132%re0 prefixlen 64 scopeid 0x2
        inet 192.168.93.23 netmask 0xff00 broadcast 192.168.93.255
        nd6 options=23PERFORMNUD,AUTO_LINKLOCAL    ## Where's the
 ACCEPT_RTADV???
        media: Ethernet autoselect (100baseTX full-duplex)
        status: active

 I have to manually do

 # ifconfig re0 inet6 accept_rtadv

 to get it to work. Am I missing something? Grepping /etc/rc.d/ for rtadv
 finds no clues. Is this broken for everyone, for the re driver, or am I just
 crazy?

What is in your rc.conf? Do you have inet6 accept_rtadv keyword in it?
IIRC it should be enough to specify ifconfig_re0_ipv6=inet6 accept_rtadv
without additional tweaks. Consult with rc.conf(5).

HTH,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix kenv(1) output in w/respect to new boot loader variables

2011-12-29 Thread Sergey Kandaurov
On 28 December 2011 05:26, Devin Teske devin.te...@fisglobal.com wrote:
 D'Oh! Attached wrong (OLD; already applied) patch.

 Please find appropriate patch attached!

Hi.

I committed your patch to head as svn r228985.
Thank you!


 -Original Message-
 From: Devin Teske [mailto:devin.te...@fisglobal.com]
 Sent: Tuesday, December 27, 2011 5:24 PM
 To: 'freebsd-hackers@freebsd.org'
 Cc: Garrett Cooper; devin.te...@fisglobal.com
 Subject: [PATCH] Fix kenv(1) output in w/respect to new boot loader variables

 Garrett Cooper and a few others have requested that I write a patch to fix a
 regression w/respect to kenv(1) output in FreeBSD-9.0 and HEAD.

 The issue is with the new boot loader menu. It adds many loader variables
 including ones that contain ANSI color escapes.

 Obviously, these ANSI codes don't play well with serial consoles when kenv(1)
 is
 executed without arguments (reports vary as to what happens, but it's never
 pretty).

 Attached is a patch to the Forth code that clears-out the menu-associated
 variables before invoking the kernel.

 The net-effect is that kenv(1) no longer reports menu-related variables.

 In essence, kenv(1) output should now appear the same as on RELENG_8 (which
 lacks the new boot loader and didn't use any such variables). Thus, restoring
 serial console glory.
 --
 Devin

 _
 The information contained in this message is proprietary and/or confidential. 
 If you are not the intended recipient, please: (i) delete the message and all 
 copies; (ii) do not disclose, distribute or use the message in any manner; 
 and (iii) notify the sender immediately. In addition, please be aware that 
 any message addressed to our domain is subject to archiving and review by 
 persons other than the intended recipient. Thank you.

Great!

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: strange printf(9) format specifier (Z) in dev/drm code

2011-12-05 Thread Sergey Kandaurov
On 5 December 2011 02:22, Alexander Best arun...@freebsd.org wrote:
 hi there,

 i was going through the clang warnings from a GENERIC buildkernel and noticed
 the following:

 === drm/mga (all)
 /usr/subversion-src/sys/modules/drm/mga/../../../dev/drm/mga_state.c:56:2: 
 error: invalid conversion specifier 'Z' [-Werror,-Wformat-invalid-specifier]
        BEGIN_DMA(2);
        ^~~
 @/dev/drm/mga_drv.h:291:35: note: expanded from:
                DRM_INFO(    space=0x%x req=0x%Zx\n,                  \
                                                ^
 @/dev/drm/drmP.h:317:60: note: expanded from:
 #define DRM_INFO(fmt, ...)  printf(info: [ DRM_NAME ]  fmt , 
 ##__VA_ARGS__)
                                                           ^
 these lines should cover all warnings:

 otaku% egrep -r %[0-9]*Zx /usr/src/sys/dev/drm
 dev/drm/mga_drv.h:              DRM_INFO(    space=0x%x req=0x%Zx\n,        
           \
 dev/drm/mga_drv.h:              DRM_INFO(    DMA_WRITE( 0x%08x ) at 
 0x%04Zx\n,        \

 ... i couldn't find a reference to an upercase Z in the printf(9) man page.
 i talked to dinoex on #freebsd-clang (EFNet) and he said that the Z might
 come from linux'es libc5 and is the equaivalent to glibc's z.

 can we adjust those lines, so the clang warnings disappear?

Hi, Alexander.

Can you build-test with this change?
Thanks in advance.

Index: sys/dev/drm/mga_drv.h
===
--- sys/dev/drm/mga_drv.h  (revision 228276)
+++ sys/dev/drm/mga_drv.h  (working copy)
@@ -288,7 +288,7 @@
 do {   \
if ( MGA_VERBOSE ) {\
DRM_INFO( BEGIN_DMA( %d )\n, (n) );   \
-   DRM_INFO(space=0x%x req=0x%Zx\n,  \
+   DRM_INFO(space=0x%x req=0x%x\n,   \
  dev_priv-prim.space, (n) * DMA_BLOCK_SIZE ); \
}   \
prim = dev_priv-prim.start;\
@@ -338,7 +338,7 @@
 #define DMA_WRITE( offset, val )   \
 do {   \
if ( MGA_VERBOSE ) {\
-   DRM_INFO(DMA_WRITE( 0x%08x ) at 0x%04Zx\n,\
+   DRM_INFO(DMA_WRITE( 0x%08x ) at 0x%04x\n, \
  (u32)(val), write + (offset) * sizeof(u32) ); \
}   \
*(volatile u32 *)(prim + write + (offset) * sizeof(u32)) = val; \

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Acquiring ACPI_LOCK(acpi) from kernel module during boot process

2011-10-14 Thread Sergey Kandaurov
On 14 October 2011 16:11, Maxim Ignatenko gelraen...@gmail.com wrote:
 Hi,

 I have this code:
 https://gitorious.org/acpi_call-freebsd/acpi_call-freebsd/blobs/5e6a79869721a2bd8de88b5cfa90c14b429cb5c7/acpi_call.c
 It works just fine when loaded into kernel manually, but crashes if
 loaded during boot via loader.conf: http://i.imgur.com/fLPen.png

 I've added some printf's to acpi_register_ioctl() to track down where
 it hangs and crashes after about one minute:
 http://pastebin.com/vvJutWLA

 What am I missing? Do I need to somehow (how?) specify module
 initialization order? Or just call acpi_register_ioctl() by some other
 mean when it would not cause panic?


Hi.

The call of mtx_lock_spin() (as seen from your attached screenshot)
on MTX_DEF acpi mutex tells me that you try to use it before it was
initialized. This is usually done in acpi_attach() routine which is
called with SI_SUB_DRIVERS (? - correct me if I'm wrong) order.
Your module is initialized with the earlier SI_SUB_KLD order.
That also might depend on whether acpi.ko is statically compiled in,
even though you have MODULE_DEPEND(acpi_call, acpi, 1, 1, 1);

First I would change the order in DECLARE_MODULE() to a more
common SI_SUB_EXEC.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: module_register_init fails, but driver is still loaded?

2011-08-05 Thread Sergey Kandaurov
On 4 August 2011 20:23, Garrett Cooper yaneg...@gmail.com wrote:
 Hi hackers,
    I noticed that if anything fails while initializing a driver, the
 driver stays attached to the kernel as a module instead of being
 kicked when all references to the driver go to 0. Is this desired
 behavior (it doesn't seem like it, but I can see potential pros and
 cons of kicking the driver out of the kernel immediately when a
 failure state occurs)? I've seen this on 7.2 ~ 9-CURRENT. Example
 sourcecode and invocation attached below.

Hi.
I have cooked something that might work, though I don't know how much
is it correct from locking  cleanup side. Can you try it? Anyway, in its
current form we cannot return error from module_register_init() because
it's usually called from SYSINIT, so kldload(8) will say nonsense:
can't load ./bad_module.ko: No error: 0.

Index: sys/kern/kern_module.c
===
--- sys/kern/kern_module.c  (revision 224471)
+++ sys/kern/kern_module.c  (working copy)
@@ -112,6 +117,7 @@ module_register_init(const void *arg)
const moduledata_t *data = (const moduledata_t *)arg;
int error;
module_t mod;
+   linker_file_t lf;

mtx_lock(Giant);
MOD_SLOCK;
@@ -123,12 +129,14 @@ module_register_init(const void *arg)
error = MOD_EVENT(mod, MOD_LOAD);
if (error) {
MOD_EVENT(mod, MOD_UNLOAD);
+   lf = mod-file;
MOD_XLOCK;
module_release(mod);
MOD_XUNLOCK;
printf(module_register_init: MOD_LOAD (%s, %p, %p) error
 %d\n, data-name, (void *)data-evhand, data-priv,
error);
+   linker_release_module(NULL, NULL, lf);
} else {
MOD_XLOCK;
if (mod-file) {


-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: strange 'vmstat -z' output

2011-07-06 Thread Sergey Kandaurov
On 6 July 2011 05:08, Jason Hellenthal jh...@dataix.net wrote:


 On Wed, Jul 06, 2011 at 04:40:54AM +0400, Sergey Kandaurov wrote:
 On 6 July 2011 02:46, Alexander Best arun...@freebsd.org wrote:
  hi there,
 
  i'm seeing the following with 'vmstat -z' on CURRENT, running on amd64:
 
  ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
  128 Bucket:            1048,      0,     150,       0,    1650,12746,   0
 
  ...how can the number of failures be greater than the number of requests?

 Here REQ is the total number of successful allocations, not just requests.


 I think he is refering to FAIL column not REQ.


Yes. And FAIL is not a subset of REQ. They are success / failure counters.
That's why FAIL can be less/greater then REQ.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: strange 'vmstat -z' output

2011-07-05 Thread Sergey Kandaurov
On 6 July 2011 02:46, Alexander Best arun...@freebsd.org wrote:
 hi there,

 i'm seeing the following with 'vmstat -z' on CURRENT, running on amd64:

 ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
 128 Bucket:            1048,      0,     150,       0,    1650,12746,   0

 ...how can the number of failures be greater than the number of requests?

Here REQ is the total number of successful allocations, not just requests.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: kvm_open errors on /proc/*/mem in top

2011-06-10 Thread Sergey Kandaurov
On 10 June 2011 00:01, Jim Bryant kc5vdj.free...@gmail.com wrote:
 i'm not sure which list this belongs to, so i'm posting to -hackers and
 -stable.

 i've noticed for a while now that during heavy activity (for instance
 buildworld), that top will get these kvm_read errors when reading proc
 mem entries.

Hi.
I think that is a question of whether it's acceptable to hide all errors
originated in kvm(3):

Index: usr.bin/top/machine.c
===
--- usr.bin/top/machine.c   (revision 222893)
+++ usr.bin/top/machine.c   (working copy)
@@ -265,7 +265,7 @@
else if (namelength  UPUNAMELEN)
namelength = UPUNAMELEN;

-   kd = kvm_open(NULL, _PATH_DEVNULL, NULL, O_RDONLY, kvm_open);
+   kd = kvm_open(NULL, _PATH_DEVNULL, NULL, O_RDONLY, NULL);
if (kd == NULL)
return (-1);

Or rewrite top(1) a little more to open kvm with kvm_openfiles(), to let
the caller decide itself in what places it needs to print an error with
kvm_geterr().


 i have included a screenshot of what happens during such events...

 last pid: 92024;  load averages:  4.79,  4.58,
 4.10
 up 0+00:49:07  15:30:53
 225 processes: 10 running, 197 sleeping, 18 waiting
 CPU: 90.6% user,  0.0% nice,  9.4% system,  0.0% interrupt,  0.0% idle
 Mem: 493M Active, 1337M Inact, 604M Wired, 632K Cache, 315M Buf, 524M Free
 Swap: 4097M Total, 4097M Free
 kvm_open: cannot open /proc/86755/mem
  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 91943 root          1  97    0 39536K 33620K RUN     1   0:01  7.37%
 [cc1plus]
 2859 jbryant       1  48    0   406M 72332K select  0   3:10  5.96%
 kwin -session 1028b2382461f50001270420560001955_13
 2747 root          1  46    0   419M   370M select  0   1:43  4.39%
 /usr/local/bin/X :0 -nolisten tcp -auth /var/run/xauth/A:0
 1464 root          1  44    0  8068K  1384K select  0   0:03  0.39%
 /usr/sbin/moused -p /dev/ums0 -t auto -I /var/run/moused.u
 11219 jbryant       7  44    0   299M   109M select  1   0:17  0.29%
 /usr/local/lib/thunderbird/thunderbird-bin
 2865 jbryant       1  45    0   453M 86140K select  0   0:21  0.20%
 kdeinit4: kdeinit4: plasma-desktop (kdeinit4)
 2882 jbryant       1  44    0   391M 60996K select  0   0:17  0.10%
 kdeinit4: kdeinit4: kmix -session 102511e52251c60001304471
 92001 root          1  97    0 23452K 22256K CPU1    1   0:00  0.00% [cc1]
 92017 root          1  96    0 16172K 13440K RUN     0   0:00  0.00% [cc1]

[snip]

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: fdopendir prototype on 7.3-RELEASE amd64

2011-06-01 Thread Sergey Kandaurov
On 1 June 2011 19:27, Klaus T. Aehlig aeh...@linta.de wrote:

 [Please CC me, as I'm not subscribed to this list]

 Hallo,

 while dealing with PR ports/157274 [1], I found that the following
 program cause a segmentation fault on 7.3-RELEASE amd64, even though
 my understanding of the man page of fdopendir(3) says it should not.

 #include fcntl.h
 #include sys/types.h
 #include dirent.h

 int main(int argc, char **argv) {
  DIR *dirp;
  int fd;

  fd = open(., O_RDONLY);
  dirp = fdopendir(fd);
  (void) readdir(dirp);

 }

 Compiling gives the warning assignment makes pointer from integer without a 
 cast
 refering to the line with the fdopendir call. Indeed, adding the prototype

 extern DIR *fdopendir(int);

 right after the #include lines solves this problem. Is my understanding of the
 man page that the above #include lines should suffice incorrect? Is this
 problem known---or even fixed already?

That is because 7.3 mistakenly misses the fdopendir() declaration in
dirent.h, though it is the first release from 7.x that ought to support it.
That was fixed in 7.3-STABLE past 7.3 release. There should be no problem
for any release from 8.x branch. Also, the description from manpage only
says that the function has appeared in 8.0, and there's nothing about 7.x.

A segmentation fault is indeed due to missing declaration. Here gcc assumes
that a return type of fdopendir() is int, and truncates a return value to
sizeof(int). [On amd64 a pointer is 64-bit capable, int is 32-bit capable.
I guess that 7.3 i386 does not fail here, though it prints the warning.]


 I have reports that indicate that this problem also seems to exist on 
 7.3-RELEASE-p4 amd64
 and 8.1-RELEASE i386. The above program does not segfault on my 8.2-STABLE 
 amd64.

Can you recheck it for 8.1? It should not be so.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: issue with devstat_buildmatch(3) and certain strings

2011-04-01 Thread Sergey Kandaurov
On 1 April 2011 15:37, Sergey Kandaurov pluk...@freebsd.org wrote:
 On 1 April 2011 01:03, Alexander Best arun...@freebsd.org wrote:
 hi there,

 devstat_buildmatch(3) crashes with certain strings. you can test this by
 doing one of:

 iostat -t ,
 iostat -t ,,
 iostat -t da,
 iostat -t ,da,
 iostat -t ,da
 iostat -t da,scsi,
 iostat -t ,da,scsi
 iostat -t da,,scsi

 [Someone told me, -hackers isn't appropriate for patches, Cc: -current.]

 The problem is devstat(3) increments num_args regardless if strsep
 returned NULL.
 I think that should work (all your tests pass):

 Index: lib/libdevstat/devstat.c
 ===
 --- lib/libdevstat/devstat.c    (revision 220102)
 +++ lib/libdevstat/devstat.c    (working copy)
 @@ -1014,11 +1014,12 @@
         * Break the (comma delimited) input string out into separate strings.
         */
        for (tempstr = tstr, num_args  = 0;
 -            (*tempstr = strsep(match_str, ,)) != NULL  (num_args  5);
 -            num_args++)
 -               if (**tempstr != '\0')
 +            (*tempstr = strsep(match_str, ,)) != NULL  (num_args  5); )
 +               if (**tempstr != '\0') {
 +                       num_args++;
                        if (++tempstr = tstr[5])
   
BTW,
this game with pointers might prevent devstat(3) from work on big-endian.

                                break;
 +               }

        /* The user gave us too many type arguments */
        if (num_args  3) {

 Please review, and I will commit the patch.


-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: issue with devstat_buildmatch(3) and certain strings

2011-04-01 Thread Sergey Kandaurov
On 1 April 2011 01:03, Alexander Best arun...@freebsd.org wrote:
 hi there,

 devstat_buildmatch(3) crashes with certain strings. you can test this by
 doing one of:

 iostat -t ,
 iostat -t ,,
 iostat -t da,
 iostat -t ,da,
 iostat -t ,da
 iostat -t da,scsi,
 iostat -t ,da,scsi
 iostat -t da,,scsi

[Someone told me, -hackers isn't appropriate for patches, Cc: -current.]

The problem is devstat(3) increments num_args regardless if strsep
returned NULL.
I think that should work (all your tests pass):

Index: lib/libdevstat/devstat.c
===
--- lib/libdevstat/devstat.c(revision 220102)
+++ lib/libdevstat/devstat.c(working copy)
@@ -1014,11 +1014,12 @@
 * Break the (comma delimited) input string out into separate strings.
 */
for (tempstr = tstr, num_args  = 0;
-(*tempstr = strsep(match_str, ,)) != NULL  (num_args  5);
-num_args++)
-   if (**tempstr != '\0')
+(*tempstr = strsep(match_str, ,)) != NULL  (num_args  5); )
+   if (**tempstr != '\0') {
+   num_args++;
if (++tempstr = tstr[5])
break;
+   }

/* The user gave us too many type arguments */
if (num_args  3) {

Please review, and I will commit the patch.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: issue with devstat_buildmatch(3) and certain strings

2011-04-01 Thread Sergey Kandaurov
On 1 April 2011 18:50, Warner Losh i...@bsdimp.com wrote:
 On Apr 1, 2011, at 5:40 AM, Sergey Kandaurov wrote:

                        if (++tempstr = tstr[5])

   
 BTW,
 this game with pointers might prevent devstat(3) from work on big-endian.

 I'm very curious about your reasoning here.
 Warner

I meant the above comparison of pointers might not work
(I'm not sure, as I have no big-endian to test). Look:

# iostat -t da,scsi,pass
tempstr=0x7fffcfa0, tstr[5]=0x7fffcfc8
tempstr=0x7fffcfa8, tstr[5]=0x7fffcfc8
tempstr=0x7fffcfb0, tstr[5]=0x7fffcfc8


D'oh.. endianness doesn't matter with arrays *blush
(Unless that's some system with decreasing memory
addressing. Ok, nevermind.)

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sched_setscheduler() behaviour changed??

2011-02-18 Thread Sergey Kandaurov
On 17 February 2011 12:50, Mats Lindberg mats.w.lindb...@gmail.com wrote:
 All,
 I have been using a small program /rt) that utilize the sched_setscheduler()
 syscall to set the scheduling policy of a process to SCHED_RR. Been running
 it FBSD 5.x and 6.x. Now when migrating to FBSD 8.1 I get EPERM back at me.
 used to be able to run it like e.g.
 ./rt -sr -p2 -- prog

 which started prog in SCHED_RR policy with priority 2.

 now in FBSD 8.1 I get EPERM

 But If I do
 rtprio 10 ./rt -sr -p2 -- prog

 it I dont get EPERM.

 I'm always root when doing this.

 My problem is that I have customers that need to run their old 5.x 6.x
 applications 'as is' in 8.1 whithout changing anything.


[just thinking aloud]

Perhaps, you might have stumbled upon the change in
sched_setscheduler() restricting permission to superuser:

src/sys/posix4/p1003_1b.c#rev1.24.2.1
MFC revision 1.27.
Don't allow non-root user to set a scheduler policy.

@@ -195,6 +195,10 @@ int sched_setscheduler(struct thread *td
struct thread *targettd;
struct proc *targetp;

+   /* Don't allow non root user to set a scheduler policy */
+   if (suser(td) != 0)
+   return (EPERM);
+
e = copyin(uap-param, sched_param, sizeof(sched_param));
if (e)
return (e);


-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: problem with build mcelog

2011-02-18 Thread Sergey Kandaurov
On 18 February 2011 14:13, venom samflan...@gmail.com wrote:
 On 02/11/2011 11:31 PM, John Baldwin wrote:

 On Friday, February 11, 2011 7:48:39 am venom wrote:

 Hello.

 i am trying build mcelog


 FreeBSD  8.1-RELEASE-p2 FreeBSD 8.1-RELEASE-p2 #0: Fri Jan 14
 04:15:56
 UTC 2011 root@freebsd:/usr/obj/usr/src/sys/GENERIC amd64


 # fetch
 http://ftp2.pl.freebsd.org/pub/FreeBSD/distfiles/mcelog-1.0pre2.tar.gz
 # tar -xf mcelog-1.0pre2.tar.gz
 # cd mcelog-1.0pre2
 # fetch http://people.freebsd.org/~jhb/mcelog/mcelog.patch
 # fetch http://people.freebsd.org/~jhb/mcelog/memstream.c

 Oops, I just updated mcelog.patch and it should work fine now.


 |--- //depot/vendor/mcelog/tsc.c    2010-03-05 20:24:22.0 
 |+++ //depot/projects/mcelog/tsc.c    2010-03-05 21:09:24.0 
 --
 Patching file tsc.c using Plan A...
 Hunk #1 succeeded at 15.
 Hunk #2 succeeded at 52.
 Hunk #3 succeeded at 75.
 Hunk #4 succeeded at 156.
 done
 12:12:46 ~/temp/MCE/mcelog-1.0pre2
 # gmake FREEBSD=yes
 Makefile:92: .depend: No such file or directory
 cc -MM -I. p4.c k8.c mcelog.c dmi.c tsc.c core2.c bitfield.c intel.c
 nehalem.c dunnington.c tulsa.c config.c memutil.c msg.c eventloop.c
 leaky-bucket.c memdb.c server.c client.c cache.c rbtree.c memstream.c 
 .depend.X  mv .depend.X .depend
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o mcelog.o mcelog.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o p4.o p4.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o k8.o k8.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o dmi.o dmi.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o tsc.o tsc.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o core2.o core2.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o bitfield.o
 bitfield.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o intel.o intel.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o nehalem.o nehalem.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o dunnington.o
 dunnington.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o tulsa.o tulsa.c
 cc -c -g -Os  -Wall -Wextra -Wno-missing-field-initializers
 -Wno-unused-parameter -Wstrict-prototypes -Wformat-security
 -Wmissing-declarations -Wdeclaration-after-statement  -o config.o config.c
 config.c:135: error: static declaration of 'getline' follows non-static
 declaration
 /usr/include/stdio.h:370: error: previous declaration of 'getline' was here
 gmake: *** [config.o] Error 1


A local getline() needs the FreeBSD version check.

%%%
--- config.c.olg2011-02-18 14:57:52.0 +0300
+++ config.c2011-02-18 15:07:59.0 +0300
@@ -18,6 +18,9 @@
Author: Andi Kleen
 */
 #define _GNU_SOURCE 1
+#ifdef __FreeBSD__
+# include sys/param.h
+#endif
 #include stdio.h
 #include string.h
 #include ctype.h
@@ -126,7 +129,7 @@
return s;
 }

-#ifdef __FreeBSD__
+#if (defined __FreeBSD__)  (__FreeBSD_version  800067)
 /*
  * Newer versions do have getline(), so this should use a version test
  * at some point.
%%%

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [rfc] allow to boot with = 256GB physmem

2011-02-07 Thread Sergey Kandaurov
On 22 January 2011 00:43, Alan Cox alan.l@gmail.com wrote:
 On Fri, Jan 21, 2011 at 2:58 PM, Alan Cox alan.l@gmail.com wrote:

 On Fri, Jan 21, 2011 at 11:44 AM, John Baldwin j...@freebsd.org wrote:

 On Friday, January 21, 2011 11:09:10 am Sergey Kandaurov wrote:
  Hello.
 
  Some time ago I faced with a problem booting with 400GB physmem.
  The problem is that vm.max_proc_mmap type overflows with
  such high value, and that results in a broken mmap() syscall.
  The max_proc_mmap value is a signed int and roughly calculated
  at vmmapentry_rsrc_init() as u_long vm_kmem_size quotient:
  vm_kmem_size / sizeof(struct vm_map_entry) / 100.
 
  Although at the time it was introduced at svn r57263 the value
  was quite low (f.e. the related commit log stands:
  The value defaults to around 9000 for a 128MB machine.),
  the problem is observed on amd64 where KVA space after
  r212784 is factually bound to the only physical memory size.
 
  With INT_MAX here is 0x7fff, and sizeof(struct vm_map_entry)
  is 120, it's enough to have sligthly less than 256GB to be able
  to reproduce the problem.
 
  I rewrote vmmapentry_rsrc_init() to set large enough limit for
  max_proc_mmap just to protect from integer type overflow.
  As it's also possible to live tune this value, I also added a
  simple anti-shoot constraint to its sysctl handler.
  I'm not sure though if it's worth to commit the second part.
 
  As this patch may cause some bikeshedding,
  I'd like to hear your comments before I will commit it.
 
  http://plukky.net/~pluknet/patches/max_proc_mmap.diff

 Is there any reason we can't just make this variable and sysctl a long?


 Or just delete it.

 1. Contrary to what the commit message says, this sysctl does not
 effectively limit the number of vm map entries.  It only limits the number
 that are created by one system call, mmap().  Other system calls create vm
 map entries just as easily, for example, mprotect(), madvise(), mlock(), and
 minherit().  Basically, anything that alters the properties of a mapping.
 Thus, in 2000, after this sysctl was added, the same resource exhaustion
 induced crash could have been reproduced by trivially changing the program
 in PR/16573 to do an mprotect() or two.

 In a nutshell, if you want to really limit the number of vm map entries
 that a process can allocate, the implementation is a bit more involved than
 what was done for this sysctl.

 2. UMA implements M_WAITOK, whereas the old zone allocator in 2000 did
 not.  Moreover, vm map entries for user maps are allocated with M_WAITOK.
 So, the exact crash reported in PR/16573 couldn't happen any longer.


 Actually, I take back part of what I said here.  The old zone allocator did
 implement something like M_WAITOK, and that appears to have been used for
 user maps.  However, the crash described in PR/16573 was actually on the
 allocation of a vm map entry within the *kernel* address space for a process
 U area.  This type of allocation did not use the old zone allocator's
 equivalent to M_WAITOK.  However, we no longer have U areas, so the exact
 crash scenario is clearly no longer possible.  Interestingly, the sysctl in
 question has no direct effect on the allocation of kernel vm map entries.

 So, I remain skeptical that this sysctl is preventing any resource
 exhaustion based panics in the current kernel.  Again, I would be thrilled
 to see one or more people do some testing, such as rerunning the program
 from PR/16573.


 3. We now have the vmemoryuse resource limit.  When this sysctl was
 defined, we didn't.  Limiting the virtual memory indirectly but effectively
 limits the number of vm map entries that a process can allocate.

 In summary, I would do a little due diligence, for example, run the
 program from PR/16573 with the limit disabled.  If you can't reproduce the
 crash, in other words, nothing contradicts point #2 above, then I would just
 delete this sysctl.


I tried the test from PR/16573 running as root. If unmodified it just quickly
bounds on kern.maxproc limit. So, I added signal(SIGCHLD, SIG_IGN); to not
create zombie processes at all to give it more workload. With this change it
also survived. Submitter reported that it crashes with 1 iterations.
After increasing the limit up to 100 I still couldn't get it to crash.

* The testing was done with commented out max_proc_mmap part.
The change effectively reverts r57263.

-- 
wbr,
pluknet


vm_mmap_maxprocmmap.diff
Description: Binary data
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: broken INCLUDE in sys/conf/kern.pre.mk for opensolaris code?

2011-02-02 Thread Sergey Kandaurov
On 6 January 2011 04:40, Alexander Best arun...@freebsd.org wrote:
 hi there,

 while building target buildkernel with 'clang -v' i noticed a lot of these
 lines:

 ignoring nonexistent directory 
 /usr/subversion-src/sys/contrib/opensolaris/compat

 i checked sys/conf/kern.pre.mk and there's a line refering to a non-existing
 directory:

 # ...  and OpenSolaris
 INCLUDES+= -I$S/contrib/opensolaris/compat

Hi, I just removed that path in r218189.


 is suspect this should actually be:

 # ...  and OpenSolaris
 INCLUDES+= -I$S/cddl/compat/opensolaris

 but i'm not sure of it.


I found that instead it's included in modules' Makefile
(e.g. as done for /sys/modules/cyclic).

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: broken INCLUDE in sys/conf/kern.pre.mk for opensolaris code?

2011-02-02 Thread Sergey Kandaurov
On 2 February 2011 20:20, Alexander Best arun...@freebsd.org wrote:
 On Wed Feb  2 11, Sergey Kandaurov wrote:
 On 6 January 2011 04:40, Alexander Best arun...@freebsd.org wrote:
  hi there,
 
  while building target buildkernel with 'clang -v' i noticed a lot of these
  lines:
 
  ignoring nonexistent directory 
  /usr/subversion-src/sys/contrib/opensolaris/compat
 
  i checked sys/conf/kern.pre.mk and there's a line refering to a 
  non-existing
  directory:
 
  # ...  and OpenSolaris
  INCLUDES+= -I$S/contrib/opensolaris/compat

 Hi, I just removed that path in r218189.

 thanks a bunch. :)

 i might do a 'make universe' build at some point with clang -v in order to
 check, if there are more cases where non-existing include paths exist in the
 freebsd src.

Thanks a lot. That would be great, I think.


-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: weird characters in top(1) output

2011-02-01 Thread Sergey Kandaurov
On 1 February 2011 15:24, Alexander Best arun...@freebsd.org wrote:
 hi there,

 i was doing the following:

 top inf  ~/output

 when i noticed that this was missing the overall statistics line. so i went
 ahead and did:

 top -d2 inf  ~/output

 funny thing is that for the second output some weird characters seem to get
 spammed into the overall statistics line:

 last pid: 14320;  load averages:  0.42,  0.44,  0.37  up 1+14:02:02    
 13:21:05
 249 processes: 1 running, 248 sleeping
 CPU: ^[[3;6H 7.8% user,  0.0% nice, 10.6% system,  0.6% interrupt, 81.0% idle
 Mem: 1271M Active, 205M Inact, 402M Wired, 67M Cache, 212M Buf, 18M Free
 Swap: 18G Total, 782M Used, 17G Free, 4% Inuse

 this only seems to happen when i redirect the top(1) output to a file. if i 
 do:

 top -d2 inf

 ...everything works fine. i verified the issue under zsh(1) and sh(1).

My quick check shows that this is a regression between 7.2 and 7.3.
Reverting r196382 fixes this bug for me.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


[rfc] allow to boot with = 256GB physmem

2011-01-21 Thread Sergey Kandaurov
Hello.

Some time ago I faced with a problem booting with 400GB physmem.
The problem is that vm.max_proc_mmap type overflows with
such high value, and that results in a broken mmap() syscall.
The max_proc_mmap value is a signed int and roughly calculated
at vmmapentry_rsrc_init() as u_long vm_kmem_size quotient:
vm_kmem_size / sizeof(struct vm_map_entry) / 100.

Although at the time it was introduced at svn r57263 the value
was quite low (f.e. the related commit log stands:
The value defaults to around 9000 for a 128MB machine.),
the problem is observed on amd64 where KVA space after
r212784 is factually bound to the only physical memory size.

With INT_MAX here is 0x7fff, and sizeof(struct vm_map_entry)
is 120, it's enough to have sligthly less than 256GB to be able
to reproduce the problem.

I rewrote vmmapentry_rsrc_init() to set large enough limit for
max_proc_mmap just to protect from integer type overflow.
As it's also possible to live tune this value, I also added a
simple anti-shoot constraint to its sysctl handler.
I'm not sure though if it's worth to commit the second part.

As this patch may cause some bikeshedding,
I'd like to hear your comments before I will commit it.

http://plukky.net/~pluknet/patches/max_proc_mmap.diff

-- 
wbr,
pluknet


max_proc_mmap.diff
Description: Binary data
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: [rfc] allow to boot with = 256GB physmem

2011-01-21 Thread Sergey Kandaurov
On 21 January 2011 20:44, John Baldwin j...@freebsd.org wrote:
 On Friday, January 21, 2011 11:09:10 am Sergey Kandaurov wrote:
 Hello.

 Some time ago I faced with a problem booting with 400GB physmem.
 The problem is that vm.max_proc_mmap type overflows with
 such high value, and that results in a broken mmap() syscall.
 The max_proc_mmap value is a signed int and roughly calculated
 at vmmapentry_rsrc_init() as u_long vm_kmem_size quotient:
 vm_kmem_size / sizeof(struct vm_map_entry) / 100.

 Although at the time it was introduced at svn r57263 the value
 was quite low (f.e. the related commit log stands:
 The value defaults to around 9000 for a 128MB machine.),
 the problem is observed on amd64 where KVA space after
 r212784 is factually bound to the only physical memory size.

 With INT_MAX here is 0x7fff, and sizeof(struct vm_map_entry)
 is 120, it's enough to have sligthly less than 256GB to be able
 to reproduce the problem.

 I rewrote vmmapentry_rsrc_init() to set large enough limit for
 max_proc_mmap just to protect from integer type overflow.
 As it's also possible to live tune this value, I also added a
 simple anti-shoot constraint to its sysctl handler.
 I'm not sure though if it's worth to commit the second part.

 As this patch may cause some bikeshedding,
 I'd like to hear your comments before I will commit it.

 http://plukky.net/~pluknet/patches/max_proc_mmap.diff

 Is there any reason we can't just make this variable and sysctl a long?


That was my initial thought, but now I'm afraid this can result
in 32bit vs 64bit comparison issue below in code.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: broken INCLUDE in sys/conf/kern.pre.mk for opensolaris code?

2011-01-05 Thread Sergey Kandaurov
On 6 January 2011 04:40, Alexander Best arun...@freebsd.org wrote:
 hi there,

 while building target buildkernel with 'clang -v' i noticed a lot of these
 lines:

 ignoring nonexistent directory 
 /usr/subversion-src/sys/contrib/opensolaris/compat

 i checked sys/conf/kern.pre.mk and there's a line refering to a non-existing
 directory:

 # ...  and OpenSolaris
 INCLUDES+= -I$S/contrib/opensolaris/compat

I guess that's leftover from early dtrace stages in perforce.
See //depot/projects/dtrace/src/sys/contrib/opensolaris/compat/sys


 is suspect this should actually be:

 # ...  and OpenSolaris
 INCLUDES+= -I$S/cddl/compat/opensolaris

 but i'm not sure of it.



-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SYSCALL_MODULE() macro and modfind() issues

2010-10-27 Thread Sergey Kandaurov
On 26 October 2010 17:34, John Baldwin j...@freebsd.org wrote:
 On Tuesday, October 26, 2010 4:00:14 am Selphie Keller wrote:
 Thanks Andriy,

 Took a look at the change to src/sys/sys/sysent.h

 @@ -149,7 +149,7 @@ static struct syscall_module_data name##
  };                                                             \
                                                                 \
  static moduledata_t name##_mod = {                             \
 -       #name,                                                  \
 +       sys/ #name,                                           \
         syscall_module_handler,                                 \
         name##_syscall_mod                                     \
  };                                                             \

 applied the MFC prefix to pmap port:

 --- /usr/ports/sysutils/pmap/work/pmap/pmap/pmap.c.orig 2010-10-26
 00:55:32.0 -0700
 +++ /usr/ports/sysutils/pmap/work/pmap/pmap/pmap.c      2010-10-26
 00:56:10.0 -0700
 @@ -86,12 +86,12 @@ main(int argc, char **argv)
      struct kinfo_proc *kp;
      int        pmap_helper_syscall;

 -    if ((modid = modfind(pmap_helper)) == -1) {
 +    if ((modid = modfind(sys/pmap_helper)) == -1) {
                 /* module not found, try to load */
                 modid = kldload(pmap_helper.ko);
                 if (modid == -1)
                         err(1, unable to load pmap_helper module);
 -               modid = modfind(pmap_helper);
 +               modid = modfind(sys/pmap_helper);
                 if (modid == -1)
                         err(1, pmap_helper module loaded but not found);
         }

 which restored functionality on freebsd 8.1.

 The best approach might be to have something like this:

 static int
 pmap_find(void)
 {
        int modid;

        modid = modfind(pmap_helper);
        if (modid == -1)
                modid = modfind(sys/pmap_helper);
        return (modid);
 }

 then in the original main() routine use this:

        if ((modid = pmap_find()) == -1) {
                /* module not found, try to load */
                modid  = kldload(pmap_helper.ko);
                if (modid == -1)
                        err(1, unable to load pmap_helper module);
                modid = pmap_find();
                if (modid == -1)
                        err(1, pmap_helper module loaded but not found);
        }

 This would make the code work for both old and new versions.


Just another foo of many which I use at work generally.
It lacks compat32 syscalls handling though (we don't use them).

/*
 * We have to extract __FreeBSD_version from live kernel
 * as we depend on kernel feature and can run on an older world.
 */
if (sysctlbyname(kern.osreldate, osreldate, intlen, NULL, 0)  0)
err(-2, sysctl(kern.osreldate));
if (osreldate = 800505)/* See r206346 in stable/8. */
strcpy(modname, sys/foo);
else
strcpy(modname, foo);

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix 'implicit declaration' warning and update vgone(9)

2010-10-27 Thread Sergey Kandaurov
On 27 October 2010 10:23, Lars Hartmann l...@chaotika.org wrote:
 The vgonel function isnt declarated in any header, the vgonel prototype
 in vgone(9) isnt correct - found by Ben Kaduk ka...@mit.edu

Hi.

I'm afraid it's just an overlooked man page after many VFS changes in 5.x.
As vgonel() is a static (i.e. private and not visible from outside) function
IMO it should be removed from vgone(9) man page.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [PATCH] Fix 'implicit declaration' warning and update vgone(9)

2010-10-27 Thread Sergey Kandaurov
On 27 October 2010 15:33, Sergey Kandaurov pluk...@gmail.com wrote:
 On 27 October 2010 10:23, Lars Hartmann l...@chaotika.org wrote:
 The vgonel function isnt declarated in any header, the vgonel prototype
 in vgone(9) isnt correct - found by Ben Kaduk ka...@mit.edu

 Hi.

 I'm afraid it's just an overlooked man page after many VFS changes in 5.x.
 As vgonel() is a static (i.e. private and not visible from outside) function
 IMO it should be removed from vgone(9) man page.


Something like this I'd like to check in. Comments?

Index: ObsoleteFiles.inc
===
--- ObsoleteFiles.inc   (revision 214414)
+++ ObsoleteFiles.inc   (working copy)
@@ -14,6 +14,8 @@
 # The file is partitioned: OLD_FILES first, then OLD_LIBS and OLD_DIRS last.
 #

+# 20101027: vgonel(9) has gone to private API a while ago
+OLD_FILES+=usr/share/man/man9/vgonel.9.gz
 # 20101020: catch up with vm_page_sleep_if_busy rename
 OLD_FILES+=usr/share/man/man9/vm_page_sleep_busy.9.gz
 # 20101011: removed subblock.h from liblzma
Index: share/man/man9/Makefile
===
--- share/man/man9/Makefile (revision 214413)
+++ share/man/man9/Makefile (working copy)
@@ -1317,7 +1317,6 @@
vfs_getopt.9 vfs_setopt_part.9 \
vfs_getopt.9 vfs_setopts.9
 MLINKS+=VFS_LOCK_GIANT.9 VFS_UNLOCK_GIANT.9
-MLINKS+=vgone.9 vgonel.9
 MLINKS+=vhold.9 vdrop.9 \
vhold.9 vdropl.9 \
vhold.9 vholdl.9
Index: share/man/man9/vgone.9
===
--- share/man/man9/vgone.9  (revision 214413)
+++ share/man/man9/vgone.9  (working copy)
@@ -26,24 +26,21 @@
 .\
 .\ $FreeBSD$
 .\
-.Dd November 21, 2001
+.Dd October 27, 2010
 .Dt VGONE 9
 .Os
 .Sh NAME
-.Nm vgone , vgonel
+.Nm vgone
 .Nd prepare a vnode for reuse
 .Sh SYNOPSIS
 .In sys/param.h
 .In sys/vnode.h
 .Ft void
 .Fn vgone struct vnode *vp
-.Ft void
-.Fn vgonel struct vnode *vp struct thread *td
 .Sh DESCRIPTION
+The
 .Fn vgone
-and
-.Fn vgonel
-prepare a vnode for reuse by another file system.
+function prepares a vnode for reuse by another file system.
 The preparation includes the cleaning of all file system specific data and
 the removal from its mount point vnode list.
 .Pp
@@ -55,17 +52,11 @@
 as in most cases the vnode
 is about to be reused, or its file system is being unmounted.
 .Pp
-The difference between
+The
 .Fn vgone
-and
-.Fn vgonel
-is that
-.Fn vgone
-locks the vnode interlock and then calls
-.Fn vgonel
-while
-.Fn vgonel
-expects the interlock to already be locked.
+function takes an unlocked vnode and returns with the vnode unlocked.
+.Sh SEE ALSO
+.Xr vnode 9
 .Sh AUTHORS
 This manual page was written by
 .An Chad David Aq dav...@acns.ab.ca .
Index: share/man/man9/vflush.9
===
--- share/man/man9/vflush.9 (revision 214413)
+++ share/man/man9/vflush.9 (working copy)
@@ -75,7 +75,6 @@
 will be returned.
 .Sh SEE ALSO
 .Xr vgone 9 ,
-.Xr vgonel 9 ,
 .Xr vrele 9
 .Sh AUTHORS
 This manual page was written by


-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: issue with unsetting 'arch' flag

2010-10-07 Thread Sergey Kandaurov
On 7 October 2010 22:45, Jaakko Heinonen j...@freebsd.org wrote:
 On 2010-10-06, Alexander Best wrote:
 $ sudo rm -d /tmp/chflags.XX
 $ tmpfile=`mktemp /tmp/chflags.XX`
 $ sudo chflags arch $tmpfile
 $ chflags noarch $tmpfile

 is what's causing the problem. the last chflags call should fail, but it
 doesn't.

 Here is a patch for UFS:

 %%%
 Index: sys/ufs/ufs/ufs_vnops.c
 ===
 --- sys/ufs/ufs/ufs_vnops.c     (revision 213507)
 +++ sys/ufs/ufs/ufs_vnops.c     (working copy)
 @@ -556,6 +556,9 @@ ufs_setattr(ap)
                             (SF_NOUNLINK | SF_IMMUTABLE | SF_APPEND) ||
                            (vap-va_flags  UF_SETTABLE) != vap-va_flags)
                                return (EPERM);
 +                       if ((ip-i_flags  SF_SETTABLE) !=
 +                           (vap-va_flags  SF_SETTABLE))
 +                               return (EPERM);
                        ip-i_flags = SF_SETTABLE;
                        ip-i_flags |= (vap-va_flags  UF_SETTABLE);
                        DIP_SET(ip, i_flags, ip-i_flags);
 %%%

 The patch has a potential to break something if someone assumes that
 non-super-user can modify UF_SETTABLE flags with the SF_SETTABLE part
 set to zero. However with a quick peek this seems to be what NetBSD
 does.

Just for reference:
this comes from NetBSD PR kern/3491 and fixed before 1.3R.
I just checked arch test, and it works as expected with the change.
All chflags tests from fstest suite passed as well.

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: issue with unsetting 'arch' flag

2010-10-06 Thread Sergey Kandaurov
On 6 October 2010 23:38, Alexander Best arun...@freebsd.org wrote:
 On Wed Oct  6 10, Garrett Cooper wrote:
 On Wed, Oct 6, 2010 at 10:35 AM, Alexander Best arun...@freebsd.org wrote:
  On Wed Oct  6 10, Garrett Cooper wrote:
  On Tue, Oct 5, 2010 at 4:50 PM, Alexander Best arun...@freebsd.org 
  wrote:
   hi there,
  
   i think the following example shows the problem better than a long 
   explanation:
  
   `touch ftest  chflags arch ftest  chflags -vv 0 ftest`.
    ^^non-root     ^^root                ^^non-root
  
   chflags claims to have cleared the 'arch' flag (which should be 
   impossible as
   non-root user), but indeed has done nothing.
  
   i've tried the same with 'sappnd' and that works as can be expected.
  
   The issue was confirmed to exist in HEAD (me), stable/8 (pgollucc1, 
   jpaetzel)
   and stable/7 (nox).
   On stable/6 it does NOT exist (jpaetzel). chflags properly fails with 
   EPERM.
 
      Fails for me when I call the syscall directly, as I would expect,
  and passes when I'm superuser:
 
  $ ./test_chflags
  (uid, euid) = (1000, 1000)
  test_chflags: chflags: Operation not permitted
  test_chflags: lchflags: Operation not permitted
  $ sudo ./test_chflags
  (uid, euid) = (0, 0)
 
      According to my basic inspection in strtofflags
  (.../lib/libc/gen/strtofflags.c), it works as well.
      And last but not least, executing the commands directly on the CLI 
  work:
 
  $ tmpfile=`mktemp /tmp/chflags.XX`
  $ chflags arch $tmpfile
  chflags: /tmp/chflags.nQm1IL: Operation not permitted
  $ rm $tmpfile
  $ tmpfile=`mktemp /tmp/chflags.XX`
  $ sudo chflags arch $tmpfile
  $ sudo chflags noarch $tmpfile
  $ rm $tmpfile
 
  thanks for your test app and helping out with this problem. i'm not sure
  however you understood the problem. probably i didn't explain it right:
 
  $ sudo rm -d /tmp/chflags.XX
  $ tmpfile=`mktemp /tmp/chflags.XX`
  $ sudo chflags arch $tmpfile
  $ chflags noarch $tmpfile
 
  is what's causing the problem. the last chflags call should fail, but it
  doesn't.

 Sorry... my CLI based example was stupid. I meant:

 $ tmpfile=`mktemp /tmp/chflags.XX`
 $ chflags arch $tmpfile
 chflags: /tmp/chflags.V2NpXR: Operation not permitted
 $ chflags noarch $tmpfile
 $ rm $tmpfile

 Currently chflags(2) states:

      The SF_IMMUTABLE, SF_APPEND, SF_NOUNLINK, and SF_ARCHIVED flags may only
      be set or unset by the super-user.  Attempts to set these flags by non-
      super-users are rejected,  attempts by non-superusers to clear
 flags that
      are already unset are silently ignored.   These flags may be set at 
 any
      time, but normally may only be unset when the system is in single-user
      mode.  (See init(8) for details.)

 So this behavior is already well documented :). The EPERM section
 should really note SF_ARCHIVED though (whoever added the flag forgot
 to add that particular item to the ERRORS section).

 that's perfectly alright. clearing an unset flag shouldn't cause any error to
 be returned. however in my example arch *does* get set and still trying to
 unset it as normal user doesn't return an error.


It's even more interesting.

As far as I could parse the code:
- UFS has no special handling for SF_ARCHIVED (I found only it for msdosfs)
- ufs_setattr() does not handle unsetting SF_ARCHIVED,
  so all what it does is simply return zero.
- /bin/chflags doesn't check the actual flags value from inode after
calling chflags() syscall, and blindly assumes all is well, if chflags()
returns with zero,

-- 
wbr,
pluknet
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org