Re: msleep() on recursivly locked mutexes

2007-04-27 Thread Bosko Milekic

On 4/26/07, Hans Petter Selasky [EMAIL PROTECTED] wrote:

Hi,

In the new USB stack I have defined the following:


Could you perhaps describe some of the codepaths in the USB stack that
require this behavior?

--
Bosko Milekic [EMAIL PROTECTED]
http://www.crowdedweb.com/
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More user developers friendly memguard.

2005-12-29 Thread Bosko Milekic
I like this very much.  Please commit and feel free to continue
improving Memguard for yours (and everyone elses) benefit.

On 12/27/05, Pawel Jakub Dawidek [EMAIL PROTECTED] wrote:
 Here is the patch:

 http://people.freebsd.org/~pjd/patches/kern_malloc.c.3.patch

 It allows to configure memory type to debug without recompilling the
 kernel. It also allows to debug kernel modules with memguard.

 The rules:
 1. If memory type is compiled into the kernel vm.memguard_desc should be
configured in /boot/loader.conf.
 2. If memory type is in kernel module, vm.memguard_desc sysctl should be
configured before loading the module.

 --
 Pawel Jakub Dawidek   http://www.wheel.pl
 [EMAIL PROTECTED]   http://www.FreeBSD.org
 FreeBSD committer Am I Evil? Yes, I Am!





--
Bosko Milekic [EMAIL PROTECTED]

To see all the content I generate on the web, check out my Peoplefeeds
profile at
http://peoplefeeds.com/bosko/profile
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: generic network protocols parser ?

2005-03-04 Thread Bosko Milekic

On Fri, Mar 04, 2005 at 11:07:34AM -0500, Aziz KEZZOU wrote:
 Hi all,
 I am wondering if any one knows  about a generic parser which takes a
 packet (mbuf) of a certain protocol (e.g RSVP ) as input and generates
 some data structre representing the packet ?
 
 I've been searching for a while and found that ethereal and tcpdump
 for example use specific data structres and functions to dissect each
 protocol packets. Is this the only approach possible ?
 
 My supervisor suggested using a TLV (Type/Length/Value) approach
 instead. Any opinions about that?
 
 If no such a parser exists is there any practical reason why ?
 
 Thanks,
 Aziz

  You can only go so far with generic parsing.  Eventually you will want
  some protocol specific value to be extracted and your parser will have
  to know about what the packet looks like.

  What are you trying to do, exactly?

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: MBUF statistics

2005-02-16 Thread Bosko Milekic
On Tue, 15 Feb 2005 16:54:52 +0100, Max Laier [EMAIL PROTECTED] wrote:
 On Tuesday 15 February 2005 12:38, Borja Marcos wrote:
Hello,
 
Looking at the mbuf statistics available in FreeBSD 4 and FreeBSD 5 I
  can see that the statistics available in FreeBSD 5 are, surprisingly,
  much less comprehensive. Is there any other place where I can find out
  how many mbuf requests have been done, how many of them have waited,
  how many have failed, etc?
 
 I use $vmstat -z | grep Mbuf.  The netstat -m output is broken, because
 fixing this would impose an additional atomic operation on each alloc/free
 which is a real performance killer.

Yeah, unfortunately statistics are too hard to do completely correctly
right now (too hard on performance, that is).  To make things worse,
the more involved UMA zone design used for Mbuf and Cluster
allocations means that some of the UMA zone statistics you get with
vmstat -z are not entirely accurate.  The UMA zone statistics code
needs to be changed to accomodate this structure, but in addition to
that a way to make statistics gathering cheaper would be nice, because
currently doing a 'vmstat -z' is really terrible for performance. 
Which reminds me... those of you doing benchmarks, please don't run
'vmstat -z' while you're doing them, it might skew/pessimize your
results.

-- 
Bosko Milekic - If I were a number, I'd be irrational.
Contact Info: http://bmilekic.unixdaemons.com/contact.txt
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Playing with mbuf in userland

2004-08-22 Thread Bosko Milekic

Paolo Pisati wrote:
Hi,

i'm developing a little app that manipulates mbuf.
Right now i'm still working on it as userland app but i
would like to test it with some real mbufs straight
from the stack.
Do you know how i can get some of these structs in
an easy way?
I mean, is it possible to copy some of these struct from
stack to userland?
Or should i fake it in userland?

  One way to do this would be to instrument a for-superuser-only
  socket option that would copy out all of the data, including the
  metadata and mbuf headers, out to userland, while taking care to
  modify references within the mbufs to userland locations {*}.  To do
  this, in turn, you would need to obtain the userland target addresses
  of all mbufs and clusters you're copying out beforehand, and overwrite
  all mbufs' m_next, m_nextpkt, m_data, and in some cases, m_ext.ext_buf
  references before doing the copyout in-kernel.  This can be a pretty
  involved copy and would require careful implementation.

{*} The data is the socket buffer is kept as an mbuf chain so this is
possible.

  Another option to look into would be to implement a sysctl(8)-exported
  handler that iterates over the mbuf chain and prints out the mbuf chains
  in something like XML, which your userland application can then more or less
  easily parse, and reproduce the chain (fake it up) in userland.  This
  solution is rather attractive because you can do all sorts of things
  with the intermediate-parsing-language right from the kernel, as well
  as from userland (at the parsing stages).  To see an example of a sysctl(8)
  handler, refer to src/sys/vm/uma_core.c (the bottom) in FreeBSD 5.x.

  In any case, I would be very interested in seeing what you come up with,
  as this could be a very useful diagnostic tool.

-Bosko
--
Bosko Milekic [EMAIL PROTECTED] [EMAIL PROTECTED]
For the wicked / Carry us away / Captivity require from us a song /
How can we sing king alpha's song in a strange land? --Bob Marley

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Playing with mbuf in userland

2004-08-22 Thread Bosko Milekic

I wrote:
  Another option to look into would be to implement a sysctl(8)-exported
  handler that iterates over the mbuf chain and prints out the mbuf chains
  in something like XML, which your userland application can then more or less
  easily parse, and reproduce the chain (fake it up) in userland.  This
  solution is rather attractive because you can do all sorts of things
  with the intermediate-parsing-language right from the kernel, as well
  as from userland (at the parsing stages).  To see an example of a sysctl(8)
  handler, refer to src/sys/vm/uma_core.c (the bottom) in FreeBSD 5.x.

I should also add: you should have various sysctl OIDs that call this
handler passing, say, the mbuf chain as an argument (a reference to the
top mbuf).  This way certain OIDs can send out a snapshot of an mbuf
chain at a particular point in the stack, and others can send snapshots
from the socket buffer and driver entry-points (you can get some perception
of how the chain changes as it makes its way up-and-down the layers).

-Bosko
 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [HEADS-UP] mbuma is in the tree

2004-06-02 Thread Bosko Milekic

  Bosko,

[deletia]

  are you going to convert mbuf tag allocator to UMA? Now
tags are allocated with malloc(). AFAIK, tags are used heavily in pf,
and forthcoming ALTQ. Moving to UMA should affect their performance
positively.

  First off, malloc() *is* UMA.  With mbuma in the tree, I don't believe
  we have any remaining custom-allocators in the tree.

  As for what to do with m_tags, it is still unclear to me.  Personally,
  I'm conflicted about their use.  On one hand, they offer a clean way
  to attach metadata to packets, but on the other hand they are quite
  expensive.

  If you read the paper on mbuma, you'll notice that I point out that it
  would be worth investigating whether, in scenarios where an m_tag is
  ALWAYS required per packet (e.g., MAC), providing a secondary zone with
  pre-allocated m_tags for packet headers might be worth it.  Prior to
  this work, however, I suggest we investigate the possibility of using
  smaller mini-mbufs whenever clusters are used so that space wastage
  is reduced.

  -Bosko

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


[HEADS-UP] mbuma is in the tree

2004-05-31 Thread Bosko Milekic
(Hello Chris Haalboom? :-))

Hello,

  In order to avoid having to type everything again, I'll refer
  to the commit log.  PLEASE READ IT IN FULL:

Bring in mbuma to replace mballoc.

mbuma is an Mbuf  Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.

Extensions to UMA worth noting:
  - Better layering between slab - zone caches; introduce
Keg structure which splits off slab cache away from the
zone structure and allows multiple zones to be stacked
on top of a single Keg (single type of slab cache);
perhaps we should look into defining a subset API on
top of the Keg for special use by malloc(9),
for example.
  - UMA_ZONE_REFCNT zones can now be added, and reference
counters automagically allocated for them within the end
of the associated slab structures.  uma_find_refcnt()
does a kextract to fetch the slab struct reference from
the underlying page, and lookup the corresponding refcnt.

mbuma things worth noting:
  - integrates mbuf  cluster allocations with extended UMA
and provides caches for commonly-allocated items; defines
several zones (two primary, one secondary) and two kegs.
  - change up certain code paths that always used to do:
m_get() + m_clget() to instead just use m_getcl() and
try to take advantage of the newly defined secondary
Packet zone.
  - netstat(1) and systat(1) quickly hacked up to do basic
stat reporting but additional stats work needs to be
done once some other details within UMA have been taken
care of and it becomes clearer to how stats will work
within the modified framework.

From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used.  The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.

Additional things worth noting/known issues (READ):
   - One report of 'ips' (ServeRAID) driver acting really
 slow in conjunction with mbuma.  Need more data.
 Latest report is that ips is equally sucking with
 and without mbuma.
   - Giant leak in NFS code sometimes occurs, can't
 reproduce but currently analyzing; brueffer is
 able to reproduce but THIS IS NOT an mbuma-specific
 problem and currently occurs even WITHOUT mbuma.
   - Issues in network locking: there is at least one
 code path in the rip code where one or more locks
 are acquired and we end up in m_prepend() with
 M_WAITOK, which causes WITNESS to whine from within
 UMA.  Current temporary solution: force all UMA
 allocations to be M_NOWAIT from within UMA for now
 to avoid deadlocks unless WITNESS is defined and we
 can determine with certainty that we're not holding
 any locks when we're M_WAITOK.
   - I've seen at least one weird socketbuffer empty-but-
 mbuf-still-attached panic.  I don't believe this
 to be related to mbuma but please keep your eyes
 open, turn on debugging, and capture crash dumps.

This change removes more code than it adds.

A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.

Testing and Debugging:
rwatson,
brueffer,
Ketrien I. Saihr-Kesenchedra,
...
Reviewed by: Lots of people (for different parts)


SHOULD YOU HAVE ANY ISSUES:

  - Turn on INVARIANTS
  - Turn on WITNESS
  - Send stack trace and if possible capture crash dump
  - Might require further information from you, please provide
reachable Email address.
  - When you Email me, please include MBUMA in the Subject
line.

Cheers,
Bosko

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Network buffer allocations: mbuma, PLEASE TEST

2004-05-26 Thread Bosko Milekic
Hi,

  If you're running -CURRENT, please test this:

http://people.freebsd.org/~bmilekic/code/mbuma2.diff

  It is several extensions to UMA and mbuf  cluster allocation
  built on top of it.

  Once you apply the patch from src/, you need to rebuild and
  reinstall src/usr.bin/netstat, src/usr.bin/systat, and then
  a new kernel.  When you're configuring your new kernel,
  you should remove the NMBCLUSTERS compile-time option, it's
  no longer needed.  Clusters will still be capped off
  according to maxusers (which is auto-tuned itself).
  Alternately, if you want theoretically unlimited number
  of clusters, you can tune the boot-time kern.ipc.nmbclusters
  tunable to zero.

  Unless final issues arise I'm going to commit this tomorrow
  morning; it's been tested already quite a bit, and performance
  considered.  A paper is available and was presented at
  BSDCan 2004; in case you missed it:

http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf

  It has been looked at for quite some time now.  Additional
  code cleanups will need to occur following commit, maybe.
  Future work is also possible, see the paper if you're
  interested in taking some of it on.

  Oh, and keep me in the CC; I have no idea if I'm
  subscribed to these lists anymore.  You should also follow
  up to this thread on -net and not on -hackers (trim
  -hackers from CC in the future).  Thanks and happy
  hacking!

Regards,
--
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]
 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic in uma_zdestroy

2003-08-01 Thread Bosko Milekic

I screwed up...  fix coming shortly.  Sorry!

On Fri, Aug 01, 2003 at 07:00:19PM +0200, Harti Brandt wrote:
 
 Hi,
 
 with a kernel from yesterday I get a panic on an SMP system when I
 destroy a zone immediately after creating it. It have a driver (with the
 probe routine set to return ENXIO) and the following module event
 function:
 
 /*
  * Module loaded/unloaded
  */
 int
 en_modevent(module_t mod __unused, int event, void *arg __unused)
 {
 
   switch (event) {
 
 case MOD_LOAD:
   en_vcc_zone = uma_zcreate(EN vccs, sizeof(struct en_vcc),
   NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
   if (en_vcc_zone == NULL)
   return (ENOMEM);
   break;
 
 case MOD_UNLOAD:
   uma_zdestroy(en_vcc_zone);
   break;
   }
   return (0);
 }
 
 When I load the module and unload it I get a panic with the following trace:
 
 db trace
 uma_zfree_internal(c083a200,0,0,0,c627b3c4) at uma_zfree_internal+0xb0
 cache_drain(c627b300,1,c030547c,245,c0369740) at cache_drain+0xe3
 zone_drain_common(c627b300,1,c030547c,461,0) at zone_drain_common+0x62
 zone_dtor(c627b300,f4,0,dad4fc40,c01b0255) at zone_dtor+0x55
 uma_zfree_internal(c0369660,c627b300,0,0,dad4fc60) at uma_zfree_internal+0x35
 uma_zdestroy(c627b300,dad4fc84,c01adce0,c6302c40,1) at uma_zdestroy+0x2a
 en_modevent(c6302c40,1,0,c5ea2000,c632c700) at en_modevent+0x4b
 driver_module_handler(c6302c40,1,c658a804,dad4fcc0,c0183f61) at 
 driver_module_handler+0x120
 module_unload(c6302c40,c02f00d9,1f1,0,0) at module_unload+0x1e
 linker_file_unload(c632c700,0,c02f00d9,31b,c632f250) at linker_file_unload+0x81
 kldunload(c6046ab0,dad4fd10,c0309978,3ee,1) at kldunload+0x9b
 syscall(2f,2f,2f,bfbffd03,bfbffc1c) at syscall+0x2b3
 Xint0x80_syscall() at Xint0x80_syscall+0x1d
 --- syscall (305, FreeBSD ELF32, kldunload), eip = 0x80485b3, esp = 0xbfbff76c, ebp 
 = 0xbfbffbcc ---
 db
 
 The uma_zfree_internal call is the first one in cache_drain (the one
 that frees uc_allocbucket). The seconds argument to uma_zfree_internal in the
 trace above seems rather strange to me. What is the problem here?
 
 harti
 
 -- 
 harti brandt,
 http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
 [EMAIL PROTECTED], [EMAIL PROTECTED]
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic in uma_zdestroy

2003-08-01 Thread Bosko Milekic

On Fri, Aug 01, 2003 at 01:32:05PM +, Bosko Milekic wrote:
 
 I screwed up...  fix coming shortly.  Sorry!
 
 On Fri, Aug 01, 2003 at 07:00:19PM +0200, Harti Brandt wrote:
  
  Hi,
  
  with a kernel from yesterday I get a panic on an SMP system when I
  destroy a zone immediately after creating it. It have a driver (with the
  probe routine set to return ENXIO) and the following module event
  function:
...

  Again, I appologize.  I just committed something which should fix
  this:

bmilekic2003/08/01 10:42:27 PDT

  FreeBSD src repository

  Modified files:
  sys/vm   uma_core.c
  Log:
  Only free the pcpu cache buckets if they are non-NULL.

  Crashed this person's machine: harti
  Pointy-hat to: me

  Revision  ChangesPath
  1.70  +6 -4  src/sys/vm/uma_core.c

  Let me know if you're still having problems.

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: complicated downgrade

2003-07-22 Thread Bosko Milekic

On Tue, Jul 22, 2003 at 09:01:06AM +0300, Valentin Nechayev wrote:
  Mon, Jul 21, 2003 at 23:40:05, des wrote about Re: complicated downgrade: 
 
  I need to downgrade a remote FreeBSD system from 5.1-release to 4.8-release
  remotely without any local help (except possible hitting Reset).
  Maybe if you tell us why you need to do this we can figure out a way
  for you to avoid doing it?
 
 System periodically hangs up. Average uptime is ~6 hours. No crash info
 is available. No serial console is available. Different invariants didn't
 help, AFAIK (this testing was done by another admin, so I'm not 100% sure).
 4.8 in any case is considered more stable, so switching can exclude
 some software problems or software-caused triggerings of hardware problems.

  This sounds like the same symptoms as the latest USB problem...
  when/if you track -current or even run one of the 5.x releases, it's
  key to realize that this is very active code that you're running; it's
  not the same thing as running 4.x, for example.  The code in 5.x is
  constantly actively changing, whereas the code in 4.x only receives
  comparatively well-regulated merges from 5.x, for the most part.
  Therefore, one of the things to always try is to update to the latest
  -current, rebuild, and see if you can reproduce.  Chances are, your
  problem may have been fixed and, if not, at least we can be confident
  that it's reproducable on your hardware with the latest sources.

 Just now question isn't so important because it was decided to move to another
 box (including more friendly environment), so my question is more theoretical
 than practical. But, there is opportunity to play with configs, so I'll try
 again to play with invariants, witnesses, etc.
 
 Thanks to all for help.
 
 
 -netch-

Cheers,
-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SMP problem with uma_zalloc

2003-07-18 Thread Bosko Milekic

On Fri, Jul 18, 2003 at 07:05:58PM +0200, Harti Brandt wrote:
 
 Hi all,
 
 it seems there is a problem with the zone allocator in SMP systems.
 
 I have a zone, that has an upper limit on items that resolves to an
 upper limit of pages of 1. It turns out, that allocations from this
 zone get stuck from time to time. It seems to me, that the following
 happens:
 
 - on the first call to uma_zalloc a page is allocated and all the free
 items are put into the cache of the CPU. uz_free of the zone is 0 and
 uz_cachefree holds all the free items.
 
 - when the next call to uma_zalloc occurs on the same CPU, everything is
 fine. uma_zalloc just gets the next item from the cache.
 
 - when the call happens on another CPU, the code finds uz_free to be 0 and
 checks the page limit (uma_core.c:1492). It finds the limit already
 reached and puts the process to sleep (uma_zalloc was called with
 M_WAITOK).
 
 - the process may sleep there forever (depending on circumstances).
 
 If M_WAITOK is not set, the code will falsely return NULL while there
 are still free items (albeight in the cache of another CPU).
 
 I wonder whether this is intended behaviour. If yes, this should be
 definitely documented. uma_zone_set_max() seems to be documented only in
 the header file and it does not mention, that free items may not actually
 be allocatable because they happen to sit in another CPU's cache.
 
 If it is not intended (I would prefer this), I wonder how one can get the
 items out of another's CPU cache. I'm not too familiar with this code.
 I suppose this should be done somewhere around uma_core.c:1485?
 
 Regards,
 harti
 -- 
 harti brandt,
 http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
 [EMAIL PROTECTED], [EMAIL PROTECTED]

  If the per-cpu caches are relatively small (which they ought to be,
  especially when you've hit a maximum number of allocations from the
  zone), then this is actually not that bad of a behavior.

  I spoke to Jeff about this and it seemed to me that he was leaning
  toward keeping the behavior this way and, in fact, also perhaps _not_
  even doing an internal free to the zone when UMA_ZFLAG_FULL is in
  effect but we still have space in the pcpu cache.  While I'm not sure
  if going that far is a good idea, I _don't_ really think that the
  current behavior is a bad idea.  As mentionned, when you have a zone
  that is mostly starved, all future frees will go back to the zone and
  not the per-cpu caches, but if you have some free items in another
  per-cpu cache, you're not likely to hit a starvation situation unless
  something is horribly wrong.  And having the free code actually drain
  the per-cpu caches in a zone-full situation may lead to bad behavior
  under heavy load.  Think about what happens under heavy load... your
  zone is starved and if you then flush all the pcpu caches and the load
  is still heavy, you're likely to have other threads try to allocate
  anyway, so they'll end up having to dip into the zone anyway;
  therefore, there doesn't seem to be much of a reason to push the
  cached objects back into the zone (if they're going to leave it again
  soon anyway).

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: running 5.1-RELEASE with no procfs mounted (lockups?)

2003-07-16 Thread Bosko Milekic

On Tue, Jul 15, 2003 at 10:43:19PM -0700, Josh Brooks wrote:
[...]
 One of the systems, the one I am doing all the work on, is an SMP system,
 and it keeps locking up on me - the lockups are always the same - things
 are going fine, and suddenly a process fails to complete - maybe it is
 pwd, maybe I type   :q!   in vi  and it just sticks there - either
 way, randomly, processes just begin to lock up ... if I log in on another
 session, I can see the PID, but I cannot kill it - I can kill -9 (PID) 100
 times and it will still exist.  Eventually the entire system will lock up,
 although you can always ping the system.

  When this happens and you start another session to kill the original
  process, can you perhaps run 'ps pid -l' and get the MWCHAN column?
  The process could be stuck blocking somewhere in the kernel, which is
  why your signal is not being delivered.

  Anyway, this is just one possibility.  See if all the processes you
  describe as 'frozen' have the same MWCHAN and, if so, what is it?

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NSS Modules

2003-07-10 Thread Bosko Milekic

On Wed, Jul 09, 2003 at 06:27:21PM -0400, Ben Goodwin wrote:
  Hi guys ...
 
 I thought I'd give you a heads-up that I'm porting libnss-mysql to the NSS
 API that FreeBSD 5.1 has adopted in case anyone has input, suggestions,
 wants to test, etc..
 I'm also curious about including it eventually .. via ports or something
 perhaps?
 Is anyone else developing NSS modules for FreeBSD?
 
 I believe I've figured out the API .. I've got a rudimentary test working,
 so ...
 
 Actually, I do have one question .. As I support more operating systems,
 I've wondered about how to autoconf the different APIs .. right now if I see
 nss.h I know it's one OS, and if I see nsswitch.h I know it's the other
 (Linux vs. Solaris) .. but that doesn't hold true with FreeBSD added to the
 mix.  Any recommendations on what I could do to create an API define that
 holds the current O/S in a clean and reliable fashion?
 Thanks!

  You should be able to do:

  #if defined(__FreeBSD__)

  to test if you're on FreeBSD.  This is built as part of the
  freebsd-spec in gcc so it will be defined at least if you're using our
  system (stock) compiler.

  Ideally, though, the API would be the same. :-)

 -=| Ben
 
 http://libnss-mysql.sourceforge.net

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NSS Modules

2003-07-10 Thread Bosko Milekic

On Thu, Jul 10, 2003 at 05:12:03PM -0400, Ben Goodwin wrote:
 I'd like to support Sun's cc, however .. so I'm betting that isn't defined
 (I will check) ...  I figured that would be available under gcc but assumed
 it wasn't portable enough ...

  You can still test for whether or not it is defined.

#if defined(__FreeBSD__)
  /* Do freebsd-specific stuff */
#else
  /* Other systems? */
#endif

-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Kernel Support for System Call Performance Monitoring

2003-06-16 Thread Bosko Milekic

On Mon, Jun 16, 2003 at 01:20:03PM -0400, Yaoping Ruan wrote:
 We have been working on improving Web server performance on
 FreeBSD, and think you may be interested in the results and
 techniques we used. Specifically, we focus on the SpecWeb99
 benchmark and the Flash Web Server, and have roughly quadrupled
 its performance. We did this by adding support for a very
 low-cost kernel performance monitoring system, which allowed
 us to find and fix a number of bad interactions between the
 server and the OS. We additionally augmented one of the system
 calls, sendfile, to be more useful for this kind of server.
 We think that our observations may be useful for other servers,
 and may present opportunities for performance improvement in
 FreeBSD.
 
 A paper describing our system can be found at
 http://www.cs.princeton.edu/~yruan/DeBox and we can provide the
 patches we made if anyone's interested. We welcome any comments
 and feedback that you have.

  First off, thank you for choosing FreeBSD for your research.
  The more effort is put into doing this sort of research, the better it
  is for both the academic community and the industry.

  I've read your paper and have a few brief notes:

  - On DeBox implementation.  I understand that the DeBox implementation
is primarily a tool used for tracking down potential application
bottlenecks and so the relative importance of the crudeness of the
implementation is not so high.  However, I'm looking at this from
the perspective of introducing DeBox as a permanent option in
FreeBSD, and two immediate problems are:

1) User-visible DeBoxInfo structure has the magic number 5
   PerSleepInfo structs and the magic number 200 CallTrace
   structs.  It seems that it would be somewhat less crude to turn
   the struct arrays in DeBoxInfo into pointers in which case you
   have several options.  You could provide a library to link
   applications compiled for DeBox use with that would take care of
   allocating the space in which to store maxSleeps and
   maxTrace-worth of memory and hooking the data into resultBuf or
   providing the addresses as separate arguments to the
   DeBoxControl() system call.  For what concerns the kernel, you
   could take a similar approach and dynamically pre-allocate the
   PerSleepInfo and CallTrace structures, based on the requirements
   given by the DeBoxControl system call.

2) The problem of modifying entry-exit paths in function calls.
   Admittedly, this is hard, but crudely modifying a select number
   of functions to Do The Right Thing for what concerns call tracing
   is hard to justify from a general perspective.  I don't mean to
   spread FUD here; the change you made is totally OK from a
   measurement perspective and serves great for the paper, it's just
   tougher to integrate this stuff into the mainline code.

  - On the Case Study.  I was most interested in the sendfile
modifications you talk about and would be interested in seeing
patches.  I know that some of the modifications you mention have
already been done in 5.x; Notably, if you have not already, you'll
want to glance at:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/uipc_syscalls.c? \
rev=1.144content-type=text/x-cvsweb-markup 

(regarding your mapping caching in sf_bufs)

and this [gigantic] thread:

http://www.freebsd.org/cgi/getmsg.cgi?fetch=12432+15802+ \
/usr/local/www/db/text/2003/freebsd-arch/20030601.freebsd-arch

(subject: sendfile(2) SF_NOPUSH flag proposal on freebsd-arch@, at
 least).

   You may want to contact Igor Sysoev or other concerned parties in
   that thread to show them that you actually have performance results
   resulting from such a change.

   Finally, I'd like to sort of make a longshot proposal; more of a if
   you have the time follow-up to your work that someone could be able
   to perform, and that would certainly be interesting to see: how all
   this works out when forward-ported to FreeBSD 5.x.

 Sincerely
 
 - Yaoping
 [EMAIL PROTECTED]

Regards,
-- 
Bosko Milekic  *  [EMAIL PROTECTED]  *  [EMAIL PROTECTED]
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP! Major commits in the tree coming soon

2003-05-30 Thread Bosko Milekic

For the benefit of the majority:  This post was FAKE.

Now please return to your regularly scheduled discussion and kindly
ignore all future posts to this thread.

-Bosko

On Thu, May 29, 2003 at 01:50:33PM -0400, Kenneth Culver wrote:
  The HEAD code freeze was extended by three days to
  allow for some final pending work to be committed and
  prepare 5.1 to be a good release. The code freeze will
 
  likely end sometime tomorrow, May 30.
 
  We ask that large scale changes still be deferred
  until after 5.1 is actually released so that any
   problems can be dealt with.  The release
  engineering team will send out emails explicitely
  stating when HEAD has thawed and when large changes
  like new compilers and dynamic-linked worlds can go
  it.
 
  The most important changes I'm going to commit today:
 
  - Remove gcc and replace it with a new TenDRA
  snapshot.
 
 I'm just wondering... but is there a reason why gcc is being replaced? Is
 there a page or a previous list mail that explains the reasons? URL?
 Thanks.
 
  - Remove GNU tar.
  - Fix httpd.ko to make it work on buggy AMD
  processors.
  - Drop support for 386 and 486 cpus.
  - Remove ext2 support (GPL encumbered).
  - Add perl 5.8 *and* python 2.2 to base.
  - Remove Sendmail and replace it with Postfix.
 
  If anyone has any reason why these should not be
  committed, I'll give a 5 hours grace time. Send
  replies
  to the list.
 
  Thank you.
 
   Thorsten and the rest or the release engineering team.
 
 Thanks
 
 Ken
 
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bootp_subr.c

2003-02-21 Thread Bosko Milekic

On Fri, Feb 21, 2003 at 03:06:51PM +, omestre wrote:
 
 
  Hello,
  I'm working in FreeBSD diskless machines projects... and
 i have wrote a patch to bootp_subr.c code ( luigi code).
  I have posted a PR too. (kern/46174).
 
  luigi did not reply... no one did.
  I have more contact with Linux. Is this the FreeBSD world?
 
  Thanks!
 
 [EMAIL PROTECTED]
 SDF Public Access UNIX System - http://sdf.lonestar.org

  Hi,

One of the problems could be that I don't see a patch in the PR.
You included the source file itself but there is no indication as to
what exactly was changed.

diff -u bootp_subr.c.old bootp_subr.c

would be good enough.

Or if you have the sources checked out with cvs,

cvs diff -u bootp_subr.c

I think this would generate a quicker response.  As for the idea
itself, it sounds reasonable.

Thanks,
-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Fast interrupts

2002-08-26 Thread Bosko Milekic


On Mon, Aug 26, 2002 at 09:41:43AM -0700, Maksim Yevmenkin wrote:
 John Baldwin wrote:
  
  On 26-Aug-2002 M. Warner Losh wrote:
   can you call wakeup(9) from a fast interrupt handler?  
 
 [ ...]
 
   The only reason I ask is because sio seems to go out of its way to
   schedule a soft interrupt to deal with waking up processes, which then
   calls wakeup...
  
  Since wakeup only needs a spin lock, it is probably ok.  You just can't call
  anything that would sleep (in any interrupt handler) or block on a non-spin
  mutex.
 
 what is the general locking technique for interrupt handlers?
 there must be some sort of locking, right?

  You are allowed to use mutex locks (both spin and MTX_DEF), only you
  are only allowed to user the former for fast interrupt handlers.

-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Fast interrupts

2002-08-26 Thread Bosko Milekic


On Mon, Aug 26, 2002 at 10:14:32AM -0700, Maksim Yevmenkin wrote:
 Bosko Milekic wrote:
  
  On Mon, Aug 26, 2002 at 09:41:43AM -0700, Maksim Yevmenkin wrote:
   John Baldwin wrote:
   
On 26-Aug-2002 M. Warner Losh wrote:
 can you call wakeup(9) from a fast interrupt handler?
  
   [ ...]
  
 The only reason I ask is because sio seems to go out of its way to
 schedule a soft interrupt to deal with waking up processes, which then
 calls wakeup...
   
Since wakeup only needs a spin lock, it is probably ok.  You just can't call
anything that would sleep (in any interrupt handler) or block on a non-spin
 ^^
 my understanding is that John was talking about any
 interrupt handler. Not just fast interrupt hander.

   Yeah, you can't call anything that would _sleep_ (e.g., msleep()).
   You could still grab a MTX_DEF mutex for a non-fast interrupt
   handler and possibly block waiting to get it.

-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Memory corruption in CURRENT

2002-08-22 Thread Bosko Milekic


We have seen weird problems regarding the pmap PG_G related stuff (well
sort of, it has to do with PSE and PG_G) on ppro and pII chips
(apparently, this is not the case with at least Xeons) but what
happened, for the record, was this:

We would enable PSE and switch the pde corresponding to the first 4M
to the new entry describing a 4M page, instead of the one describing the
location of the ptes covering those 4M.  Then, what we would do is walk
all the ptes, including those old stale and useless ones that previously
described those first 4M and set the PG_G bit there (Note: we've already
set PG_G on our 4M page).  Normally, we don't really need to touch the
old ptes but we did it just because it was more convenient (i.e. a few
lines less code).  Oddly enough, on the ppro and pII what would happen
is that we would page fault on that page where we kept the old ptes
covering those first 4M, and only on that page!  The other ptes - the
ones that actually mattered - were all fine.  The ptes are mapped above
the 4M so I don't see how changing the pde for those first 4M would have
done anything.  To fix the problem, we (actually Peter) committed code
that basically just jumps beyond that first page of stale ptes when
setting the PG_G bit for the 4K pages, and since then, the problem seems
to have gone away.  Although we are not sure, this seems like a silicon
bug.

Since then, Peter had some work planned to load the kernel above the
first 4M to see if that fixed the problems.  I'm wondering if this
problem on the PIVs could be related.  Please let us know if the removal
of those two options really makes 5-10 buildworlds in a row work out for
you.

Regards,
Bosko

On Thu, Aug 22, 2002 at 01:34:11PM +0200, Mark Santcroos wrote:
 On Thu, Aug 22, 2002 at 04:23:46AM -0700, Terry Lambert wrote:
  Ugh!  Wait until it seems to work for a statistically significant
  sample size, and for more than one person before calling it happy!
  
  Also, I'm not sure looking at the code whether or not the PG_G is
  truly significant, or just preterbs the workaround.  The problem
  I've referred to in my hunch here is actually related solely to
  the PSE, but with the recent code reorganization in locore.s, etc.,
  it could have become more significant.
 
 I was just giving a slight report, not yelling halleluja yet ;-)
 
 It's doing the 2nd buildworld now.
 
 Do you also want me to try to split up the disabling of the two options?
 
 Mark
 
 -- 
 Mark SantcroosRIPE Network Coordination Centre
 http://www.ripe.net/home/mark/New Projects Group/TTM
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 

-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: m_freem() in tcp_respond()

2002-08-10 Thread Bosko Milekic
 in panic (fmt=0xc03edd84 from debugger)
 at /usr/src/sys/kern/kern_shutdown.c:595
 #3  0xc014cbb9 in db_panic (addr=-1071517796, have_addr=0, count=-1, 
 modif=0xdc319b3c ) at /usr/src/sys/ddb/db_command.c:435
 #4  0xc014cb59 in db_command (last_cmdp=0xc0463918, cmd_table=0xc0463758, 
 aux_cmd_tablep=0xc04c0cb8) at /usr/src/sys/ddb/db_command.c:333
 #5  0xc014cc1e in db_command_loop () at /usr/src/sys/ddb/db_command.c:457
 #6  0xc014ed5b in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:71
 #7  0xc03b84ce in kdb_trap (type=12, code=0, regs=0xdc319c90)
 at /usr/src/sys/i386/i386/db_interface.c:158
 #8  0xc03c8e14 in trap_fatal (frame=0xdc319c90, eva=0)
 at /usr/src/sys/i386/i386/trap.c:969
 #9  0xc03c8aed in trap_pfault (frame=0xdc319c90, usermode=0, eva=0)
 at /usr/src/sys/i386/i386/trap.c:867
 #10 0xc03c8667 in trap (frame={tf_fs = 16, tf_es = -600768496, tf_ds = 16, 
   tf_edi = -1048332032, tf_esi = 6422528, tf_ebp = -600728360, 
   tf_isp = -600728388, tf_ebx = 0, tf_edx = 6756410, tf_ecx = 0, 
   tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1071517796, tf_cs = 8, 
   tf_eflags = 66199, tf_esp = -1048331972, tf_ss = -1048331972})
 at /usr/src/sys/i386/i386/trap.c:466
 #11 0xc021ef9c in m_freem (m=0x0) at /usr/src/sys/kern/uipc_mbuf.c:706
 ---Type return to continue, or q return to quit---
 #12 0xc0273a0f in tcp_respond (tp=0x0, ipgen=0xc183b93c, th=0xc183b950, 
 m=0xc183b900, ack=2100704027, seq=0, flags=20)
 at /usr/src/sys/netinet/tcp_subr.c:396
 #13 0xc0271eff in tcp_input (m=0xc183b900, off0=20, proto=6)
 at /usr/src/sys/netinet/tcp_input.c:2204
 #14 0xc026b874 in ip_input (m=0xc183b900)
 at /usr/src/sys/netinet/ip_input.c:821
 #15 0xc026b8d3 in ipintr () at /usr/src/sys/netinet/ip_input.c:842
 #16 0xc03ba809 in swi_net_next ()
 #17 0xc0224929 in connect (p=0xd86e1f20, uap=0xdc319f80)
 at /usr/src/sys/kern/uipc_syscalls.c:396
 #18 0xc03c90f5 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
   tf_edi = 22273, tf_esi = 3, tf_ebp = -1077938064, tf_isp = -600727596, 
   tf_ebx = 671650276, tf_edx = -1077938288, tf_ecx = 13, tf_eax = 98, 
   tf_trapno = 12, tf_err = 2, tf_eip = 672133692, tf_cs = 31, 
   tf_eflags = 659, tf_esp = -1077938252, tf_ss = 47})
 at /usr/src/sys/i386/i386/trap.c:1175
 #19 0xc03b93a5 in Xint0x80_syscall ()
 #20 0x2806fcbd in ?? ()
 #21 0x8048d88 in ?? ()
 #22 0x8048add in ?? ()
 (kgdb) frame 12
 #12 0xc0273a0f in tcp_respond (tp=0x0, ipgen=0xc183b93c, th=0xc183b950, 
 m=0xc183b900, ack=2100704027, seq=0, flags=20)
 at /usr/src/sys/netinet/tcp_subr.c:396
 396 m_freem(m-m_next);
 (kgdb) print m
 $1 = (struct mbuf *) 0xc183b900
 (kgdb) print m-m_hdr.mh_next
 $2 = (struct mbuf *) 0x0
 (kgdb) frame 11
 #11 0xc021ef9c in m_freem (m=0x0) at /usr/src/sys/kern/uipc_mbuf.c:706
 706 if (mcl_pool_now  mcl_pool_max  m-m_next == NULL 
 (kgdb) 
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-net in the body of the message
 

-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ARM Port: Help with UMA subsystem needed

2002-08-03 Thread Bosko Milekic


On Sat, Aug 03, 2002 at 11:07:11AM -0400, Stephane E. Potvin wrote:
 On Thu, Aug 01, 2002 at 08:05:12PM -0400, Stephane E. Potvin wrote:
  I've been busy trying to bring the port back in sync with current.
  Now, each time I start my NetWinder, I get the following panic which
  I don't seem able to track the source. I would greatly appreciate if
  anybody knowledgeable with the UMA subsystem could give me a hint on
  what could be causing this.
  
 
 I just found out that reverting this commit fixes the problem. Any
 ideas about why other arches don't encouter the problem?
 
 jeff2002/06/19 13:49:44 PDT
 
   Modified files:
 sys/vm   uma.h uma_core.c
   Log:
   - Remove bogus use of kmem_alloc that was inherited from the old zone
 allocator.

   This looks like the problem, or at least that which uncovers the
   problem.  The pmap code is calling the zone allocator as well and
   what happens is that you recurse on the kmem_map lockmgr lock because
   you allocate recursively from kmem_map.  Previously, we could also
   allocate from kernel_map, if the kernel_map lockmgr lock wasn't held,
   so this way if we had a recursive call we would get around this
   problem.  I think this whole thing is flaky in general (if this was
   the way to get around recursion, we should fix it).

   JHB and/or JeffR: why is the kmem_map lockmgr lock not recursive? 

Regards,
-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ARM Port: Help with UMA subsystem needed

2002-08-03 Thread Bosko Milekic


On Sat, Aug 03, 2002 at 03:51:20PM -0400, Jeff Roberson wrote:
 These locks can not be made recurisve safely.  In this case you would just
 recurse forever and never satisfy the allocation.  All pmap modules do
 something like the following:
 
 static void *
 pmap_allocf(uma_zone_t zone, int bytes, u_int8_t *flags, int wait)
 {
 *flags = UMA_SLAB_PRIV;
 return (void *)kmem_alloc(kernel_map, bytes);
 }
 
 pvzone = uma_zcreate(PV ENTRY, sizeof (struct pv_entry), NULL,
 NULL,
 NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_VM);
 uma_zone_set_allocf(pvzone, pmap_allocf);
 uma_prealloc(pvzone, initial_pvs);

   Assuming ARM is following the same example, perhaps it needs to
   pre-allocate more pvs.  Although I somehow doubt it's doing the right
   thing here because the panic seems to happen early on during boot,
   according to the trace first provided.
 
 Is arm using a seperate allocf?

 Jeff

-- 
Bosko Milekic * [EMAIL PROTECTED] * [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: tunings for many httpds...

2002-06-26 Thread Bosko Milekic


On Wed, Jun 26, 2002 at 11:24:47AM -0700, Matthew Dillon wrote:
 
 :[commenting live from ottawa]
 
 Pictures!  We want pictures!

It's pretty cool that the Linux camp has decided to do the
Summit stuff too (I'm assuming that this is a relatively new
phenomenon).  What's even cooler is that they picked Ottawa.  I
think I may be in Ottawa this weekend (Canada Day is then), so
if it's still going on then, or if something else is planned, I
would love to attend - if you folks don't mind a BSD developer
hanging around. :-)

   -Matt
   Matthew Dillon 
   [EMAIL PROTECTED]

Regards,
-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: m_cat() does not update m_pkthdr.len

2002-06-24 Thread Bosko Milekic


On Mon, Jun 24, 2002 at 09:15:09PM -0500, Mike Silbersack wrote:
 
 On Sun, 23 Jun 2002, Yahel Zamir wrote:
 
  Hi,
 
  During development of networking code in FreeBSD kernel,
  we noticed that m_cat(p1, p2) does NOT do some necessary things:
  p1-m_pkthdr.len += p2-m_pkthdr.len;
  p2-m_flags = ~M_PKTHDR;
 
  Thanks,
  Yahel.
 
 Please notify Luigi or Bosko.  See the -net archives as well, I believe
 that this has been discussed in the last week or two.
 
 Mike Silby Silbersack

  There is not much about m_cat() that says that p1 has to be a packet
  header and that also says that p2 is a packet header or that, if it
  is, its packet headerness will be removed.  It is up to the
  surrounding code to make sure that it properly deals with p1 and p2
  before concatenating p2 to p1 (or after).  To place the added
  requirement that p1 be a packet header type mbuf would be placing an
  additional requirement on callers to m_cat().

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Some small projects for mutt(1)

2002-06-20 Thread Bosko Milekic


On Thu, Jun 20, 2002 at 03:27:24PM -0500, Brandon D. Valentine wrote:
 On Thu, 20 Jun 2002, Bosko Milekic wrote:
 
 On Thu, Jun 20, 2002 at 01:10:39PM -0700, Matthew Hunt wrote:
  This shouldn't be hard to glue together without modifying mutt itself.
  Make a little program, foo, that takes the message on stdin, passes
  it through formail -x subject, massages it into a procmail rule, and
  appends it to some procmail rule file.  The massage step should include
  escaping characters that have special meanings in procmail regexps, and
  adding something like (Re: *)? at the beginning of the subject when
  appropriate.  Shouldn't be more than a screenful of Perl.
 
   Interesting.  How would you have a key bound sequence in mutt set off
 the script on the message, though?  For instance, if I do a ctrl+B, how
 would you ensure that the Right Thing happens, without modifying mutt
 code?
 
 Check out mutt2procmailrc written by my good friend timball:
 
 http://www.ghettohack.net/timball/

  Hey, this is awesome stuff!  Thanks!  How come we don't have a port?

 It rocks.
 
 Brandon D. Valentine
 -- 
 http://www.geekpunk.net [EMAIL PROTECTED]
 ++[++-][++-].[+-][+-]+.+++..++
 +.+[++-]++.+++..+++.--..+.

Regards,
-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: The problem with FreeBSD

2002-06-18 Thread Bosko Milekic


  Bill H., is that you?

On Tue, Jun 18, 2002 at 08:39:57AM +, Bill Flamerola wrote:
 Okay, this is not really intended as a flame, but kinda necessary, given the 
 current situation in the FreeBSD camp.
[...useless stuff...]

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: ICU_LEN with IO APIC

2002-05-31 Thread Bosko Milekic


On Fri, May 31, 2002 at 12:12:00PM +0300, Aaro J Koskinen wrote:
 Hello,
 
 Is there any particular reason why the number of interrupts is limited
 to 32 on APIC systems? Is it just a conservative guess on the number of
 interrupts anyone might want to need...?

  I'm not sure but perhaps this is historical (and now also required
  again), but if we use a word to mask out interrupts than after 32 we
  run out of bits.  Who needs more than 32 interrupts anyway?! :-)

 A.
 
 -- 
 Aaro Koskinen
 E-mail: [EMAIL PROTECTED]I'm the ocean, I'm the giant undertow.
 http://www.iki.fi/aaro

Regards,
-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: splimp() during panic?

2002-05-24 Thread Bosko Milekic


Archie Cobbs wrote:
 Hi,
 
 I'm trying to debug a mbuf corruption bug in the kernel. I've added
 an mbuf sanity check routine which calls panic() if anything is amiss
 with the mbuf free list, etc. This function runs at splimp() and if/when
 it calls panic() the cpl is still at splimp().
 
 My question is: does this guarantee that the mbuf free lists, etc. will
 not be modified between the time panic() is called and the time a core
 file is generated? For example, if an incoming packet causes a networking
 interrupt after panic() has been called but before the core file is
 written, will that interrupt be blocked when it calls splimp()?

  splimp() ensures that no driver handlers will be executed.  Further,
  dumpsys() is called from panic() at splhigh() which would also mean
  that none of those potentially troublesome handlers will run.

 I've been working under this assumption but it seems to not be
 valid, because I seem to be seeing panics for situations that are
 not true in the core file.

  Are you seeing invalid stuff from DDB but valid stuff from the core
  file?  Because if so, that's REALLY WIERD.  If you're just seeing two
  different but invalid things, then perhaps something is happening when
  Debugger() runs (is it possible that the cpl() is changed after
  or before a breakpoint()?).

 If this is not a valid assumption, is there an easy way to 'freeze'
 the mbuf free lists long enough to generate the core file when an
 inconsistency is found (other than adding the obvious hack)?

  To make doubly-sure, what you can do is just keep a variable 'foo'
  which you initialize to 0.  Before any mbuf free list manipulations,
  place a 'if (foo == 0)' check.  Atomically set foo to 1 before the
  panic.  See if the inconsistency changes.  If you're seeing garbage in
  both cases, but the garbage is inconsistent, perhaps there's a memory
  problem or the dump isn't working properly (I've never heard of
  anything like this before).

 Thanks,
 -Archie
 
 __
 Archie Cobbs * Packet Design * http://www.packetdesign.com

Regards,
-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: deltas for sys/kern/vfs_syscalls.c and sys/kern/vfs_subr.c

2002-05-03 Thread Bosko Milekic


On Fri, May 03, 2002 at 10:45:32PM +0100, Hiten Pandya wrote:
 Hi all,
 
 I am submitting a patch which removes the register keyword from 
 sys/kern/vfs_syscalls.c.  The reason I am doing this is very simple.
 
 The 'register' keyword has no effect, as compilers do enough optimizations
 on their own.   Also, I have seen commits made before which do the same
 thing which I am doing now.  I have talked about this patch with jmallett,
 and various other developers.
 
 This patch is located at: 
 http://storm.uk.FreeBSD.org/~hiten/diffs/vfs_syscalls.diff.1

  Looks good.

 The second issue, is what I am not very sure about, but I had a little
 discussion about this with rwatson.  The vfs_subr.c module contains
 a large #if 0'ed section, which basically contains some sysctls.  I
 think it has been forgotten for removal, so I am submitting a delta which
 can be used to remove that #if 0'ed section.
 
 Note, I am not very sure about this, that is why I am posting this to
 -hackers.
 
 The patch is located at: 
 http://storm.uk.FreeBSD.org/~hiten/diffs/vfs_subr.c.diff.1

  I don't think that removing the code is a problem.  The real person to
  ask would be dillon, since he was the one who placed the #if 0 around
  the block.

 Thanks.  If anyone finds them interesting, please commit them to the
 CVS repository.
 
 P.S. Please do not hesitate to contact me for more information reg.
 these deltas.
 
 -- 
 Hiten Pandya
 http://storm.uk.FreeBSD.org/~hiten
 Finger [EMAIL PROTECTED] for PGP public key
 -- 4FB9 C4A9 4925 CF97 9BF3  ADDA 861D 5DBD E4E3 03C3 

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: I want to help

2002-04-03 Thread Bosko Milekic


On Wed, Apr 03, 2002 at 07:14:59PM +0200, digital wrote:
 Hi I'm from Serbia and I love most computers and programming.I am very 
 familiar with FreeBSD operating system and I know to programme in 
 language C, and if it is necessarirly I can learn C++.I would be very 
 happy if I could help in some way in development of FreeBSD operating 
 system (maybe Socket programming).I am student of Astrophysics and in 
 free time I am learning FreeBSD so I came to idea that I can active 
 involve in FreeBSD development.
 I have computer: CPU celeron2 633 SL3VS, 256 RAM,VGA TNT2 AGP M64 
 32MB,main board ABIT133 RAID, hard disk IDE 20.5GB Quantum Fireball 
 Plus(LM20A011), CD-ROM IDE 40X Teac CD-540E.
 My email is : [EMAIL PROTECTED]
 If you think I can help let me know,

   How about you start by helping out the bsd.org.yu crew with
   documentation translation?

   The rest will/may come with time.

 A lot of regards and wish for best work,
 
 Dragoslav Zaric

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch to remove MFREE() macro entirely

2002-02-03 Thread Bosko Milekic


On Sat, Feb 02, 2002 at 11:54:17PM -0800, David Greenman wrote:
 Oh what a tangled web we weave.  This should be really easy for people
 to take a quick look at to see if I made any mistakes.  I'm basically
 untangling the (small) mess that people made of the code while trying to
 use the MFREE() macro over the last N years.
 
 If nobody sees any problems it will go into -current next week some
 time and then be MFC'd to stable.
 
Looks good to me. I'm definately very much in favor of killing MFREE().

  Absolutely! Especially in light of the fact that in -CURRENT
now-a-days, MFREE() will has no benefits and pretty much ALL the mbuf
macros are deprecated (they just wrap calls to the appropriate
functions). They were really big for macros and actually used to make
things slower by busting the cache.

 -DG
 
 David Greenman
 Co-founder, The FreeBSD Project - http://www.freebsd.org
 President, TeraSolutions, Inc. - http://www.terasolutions.com
 President, Download Technologies, Inc. - http://www.downloadtech.com
 Pave the road of life with opportunities.

-- 
Bosko Milekic
[EMAIL PROTECTED]
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: mbuf chains

2002-01-18 Thread Bosko Milekic


On Fri, Jan 18, 2002 at 04:46:18PM -0800, Skye Poier wrote:
 What are the rules around mbuf chain construction?
 (I've read man mbuf, doesnt go into much detail)
 
 In particular, I'm assuming:
 - all mbufs must be same type
 - the head mbuf must have M_PKTHDR set
 - the head mbuf.m_pkthdr.len must be the len of the entire chain
 Anything to add?

  Take a look at the mchain interface for a nice way to deal with mbufs
in certain cases: src/sys/kern/subr_mchain.c
  The `rules' you state are good advice but are not _technically_
obligatory in the most general case. In other words, it is technically
up to the implementor to decide on how to chain mbufs and what their
meaning is.

 My confusion is around splitting/concatenating -
 
 When splitting an mbuf chain, the two resultant chains must be as above
 (heads have M_PKTHDR and mbuf.m_pkthdr.len set) right?

 When concatenating chains, what do you do with the M_PKTHDR that is now
 in the middle of the chain?  m_cat doesn't seem very sophisticated in
 this regard.  And of course update head mbuf.m_pkthdr.len

  Again, it all depends on what you're doing. Typically a packet
consists of a chain with a head mbuf that is M_PKTHDR and contains the
additional information. You don't normally do what you wrote, but again,
it depends on the implementation, ultimately.

 Thanks
 Skye

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: huge MFREE() macro?

2002-01-02 Thread Bosko Milekic


On Tue, Jan 01, 2002 at 10:16:25PM -0800, Matthew Dillon wrote:
 I noticed a bunch of routines use MFREE() instead of m_free() (which 
 just calls MFREE()).  MFREE() is a huge macro.
 
 textdata bss dec hex filename
 1986399  252380  145840 2384619  2462eb kernel
 
 textdata bss dec hex filename
 1983343  252380  145840 2381563  2456fb kernel
 
 We save about 3K.  Any problems with this?  Maybe also MFC to -stable
 to save some bytes?

  In -CURRENT, MFREE() just wraps a call to m_free().

 (The #if 0's wouldn't be in a commit, I'd actually delete the code)
 
 Also, if you do a search for XXX, I think there was an MFREE in there
 that should have been an m_freem().  Could someone check that?

  See below.

 The patch is against -stable.
 
   -Matt
   Matthew Dillon 
   [EMAIL PROTECTED]

[...]
 Index: i386/isa/if_lnc.c
 ===
 RCS file: /home/ncvs/src/sys/i386/isa/Attic/if_lnc.c,v
 retrieving revision 1.68.2.4
 diff -u -r1.68.2.4 if_lnc.c
 --- i386/isa/if_lnc.c 8 Jan 2001 15:37:59 -   1.68.2.4
 +++ i386/isa/if_lnc.c 2 Jan 2002 06:12:24 -
 @@ -839,9 +839,13 @@
   sc-mbuf_count++;
   start-buff.mbuf = 0;
   } else {
 +#if 0
   struct mbuf *junk;
   MFREE(start-buff.mbuf, junk);
 - start-buff.mbuf = 0;
 +#endif
 + /* XXX shouldn't this be m_freem ?? */
 + m_free(start-buff.mbuf);
 + start-buff.mbuf = NULL;

  I guess it depends on whether start-buff.mbuf is always a single mbuf
or if it ever becomes a chain. If it becomes a chain then it should
certainly be m_freem(). How about placing a loop there to traverse
forward and count the number of mbufs before m_next == NULL? And, if it
is above exactly 1, then change that to an m_freem(). Anyone using lnc?

   }
   }
   sc-pending_transmits--;
 @@ -1702,8 +1706,12 @@
   m-m_len -= chunk;
   m-m_data += chunk;
   if (m-m_len = 0) {
 +#if 0
   MFREE(m, head-m_next);
   m = head-m_next;
 +#endif
 + m = m_free(m);
 + head-m_next = m;
   }
   }
   }

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #3 (TCP / Linux / Performance)

2001-12-02 Thread Bosko Milekic


On Sun, Dec 02, 2001 at 11:18:42AM -0800, Matthew Dillon wrote:
[...]
 :Does the FreeBSD tcp stack do zero copy (page flip the data to 
 :userspace)? In the localhost case, it seems like there are two copies 
 :to/from userspace there.
 :
 :-- 
 :Richard Sharpe, [EMAIL PROTECTED], LPIC-1
 
 There are zero-copy patches floating around but I haven't looked at
 them to determine how messy they might be.

http://people.freebsd.org/~ken/zero_copy/

The main issues with the patch are, afaics:

  1. It's fairly large and difficult to maintain, especially in light of
 the large amount of SMPng-related changes.

  2. The receive code only works with certain network cards. [This is
 expected]. The performance may also vary based on the behavior of
 the application.

   -Matt
   Matthew Dillon 
   [EMAIL PROTECTED]

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Open Source Load Balancer

2001-08-20 Thread Bosko Milekic


On Mon, Aug 20, 2001 at 01:05:28PM -0400, Bill Kish wrote:
 
 Hi All,
 
  Sorry if this posting is somewhat off topic, but I think the answer to my
 question will be found among the subscribers to this list.
 
  I'm trying to convince the powers that be here at Coyote Point Systems that
 we should release the source for our Equalizer load balancer software to the
 Open Source community. I think I can pull this off, but I'd like to see what
 sort of interest there might be in this project and perhaps begin recruiting
 a project team.
 
  Any thoughts on the best forum to begin this process? This mostly kernel IP
 stack code which currently runs on FreeBSD.

Most probably [EMAIL PROTECTED] - assuming of course that this is
some sort of `network load balancer.'

 Thanks in advance,
 
 -=BK
 --
 ---
 Bill Kish  Ph: 650.969.6000
 Chief Engineer,3350 Scott Blvd, Bldg 20
 Coyote Point Systems Inc.   Santa Clara  California  95054
 Email: [EMAIL PROTECTED]   http://www.coyotepoint.com/
 ---
 For support call: 1-888-891-8150  Email: [EMAIL PROTECTED]
 ---

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Bosko Milekic


On Mon, Aug 06, 2001 at 11:27:56PM -0700, Terry Lambert wrote:
 I keep wondering about the sagicity of running interrupts in
 threads... it still seems like an incredibly bad idea to me.
 
 I guess my major problem with this is that by running in
 threads, it's made it nearly impossibly to avoid receiver
 livelock situations, using any of the classical techniques
 (e.g. Mogul's work, etc.).

References to published works?
 
 It also has the unfortunate property of locking us into virtual
 wire mode, when in fact Microsoft demonstrated that wiring down
 interrupts to particular CPUs was good practice, in terms of
 assuring best performance.  Specifically, running in virtual

Can you point us at any concrete information that shows this?
Specifically, without being Microsoft biased (as is most data published by
Microsoft)? -- i.e. preferably third-party performance testing that attributes
wiring down of interrupts to particular CPUs as _the_ performance advantage.

 wire mode means that all your CPUs get hit with the interrupt,
 whereas running with the interrupt bound to a particular CPU
 reduces the overall overhead.  Even what we have today, with

Obviously.

 the big giant lock and redirecting interrupts to the CPU in
 the kernel is better than that...
 
 -- Terry

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Allocate a page at interrupt time

2001-08-07 Thread Bosko Milekic


On Tue, Aug 07, 2001 at 12:19:01PM -0700, Matt Dillon wrote:
 Cache line invalidation does not require an IPI.  TLB
 shootdowns require IPIs.  TLB shootdowns are unrelated to
 interrupt threads, they only occur when shared mmu mappings
 change.  Cache line invalidation can waste cpu cycles --
 when cache mastership changes occur between cpus due to
 threads being switched between cpus.  I consider this a
 serious problem in -current.

I don't think it's fair to consider this a serious problem seeing as
how, as far as I'm aware, we've intended to eventually introduce code that will
favor keeping threads running on one CPU on that same CPU as long as it is
reasonable to do so (which should be most of the time).
I think after briefly discussing with Alfred on IRC that Alfred has
some CPU affinity patches on the way, but I'm not sure if they address
thread scheduling with the above intent in mind or if they merely introduce
an _interface_ to bind a thread to a single CPU.
 
   -Matt

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-28 Thread Bosko Milekic


On Fri, Jul 27, 2001 at 08:23:37PM -0400, Zhihui Zhang wrote:
 
 I thought doing a memory free is always safe in an interrupt context. Now
 it seems doing an allocation of memory is safe too.  Does MCLGET() call
 vm_page_alloc() or malloc() eventually?  If so, it might block.

It never calls malloc(). Sometimes, although rarely, it may end up
in kmem_malloc() which calls vm_page_alloc(), but vm_page_alloc() should not
block as in this case it will be called with the VM_ALLOC_INTERRUPT flag.
 
 -Zhihui

--
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-28 Thread Bosko Milekic


On Sat, Jul 28, 2001 at 01:44:25PM -0700, Terry Lambert wrote:
 Zhihui Zhang wrote:
  I thought doing a memory free is always safe in an interrupt context. Now
  it seems doing an allocation of memory is safe too.  Does MCLGET() call
  vm_page_alloc() or malloc() eventually?  If so, it might block.
 
 The mbuf allocator uses the zone allocator.

No, it doesn't. When it needs to dip into the VM code (which is
rare), it uses kmem_malloc() and so vm_page_alloc()-ates via the kmem_object.
Since kmem_map is scaled accordingly to accomodate the mbuf maps (mbuf_map and
clust_map), which are submaps of the former, this works out similarily to
the zone allocator. Note that the mbuf allocations were NEVER done with the
zone allocator.
The cool thing about managing mbufs via a map is that it *does* allow
for us to unwire associated pages in case we decide to actually free back to
the map. Previously, this was never implemented but with the new allocator,
the framework is present to allow for freeing of pages to be implemented. If
implemented properly, this could allow for the system to re-adapt even if
the character of the load changes with time, without affecting allocation
performance.
 
 The reason this works at interupt is that the page table
 entries for the memory are already in place in the kernel,
 but the actual allocations have not taken place.

 When you are running with less than a full complement of
 RAM (e.g. 4G on a 32 bit Intel machine), this will permit
 you to do the allocations of physical RAM later, and have
 a KVA (kernel virtual address) space that exceeds the amount
 of physical memory.

 In practice, this means that your system is not specifically
 tuned for particular loading, until the memory is committed
 (when that happens, say, by using all possible mbufs, then
 you are unable to recover the memory to the system memory
 pool: it has become type stable).  This lets you have a mostly
 general system that then commits resources based on the
 character of its load, yet which does not permit the character
 of the load to change over time.

See previous paragraph.
 
 When you have all the memory you can address in physical
 space, then the problem changes somewhat, and you basically do
 not overcommit resources.
 
 The upshot of having the page descriptors preallocated,
 however, is that you can allocate in interrupt context, and
 the zone headers are statically allocated at compile time,
 instead of being malloc'ed later in the kernel boot cycle.
 
 You should look at the ziniti and zalloci code: the zone
 allocator code.  The mbuf issue has recently been a bit
 obfuscated by the -current commit of a replacement allocator,
 which is mbuf specific.  I think this new allocator has some
 unforgivable drawbacks; you yould be better off looking at
 the 4.3 kernel source code to get an idea of why interrupt
 allocations work.

Again, the actual allocation code has LITTLE changed even in the new
allocator. I simply don't understand where you get the idea that mbufs were
ever allocated with the zone allocator but I suspect that if you went ahead
and read the new code, you'd realize that for what concerns actually memory
allocation, very little has changed vis-a-vis the older allocator.

 So, in general:
 
 1)Only some allocators can be used at interrupt time
 2)If they can, they must precommit kernel address space
   to the task
 3)Once memory is allocated from one of these pools, it
   is never returned to the system for reuse

This (3) only applies to the zone allocator. With maps, you *can* free
back to the map and unwire the wired pages (freeing physical memory).

 4)The general malloc() code _can not_ be used at interrupt
   time in FreeBSD (but SVR4's allocator can).

Huh? Do you realize that in much much earlier versions of FreeBSD
(not long after the import from 4.4BSD, or whatever it was uipc_mbuf.c was
initially imported from) all _MBUFS_ were allocated directly with malloc()?
Obviously, mbufs are allocatable at interrupt time (and always were, afaic
remember). All that you have to make sure to do, when allocating at interrupt
time is to allocate with the M_NOWAIT flag.

 -- Terry

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-26 Thread Bosko Milekic


On Thu, Jul 26, 2001 at 10:18:09AM -0700, Terry Lambert wrote:
 The real reason behind all this is to make the input and output
 routines symmetric, since mbuf's can be allocated at interrupt,
 and clusters can't (or couldn't, last time I looked at 4.3).

They can. Whether they are or not I'm not sure.
 
 -- Terry

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-26 Thread Bosko Milekic


On Thu, Jul 26, 2001 at 10:51:40AM -0700, Terry Lambert wrote:
 Alfred Perlstein wrote:
   On Thu, Jul 26, 2001 at 10:18:09AM -0700, Terry Lambert wrote:
The real reason behind all this is to make the input and output
routines symmetric, since mbuf's can be allocated at interrupt,
and clusters can't (or couldn't, last time I looked at 4.3).
  
 They can. Whether they are or not I'm not sure.
  
  Er, wouldn't that be the only way for cards to refil thier DMA
  recieve buffers?
 
 Look at the Tigon II and FXP drivers.  The allocations in
 the macros turn into m_get, not m_clusterget.

From if_fxp.c (fxp_add_rfabuf(), sometimes called from fxp_intr()):

MGETHDR(...);  -- get mbuf
if (m != NULL) {
MCLGET(...); -- get cluster
...
}
 
 -- Terry

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-25 Thread Bosko Milekic


On Wed, Jul 25, 2001 at 01:51:51PM -0400, Zhihui Zhang wrote:
 
 
 On Tue, 24 Jul 2001, Terry Lambert wrote:
 
  Zhihui Zhang wrote:
Hi,
in freebsd can we change the cluster size from 2048
bytes.If yes how can we do that?
do we have to configure in some file?
   
   You must be asking why the mbuf cluster size is chosen as 2048, right? It
   is probably a tradeoff between memory efficient and speed.
  
  Ask yourselves:
  
  What is the minimum cluster size I would have to have
   to be able to contain the maximum MTU worth of data,
   yet remain an even multiple of sizeof(mbuf) -- 256
   bytes?
 
 A dumb question: why even not odd multiple?
 
 -Zhihui

It actually has to do with the fact that 2K is the only size equal to
or greater than the maximum MTU worth of data that can be multiplied to a page
size without any leftover (in other words, page size modulo 2K is zero).

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-25 Thread Bosko Milekic


On Wed, Jul 25, 2001 at 02:17:38PM -0400, Zhihui Zhang wrote:
 
 I see.  It has something to do with the power-of-two allocator we are
 using inside the kernel.

No, it has nothing to do with the power-of-two allocation strategy
used in some cases inside the kernel. 2K is just the most convenient size
for a cluster as it fits the maximum MTU size while at the same time fitting
nicely into a page, reducing allocation complexity.
 
 -Zhihui

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: kernel malloc

2001-07-23 Thread Bosko Milekic


On Mon, Jul 23, 2001 at 12:37:55PM +0100, vishwanath pargaonkar wrote:
 Hi,
 
 thx for ur reply.
 i wanted to know in side kernel is there any limit to
 the malloc that a user can do.what you told in ur
 previous mail is that at a time user can malloc 4k.but

No. You _can_ malloc over 4k and I never said that you could not. All
I said was that if you do malloc() a buffer larger than PAGE_SIZE that the
buffer will likely not be contiguous in physical memory. What that means
is that your buffer may span across two non-contiguous physical pages. Usually
you won't care unless you're DMAing into the buffer, or relying on the
physical pages to be contiguous.

 suppose i am doing 2k memory allocations. how many
 such mallocs i can do?

In the kernel, you can do as many as you want. That is, until you
run out of physical memory or until you exhaust the kmem_map virtual address
space, whichever comes first.

 is there any configuration we can do depending on our
 RAM size?
 please reply.
 thx
 vishwanath 

Regards,
-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: cluster size

2001-07-23 Thread Bosko Milekic


On Mon, Jul 23, 2001 at 11:01:27AM -0500, Dan Nelson wrote:
 In the last episode (Jul 23), vishwanath pargaonkar said:
  in freebsd can we change the cluster size from 2048 bytes.If yes how
  can we do that? do we have to configure in some file?
 
 Actually, the block size is 8192 bytes by default, with fragment size
 of 1024 bytes.  You pick the sizes when you run newfs with the -b and
 -f options.

I think he was referring to the mbuf cluster size being 2K. In any
case, I think the question is way too ambiguous to be answered properly.
 
 -- 
   Dan Nelson
   [EMAIL PROTECTED]

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: kernel malloc

2001-07-20 Thread Bosko Milekic


On Fri, Jul 20, 2001 at 10:17:20AM +0100, vishwanath pargaonkar wrote:
 Hi,
 
 can any one please help me with this. i want allocate
 a memory in the kernel -a buffer of size 2k to 5k.
 can i do it using malloc with second parameter as
 M_TEMP and third as M_WAITOK.
 
 can anybody tell me what M_TEMP means .what is maximum
 malloc i can do with M_TEMP?
 will the OS allow me to malloc 4k buffer in side
 kernel??shd i give M_WAITOK or M_DONTWAIT???

M_TEMP is merely there for statistics gathering. If you're writing
a subsystem and plan to malloc() a lot of things for the subsystem you may
want to create your own malloc type (see malloc(9)).
On another note, remember that if you allocate a 5k buffer with malloc()
on x86 where the page size if 4k, that you're not guaranteed to have a
physically contiguous backing. 
 
 please tell me.
 thanx in advance.

Regards,
-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Network performance roadmap.

2001-07-17 Thread Bosko Milekic


On Fri, Jul 13, 2001 at 04:37:46PM -0500, Mike Silbersack wrote:
 Jiangyi Liu has been working on mbuf limiting code for the past week or
 so.  What he has is pretty complete, I expect to get most of it committed
 once Bosko gets back.

Well, I'm back. I'm now going to bed but my INBOX awaits. 
 
 Mike Silby Silbersack

-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Article: Network performance by OS

2001-06-19 Thread Bosko Milekic
 you to characterize the load you
 expect, which I am sure results in some non-defaults for
 a number of tuning parameters.
 
 Similarly, it has opportunity to notice the network
 hardware installed: if you install a GigaBit Ethernet
 card, it's probably a good be that you will be running
 heavy network services off the machine.  If you install
 SCSI disks, it's a pretty good bet you will be serving
 static content, either as a file server, or as an FTP
 or web server.
 
 Tuning for mail services is different; the hardware
 doesn't really tell you that's the use to which you will
 put the box.
 
 On the other hand, some of the tuning was front-loaded
 by the architecture of the software being better suited
 to heavy-weight threads implementations.  Contrary to
 their design claims, they are effectively running in a
 bunch of different processes.  Linux would potentially
 beat NT on this mix, simply because NT has more things
 running in the background to cause context switches to
 the non-shared address spaces of other tasks.  Put the
 same test to a 4 processor box with 4 NIC cartds, and I
 have no doubt that an identically configured NT box will
 beat the Linux box hands down.
 
 
 A common thread in these complaints that the results
 were somehow FreeBSD's fault, rather than the fault of
 tuning and architecture of the application being run,
 is, frankly, ridiculous.

I completely agree. :-)))
 
 -- Terry

Cheers,
-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Article: Network performance by OS

2001-06-16 Thread Bosko Milekic


On Sat, Jun 16, 2001 at 02:14:14PM -0700, Matt Dillon wrote:
 It's certainly true that a greater degree of dynamic tuning could be
 done, but all this benchmark proves (in regards to the TCP results)
 is that FreeBSD puts its foot down earlier then other OS's in regards
 to how much it is willing to dedicate to the network.  In a real life
 situation where you may be running a multi-user load or a large database,
 the very last thing you want to do is shift every last bit of your
 resources away from the users or the database and to the network when
 an 'unexpected load' comes in (unexpected meaning something that is a
 factor of 100 or 1000x what the machine normally handles).  The
 truth of the matter is that no amount of dynamic tuning can handle
 every situation... at some point you have to manually tune the box.
 FreeBSD does exactly the right thing on an untuned box by capping the
 network resources.  If the authors want to run the machine into the 
 ground with a benchmark, they have to tune the machine properly to handle
 the load because FreeBSD anyway is more interested in keeping the
 integrity of the machine as a whole together then it is tuning itself
 to match some idiot who thinks he is gods own gift to humanity running
 a benchmark.

This is the best written paragraph on the issue in this entire thread.
This is exactly my philosophy toward the whole thing. And I can tell you from
previous dealings with companies that use FreeBSD as their main platform that
this is one of the main reasons why.

   -Matt

Regards,
-- 
 Bosko Milekic
 [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: questions regarding the MGET function.

2001-02-28 Thread Bosko Milekic


Shankar Agarwal wrote:

 Hi,
 Can you please tell me when did the MGET function change it
 implementation from using MALLOC to using pool_get to allocate a
mbuf. I

Never. We don't use pool_get(). That's a NetBSD-ism. :-)

The mbuf subsystem uses its own allocator and stats are kept in
mbstat which is exported via sysctl. Things like netstat(1) can fetch
this information for you.

 am having a trouble finding out how does the memstats keep track of
the
 mbufs allocated through pool_get.
 Thanks
 Regards
 Shankar



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Re: Re: postfix: No buffer space available

2001-02-20 Thread Bosko Milekic


Since nobody else has asked this, I think I will:

What network device are you using and with what driver?
Please show the output of `ifconfig -a' when you notice this problem.
Finally, try `ifconfig the_interface down' followed by `ifconfig
the_interface up' when you notice this, and see if it temporarily
fixes the problem.

Thanks to Matthew Dodd and NetBSD, I think we may have a solution to
the ep wedging problems (which has similar symptoms, by the way)
sometime soon (i.e. when I get around to it this weekend, after first
mid-term, if noone beats me to it).

In the meantime, it would be nice to know if there are other devices
exhibiting this behavior.

(All this assuming, of course, that what you're describing is not the
result of a kernel resource shortage, such as mbuf starvation, etc.)

Regards,
Bosko.

Renaud Waldura wrote:

  But neither parameter takes effect.

 They may be read-only if you're running with securelevel  0.
Otherwise they
 "take effect" just fine.


  Anybody got any other ideas how scale FreeBSD up to postfix's
needs?


 Yes, recompile your kernel with "maxusers 128" or more. This tweaks
a bunch
 of stuff, notably mbufs.

 E.g. with 128 "users" I've got:

 226/1920/10240 mbufs in use (current/peak/max):
 159 mbufs allocated to data
 67 mbufs allocated to packet headers
 130/1438/2560 mbuf clusters in use (current/peak/max)
 3116 Kbytes allocated to network (9% in use)
 0 requests for memory denied
 0 requests for memory delayed
 0 calls to protocol drain routines


 --Renaud





 - Original Message -
 From: "Len Conrad" [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Tuesday, February 20, 2001 1:36 PM
 Subject: Fwd: Re: Re: postfix: No buffer space available


  Here's what has happened with the advice earlier:
 
  tried to add the following via sysctl.conf
  
  kern.ipc.maxsockets = 5000
  kern.ipc.maxsockbuf = 524288
  
  But neither parameter takes effect.
 
  are these read-only values?? and:
 
  # netstat -m
  445/720/4096 mbufs in use (current/peak/max):
   172 mbufs allocated to data
   273 mbufs allocated to packet headers
  154/252/1024 mbuf clusters in use (current/peak/max)
  684 Kbytes allocated to network (61% in use)
  0 requests for memory denied
  0 requests for memory delayed
  0 calls to protocol drain routines
 
  Anybody got any other ideas how scale FreeBSD up to postfix's
needs?
 
  tia,
  Len
 
 
  http://BIND8NT.MEIway.com : Binary for ISC BIND 8.2.3 for NT4 
W2K
  http://IMGate.MEIway.com  : Build free, hi-perf, anti-spam mail
gateways
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with "unsubscribe freebsd-hackers" in the body of the message
 


 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Problems attaching an interrupt handler

2001-01-21 Thread Bosko Milekic

This: bus_teardown_intr(dev, sc-irq, sc-ih) != 0 ); looks pretty odd. See
your ir_detach().

Alex wrote:

 Hi,

 I started experimenting with kernel hacking to write an
 infrared device driver. Therfore I read Alexander Langer's
 article on DaemonNews and started modifying the led.c
 example code.

 Unfortunately I can't get my interrupt handler working.

 Could anyone please have a short look on my code.

 On loading the module the first time everything stays
 stable and vmstat -i shows 1 INT on my device. After
 unloading the module and reloading it the kernel
 crashes on the next incoming interrupt.

 Any ideas?

 Alex




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: One thing linux does better than FreeBSD...

2001-01-16 Thread Bosko Milekic

Hey!

These images are hip! :-)

Cheers,
Bosko.

Matthew N. Dodd wrote:

 On Tue, 16 Jan 2001, Poul-Henning Kamp wrote:
  Isn't there *anybody* here who has a SO/family member/neighbor in the
  graphic/design business ?
 
 Yes.
 
 http://www.svaha.net/daemon/index.html
 
 -- 
 | Matthew N. Dodd  | '78 Datsun 280Z | '75 Volvo 164E | FreeBSD/NetBSD  |
 | [EMAIL PROTECTED] |   2 x '84 Volvo 245DL| ix86,sparc,pmax |
 | http://www.jurai.net/~winter | This Space For Rent  | ISO8802.5 4ever |
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: FIN_WAIT_2 / TIME_WAIT Confusion

2001-01-08 Thread Bosko Milekic

Hi Michael,

What version of FreeBSD are you running? If it's not too much trouble,
can you please provide the code you're using to simulate the problem? Are the
TIME_WAIT state connections eventually timing out/disappearing?

Michael wrote:

 If this is not proper place to ask this, let me know and I'll go elsewhere
as
 it is a TCP question. . . but I specifically use (and prefer) FreeBSD.

 I wrote a simple little I/O multiplexing thing that can act as a client or
 server as a personal project in network programming. Everything seems fine,
 except that when I use the client to make multiple connections to a web
 server.

 Even though I don't primarly use it for this, the following behvior has me
 curious.

 I will run my client about 2 or three times, each time it makes 5
 connections, pulling back the main page. Then the weird behavior starts:

 1. I will get all data back from all connections except for one, perhaps
 two, which then sit in a FIN_WAIT_2 or sometimes TIME_WAIT state.

 2. When I run netstat -a, it indicates that there is data in the read queue
 for these clients, but select() always returns 0 ready file descriptors.

 That's what puzzles me. There is data there to be gotten, but I am not
 getting it. When I look at the data that comes back in tcpflow, it doesn't
 look like the whole document has made it back either.

 A couple of runs might work perfectly, then once or twice will be weird.
And
 it seems to multiplex more connections more reliably than fewer (the weird
 behavior seems inversely proportional to the number connections---to a
point
 of course. The client runs reliably more times with 50 connections than
with
 5).


 Three notes:

 1. It seems to happen more if I access machines on my LAN than over the
 Internet.

 2. I do make sure and shutdown the write side of the socket after I send
the
 HTTP request so as to avoid keeping the web server in FIN_WAIT_2.

 3. I am sure about having the maxfd + 1 in select() correct, so that's not
 the problem.

 Does anyone have any ideas as to what's going on?





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: FreeBSD vs Linux, Solaris, and NT

2000-12-29 Thread Bosko Milekic

Dennis wrote:
 : Still, I personally believe, that "core" or general "freebsd community"
 : should explicitly state, that support for binary drivers and support
for
 : easier inclusion of binary driver or just third party driver is eagerly
 : encouraged. And as much as possible, easy inclusion of binary drivers
 : sould be kept in mind whether makeing changes to /usr/src/Makefile or
 : kernel interfaces or even discussions on the freebsd lists.
 
 Core has stated in the past a strong desire for developers not to
 break kernel interfaces within minor releases.


 4.1 broke that "policy" rather badly. Perhaps its time to get rid of the
 mbuf macros, as any change to that structure breaks binary compatibility
in
 the worst way possible.

 DB

The "problem" was not with the macros themselves, but with the fact that
 your outdated binary was compiled with old definitions of some structures
 which were later changed (mbstat structure). The changes that happened
 there were relatively minor. I'm sure you would know all this had you
 debugged the problem yourself, but it turns out that all you provided in
 terms of "support" was whining and directing blame at the FreeBSD team.

I disagree with not merging in fixes to -STABLE that help maintain code
 in general, for the entire project; In this case, the change helped
userland code
 such as netstat(1) deal with mbtypes. This wasn't a "big interface change"
by
 any means. Plus, it was discussed on -net and since -net directly concerns
you
 and your driver, perhaps you should read it every once in a while. Had we
not
 merged this change to -STABLE, I'm sure we would have had just as many, if
 not more requests: "MFC MFC, you guys are ignoring -STABLE!" as we
 have now with you complaining about the change being made. A wise man
 once said something along the lines: "you can never win with tire-kickers,"
 and now I see how he was right.

Regards,
Bosko.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Why not another style thread? (was Re: cvs commit: src/lib/libc/gen getgrent.c)

2000-12-17 Thread Bosko Milekic


On Sun, 17 Dec 2000, Chris Costello wrote:

 On Sunday, December 17, 2000, Jacques A. Vidrine wrote:
  What do folks think about
  
1)if (data)
  free(data);
  
  versus
  
2)free(data);
  
  versus
  
3)#define xfree(x) if ((x) != NULL) free(x);
  xfree(data);
 
2.  The C standard dictates that free() does nothing when it
 gets a NULL argument.  The other two are just extra clutter.

Agreed. However, in the kernel, all free()s should be made as in (1),
  in my opinion. (2) is dangerous, and (3) would just obfuscate the code.
  (I know this does not apply to the commit, but should be noted)

 -- 
 +---+-+
 | Chris Costello| This system will self-destruct in five minutes. |
 | [EMAIL PROTECTED] | |
 +---+-+

  Later,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: crash on 4.2-stable (sendto() system call)

2000-11-23 Thread Bosko Milekic


  Hello,

Can you please also get the instruction at which the page fault
  occured? You can try "where" from gdb or you can get the instruction
  pointer from the original page fault message and then you can probably
  "disassemble fr_makefrip" and get us the contents around the instruction
  generating the fault.


On Thu, 23 Nov 2000, FengYue wrote:

 
 Hi, got a crash on 4.2-stable. the machine was running 4.1.1-stable
 and had no problem at all.  10 hours after upgrade to 4.2-stable I got
 a vmcore.   Here it's the trace and could someone take a look, it looks
 like it was the sendto() call triggered the crash but I don't know
 how to reproduce it.
 
 Thanks
 
 ---
 initial pcb at 24c320
 panicstr: page fault
 panic messages:
 ---
 dmesg: kvm_read: 
 ---
 #0  0xc013336e in dumpsys ()
 (kgdb) bt
 #0  0xc013336e in dumpsys ()
 #1  0xc013318f in boot ()
 #2  0xc013350c in poweroff_wait ()
 #3  0xc0200461 in trap_fatal ()
 #4  0xc0200139 in trap_pfault ()
 #5  0xc01ffd1f in trap ()
 #6  0xc01882dd in fr_makefrip ()
 #7  0xc018e20c in fr_checkicmpmatchingstate ()
 #8  0xc018e44d in fr_checkstate ()
 #9  0xc0188ecc in fr_check ()
 #10 0xc017d124 in ip_output ()
 #11 0xc017b416 in icmp_send ()
 #12 0xc017b397 in icmp_reflect ()
 #13 0xc017acbd in icmp_error ()
 #14 0xc0185be4 in udp_input ()
 #15 0xc017bdcb in ip_input ()
 #16 0xc017be2b in ipintr ()
 #17 0xc01f69d5 in swi_net_next ()
 #18 0xc0153881 in sendit ()
 #19 0xc0153975 in sendto ()
 #20 0xc020070d in syscall2 ()
 #21 0xc01f5575 in Xint0x80_syscall ()
 Cannot access memory at address 0xbfbffc8c.
 

  Regards,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Log analysis program running under apache reboots server!

2000-11-13 Thread Bosko Milekic


Likely, you're getting a panic() and since you likely don't have
  debugging options, the machine eventually reboots itself.
Notice that this is all "likely" and that since we don't have a crash
  dump, stack trace, or similar debugging information, that there's not
  much that can be done except guessing.
I would suggest that you try to reproduce the problem on a local
  machine and get some debugging info.

On Mon, 13 Nov 2000, Nicole wrote:

   Silent reboot :(
 
  I hate to respond to my own message.. But the server is remote.. But there is
 nothing in the logs afterwards.. and nothing appears on the screen when it
 occurs.  
 
Nicole

[...]
   apacheuser:\
   :manpath=/usr/share/man /usr/X11R6/man /usr/local/man:\
   :cputime=4h:\
   :datasize=64M:\
   :stacksize=4M:\
   :filesize=infinity:\
   :memoryuse=64M:\
   :priority=0:\
  :datasize-cur=32M:\
  :stacksize-cur=32M:\
  :coredumpsize-cur=0:\
  :maxmemorysize-cur=64M:\
  :memorylocked=32M:\
  :maxproc=128:\
  :openfiles=256:\
   :tc=standard:
   
  ## standard - standard user defaults
  ##
   standard:\
   :copyright=/etc/COPYRIGHT:\
   :welcome=/etc/motd:\
   :setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\
   :path=~/bin /bin /usr/bin /usr/local/bin:\
   :manpath=/usr/share/man /usr/local/man:\
   :nologin=/var/run/nologin:\
   :cputime=1h30m:\
   :datasize=8M:\
   :stacksize=2M:\
   :memorylocked=4M:\
   :memoryuse=8M:\
   :filesize=8M:\
   :coredumpsize=8M:\
   :openfiles=24:\
   :maxproc=32:\
   :priority=0:\
   :requirehome:\
   :passwordtime=90d:\
   :umask=002:\
   :ignoretime@:\
   :tc=default:
   
   default:\
   :cputime=infinity:\
   :datasize-cur=22M:\
   :stacksize-cur=8M:\
   :memorylocked-cur=10M:\
   :memoryuse-cur=30M:\
   :filesize=infinity:\
   :coredumpsize=infinity:\
   :maxproc-cur=64:\
   :openfiles-cur=64:\
   :priority=0:\
   :requirehome@:\
   :umask=022:\

For starters, I don't see "sbsize" in there, although it doesn't
  sound like something that should be causing a panic() anymore anyway.
  Please provide more debugging infos.

  Thanks,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: zero copy TCP

2000-11-13 Thread Bosko Milekic



On Mon, 13 Nov 2000, Jin Guojun wrote:

 Both, but I may do either way, depending on which way is easier.
 If we can directly DMA from a disk drive to a NIC, that will be great.
 If the current implementation requires preloaded buffer, that works.
 So, where can I look for the patch?
 
 Thanks,
 
   -Jin

Please see sendfile(2).
  
  Regards,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: post-install of kernal sources, maxusers max?

2000-11-08 Thread Bosko Milekic


On Wed, 8 Nov 2000, Mike Silbersack wrote:

 I think you can up the mbuf related settings while the system is
 running.  Give it a try.  The two sysctls you'll want to fiddle with are:
 
 kern.ipc.nmbclusters
 kern.ipc.nmbufs

Nope.
These are read-only but can be tuned from loader.

 You can determine which is needed more through a quick netstat -m.

 Mike "Silby" Silbersack

  Cheers,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: When do you want to see panics?

2000-10-05 Thread Bosko Milekic


  No, not in the general case, they are not normal! So feel free to provide
  the info. :-)

On Thu, 5 Oct 2000, Michael Lucas wrote:

 Not sure if this is on-topic, but what the heck:
 
 I've started playing a little more freely with my laptop.  One result
 is comparatively frequent panics when doing things I know damn well
 are almost certain to fail, say, while playing with the Linuxulator or
 in mount_union.
 
 Are these panics  debugger dumps something people want to see, or is
 the general attitude "then don't *do* that!" ?
 
 If you folks want 'em, I'll send them.
 
 (I suppose the generalized form of this question is, "Are panics
 normal when the sysadmin is a behaving like a damned fool?" ;)
 
 Thanks,
 Michael
 
 -- 
 Michael Lucas
 [EMAIL PROTECTED]
 http://www.blackhelicopters.org/~mwlucas/
 Big Scary Daemons: http://www.oreillynet.com/pub/q/Big_Scary_Daemons

  Regards,
  Bosko Milekic
  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s), v 0.1

2000-07-06 Thread Bosko Milekic


On Thu, 6 Jul 2000, Bosko Milekic wrote:

   I've recently had the chance to get some profiling done.
 
   I used metrics obtained from gprof, as well as the (basic block
   length) * (number of executions) metric generated by kernbb. The latter
   reveals an approximate 30% increase in the new code, but does not
   necessarily imply that time of execution is increased by that amount.
   gprof makes a fair estimate on execution time, and reveals that the
   new code is, worse case scenario 30% slower, and best case scenario,
   negligeably slower. Of course, I'm leaving out some details here, because
   I've decided to change things a little, in order to further improve (and
   significantly, at that) the performance of the new code. Note however
   that the 30% overall APPROXIMATE increase is not something I would
   consider significant, especially since the allocator/free routines don't
   hold much %time, and are not the bottleneck in any of the call graphs. I
   did decide to make drastic changes, however, in order to maintain with
   the 0-tolerance policy, even if it involves somewhat getting rid of a
   cleaner interface and adopting a "kernel process." See below.

You can disregard the above data. I actually found something
detrimental (seriously) to performance.

During MFREE, the code would free the page in question if at the
  time the number of mbufs on the free list exceeds (even by a little)
  min_on_avail. This is fine. The problem was in MGET/MGETHDR where the
  code would explicitly allocate when how==M_WAIT and number of mbufs on
  free list  min_on_avail (this was a feeble attempt at making M_NOWAIT
  allocations even faster). The potential problem is not so obvious:
  numerous M_WAIT allocs will ALWAYS allocate a page from the map while
  min_on_avail  mbufs on free lists. And, MFREE would almost ALWAYS have
  to free back to the map as at this point, the number of mbufs on the free
  lists fairly quickly reaches min_on_avail. So what would happen is a page
  would be allocated, freed, allocated, freed, etc. m_get + m_free would be
  an endless cycle of m_mbmapalloc and m_mbmapfree, which increases
  overhead significantly.
After fixing MGET/MGETHDR, I'm getting more promising results. I'll
  get some hard data and post it later tonight, hopefully.

  Oh, and I'm still open to the kernel process idea. I'll need one such
  beast anyway, because it will help minimize page fragmentation for the
  allocator, on request.


--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s), v 0.1

2000-07-03 Thread Bosko Milekic


On Sun, 2 Jul 2000, David Greenman wrote:

Yes, malloc is slow for other reasons, but it is especially slow when VM
 pages are freed back to the general pool. Of course it is possible to
 introduce hysteresis in the algorithm such that it doesn't free the pages as
 often, but this (and all the tunables that you proposed) has the negative
 effect of making the allocator more complex. We've tried very hard not to do
 this in the current mbuf allocator, making it nearly as efficient as you can
 get.

  * Have you looked at the code I proposed?
 http://24.201.62.9/code/mbuf/
 (I did some simplification recently, but it's not done yet, so you may
 want to look at it).
 
  * Again, I did NOT use malloc()/free() to allocate mbufs. Effectively, I
  do something similar to NetBSD's "pool" interface, only much SIMPLER.

  * I only proposed ONE additional tunable, and that's the one I mentionned
  previously. It has the effect of maintaining speed for those who would
  prefer to have it done in a similar way to before.

  * I agree with this:  - the present allocator is simple
- the present allocator is efficient
So is the new one, but since it introduces a new useful feature,
which has the effect of freeing physical memory when it isn't needed
and when the administrator agrees to do so, it's "simple" and
"efficient" in its own class. By the way, I'm very open to comments
and optimisation suggestions, so if it's not as efficient as possible
right now, then I'd love to hear suggestions pertaining to that, but
that would maintain the new functionality.

I guess I just don't see the problem on any of the servers that I manage
 (ftp.cdrom.com and ftp.freesoftware.com, for example). There are peaks in 
 usage, but they tend to reach the peaks often enough that freeing the pages
 for short term memory gain is just a waste of CPU cycles. Memory is so cheap
 these days that throwing memory at the problem seems to be a very reasonable
 solution, especially when the system clearly needs it during the peaks.

 -DG
 
 David Greenman
 Co-founder, The FreeBSD Project - http://www.freebsd.org
 Manufacturer of high-performance Internet servers - http://www.terasolutions.com
 Pave the road of life with opportunities.

I'm getting the unfortunate impression that evolution is being
  frowned upon here. Are their other people that frown the proposal out
  there to this extent? (i.e. "don't change it if it works") I'd like to
  hear some important voices on this issue so that I can decide whether to
  just drop this entire thing and forget about it. (in other words, what do
  committers and/or core have to say about this?)

Aside from this, I've gotten several other "pro" opinions on this;
  some people have even sent suggestions. So I know that I am not the only
  one (not by far, in fact) to see an opportunity to benefit from this.
  Either way, I know *I* will be using this code in time to come, so I
  suppose the question is:
Would you consider committing this code or should I stop posting any
changes I make in the future altogether?

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s), v 0.1

2000-07-03 Thread Bosko Milekic
dr struct in mbufs (pointer) that points to the
mbuf's corresponding page descriptor structure, so that pointer is
aquired and the free mbuf chain is extracted from the structure to
which the freed mbuf is attached (as it always was). I guess the only
real addition in CPU cycles here is the following: a simple check was
added that just checks if the entry is on the "empty" list and if it
is, moves it over to the "free list." If that's not the case, then
there is a possibility that the freed mbuf completes a page and the
page can be freed, so if that's the case and min_on_avail allows it,
then the page is freed back to the map (notice that this behavior is
tunable - again - with min_on_avail).

I'm not trying to 'frown upon evolution', unless the particular form of
 evolution is to make the software worse than it was. I *can* be convinced
 that your proposed changes are a good thing and I'm asking you to step up
 to the plate and prove it.

That sounds fair.

 
 -DG
 
 David Greenman
 Co-founder, The FreeBSD Project - http://www.freebsd.org
 Manufacturer of high-performance Internet servers - http://www.terasolutions.com
 Pave the road of life with opportunities.


--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s), v 0.1

2000-07-03 Thread Bosko Milekic


On Mon, 3 Jul 2000, Poul-Henning Kamp wrote:

 Considering the prominence of DoS attacks and similar, I think it
 makes a lot of sense to be able to free the memory again, and if
 the hysteresis you have built in means that there is no measurable
 performance impact I think you will face no objections.

That was one of the reasons of writing. Oh, and there's something I
forgot to mention previously. The code I presently have frees memory
dedicated to mbufs, so obviously, it's significant, but it's even
more significant in the case of mbuf clusters, as they are larger. I
still haven't finished writing the cluster stuff though but expect it
to be similar in concept and design.

 Is it possible to auto-tune min_on_avail somehow ?
 
 What if instead you made it free only when more than 50% of the
 memory allocated from the map was unused ?

min_on_avail is presently a sysctl but I do expect to have it
optionally autotuned - read below.

 Could that freeing be done by a timeout routine which runs every
 N seconds ?

Ah! Finally, you've read my mind! The design has been made with the
idea of the possibility of a "kernel process" running [optionally]
periodically which will take care of such issues.

* reducing fragmentation by moving page descriptor structure nodes
with almost complete free lists to the bottom of the "free"
doubly-linked list

* possible auto-tuning of min_on_avail; I will be expanding mbstat to
include allocator statistics, so that the number of times the
VM allocation routine and the VM free routine have been called can be
recorded and used for such purposes.

* drain routine to free pages back to VM system

  In other words, the free page back to mb_map routine takes as an argument
  a node on the free list, so the "timeout" daemon can be made to walk the
  free list and pick out full available pages from the list and return the
  space to the map, on the condition that min_on_avail is respected. The
  issue with doing this however is that it will have to splimp() while
  walking the lists, so the issue being with whether it's really much of an
  advantage (as opposed to freeing from MFREE if necessary).

  On the other hand, what I think would be more of an advantage is having
  MFREE only call m_mbmapfree() [the new free routine] if (how) == M_WAIT.
  If (how) == M_NOWAIT, then the mbuf will just be attached to its
  corresponding page descriptor's free chain. I try to take advantage of
  (how) being M_WAIT as much as possible. For instance, during allocation,
  even if the free list is not empty but (how) is M_WAIT, the system will
  still fetch a new page and allocate from it if the number of free mbufs
  are less than min_on_avail. This is to minimize the calling to
  m_mbmapalloc() when allocations are to be done with M_NOWAIT (i.e. from
  interrupts).

 --
 Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
 [EMAIL PROTECTED] | TCP/IP since RFC 956
 FreeBSD coreteam member | BSD since 4.3-tahoe
 Never attribute to malice what can adequately be explained by incompetence.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s), v 0.1

2000-06-29 Thread Bosko Milekic


On Thu, 29 Jun 2000, David Greenman wrote:

We used to do this in FreeBSD, but found that it was a bad idea for
 performance reasons. Freeing and reallocating memory from the high-level
 VM system is quite expensive and the trend in NICs these days is towards
 needing the code to be even faster, not slower. Further, if the 'peak' is
 reached often, then you're probably not really gaining much by freeing
 the memory back to the common pool.
 
 -DG

What was previously done at some point was use the kernel malloc() to
  allocate mbufs. As you know, this is a general purpose allocator that has
  to first determine what algorithm to use and then store the object
  correctly according to its size. This allocator is faster than that
  one. This allocator knows that it only has to deal with mbufs and knows
  that all of these mbufs are of the same size. 
I am not proposing to return to malloc(), I am proposing the new
  allocator.
Also, the "peak" in this case is not reached often, obviously. It is
  designed with just that idea in mind. But, if the administrator feels
  that it is, I have provided the following mechanism:

  { jehovah:/home/bmilekic } sysctl -A | grep min_on_avail
  kern.ipc.min_on_avail: 0

  With this sysctl, the administrator can set a "minimum required" count
  for mbufs. In other words, it is possible to easily tell the system to
  keep as many mbufs as you'd like cached on the free lists.

 David Greenman
 Co-founder, The FreeBSD Project - http://www.freebsd.org
 Manufacturer of high-performance Internet servers - http://www.terasolutions.com
 Pave the road of life with opportunities.

 -Bosko

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Re[2]: mbuf re-write(s), v 0.1

2000-06-29 Thread Bosko Milekic


On Thu, 29 Jun 2000, Joe McGuckin wrote:

 
 What about a slab allocator 
 (e.g. http://www.cnds.jhu.edu/~jesus/418/SlabAllocator.pdf)
 
 Joe

What's your motivation behind this recommendation?

What you're essentially suggesting is a replacement for our kernel
  malloc(). This will not make mbuf allocations faster by any means. The
  mbuf allocator that I presented in code a week or so ago does something
  very simple, as there is no point in making same-sized object allocations
  complicated, really, especially when they are small objects; in other
  words, I did NOT suggest going back to using malloc() for mbufs. 
  I wrote a new, simpler and faster customized allocator which does
  essentially this:

  * Check free list, which is a doubly-linked list of "mb_map page
  descriptor structures" (new structure, mbpl_pg_descr). This structure
  contains very basic and essential information, such as the address of the
  VM page, the number of mbufs that are "in use" on that page, etc. If
  there is a node present on that general free list, then there is a free
  mbuf, and allocation from the map is not necessary. Grab free mbuf. If
  node now no longer contains any free mbufs, detach it from this list and
  attach it to free list.

  * If nothing on free list, allocate from map, also allocate memory for
  mbpl_pg_descr node for the obtained page and break page down into n
  objects, attaching it to the free list. Future allocations can allocate
  from that page until we run out.

  Freeing is equally simple:

  * Compute index into global array of pointers to mbpl_pg_descr structures
  based on the address of the mbuf. Locate node and determine on which list
  its on.

  * Place mbuf back on that mbpl_pg_descr's free list and if the node was
  previously on the empty list, move it to the free list, as there is now
  at least one free mbuf available on it.

  * If the freed mbuf completes the page, the page can be freed back to the
  map, but ONLY free it back if min_on_avail is met (sysctl). So you see,
  it is possible in this way to control the free list, and have many
  objects cached on the free list, essentially going back to the behavior
  we presently have, if that's what the sysadmin wants, with only the
  little overhead of having to deal with the linked lists (which isn't
  much, as they are both doubly-linked, so insertion/removal is fast).

That's it, roughly. I hope this clears up some things for those of
  you who didn't look at the actual code.

  Regards,
  Bosko.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s): v 0.2: request-for-comments

2000-06-28 Thread Bosko Milekic


On Wed, 28 Jun 2000, Dennis wrote:

 YES! This is wonderful news.
 
 I started coding device drivers on Digital UNIX and have long missed
 this feature.  I can't count the number of times I've gotten 90% of
 the way through doing something with ext mubfs  thought to myself 
 "oh  hell, now what am I going to do for an m_ext.ext_ref() function?"
 
 On a less enthusiastic note, the amount of whitespace changes make it
 very difficult to eyeball your diff.  Could you re-roll your diffs with
 -b (to ignore your whitespace changes).
 
 Its not really "wonderful" to those that have already implemented something
 using the old method.
 
 What version is this "patch" likely to find its way into the mainstream
 code (or will it), as its likely to break our drivers.
 
 Dennis

You can cast the void * argument to basically anything you like, so
  there is little chance that it will break your drivers to the order which
  you appear to be suggesting. All it would really do is reduce coad bloat
  and make things less scattered. Actually, network device drivers were one
  of the motivations for this part of the patch: Bill Paul implements jumbo
  bufs in if_sk, for example, and has to literally "hide" the address of
  the softc structure inside the buffer so that he can use it inside his
  ext_{free, ref} calls. All that this would do is clean things up for him.

As this patch is rather big, and does more than just this (i.e. it
  also completely changes the way mbufs are allocated and freed and thus
  allows pages allocated from mb_map to be freed back to the map, therefore
  freeing physical pages as a consequence, etc. -- and there are more
  changes to come), I think that it's safe to say that there is still a
  little bit before this goes through.

  Regards,
  Bosko.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s): v 0.2: request-for-comments

2000-06-28 Thread Bosko Milekic


On Wed, 28 Jun 2000, Andrew Gallatin wrote:

 YES! This is wonderful news.
 
 I started coding device drivers on Digital UNIX and have long missed
 this feature.  I can't count the number of times I've gotten 90% of
 the way through doing something with ext mubfs  thought to myself 
 "oh  hell, now what am I going to do for an m_ext.ext_ref() function?"

I can imagine. As I've previously mentionned, I'm thinking of
  adopting NetBSD's reference idea, as it seems very handy here. What it
  basically assumes is that if you're going to increase the reference count
  to an object, that you know one of the mbufs also referencing that object
  (since what you're doing is probably "copying" the data without having to
  actually perform a memory-to-memory copy -- the reason we have reference
  counts in the first place). So, if you know the mbuf referencing the same
  object, you will pass it to the macro and it will "increase a reference
  count" for it itself. What actually occurs is that the m_ext structure
  holds a forward/backward pointer (in the style of doubly-linked list) and
  is linked to all the other mbufs referencing the same object. This would
  isolate the referencing of external objects to the mbuf subsystem, such
  that callers don't have to worry about it at all, and can essentially get
  rid of the ext_ref() routine alltogether.

 On a less enthusiastic note, the amount of whitespace changes make it
 very difficult to eyeball your diff.  Could you re-roll your diffs with
 -b (to ignore your whitespace changes).

Yeah, I made some "appearence/consistency/cleanliness" changes in
  /sys/sys/mbuf.h in order to maintain consistency and ensure easy
  readability of the final product. However, for readability purposes, I
  posted the no-whitespace-changes diff to the same place:
http://www.technokratis.com/code/mbuf/

I should have done this immediately; thanks for the advice! Hope this
  helps. :-)

 --
 Andrew Gallatin, Sr Systems Programmerhttp://www.cs.duke.edu/~gallatin
 Duke University   Email: [EMAIL PROTECTED]
 Department of Computer SciencePhone: (919) 660-6590

 Cheers,
 Bosko.
--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s): v 0.2: request-for-comments

2000-06-28 Thread Bosko Milekic



On Wed, 28 Jun 2000, Dave Baukus wrote:

 All this talk of mbuf prompts me to point a small bug in M_PREPEND that
 was introduced somewhere between 3.3 and 4.0; maybe its also in 5.x.
 
[...]
 If m_prepend() fails then 

No longer an issue in 5.0-CURRENT, and I'm looking at version 1.50 of
  mbuf.h
Although you pointing it out did lead me to looking at m_prepend()
  itself, and noticing some bad style issues, like casting on NULLs (ick!)
  which I'll fix in the patch along with adding the new reference stuff.
  Thanks!

 --
 Dave Baukus
 [EMAIL PROTECTED]
   Chiaro Networks ltd.
   Richardson, Texas, USA.


--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf re-write(s): v 0.2: request-for-comments

2000-06-28 Thread Bosko Milekic


On Wed, 28 Jun 2000, Kenneth D. Merry wrote:

 FWIW, I'm in favor of a pointer argument as well.  The way I implemented it
 was actually with a third argument, instead of changing the int to void.
 i.e.:

[...]

 I don't feel too strongly about it either way -- I suppose it's about the
 same amount of work to port older code.  (I just put an ifdef in the
 sendfile code, which doesn't use the third argument in my tree.)

The u_int is really unnecessary. If the caller needs more important
  information, he can pass anything he likes, including a data structure,
  or even a pointer to the mbuf. So this information can be extracted in
  either case.

 
 Ken
 -- 
 Kenneth Merry
 [EMAIL PROTECTED]
 
 


--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



mbuf re-write(s): v 0.2: request-for-comments

2000-06-27 Thread Bosko Milekic


  Hello,

I just added a MEXTADD() routine to the [now getting bigger] mbuf
  re-write patch, as well as fixed and changed a few little things here and
  there (once again). Thus, so-called "version 2" of the diff is again
  available: http://www.technokratis.com/code/mbuf/

This code includes all that was discussed in the previous Email, as
  well as a better/actually working external storage facility for clusters.
  Previously, it was very difficult to allocate external storage, attach it
  to the mbuf, _and_ as well maintain a reference counter for it, primarily
  due to the arguments that were taken by ext_free() and ext_buf(). These
  have been changed to have a new void * pointer passed in as the second
  argument (following the base address of the storage buffer). Also has
  been included a void * multi-purpose ext_args pointer in the m_ext
  struct, so the caller has much more flexibility now. In fact, the caller
  can now attach a "management" or "reference" structure to the m_ext
  struct via the ext_args pointer, and have it passed to his ext_free and
  ext_buf routines. Naturally, for dynamically sized malloc() external
  buffers, the caller can also allocate along with it space for its
  reference counter and attach to the mbuf via the ext_args pointer. It
  will be incremented/decremented properly as ext_args can be passed as the
  second argument to the two functions. When ext_free, ext_buf, and
  ext_args are all NULL, but M_EXT is set, then the external storage
  corresponds to an mcluster.
These changes will surely help out/make cleaner some code, like some
  of Bill Paul's device drivers (if_sk, if_ti, if_wb). For other purposes,
  such as sf_bufs, for example, it's not _as_ significant, mainly since
  sf_bufs are allocated from their own map such that the system can easily
  produce a unique index for a reference counter array just by looking at
  the offset base_addr_of_sf_buf - base_addr_of_map, like we do for
  mclusters. However, obviously, we don't want a new map for every new type
  of external storage we want to attach to an mbuf. :-)
(Yes, this means easy attaching of dynamically sized buffers)

What I still have left to do before I look into
finding/bugging/annoying a committer (sigh) to reviewing/committing
all of this:

* Re-write the mcluster allocations/deallocations in the same style
as the new mbuf allocator/deallocator. ... If someone has a more
suitable proposition, please let me know. I love to hear suggestions.

* I'm thinking of adopting NetBSD's "cute" and "clean" reference
count system; they maintain their mbufs linked through the m_ext when
they reference the same storage object. This will remove all fear
from external callers/code having to deal with references in the
first place, and will isolate it all to the mbuf code. Once this is
done, I can also add a NetBSD-like MEXTMALLOC() macro, in addition to
the just-added MEXTADD() macro. This would automate dynamic
malloc()ing of external storage objects, and make it quite a bit
cleaner/easier for the caller.

* Patch up userland to deal with all of these changes.

* Get some profiling / optimisation done.

  Since my initial post, I have received quite a few hits/requests for the
  posted code, and have even received a few comments/suggestions. These
  have been most helpful. I invite many more,... please!

  Regards,
  Bosko.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



mbuf re-write(s), v 0.1

2000-06-20 Thread Bosko Milekic


In an attempt to eliminate or significantly reduce the hogging of
  physical memory by unused mbufs, I have begun re-writing some of the mbuf
  subsystem. I've re-written the allocator and designed an actual free
  routine, and have also considerably re-written the MGET, MGETHDR, and
  MFREE macros. I still have some work to do with this, notably
  optimisation, but I have not been able to do any profiling whatsoever as
  profiling, I repeat, seems presently broken on -CURRENT.

This is particularily useful for machines which see "peak" mbuf usage
  periods, where many mbufs are allocated, only to be freed a little while
  later, but which will unfortunately remain on the free list, holding on
  to physical memory (for a graphical example, see the THIRD graph at
  http://www.technokratis.com/stats/mbuf.html).

Previously, we used to use the kernel malloc() to do mbuf
  allocations, coupled with the free() routine to do the freeing. However,
  the new allocator does not have to worry about chosing the right
  algorithm, and notably, variable sized objects. Of course, I still have
  some performance tuning to do, but need the profiling to work for that.

Of course, there is an min_on_avail variable added to the code, which
  is yet to be made sysctl-tunable, and which represents the minimum amount
  of mbufs that must reside on the free lists, so that the system will not
  explicitly free pages on every occasion it gets.

The reason I named this "v 0.1" has to do with the work that is left
  to be done here. I've, for the moment, removed the m_reclaim() and wait
  code for mbufs, but this will all have to be re-placed appropriately (not
  much voodoo involved here). However, I've moved the mclusters to their
  own map, mcl_map, which is the correct thing to do here, in order to
  avoid having to worry about fragmentation in the allocation routines (we
  want most efficiency possible). I'll go ahead and change the mcluster
  stuff soon, too, and hopefully fix up some of the mclrefcnt usage for
  clusters. I'll discuss more of this in time to come, and post the URL
  here. Also, I'm planning to write an optional "mbuf daemon" that can
  periodically walk the mbuf system's AVAIL_LST, and EMPTY_LST, and
  re-organize order of elements on, particularily, the AVAIL_LST, in order
  to minimize fragmentation during allocations, and augment % utilization
  for the allocator(s). It should also optionally do some other neat tasks,
  but I haven't exactly decided on which ones, although I'd like to avoid
  having it raise to splimp() for too long, though.

Unlike what some of you may be thinking right now, this is not
  theoretical work, I have some diffs right here:
 http://www.technokratis.com/code/mbuf/
  (you'll have to excuse my big tabs)

The diffs provided for now are context diffs, and they do several
  things, among the which (not to go too much into details):

  1* Implement new mbuf allocator, implement free routine, re-write mbuf
  allocation and free macros. Add necessary lists / structures for the new
  system.

  2* Change to OID_AUTO for all sysctls in uipc_mbuf.c

  3* Make /sys/sys/mbuf.h look nicer, more consistent comments, etc.

  4* Have mbuf clusters remain the same for now, but move them over to
  mcl_map

  5* Remove (temporarily) mbuf wait/reclaim stuff.

The diffs are in working condition on -CURRENT (as of a couple of
  days ago, at least), and I'm running them with no apparent problems as we
  speak. % utilization is great, for now, and I hope that the
  daemon-to-come will bring it up even higher. I can also tune it with the
  min_on_avail variable. Of course, from the above 5 points, you'll quickly
  note that I still have to go around and rebuild userland stuff, but
  that will wait until the end of all mbuf system modifications.

Comments welcome. Special thanks to Mike Silbersack for already
  discussing such issues with me.

 Regards,
 Bosko
 
--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



ether_output() : WIERD PROBLEM

2000-06-13 Thread Bosko Milekic


  Hello,

  I've been doing some mbuf-related work on my -CURRENT machine lately.
  Particularily, I've re-written the allocator and free routines, amongst
  other things. However, I've encountered a peculiar problem that surfaces
  in ether_output().

  What happens is that one of my daemons, for example, natd, or httpd,
  etc., performs a system call, which eventually results in a call to
  ether_output (following tcp_output, ip_output, etc.). At the bottom of
  ether_output(), after an IF_ENQUEUE, and an splx(s), there is the
  following check:
  
if (m-m_flags  M_MCAST)
ifp-if_omcasts++;

  The if () part results in a testb $0x2, 0x13(%ebx)
  IF I REMEMBER correctly.

  For some wierd reason, when the mbuf in question is at a location:
  0xstuffF00 (256 bytes into a page, the second mbuf on a page), there is a
  page fault. And it's _always_ when the mbuf is at such an address.
 
  Where the wierdness begins is when I actually examine the contents of the
  mbuf... I can actually see them, no page fault, no nothing. In fact, if I
  `continue' from the debugger, things continue to work fine... until the
  next 0xstuffF00 mbuf goes through ether_output() and reaches that check.

  If I move the check of the m_flags to just above the splx(s), but after
  the IF_ENQUEUE, then the page fault still occurs in the same way, except
  that I even get a page fault when trying to examine the contents of the
  mbuf. In other words, I can't even `continue' in this case.

  If I move the m_flags check before the IF_ENQUEUE, this doesn't happen at
  all!

  Furthermore, if I revert my mbuf changes, I don't catch this problem.

  Anyone got any hints/clues?

  Regards,
  Bosko.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: ether_output() : WIERD PROBLEM

2000-06-13 Thread Bosko Milekic


Wow, a reply to myself. I feel kind of lame. :-)

Anyway, this is just an update, with more info.

I've checked the status of my new system's lists, once the fault
occurs, and I can _guarantee_ that the management lists I wrote the
code for are actually not corrupt when this happens. I've looked at
the dump of the memory at where they are stored from DDB, and
everything looks in order. So my assumption is that this has just
uncovered a bug in ether_output(). Although I can't confirm it 100%.

I do have another bit of valuable information, though. If I move the
m_flags check to after the ENQUEUE, but prior to the call to the
interface "start" routine (see the end of ether_output()), then
things are fine. The problem only occurs if the check is moved to
_after_ the if_start call.


On Tue, 13 Jun 2000, Bosko Milekic wrote:

 
   Hello,
 
   I've been doing some mbuf-related work on my -CURRENT machine lately.
   Particularily, I've re-written the allocator and free routines, amongst
   other things. However, I've encountered a peculiar problem that surfaces
   in ether_output().
 
   What happens is that one of my daemons, for example, natd, or httpd,
   etc., performs a system call, which eventually results in a call to
   ether_output (following tcp_output, ip_output, etc.). At the bottom of
   ether_output(), after an IF_ENQUEUE, and an splx(s), there is the
   following check:
   
   if (m-m_flags  M_MCAST)
   ifp-if_omcasts++;
 
   The if () part results in a testb $0x2, 0x13(%ebx)
   IF I REMEMBER correctly.
 
   For some wierd reason, when the mbuf in question is at a location:
   0xstuffF00 (256 bytes into a page, the second mbuf on a page), there is a
   page fault. And it's _always_ when the mbuf is at such an address.
  
   Where the wierdness begins is when I actually examine the contents of the
   mbuf... I can actually see them, no page fault, no nothing. In fact, if I
   `continue' from the debugger, things continue to work fine... until the
   next 0xstuffF00 mbuf goes through ether_output() and reaches that check.
 
   If I move the check of the m_flags to just above the splx(s), but after
   the IF_ENQUEUE, then the page fault still occurs in the same way, except
   that I even get a page fault when trying to examine the contents of the
   mbuf. In other words, I can't even `continue' in this case.
 
   If I move the m_flags check before the IF_ENQUEUE, this doesn't happen at
   all!
 
   Furthermore, if I revert my mbuf changes, I don't catch this problem.
 
   Anyone got any hints/clues?
 
   Regards,
   Bosko.

--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Mbuf waiting mfc to 3

2000-06-10 Thread Bosko Milekic


[re-directed to --hackers, as more appropriate there, also
[EMAIL PROTECTED] is in the CC, make _SURE_ to remove him from there
before you post ANY replies!!!]

 Mike, your patch looks fine. However, I found a bug in /sys/netkey code.
 (and it's related to the wait stuff, although I don't believe it directly
 concerns your patch so, as far as I'm concerned, your stuff is ready to go
 in.)

  However, I believe this code is only for 4.x and -CURRENT people.

 keysock.c's key_sendup() does a silly thing with the mbuf allocation.

 Attached is a patch that fixes it.

 This applies to v 1.2 of the file.
 here is the Id:
 /* KAME @(#)$Id: keysock.c,v 1.2 1999/08/16 19:30:36 shin Exp $ */
 
 (Jlemon, can you commit this?)

 Oh yeah, and please also commit pr=18471 as it's been sitting there for a
 while.

Thanks in advance,
Bosko.

On Fri, 9 Jun 2000, Mike Silbersack wrote:

 Well, it's been nearly a month since I posted the mbuf waiting MFC for 3.4
 to -net, although I haven't heard any complaints about it messing up
 systems, there have been a few complaints on bugtraq of mbuf
 exhaustion attacks which would be much less serious with it. :)
 
 In any case, the patch is still available at
 http://www.silby.com/patches/mbuf-wait-mfc-2.patch for review.  I'm
 fairly confident in its reliability, but I'd prefer a few more people to
 test it if they have the time.  If there are no negative complaints, I'd
 like to get it committed before the end of next week to ensure that we
 don't miss getting it into 3.5.
 
 There are no changes between this patch and the last one I posted other
 than a single version line I had messed up in the previous one, so if
 you're currently testing that one, there's no need to download this
 one.  Please post your experiences with it in any case, though.
 
 The small memory leak I alluded to in my previous posting of the patch has
 been found and committed seperately (as it affected 3,4, and 5.)  So,
 please CVSUP before testing this patch to ensure you're seeing its true
 colors.
 
 Thanks,
 
 Mike "Silby" Silbersack


--
 Bosko Milekic  *  Voice/Mobile: 514.865.7738  *  Pager: 514.921.0237
[EMAIL PROTECTED]  *  http://www.technokratis.com/



--- keysock.old.c   Sat Jun 10 03:09:05 2000
+++ keysock.c   Sat Jun 10 03:13:43 2000
@@ -419,18 +419,25 @@
while (tlen  0) {
if (tlen == len) {
MGETHDR(n, M_DONTWAIT, MT_DATA);
+   if (n == NULL) {
+   if (m) m_freem(m);
+   return ENOBUFS;
+   }
n-m_len = MHLEN;
} else {
MGET(n, M_DONTWAIT, MT_DATA);
+   if (n == NULL) {
+   if (m) m_freem(m);
+   return ENOBUFS;
+   }
n-m_len = MLEN;
}
-   if (!n)
-   return ENOBUFS;
+
if (tlen  MCLBYTES) {  /*XXX better threshold? */
MCLGET(n, M_DONTWAIT);
if ((n-m_flags  M_EXT) == 0) {
m_free(n);
-   m_freem(m);
+   if (m) m_freem(m);
return ENOBUFS;
}
n-m_len = MCLBYTES;



Re: kerneld for FreeBSD

2000-06-06 Thread Bosko Milekic


On Wed, 7 Jun 2000, void wrote:

 Doesn't Solaris auto-unload unused drivers when memory gets tight?
 
 -- 
  Ben
 
 220 go.ahead.make.my.day ESMTP Postfix

An Operating System should only do that when the administrator is so
  stupid that he/she actually loads "unused" drivers.

--
 Bosko Milekic
 [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: fatal trap 12: page fault while in kernel mode

2000-05-26 Thread Bosko Milekic


On Fri, 26 May 2000, Greg Skouby wrote:

 Hello,
 
 I posted a message to -questions yesterday about a machine that had the
 /dev directory somewhat corrupt. I could ls -la /dev/wd0* but when I was
 in the /dev director when I did an ls it was not showing any of the files.  
 Now, today the machine was rebooting over and over again, freezing with
 this message:
 
 
 fatal trap 12: page fault while in kernel mode
 
 fault virtual address = 0xc33a3c6d
 
 fault code = supervisor read, page not present
 
 Instruction Pointer  = 0x8:0xc022798F
 

You have to post more information. For example, what is at the
  location pointed at by the instruction pointer? Get a stack trace, if
  possible (from the debugger), and any other relevant info., most of which
  is explained in the Handbook.
  

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Linux Module problems

2000-05-26 Thread Bosko Milekic



On my -CURRENT machine,

FreeBSD jehovah 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sat May 13 15:11:13 EDT
2000 root@jehovah:/usr/src/sys/compile/JEHOVAH  i386

(obviously a little out-dated), I have recently noticed unusual
  problems with the linux module which, by the way, is of the same date.
The first problem I discovered first came up while building the
  StarOffice5 port. After checking the dependency for linux's libc5, it
  _spontaneously_ reboots. No panic(), hence no debugger. I've never seen
  this sort of behavior before and have no idea what could have caused it.
However, I noticed a related incident, which I can reproduce. What I
  did was, for kicks, kldunload linux, and then make install the
  staroffice5 port, and this time, I got a page fault and panic() from
  within malloc, which was trying to move something located at an address
  on an unmapped page to a register. I can reproduce this easily at the
  moment, with the following:

  #!/bin/sh
  while true; do
   kldload linux;
   kldunload linux;
  done

  A quick kldunload linux followed by a quick kldload linux does it on the
  first iteration.

  What's more odd is that now, after panic()ing the machine a couple of
  times with the above, I can reproduce the spontaneous reboot easily too,
  by just starting up linux Netscape!

  At the moment, I cvsup-ed new sources, and am rebuilding world and a
  fresh new kernel, at which point I'll try to reproduce this again. I
  remember seeing this in earlier -CURRENT, too, just never got around to
  playing with it. Anyone?

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Linux Module problems

2000-05-26 Thread Bosko Milekic


On Fri, 26 May 2000, Alain Thivillon wrote:

   I had the same problem with all statically linked Linux
 binaries, including rpm. I guess that loader does not recognize as
 Linux, launch them as FreeBSD static and one of the syscall is mapped to
 halt() (for example if dont launch rpm as root, i have "Segmentation
 violation" instead of a reboot).

I just re-cvsuped and rebuilt everything, and I am still having the
  same problem. In fact, I've noticed something else: After the reboot, the
  _time_ (not the date, though) is modified to, generally +4 hours. I have
  no idea why this would be happening.

 
   As explained in /usr/src/UPDATING, you have to rebrand them:
 
 brandelf -t Linux static-binary
 
 The first candidate (and i think this explain you problem)
 if of course /compat/linux/sbin/ldconfig.

Am giving it a shot. 

 
 -- 
 Alain Thivillon -+- [EMAIL PROTECTED] -+- Hervé Schauer Consultants
 
 


--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Linux Module problems

2000-05-26 Thread Bosko Milekic


  As explained in /usr/src/UPDATING, you have to rebrand them:
  
  brandelf -t Linux static-binary
  
  The first candidate (and i think this explain you problem)
  if of course /compat/linux/sbin/ldconfig.
 
   Am giving it a shot. 
 

This worked. Thanks!

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



socket leak(s)...

2000-05-18 Thread Bosko Milekic



I'm afraid my earlier message was incorrect.
This is _still_ an issue...

(the always_keepalive is unrelated)

... appologies; and tomorrow will be back to poking around day.



--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: leaking sockets (closure)

2000-05-17 Thread Bosko Milekic


Hi Mike,

On Wed, 17 May 2000, Mike Silbersack wrote:

 Heh, that's sorta neat, I guess.  It'll be interesting to find out if the
 leak is due to the mbuf waiting in some way, or a totally unrelated bug
 we're tickling.  I'd almost guess the latter.

I finally peeked at the tcp_timer stuff and quickly realized:

`grep keepalive /etc/defaults/rc.conf'

or, equivalently,

`sysctl -A | grep keepalive'

should quickly make things clear... :-)

Notice the explicit initialization of always_keepalive to zero in
  tcp_timer.c, which is what at first glance tripped me off.

(I have re-simulated the exhaustion and all seems fine).

 -Bosko

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]

 "Give a man a fish and he will eat for a day. Teach him how
  to fish, and he will sit in a boat and drink beer all day."




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: What do people think of maybe using the sourceforge software?

2000-05-16 Thread Bosko Milekic


On Tue, 16 May 2000, Nick Hibma wrote:

 
 I guess that most people leading a project could do with a bit of
 feature creep, features being shoved under their noses. Even if at first
 you think that source control solves all our problems, it still could be
 a way to develop new tools and get them running and tried out before
 committing them to the tree.
 
 Second, the projects page we have now, with all due respect to the
 people that try to keep it reasonably organised, is a mess due to the
 lack of updates. people only maintain their project pages perhaps, but
 certainly not the links that lead to them. 
 
 Being able to work with more people on the same project on an equal
 bases would be a good idea IMHO.
 
 Nick
 

Although I have no control over what goes on behind the curtains, I
  must say the following:

My feeling is that a lot of the doc people are working really hard to
  make this sort of stuff happen. I know, for instance, that Jeroen
  (Asmodai) has great ideas in place for centralization of project
  listings, and TODO lists, etc. The only thing left is to bind these ideas
  together and make things like this happen. One of the big issues, I feel,
  is the duplication of efforts and I, as a "guy who develops from the
  sidelines" can tell you right now: a centralized information-base such as
  the one [I believe] these people are working on is key to what I choose to 
  poke at next. Please remember that a lot of people who contribute to the
  project are not necessarily committers and do not read -commiters mail.
  The centralization of documentation and various other data will make
  collaboration possible and, best of all, it'll make it fun (which is what
  open source is about for many of us).
With the centralization of information will come direction.

  Cheers,
  Bosko.

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]

 "Give a man a fish and he will eat for a day. Teach him how
  to fish, and he will sit in a boat and drink beer all day."




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Regarding PR 5877: sb_cc issues.

2000-04-30 Thread Bosko Milekic


As I haven't seen [other] feedback yet to this PR posted by Bill Fenner,
  I'm curious as to what people's opinions are. (I should have cross-posted
  to -hackers when I first replied to it, as I believe that I've seen this
  mentionned previously, but not getting many replies).

--
 Bosko Milekic * pages.infinit.net/bmilekic/index.html * www.technokratis.com
 [EMAIL PROTECTED] * [EMAIL PROTECTED] * [EMAIL PROTECTED]

 "Give a man a fish and he will eat for a day. Teach him how
  to fish, and he will sit in a boat and drink beer all day."




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Comments above kmem_malloc() (vm/vm_kern.c)

2000-03-25 Thread Bosko Milekic



 Is the following comment above kmem_malloc()'s definition in:
/sys/vm/vm_kern.c
 ... still valid? (I hope and suspect not):

 " *  Note that this still only works in a uni-processor environment and
   *  when called at splhigh(). "

 The only places, as far as I've seen, that call kmem_malloc are the
 kernel's malloc() and the mbuf allocation routines. Niether of these
 seems to do it at splhigh(), either.

--Bosko Milekic [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Where is pci_intr_establish() _thread_sys_read()?

2000-03-06 Thread Bosko Milekic



On Mon, 6 Mar 2000, Chris Costello wrote:

On Monday, March 06, 2000, Zhihui Zhang wrote:
 Can anyone tell me where is the code for pci_intr_establish() and
 _thread_sys_read()? I could not find them under /usr/src.

   I can tell you offhand that _thread_sys_anything is the _real_
syscall for `anything'.  This is because a lot of syscalls are
reimplemented within libc_r for reasons that are kind of obvious
(directly calling the read syscall from one thread would block
all the other threads in a process).  So _thread_sys_open() ==
open(2), _thread_sys_read() == read(2), etc.

   I don't know about pci_intr_establish.

-- 
|Chris Costello [EMAIL PROTECTED]
|Today's assembler command :  EXOP   Execute Operator
`


 pci_intr_establish is not part of FreeBSD's interface(s), as far as I
know.

 This probably belongs to either NetBSD or OpenBSD (since the drivers that use
this routine to setup an interrupt use it under #if defined(__OpenBSD__)
or __NetBSD__ blocks. See our bus interface code (e.g. bus_if.[ch])

  --Bosko

 ..
  Bosko Milekic * [EMAIL PROTECTED] * http://pages.infinit.net/bmilekic/
  Montreal, Quebec, Canada. *  Technokratis:  http://www.technokratis.com/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



rtfree panic() (fwd)

2000-02-07 Thread Bosko Milekic


 Hmmm. Judging from the last CVS log entry for route.c (See r1.59), this
 problem can manifest itself in -current as well. I´m cross posting on the
 initial send, but please, when replying, redirect to [single, truly]
 appropriate list.

 It *appears* that rtfree() is puking because the rnh pointer is somehow (?)
 NULL. The rt_tables tree for the given address family either doesn't hold
 what's being looked for, or there's a problem with the rt_key macro.

 In any case, I'm not that comfortable with this part of the code yet, so
 if some route.c guru with radix tree know-how could take a look at this, I
 think that Shrihari (along with the others who have experienced
 this?) would appreciate it. 

 Note that if more infos. are needed, request it at the Shrihari's address
 below.

 -
| Bosko Milekic   | Coffee vector: 1.0i+1.0j+1.0k |
| Email: [EMAIL PROTECTED]  | Sleep vector: -1.0i-1.0j-1.0k |
| WWW: http://pages.infinit.net/bmilekic/ | Resulting life: 0i+0j+0k (DNE)|
 -


-- Forwarded message --
Date: Mon, 7 Feb 2000 16:53:28 -0500
From: Shrihari Pandit [EMAIL PROTECTED]
To: Bosko Milekic [EMAIL PROTECTED]
Subject: rtfree panic()

Hey there.

I was hoping you might be able to give us hand in this problem:

We have a couple of machines that run FreeBSD 3.4-STABLE and they are
panicing randomly in rtfree().  These systems contain over 70,000 routes 
in the routing table.

IdlePTD 2600960
initial pcb at 210e34
panicstr: rtfree
panic messages:
---
panic: rtfree

syncing disks... done

dumping to dev 20001, offset 0
dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 
235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 
214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 
193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 
172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 
151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 
130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 
109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 
84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 
55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 
26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:285
285 in ../../kern/kern_shutdown.c
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:285
#1  0xc012b8e4 in at_shutdown (
function=0xc01f20a2 __set_sysctl__debug_sym_sysctl___debug_if_tun_debug+934, 
arg=0x3, queue=-1071640608) at ../../kern/kern_shutdown.c:446
#2  0xc016650f in rtfree (rt=0xc2413000) at ../../net/route.c:206
#3  0xc01668f7 in rtrequest (req=2, dst=0xc243de00, gateway=0xc243de10,
netmask=0xc2415670, flags=3, ret_nrt=0x0) at ../../net/route.c:509
#4  0xc016be81 in in_ifadownkill (rn=0xc243e800, xap=0xc0201038)
at ../../netinet/in_rmx.c:390
#5  0xc0165d68 in rn_walktree (h=0xc23f4200, f=0xc016be4c in_ifadownkill,
w=0xc0201038) at ../../net/radix.c:959
#6  0xc016bec8 in in_ifadown (ifa=0xc23fb500) at ../../netinet/in_rmx.c:410
#7  0xc016ff7f in rip_ctlinput (cmd=0, sa=0xc23fb548, vip=0x0)
at ../../netinet/raw_ip.c:396
#8  0xc014148d in pfctlinput (cmd=0, sa=0xc23fb548)
at ../../kern/uipc_domain.c:265
#9  0xc015e343 in if_unroute (ifp=0xc023ebc4, flag=1, fam=0)
at ../../net/if.c:414
#10 0xc015e3cf in if_down (ifp=0xc023ebc4) at ../../net/if.c:449
#11 0xc01e7308 in etp_linkdown ()
#12 0xc01e97ab in cisco_keepalive ()
#13 0xc01e9ba8 in cisco_notify ()
#14 0xc01ecbad in etp_notify ()
---Type return to continue, or q return to quit---
#15 0xc01e91e4 in hdlc_rcvhandler ()
#16 0xc01cf35e in l3_rcvhandler ()
#17 0xc01c857d in lind_event ()
#18 0xc01e7445 in hdlc_timeout ()
#19 0xc0130112 in softclock () at ../../kern/kern_timeout.c:132

---




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Acceptable MBUF levels?

2000-01-29 Thread Bosko Milekic


On Fri, 28 Jan 2000, Doug White wrote:

That would be correct, at least looking at the appropriate code in
/sys/kern/uipc_mbuf.c.  The read-only sysctls kern.ipc.nmbclusters and
kern.ipc.nmbufs hold the max mbuf clusters and the max mbufs, respecively.
kern.ipc.nmbufs is bound to an nmbufs value in there, but I can't figure
out to what value it's initialized to.  

`nmbufs' is actually NMBCLUSTERS * 4, unless a value is fetched from
  the environment (see `loader'). A similar initialization is done for
  `nmbclusters,' only nmbclusters defaults to NMBCLUSTERS unless something
  else is provided through the getenv() call (see `TUNABLE_INT_DECL').

Increasing maxusers has the side effect of increasing NMBCLUSTERS
according to this formula (from /sys/conf/param.c):

#ifndef NMBCLUSTERS
#define NMBCLUSTERS (512 + MAXUSERS * 16)
#endif

You only have to override NMBCLUSTERS by hand if you want a truly gigantic
(i.e.  10,000) number of nmbclusters.  Just be VERY CAREFUL doing so
since you can *reduce* the number, and that's not good!

From personal experience, 512 maxusers and 16384 nmbclusters is more than
enough for just about anything -- just make sure you can handle a 17MB
kernel. :-)

Yes, that's exactly right. Good thing you pointed it out too. :-)
  However, increasing MAXUSERS also ends up increasing other global
  parameters in the kernel, so you could end up with a rather large kernel
  when all you really want to do is increase NMBCLUSTERS, and nothing else.
But yeah, your point is very valid.

Cheers,
Bosko.

 -
| Bosko Milekic   | Coffee vector: 1.0i+1.0j+1.0k |
| Email: [EMAIL PROTECTED]  | Sleep vector: -1.0i-1.0j-1.0k |
| WWW: http://pages.infinit.net/bmilekic/ | Resulting life: 0i+0j+0k (DNE)|
 -




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Acceptable MBUF levels?

2000-01-26 Thread Bosko Milekic


On Wed, 26 Jan 2000, Doug White wrote:

When people refer to mbufs, they refer to mbuf clusters, of which there's
a fixed number.  The kernel will allocate more mbufs as necessary.

Uhm, actually, mbufs are also allocated from mb_map. Thus, they are
  also capped. (Unless I'm missing something big again... :-) )

The usual rule of thumb is that the peak should never exceed 75% of the
max mbufs in the system to allow for sufficient overhead in extreme
situations.  In this case you're at 80%, so you should probably recompile
your kernel and bump maxusers.

Actually, for mbufs and mbuf clusters, you should increase
  NMBCLUSTERS, which will serve as an indication of allocate-able clusters
  as well as, ultimately, mbufs.

--
 Bosko Milekic
 Email:  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: PR kern/14034: gettimeofday() returns negative value?

2000-01-19 Thread Bosko Milekic


On Wed, 19 Jan 2000, Sabrina Minshall wrote:

What's going one here? Successive calls to gettimeofday 
yields negative elapsed time?

Any fixes?

[ code snipped ]

  Well, the PR considers a different problem. What your code does is call
  gettimeofday() once, record the value, and then a little later, call it
  again while proceeding to calculate a delta between the latter and
  previous results. Notice the issue mentionned in the PR has been
  concluded to be faulty hardware.

  Now, I assure you, this is a problem with your code snippet. I tried this
  code on a DEC box running:

  OSF1 oracle.dsuper.net V4.0 1091 alpha

  And got the exact same results.

  The problem is the tv1 = tv2 structure equality. Since the byte order is
  different, you get your usec from tv1 ending up in tv2's usec field.

  Regards,
  Bosko.


--
 Bosko Milekic
 Email:  [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: splimp for PCI

1999-12-20 Thread Bosko Milekic


On Mon, 20 Dec 1999, Alex wrote:

!This message was sent from Geocrawler.com by "Alex" [EMAIL PROTECTED]
!Be sure to reply to that address.
!
!Hello,
!I'm little confuse to using splimp/splx in driver 
!that support PCI board. IRQ is shared for PCI.
!Is using splimp can cause for some problem?
!Thank a lot
!Alex
!
!Geocrawler.com - The Knowledge Archive

Here's my `first glance' shot at an answer: [ ;-) ]

Nope, using splimp() will not cause problems in terms of shared IRQs.
  However, it may cause problems if you're blocking interrupt handlers of
  priorities that you don't want to.
Shared IRQs are dealt with thanks to a linked list of handlers for
  each shared IRQ, which all end up being called as a result of at least
  one of the devices "registered" for that IRQ asserts an int request. Each
  device, however, is "registered" with its own 'mask,' and this mask should
  correspond to one of several given values, where _handlers_ executing at
  that priority level will be blocked as per the present priority level.
If you haven't done so already, my suggestion is to take a look
  at spl(9) [e.g. `man 9 spl'], and, if still interested, taking a look at
  *some* of i386/isa/intr_machdep.c as well as most of
  sys/i386/i386/nexus.c -- which brings up the question: Is anybody
  _currently_ working on cleaning this stuff up, and completely getting rid
  of the remains of the "old" interface?

  Bosko.

 .. . . . . .  .   .    .  .. . .
 . Bosko Milekic  --  [EMAIL PROTECTED]   .   .
 .   .  .  ..  . . . . ..   .
 . WWW: http://pages.infinit.net/bmilekic/  . 
 
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: ip checksum

1999-11-22 Thread Bosko Milekic

On Tue, 23 Nov 1999, Parthasarathy M. Aji wrote:

!Hey,
!
!I am trying to recompute the checksum of an IP packet. I use
!netinet/in_chksum.c to do this. The values returned are not correct. I've
!reset the ip_sum field to 0 before doing the sum. Is there something
!missing?   
!
!thanks
!
!


Would you be able to provide some code to illustrate the situation?
  There are several things that may go wrong. What exactly are you trying
  to do here? (You may be using the wrong procedure) and what are you
  getting for return values?

  --Bosko

--
  Bosko Milekic [EMAIL PROTECTED]

  "I want now to tell you, gentlemen, whether you care to hear it or not,
  why I could not even become an insect." --F. Dostoyevski




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: PCI DMA lockups in 3.2 (3.3 maybe?)

1999-11-21 Thread Bosko Milekic


On Mon, 22 Nov 1999, Dennis wrote:

!Its a late 3.2-STABLE. so its not that old. Surely someone knows if
!something in this area was fixed or not?
!
!Since its a DMA lockup, how would you suggest that the informatoin about
!what instruction was executing be obtained?
!
!The nightmare of instability of  3.x continues whilst the braintrust flogs
!away at 4.x. Its really  a damn shame. And why is 3.x so much slower than
!2.2.8?  Will 4.0 be slower yet?
!
!DB 
!

Can you quantify how "slower" the 3.x code is? What's "slower" about
it? A lot of people are willing to help, but providing no concrete
information offers little possibility.

In the mean time, did you happen to get a chance to reproduce the
problem in 3.3-STABLE ? It appears from your description of the problem
that's it somewhat tougher to debug, and knowing whether 3.3 remedies the
problem can be of some help.


--
 Bosko Milekic [EMAIL PROTECTED]




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



mbuf wait code (revisited) -- review?

1999-11-11 Thread Bosko Milekic


Hi,

Attached are some diffs that provide a couple of wait routines in the
out-of-mbuf and/or out-of-mbuf-cluster case(s). The attached diffs are for -STABLE
and I would be greatful if somebody could review them/give feedback. I have diffs for
-CURRENT but am not posting them because I haven't had too much of a chance to test
the code -- whereas these below have been tested for a while now on several -STABLE
machines.
Since the problematic situation has been described numerous times before both
on the list and in several PRs, I am not going to go over it again. Instead, a
[fairly] accurate description of the situation in question can be found at:

http://www.freebsd.org/cgi/query-pr.cgi?pr=14042

[Note that the patches posted in the original PR should not be considered.]

There are several other open PRs which refer to a similar problem.
Furthermore, I've also spotted at least one other PR which addresses a potentially
related issue:

http://www.freebsd.org/cgi/query-pr.cgi?pr=9883

The above PR mentions MGET turning the provided mbuf pointer to a NULL
pointer even if the call was made with M_WAIT. I don't see how this can be the case,
especially since presently the code is set to panic() in the m_reclaim when out of
mbufs and calling with M_WAIT. In any case, with the code below, MGET will
potentially be capable of setting that NULL pointer, which is something that really
can't be avoided even if the call is made with M_WAIT. The whole idea behind the
provided diffs is to add a 'sleep' time before actually deciding to explicitly
"fail" -- this sleep time can be modified dynamically, the diffs add a sysctl
kern.ipc.mbuf_wait to tune the sleep time in the tsleep().

Anyway, I would really appreciate feedback and/or suggestions. Furthermore,
if anybody's interested in testing it, I can post the -CURRENT version of the diffs 
(which are only
slightly different). Finally, if this looks good, the next step would be to search
and dig through all the code that uses the MGET, MGETHDR, MCLGET, MCLALLOC macros and
m_get, m_gethdr, m_clalloc functions in order to make sure that all of that code
checks whether the returned pointer is referencing a NULL (most of the problematic
code resides in sys/nfs, from what I've seen.


--
 Bosko Milekic [EMAIL PROTECTED]

"I counted the steps in my walks and calculated the cubic contents of soup
plates, coffee cups, and pieces of food -- otherwise my meal was unenjoyable. All
repeated acts or operations I performed had to be divisible by three and if I missed
I felt impelled to do it all over again, even if it took hours."
   -- Nikola Tesla, 1919. 



(Note: If the diffs below generate problems, please let me know and I'll post this
stuff somewhere on the WWW).
--snip snip--

diff -ruN sys.old/conf/param.c sys/conf/param.c
--- sys.old/conf/param.cSun Oct 31 23:34:16 1999
+++ sys/conf/param.cMon Nov  1 20:07:46 1999
@@ -82,6 +82,7 @@
 intmaxfiles = MAXFILES;/* system wide open files limit */
 intmaxfilesperproc = MAXFILES; /* per-process open files limit */
 intncallout = 16 + NPROC + MAXFILES;   /* maximum # of timer events */
+intmbuf_wait = 32; /* mbuf sleep time */

 /* maximum # of mbuf clusters */
 #ifndef NMBCLUSTERS
diff -ruN sys.old/kern/uipc_mbuf.c sys/kern/uipc_mbuf.c
--- sys.old/kern/uipc_mbuf.cWed Sep  8 20:45:50 1999
+++ sys/kern/uipc_mbuf.cFri Nov  5 21:44:51 1999
@@ -47,6 +47,10 @@
 #include vm/vm_kern.h
 #include vm/vm_extern.h

+#ifdef INVARIANTS
+#include machine/cpu.h
+#endif
+
 static void mbinit __P((void *));
 SYSINIT(mbuf, SI_SUB_MBUF, SI_ORDER_FIRST, mbinit, NULL)

@@ -60,6 +64,8 @@
 intmax_hdr;
 intmax_datalen;

+static u_int m_mballoc_wid = 0, m_clalloc_wid = 0;
+
 SYSCTL_INT(_kern_ipc, KIPC_MAX_LINKHDR, max_linkhdr, CTLFLAG_RW,
   max_linkhdr, 0, "");
 SYSCTL_INT(_kern_ipc, KIPC_MAX_PROTOHDR, max_protohdr, CTLFLAG_RW,
@@ -67,13 +73,14 @@
 SYSCTL_INT(_kern_ipc, KIPC_MAX_HDR, max_hdr, CTLFLAG_RW, max_hdr, 0, "");
 SYSCTL_INT(_kern_ipc, KIPC_MAX_DATALEN, max_datalen, CTLFLAG_RW,
   max_datalen, 0, "");
+SYSCTL_INT(_kern_ipc, OID_AUTO, mbuf_wait, CTLFLAG_RW,
+  mbuf_wait, 0, "");
 SYSCTL_STRUCT(_kern_ipc, KIPC_MBSTAT, mbstat, CTLFLAG_RW, mbstat, mbstat, "");
 
 static voidm_reclaim __P((void));
 
 /* "number of clusters of pages" */
 #define NCL_INIT   1
-
  #define NMB_INIT   16
 
 /* ARGSUSED*/
@@ -125,6 +132,9 @@
 * any more (nothing is ever freed back to the map) (XXX which
 * is dumb). (however you are not dead as m_reclaim might
 * still be able to free a substantial amount of space).
+* XXX Furthermore, we can also work with "recycled" mbufs (when
+* we're calling with M_WAIT

Re: mbuf shortage situations (followup)

1999-09-13 Thread Bosko Milekic



!I think that what needs to be done is to split the problem in two.  First,
!allow the mbuf routines to return a failure even with M_WAIT.  If M_WAIT
!is used, it simply means 'try harder, sleeping a bit if necessary'.  This 
!requires ensuring that all the networking code deal with the failure
!case - a time consuming but straightforward task.  If a failure occurs,
!one simply drops the packet, not the connection or anything else drastic.
!just the packet.

Yes, these is mainly the part I've been working on recently. The
sleeping and what not (as I'm sure you've seen from the patches if you
looked at them) has already been completed. Adding a counter that will
expire and return a pre-defined error is trivial, in this case.

The only real issue here (if we can call it that) is get _all_ the
networking code to recognize this. Anyone want to help? :-)


!
!The second problem that needs to be addressed is resource exhaustion.
!For example, allocating thousands of connections and socket-opting their
!buffers as large as possible, or programs such as syslog accepting new
!connections ad-infinitum.  This is a harder problem to fix properly,
!but a lot of the various issues such as those with syslog can be dealt 
!with in userland rather then the kernel.
!
!  -Matt
!

I agree. The issue here is somewhat related (if I understand your
explanation correctly) to [local] processes attempting to grab a lot of
socket buffer space. I was a little less concerned with this issue since,
as I previously mentionned, Brian Feldman is working on limiting socket
buffer space. Nonetheless, if we do not consider limiting, here's what I
believe will need to be done:

As explained above, when we run out of mbufs and/or mbuf
clusters (and some are needed), if we are M_WAIT (when processes socket
opt their buffers as large as possible, the call is usually with M_WAIT),
we will end up tsleep()ing for certain periods of time, until our counter
expires and we return our pre-defined error (as mentionned above). When we
do return this error, however, the caller (for instance, we can consider
sosend() the caller -- which, if I remember correctly, is one of the
callers to MGET() when we setsockopt a large buffer and consequently
write() to this socket), will also have to know how to properly deal with
this error (e.g.: kill the process?).

Killing the process may seem somewhat sadistic to some ( :-) ),
but remember that if we do get to the point where 'normal' local processes
eat up so much buffer space that we run out, we should probably be
increasing NMBCLUSTERS and/or maxusers anyway.

As for script weenies, I hope that Brian (and whomever else may be
working on it) gets that sockbuf limiting code done, because, to be quite
honest, I don't think that script kids having to comprimise more than one
account just so they can DoS a box will be much of an issue (if worse
comes to worse, we can limit per gid -as opposed to per uid). With
exhaustion attacks such as these, we're better off just limiting.

Regards,
Bosko Milekic.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: mbuf shortage situations (followup)

1999-09-13 Thread Bosko Milekic



On Mon, 13 Sep 1999, Garrett Wollman wrote:

!On Sun, 12 Sep 1999 23:19:13 -0400 (EDT), Bosko Milekic [EMAIL PROTECTED] said:
!
!   This message is in MIME format.  The first part should be readable text,
!   while the remaining parts are likely unreadable without MIME-aware tools.
!   Send mail to [EMAIL PROTECTED] for more info.
!
!It would be preferable if text were sent as text, since MIME-encoded
!patches require more effort to read.
!

I deffinately agree. This is obviously my mistake, and I was
somewhat in a rush, very lagged (modem, eurgh), using pine, and made
several [dumb] typos in the 'Attatchement' field.


! I'm also aware of the possiblity of some people not liking the
! fact that we tsleep() forever (e.g. tsleep(x,x,x,0)). 
!
!
!I don't have any problem with sleeping forever -- but I am concerned
!about the possibility of deadlock, especially when client-NFS is
!involved.  If the problem just moves around and has harder-to-recover
!symptoms, the change isn't helping.

Well, the main purpose of the code is to basically sleep until
something is freed after we've already exhausted the mb_map arena (as I'm
sure you've seen if you were able to grab the attachements). This is
really a-la-limite stuff. In other words, if 'normal' local programs are
having trouble because of mb_map exhaustion, then maxusers  nmbclusters
would have to be augmented.

!
!The 4.3BSD code had two different behaviors:
!
!  - For clusters, if M_WAIT was specified and there was no space
!left in mb_map, it panicked.  However, m_clalloc was never called with
!M_WAIT, so that panic was effectively dead code.

Hmmm. If m_clalloc was never called with M_WAIT, then all the code
calling m_clalloc deffinately checked its return value. It probably had
specific ways to deal with m_clalloc returning failures, too?

!
!  - For mbufs, if M_WAIT was specified and there were no mbufs
!available, it would sleep at PZERO - 1 (which was interruptible).
!
!In 4.3, the code was able to deal with cluster allocation failing.  We
!have a somewhat different situation now, because many network
!interface devices have less-flexible DMA mechanisms which don't allow
!packet reception into non-contiguous buffers, so we need to have at
!least a certain number of clusters available for this purpose.

Exactly. This is the next challenge. As for things being
interruptable, as I mentionned to a reply to Matt Dillon just a few
seconds ago, getting the tsleep to occasionally expire is trivial. As you
say above, it's dealing with the failure that is the issue.

!
!-GAWollman
!
!--
!Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
![EMAIL PROTECTED]  | O Siem / The fires of freedom 
!Opinions not those of| Dance in the burning flame
!MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick
!

Cheers,
Bosko Milekic.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



  1   2   >