Re: mmap() question

2013-10-13 Thread Dmitry Sivachenko

On 12.10.2013, at 18:14, Konstantin Belousov kostik...@gmail.com wrote:

 
 First I tried with some swap space configured.  The OS started to swap out 
 my process after it reached about 20GB which is also not what I expected:  
 what is the reason to swap out regions of read-only mmap()ed files?  Is it 
 the expected behaviour?
 
 How did you concluded that the pages from your r/o mappings were paged out ?
 VM never does this.  Only anonymous memory could be written to swap file,
 including the shadow pages for the writeable COW mappings.  I suspect that
 you have another 20GB of something used on the machine meantime.
 


Yes, sorry, I tried again with swap space configured and it is really some 
other processes which are swapping out:
sshd, other user's shells, etc.


 
 
 Below is the prototype patch, against HEAD.  It is not applicable to
 stable, please use HEAD kernel for test.
 



I tried your patch with stable/10 system and I can confirm that my process is 
not killed anymore because of OOM.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: mmap() question

2013-10-12 Thread Konstantin Belousov
On Fri, Oct 11, 2013 at 09:57:24AM +0400, Dmitry Sivachenko wrote:
 
 On 11.10.2013, at 9:17, Konstantin Belousov kostik...@gmail.com wrote:
 
  On Wed, Oct 09, 2013 at 03:42:27PM +0400, Dmitry Sivachenko wrote:
  Hello!
  
  I have a program which mmap()s a lot of large files (total size more that 
  RAM and I have no swap), but it needs only small parts of that files at a 
  time.
  
  My understanding is that when using mmap when I access some memory region 
  OS reads the relevant portion of that file from disk and caches the result 
  in memory.  If there is no free memory, OS will purge previously read part 
  of mmap'ed file to free memory for the new chunk.
  
  But this is not the case.  I use the following simple program which gets 
  list of files as command line arguments, mmap()s them all and then selects 
  random file and random 1K parts of that file and computes a XOR of bytes 
  from that region.
  After some time the program dies:
  pid 63251 (a.out), uid 1232, was killed: out of swap space
  
  It seems I incorrectly understand how mmap() works, can you please clarify 
  what's going wrong?
  
  I expect that program to run indefinitely, purging some regions out of RAM 
  and reading the relevant parts of files.
  
  
  You did not specified several very important parameters for your test:
  1. total amount of RAM installed
 
 
 24GB
 
 
  2. count of the test files and size of the files
 
 To be precise: I used 57 files with size varied form 74MB to 19GB.
 The total size of these files is 270GB.
 
  3. which filesystem files are located at
 
 
 UFS @ SSD drive
 
  4. version of the system.
 
 
 FreeBSD 9.2-PRERELEASE #0 r254880M: Wed Aug 28 11:07:54 MSK 2013

I was not able to reproduce the situation locally. I even tried to start
a lot of threads accessing the mapped regions, to try to outrun the
pagedaemon. The user threads sleep on the disk read, while pagedaemon
has a lot of time to rebalance the queues. It might be a case when SSD
indeed makes a difference.

Still, I see how this situation could appear. The code, which triggers
OOM, never fires if there is a free space in the swapfile, so the
absense of swap is neccessary condition to trigger the bug.  Next, OOM
calculation does not account for a possibility that almost all pages on
the queues can be reused. It just fires if free pages depleted too much
or free target cannot be reached.

IMO one of the possible solution is to account the queued pages in
addition to the swap space.  This is not entirely accurate, since some
pages on the queues cannot be reused, at least transiently.  Most precise
algorithm would count the hold and busy pages globally, and substract
this count from queues length, but it is probably too costly.

Instead, I think we could rely on the numbers which are counted by
pagedaemon threads during the passes.  Due to the transient nature of the
pagedaemon failures, this should be fine.

Below is the prototype patch, against HEAD.  It is not applicable to
stable, please use HEAD kernel for test.

diff --git a/sys/sys/vmmeter.h b/sys/sys/vmmeter.h
index d2ad920..ee5159a 100644
--- a/sys/sys/vmmeter.h
+++ b/sys/sys/vmmeter.h
@@ -93,9 +93,10 @@ struct vmmeter {
u_int v_free_min;   /* (c) pages desired free */
u_int v_free_count; /* (f) pages free */
u_int v_wire_count; /* (a) pages wired down */
-   u_int v_active_count;   /* (q) pages active */
+   u_int v_active_count;   /* (a) pages active */
u_int v_inactive_target; /* (c) pages desired inactive */
-   u_int v_inactive_count; /* (q) pages inactive */
+   u_int v_inactive_count; /* (a) pages inactive */
+   u_int v_queue_sticky;   /* (a) pages on queues but cannot process */
u_int v_cache_count;/* (f) pages on cache queue */
u_int v_cache_min;  /* (c) min pages desired on cache queue */
u_int v_cache_max;  /* (c) max pages in cached obj (unused) */
diff --git a/sys/vm/vm_meter.c b/sys/vm/vm_meter.c
index 713a2be..4bb1f1f 100644
--- a/sys/vm/vm_meter.c
+++ b/sys/vm/vm_meter.c
@@ -316,6 +316,7 @@ VM_STATS_VM(v_active_count, Active pages);
 VM_STATS_VM(v_inactive_target, Desired inactive pages);
 VM_STATS_VM(v_inactive_count, Inactive pages);
 VM_STATS_VM(v_cache_count, Pages on cache queue);
+VM_STATS_VM(v_queue_sticky, Pages which cannot be moved from queues);
 VM_STATS_VM(v_cache_min, Min pages on cache queue);
 VM_STATS_VM(v_cache_max, Max pages on cached queue);
 VM_STATS_VM(v_pageout_free_min, Min pages reserved for kernel);
diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h
index 7846702..6943a0e 100644
--- a/sys/vm/vm_page.h
+++ b/sys/vm/vm_page.h
@@ -226,6 +226,7 @@ struct vm_domain {
long vmd_segs;  /* bitmask of the segments */
boolean_t vmd_oom;
int vmd_pass;   /* local pagedaemon pass */
+   int vmd_queue_sticky;   /* pages on queues which cannot be processed */
struct vm_page vmd_marker; /* marker for pagedaemon private 

Re: mmap() question

2013-10-12 Thread Dmitry Sivachenko

On 12.10.2013, at 13:59, Konstantin Belousov kostik...@gmail.com wrote:
 
 I was not able to reproduce the situation locally. I even tried to start
 a lot of threads accessing the mapped regions, to try to outrun the
 pagedaemon. The user threads sleep on the disk read, while pagedaemon
 has a lot of time to rebalance the queues. It might be a case when SSD
 indeed makes a difference.
 


With ordinary SATA drive it will take hours just to read 20GB of data from disk 
because of random access, it will do a lot of seeks and reading speed will be 
extremely low.

SSD dramatically improves reading speed.


 Still, I see how this situation could appear. The code, which triggers
 OOM, never fires if there is a free space in the swapfile, so the
 absense of swap is neccessary condition to trigger the bug.  Next, OOM
 calculation does not account for a possibility that almost all pages on
 the queues can be reused. It just fires if free pages depleted too much
 or free target cannot be reached.


First I tried with some swap space configured.  The OS started to swap out my 
process after it reached about 20GB which is also not what I expected:  what is 
the reason to swap out regions of read-only mmap()ed files?  Is it the expected 
behaviour?


 
 Below is the prototype patch, against HEAD.  It is not applicable to
 stable, please use HEAD kernel for test.


Thanks, I will test the patch soon and report the results.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: mmap() question

2013-10-12 Thread Konstantin Belousov
On Sat, Oct 12, 2013 at 04:04:31PM +0400, Dmitry Sivachenko wrote:
 
 On 12.10.2013, at 13:59, Konstantin Belousov kostik...@gmail.com wrote:
  
  I was not able to reproduce the situation locally. I even tried to start
  a lot of threads accessing the mapped regions, to try to outrun the
  pagedaemon. The user threads sleep on the disk read, while pagedaemon
  has a lot of time to rebalance the queues. It might be a case when SSD
  indeed makes a difference.
  
 
 
 With ordinary SATA drive it will take hours just to read 20GB of data from 
 disk because of random access, it will do a lot of seeks and reading speed 
 will be extremely low.
 
 SSD dramatically improves reading speed.
 
 
  Still, I see how this situation could appear. The code, which triggers
  OOM, never fires if there is a free space in the swapfile, so the
  absense of swap is neccessary condition to trigger the bug.  Next, OOM
  calculation does not account for a possibility that almost all pages on
  the queues can be reused. It just fires if free pages depleted too much
  or free target cannot be reached.
 
 
 First I tried with some swap space configured.  The OS started to swap out my 
 process after it reached about 20GB which is also not what I expected:  what 
 is the reason to swap out regions of read-only mmap()ed files?  Is it the 
 expected behaviour?
 
How did you concluded that the pages from your r/o mappings were paged out ?
VM never does this.  Only anonymous memory could be written to swap file,
including the shadow pages for the writeable COW mappings.  I suspect that
you have another 20GB of something used on the machine meantime.

 
  
  Below is the prototype patch, against HEAD.  It is not applicable to
  stable, please use HEAD kernel for test.
 
 
 Thanks, I will test the patch soon and report the results.


pgp4mxTG6rGdf.pgp
Description: PGP signature


Re: mmap() question

2013-10-11 Thread Dmitry Sivachenko

On 11.10.2013, at 9:17, Konstantin Belousov kostik...@gmail.com wrote:

 On Wed, Oct 09, 2013 at 03:42:27PM +0400, Dmitry Sivachenko wrote:
 Hello!
 
 I have a program which mmap()s a lot of large files (total size more that 
 RAM and I have no swap), but it needs only small parts of that files at a 
 time.
 
 My understanding is that when using mmap when I access some memory region OS 
 reads the relevant portion of that file from disk and caches the result in 
 memory.  If there is no free memory, OS will purge previously read part of 
 mmap'ed file to free memory for the new chunk.
 
 But this is not the case.  I use the following simple program which gets 
 list of files as command line arguments, mmap()s them all and then selects 
 random file and random 1K parts of that file and computes a XOR of bytes 
 from that region.
 After some time the program dies:
 pid 63251 (a.out), uid 1232, was killed: out of swap space
 
 It seems I incorrectly understand how mmap() works, can you please clarify 
 what's going wrong?
 
 I expect that program to run indefinitely, purging some regions out of RAM 
 and reading the relevant parts of files.
 
 
 You did not specified several very important parameters for your test:
 1. total amount of RAM installed


24GB


 2. count of the test files and size of the files

To be precise: I used 57 files with size varied form 74MB to 19GB.
The total size of these files is 270GB.

 3. which filesystem files are located at


UFS @ SSD drive

 4. version of the system.


FreeBSD 9.2-PRERELEASE #0 r254880M: Wed Aug 28 11:07:54 MSK 2013
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: mmap() question

2013-10-10 Thread Konstantin Belousov
On Wed, Oct 09, 2013 at 03:42:27PM +0400, Dmitry Sivachenko wrote:
 Hello!
 
 I have a program which mmap()s a lot of large files (total size more that RAM 
 and I have no swap), but it needs only small parts of that files at a time.
 
 My understanding is that when using mmap when I access some memory region OS 
 reads the relevant portion of that file from disk and caches the result in 
 memory.  If there is no free memory, OS will purge previously read part of 
 mmap'ed file to free memory for the new chunk.
 
 But this is not the case.  I use the following simple program which gets list 
 of files as command line arguments, mmap()s them all and then selects random 
 file and random 1K parts of that file and computes a XOR of bytes from that 
 region.
 After some time the program dies:
 pid 63251 (a.out), uid 1232, was killed: out of swap space
 
 It seems I incorrectly understand how mmap() works, can you please clarify 
 what's going wrong?
 
 I expect that program to run indefinitely, purging some regions out of RAM 
 and reading the relevant parts of files.
 

You did not specified several very important parameters for your test:
1. total amount of RAM installed
2. count of the test files and size of the files
3. which filesystem files are located at
4. version of the system.


pgpu_xJMm2QsF.pgp
Description: PGP signature


Re: mmap() question

2013-10-09 Thread RW
On Wed, 9 Oct 2013 15:42:27 +0400
Dmitry Sivachenko wrote:

 Hello!
 
 I have a program which mmap()s a lot of large files (total size more
 that RAM and I have no swap), but it needs only small parts of that
 files at a time.
 
 My understanding is that when using mmap when I access some memory
 region OS reads the relevant portion of that file from disk and
 caches the result in memory.  If there is no free memory, OS will
 purge previously read part of mmap'ed file to free memory for the new
 chunk.
 
...

 It seems I incorrectly understand how mmap() works, can you please
 clarify what's going wrong?
 
 I expect that program to run indefinitely, purging some regions out
 of RAM and reading the relevant parts of files.

I think your problem is that you are accessing the memory so rapidly
that the pages can't even get out of the active queue. The VM system
isn't optimized for this kind of abnormal access. 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: another question

2013-07-01 Thread mdf
On Mon, Jul 1, 2013 at 5:42 PM, David Sanford
david.lee...@programmer.netwrote:

 Hi,

 Thanks for your responses to my first question. They were very helpful.

 In looking at the code, I ran across the functions setprogname and
 getprogname. According to the man page:
 In FreeBSD, the name of the program is set by the start-up code that is
 run before  *main*(); thus, running  *setprogname*() is not necessary.
 I'm confused by how this is done. Where is this start-up code defined?
 Is this included in all executables compiled on FreeBSD? Even the programs
 released under the GNU GPL?


I believe the code that does this is in lib/csu/common/ignore_init.c; see
handle_argv() and the use of __progname[].

This will run for anything that links against csu, which is anything
compiled on FreeBSD.  The same csu library sets the ABI note.tag, which
tells the kernel which syscall table to use when the binary is executed.

Cheers,
matthew
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Chris Rees
On 24 May 2013 08:34, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl
wrote:

 how to redirect recipient address. i mean - if someone try to send to
x...@y.pl from serwer then it should be redirected to local account, while the
rest of mails to domain @y.pl should get out normally.

 alternatively outgoing mail to x...@y.pl should be rejected.


 tried access.db -

 To:x...@y.pl REJECT

 doesn't work


 any idea. thank you

Try a sendmail list?

Chris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 09:33+0200, Wojciech Puchar wrote:

 how to redirect recipient address. i mean - if someone try to send to 
 x...@y.pl
 from serwer then it should be redirected to local account, while the rest of
 mails to domain @y.pl should get out normally.
 
 alternatively outgoing mail to x...@y.pl should be rejected.
 
 tried access.db -
 
 To:x...@y.pl REJECT
 
 doesn't work
 
 any idea. thank you

Don't use /etc/mail/access, use /etc/mail/aliases.

E.g.:

x:  /dev/null

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Wojciech Puchar


To:x...@y.pl REJECT

doesn't work

any idea. thank you


Don't use /etc/mail/access, use /etc/mail/aliases.

E.g.:

x:  /dev/null


x is NOT on my server. it will not work.

all i want is when someone send a mail from my server to x...@y.pl (which is 
someone else domain) it will not get there and be blocked or redirected

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 09:55+0200, Wojciech Puchar wrote:

   
   To:x...@y.pl REJECT
   
   doesn't work
   
   any idea. thank you
  
  Don't use /etc/mail/access, use /etc/mail/aliases.
  
  E.g.:
  
  x:  /dev/null
 
 x is NOT on my server. it will not work.
 
 all i want is when someone send a mail from my server to x...@y.pl (which is
 someone else domain) it will not get there and be blocked or redirected

My bad, take a look at the /etc/mail/genericstable file:

http://www.sendmail.com/sm/open_source/docs/m4/features.html

Maybe a line like this one will help you achieve your goal:

j...@bar.comerror:5.7.0:550 Address invalid

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 10:19+0200, Trond Endrestøl wrote:

 My bad, take a look at the /etc/mail/genericstable file:
 
 http://www.sendmail.com/sm/open_source/docs/m4/features.html
 
 Maybe a line like this one will help you achieve your goal:
 
 j...@bar.com  error:5.7.0:550 Address invalid

I was wrong again, sorry, but I believe I got it right this time:

1. Edit the /etc/mail/access file.

2. Insert a line like this one:

To:mail...@some.domain.tld REJECT

3. Save the /etc/mail/access file.

4. Change to the /etc/mail directory if not already there.

5. Run the make command to update the /etc/mail/access.db.

6. Try to send email to the blacklisted recipient.

7. If successful, the sender should recieve:

reason: 550 5.2.1 mail...@some.domain.tld... Mailbox disabled for this 
recipient

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Wojciech Puchar

all i want is when someone send a mail from my server to x...@y.pl (which is
someone else domain) it will not get there and be blocked or redirected


My bad, take a look at the /etc/mail/genericstable file:

http://www.sendmail.com/sm/open_source/docs/m4/features.html

Maybe a line like this one will help you achieve your goal:

j...@bar.comerror:5.7.0:550 Address invalid


i tried it. maybe i do something wrong but tried on my home server

woj...@3miasto.net.pl   error:5.7.0:550 Address invalid


(TAB separates fields)

in /etc/mail/genericstable

and


FEATURE(`genericstable')

in wojtek.tensor.gdynia.pl.mc

did
make
make install
/etc/rc.d/sendmail restart


and tried to send mail to woj...@3miasto.net.pl

mail is not blocked


any more ideas? thank you!

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Wojciech Puchar


http://www.sendmail.com/sm/open_source/docs/m4/features.html

Maybe a line like this one will help you achieve your goal:

j...@bar.comerror:5.7.0:550 Address invalid


I was wrong again, sorry, but I believe I got it right this time:

1. Edit the /etc/mail/access file.

2. Insert a line like this one:

To:mail...@some.domain.tld REJECT


tried too.

doesn't work.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Chris Rees
On 24 May 2013 11:05, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl
wrote:


 http://www.sendmail.com/sm/open_source/docs/m4/features.html

 Maybe a line like this one will help you achieve your goal:

 j...@bar.com error:5.7.0:550 Address invalid


 I was wrong again, sorry, but I believe I got it right this time:

 1. Edit the /etc/mail/access file.

 2. Insert a line like this one:

 To:mail...@some.domain.tld REJECT


 tried too.

 doesn't work.


http://www.sendmail.com/sm/open_source/support/public_forums/

There is also an IRC channel, #sendmail on Freenode.

Chris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 12:03+0200, Wojciech Puchar wrote:

  1. Edit the /etc/mail/access file.
  
  2. Insert a line like this one:
  
  To:mail...@some.domain.tld REJECT
 
 tried too.
 
 doesn't work.

Make sure you edit the /etc/mail/access file, not the 
/etc/mail/access.db file.

The latter is a hashmap used by sendmail for rapid lookup. The former 
is the source used to generate the /etc/mail/access.db file.

Sendmail will never read the /etc/mail/access file.

Don't forget to run the make command afterwards to update the 
/etc/mail/access.db file, and other changed files.

Your hostname.mc file must contain this line exactly as shown:

FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')

Yes, sendmail will indeed access the /etc/mail/access.db file, not the 
/etc/mail/access file.

If you changed the .mc file, then install the corresponding .cf files 
using the make install command.

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 12:45+0200, Trond Endrestøl wrote:

 On Fri, 24 May 2013 12:03+0200, Wojciech Puchar wrote:
 
   1. Edit the /etc/mail/access file.
   
   2. Insert a line like this one:
   
   To:mail...@some.domain.tld REJECT
  
  tried too.
  
  doesn't work.
 
 Make sure you edit the /etc/mail/access file, not the 
 /etc/mail/access.db file.
 
 The latter is a hashmap used by sendmail for rapid lookup. The former 
 is the source used to generate the /etc/mail/access.db file.
 
 Sendmail will never read the /etc/mail/access file.
 
 Don't forget to run the make command afterwards to update the 
 /etc/mail/access.db file, and other changed files.
 
 Your hostname.mc file must contain this line exactly as shown:
 
 FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')
 
 Yes, sendmail will indeed access the /etc/mail/access.db file, not the 
 /etc/mail/access file.
 
 If you changed the .mc file, then install the corresponding .cf files 
 using the make install command.

One final(?) note: You might need this line as well:

FEATURE(blacklist_recipients)

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Claus Assmann
On Fri, May 24, 2013, Trond Endrest?l wrote:

[freebsd-hackers doesn't seem like the appropriate list...]

  FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')

Do NOT use -o. Moreover, do not specify arguments that are default.
FEATURE(`access_db')
is the best choice.

 One final(?) note: You might need this line as well:

 FEATURE(blacklist_recipients)

That's not a might, that's a MUST for this case.

Note: (sendmail's) cf/README is a rather useful document.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Trond Endrestøl
On Fri, 24 May 2013 08:34-0700, Claus Assmann wrote:

 On Fri, May 24, 2013, Trond Endrestøl wrote:
 
 [freebsd-hackers doesn't seem like the appropriate list...]
 
   FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')
 
 Do NOT use -o. Moreover, do not specify arguments that are default.
 FEATURE(`access_db')
 is the best choice.

Then I guess the defaults in freebsd.mc should be changed as well:

http://svnweb.freebsd.org/base/stable/9/etc/sendmail/freebsd.mc?revision=249867view=markup

  One final(?) note: You might need this line as well:
 
  FEATURE(blacklist_recipients)
 
 That's not a might, that's a MUST for this case.
 
 Note: (sendmail's) cf/README is a rather useful document.

Thank you for the clarification. I only hope our friend in Poland got 
what he needed.

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Claus Assmann
On Fri, May 24, 2013, Trond Endrest?l wrote:
 On Fri, 24 May 2013 08:34-0700, Claus Assmann wrote:

FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')

  Do NOT use -o. Moreover, do not specify arguments that are default.

 Then I guess the defaults in freebsd.mc should be changed as well:
 http://svnweb.freebsd.org/base/stable/9/etc/sendmail/freebsd.mc?revision=249867view=markup

That default was probably chosen so the MTA does not complain
if the map doesn't exist.
Of course that doesn't work so well if you really want to use
the map but make some mistake -- then an error is silently
ignored and you wonder Why doesn't this work?
Hence for this case: do NOT use -o.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: stupid question about sendmail

2013-05-24 Thread Wojciech Puchar

works fine after your advice. thank you very much.

FEATURE(`access_db')
FEATURE(`blacklist_recipients')


On Fri, 24 May 2013, Claus Assmann wrote:


On Fri, May 24, 2013, Trond Endrest?l wrote:

[freebsd-hackers doesn't seem like the appropriate list...]


FEATURE(access_db, `hash -o -TTMPF /etc/mail/access')


Do NOT use -o. Moreover, do not specify arguments that are default.
FEATURE(`access_db')
is the best choice.


One final(?) note: You might need this line as well:



FEATURE(blacklist_recipients)


That's not a might, that's a MUST for this case.

Note: (sendmail's) cf/README is a rather useful document.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Stupid question about integer sizes

2013-02-19 Thread mdf
On Tue, Feb 19, 2013 at 5:11 AM, Borja Marcos bor...@sarenet.es wrote:

 Hello,

 I'm really sorry if this is a stupid question, but as far as I know, 
 u_int64_t defined in /usr/include/sys/types.h should *always* be
 a 64 bit unsigned integer, right?

 Seems there's a bug (or I need more and stronger coffee). Compiling a program 
 on a 64 bit system with -m32 gets the 64 bit integer types wrong.

 % cat tachin.c
 #include sys/types.h
 #include stdio.h


 main()
 {
 printf(sizeof uint64_t = %d\n, sizeof(uint64_t));
 printf(sizeof u_int64_t = %d\n, sizeof(u_int64_t));
 }



 uname -a
 FreeBSD splunk 9.1-RELEASE FreeBSD 9.1-RELEASE #14: Wed Jan 23 17:24:05 CET 
 2013 root@splunk:/usr/obj/usr/src/sys/SPLUNK  amd64

 % gcc -o tachin tachin.c
 % ./tachin
 sizeof uint64_t = 8
 sizeof u_int64_t = 8

 % ./tachin
 sizeof uint64_t = 4   === WRONG!!
 sizeof u_int64_t = 4=== WRONG!!

 The same happens with clang.

 % clang -m32 -o tachin tachin.c
 tachin.c:5:1: warning: type specifier missing, defaults to 'int' 
 [-Wimplicit-int]
 main()
 ^~~~
 1 warning generated.
 % ./tachin
 sizeof uint64_t = 4 === WRONG!!
 sizeof u_int64_t = 4=== WRONG!!


 if I do the same on a i386 system (8.2-RELEASE, but it should be irrelevant) 
 the u_int64 types have the correct size.

 %gcc -o tachin tachin.c
 %./tachin
 sizeof uint64_t = 8
 sizeof u_int64_t = 8





 Am I missing anything? Seems like a too stupid problem to be a real bug. 
 Sorry if I am wrong.


Last I knew -m32 still wasn't quite supported on 9.1.  This is fixed
in CURRENT.  The problem in in the machine-dependent headers.
sys/types.h includes sys/_types.h which includes machine/_types.h.
But the machine alias can only point to one place; it's the amd64
headers since you're running the amd64 kernel.
sys/amd64/include/_types.h defines __uint64_t as unsigned long, which
is correct in 64-bit mode but wrong in 32-bit mode.

On CURRENT there is a merge x86/_types.h which uses #ifdef guards to
define __uint64_t as unsigned long or unsigned long long, depending on
__LP64__, so that the size is correct on a 32-bit compiler invocation.

Cheers,
matthew
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Stupid question about integer sizes

2013-02-19 Thread Borja Marcos

On Feb 19, 2013, at 3:52 PM, m...@freebsd.org wrote:

 Last I knew -m32 still wasn't quite supported on 9.1.  This is fixed

Ahh I see. It should print a warning, then. It's the typical thing that can 
drive you nuts ;)

Thanks,





Borja.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Stupid question about integer sizes

2013-02-19 Thread Konstantin Belousov
On Tue, Feb 19, 2013 at 06:52:34AM -0800, m...@freebsd.org wrote:
 On Tue, Feb 19, 2013 at 5:11 AM, Borja Marcos bor...@sarenet.es wrote:
 
  Hello,
 
  I'm really sorry if this is a stupid question, but as far as I know, 
  u_int64_t defined in /usr/include/sys/types.h should *always* be
  a 64 bit unsigned integer, right?
 
  Seems there's a bug (or I need more and stronger coffee). Compiling a 
  program on a 64 bit system with -m32 gets the 64 bit integer types wrong.
 
  % cat tachin.c
  #include sys/types.h
  #include stdio.h
 
 
  main()
  {
  printf(sizeof uint64_t = %d\n, sizeof(uint64_t));
  printf(sizeof u_int64_t = %d\n, sizeof(u_int64_t));
  }
 
 
 
  uname -a
  FreeBSD splunk 9.1-RELEASE FreeBSD 9.1-RELEASE #14: Wed Jan 23 17:24:05 CET 
  2013 root@splunk:/usr/obj/usr/src/sys/SPLUNK  amd64
 
  % gcc -o tachin tachin.c
  % ./tachin
  sizeof uint64_t = 8
  sizeof u_int64_t = 8
 
  % ./tachin
  sizeof uint64_t = 4   === WRONG!!
  sizeof u_int64_t = 4=== WRONG!!
 
  The same happens with clang.
 
  % clang -m32 -o tachin tachin.c
  tachin.c:5:1: warning: type specifier missing, defaults to 'int' 
  [-Wimplicit-int]
  main()
  ^~~~
  1 warning generated.
  % ./tachin
  sizeof uint64_t = 4 === WRONG!!
  sizeof u_int64_t = 4=== WRONG!!
 
 
  if I do the same on a i386 system (8.2-RELEASE, but it should be 
  irrelevant) the u_int64 types have the correct size.
 
  %gcc -o tachin tachin.c
  %./tachin
  sizeof uint64_t = 8
  sizeof u_int64_t = 8
 
 
 
 
 
  Am I missing anything? Seems like a too stupid problem to be a real bug. 
  Sorry if I am wrong.
 
 
 Last I knew -m32 still wasn't quite supported on 9.1.  This is fixed
 in CURRENT.  The problem in in the machine-dependent headers.
 sys/types.h includes sys/_types.h which includes machine/_types.h.
 But the machine alias can only point to one place; it's the amd64
 headers since you're running the amd64 kernel.
 sys/amd64/include/_types.h defines __uint64_t as unsigned long, which
 is correct in 64-bit mode but wrong in 32-bit mode.
 
 On CURRENT there is a merge x86/_types.h which uses #ifdef guards to
 define __uint64_t as unsigned long or unsigned long long, depending on
 __LP64__, so that the size is correct on a 32-bit compiler invocation.

Yes, but there are still some unconverted headers, mostly related to
the machine context and signal handlers.

Quick look:

machine/sigframe.h (probably not much useful for usermode)
machine/ucontext.h
machine/frame.h (again, usermode most likely does not need it)
machine/signal.h
machine/elf.h

libm machdep headers.


pgpSY1dpkwpFP.pgp
Description: PGP signature


Re: A question about creating a system call

2012-11-08 Thread Robert Watson

Hi Dave:

This wiki page may be of value:

http://wiki.freebsd.org/AddingAuditEvents

Robert N M Watson
Computer Laboratory
University of Cambridge

On Thu, 8 Nov 2012, dave jones wrote:


Hello,

I know how to create system calls, but I'm a bit confused about
sys/kern/syscalls.master file explained. For example, if I have a
foo system call, following code is added:

532 AUE_NULLSTD { int foo(char *str); }

The question is in column two AUE_NULL, can I replace it with AUE_FOO?
How to determine the system call should be audit or not? Thank you.

Regards,
Dave.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: priv_check() question

2011-07-05 Thread Robert Watson


On Sun, 3 Jul 2011, exorcistkiller wrote:

Hi! I am taking a FreeBSD course this summer and I'm doing a homework. A new 
system call uidkill() is to be added. uidkill(uid_t uid, int signum) sends 
signal specified by signum to all processes owned by uid, excluding the 
calling process itself.


I'm almost done, however I get stuck with priv_check(). If the calling 
process is trying to send signal to processes owned by others, permission 
should be denied. My implementation simply uses an if (p-p_ucred-cr_uid == 
ksi.ksi_uid) to deny it, however priv_check() is required. My question is: 
what privilege a process should have to send signal to processes owned by 
others? PRIV_SIGNAL_DIFFCRED?


The right way to think about privileges in FreeBSD is that they exempt 
subjects (usually processes) from normal access control rules -- typically as 
a result of a root uid.  The access control rules for signalling are captured 
by p_cansignal() and cr_cansignal(), depending on whether the subject is a 
process or a cached credential.  Processes have access to slightly greater 
rights than raw credentials due to additional context -- for example, 
information about parent-child relationships.  These functions then invoke 
further privilege checks if required, perhaps overriding the normal 
requirement that uids match, etc.  kill() implements a couple of broadcast 
modes for signals -- you may want to look at the implementation there to see 
how this is done.


Robert

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Mount_nfs question

2011-05-31 Thread Robert Watson


On Mon, 30 May 2011, Mark Saad wrote:


 So I am stumped on this one.  I want to know what the IP of each
nfs server that is providing each nfs export. I am running 7.4-RELEASE
When I run mount -t nfs I see something like this

VIP-01:/export/source on /mnt/src
VIP-02:/export/target   on /mnt/target
VIP-01:/export/logs on /mnt/logs
VIP-02:/export/package   on /mnt/pkg

The issue is I use a load balanced nfs server , from isilon. So VIP-01 could 
be any one of a group of IPs . I am trying to track down a network 
congestion issue and I cant find a way to match the output of lsof , and 
netstat to the output of mount -t nfs . Does anyone have any ideas how I 
could track this down , is there a way to run mount and have it show the IP 
and not the name of the source server ?


Unfortunately, there's not a good answer to this question.  nfsstat(1) should 
have a mode that can iterate down active mount points displaying statistics 
and connection information for each, but doesn't.  NFS sockets generally don't 
appear in sockstat(1) either.  However, they should appear in netstat(1), so 
you can at least identify the sockets open to various NFS server IP addresses 
(especially if they are TCP mounts).


Enhancing nfsstat(1) to display more detailed information would, I think, be a 
very useful task for someone to get up to (and perhaps should appear on our 
ideas list).  Something that would be nice to have, in support of this, is a 
way for file systems to provide extended status via a system call that queries 
mountpoints, both portable information that spans file systems, and file 
system-specific data.  Morally, similar to nmount(2) but for statistics rather 
than setting things.  The easier route is to add new sysctls that dump 
per-mountpoint state directly from NFS, but given how much other information 
we'd like to export, it would be great to have a more general mechanism.


(The more adventurous can, with a fairly high degree of safety, use kgdb on 
/dev/mem (read-only) to walk the NFS stack's mount tables, but that's not much 
fun.)


Robert
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Mount_nfs question

2011-05-31 Thread Rick Macklem
 Maybe you can use showmount -a SERVER-IP, foreach server you have...
 
That might work. NFS doesn't actually have a notion of a mount, but
the mount protocol daemon (typically called mountd) does try and keep
track of NFSv3 mounts from the requests it sees. How well this works for
NFSv3 will depend on how well the server keeps track of these things and
how easily they are lost during a server reboot or similar.

Since NFSv4 doesn't use the mount protocol, it will be useless for NFSv4.

 Thiago
 2011/5/30 Mark Saad nones...@longcount.org:
  On Mon, May 30, 2011 at 8:13 PM, Rick Macklem rmack...@uoguelph.ca
  wrote:
  Hello All
  So I am stumped on this one. I want to know what the IP of each
  nfs server that is providing each nfs export. I am running
  7.4-RELEASE
  When I run mount -t nfs I see something like this
 
  VIP-01:/export/source on /mnt/src
  VIP-02:/export/target on /mnt/target
  VIP-01:/export/logs on /mnt/logs
  VIP-02:/export/package on /mnt/pkg
 
  The issue is I use a load balanced nfs server , from isilon. So
  VIP-01
  could be any one of a group of IPs . I am trying to track down a
  network congestion issue and I cant find a way to match the output
  of
  lsof , and netstat to the output of mount -t nfs . Does anyone
  have
  any ideas how I could track this down , is there a way to run
  mount
  and have it show the IP and not the name of the source server ?
 
  Just fire up wireshark (or tcpdump) and watch the traffic. tcpdump
  doesn't know much about NFS, but if al you want are the IP#s, it'll
  do.
 
  But, no, mount won't tell you more than what the argument looked
  like.
 
  rick
 
  Wireshark seams like using a tank to swap a fly.
 
Maybe, but watching traffic isn't that scary and over the years I've
discovered things I would have never expected from doing it. Like a
case where one specific TCP segment was being dropped by a network
switch (it was a hardware problem in the switch that didn't manifest
itself any other way). Or, that one client was generating a massive
number of Getattr and Lookup RPCs. (That one turned out to be a grad
student who had made themselves an app. that had a bunch of threads
continually scanning to fs changes. Not a bad idea, but the threads
never took a break and continually did it.)

I've always found watching traffic kinda fun, but then I'm weird, rick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Mount_nfs question

2011-05-30 Thread Rick Macklem
 Hello All
 So I am stumped on this one. I want to know what the IP of each
 nfs server that is providing each nfs export. I am running 7.4-RELEASE
 When I run mount -t nfs I see something like this
 
 VIP-01:/export/source on /mnt/src
 VIP-02:/export/target on /mnt/target
 VIP-01:/export/logs on /mnt/logs
 VIP-02:/export/package on /mnt/pkg
 
 The issue is I use a load balanced nfs server , from isilon. So VIP-01
 could be any one of a group of IPs . I am trying to track down a
 network congestion issue and I cant find a way to match the output of
 lsof , and netstat to the output of mount -t nfs . Does anyone have
 any ideas how I could track this down , is there a way to run mount
 and have it show the IP and not the name of the source server ?
 
Just fire up wireshark (or tcpdump) and watch the traffic. tcpdump
doesn't know much about NFS, but if al you want are the IP#s, it'll do.

But, no, mount won't tell you more than what the argument looked like.

rick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Mount_nfs question

2011-05-30 Thread Mark Saad
On Mon, May 30, 2011 at 8:13 PM, Rick Macklem rmack...@uoguelph.ca wrote:
 Hello All
 So I am stumped on this one. I want to know what the IP of each
 nfs server that is providing each nfs export. I am running 7.4-RELEASE
 When I run mount -t nfs I see something like this

 VIP-01:/export/source on /mnt/src
 VIP-02:/export/target on /mnt/target
 VIP-01:/export/logs on /mnt/logs
 VIP-02:/export/package on /mnt/pkg

 The issue is I use a load balanced nfs server , from isilon. So VIP-01
 could be any one of a group of IPs . I am trying to track down a
 network congestion issue and I cant find a way to match the output of
 lsof , and netstat to the output of mount -t nfs . Does anyone have
 any ideas how I could track this down , is there a way to run mount
 and have it show the IP and not the name of the source server ?

 Just fire up wireshark (or tcpdump) and watch the traffic. tcpdump
 doesn't know much about NFS, but if al you want are the IP#s, it'll do.

 But, no, mount won't tell you more than what the argument looked like.

 rick

Wireshark seams like using a tank to swap a fly.


-- 
mark saad | nones...@longcount.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Mount_nfs question

2011-05-30 Thread Thiago Damas
  Maybe you can use showmount -a SERVER-IP, foreach server you have...

Thiago
2011/5/30 Mark Saad nones...@longcount.org:
 On Mon, May 30, 2011 at 8:13 PM, Rick Macklem rmack...@uoguelph.ca wrote:
 Hello All
 So I am stumped on this one. I want to know what the IP of each
 nfs server that is providing each nfs export. I am running 7.4-RELEASE
 When I run mount -t nfs I see something like this

 VIP-01:/export/source on /mnt/src
 VIP-02:/export/target on /mnt/target
 VIP-01:/export/logs on /mnt/logs
 VIP-02:/export/package on /mnt/pkg

 The issue is I use a load balanced nfs server , from isilon. So VIP-01
 could be any one of a group of IPs . I am trying to track down a
 network congestion issue and I cant find a way to match the output of
 lsof , and netstat to the output of mount -t nfs . Does anyone have
 any ideas how I could track this down , is there a way to run mount
 and have it show the IP and not the name of the source server ?

 Just fire up wireshark (or tcpdump) and watch the traffic. tcpdump
 doesn't know much about NFS, but if al you want are the IP#s, it'll do.

 But, no, mount won't tell you more than what the argument looked like.

 rick

 Wireshark seams like using a tank to swap a fly.


 --
 mark saad | nones...@longcount.org
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-29 Thread Warner Losh

On Apr 28, 2011, at 7:37 PM, Arnaud Lacombe wrote:
 On Thu, Apr 28, 2011 at 11:52 AM, Hartmut Brandt hartmut.bra...@dlr.de 
 wrote:
 I think we can change this, because it would break makefiles that assume
 that the entire script is given to the shell in one piece.
 
 I'm not sure to parse that. We can change it because it would break stuff.
 
 That said, if something was to be broken, it would already shows up
 when using -j N, and thus should be considered as a bug.

There's bugs in the code which does the output which makes it wrong often...  
So there's bugs both ways...  There's archival history here in hackers@.

Warner

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-29 Thread Roman Divacky
On Thu, Apr 28, 2011 at 08:50:27PM +0200, Hartmut Brandt wrote:
 On Thu, 28 Apr 2011, Roman Divacky wrote:
 
 RDOn Thu, Apr 28, 2011 at 05:52:58PM +0200, Hartmut Brandt wrote:
 RD Hi Roman,
 RD 
 RD On Wed, 27 Apr 2011, Roman Divacky wrote:
 RD 
 RD RDYou seem to have messed with bsd make so I have a question for you  :)
 RD 
 RD Yeah, that was some time ago ...
 RD 
 RD RDWhen a job is about to be executed in JobStart() a pipe is created 
 with
 RD RDits ends connected to job-inPipe/job-outPipe. When the job is 
 actually
 RD RDcreated in JobExec() the ps.out is set to job-outPipe so that in
 RD RDJobDoOutput() we can read from that pipe and basically just parse the 
 output
 RD RDfor shell-noPrint and leaving it out from the output. This is meant 
 (I think)
 RD RDfor supressing the filter thing. Ie. that if we do some @command the
 RD RDrestoration of setting of quiet mode is filtered out.
 RD RD
 RD RD
 RD RDIn -B mode we do it differently, as we invoke one shell per command 
 we don't
 RD RDhave to insert quiet/verbose commands and thus avoid all the 
 piping/parsing
 RD RDdance.
 RD RD
 RD RDSo my question is - why don't we invoke one shell per command by 
 default
 RD RDand avoid the piping/parsing? Is this because of performance? I think 
 that
 RD RDthe piping/parsing of the output can have worse impact than invoking 
 a shell
 RD RDfor every command. Especially given that most targets consists of 
 just one
 RD RDcommand.
 RD 
 RD The answer is in /usr/share/doc/psd/12.make. This is so one can write 
 RD something like
 RD 
 RD debug:
 RD  DEBUG_FLAGS=-g  
 RD  for i in $(SUBDIR); do
 RD  $(MAKE) -C $$i all
 RD  done
 RD 
 RD instead of:
 RD 
 RD debug:
 RD  DEBUG_FLAGS=-g \
 RD  for i in $(SUBDIR); do \
 RD  $(MAKE) -C $$i all ; \
 RD  done
 RD 
 RD -B means 'backward compatible' and does what the original v7 make did: 
 one 
 RD shell per command. This means you don't have to write the backslashes 
 and 
 RD the shell variable will be seen in the sub-makes and programs.
 RD 
 RD I think we can change this, because it would break makefiles that assume 
 RD that the entire script is given to the shell in one piece.
 RD
 RDI think you answered the question why we parse the target. But I asked why
 RDwe parse the output from it.
 
 My intention was to say why we use one shell for all commands for a given 
 rule. If we'd use one shell per line the above would not work, because the 
 first shell would see just the environment variable assignment (which 
 would be completly useless). The next shell would see a partial 'for' 
 statement and complain, and would not have the environment variable and so 
 on. So this is not so much about parsing, but about execution.
 
 I suppose that the tricky point is with @-lines in the middle of a 
 multi-line script.
 
Unless I am reading the code wrong the one shell per command is the default
mode.

see in main.c:

/*
 * Be compatible if user did not specify -j and did not explicitly
 * turned compatibility on
 */
if (!compatMake  !forceJobs)
compatMake = TRUE;

You have to specify -j to turn off the compat mode.

 RDAnyway, so you think it would be ok to change it to one shell per command 
 and
 RDavoid the shell output parsing or not?
 
 Unless I misunderstand the question I would say no, because this would 
 certainly render makefiles invalid that rely on the multi-line scripts 
 beeing handled by a single shell.
 
I think the chances of this breakage are pretty low as this is the default mode.

 RDI am interested in this so that make -j* lets the command know that the 
 RDoutput is a TTY, eg. clang can emit coloured warnings.
 
 Hmm. I see. Just a wild guess: couldn't we use a pty to talk to the shell? 
 If that could work the question is of course what one would expect from 
 something like: make 21 make.out

I looked at what gnu make does and I think they do pretty much the same what
I suggested. Ie. one shell per command with no parsing of output by printing
directly to stdout. Thus they have no problem with the process detecting stdout
being a tty.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: make question

2011-04-29 Thread Hartmut.Brandt
s/can/can't/

harti

From: Arnaud Lacombe [lacom...@gmail.com]
Sent: Friday, April 29, 2011 3:37 AM
To: Brandt, Hartmut
Cc: Roman Divacky; hack...@freebsd.org
Subject: Re: make question

Hi,

On Thu, Apr 28, 2011 at 11:52 AM, Hartmut Brandt hartmut.bra...@dlr.de wrote:
 I think we can change this, because it would break makefiles that assume
 that the entire script is given to the shell in one piece.

I'm not sure to parse that. We can change it because it would break stuff.

That said, if something was to be broken, it would already shows up
when using -j N, and thus should be considered as a bug.

 - Arnaud
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-29 Thread Hartmut Brandt
On Fri, 29 Apr 2011, Roman Divacky wrote:

RDOn Thu, Apr 28, 2011 at 08:50:27PM +0200, Hartmut Brandt wrote:
RD On Thu, 28 Apr 2011, Roman Divacky wrote:
RD 
RD RDOn Thu, Apr 28, 2011 at 05:52:58PM +0200, Hartmut Brandt wrote:
RD RD Hi Roman,
RD RD 
RD RD On Wed, 27 Apr 2011, Roman Divacky wrote:
RD RD 
RD RD RDYou seem to have messed with bsd make so I have a question for you 
 :)
RD RD 
RD RD Yeah, that was some time ago ...
RD RD 
RD RD RDWhen a job is about to be executed in JobStart() a pipe is created 
with
RD RD RDits ends connected to job-inPipe/job-outPipe. When the job is 
actually
RD RD RDcreated in JobExec() the ps.out is set to job-outPipe so that in
RD RD RDJobDoOutput() we can read from that pipe and basically just parse 
the output
RD RD RDfor shell-noPrint and leaving it out from the output. This is 
meant (I think)
RD RD RDfor supressing the filter thing. Ie. that if we do some @command 
the
RD RD RDrestoration of setting of quiet mode is filtered out.
RD RD RD
RD RD RD
RD RD RDIn -B mode we do it differently, as we invoke one shell per 
command we don't
RD RD RDhave to insert quiet/verbose commands and thus avoid all the 
piping/parsing
RD RD RDdance.
RD RD RD
RD RD RDSo my question is - why don't we invoke one shell per command by 
default
RD RD RDand avoid the piping/parsing? Is this because of performance? I 
think that
RD RD RDthe piping/parsing of the output can have worse impact than 
invoking a shell
RD RD RDfor every command. Especially given that most targets consists of 
just one
RD RD RDcommand.
RD RD 
RD RD The answer is in /usr/share/doc/psd/12.make. This is so one can write 
RD RD something like
RD RD 
RD RD debug:
RD RD   DEBUG_FLAGS=-g  
RD RD   for i in $(SUBDIR); do
RD RD   $(MAKE) -C $$i all
RD RD   done
RD RD 
RD RD instead of:
RD RD 
RD RD debug:
RD RD   DEBUG_FLAGS=-g \
RD RD   for i in $(SUBDIR); do \
RD RD   $(MAKE) -C $$i all ; \
RD RD   done
RD RD 
RD RD -B means 'backward compatible' and does what the original v7 make 
did: one 
RD RD shell per command. This means you don't have to write the backslashes 
and 
RD RD the shell variable will be seen in the sub-makes and programs.
RD RD 
RD RD I think we can change this, because it would break makefiles that 
assume 
RD RD that the entire script is given to the shell in one piece.
RD RD
RD RDI think you answered the question why we parse the target. But I asked 
why
RD RDwe parse the output from it.
RD 
RD My intention was to say why we use one shell for all commands for a given 
RD rule. If we'd use one shell per line the above would not work, because the 
RD first shell would see just the environment variable assignment (which 
RD would be completly useless). The next shell would see a partial 'for' 
RD statement and complain, and would not have the environment variable and so 
RD on. So this is not so much about parsing, but about execution.
RD 
RD I suppose that the tricky point is with @-lines in the middle of a 
RD multi-line script.
RD 
RDUnless I am reading the code wrong the one shell per command is the default
RDmode.
RD
RDsee in main.c:
RD
RD/*
RD * Be compatible if user did not specify -j and did not explicitly
RD * turned compatibility on
RD */
RDif (!compatMake  !forceJobs)
RDcompatMake = TRUE;
RD
RDYou have to specify -j to turn off the compat mode.

Wow. This breakage was introduced in our make in 1996 by an import from 
NetBSD (I did not check why they introduced it). I fail to see the logic 
in this handling of the -B option (if the user doesn't provide it, I 
(make) do it, unless the user said -j). But, the good side is, we can 
probably change the behavior of the non-compat mode, because the 
probability of there beeing makefiles that rely on it is now rather low.

In any case I recommend to check what is the performance implication of 
executing a lot more shells in -j mode. I expect this to be in the noise 
level, but a check would not hurt...

harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-29 Thread Bob Bishop
Hi,

This whole area is quite a mess. See for instance bin/10985 on interactions 
between -j, -B and .NOTPARALLEL

--
Bob Bishop
r...@gid.co.uk




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-28 Thread Hartmut Brandt
Hi Roman,

On Wed, 27 Apr 2011, Roman Divacky wrote:

RDYou seem to have messed with bsd make so I have a question for you  :)

Yeah, that was some time ago ...

RDWhen a job is about to be executed in JobStart() a pipe is created with
RDits ends connected to job-inPipe/job-outPipe. When the job is actually
RDcreated in JobExec() the ps.out is set to job-outPipe so that in
RDJobDoOutput() we can read from that pipe and basically just parse the output
RDfor shell-noPrint and leaving it out from the output. This is meant (I 
think)
RDfor supressing the filter thing. Ie. that if we do some @command the
RDrestoration of setting of quiet mode is filtered out.
RD
RD
RDIn -B mode we do it differently, as we invoke one shell per command we don't
RDhave to insert quiet/verbose commands and thus avoid all the piping/parsing
RDdance.
RD
RDSo my question is - why don't we invoke one shell per command by default
RDand avoid the piping/parsing? Is this because of performance? I think that
RDthe piping/parsing of the output can have worse impact than invoking a shell
RDfor every command. Especially given that most targets consists of just one
RDcommand.

The answer is in /usr/share/doc/psd/12.make. This is so one can write 
something like

debug:
DEBUG_FLAGS=-g  
for i in $(SUBDIR); do
$(MAKE) -C $$i all
done

instead of:

debug:
DEBUG_FLAGS=-g \
for i in $(SUBDIR); do \
$(MAKE) -C $$i all ; \
done

-B means 'backward compatible' and does what the original v7 make did: one 
shell per command. This means you don't have to write the backslashes and 
the shell variable will be seen in the sub-makes and programs.

I think we can change this, because it would break makefiles that assume 
that the entire script is given to the shell in one piece.

harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-28 Thread Roman Divacky
On Thu, Apr 28, 2011 at 05:52:58PM +0200, Hartmut Brandt wrote:
 Hi Roman,
 
 On Wed, 27 Apr 2011, Roman Divacky wrote:
 
 RDYou seem to have messed with bsd make so I have a question for you  :)
 
 Yeah, that was some time ago ...
 
 RDWhen a job is about to be executed in JobStart() a pipe is created with
 RDits ends connected to job-inPipe/job-outPipe. When the job is actually
 RDcreated in JobExec() the ps.out is set to job-outPipe so that in
 RDJobDoOutput() we can read from that pipe and basically just parse the 
 output
 RDfor shell-noPrint and leaving it out from the output. This is meant (I 
 think)
 RDfor supressing the filter thing. Ie. that if we do some @command the
 RDrestoration of setting of quiet mode is filtered out.
 RD
 RD
 RDIn -B mode we do it differently, as we invoke one shell per command we 
 don't
 RDhave to insert quiet/verbose commands and thus avoid all the piping/parsing
 RDdance.
 RD
 RDSo my question is - why don't we invoke one shell per command by default
 RDand avoid the piping/parsing? Is this because of performance? I think that
 RDthe piping/parsing of the output can have worse impact than invoking a 
 shell
 RDfor every command. Especially given that most targets consists of just one
 RDcommand.
 
 The answer is in /usr/share/doc/psd/12.make. This is so one can write 
 something like
 
 debug:
   DEBUG_FLAGS=-g  
   for i in $(SUBDIR); do
   $(MAKE) -C $$i all
   done
 
 instead of:
 
 debug:
   DEBUG_FLAGS=-g \
   for i in $(SUBDIR); do \
   $(MAKE) -C $$i all ; \
   done
 
 -B means 'backward compatible' and does what the original v7 make did: one 
 shell per command. This means you don't have to write the backslashes and 
 the shell variable will be seen in the sub-makes and programs.
 
 I think we can change this, because it would break makefiles that assume 
 that the entire script is given to the shell in one piece.

I think you answered the question why we parse the target. But I asked why
we parse the output from it.

Anyway, so you think it would be ok to change it to one shell per command and
avoid the shell output parsing or not?

I am interested in this so that make -j* lets the command know that the 
output is a TTY, eg. clang can emit coloured warnings.

Thank you, roman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-28 Thread Hartmut Brandt
On Thu, 28 Apr 2011, Roman Divacky wrote:

RDOn Thu, Apr 28, 2011 at 05:52:58PM +0200, Hartmut Brandt wrote:
RD Hi Roman,
RD 
RD On Wed, 27 Apr 2011, Roman Divacky wrote:
RD 
RD RDYou seem to have messed with bsd make so I have a question for you  :)
RD 
RD Yeah, that was some time ago ...
RD 
RD RDWhen a job is about to be executed in JobStart() a pipe is created with
RD RDits ends connected to job-inPipe/job-outPipe. When the job is actually
RD RDcreated in JobExec() the ps.out is set to job-outPipe so that in
RD RDJobDoOutput() we can read from that pipe and basically just parse the 
output
RD RDfor shell-noPrint and leaving it out from the output. This is meant (I 
think)
RD RDfor supressing the filter thing. Ie. that if we do some @command the
RD RDrestoration of setting of quiet mode is filtered out.
RD RD
RD RD
RD RDIn -B mode we do it differently, as we invoke one shell per command we 
don't
RD RDhave to insert quiet/verbose commands and thus avoid all the 
piping/parsing
RD RDdance.
RD RD
RD RDSo my question is - why don't we invoke one shell per command by default
RD RDand avoid the piping/parsing? Is this because of performance? I think 
that
RD RDthe piping/parsing of the output can have worse impact than invoking a 
shell
RD RDfor every command. Especially given that most targets consists of just 
one
RD RDcommand.
RD 
RD The answer is in /usr/share/doc/psd/12.make. This is so one can write 
RD something like
RD 
RD debug:
RDDEBUG_FLAGS=-g  
RDfor i in $(SUBDIR); do
RD$(MAKE) -C $$i all
RDdone
RD 
RD instead of:
RD 
RD debug:
RDDEBUG_FLAGS=-g \
RDfor i in $(SUBDIR); do \
RD$(MAKE) -C $$i all ; \
RDdone
RD 
RD -B means 'backward compatible' and does what the original v7 make did: one 
RD shell per command. This means you don't have to write the backslashes and 
RD the shell variable will be seen in the sub-makes and programs.
RD 
RD I think we can change this, because it would break makefiles that assume 
RD that the entire script is given to the shell in one piece.
RD
RDI think you answered the question why we parse the target. But I asked why
RDwe parse the output from it.

My intention was to say why we use one shell for all commands for a given 
rule. If we'd use one shell per line the above would not work, because the 
first shell would see just the environment variable assignment (which 
would be completly useless). The next shell would see a partial 'for' 
statement and complain, and would not have the environment variable and so 
on. So this is not so much about parsing, but about execution.

I suppose that the tricky point is with @-lines in the middle of a 
multi-line script.

RDAnyway, so you think it would be ok to change it to one shell per command and
RDavoid the shell output parsing or not?

Unless I misunderstand the question I would say no, because this would 
certainly render makefiles invalid that rely on the multi-line scripts 
beeing handled by a single shell.

RDI am interested in this so that make -j* lets the command know that the 
RDoutput is a TTY, eg. clang can emit coloured warnings.

Hmm. I see. Just a wild guess: couldn't we use a pty to talk to the shell? 
If that could work the question is of course what one would expect from 
something like: make 21 make.out

harti
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: make question

2011-04-28 Thread Arnaud Lacombe
Hi,

On Thu, Apr 28, 2011 at 11:52 AM, Hartmut Brandt hartmut.bra...@dlr.de wrote:
 I think we can change this, because it would break makefiles that assume
 that the entire script is given to the shell in one piece.

I'm not sure to parse that. We can change it because it would break stuff.

That said, if something was to be broken, it would already shows up
when using -j N, and thus should be considered as a bug.

 - Arnaud
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: OMAP3 Question

2011-04-27 Thread Freddie Cash
On Wed, Apr 27, 2011 at 1:30 PM, Chris Richardson
chris.richardson@gmail.com wrote:
    I wanna emulate OMAP3 Processor. Is it approach I can use to emulate
 OMAP3 without the need to any hardware?

Qemu has some basic support for this:
http://code.google.com/p/qemu-omap3/

No idea how good it is, or if it's even usuable.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-20 Thread Rick Macklem
 On Tue, Apr 19, 2011 at 12:00:29PM +,
 freebsd-hackers-requ...@freebsd.org wrote:
  Subject: Re: SMP question w.r.t. reading kernel variables
  To: Rick Macklem rmack...@uoguelph.ca
  Cc: freebsd-hackers@freebsd.org
  Message-ID: 201104181712.14457@freebsd.org
 
 [John Baldwin]
  On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote:
On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
 ...
   All of this makes sense. What I was concerned about was memory
   cache
   consistency and whet (if anything) has to be done to make sure a
   thread
   doesn't see a stale cached value for the memory location.
  
   Here's a generic example of what I was thinking of:
   (assume x is a global int and y is a local int on the thread's
   stack)
   - time proceeds down the screen
   thread X on CPU 0 thread Y on CPU 1
   x = 0;
x = 0; /* 0 for x's location
in CPU 1's memory cache */
   x = 1;
y = x;
   -- now, is y guaranteed to be 1 or can it get the stale cached
   0 value?
   if not, what needs to be done to guarantee it?
 
  Well, the bigger problem is getting the CPU and compiler to order
  the
  instructions such that they don't execute out of order, etc. Because
  of that,
  even if your code has 'x = 0; x = 1;' as adjacent threads in thread
  X,
  the 'x = 1' may actually execute a good bit after the 'y = x' on CPU
  1.
 
 Actually, as I recall the rules for C, it's worse than that. For
 this (admittedly simplified scenario), x=0; in thread X may never
 execute unless it's declared volatile, as the compiler may optimize it
 out and emit no code for it.
 
 
  Locks force that to sychronize as the CPUs coordinate around the
  lock cookie
  (e.g. the 'mtx_lock' member of 'struct mutex').
 
   Also, I see cases of:
mtx_lock(np);
np-n_attrstamp = 0;
mtx_unlock(np);
   in the regular NFS client. Why is the assignment mutex locked? (I
   had assumed
   it was related to the above memory caching issue, but now I'm not
   so sure.)
 
  In general I think writes to data that are protected by locks should
  always be
  protected by locks. In some cases you may be able to read data using
  weaker
  locking (where no locking can be a form of weaker locking, but
  also a
  read/shared lock is weak, and if a variable is protected by multiple
  locks,
  then any singe lock is weak, but sufficient for reading while all of
  the
  associated locks must be held for writing) than writing, but writing
  generally
  requires full locking (write locks, etc.).
 
Oops, I now see that you've differentiated between writing and reading.
(I mistakenly just stated that you had recommended a lock for reading.
 Sorry about my misinterpretation of the above on the first quick read.)

 What he said. In addition to all that, lock operations generate
 atomic barriers which a compiler or optimizer is prevented from
 moving code across.
 
All good and useful comments, thanks.

The above example was meant to be contrived, to indicate what I was
worried about w.r.t. memory caches.
Here's a somewhat simplified version of what my actual problem is:
(Mostly fyi, in case you are interested.)

Thread X is doing a forced dismount of an NFS volume, it (in dounmount()):
- sets MNTK_UNMOUNTF
- calls VFS_SYNC()/nfs_sync()
  - so this doesn't get hung on an unresponsive server it must test
for MNTK_UNMOUNTF and return an error it is set. This seems fine,
since it is the same thread and in a called function. (I can't
imagine that the optimizer could move setting of a global flag
to after a function call which might use it.)
- calls VFS_UNMOUNT()/nfs_unmount()
  - now the fun begins...
  after some other stuff, it calls nfscl_umount() to get rid of the
  state info (opens/locks...)
  nfscl_umount() - synchronizes with other threads that will use this
state (see below) using the combination of a mutex and a
shared/exclusive sleep lock. (Because of various quirks in the
code, this shared/exclusive lock is a locally coded version and
I happenned to call the shared case a refcnt and the exclusive
case just a lock.)

Other threads that will use state info (open/lock...) will:
-call nfscl_getcl()
  - this function does two things that are relevant
  1 - it allocates a new clientid, as required, while holding the mutex
  - this case needs to check for MNTK_UNMOUNTF and return error, in
case the clientid has already been deleted by nfscl_umount() above.
  (This happens before #2 because the sleep lock is in the clientd 
structure.)
-- it must see the MNTK_UNMOUNTF set if it happens after (in a temporal sense)
  being set by dounmount()
  2 - while holding the mutex, it acquires the shared lock
  - if this happens before nfscl_umount() gets the exclusive lock, it is
fine, since acquisition of the exclusive lock above will wait for its

Re: SMP question w.r.t. reading kernel variables

2011-04-20 Thread Alan Cox
On Wed, Apr 20, 2011 at 7:42 AM, Rick Macklem rmack...@uoguelph.ca wrote:

  On Tue, Apr 19, 2011 at 12:00:29PM +,
  freebsd-hackers-requ...@freebsd.org wrote:
   Subject: Re: SMP question w.r.t. reading kernel variables
   To: Rick Macklem rmack...@uoguelph.ca
   Cc: freebsd-hackers@freebsd.org
   Message-ID: 201104181712.14457@freebsd.org
 
  [John Baldwin]
   On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote:
 On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
  ...
All of this makes sense. What I was concerned about was memory
cache
consistency and whet (if anything) has to be done to make sure a
thread
doesn't see a stale cached value for the memory location.
   
Here's a generic example of what I was thinking of:
(assume x is a global int and y is a local int on the thread's
stack)
- time proceeds down the screen
thread X on CPU 0 thread Y on CPU 1
x = 0;
 x = 0; /* 0 for x's location
 in CPU 1's memory cache */
x = 1;
 y = x;
-- now, is y guaranteed to be 1 or can it get the stale cached
0 value?
if not, what needs to be done to guarantee it?
  
   Well, the bigger problem is getting the CPU and compiler to order
   the
   instructions such that they don't execute out of order, etc. Because
   of that,
   even if your code has 'x = 0; x = 1;' as adjacent threads in thread
   X,
   the 'x = 1' may actually execute a good bit after the 'y = x' on CPU
   1.
 
  Actually, as I recall the rules for C, it's worse than that. For
  this (admittedly simplified scenario), x=0; in thread X may never
  execute unless it's declared volatile, as the compiler may optimize it
  out and emit no code for it.
 
 
   Locks force that to sychronize as the CPUs coordinate around the
   lock cookie
   (e.g. the 'mtx_lock' member of 'struct mutex').
  
Also, I see cases of:
 mtx_lock(np);
 np-n_attrstamp = 0;
 mtx_unlock(np);
in the regular NFS client. Why is the assignment mutex locked? (I
had assumed
it was related to the above memory caching issue, but now I'm not
so sure.)
  
   In general I think writes to data that are protected by locks should
   always be
   protected by locks. In some cases you may be able to read data using
   weaker
   locking (where no locking can be a form of weaker locking, but
   also a
   read/shared lock is weak, and if a variable is protected by multiple
   locks,
   then any singe lock is weak, but sufficient for reading while all of
   the
   associated locks must be held for writing) than writing, but writing
   generally
   requires full locking (write locks, etc.).
 
 Oops, I now see that you've differentiated between writing and reading.
 (I mistakenly just stated that you had recommended a lock for reading.
  Sorry about my misinterpretation of the above on the first quick read.)

  What he said. In addition to all that, lock operations generate
  atomic barriers which a compiler or optimizer is prevented from
  moving code across.
 
 All good and useful comments, thanks.

 The above example was meant to be contrived, to indicate what I was
 worried about w.r.t. memory caches.
 Here's a somewhat simplified version of what my actual problem is:
 (Mostly fyi, in case you are interested.)

 Thread X is doing a forced dismount of an NFS volume, it (in dounmount()):
 - sets MNTK_UNMOUNTF
 - calls VFS_SYNC()/nfs_sync()
  - so this doesn't get hung on an unresponsive server it must test
for MNTK_UNMOUNTF and return an error it is set. This seems fine,
since it is the same thread and in a called function. (I can't
imagine that the optimizer could move setting of a global flag
to after a function call which might use it.)
 - calls VFS_UNMOUNT()/nfs_unmount()
  - now the fun begins...
  after some other stuff, it calls nfscl_umount() to get rid of the
  state info (opens/locks...)
  nfscl_umount() - synchronizes with other threads that will use this
state (see below) using the combination of a mutex and a
shared/exclusive sleep lock. (Because of various quirks in the
code, this shared/exclusive lock is a locally coded version and
I happenned to call the shared case a refcnt and the exclusive
case just a lock.)

 Other threads that will use state info (open/lock...) will:
 -call nfscl_getcl()
  - this function does two things that are relevant
  1 - it allocates a new clientid, as required, while holding the mutex
  - this case needs to check for MNTK_UNMOUNTF and return error, in
case the clientid has already been deleted by nfscl_umount() above.
  (This happens before #2 because the sleep lock is in the clientd
 structure.)
 -- it must see the MNTK_UNMOUNTF set if it happens after (in a temporal
 sense)
  being set by dounmount()
  2 - while holding the mutex, it acquires the shared lock

Re: SMP question w.r.t. reading kernel variables

2011-04-20 Thread Rick Macklem
[good stuff snipped for brevity]
 
 1. Set MNTK_UNMOUNTF
 2. Acquire a standard FreeBSD mutex m.
 3. Update some data structures.
 4. Release mutex m.
 
 Then, other threads that acquire m after step 4 has occurred will
 see
 MNTK_UNMOUNTF as set. But, other threads that beat thread X to step 2
 may
 or may not see MNTK_UNMOUNTF as set.
 
First off, Alan, thanks for the great explanation. I think it would be
nice if this was captured somewhere in the docs, if it isn't already
there somewhere (I couldn't spot it, but that doesn't mean anything:-).

 The question that I have about your specific scenario is concerned
 with
 VOP_SYNC(). Do you care if another thread performing nfscl_getcl()
 after
 thread X has performed VOP_SYNC() doesn't see MNTK_UNMOUNTF as set?

Well, no and yes. It doesn't matter if it doesn't see it after thread X
performed nfs_sync(), but it does matter that the threads calling nfscl_getcl()
see it before they compete with thread X for the sleep lock.

 Another
 relevant question is Does VOP_SYNC() acquire and release the same
 mutex as
 nfscl_umount() and nfscl_getcl()?
 
No. So, to get this to work correctly it sounds like I have to do one
of the following:
1 - mtx_lock(m); mtx_unlock(m); in nfs_sync(), where m is the mutex used
by nfscl_getcl() for the NFS open/lock state.
or
2 - mtx_lock(m); mtx_unlock(m); mtx_lock(m); before the point where I care
that the threads executing nfscl_getcl() see MNTK_UMOUNTF set in 
nfscl_umount().
or
3 - mtx_lock(m2); mtx_unlock(m2); in nfscl_getcl(), where m2 is the mutex used
by thread X when setting MNTK_UMOUNTF, before mtx_lock(m); and then testing
MNTK_UMOUNTF plus acquiring the sleep lock. (By doing it before, I can avoid
any LOR issue and do an msleep() without worrying about having two mutex 
locks.)

I think #3 reads the best, so I'll probably do that one.

One more question, if you don't mind.

Is step 3 in your explanation necessary for this to work? If it is, I can just 
create
some global variable that I assign a value to between mtx_lock(m2); 
mtx_unlock(m2);
but it won't be used for anything, so I thought I'd check if it is necessary?

Thanks again for the clear explanation, rick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-20 Thread Rick Macklem
 [good stuff snipped for brevity]
 
  1. Set MNTK_UNMOUNTF
  2. Acquire a standard FreeBSD mutex m.
  3. Update some data structures.
  4. Release mutex m.
 
  Then, other threads that acquire m after step 4 has occurred will
  see
  MNTK_UNMOUNTF as set. But, other threads that beat thread X to step
  2
  may
  or may not see MNTK_UNMOUNTF as set.
 
 First off, Alan, thanks for the great explanation. I think it would be
 nice if this was captured somewhere in the docs, if it isn't already
 there somewhere (I couldn't spot it, but that doesn't mean
 anything:-).
 
  The question that I have about your specific scenario is concerned
  with
  VOP_SYNC(). Do you care if another thread performing nfscl_getcl()
  after
  thread X has performed VOP_SYNC() doesn't see MNTK_UNMOUNTF as set?
 
 Well, no and yes. It doesn't matter if it doesn't see it after thread
 X
 performed nfs_sync(), but it does matter that the threads calling
 nfscl_getcl()
 see it before they compete with thread X for the sleep lock.
 
  Another
  relevant question is Does VOP_SYNC() acquire and release the same
  mutex as
  nfscl_umount() and nfscl_getcl()?
 
 No. So, to get this to work correctly it sounds like I have to do one
 of the following:
 1 - mtx_lock(m); mtx_unlock(m); in nfs_sync(), where m is the mutex
 used
 by nfscl_getcl() for the NFS open/lock state.
 or
 2 - mtx_lock(m); mtx_unlock(m); mtx_lock(m); before the point where I
 care
 that the threads executing nfscl_getcl() see MNTK_UMOUNTF set in
 nfscl_umount().
 or
 3 - mtx_lock(m2); mtx_unlock(m2); in nfscl_getcl(), where m2 is the
 mutex used
 by thread X when setting MNTK_UMOUNTF, before mtx_lock(m); and then
 testing
 MNTK_UMOUNTF plus acquiring the sleep lock. (By doing it before, I can
 avoid
 any LOR issue and do an msleep() without worrying about having two
 mutex locks.)
 
 I think #3 reads the best, so I'll probably do that one.
 
 One more question, if you don't mind.
 
 Is step 3 in your explanation necessary for this to work? If it is, I
 can just create
 some global variable that I assign a value to between mtx_lock(m2);
 mtx_unlock(m2);
 but it won't be used for anything, so I thought I'd check if it is
 necessary?
 
Oops, I screwed up this question. For my #3, all that needs to be done
in nfscl_getcl() before I care if it sees MNTK_UMOUNTF set is mtx_lock(m2);
since that has already gone through your steps 1-4.

The question w.r.t. do you really need your step 3 would apply to the
cases where I was using m (the mutex nfscl_umount() and nfscl_getcl()
already use instead of the one used by thread X).

rick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-19 Thread Clifton Royston
On Tue, Apr 19, 2011 at 12:00:29PM +, freebsd-hackers-requ...@freebsd.org 
wrote:
 Subject: Re: SMP question w.r.t. reading kernel variables
 To: Rick Macklem rmack...@uoguelph.ca
 Cc: freebsd-hackers@freebsd.org
 Message-ID: 201104181712.14457@freebsd.org

[John Baldwin]
 On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote:
   On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
...
  All of this makes sense. What I was concerned about was memory cache
  consistency and whet (if anything) has to be done to make sure a thread
  doesn't see a stale cached value for the memory location.
  
  Here's a generic example of what I was thinking of:
  (assume x is a global int and y is a local int on the thread's stack)
  - time proceeds down the screen
  thread X on CPU 0thread Y on CPU 1
  x = 0;
   x = 0; /* 0 for x's location in CPU 
  1's memory cache */
  x = 1;
   y = x;
  -- now, is y guaranteed to be 1 or can it get the stale cached 0 value?
  if not, what needs to be done to guarantee it?
 
 Well, the bigger problem is getting the CPU and compiler to order the
 instructions such that they don't execute out of order, etc.  Because of that,
 even if your code has 'x = 0; x = 1;' as adjacent threads in thread X,
 the 'x = 1' may actually execute a good bit after the 'y = x' on CPU 1.

  Actually, as I recall the rules for C, it's worse than that.  For
this (admittedly simplified scenario), x=0; in thread X may never
execute unless it's declared volatile, as the compiler may optimize it
out and emit no code for it.


 Locks force that to sychronize as the CPUs coordinate around the lock cookie
 (e.g. the 'mtx_lock' member of 'struct mutex').
 
  Also, I see cases of:
   mtx_lock(np);
   np-n_attrstamp = 0;
   mtx_unlock(np);
  in the regular NFS client. Why is the assignment mutex locked? (I had 
  assumed
  it was related to the above memory caching issue, but now I'm not so sure.)
 
 In general I think writes to data that are protected by locks should always be
 protected by locks.  In some cases you may be able to read data using weaker
 locking (where no locking can be a form of weaker locking, but also a
 read/shared lock is weak, and if a variable is protected by multiple locks,
 then any singe lock is weak, but sufficient for reading while all of the
 associated locks must be held for writing) than writing, but writing generally
 requires full locking (write locks, etc.).

  What he said.  In addition to all that, lock operations generate
atomic barriers which a compiler or optimizer is prevented from
moving code across.

  -- Clifton

-- 
Clifton Royston  --  clift...@iandicomputing.com / clift...@lava.net
   President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread John Baldwin
On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
 Hi,
 
 I should know the answer to this, but... When reading a global kernel
 variable, where its modifications are protected by a mutex, is it
 necessary to get the mutex lock to just read its value?
 
 For example:
 Aif ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
   return (EPERM);
 versus
 BMNT_ILOCK(mp);
  if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
   MNT_IUNLOCK(mp);
   return (EPERM);
  }
  MNT_IUNLOCK(mp);
 
 My hunch is that B is necessary if you need an up-to-date value
 for the variable (mp-mnt_kern_flag in this case).
 
 Is that correct?

You already have good followups from Attilio and Kostik, but one thing to keep 
in mind is that if a simple read is part of a larger atomic operation then 
it may still need a lock.  In this case Kostik points out that another lock 
prevents updates to mnt_kern_flag so that this is safe.  However, if not for 
that you would need to consider the case that another thread sets the flag on 
the next instruction.  Even the B case above might still have that problem 
since you drop the lock right after checking it and the rest of the function 
is implicitly assuming the flag is never set perhaps (or it needs to handle 
the case that the flag might become set in the future while MNT_ILOCK() is 
dropped).

One way you can make that code handle that race is by holding MNT_ILOCK() 
around the entire function, but that approach is often only suitable for a 
simple routine.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread Rick Macklem
 On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
  Hi,
 
  I should know the answer to this, but... When reading a global
  kernel
  variable, where its modifications are protected by a mutex, is it
  necessary to get the mutex lock to just read its value?
 
  For example:
  A if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
return (EPERM);
  versus
  B MNT_ILOCK(mp);
   if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
MNT_IUNLOCK(mp);
return (EPERM);
   }
   MNT_IUNLOCK(mp);
 
  My hunch is that B is necessary if you need an up-to-date value
  for the variable (mp-mnt_kern_flag in this case).
 
  Is that correct?
 
 You already have good followups from Attilio and Kostik, but one thing
 to keep
 in mind is that if a simple read is part of a larger atomic
 operation then
 it may still need a lock. In this case Kostik points out that another
 lock
 prevents updates to mnt_kern_flag so that this is safe. However, if
 not for
 that you would need to consider the case that another thread sets the
 flag on
 the next instruction. Even the B case above might still have that
 problem
 since you drop the lock right after checking it and the rest of the
 function
 is implicitly assuming the flag is never set perhaps (or it needs to
 handle
 the case that the flag might become set in the future while
 MNT_ILOCK() is
 dropped).
 
 One way you can make that code handle that race is by holding
 MNT_ILOCK()
 around the entire function, but that approach is often only suitable
 for a
 simple routine.
 
All of this makes sense. What I was concerned about was memory cache
consistency and whet (if anything) has to be done to make sure a thread
doesn't see a stale cached value for the memory location.

Here's a generic example of what I was thinking of:
(assume x is a global int and y is a local int on the thread's stack)
- time proceeds down the screen
thread X on CPU 0thread Y on CPU 1
x = 0;
 x = 0; /* 0 for x's location in CPU 1's 
memory cache */
x = 1;
 y = x;
-- now, is y guaranteed to be 1 or can it get the stale cached 0 value?
if not, what needs to be done to guarantee it?

For the original example, I am fine so long as the bit is seen as set after 
dounmount()
has set it.

Also, I see cases of:
 mtx_lock(np);
 np-n_attrstamp = 0;
 mtx_unlock(np);
in the regular NFS client. Why is the assignment mutex locked? (I had assumed
it was related to the above memory caching issue, but now I'm not so sure.)

Thanks a lot for all the good responses, rick
ps: I guess it comes down to whether or not atomic includes ensuring memory
cache consistency. I'll admit I assumed atomic meant that the memory
access or modify couldn't be interleaved with one done to the same location
by another CPU, but not memory cache consistency.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread Rick Macklem
 
 All of this makes sense. What I was concerned about was memory cache
 consistency and whet (if anything) has to be done to make sure a
Oops, whet should have been what..
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-18 Thread John Baldwin
On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote:
  On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote:
   Hi,
  
   I should know the answer to this, but... When reading a global
   kernel
   variable, where its modifications are protected by a mutex, is it
   necessary to get the mutex lock to just read its value?
  
   For example:
   A if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
 return (EPERM);
   versus
   B MNT_ILOCK(mp);
if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
 MNT_IUNLOCK(mp);
 return (EPERM);
}
MNT_IUNLOCK(mp);
  
   My hunch is that B is necessary if you need an up-to-date value
   for the variable (mp-mnt_kern_flag in this case).
  
   Is that correct?
  
  You already have good followups from Attilio and Kostik, but one thing
  to keep
  in mind is that if a simple read is part of a larger atomic
  operation then
  it may still need a lock. In this case Kostik points out that another
  lock
  prevents updates to mnt_kern_flag so that this is safe. However, if
  not for
  that you would need to consider the case that another thread sets the
  flag on
  the next instruction. Even the B case above might still have that
  problem
  since you drop the lock right after checking it and the rest of the
  function
  is implicitly assuming the flag is never set perhaps (or it needs to
  handle
  the case that the flag might become set in the future while
  MNT_ILOCK() is
  dropped).
  
  One way you can make that code handle that race is by holding
  MNT_ILOCK()
  around the entire function, but that approach is often only suitable
  for a
  simple routine.
  
 All of this makes sense. What I was concerned about was memory cache
 consistency and whet (if anything) has to be done to make sure a thread
 doesn't see a stale cached value for the memory location.
 
 Here's a generic example of what I was thinking of:
 (assume x is a global int and y is a local int on the thread's stack)
 - time proceeds down the screen
 thread X on CPU 0thread Y on CPU 1
 x = 0;
  x = 0; /* 0 for x's location in CPU 1's 
 memory cache */
 x = 1;
  y = x;
 -- now, is y guaranteed to be 1 or can it get the stale cached 0 value?
 if not, what needs to be done to guarantee it?

Well, the bigger problem is getting the CPU and compiler to order the
instructions such that they don't execute out of order, etc.  Because of that,
even if your code has 'x = 0; x = 1;' as adjacent threads in thread X,
the 'x = 1' may actually execute a good bit after the 'y = x' on CPU 1.
Locks force that to sychronize as the CPUs coordinate around the lock cookie
(e.g. the 'mtx_lock' member of 'struct mutex').

 Also, I see cases of:
  mtx_lock(np);
  np-n_attrstamp = 0;
  mtx_unlock(np);
 in the regular NFS client. Why is the assignment mutex locked? (I had assumed
 it was related to the above memory caching issue, but now I'm not so sure.)

In general I think writes to data that are protected by locks should always be
protected by locks.  In some cases you may be able to read data using weaker
locking (where no locking can be a form of weaker locking, but also a
read/shared lock is weak, and if a variable is protected by multiple locks,
then any singe lock is weak, but sufficient for reading while all of the
associated locks must be held for writing) than writing, but writing generally
requires full locking (write locks, etc.).

The case above may be excessive caution on my part, but I'd rather be safe than
sorry for writes.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-17 Thread Attilio Rao
2011/4/17 Rick Macklem rmack...@uoguelph.ca:
 Hi,

 I should know the answer to this, but... When reading a global kernel
 variable, where its modifications are protected by a mutex, is it
 necessary to get the mutex lock to just read its value?

 For example:
 A    if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
          return (EPERM);
 versus
 B    MNT_ILOCK(mp);
     if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
          MNT_IUNLOCK(mp);
          return (EPERM);
     }
     MNT_IUNLOCK(mp);

 My hunch is that B is necessary if you need an up-to-date value
 for the variable (mp-mnt_kern_flag in this case).

 Is that correct?

 Thanks in advance for help with this, rick

In general, FreeBSD kernel assumes that read and writes of the word
boundry and of the int types are always atomic.

Considering this, if a kernel variable is of int type or word boundry
size you don't strictly need a lock there.
Anyway, locking also bring some side effect, like usage of memory and
compiler barriers... while it is true that bounded read/writes should
be seq points, it is not too obvious to predict if the barrier is
necessary or not, is implied or not in every architecture we support.

Said that, for a matter of consistency and of better semantic, I
prefer to also lock simple read/writes when the objects are
explicitly equipped to do that.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: SMP question w.r.t. reading kernel variables

2011-04-17 Thread Kostik Belousov
On Sun, Apr 17, 2011 at 03:49:48PM -0400, Rick Macklem wrote:
 Hi,
 
 I should know the answer to this, but... When reading a global kernel
 variable, where its modifications are protected by a mutex, is it
 necessary to get the mutex lock to just read its value?
 
 For example:
 Aif ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
   return (EPERM);
 versus
 BMNT_ILOCK(mp);
  if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
   MNT_IUNLOCK(mp);
   return (EPERM);
  }
  MNT_IUNLOCK(mp);
 
 My hunch is that B is necessary if you need an up-to-date value
 for the variable (mp-mnt_kern_flag in this case).
 
 Is that correct?

mnt_kern_flag read is atomic on all architectures.
If, as I suspect, the fragment is for the VFS_UNMOUNT() fs method,
then VFS guarantees the stability of mnt_kern_flag, by blocking
other attempts to unmount until current one is finished.
If not, then either you do not need the lock, or provided snipped
which takes a lock is unsufficient, since you are dropping the lock
but continue the action that depends on the flag not being set.


pgp3IYsjmCOsB.pgp
Description: PGP signature


Re: SMP question w.r.t. reading kernel variables

2011-04-17 Thread Rick Macklem
 On Sun, Apr 17, 2011 at 03:49:48PM -0400, Rick Macklem wrote:
  Hi,
 
  I should know the answer to this, but... When reading a global
  kernel
  variable, where its modifications are protected by a mutex, is it
  necessary to get the mutex lock to just read its value?
 
  For example:
  A if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0)
return (EPERM);
  versus
  B MNT_ILOCK(mp);
   if ((mp-mnt_kern_flag  MNTK_UNMOUNTF) != 0) {
MNT_IUNLOCK(mp);
return (EPERM);
   }
   MNT_IUNLOCK(mp);
 
  My hunch is that B is necessary if you need an up-to-date value
  for the variable (mp-mnt_kern_flag in this case).
 
  Is that correct?
 
 mnt_kern_flag read is atomic on all architectures.
 If, as I suspect, the fragment is for the VFS_UNMOUNT() fs method,
 then VFS guarantees the stability of mnt_kern_flag, by blocking
 other attempts to unmount until current one is finished.
 If not, then either you do not need the lock, or provided snipped
 which takes a lock is unsufficient, since you are dropping the lock
 but continue the action that depends on the flag not being set.

Sounds like A should be ok then. The tests matter when dounmount()
calls VFS_SYNC() and VFS_UNMOUNT(), pretty much as you guessed. To
be honest, most of it will be the thread doing the dounmount() call,
although other threads fall through VOP_INACTIVE() while they are
terminating in VFS_UNMOUNT() and these need to do the test, too.
{ I just don't know much about the SMP stuff, so I don't know when
  a cache on another core might still have a stale copy of a value.
  I've heard the term memory barrier, but don't really know what it
  means.:-)

Thanks, rick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-23 Thread Daniel O'Connor

On 04/02/2011, at 13:26, Daniel O'Connor wrote:
 I am writing a program which reads from a data acquisition chassis connected 
 to a radar via USB. The interface is a Cypress FX2 and I am communicating via 
 libusb.

I ended up writing a kernel driver (thank you hps for usb_fifo_*!) and it has 
greatly improved things which is good news for me :)

I will some of the tests suggested by various people soon, I have to wait for a 
new PC to do them though.

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-18 Thread Daniel O'Connor

On 04/02/2011, at 13:26, Daniel O'Connor wrote:
 I only have about 10 milliseconds of buffering (96kbyte FIFO, 8Mbyte/sec) in 
 the hardware, however I have about 128Mb of USB requests queued up to libusb. 
 hps@ informed me that libusb will only queue 16kbyte (2msec) in the kernel at 
 one time although I have increased this.

We have upped the hardware FIFO size to 768kb, which is 91msec at 8Mb/sec, 
although due to the fact we only start reading out when it's 1/6th full the 
effective buffer is 75msec.

It does seem much more resilient to CPU load, however heavy disk activity on 
the same drive still stalls it for too long :(

Given the large buffering in the program it does seem very odd that it would 
stall for long enough unless both threads are slept while one is waiting for 
disk IO (which seems like a bug to me).

BTW I have changed to -current (without WITNESS).

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-14 Thread Daniel O'Connor

On 11/02/2011, at 6:58, Matthew Dillon wrote:
   It sounds like there are at least two issues involved.
 
   The first could be a buffer cache starvation issue due to the load on
   the filesystem from the tar.  If the usb program is doing any filesystem
   operation at all, even at low bandwidths, it could be hitting blockages
   due to the disk intensive tar eating up available buffer cache buffers
   (e.g. causing an excessive ratio of dirty buffers vs clean buffers).
   This would NOT be a scheduler problem per-say, but instead a kernel
   resource management problem.

OK..
Note that my program is split into 2 threads and queues up a large number of 
buffers. One thread just calls the libusb event handler so if the main thread 
is blocked for IO it should still run.. right? :)

   The way to test this is to double-buffer or tripple-buffer the output
   via shared memory.  A pipe might not do the job if it gets stuck doing
   direct transfers (I eventually gave up trying to optimize pipes in DFly
   due to a similar problem and just run everything through a kernel buffer
   now).  Still, it may be possible to test against this particular problem
   by having the program write to a pipe and another program or fork handle
   the actual writing to the disk or filesystem.

Hmm.. in effect I have this as I write all data to disk via mbuffer and this 
did help, but it still drops out which seems to indicate to me that my libusb 
event loop thread is being stalled. 

Note that the total CPU consumed by it is very low (1%) and that thread does 
no I/O.
 
   Another way to test this is to comment out the writing in the usb program
   entirely and see if things improve.

If I write to /dev/null it works fine.

   The second issue sounds more scheduler-related.  Try running the
   usb program at nice -20?  You could even run it at a pseudo-realtime
   priority using rtprio but nice -20 had better work properly against
   a md5 or there is something seriously broken in the scheduler.

Unfortunately neither of these improve things, I am pretty surprised a nice -20 
or rtprio'd thread doesn't beat a pure CPU user doing no IO :(
 
   Dynamic priority handling is supposed to deal with this sort of thing
   automatically, particularly if the usb program is not using a lot of
   cpu, but sometimes it can't tell whether a newly-exec'd program is
   going to be interactive or batch until after it has run for a while.
 
   Tuning initial conditions after an exec for the scheduler is not an
   easy task.  Simply giving a program a more batch/bulk-run priority on
   exec and letting the dynamic priority shift it more to interactive
   operation tends to mess up interactive shells in the face of
   cpu-intensive system operation, for example.  Theoretically dynamic
   priority handling should bump up the priority for the usb program well
   beyond any initial conditions for exec once it has been running a while,
   assuming it doesn't use tons of cpu.

Hmm.. It is unfortunate the hinting mechanisms are very coarse :(

   An md5, or any single-file reading operation, would not overload the
   buffer cache.  File writing and/or multi-file operations (such as a
   tar extraction or a tar-up) can create blockages in the buffer cache.

The md5 process is just reading /dev/null - I run it to soak up the CPU because 
in production the system will be doing CPU intensive data analysis.

   It takes a considerable amount of VM/buffer-cache tuning to get those
   subsystems to pipeline properly and sometimes things can go stale and
   stop pipelining properly for months without anyone realizing it.

:(
I am waiting on a new buffer card with 8 times bigger FIFOs which should help I 
hope..

Also I am writing a kernel driver in the hope it will be more robust :)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-10 Thread Matthew Dillon
   It sounds like there are at least two issues involved.

   The first could be a buffer cache starvation issue due to the load on
   the filesystem from the tar.  If the usb program is doing any filesystem
   operation at all, even at low bandwidths, it could be hitting blockages
   due to the disk intensive tar eating up available buffer cache buffers
   (e.g. causing an excessive ratio of dirty buffers vs clean buffers).
   This would NOT be a scheduler problem per-say, but instead a kernel
   resource management problem.

   The way to test this is to double-buffer or tripple-buffer the output
   via shared memory.  A pipe might not do the job if it gets stuck doing
   direct transfers (I eventually gave up trying to optimize pipes in DFly
   due to a similar problem and just run everything through a kernel buffer
   now).  Still, it may be possible to test against this particular problem
   by having the program write to a pipe and another program or fork handle
   the actual writing to the disk or filesystem.

   Another way to test this is to comment out the writing in the usb program
   entirely and see if things improve.

   --

   The second issue sounds more scheduler-related.  Try running the
   usb program at nice -20?  You could even run it at a pseudo-realtime
   priority using rtprio but nice -20 had better work properly against
   a md5 or there is something seriously broken in the scheduler.

   Dynamic priority handling is supposed to deal with this sort of thing
   automatically, particularly if the usb program is not using a lot of
   cpu, but sometimes it can't tell whether a newly-exec'd program is
   going to be interactive or batch until after it has run for a while.

   Tuning initial conditions after an exec for the scheduler is not an
   easy task.  Simply giving a program a more batch/bulk-run priority on
   exec and letting the dynamic priority shift it more to interactive
   operation tends to mess up interactive shells in the face of
   cpu-intensive system operation, for example.  Theoretically dynamic
   priority handling should bump up the priority for the usb program well
   beyond any initial conditions for exec once it has been running a while,
   assuming it doesn't use tons of cpu.

   --

   An md5, or any single-file reading operation, would not overload the
   buffer cache.  File writing and/or multi-file operations (such as a
   tar extraction or a tar-up) can create blockages in the buffer cache.

   It takes a considerable amount of VM/buffer-cache tuning to get those
   subsystems to pipeline properly and sometimes things can go stale and
   stop pipelining properly for months without anyone realizing it.

-Matt
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-07 Thread Ivan Voras
On 07/02/2011 04:12, Daniel O'Connor wrote:

 On 07/02/2011, at 13:02, Ivan Voras wrote:
 I'll be looking at it on Monday, I will let you know :)

 No luck with mlock() so it wouldn't appear to be paging is the issue :(

 I'm also interested in raw device vs file system access!

 Oops, sorry.. I just tried that now but it doesn't improve things :(

Meaning: you still get jitter?

 I am writing directly to /dev/ad10 but stressing /dev/ad14 (sudo tar -cf 
 /dev/null /local0)

Can you do only one of those things? I.e. leave all the file systems
alone and just do something like 'diskinfo -vt /dev/ad14'?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-07 Thread Daniel O'Connor

On 07/02/2011, at 21:07, Ivan Voras wrote:
 I'm also interested in raw device vs file system access!
 
 Oops, sorry.. I just tried that now but it doesn't improve things :(
 
 Meaning: you still get jitter?

Yes, well I didn't measure the read frequency but it dropped out (stopped 
streaming due to a full FIFO) no less often.

 I am writing directly to /dev/ad10 but stressing /dev/ad14 (sudo tar -cf 
 /dev/null /local0)
 
 Can you do only one of those things? I.e. leave all the file systems
 alone and just do something like 'diskinfo -vt /dev/ad14'?

OK, I wrote the data to /dev/null from USB and ran diskutil in a loop and it 
doesn't drop out.

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-07 Thread Ivan Voras
On 7 February 2011 13:38, Daniel O'Connor docon...@gsoft.com.au wrote:

 I am writing directly to /dev/ad10 but stressing /dev/ad14 (sudo tar -cf 
 /dev/null /local0)

 Can you do only one of those things? I.e. leave all the file systems
 alone and just do something like 'diskinfo -vt /dev/ad14'?

 OK, I wrote the data to /dev/null from USB and ran diskutil in a loop and it 
 doesn't drop out.

Maybe I misunderstood you and it's a different problem than what I was
experiencing; is this a better description of your problem:

1) you have a program communicating with a USB device
2) it reads from the device and writes to a file
3) you experience stalls when you write the data recived from the USB
device to the file but only if the file system you're writing on is
also loaded by something else - heavy reads?

?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-07 Thread Daniel O'Connor

On 07/02/2011, at 23:36, Ivan Voras wrote:
 OK, I wrote the data to /dev/null from USB and ran diskutil in a loop and it 
 doesn't drop out.
 
 Maybe I misunderstood you and it's a different problem than what I was
 experiencing; is this a better description of your problem:
 
 1) you have a program communicating with a USB device
 2) it reads from the device and writes to a file
 3) you experience stalls when you write the data recived from the USB
 device to the file but only if the file system you're writing on is
 also loaded by something else - heavy reads?
 
 ?

Yes, however CPU loading also seems to affect it.

Unfortunately I don't have a useful measurement to show the problem - ie I 
don't have a metric which correlates with the hardware FIFO filling up.

This makes the testing rather annoying :)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-06 Thread Daniel O'Connor

On 05/02/2011, at 12:43, Daniel O'Connor wrote:
 On 05/02/2011, at 11:09, Ivan Voras wrote:
 It doesn't allocate memory once it's going, everything is preallocated 
 before the data transfer starts.
 
 I'll have a go with mlock() and see what happens.
 
 Did you find anything interesting?
 
 I'll be looking at it on Monday, I will let you know :)

No luck with mlock() so it wouldn't appear to be paging is the issue :(

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-06 Thread Ivan Voras
On 7 February 2011 02:41, Daniel O'Connor docon...@gsoft.com.au wrote:

 On 05/02/2011, at 12:43, Daniel O'Connor wrote:
 On 05/02/2011, at 11:09, Ivan Voras wrote:
 It doesn't allocate memory once it's going, everything is preallocated 
 before the data transfer starts.

 I'll have a go with mlock() and see what happens.

 Did you find anything interesting?

 I'll be looking at it on Monday, I will let you know :)

 No luck with mlock() so it wouldn't appear to be paging is the issue :(

I'm also interested in raw device vs file system access!
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-06 Thread Daniel O'Connor

On 07/02/2011, at 13:02, Ivan Voras wrote:
 I'll be looking at it on Monday, I will let you know :)
 
 No luck with mlock() so it wouldn't appear to be paging is the issue :(
 
 I'm also interested in raw device vs file system access!

Oops, sorry.. I just tried that now but it doesn't improve things :(

I am writing directly to /dev/ad10 but stressing /dev/ad14 (sudo tar -cf 
/dev/null /local0)

It is interesting also that if I have md5's soaking up CPU then it's much less 
likely to start streaming properly and generally bombs out straight away. If I 
start it streaming and then start md5 it stays running... (even if it's 
rtprio'd)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-04 Thread Ivan Voras

On 04/02/2011 03:56, Daniel O'Connor wrote:


I hooked up a logic analyser and I can see most of the time it's fairly 
regularly transferring 16k of data every 2msec.

If I load up the disk by, eg, tar -cf /dev/null /local0 I find it drops out and 
I can see gaps in the transfers until eventually the FIFO fills up and it stops.

I am wondering if this is a scheduler problem (or I am expecting too much :) in 
that it is not running my libusb thread reliably under load. The other 
possibility is that it is a USB issue, although I am looking at using 
isochronous transfers instead of bulk.


I'm surprised this isn't complained about more often - I also regularly 
see that file system activity blocks other, non-file-using processes 
which are mostly CPU and memory intensive (but since I'm not running 
realtime things, it fell under the good enough category). Maybe there 
is kind of global-ish lock of some kind which the VM or the VFS hold 
which would interfere with normal operation of other processes (maybe 
when the processes use malloc() to grow their memory?).


Could you try 2 things:

	1) instead of doing file IO, could you directly use a disk device (e.g. 
/dev/ad0), possibly with some more intensive utility than dd (e.g. 
diskinfo -vt) and see if there is any difference?


	2) if there is a difference in 1), try modifying your program to not 
use malloc() in the critical path (if applicable) and/or use mlock(2)?



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-04 Thread Daniel O'Connor

On 04/02/2011, at 21:48, Ivan Voras wrote:
 I am wondering if this is a scheduler problem (or I am expecting too much :) 
 in that it is not running my libusb thread reliably under load. The other 
 possibility is that it is a USB issue, although I am looking at using 
 isochronous transfers instead of bulk.
 
 I'm surprised this isn't complained about more often - I also regularly see 
 that file system activity blocks other, non-file-using processes which are 
 mostly CPU and memory intensive (but since I'm not running realtime things, 
 it fell under the good enough category). Maybe there is kind of global-ish 
 lock of some kind which the VM or the VFS hold which would interfere with 
 normal operation of other processes (maybe when the processes use malloc() to 
 grow their memory?).

I guess for an interactive user anything less than 100msec is probably not 
noticeable unless it happens reasonably regularly when watching a video.

 Could you try 2 things:
 
   1) instead of doing file IO, could you directly use a disk device (e.g. 
 /dev/ad0), possibly with some more intensive utility than dd (e.g. diskinfo 
 -vt) and see if there is any difference?

OK, I'll give it a shot.

   2) if there is a difference in 1), try modifying your program to not 
 use malloc() in the critical path (if applicable) and/or use mlock(2)?

It doesn't allocate memory once it's going, everything is preallocated before 
the data transfer starts.

I'll have a go with mlock() and see what happens.

Thanks :)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-04 Thread Ivan Voras

On 04/02/2011 12:45, Daniel O'Connor wrote:


On 04/02/2011, at 21:48, Ivan Voras wrote:

I am wondering if this is a scheduler problem (or I am expecting too much :) in 
that it is not running my libusb thread reliably under load. The other 
possibility is that it is a USB issue, although I am looking at using 
isochronous transfers instead of bulk.


I'm surprised this isn't complained about more often - I also regularly see that file 
system activity blocks other, non-file-using processes which are mostly CPU and memory 
intensive (but since I'm not running realtime things, it fell under the good 
enough category). Maybe there is kind of global-ish lock of some kind which the VM 
or the VFS hold which would interfere with normal operation of other processes (maybe 
when the processes use malloc() to grow their memory?).


I guess for an interactive user anything less than 100msec is probably not 
noticeable unless it happens reasonably regularly when watching a video.


Could you try 2 things:

1) instead of doing file IO, could you directly use a disk device (e.g. 
/dev/ad0), possibly with some more intensive utility than dd (e.g. diskinfo 
-vt) and see if there is any difference?


OK, I'll give it a shot.


2) if there is a difference in 1), try modifying your program to not 
use malloc() in the critical path (if applicable) and/or use mlock(2)?


It doesn't allocate memory once it's going, everything is preallocated before 
the data transfer starts.

I'll have a go with mlock() and see what happens.


Did you find anything interesting?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler question

2011-02-04 Thread Daniel O'Connor

On 05/02/2011, at 11:09, Ivan Voras wrote:
 It doesn't allocate memory once it's going, everything is preallocated 
 before the data transfer starts.
 
 I'll have a go with mlock() and see what happens.
 
 Did you find anything interesting?

I'll be looking at it on Monday, I will let you know :)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C






___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: A question about WARNING: attempt to domain_add(xyz) after domainfinalize()

2011-01-13 Thread Stefan Esser
Am 13.01.2011 06:42, schrieb Julian Elischer:
 On 1/12/11 5:26 AM, Svatopluk Kraus wrote:
 Hi,

 I'd like to add a new network domain into kernel (and never remove it)
 from loadable module. In fact, I did it, but I got following warning
 from domain_add(): WARNING: attempt to domain_add(xyz) after
 domainfinalize(). Now, I try to figure out what is behind the
 warning, which seems to become KASSERT (now, in notyet section part of
 code, which is 6 years old).
 
 just ignore that message, everyone else does :-)

Well, yes, but there are actual problems in the current code and it
needs to be worked on. I checked the situation the last time the
question arose about half a year ago, and there are races and real
bugs (IIRC, you can only add one domain from a KLD, a second one
will not be initialized). My spare time was (and to date is) very
limited, but I still intend to prepare a fix (not sure whether the
races can be completely avoided, but they are extremely hard to
trigger since they only exist during module load).

 teh problem is that the idea of domainfinalize() is incompatible with
 having the ability to add domains from modules.
 Luckily domain finalize doesn't actually do anything that stops your new
 domain from working so it doesn't matter.

Not exactly true: In fact, domainfinalize performs some init work
for all domains and interfaces that have been compiled into the kernel.
After domainfinalize has been called, this domain initialization (the
initialization of pointers in the interface structure of all existing
network devices) has to be performed by a call to the same code that
domainfinalize calls in a loop for compiled in network interfaces.
But there are checks in that code, which need to be checked and fixed.
I had made annotations to the affected files, half a year ago, but do
not have access to my development system right now and thus I cannot
provide details, now.

 you'll get the same message if you add netgraph.ko as an object.

The netgraph domain is probably the only one that is often loaded from
a KLD in generic systems. If another driver adds a domain and netgraph
is also required, then I'd strongly suggest to compile netgraph into
the kernel (and thus to have only one domain added from a KLD).

Best regards, STefan
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: A question about WARNING: attempt to domain_add(xyz) after domainfinalize()

2011-01-12 Thread Julian Elischer

On 1/12/11 5:26 AM, Svatopluk Kraus wrote:

Hi,

I'd like to add a new network domain into kernel (and never remove it)
from loadable module. In fact, I did it, but I got following warning
from domain_add(): WARNING: attempt to domain_add(xyz) after
domainfinalize(). Now, I try to figure out what is behind the
warning, which seems to become KASSERT (now, in notyet section part of
code, which is 6 years old).



just ignore that message, everyone else does :-)

teh problem is that the idea of domainfinalize() is incompatible with 
having the ability

to add domains from modules.
Luckily domain finalize doesn't actually do anything that stops your 
new domain

from working so it doesn't matter.

you'll get the same message if you add netgraph.ko as an object.



I found a few iteration on domains list and each domain protosw table,
which are not protected by any lock. OK, it is problem but when I only
add a domain (it's added at the head of domains list) and never remove
it then that could be safe. Moreover, it seems that without any
limits, it is possible to add a new protocol into domain on reserved
place labeled as PROTO_SPACER by pf_proto_register() function. Well,
it's not a list so it's a different case (but a copy into spacer isn't
atomic operation).

I found two global variables (max_hdr,max_datalen) which are evaluated
in each domain_init() from other variables (max_linkhdr,max_protohdr)
and a global variable (max_keylen) which is evaluated from all known
domains (dom_maxrtkey entry). The variables are used in other parts of
kernel. Futher, I know about 'dom_ifattach' and 'dom_ifdetach'
pointers to functions defined on each domain, which are responsible
for 'if_afdata' entry on ifnet structure.

Is there something more I didn't find in current kernel?
Will be there something more in future kernels, what legitimize
KASSERT in domain_add()?

My network domain doesn't influence any mentioned global variables,
doesn't define dom_ifattach() and dom_ifdetach() functions, and should
be only added from loadable module and never removed. So, I think it's
safe. But I'm a little bit nervous because of planned KASSERT in
domain_add().

Well, I can implement an empty domain with some spacers for protocols,
link it into kernel (thus shut down the warning), and make loadable
module in which I only register protocols I want on already added
domain, but why should I do it in that (for me good-for-nothing) way?

  Thanks for any response, Svata
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Scheduler Question

2010-10-12 Thread Garrett Cooper
On Sat, Oct 9, 2010 at 4:46 PM, Eknath Venkataramani
eknath.i...@gmail.com wrote:
 DI of the FreeBSD Operating System says it's gonna refer to the BSD default
 scheduler, the 'time share scheduler' does this mean sched_4BSD.c(In the
 introduction section of Chapter 4) handles only time-share process?
 If so, then how (or where) are the kernel processes/real time process
 scheduled?

The Design and Implementation of the FreeBSD Operating System is
unfortunately extremely out of date (my edition which I think is the
latest one refers to FreeBSD 5.2). The FreeBSD scheduler was switched
over to sched_ule.c as the default scheduler in 7.1.
So I'd invest more time in determining how SCHED_ULE works rather
than SCHED_4BSD going forward (even though learning about SCHED_4BSD
is a good lesson in history of design of FreeBSD).
FWIW the algorithm of prioritization, quantization of time slices,
etc for SCHED_4BSD are discussed more in depth in the chapter.
Cheers,
-Garrett
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-26 Thread RW
On Sun, 25 Jul 2010 23:43:08 +0300
Andriy Gapon a...@freebsd.org wrote:

 on 25/07/2010 23:28 RW said the following:

  I didn't say it say it was guaranteed. I just think the scenario
  where a first pass ends up between the watermarks is rare. And when
  it happens I don't see a compelling reason to do extra paging to
  reach an arbitrary target.
 
 Well, it seems neither I nor you have data to show whether it's rare
 or not (and it would greatly depend on workload too).
 As to arbitrary target - well, that's the whole point of
 hysteresis-like behavior.  We start paging also at an arbitrary
 point.


If after the first pass with light-paging the high watermark isn't
reached then the choices are

1) loop and immediately do a heavy-paging pass.

2) wait and let the daemon get woken-up for another light-paging pass -
only go to heavy-paging when this strategy isn't keeping up with demand.

To me (2) is doing the right thing. It's trying to satisfy  demand from
existing clean pages, and only paging heavily as a last resort. 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-26 Thread Andriy Gapon
on 25/07/2010 23:43 Andriy Gapon said the following:
 on 25/07/2010 23:28 RW said the following:
 I didn't say it say it was guaranteed. I just think the scenario where
 a first pass ends up between the watermarks is rare. And when it
 happens I don't see a compelling reason to do extra paging to reach an
 arbitrary target.
 
 Well, it seems neither I nor you have data to show whether it's rare or not 
 (and
 it would greatly depend on workload too).
 As to arbitrary target - well, that's the whole point of hysteresis-like
 behavior.  We start paging also at an arbitrary point.

Well, it seems that you are right (at least to a certain degree) - with
moderately high memory load (starting lots of memory hungry real
applications and not letting them sit idle) a single pass was always sufficient.
 Even with my suggested change! :-)  I.e. that single pass was always able to
shoot to or over the high watermark.
So, in fact, there is not much (any?) difference between current code and
patched code in this case.

But not quite so with stress2 swap test.
In that case more than one pass was needed in almost all the cases.  Again, this
is with patched vm_pageout().

Which brings another interesting point which was overlooked initially.
vm_pageout() loop can make at most two passes back-to-back, after that it slows
down to make an additional pass every 1/2 seconds:
if (vm_pages_needed) {
/*
 * Still not done, take a second pass without waiting
 * (unlimited dirty cleaning), otherwise sleep a bit
 * and try again.
 */
++pass;
if (pass  1)
msleep(vm_pages_needed,
vm_page_queue_free_mtx, PVM, psleep,
hz / 2);
} else {

With the patched code and stress2 I indeed observed pagedaemon spending time in
this sleep.

On the other hand, current unpatched code is more optimistic about calling it
done.  So even if only a handful of pages is freed and available memory goes
just above low watermark, pagedaemon would decide that it had a successful pass
and would reset pass count to zero.  Those freed pages would, of course, get
consumed immediately and a new pass would be requested.  Since the history is
lost at this point, there would be no rate limit for the new pass.

So my _theory_ is that in very harsh conditions doing true hysteresis would
result in many _accounted_ passes and thus throttled down pagedaemon.  On the
other hand, the current code would still do many passes because of the constant
memory pressure, but they will be (mostly) unaccounted and thus pagedaemon would
be scanning pages 'like crazy'.

In other words: with current code available page count would rapidly oscillate
around low watermark, while with patched code available page count would mostly
stay low.

Not sure which one is better.  But for me, in such extreme conditions,  slowing
things down sounds better than spinning pagedaemon.

P.S.
Just in case, I would like to point out that the patch doesn't change condition
when the waiters are notified about available memory - it is still
!vm_page_count_min().  The patches only changes when vm_pages_needed is reset.
This is kind of obvious, but I decided to make it explicit.

-- 
Andriy Gapon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-26 Thread Andriy Gapon
on 26/07/2010 20:53 RW said the following:
 If after the first pass with light-paging the high watermark isn't
 reached then the choices are
 
 1) loop and immediately do a heavy-paging pass.
 
 2) wait and let the daemon get woken-up for another light-paging pass -
 only go to heavy-paging when this strategy isn't keeping up with demand.
 
 To me (2) is doing the right thing. It's trying to satisfy  demand from
 existing clean pages, and only paging heavily as a last resort. 

Well, based on my observations, if the first pass doesn't reach the high
watermark, then we are in a high pressure situation and so we would have to do
some heavy-lifting anyways.  In my opinion, it's better to start doing more work
 at once than trying to pretend that situation would somehow resolve itself.

-- 
Andriy Gapon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-25 Thread Andriy Gapon
on 25/07/2010 02:31 RW said the following:
 On Sat, 24 Jul 2010 23:23:07 +0300
 Andriy Gapon a...@freebsd.org wrote:
 
 There is a good deal of comments in the vm_pageout.c code that imply
 that we use a hysteresis approach to deal with low available pages
 condition.


 In general, the hysteresis, the comments and the code make sense.
 My doubt, though, is about the block of code that is right below the
 comment quoted above:
 if (vm_pages_needed  !vm_page_count_min()) {
 if (!vm_paging_needed())
 vm_pages_needed = 0;
 wakeup(cnt.v_free_count);
 }
 
 As I understand it the hysteresis is done inside vm_pageout_scan, and
 the expectation is that one pass will typically satisfy this because the
 design aims to keep enough clean pages in the inactive queue.  

I have seen these lines in vm_pageout_scan:
/*
 * Calculate the number of pages we want to either free or move
 * to the cache.
 */
page_shortage = vm_paging_target() + addl_page_shortage_init;
...
/*
 * Compute the number of pages we want to try to move from the
 * active queue to the inactive queue.
 */
page_shortage = vm_paging_target() +
cnt.v_inactive_target - cnt.v_inactive_count;
page_shortage += addl_page_shortage;

But I am not sure about clean pages in the inactive queue part.
From what I can see in the code,  pagedaemon only tries to maintain a certain
number of pages on inactive queue - I am speaking about  
vm_pageout_page_stats().
But I do not see any code ensuring level of _clean_ inactive pages.
And, if I am not mistaken, there is no guarantee even that those pages will not
be re-activated when pagedaemon actually scans them.

 I'm not sure if  the vm_paging_needed() call is correct or not, but it
 may be that that the intent is to avoid immediately going back to a
 depleted inactive queue when cache+free is within normal bounds,
 because it could result in avoidable paging to swap. 

Well, OTOH, if the current pass results in many pages being re-activated and
many pages still left on the inactive queue because they are dirty (see
maxlaunder in vm_pageout_scan), then it is premature to quit paging when we only
reached bare minimum of available pages (see pass and maxlaunder again).  IMHO,
of course.


As a side discussion, I wonder if current setting of v_inactive_target is
adequate.  It feels that it should be bigger.

-- 
Andriy Gapon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-25 Thread RW
On Sun, 25 Jul 2010 13:07:21 +0300
Andriy Gapon a...@freebsd.org wrote:

 on 25/07/2010 02:31 RW said the following:

  As I understand it the hysteresis is done inside vm_pageout_scan,
  and the expectation is that one pass will typically satisfy this
  because the design aims to keep enough clean pages in the inactive
  queue.  
 

 But I am not sure about clean pages in the inactive queue ... But I
 do not see any code ensuring level of _clean_ inactive pages. 

In FreeBSD the inactive queue contains disk cache pages which normally
provide most of the clean pages needed. In addition pages are dribbled
out to swap, and the resulting clean pages are placed at the back of
the inactive queue to make another pass. 

 
  I'm not sure if  the vm_paging_needed() call is correct or not, but
  it may be that that the intent is to avoid immediately going back
  to a depleted inactive queue when cache+free is within normal
  bounds, because it could result in avoidable paging to swap. 
 
 Well, OTOH, if the current pass results in many pages being
 re-activated and many pages still left on the inactive queue because
 they are dirty (see maxlaunder in vm_pageout_scan), 

Dirty-pages  make three passes through the inactive queue: twice dirty,
once clean. They are paged-out at the end of the second paass, so it's
unlike that they reactivated except under very heavy thrashing. 

 then it is
 premature to quit paging when we only reached bare minimum of
 available pages (see pass and maxlaunder again).  IMHO, of course.

It's not the bare minimum, that's another level that vm_page_count_min()
tests for.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-25 Thread Andriy Gapon
on 25/07/2010 16:41 RW said the following:
 On Sun, 25 Jul 2010 13:07:21 +0300
 Andriy Gapon a...@freebsd.org wrote:
 
 on 25/07/2010 02:31 RW said the following:
 
 As I understand it the hysteresis is done inside vm_pageout_scan,
 and the expectation is that one pass will typically satisfy this
 because the design aims to keep enough clean pages in the inactive
 queue.  
 
 But I am not sure about clean pages in the inactive queue ... But I
 do not see any code ensuring level of _clean_ inactive pages. 
 
 In FreeBSD the inactive queue contains disk cache pages which normally
 provide most of the clean pages needed. In addition pages are dribbled
 out to swap, and the resulting clean pages are placed at the back of
 the inactive queue to make another pass. 

Well, normally and most are not quite quantitative.
Personally, I do not see any guarantees that inactive queue would contain enough
clean pages to reach paging target on a single pass.

 I'm not sure if  the vm_paging_needed() call is correct or not, but
 it may be that that the intent is to avoid immediately going back
 to a depleted inactive queue when cache+free is within normal
 bounds, because it could result in avoidable paging to swap. 
 Well, OTOH, if the current pass results in many pages being
 re-activated and many pages still left on the inactive queue because
 they are dirty (see maxlaunder in vm_pageout_scan), 
 
 Dirty-pages  make three passes through the inactive queue: twice dirty,
 once clean. They are paged-out at the end of the second paass, so it's
 unlike that they reactivated except under very heavy thrashing. 

I didn't mean to say that dirty pages would get re-activated.
Clean pages can perfectly be re-activated if they were referenced since their
de-activation time.

 then it is
 premature to quit paging when we only reached bare minimum of
 available pages (see pass and maxlaunder again).  IMHO, of course.
 
 It's not the bare minimum, that's another level that vm_page_count_min()
 tests for.

I meant bare minimum to stop paging, that is, going above lower watermark of the
paging hysteresis.

-- 
Andriy Gapon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-25 Thread RW
On Sun, 25 Jul 2010 17:19:41 +0300
Andriy Gapon a...@freebsd.org wrote:

 on 25/07/2010 16:41 RW said the following:

  In FreeBSD the inactive queue contains disk cache pages which
  normally provide most of the clean pages needed. In addition pages
  are dribbled out to swap, and the resulting clean pages are placed
  at the back of the inactive queue to make another pass. 
 
 Well, normally and most are not quite quantitative.
 Personally, I do not see any guarantees that inactive queue would
 contain enough clean pages to reach paging target on a single pass.

I didn't say it say it was guaranteed. I just think the scenario where
a first pass ends up between the watermarks is rare. And when it
happens I don't see a compelling reason to do extra paging to reach an
arbitrary target.

I think the comment about not clearing vm_pages_needed is referring to
clearing it below the low-watermark because the daemon would then get
woken-up almost immediately.

 I meant bare minimum to stop paging, that is, going above lower
 watermark of the paging hysteresis.
 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-25 Thread Andriy Gapon
on 25/07/2010 23:28 RW said the following:
 On Sun, 25 Jul 2010 17:19:41 +0300
 Andriy Gapon a...@freebsd.org wrote:
 
 on 25/07/2010 16:41 RW said the following:
 
 In FreeBSD the inactive queue contains disk cache pages which
 normally provide most of the clean pages needed. In addition pages
 are dribbled out to swap, and the resulting clean pages are placed
 at the back of the inactive queue to make another pass. 
 Well, normally and most are not quite quantitative.
 Personally, I do not see any guarantees that inactive queue would
 contain enough clean pages to reach paging target on a single pass.
 
 I didn't say it say it was guaranteed. I just think the scenario where
 a first pass ends up between the watermarks is rare. And when it
 happens I don't see a compelling reason to do extra paging to reach an
 arbitrary target.

Well, it seems neither I nor you have data to show whether it's rare or not (and
it would greatly depend on workload too).
As to arbitrary target - well, that's the whole point of hysteresis-like
behavior.  We start paging also at an arbitrary point.

-- 
Andriy Gapon
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: pageout question

2010-07-24 Thread RW
On Sat, 24 Jul 2010 23:23:07 +0300
Andriy Gapon a...@freebsd.org wrote:

 
 There is a good deal of comments in the vm_pageout.c code that imply
 that we use a hysteresis approach to deal with low available pages
 condition.
 
 
 In general, the hysteresis, the comments and the code make sense.
 My doubt, though, is about the block of code that is right below the
 comment quoted above:
 if (vm_pages_needed  !vm_page_count_min()) {
 if (!vm_paging_needed())
 vm_pages_needed = 0;
 wakeup(cnt.v_free_count);
 }

As I understand it the hysteresis is done inside vm_pageout_scan, and
the expectation is that one pass will typically satisfy this because the
design aims to keep enough clean pages in the inactive queue.  

I'm not sure if  the vm_paging_needed() call is correct or not, but it
may be that that the intent is to avoid immediately going back to a
depleted inactive queue when cache+free is within normal bounds,
because it could result in avoidable paging to swap. 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: sysctl question

2010-07-13 Thread Ed Schouten
* Andreas Tobler andreast-l...@fgznet.ch wrote:
 But now I wonder how can I teach the sysctl to print my tempreature
 the same way as my userland app does.

I seem to remember all the other temperature sensors expose their value
using tenth Kelvin precision. There is some kind of modifier you can add
to the SYSCTL_INT declaration to denote that it's a temperature value.
The sysctl(8) code on HEAD seems to suggest the type is IK.

Greetings,
-- 
 Ed Schouten e...@80386.nl
 WWW: http://80386.nl/


pgpN5VezoVfCk.pgp
Description: PGP signature


Re: sysctl question

2010-07-13 Thread Andreas Tobler

On 13.07.10 10:48, Ed Schouten wrote:

* Andreas Toblerandreast-l...@fgznet.ch  wrote:

But now I wonder how can I teach the sysctl to print my tempreature
the same way as my userland app does.


I seem to remember all the other temperature sensors expose their value
using tenth Kelvin precision. There is some kind of modifier you can add
to the SYSCTL_INT declaration to denote that it's a temperature value.
The sysctl(8) code on HEAD seems to suggest the type is IK.


Thanks Ed!

I need to figure out how to format my temperature value to fit with this 
modifier.


Regards,
Andreas
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-23 Thread Eitan Adler
 - use a matrix is faster than use a linked list?

For what?
For insertion and deletion no - linked list is faster. For sequential
access they are the same speed (forgetting look-ahead caching). For
random access matrix is faster.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-23 Thread Joerg Sonnenberger
On Fri, Apr 23, 2010 at 06:18:46PM +0300, Eitan Adler wrote:
  - use a matrix is faster than use a linked list?
 
 For what?
 For insertion and deletion no - linked list is faster. For sequential
 access they are the same speed (forgetting look-ahead caching). For
 random access matrix is faster.

Actually -- it depends. Removing the tail and inserting at tail is
amortised constant time for arrays if done using the double-on-full
trick. In that case, array can be the faster datastructure too.

Joerg
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-23 Thread Pieter de Goeje
On Friday 23 April 2010 17:40:12 Joerg Sonnenberger wrote:
 On Fri, Apr 23, 2010 at 06:18:46PM +0300, Eitan Adler wrote:
   - use a matrix is faster than use a linked list?
 
  For what?
  For insertion and deletion no - linked list is faster. For sequential
  access they are the same speed (forgetting look-ahead caching). For
  random access matrix is faster.

 Actually -- it depends. Removing the tail and inserting at tail is
 amortised constant time for arrays if done using the double-on-full
 trick. In that case, array can be the faster datastructure too.

Random deletes can be made O(1) if you don't care about the order of the 
elements in an array.

- Pieter
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-09 Thread Alexander Churanov
2010/4/9 Leinier Cruz Salfran salfrancl.lis...@gmail.com

 - use a matrix is faster than use a linked list?

 example:

 char *szColumnName[10];
 unsigned short iColumnAge[10];


 struct _llList {
  struct _llList *prev, *next;
  char szName[64];
  unsigned short iAge;
  };


Leinier ,

This depends on what kind of operations are performed. For sequential
traversing, both are very appropriate. However, you can not perform a binary
search on a list. You also can not combine two arrays into a single one with
constant complexity.

Lists also have greater memory overhead for small structures.

My advice: always use arrays.
Use lists if:

1) Copying items when the dynamic arrays grows is inappropriate.
2) List-specific operations like O(1) splicing or O(1) insertions and
deletions are required.

Alexander Churanov
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-09 Thread Leinier Cruz Salfran
On Fri, Apr 9, 2010 at 10:52 AM, Alexander Churanov
alexanderchura...@gmail.com wrote:
 2010/4/9 Leinier Cruz Salfran salfrancl.lis...@gmail.com

 - use a matrix is faster than use a linked list?

 example:

 char *szColumnName[10];
 unsigned short iColumnAge[10];


 struct _llList {
  struct _llList *prev, *next;
  char szName[64];
  unsigned short iAge;
  };


 Leinier ,
 This depends on what kind of operations are performed. For sequential
 traversing, both are very appropriate. However, you can not perform a binary
 search on a list. You also can not combine two arrays into a single one with
 constant complexity.
 Lists also have greater memory overhead for small structures.
 My advice: always use arrays.
 Use lists if:
 1) Copying items when the dynamic arrays grows is inappropriate.
 2) List-specific operations like O(1) splicing or O(1) insertions and
 deletions are required.
 Alexander Churanov


hello alexander

i supposed that a matrix is much faster .. i coded my program to use
matrix in that portion but i sent the question to see what others
think about this

thanks
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question

2010-04-09 Thread KAYVEN RIESE

On Fri, 9 Apr 2010, Leinier Cruz Salfran wrote:


hello all

i want to know your oppinions about this:

- use a matrix is faster than use a linked list?


yes.




example:

char *szColumnName[10];
unsigned short iColumnAge[10];


struct _llList {
 struct _llList *prev, *next;
 char szName[64];
 unsigned short iAge;
};
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



*--*
  Kayven Riese, BSCS, MS (Physiology and Biophysics)
  (415) 902 5513 cellular
  http://kayve.net
  Webmaster http://ChessYoga.org
*--*
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Newbie question: kernel image a dynamically linked binary?

2010-04-01 Thread Gary Jennejohn
On Thu, 1 Apr 2010 15:53:50 +0530
Daniel Rodrick daniel.rodr...@gmail.com wrote:

 Hello List,
 
 I'm a newbie and coming from Linux background, and am trying to learn
 FreeBSD now. The first thing I find a little confusing is that the
 final FreeBSD kernel image is shown as a DYNAMICALLY LINKED binary:
 
 $
 $ pwd
 /boot/kernel
 $
 $ file kernel
 kernel: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD),
 dynamically linked (uses shared libs), not stripped
 $
 
 How can the kernel image use shared libraries? And which ones does it
 use, if any?
 
 Also, I cannot find out the libraries the image uses using the
 traditional ldd command:
 
 $ ldd kernel
 kernel:
 kernel: signal 6
 $
 
 Can some please throw some light?
 

file is confused.  FreeBSD uses a monolithic kernel and no shared
libraries are involved.  However, it is possible to dynamically load
modules using kldload.  See the appropriate man page.

--
Gary Jennejohn
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Newbie question: kernel image a dynamically linked binary?

2010-04-01 Thread Oliver Fromme
Hi,

Please don't crosspost to many lists.  This topic is probably
suitable for hackers@ but not for the other lists.

Daniel Rodrick daniel.rodr...@gmail.com wrote:
  I'm a newbie and coming from Linux background, and am trying to learn
  FreeBSD now. The first thing I find a little confusing is that the
  final FreeBSD kernel image is shown as a DYNAMICALLY LINKED binary:
  
  $
  $ pwd
  /boot/kernel
  $
  $ file kernel
  kernel: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD),
  dynamically linked (uses shared libs), not stripped
  $
  
  How can the kernel image use shared libraries? And which ones does it
  use, if any?
  
  Also, I cannot find out the libraries the image uses using the
  traditional ldd command:
  
  $ ldd kernel
  kernel:
  kernel: signal 6
  $

ldd works by actually executing the binary with a special
flag for rtld(1).  Compare:

$ ldd /bin/sh
/bin/sh:
libedit.so.7 = /lib/libedit.so.7 (0x280a8000)
libncurses.so.8 = /lib/libncurses.so.8 (0x280bd000)
libc.so.7 = /lib/libc.so.7 (0x280fc000)

$ LD_TRACE_LOADED_OBJECTS=1 /bin/sh
libedit.so.7 = /lib/libedit.so.7 (0x280a8000)
libncurses.so.8 = /lib/libncurses.so.8 (0x280bd000)
libc.so.7 = /lib/libc.so.7 (0x280fc000)

Of course you cannot execute the kernel (only the boot loader
knows how to load and boot the kernel), so ldd fails on the
kernel.

But you can use objdump(1) to list dynamic dependencies.

$ objdump -p /bin/sh | grep NEEDED
  NEEDED  libedit.so.7
  NEEDED  libncurses.so.8
  NEEDED  libc.so.7

$ objdump -p /boot/kernel/kernel | grep NEEDED
  NEEDED  hack.So

As far as I know, the kernel and all kernel modules need to
be dynamic binaries so the kernel linker works, which is
required for dynamically loading kernel modules.

So what is that hack.So object?  It's just a dummy that's
required for technical reasons.  You can see the details in
/sys/conf/kern.post.mk which contains this paragraph:

# This is a hack.  BFD optimizes away dynamic mode if there are no
# dynamic references.  We could probably do a '-Bforcedynamic' mode like
# in the a.out ld.  For now, this works.
HACK_EXTRA_FLAGS?= -shared
hack.So: Makefile
: hack.c
${CC} ${HACK_EXTRA_FLAGS} -nostdlib hack.c -o hack.So
rm -f hack.c

  Can some please throw some light?

I hope I did.  :-)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

Unix gives you just enough rope to hang yourself --
and then a couple of more feet, just to be sure.
-- Eric Allman
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Newbie question: kernel image a dynamically linked binary?

2010-04-01 Thread John Baldwin
On Thursday 01 April 2010 6:23:50 am Daniel Rodrick wrote:
 Hello List,
 
 I'm a newbie and coming from Linux background, and am trying to learn
 FreeBSD now. The first thing I find a little confusing is that the
 final FreeBSD kernel image is shown as a DYNAMICALLY LINKED binary:
 
 $
 $ pwd
 /boot/kernel
 $
 $ file kernel
 kernel: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD),
 dynamically linked (uses shared libs), not stripped
 $
 
 How can the kernel image use shared libraries? And which ones does it
 use, if any?
 
 Also, I cannot find out the libraries the image uses using the
 traditional ldd command:
 
 $ ldd kernel
 kernel:
 kernel: signal 6
 $
 
 Can some please throw some light?

It's a hack that is used so that the kernel linker is able to link in kernel 
modules that are built as shared objects.  The kernel is mostly built from 
static objects, but a single dynamic object (that is empty) is linked in:

# This is a hack.  BFD optimizes away dynamic mode if there are no
# dynamic references.  We could probably do a '-Bforcedynamic' mode like
# in the a.out ld.  For now, this works.
HACK_EXTRA_FLAGS?= -shared
hack.So: Makefile
: hack.c
${CC} ${HACK_EXTRA_FLAGS} -nostdlib hack.c -o hack.So
rm -f hack.c

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Newbie question: kernel image a dynamically linked binary?

2010-04-01 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav d...@des.no writes:
 File is right.  The kernel contains relocation entries so kernel modules
 can be linked against it.

relocation entries is possibly not the right term, someone with better
knowledge of ELF will have to correct me.

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Newbie question: kernel image a dynamically linked binary?

2010-04-01 Thread Dag-Erling Smørgrav
Gary Jennejohn gary.jennej...@freenet.de writes:
 Daniel Rodrick daniel.rodr...@gmail.com writes:
  $ file kernel
  kernel: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD),
  dynamically linked (uses shared libs), not stripped
 file is confused.  FreeBSD uses a monolithic kernel and no shared
 libraries are involved.  However, it is possible to dynamically load
 modules using kldload.  See the appropriate man page.

File is right.  The kernel contains relocation entries so kernel modules
can be linked against it.

monolithic means something else entirely.

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ptrace question

2009-07-27 Thread Diskin, Gal
Hi Kostik,
I'm tracing a native FreeBSD process. I tried looking at the Linux code to find 
a hint how to port my existing Linux code to FreeBSD. 

This is exactly what I was looking for - Thank you!

Thanks,
Gal


-Original Message-
From: Kostik Belousov [mailto:kostik...@gmail.com] 
Sent: Sunday, July 26, 2009 8:50 PM
To: Diskin, Gal
Cc: freebsd-hackers@freebsd.org
Subject: Re: ptrace question

On Sun, Jul 26, 2009 at 06:11:25PM +0300, Diskin, Gal wrote:
 Hi,
 I'm using ptrace to execute one application under the control
 of another (surprisingly :P). I'm trying to find the number
 of the last system call executed in the traced process from
 the tracing process. In Linux this is done using orig_eax
 (or orig_rax) but as far as I can tell it does not have a
 counterpart in FreeBSD (correct me if I'm wrong). I've looked
 at the kernel sources in hope of finding out how the conversion
 was done in the Linux emulation layer. The file linux_ptrace.c
 (http://fxr.watson.org/fxr/source/i386/linux/linux_ptrace.c?v=FREEBSD7
 2#L118) seems to be the place the conversion is taking place. However,
 in spite the comment at the top of the conversion function mentioning
 that the translation is not straightforward, the translation done is
 simply copying eax to orig_eax.

 My question is: Is there a way to find the number of the last system
 call executed in the traced application from the tracing application
 (using ptrace)?

Are you trying to trace linux process, or native freebsd ?
And, is the tracer linux process, or freebsd one ?
It seems that you are talking about linux process, note that linux
PTRACE_SYSCALL is not implemented in linuxolator.

For native FreeBSD tracers, you can use PT_TO_SCE, that stops the process
at the syscall entry, PT_TO_SCX, that stops at the syscall exit.
Most likely, truss source code is most illustrative in the usage.
The flags allow to trace both freebsd and linux processes.

After the process is stopped, you should get registers of the traced
process. Upon syscall entry, %eax contains syscall number.
-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: bsd.lib.mk question

2009-07-27 Thread Xin LI
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Gábor Kövesdán wrote:
 Hi,
 
 I wonder if there's a conventional way of building _only_ shared
 libraries using bsd.lib.mk. At default, it builds static, shared and
 profiled libraries, which is a waste of time because I only need shared
 libraries, which I use as on-demand loadable modules. Adjusting _LIBS
 after the inclusion of bsd.lib.mk doesn't help and there are no knobs to
 control the behaviour. What should I do?

If you define LIB= (or, not define it at all), and define both SHLIB and
SHLIB_MAJOR, then only shared library is being built and installed.

Example:

LIB=
SHLIB=  test
SHLIB_MAJOR=0

Would build libtest.so.0, but no libtest.a nor libtest_p.a.

Cheers,
- --
Xin LI delp...@delphij.nethttp://www.delphij.net/
FreeBSD - The Power to Serve!
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (FreeBSD)

iEYEARECAAYFAkpuIwEACgkQi+vbBBjt66C50gCgul420W4siZi3VBA2ZnHxNz4J
UesAoMIoSzqF0rE6TzvZ5/D0vyjbTc71
=Y5xW
-END PGP SIGNATURE-
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: bsd.lib.mk question

2009-07-26 Thread Ed Schouten
Hi Gabor,

* Gábor Kövesdán ga...@kovesdan.org wrote:
 I wonder if there's a conventional way of building _only_ shared
 libraries using bsd.lib.mk. At default, it builds static, shared and
 profiled libraries, which is a waste of time because I only need
 shared libraries, which I use as on-demand loadable modules.
 Adjusting _LIBS after the inclusion of bsd.lib.mk doesn't help and
 there are no knobs to control the behaviour. What should I do?

Be sure to look at the Makefiles used by the PAM modules
(lib/libpam/modules). I guess NO_PROFILE and NO_INSTALLLIB should be
sufficient.

-- 
 Ed Schouten e...@80386.nl
 WWW: http://80386.nl/


pgpW43ul3l3zL.pgp
Description: PGP signature


Re: ptrace question

2009-07-26 Thread Kostik Belousov
On Sun, Jul 26, 2009 at 06:11:25PM +0300, Diskin, Gal wrote:
 Hi,
 I'm using ptrace to execute one application under the control
 of another (surprisingly :P). I'm trying to find the number
 of the last system call executed in the traced process from
 the tracing process. In Linux this is done using orig_eax
 (or orig_rax) but as far as I can tell it does not have a
 counterpart in FreeBSD (correct me if I'm wrong). I've looked
 at the kernel sources in hope of finding out how the conversion
 was done in the Linux emulation layer. The file linux_ptrace.c
 (http://fxr.watson.org/fxr/source/i386/linux/linux_ptrace.c?v=FREEBSD7
 2#L118) seems to be the place the conversion is taking place. However,
 in spite the comment at the top of the conversion function mentioning
 that the translation is not straightforward, the translation done is
 simply copying eax to orig_eax.

 My question is: Is there a way to find the number of the last system
 call executed in the traced application from the tracing application
 (using ptrace)?

Are you trying to trace linux process, or native freebsd ?
And, is the tracer linux process, or freebsd one ?
It seems that you are talking about linux process, note that linux
PTRACE_SYSCALL is not implemented in linuxolator.

For native FreeBSD tracers, you can use PT_TO_SCE, that stops the process
at the syscall entry, PT_TO_SCX, that stops at the syscall exit.
Most likely, truss source code is most illustrative in the usage.
The flags allow to trace both freebsd and linux processes.

After the process is stopped, you should get registers of the traced
process. Upon syscall entry, %eax contains syscall number.


pgp8mGGOt47aN.pgp
Description: PGP signature


Re: c question: *printf'ing arrays

2009-07-08 Thread Alexander Best
thx for all the great help guys.

cheers,
alex

Carlos A. M. dos Santos schrieb am 2009-07-02:
 2009/7/2 Dag-Erling Smørgrav d...@des.no:
  Alexander Best alexbes...@math.uni-muenster.de writes:
      for (i=0; i  sizeof(hdr-nintendo_logo); i++)
          fprintf(stderr, %x, hdr-nintendo_logo[i]);

  What will this print if nintendo_logo is { 0x01, 0x02, 0x03, 0x04
  }?

 Good catch. It will print 0x1234 but it should print 0x01020304.
 My example has the same error. The conversion specification should be
 %02x, not just %x.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: c question: *printf'ing arrays

2009-07-04 Thread Giorgos Keramidas
On Tue, 30 Jun 2009 20:21:03 +0200 (CEST), Alexander Best 
alexbes...@math.uni-muenster.de wrote:
 thanks. now the output gets redirected using . i'm quite new to programming
 under unix. sorry for the inconvenience.

 so i guess there is no really easy way to output an inhomogeneous struct to
 stdout without using a loop to output each array contained in the struct.

No not really.  You have to do the sizeof() dance.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


  1   2   3   4   5   6   >