Re: Corruption of UFS filesystems after using md(4)

2010-11-03 Thread Peter Holm
On Tue, Nov 02, 2010 at 07:33:50PM +, Bruce Cran wrote:
 On Tuesday 02 November 2010 19:12:14 Bruce Cran wrote:
  I've noticed in recent months that I appear to be getting silent corruption
  of my UFS filesystems - and I think it may be linked to using md(4) or
  creating sparse files.
 
 I've confirmed this is a UFS bug related to sparse files: truncate -s20G f1 
  rm f1 is enough to trigger the error and start generating .viminfo files 
 that appear to be 20GB. When running fsck I get an Invalid block count 
 error 
 if I just reboot without removing the .viminfo file; if I do remove it, I get 
 a Partially allocated inode error.
 

I'm able to verify this by:

m.sh 49L, 1917C written
$ ./m.sh
Local config: x4
+ mdconfig -a -t swap -s 1g -u 5
+ bsdlabel -w md5 auto
+ newfs -U md5a
+ mount /dev/md5a /mnt
+ truncate -s20G /mnt/f1
+ rm /mnt/f1
+ umount /mnt
+ fsck -t ufs -y /dev/md5a
** /dev/md5a
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
PARTIALLY ALLOCATED INODE I=4
UNEXPECTED SOFT UPDATE INCONSISTENCY

CLEAR? yes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

2 files, 2 used, 506481 free (25 frags, 63307 blocks, 0.0%
fragmentation)

* FILE SYSTEM IS CLEAN *

* FILE SYSTEM WAS MODIFIED *
+ mdconfig -d -u 5
$ 

- Peter
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: wpa_supplicant gets points for trying, I suppose....

2010-11-03 Thread Bernhard Schmidt
On Tuesday, November 02, 2010 22:55:18 David Wolfskill wrote:
 On Tue, Nov 02, 2010 at 06:30:10PM +0100, Bernhard Schmidt wrote:
  
  Thanks. I had quick look into that and I currently do not see an easy
  way to address that issue, as in tell wpa_supplicant about the device's
  state. This might change though once a newer wpa_supplicant has been
  imported.
 
 I'm not entirely surprised -- a quick look I took at sys/dev/iwn seemed
 to indicate to me that whiule iwn(4) could whine about the switch, it
 didn't have much in the way of ability to actually provide information
 about that status in some other way (e.g., a non-zero return from
 attempt to mess with the device).  But I don't claim extensive expertise
 in that area.

There is ieee80211_notify_radio(), granted iwn(4) misses the calls.. that 
function is supposed to notify upper layers about the radio state (0 = off, 1 
= on). Anyways, once wpa_supplicant import/update is done, I'll probably have 
a look into that again.

  For now just add wpa_supplicant_flags=- to rc.conf.
 :
 :-}  That, or decide to ignore the silly messages  Noted, thanks.
 
 Peace,
 david

-- 
Bernhard
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: calcru: runtime went backwards

2010-11-03 Thread Aragon Gouveia
I recently saw these on a Dell server.  It was caused by power saving 
being enabled in the BIOS in a mode where the BIOS takes control instead 
of handing it off to the OS.  Disable power saving or set it such that 
the OS has control (and then enable powerd(8) if you like).



Regards,
Aragon


On 10/30/10 21:19, David Rhodus wrote:

I haven't seen much of this since 5.x days.  Anyone else see calcru
messages lately ?

-DR

NFS# uname -a
FreeBSD NFS.Lesmilde.com 9.0-CURRENT FreeBSD 9.0-CURRENT #2: Fri Oct
29 01:07:40 CDT 2010
r...@nfs.lesmilde.com:/usr/obj/usr/src/sys/GENERIC  amd64
NFS# tail -25 /var/log/messages
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 91464
usec to 40935 usec for pid 2709 (csh)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 4334
usec to 1927 usec for pid 2134 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 4814
usec to 2140 usec for pid 2133 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 4752
usec to 2113 usec for pid 2132 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 5322
usec to 2366 usec for pid 2131 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 5183
usec to 2304 usec for pid 2130 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 4495
usec to 1998 usec for pid 2129 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 4501
usec to 2001 usec for pid 2128 (getty)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 15315
usec to 6809 usec for pid 2127 (login)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from
32057357 usec to 28943929 usec for pid 2063 (cron)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 1381
usec to 613 usec for pid 2015 (rsync)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 1606
usec to 936 usec for pid 1940 (smbd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 20818
usec to 9600 usec for pid 1895 (smbd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 18992
usec to 8440 usec for pid 1760 (cupsd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 3378
usec to 1501 usec for pid 1720 (mountd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 1458
usec to 648 usec for pid 1681 (nfsuserd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 568
usec to 308 usec for pid 1335 (devd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 214373
usec to 95273 usec for pid 1335 (devd)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 965
usec to 428 usec for pid 132 (adjkerntz)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 191
usec to 84 usec for pid 15 (vmdaemon)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 74
usec to 33 usec for pid 7 (sctp_iterator)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 984227
usec to 748883 usec for pid 4 (g_down)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from
1281130 usec to 979529 usec for pid 3 (g_up)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from 10320
usec to 4890 usec for pid 1 (init)
Oct 30 19:13:25 NFS kernel: calcru: runtime went backwards from
6244341 usec to 2848133 usec for pid 1 (init)
NFS#

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: calcru: runtime went backwards

2010-11-03 Thread Sam Fourman Jr.
On Wed, Nov 3, 2010 at 4:36 AM, Aragon Gouveia ara...@phat.za.net wrote:
 I recently saw these on a Dell server.  It was caused by power saving being
 enabled in the BIOS in a mode where the BIOS takes control instead of
 handing it off to the OS.  Disable power saving or set it such that the OS
 has control (and then enable powerd(8) if you like).



We disabled AMD C1E support in BIOS (it was Enabled)
now we have not seen this problem for 2 days now.


-- 

Sam Fourman Jr.
Fourman Networks
http://www.fourmannetworks.com
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: wpa_supplicant gets points for trying, I suppose....

2010-11-03 Thread David Wolfskill
On Wed, Nov 03, 2010 at 08:27:02AM +0100, Bernhard Schmidt wrote:
 ...
 There is ieee80211_notify_radio(), granted iwn(4) misses the calls.. that 
 function is supposed to notify upper layers about the radio state (0 = off, 1 
 = on). Anyways, once wpa_supplicant import/update is done, I'll probably have 
 a look into that again.
 

Cool.  If you want/need testing, I'll be happy to help.  :-)

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpuH2SpqAw5Z.pgp
Description: PGP signature


Re: Openoffice doesn't work with kernel+world built with Clang

2010-11-03 Thread Renato Botelho
On Fri, Oct 22, 2010 at 8:54 AM, Renato Botelho rbga...@gmail.com wrote:
 I have a 9.0-current (r214167) amd64, kernel and world built
 with clang and all ports built with gcc, and i cannot start
 openoffice anymore, it shows splash, start to go up and die.

 If I reinstall world+kernel built with gcc openoffice works fine.

 The is a ktrace result available [1], let me know if you need
 more information or tests.

For now i solve my problem adding this to /etc/src.conf

.if ${.CURDIR} == /usr/src/gnu/lib/libgcc
CC=cc
CXX=c++
.endif

This way libgcc_s.so is built using gcc instead of clang and the problem
is gone. I just wonder other problems we can find since simething on
libgcc_s.so is broken when built with clang.

-- 
Renato Botelho
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: calcru: runtime went backwards

2010-11-03 Thread Andriy Gapon
on 03/11/2010 12:06 Sam Fourman Jr. said the following:
 On Wed, Nov 3, 2010 at 4:36 AM, Aragon Gouveia ara...@phat.za.net wrote:
 I recently saw these on a Dell server.  It was caused by power saving being
 enabled in the BIOS in a mode where the BIOS takes control instead of
 handing it off to the OS.  Disable power saving or set it such that the OS
 has control (and then enable powerd(8) if you like).


 
 We disabled AMD C1E support in BIOS (it was Enabled)
 now we have not seen this problem for 2 days now.

What revision do you run?
What is output of kern.eventtimer sub-tree?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Openoffice doesn't work with kernel+world built with Clang

2010-11-03 Thread Renato Botelho
On Wed, Nov 3, 2010 at 11:44 AM, Ed Schouten e...@80386.nl wrote:
 Garga!

 * Renato Botelho rbga...@gmail.com, 20101103 13:36:
 For now i solve my problem adding this to /etc/src.conf

 .if ${.CURDIR} == /usr/src/gnu/lib/libgcc
 CC=cc
 CXX=c++
 .endif

 This way libgcc_s.so is built using gcc instead of clang and the problem
 is gone. I just wonder other problems we can find since simething on
 libgcc_s.so is broken when built with clang.

 Would it be hard to figure out which exact object file causes this?

Hi Ed,

I've submitted a ktrace result of openoffice execution [1], i just
saw it got a SIGBUS at some point, but debug openoffice doesn't
seem to be a trivial task.

I don't know if we can build OO with debug symbols to make it
easier to debug. If you know what i can do to help debugging,
just let me know and i can provide any information.

[1] - http://people.freebsd.org/~garga/ktrace-error2.txt.gz
-- 
Renato Botelho
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


KDE 3.5.10_6 compiled (was Re: 9-CURRENT: ports/net/kdenetwork3 does not compile)

2010-11-03 Thread Matthias Apitz

Just for the records: After applying the patch for net/kdenetwork3 from
Ed (thanks again for this), the port x11/kde3 (3.5.10_6) compiled fine
on -CURRENT without further tweakings; and it comes up fine too :-)

I've two smaller problems to investigate

1) I can't zapp the Xorg server with CTRL-ALT-BS (I've checked with
xev(1) that the keys are working)

2) after KDE shutdown, the X server restarts again in background and I
must kill the X proc manually...

Any ideas about this?

Thanks

matthias

-- 
Matthias Apitz
t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211
e g...@unixarea.de - w http://www.unixarea.de/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Openoffice doesn't work with kernel+world built with Clang

2010-11-03 Thread Ed Schouten
* Renato Botelho rbga...@gmail.com, 20101103 15:36:
 On Wed, Nov 3, 2010 at 11:44 AM, Ed Schouten e...@80386.nl wrote:
  Garga!
 
  * Renato Botelho rbga...@gmail.com, 20101103 13:36:
  For now i solve my problem adding this to /etc/src.conf
 
  .if ${.CURDIR} == /usr/src/gnu/lib/libgcc
  CC=cc
  CXX=c++
  .endif
 
  This way libgcc_s.so is built using gcc instead of clang and the problem
  is gone. I just wonder other problems we can find since simething on
  libgcc_s.so is broken when built with clang.
 
  Would it be hard to figure out which exact object file causes this?
 
 Hi Ed,
 
 I've submitted a ktrace result of openoffice execution [1], i just
 saw it got a SIGBUS at some point, but debug openoffice doesn't
 seem to be a trivial task.
 
 I don't know if we can build OO with debug symbols to make it
 easier to debug. If you know what i can do to help debugging,
 just let me know and i can provide any information.

Well, I mean, can you build some of libgcc's object files with Clang and
others with GCC? Hint: Just build everything with GCC. Afterwards, go
into the object directory, rm some of the .o files and make CC=clang.

Since OOo is a C++ application, I suspect the unwind-related object
files to be the culprit.

-- 
 Ed Schouten e...@80386.nl
 WWW: http://80386.nl/


pgp6Mq6wCrRKK.pgp
Description: PGP signature


Re: Openoffice doesn't work with kernel+world built with Clang

2010-11-03 Thread Renato Botelho
On Wed, Nov 3, 2010 at 12:44 PM, Ed Schouten e...@80386.nl wrote:
 * Renato Botelho rbga...@gmail.com, 20101103 15:36:
 On Wed, Nov 3, 2010 at 11:44 AM, Ed Schouten e...@80386.nl wrote:
  Garga!
 
  * Renato Botelho rbga...@gmail.com, 20101103 13:36:
  For now i solve my problem adding this to /etc/src.conf
 
  .if ${.CURDIR} == /usr/src/gnu/lib/libgcc
  CC=cc
  CXX=c++
  .endif
 
  This way libgcc_s.so is built using gcc instead of clang and the problem
  is gone. I just wonder other problems we can find since simething on
  libgcc_s.so is broken when built with clang.
 
  Would it be hard to figure out which exact object file causes this?

 Hi Ed,

 I've submitted a ktrace result of openoffice execution [1], i just
 saw it got a SIGBUS at some point, but debug openoffice doesn't
 seem to be a trivial task.

 I don't know if we can build OO with debug symbols to make it
 easier to debug. If you know what i can do to help debugging,
 just let me know and i can provide any information.

 Well, I mean, can you build some of libgcc's object files with Clang and
 others with GCC? Hint: Just build everything with GCC. Afterwards, go
 into the object directory, rm some of the .o files and make CC=clang.

 Since OOo is a C++ application, I suspect the unwind-related object
 files to be the culprit.

Bingo! When I build everything but unwind-dw2.o with clang it works.
This is the object that is causing the problem.

-- 
Renato Botelho
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: calcru: runtime went backwards

2010-11-03 Thread Alexander Churanov
2010/10/30 David Rhodus sdrho...@gmail.com:
 I haven't seen much of this since 5.x days.  Anyone else see calcru
 messages lately ?

The following was performed on my Xen-powered VPS by RootBSD:

$ grep calcru /var/log/messages | wc -l
 125
$ grep calcru /var/log/messages | head -n3
Oct 26 18:05:59 vps-1 kernel: calcru: runtime went backwards from 23
usec to 20 usec for pid 5 (sctp_iterator)
Oct 26 18:43:10 vps-1 kernel: calcru: runtime went backwards from
7227818 usec to 7102572 usec for pid 97039 (emacs)
Oct 26 18:43:10 vps-1 kernel: calcru: runtime went backwards from
1364643 usec to 1340996 usec for pid 97039 (emacs)
$ grep calcru /var/log/messages | tail -n3
Oct 29 14:25:55 vps-1 kernel: calcru: runtime went backwards from
353859 usec to 341567 usec for pid 37537 (zsh)
Oct 29 14:25:55 vps-1 kernel: calcru: runtime went backwards from
2837565876 usec to 2734396425 usec for pid 37537 (zsh)
Oct 29 17:44:47 vps-1 kernel: calcru: runtime went backwards from
111596 usec to 105716 usec for pid 44694 (zsh)
$

Alexander Churanov
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: calcru: runtime went backwards

2010-11-03 Thread Thomas E. Spanjaard
On 10/30/2010 19:19, David Rhodus wrote:
 I haven't seen much of this since 5.x days.  Anyone else see calcru
 messages lately ?

I only get them right after boot due to ntpd_sync_on_start=YES.

Cheers,
-- 
Thomas E. Spanjaard
t...@netphreax.net
t...@deepbone.net



signature.asc
Description: OpenPGP digital signature


MTX_DEF versus MTX_SPIN

2010-11-03 Thread mdf
It's not clear to me from the man pages (perhaps I didn't look at the
right one?) in which environments I need a spinlock.  For example, I
wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
(i.e one that was registered with BUS_SETUP_INTR), but I see some code
lying around here that does it and nothing I'm aware of has broken.

Perhaps this comes to me still not understanding exactly how
interrupts work on FreeBSD.  If I capture the stack in a hard
interrupt, it looks something like:

#0 0xff87b07f43b5 at rnv_hard_intr+0x35
#1 0x8026e7ce at ithread_execute_handlers+0x9e
#2 0x8026ead0 at ithread_loop+0x70
#3 0x8026b84c at fork_exit+0x9c
#4 0x804a7f7e at fork_trampoline+0xe

And there the stack ends.  From my perspective, this doesn't look like
anything was actually interrupted.

By way of explaining what I mean, on AIX we defined 10 levels of
software interrupt, and we trained the kernel debugger to understand
the save frames that were acquired on interrupt to print a full stack.
 So a full stack dump might show something like a few frames from one
interrupt handler, then a save area, then a frames from a lower
priority interrupt, then another save area, then the base-level stack.
 In this kind of programming environment, it was important to know at
what interrupt level your handler would execute, so that locks
acquired to synchronize between the top-half and bottom-half were
acquired with interupts disabled to the same level.

So, back to my question.  Is it safe to take a MTX_DEF in a hard
interrupt?  What about a soft interrupt?  I have to assume it's okay
in a soft-interrupt context (swi_sched, callout, etc.), since
softclock() will acquire a MTX_DEF on behalf of a callout.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Andriy Gapon
on 03/11/2010 18:27 m...@freebsd.org said the following:
 It's not clear to me from the man pages (perhaps I didn't look at the
 right one?) in which environments I need a spinlock.  For example, I
 wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
 (i.e one that was registered with BUS_SETUP_INTR), but I see some code
 lying around here that does it and nothing I'm aware of has broken.

Such a handler runs in an interrupt thread.
The harder interrupt handler is called interrupt filter in FreeBSD 
terminology.
 I think that it was formerly known as fast interrupt.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread mdf
On Wed, Nov 3, 2010 at 9:42 AM, Andriy Gapon a...@icyb.net.ua wrote:
 on 03/11/2010 18:27 m...@freebsd.org said the following:
 It's not clear to me from the man pages (perhaps I didn't look at the
 right one?) in which environments I need a spinlock.  For example, I
 wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
 (i.e one that was registered with BUS_SETUP_INTR), but I see some code
 lying around here that does it and nothing I'm aware of has broken.

 Such a handler runs in an interrupt thread.
 The harder interrupt handler is called interrupt filter in FreeBSD 
 terminology.
  I think that it was formerly known as fast interrupt.

So a MTX_DEF is okay in that environment?

What would best practices be considered for what code should be run
in the interrupt handler versus a soft interrupt?  In this case the
kinds of things we have to do at some level of interrupt are:

 - handle a heartbeat interrupt from firmware a few times a second
 - get a DMA completion interrupt (completely handling this requires
calling biodone on all the associated bios)
 - receive an ECC interrupt (this requires reading registers off the
card for details)

At the moment we're on stable/7, but we will be migrating the code
base to something more recent in another year or so, if that affects
the answer.

Is there any documentation on best practices for writing a FreeBSD driver?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Andriy Gapon
on 03/11/2010 19:04 m...@freebsd.org said the following:
 On Wed, Nov 3, 2010 at 9:42 AM, Andriy Gapon a...@icyb.net.ua wrote:
 on 03/11/2010 18:27 m...@freebsd.org said the following:
 It's not clear to me from the man pages (perhaps I didn't look at the
 right one?) in which environments I need a spinlock.  For example, I
 wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
 (i.e one that was registered with BUS_SETUP_INTR), but I see some code
 lying around here that does it and nothing I'm aware of has broken.

 Such a handler runs in an interrupt thread.
 The harder interrupt handler is called interrupt filter in FreeBSD 
 terminology.
  I think that it was formerly known as fast interrupt.
 
 So a MTX_DEF is okay in that environment?

Yes, I think so.

 What would best practices be considered for what code should be run
 in the interrupt handler versus a soft interrupt?

Sorry for not going into details, but I personally think that there is no reason
to use soft interrupts.  If you can do everything in interrupt handler (i.e. on
interrupt threads), then that should be all you need.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Ryan Stone
On Wed, Nov 3, 2010 at 12:27 PM,  m...@freebsd.org wrote:
 It's not clear to me from the man pages (perhaps I didn't look at the
 right one?) in which environments I need a spinlock.  For example, I
 wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
 (i.e one that was registered with BUS_SETUP_INTR), but I see some code
 lying around here that does it and nothing I'm aware of has broken.

You can get either a hard interrupt handler(or fast handler in FreeBSD
parlance) or a soft handler using BUS_SETUP_INTR.  On FreeBSD 7 and
later fast interrupt handlers are passed to filter argument to
BUS_SETUP_INTR and soft handlers are passed to the ithread argument(on
earlier versions you had to pass the INTR_FAST flag to get a fast
handler).  You are correct that fast interrupt handlers may only
acquire spinlocks, not mutexes.  Soft interrupt handlers have their
own thread associated with them and so it's safe to acquire MTX_DEF
locks in that thread.

In your particular example you are running from the context of a
software interrupt thread, so you are safe to acquire MTX_DEF mutexes.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Kostik Belousov
On Wed, Nov 03, 2010 at 10:04:13AM -0700, m...@freebsd.org wrote:
 On Wed, Nov 3, 2010 at 9:42 AM, Andriy Gapon a...@icyb.net.ua wrote:
  on 03/11/2010 18:27 m...@freebsd.org said the following:
  It's not clear to me from the man pages (perhaps I didn't look at the
  right one?) in which environments I need a spinlock.  For example, I
  wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
  (i.e one that was registered with BUS_SETUP_INTR), but I see some code
  lying around here that does it and nothing I'm aware of has broken.
 
  Such a handler runs in an interrupt thread.
  The harder interrupt handler is called interrupt filter in FreeBSD 
  terminology.
   I think that it was formerly known as fast interrupt.
 
 So a MTX_DEF is okay in that environment?
 
 What would best practices be considered for what code should be run
 in the interrupt handler versus a soft interrupt?  In this case the
 kinds of things we have to do at some level of interrupt are:
 
  - handle a heartbeat interrupt from firmware a few times a second
Doing this in the filter would only assert that interrupts are not
disabled. If you perform the heartbeat notification from the interrupt
thread instead, you have some assurance that scheduling works.

  - get a DMA completion interrupt (completely handling this requires
 calling biodone on all the associated bios)
Calling into geom and possibly fs/VFS level should be done from the
interrupt thread. I thought that g_up thread is used to handle
the finish of i/o ?

  - receive an ECC interrupt (this requires reading registers off the
 card for details)
 
 At the moment we're on stable/7, but we will be migrating the code
 base to something more recent in another year or so, if that affects
 the answer.
 
 Is there any documentation on best practices for writing a FreeBSD driver?
Not that I am aware of. You can read locking(9) in HEAD to get the answer
on your question about spin mutexes.


pgpKLSEa8VFVu.pgp
Description: PGP signature


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread John Baldwin
On Wednesday, November 03, 2010 1:04:13 pm m...@freebsd.org wrote:
 On Wed, Nov 3, 2010 at 9:42 AM, Andriy Gapon a...@icyb.net.ua wrote:
  on 03/11/2010 18:27 m...@freebsd.org said the following:
  It's not clear to me from the man pages (perhaps I didn't look at the
  right one?) in which environments I need a spinlock.  For example, I
  wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
  (i.e one that was registered with BUS_SETUP_INTR), but I see some code
  lying around here that does it and nothing I'm aware of has broken.
 
  Such a handler runs in an interrupt thread.
  The harder interrupt handler is called interrupt filter in FreeBSD 
  terminology.
   I think that it was formerly known as fast interrupt.
 
 So a MTX_DEF is okay in that environment?

Yes.  In fact, the reason to have threads for interrupt handlers is to allow
interrupt handlers to use non-spin locks that block when the lock is held.

MTX_SPIN locks are generally not needed in device drivers.  The only reason a
driver would use one is if it used a filter handler which does not run in a
threaded context.

 What would best practices be considered for what code should be run
 in the interrupt handler versus a soft interrupt?  In this case the
 kinds of things we have to do at some level of interrupt are:
 
  - handle a heartbeat interrupt from firmware a few times a second
  - get a DMA completion interrupt (completely handling this requires
 calling biodone on all the associated bios)
  - receive an ECC interrupt (this requires reading registers off the
 card for details)
 
 At the moment we're on stable/7, but we will be migrating the code
 base to something more recent in another year or so, if that affects
 the answer.

I suspect all of these are fine to handle in a regular interrupt handler.
If you need to run a task that needs to block on a condition (e.g. cv_*wait*()
or *sleep()), then you probably want to use a task to deter that to a
taskqueue.  At this point taskqueue's are probably the cloest thing FreeBSD
really has to a true software interrupt.  FreeBSD does have software interrupts
still, but the taskqueue API is actually easier to work with for device
drivers.

 Is there any documentation on best practices for writing a FreeBSD driver?

Not really. :-/

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Andriy Gapon
on 03/11/2010 19:11 Kostik Belousov said the following:
 On Wed, Nov 03, 2010 at 10:04:13AM -0700, m...@freebsd.org wrote:
 Is there any documentation on best practices for writing a FreeBSD driver?
 Not that I am aware of. You can read locking(9) in HEAD to get the answer
 on your question about spin mutexes.

BTW, I think that BUS_SETUP_INTR(9) contains the most up to date information on
interrupt handling piece of a driver.  Unlike e.g. ithread(9).

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Julian Elischer

On 11/3/10 10:17 AM, John Baldwin wrote:

On Wednesday, November 03, 2010 1:04:13 pm m...@freebsd.org wrote:


So a MTX_DEF is okay in that environment?

Yes.  In fact, the reason to have threads for interrupt handlers is to allow
interrupt handlers to use non-spin locks that block when the lock is held.

MTX_SPIN locks are generally not needed in device drivers.  The only reason a
driver would use one is if it used a filter handler which does not run in a
threaded context.


It should be noted that in the case where you really just want to spin 
a few

instructions because some other thread is accessing a structure you want,
descheduling you. so you don't always incur the scheduling overhead.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: minidump size on amd64

2010-11-03 Thread Andriy Gapon
on 08/10/2010 10:46 Alan Cox said the following:
 Andriy Gapon wrote:
 Here's an updated patch:
 http://people.freebsd.org/~avg/amd64-minidump.3.diff
 
 The kernel part of the patch looks good.  That said, I have one suggestion.  
 The
 current generation of AMD and Intel processors has support for 1GB pages.  If 
 you
 want to make sure that this change will last us a long time, I would suggest
 translating the old trick of generating a fake page table page for 2MB pages 
 into
 generating a fake page directory page for 1GB pages, rather than disposing of 
 this
 code.

Let me double-check if I understand your suggestion correctly.
So, if a 1GB page is used, then there will be a PDP entry for it with PS bit 
set,
and PD entries for the range covered by the page will not be correctly set up?
That is, I assumed that even if 1GB pages are used, then the corresponding PD
entries are still correctly set up.  But you say that this may not be the case 
in
reality?

So, I have to check PDP entry first and only if it's valid and it doesn't point 
to
a 1GB page then I should examine corresponding PD entries.
Correct?

P.S. is there a macro for extracting frame address from PDPE?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS v28 is ready for wider testing.

2010-11-03 Thread Olivier Smedts
2010/8/31 Pawel Jakub Dawidek p...@freebsd.org:
 Hello.

 I'd like to give you ZFS v28 for testing. If you are neither brave nor
 mad, you can stop here.

 The patchset is very experimental. It can eat your cookie and hurt your
 teddy bear, so be warned. Don't try it for anything except testing.

 This patchset is also a message we, as the FreeBSD project, would like
 to send to our users: Eventhough OpenSolaris is dead, the ZFS file
 system is going to stay in FreeBSD. At this point we have quite a few
 developers involved in ZFS on FreeBSD as well as serveral companies.
 We are also looking forward to work with IllumOS.

 So, what this new ZFS brings?

 - Data deduplication. Read more here:

        http://blogs.sun.com/bonwick/entry/zfs_dedup

 - Triple parity RAIDZ (RAIDZ3). Read more here:

        http://dtrace.org/blogs/ahl/2009/07/21/triple-parity-raid-z/

 - zfs diff. Read more here:

        http://arc.opensolaris.org/caselog/PSARC/2010/105/20100328_tim.haley

 - zpool split. Read more here:

        http://arc.opensolaris.org/caselog/PSARC/2009/511/20090924_mark.musante

 - Snapshot holds. Read more here:

        http://arc.opensolaris.org/caselog/PSARC/2009/297/20090511_chris.kirby

 - zpool import -F. Allows to rewind corrupted pool to earlier
  transaction group.

 - Possibility to import pool in read-only mode.

 And much, much more, including plenty of preformance improvements and bug
 fixes.

 So test whatever you can and report back. Look for regressions, strange
 behaviour, missing features, deadlocks, livelocks, preformance
 degradation, etc.

 The boot code is not updated at all, so booting off of ZFS doesn't
 currently work.

 The patch is against today's FreeBSD HEAD.

 The patch enables (in sys/modules/zfs/Makefile) ZFS internal debugging,
 please don't turn it off. Also, compile your kernel with the following
 options:

        options         KDB
        options         DDB
        options         INVARIANTS
        options         INVARIANT_SUPPORT
        options         WITNESS
        options         WITNESS_SKIPSPIN
        options         DEBUG_LOCKS
        options         DEBUG_VFS_LOCKS

 Ignore all the LOR (Lock Order Reversal) reports from WITNESS. There will
 be plenty of those, and you'll desperately want to report them, but please
 don't.

 The best way to report a problem is to answer to this e-mail with as short
 as possible procedure of how to reproduce it and debugging info. I'd
 prefer textdump if possible. Below you can find quick procedure how to
 setup textdumps:

        Choose spare/swap disk/partition in your system, let's say it is
        /dev/ad0s1b.

        Add the following line to /etc/fstab:

                /dev/ad0s1b     none    swap    sw      0       0

        Add the following line to /etc/rc.conf:

                ddb_enable=YES

        Run the following commands:

                # /etc/rc.d/swap1 start
                # /etc/rc.d/dumpon start
                # /etc/rc.d/ddb start

        This will setup swap, mark it as dump device and setup some DDB
        scripts. Or you can just reboot.

        Now when your system panic or deadlock, enter DDB and call the
        following command:

                ddb run kdb.enter.panic

        It will execute all the commands I need, dump them in text format to
        your swap device and reboot machine.

        After the reboot, you should find textdump.tar.0 file in /var/crash/
        directory. This is the debug info I need.

 End of textdumps procedure.

 Ok, now that I know you read everything carefully, here is the patch:

        http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2

Hello,

Any status update on this ? I regularly check
http://people.freebsd.org/~pjd/patches/ to see if there's an updated
version of your patch. 2 months old is quite a bit for -CURRENT, which
often receives commits on zfsco parts.

Thanks for all your work on FreeBSD (not only ZFS).


 Good luck! :

 --
 Pawel Jakub Dawidek                       http://www.wheelsystems.com
 p...@freebsd.org                           http://www.FreeBSD.org
 FreeBSD committer                         Am I Evil? Yes, I Am!


-- 
Olivier Smedts                                                 _
                                        ASCII ribbon campaign ( )
e-mail: oliv...@gid0.org        - against HTML email  vCards  X
www: http://www.gid0.org    - against proprietary attachments / \

  Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: minidump size on amd64

2010-11-03 Thread Andriy Gapon
on 03/11/2010 20:44 Alan Cox said the following:
[snip]

Thank you for the confirmation!

 Andriy Gapon wrote:
 P.S. is there a macro for extracting frame address from PDPE?
 To a lower level page table page or to a 1GB physical page?  For the latter, 
 you
 can use PG_PS_FRAME.

To a 1GB page.
I see in the architecture manual that the lower bits are marked as MBZ, so this
macro should work.  Actually, it seems that even PG_FRAME should work in place 
of
PG_PS_FRAME for exactly the same reason (MBZ) on amd64.

Thank you!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Sanity Check, HP Bigger Iron DL580G7

2010-11-03 Thread Sean Bruno
Been working on getting this machine serviceable under FreeBSD with HP,
I need to try out some patches for other problems, but I wanted to link
folks to the pciconf output first.

Most importantly, I see an unsupported ethernet controller in this box,
but other's may find other devices of interest as well.

Sean

http://people.freebsd.org/~sbruno/dl580g7_pciconf.txt


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ZFS v28 is ready for wider testing.

2010-11-03 Thread Pawel Jakub Dawidek
On Wed, Nov 03, 2010 at 07:28:15PM +0100, Olivier Smedts wrote:
         http://people.freebsd.org/~pjd/patches/zfs_20100831.patch.bz2
 
 Hello,
 
 Any status update on this ? I regularly check
 http://people.freebsd.org/~pjd/patches/ to see if there's an updated
 version of your patch. 2 months old is quite a bit for -CURRENT, which
 often receives commits on zfsco parts.
 
 Thanks for all your work on FreeBSD (not only ZFS).

It took a while, but I should have something new shortly. I recently
finished boot support for v28 (the most missing feature in the previous
patch?) and will work on new patch soon. I'm heading to meetBSD
California tomorrow and I'll be back in a week, so nothing will happen
till then for sure.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpPnD9csrFCZ.pgp
Description: PGP signature


Re: minidump size on amd64

2010-11-03 Thread Andriy Gapon

So, here is the next version of the patch:
http://people.freebsd.org/~avg/amd64-minidump.4.diff

Changes since the last version:
1. libkvm - try to support both the new and the previous formats/versions of
amd64 minidump.  I am not entirely sure about style in which I handled handling
of version 1 minidump.  Identifier names like pmapsize (for page map size) and
page_map could also be improved, perhaps.
2. kernel - implemented dumping of 1GB pages via fake 512 x 2MB pages per
Alan's suggestion.

The change is only compile tested so far.  Not sure if it's possible to test
handling 1GB pages yet :-)

As always, reviews, testing and suggestions are very welcome.
Thank you for your help!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Rick Macklem
 
  Is there any documentation on best practices for writing a FreeBSD
  driver?
 
 Not really. :-/
 
Just a dumb obvious suggestion. Imho, there is no better doc. than some
well written code, so maybe someone familiar with the drivers can suggest
one (or two) that they consider well written and use the current conventions
as examples?

rick, who never trusts documentation anyhow;-)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Julian Elischer

On 11/3/10 10:52 AM, Julian Elischer wrote:

On 11/3/10 10:17 AM, John Baldwin wrote:

On Wednesday, November 03, 2010 1:04:13 pm m...@freebsd.org wrote:


So a MTX_DEF is okay in that environment?
Yes.  In fact, the reason to have threads for interrupt handlers is 
to allow
interrupt handlers to use non-spin locks that block when the lock 
is held.


MTX_SPIN locks are generally not needed in device drivers.  The 
only reason a
driver would use one is if it used a filter handler which does not 
run in a

threaded context.



oops
a line got deleted I think..
It should be noted that in the case where you really just want to 
spin a few
instructions because some other thread is accessing a structure you 
want,

 ... then the BTX_DEF code will spin for a short while before ...

descheduling you. so you don't always incur the scheduling overhead.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
freebsd-current-unsubscr...@freebsd.org




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Julian Elischer

On 11/3/10 2:56 PM, Rick Macklem wrote:

Is there any documentation on best practices for writing a FreeBSD
driver?

Not really. :-/


Just a dumb obvious suggestion. Imho, there is no better doc. than some
well written code, so maybe someone familiar with the drivers can suggest
one (or two) that they consider well written and use the current conventions
as examples?


we try every now and then to put good examples in /usr/share/examples
but I think what's there is hopelessly out of date.


rick, who never trusts documentation anyhow;-)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread Rick Macklem
 On 11/3/10 2:56 PM, Rick Macklem wrote:
  Is there any documentation on best practices for writing a FreeBSD
  driver?
  Not really. :-/
 
  Just a dumb obvious suggestion. Imho, there is no better doc. than
  some
  well written code, so maybe someone familiar with the drivers can
  suggest
  one (or two) that they consider well written and use the current
  conventions
  as examples?
 
 we try every now and then to put good examples in /usr/share/examples
 but I think what's there is hopelessly out of date.
 
Yep, that's the inevitable problem with this kind of doc. What I was
suggesting was to list a couple of the current drivers in src/sys as
good examples of best practice and hope they stay current, if they're
in the kernel source tree and being used for current hardware.

But, just a suggestion (and the list could/will get out of date
someday), rick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Web feeds for UPDATING files [and commits too]

2010-11-03 Thread Lapo Luchini
Alexander Kojevnikov wrote:
 The site now features Atom feeds for the following files:
 
 * ports/UPDATING
 * head/UPDATING
 * stable/7/UPDATING
 * stable/8/UPDATING
 
 Hope you find the feeds useful.

Useful indeed! Also, this is probably a nice thread to point out that a
commit log feed exists.

It was made by ale@ many months ago but (as far as GoogleReader says) I
think I'm the only user so far; these are the FeedBurner-cached entries
(to avoid hitting ale@'s ADSL too much):

src/
http://feeds.feedburner.com/FreeBSD-6-src
http://feeds.feedburner.com/FreeBSD-7-src
http://feeds.feedburner.com/FreeBSD-8-src
http://feeds.feedburner.com/FreeBSD-HEAD-src

ports/
http://feeds.feedburner.com/FreeBSD-ports

-- 
Lapo Luchini - http://lapo.it/

“There are two major products that come out of Berkeley: LSD and UNIX.
We don't believe this to be a coincidence.” (Jeremy S. Anderson)

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: re(4) driver dropping packets when reading NFS files

2010-11-03 Thread Rick Macklem
 
 I'm more interested in number of dropped frames. See below how to
 extract that information.
 

I've attached the stats. I'm guessing that the
Rx missed frames : 14792
is the culprit.

This was for a read of a fairly large file via NFS over TCP,
getting a read rate of about 450Kbytes/sec. (No DEVICE_POLLING option.)
(with your patch applied)

 
  (It almost looks like it only handles the first received packet,
  although
   it appears to be using a receive ring of 64 buffers.)
 
 
 No, re(4) uses 256 TX/RX buffers for RTL810xE controllers.
 

Oops, my mistake. At a quick glance, I had thought rl_type was
set to 8139, but I now see it's 8169.

Btw, I printed out the hwrev and its a RL_HWREV_8102EL_SPIN1,
if that is of any use to you.

 
 Ok, here is patch.
 http://people.freebsd.org/~yongari/re/re.intr.patch
 
 The patch has the following changes.
 o 64bit DMA support for PCIe controllers.
 o Hardware MAC statistics counter support. You can extract these
 counters with sysctl dev.re.0.stats=1. You can check the
 output on console or dmesg. It seems extracting these counters
 take a lot of time so I didn't try to accumulate the counters.
 You can see how many frames are dropped from the output. I saw a
 lot FAE(frame alignment errors) under high RX load and I can't
 explain how this can happen. This may indicate PHY hardware is
 poor or it may need DSP fixups. Realtek seems to maintain large
 set of DSP fixups for each PHY revisions and re(4) does not
 have the magic code at this moment.
 o Overhaul MSI interrupt handler such that make it give fairness
 to TX as well as serving RX. Because re(4) controllers do not
 have interrupt moderation mechanism, naive interrupt handler can
 generate more than 125k intrs/sec under high load. Fortunately,
 Bill implemented TX interrupt moderation with a timer register
 and it seems to work well on TX path. One drawback of the
 approach is it will require extra timer register accesses in
 fast path. There is no second timer register to use in RX path
 so no RX interrupt moderation is done in driver such that it can
 generate about 25k intrs/sec under high RX load. However, I
 think most systems can handle that interrupt load. Note, this
 feature is activated only when MSI is in use and DEVICE_POLLING
 is not defined.
 
 From my limited testing, it seems it works as expected. Would you
 give it try and let me know how well it behaves with NFS?
 
Without DEVICE_POLLING it behaves just like the unpatched one.

I'm going to look at the driver tomorrow and try some hacks on it, rick
re0 statistics:
Transmit good frames : 100966
Receive good frames : 133470
Tx errors : 0
Rx errors : 0
Rx missed frames : 14792
Rx frame alignment errs : 0
Tx single collisions : 0
Tx multiple collisions : 0
Rx unicast frames : 133463
Rx broadcast frames : 0
Rx multicast frames : 7
Tx aborts : 0
Tx underruns : 0
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: re(4) driver dropping packets when reading NFS files

2010-11-03 Thread Pyun YongHyeon
On Wed, Nov 03, 2010 at 07:27:20PM -0400, Rick Macklem wrote:
  
  I'm more interested in number of dropped frames. See below how to
  extract that information.
  
 
 I've attached the stats. I'm guessing that the
 Rx missed frames : 14792
 is the culprit.
 

Because that counter is 16bit it's also possible it wrapped
multiple times. Could you verify that?

 This was for a read of a fairly large file via NFS over TCP,
 getting a read rate of about 450Kbytes/sec. (No DEVICE_POLLING option.)
 (with your patch applied)
 
  
   (It almost looks like it only handles the first received packet,
   although
it appears to be using a receive ring of 64 buffers.)
  
  
  No, re(4) uses 256 TX/RX buffers for RTL810xE controllers.
  
 
 Oops, my mistake. At a quick glance, I had thought rl_type was
 set to 8139, but I now see it's 8169.
 
 Btw, I printed out the hwrev and its a RL_HWREV_8102EL_SPIN1,
 if that is of any use to you.
 
  
  Ok, here is patch.
  http://people.freebsd.org/~yongari/re/re.intr.patch
  
  The patch has the following changes.
  o 64bit DMA support for PCIe controllers.
  o Hardware MAC statistics counter support. You can extract these
  counters with sysctl dev.re.0.stats=1. You can check the
  output on console or dmesg. It seems extracting these counters
  take a lot of time so I didn't try to accumulate the counters.
  You can see how many frames are dropped from the output. I saw a
  lot FAE(frame alignment errors) under high RX load and I can't
  explain how this can happen. This may indicate PHY hardware is
  poor or it may need DSP fixups. Realtek seems to maintain large
  set of DSP fixups for each PHY revisions and re(4) does not
  have the magic code at this moment.
  o Overhaul MSI interrupt handler such that make it give fairness
  to TX as well as serving RX. Because re(4) controllers do not
  have interrupt moderation mechanism, naive interrupt handler can
  generate more than 125k intrs/sec under high load. Fortunately,
  Bill implemented TX interrupt moderation with a timer register
  and it seems to work well on TX path. One drawback of the
  approach is it will require extra timer register accesses in
  fast path. There is no second timer register to use in RX path
  so no RX interrupt moderation is done in driver such that it can
  generate about 25k intrs/sec under high RX load. However, I
  think most systems can handle that interrupt load. Note, this
  feature is activated only when MSI is in use and DEVICE_POLLING
  is not defined.
  
  From my limited testing, it seems it works as expected. Would you
  give it try and let me know how well it behaves with NFS?
  
 Without DEVICE_POLLING it behaves just like the unpatched one.
 

Hmm, that's strange. Are you sure you rebuilt kernel without polling
option? Just disabling polling with ifconfig(8) has no effect to
make patch work.

 I'm going to look at the driver tomorrow and try some hacks on it, rick

 re0 statistics:
 Transmit good frames : 100966
 Receive good frames : 133470
 Tx errors : 0
 Rx errors : 0
 Rx missed frames : 14792

If the counter was not wrapped, it seem you lost more than 10% out of
total RX frames. This is a lot loss and there should be a way to
mitigate it.

 Rx frame alignment errs : 0
 Tx single collisions : 0
 Tx multiple collisions : 0
 Rx unicast frames : 133463
 Rx broadcast frames : 0
 Rx multicast frames : 7
 Tx aborts : 0
 Tx underruns : 0

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org