Re: Medical database Vidal

2009-01-11 Thread Harald
On Fri, Jan 09, 2009 at 07:10:07PM +, Ben Morrow wrote:
 
> I would guess that your CD has both Rock Ridge and Joliet extensions,
> and that the creator has hidden the Win32-specific files from the Unix
> directory tree because they thought they wouldn't be useful. If for some
> reason you need to see the CD as a Win32 machine would, you can use the
> -r option to mount_cd9660.

Thank you very much indeed for your detailed explanation.

Before searching for help I have tried out all options of mount_cd9660,
one after the other and all together or so without understanding their
meaning. Therefore I obviously missed the working one.

`mount_cd9660 -r /dev/acd0 /cdrom' works like a charm.

`wine /cdrom/setup.exe' does the job as well, unfortunately with a
certain number of `err:' and `fixme:' lines.
`cd path/to/VidalCD ; wine VidalCD.exe' starts the application with
the same or similar error lines (which is not surprising).
The programme does run, but is not really operational: It is too slow,
and exiting without problems requires to type `Ctrl+Alt+Backspace' !

No time yet to see whether I am capable to fix something without
further help.

Harald
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Incorrect super block

2010-02-22 Thread Harald
On Fri, Feb 19, 2010 at 01:03:37AM -0800, per...@pluto.rain.com wrote:
> Ivan Voras  wrote:
> > On 02/18/10 16:26, Harald Weis wrote:
> > > Has anybody encountered the following problem ?
> > >
> > > Mac OS X does recognize FreeBSD partitions on USB disks, but
> > > doesn't want to mount them because ``Incorrect super block''.
> > > This is extremely annoying for my ``client'' because he relies
> > > on dayly backups on USB keys. Is there a solution ?
> >
> > Are you using UFS1 or UFS2? If one, try the other :)
> 
> This is not the first time I've heard of one OS having problems with
> another OS' instantiation of UFS.  For the particular application at
> hand, I'd use tar(1) to collect the files into a single stream and
> write the tarfile onto a USB key formatted as FAT32.  OS X should
> have no trouble reading a FreeBSD tarfile.

Many thanks for all replies, on- and off-list. I'll do the various
trials (ufs1, fat32, etc.) as soon as possible. Triggered by the
off-list post, I've understood from sysutils/fusefs-ntfs/pkg-descr that
ntfs doesn't preserve file ownerships and access rights. I'm afraid this
is also true for fat32. If that is confirmed ntfs and fat32 are useless
in combination with rsync(1), which leaves indeed only tar(1) and
fat32|ntfs :-(

With the permission of the author I have bounced the off-list reply. The
list seems to have filtered it. So here it is:
===
Date: Thu, 18 Feb 2010 10:31:54 -0500
From: Lucas Holt 
Subject: Re: Incorrect super block
To: Harald Weis 

Harald Weis wrote:
>Has anybody encountered the following problem ?
> 
>Mac OS X does recognize FreeBSD partitions on USB disks, but doesn't
>want to mount them because ``Incorrect super block''.
>This is extremely annoying for my ``client'' because he relies on dayly
>backups on USB keys. Is there a solution ?
> 
>Thank you in advance.
> 
You could use another file system such as Fat32 or NTFS (fuse ntfs from
ports).  OS X can read both of those.  Apple has been moving away from
UFS support in OS X for awhile; snow leopard is quite stale on this
front.

Luke
===

Thanks again. I'll report a.s.a.p. the outcome of the ufs1 experiment.

Harald


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: status of flash9/flash10 support in RELENG_7 ?

2009-08-10 Thread Harald
On Sun, Aug 09, 2009 at 11:04:52PM +0100, Ben Morrow wrote:
 
> I was about to say 'I believe the vuxml entry for firefox is incorrect',
> but I see it's been fixed. Neither 3.0.13 nor 3.5.2 are vulnerable, and
> vuxml now correctly reports this.

Today security/vuxml/vuln.xml says:


  
firefox
linux-firefox
3.*,1
3.*,13.0.13,1
3.5.*,13.5.2,1
  

1. Could someone tell me the meaning of the ``*'' values please ?
I can't see the logic of the range lines.

2. Yesterday I installed firefox quickly with ``pkg_add -r firefox3''
and got firefox-3.0.10,1.
Portaudit declares it vulnerable which seems to correspond
to the second range line.
I guess I have to compile firefox3 to be clean ?

Harald

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: cam SCSI negotiation issues (mpt in that case), only 3.300MB/s transfers

2012-12-05 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 23.11.2012 14:39 (localtime):
> ...
> found out thathint.mpt.0.msi_enable="1"
> solves the interrupt storm problem, although dmesg output still is
> exactly the same:
> mpt0:  port 0x4000-0x40ff mem
> 0xd644-0xd645,0xd642-0xd643 irq 18 at device 0.0 on pci3
> mpt0: MPI Version=1.2.15.0
> Btw, I'm still curious what the sysctl "hw.mpt.0.role" means? With my
> LSI1030 it's "3", an onther 1068, it's "1"

Any hint for me what "hw.mpt.0.role" means?


> ...I had no luck using "camcontrol negotiate sa0 -R 40" to
> alter the negotiation parameters.
> It always shows 3,300MB/s and seems to reflect reality

I'd still need help here if possible.
Is there any point to start for non-developers? Any SCSI command
"secrets" I can try?
If anybody is interested in solving, but lacks hardware, I'd donate one
LSI20320IE.

Thanks,

-Harry (not subscribed to freebsd-scsi@)



signature.asc
Description: OpenPGP digital signature


Re: cam SCSI negotiation issues (mpt in that case), only 3.300MB/s transfers

2012-12-05 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 05.12.2012 21:04 (localtime):
>  schrieb Harald Schmalzbauer am 23.11.2012 14:39 (localtime):
>> ...
>> found out thathint.mpt.0.msi_enable="1"
>> solves the interrupt storm problem, although dmesg output still is
>> exactly the same:
>> mpt0:  port 0x4000-0x40ff mem
>> 0xd644-0xd645,0xd642-0xd643 irq 18 at device 0.0 on pci3
>> mpt0: MPI Version=1.2.15.0
>> Btw, I'm still curious what the sysctl "hw.mpt.0.role" means? With my
>> LSI1030 it's "3", an onther 1068, it's "1"
> Any hint for me what "hw.mpt.0.role" means?
>
>
>> ...I had no luck using "camcontrol negotiate sa0 -R 40" to
>> alter the negotiation parameters.
>> It always shows 3,300MB/s and seems to reflect reality
> I'd still need help here if possible.
> Is there any point to start for non-developers? Any SCSI command
> "secrets" I can try?
> If anybody is interested in solving, but lacks hardware, I'd donate one
> LSI20320IE.
>

Found dev.mpt.0.debug, set to 8:

Dec  5 21:05:36 wega kernel: mpt0: exit mpt_intr
Dec  5 21:05:36 wega kernel: mpt0: enter mpt_intr
. (some hundred times repeated)

kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: TMF complete: req 0xff8030ace480:304 status 0x0
kernel: mpt0: exit mpt_intr
kernel: SCSI IO Request @ 0xff823ffa7b00
kernel: Chain Offset  0x00
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012a
kernel: Bus:0
kernel: TargetID0
kernel: SenseBufferLength   32
kernel: LUN:  0x0
kernel: Control   0x0500  NODATATRANSFER  UNTAGGED
kernel: DataLength 0x
kernel: SenseBufAddr   0x911255e0
kernel: CDB[0:6]   00 00 00 00 00 00
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: Address Reply:
kernel: SCSI IO Request Reply @ 0xff823fff9a00
kernel: IOC StatusSuccess
kernel: IOCLogInfo0x
kernel: MsgLength 0x08
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012a
kernel: Bus:  0
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: TMF complete: req 0xff8030ace480:304 status 0x0
kernel: mpt0: exit mpt_intr
kernel: SCSI IO Request @ 0xff823ffa7b00
kernel: Chain Offset  0x00
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012a
kernel: Bus:0
kernel: TargetID0
kernel: SenseBufferLength   32
kernel: LUN:  0x0
kernel: Control   0x0500  NODATATRANSFER  UNTAGGED
kernel: DataLength 0x
kernel: SenseBufAddr   0x911255e0
kernel: CDB[0:6]   00 00 00 00 00 00
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: SCSI IO Request Reply @ 0xff823fff9a00
kernel: IOC StatusSuccess
kernel: IOCLogInfo0x
kernel: MsgLength 0x08
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012a
kernel: Bus:  0
kernel: TargetID  0
kernel: CDBLength 6
kernel: SCSI Status:  Check Condition
kernel: SCSI State:   (0x0001)AutoSense_Valid
kernel: TransferCnt   0x
kernel: SenseCnt  0x0012
kernel: ResponseInfo  0x
kernel: mpt0: exit mpt_intr
kernel: SCSI IO Request @ 0xff80002ae3d0
kernel: Chain Offset  0x00
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012b
kernel: Bus:0
kernel: TargetID0
kernel: SenseBufferLength   18
kernel: LUN:  0x0
kernel: Control   0x02000500  READ  UNTAGGED
kernel: DataLength 0x0024
kernel: SenseBufAddr   0x911257e0
kernel: CDB[0:6]   12 00 00 00 24 00
kernel: 64_BIT_ADDRESSING LAST_ELEMENT END_OF_BUFFER END_OF_LIST
kernel: mpt0: enter mpt_intr
kernel: mpt0: Context Reply: 0x0001012b
kernel: mpt0: exit mpt_intr
kernel: SCSI IO Request @ 0xff80002ae3d0
kernel: Chain Offset  0x00
kernel: MsgFlags  0x00
kernel: MsgContext0x0001012c
kernel: Bus:0
kernel: TargetID0
kernel: SenseBufferLength   18
kernel: LUN:  0x0
kernel: Control   0x02000500  READ  UNTAGGED
kernel: DataLength 0x0038
kernel: SenseBufAddr   0x911259e0
kernel: CDB[0:6]   12 00 00 00 38 00
kernel: SE64 0xff8240020830: Addr=0x57ce18f8
FlagsLength=0xd338
kernel: 64_BIT_ADDRESSING LAST_ELEMENT END_OF_BUFFER END_OF_LIST
kernel: mpt0: enter mpt_intr
kernel: mpt0: Context Reply: 0x0001012c
kernel: mpt0: exit mpt_intr
kernel: mpt0: enter mpt_intr
kernel: mpt0: exit mpt_intr
kernel: mpt0: mpt_get_s

Re: cam SCSI negotiation issues (mpt in that case), only 3.300MB/s transfers

2012-12-05 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 05.12.2012 22:23 (localtime):
> kernel: IOC StatusSCSI: Data Underrun

Searching for this topic showd an 53c1030 errata fix:

https://patchwork.kernel.org/patch/94223/

Like you guessed, I can't make use of it, but probably someone else?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


can't reach jails own ipv4 from inside anymore

2012-12-19 Thread Harald Schmalzbauer
 Hello,

with 8.2 I could "ssh IPofTheJail" inside the jail and got connected to
the sshd in the corresponding jail. Same with "ssh localhost".

With 9.1, it's not possible anymore.
I have assigned a different FIB to my jail in both cases.

The picture is different for IPv6. "ping6 IPofTheJail" does work!

I have more oddities I wanted to check with jails and lagg-interfaces
together with VLANs, but I have no idea why I can't connect from one
jail to it's own IP(v4) anymore!

Was there any special security-extension added after 8.2?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: FIB and jail regression [Was: can't reach jails own ipv4 from inside anymore]

2012-12-19 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 19.12.2012 12:19 (localtime):
>  Hello,
>
> with 8.2 I could "ssh IPofTheJail" inside the jail and got connected to
> the sshd in the corresponding jail. Same with "ssh localhost".
>
> With 9.1, it's not possible anymore.
> I have assigned a different FIB to my jail in both cases.
>
> The picture is different for IPv6. "ping6 IPofTheJail" does work!
>
> I have more oddities I wanted to check with jails and lagg-interfaces
> together with VLANs, but I have no idea why I can't connect from one
> jail to it's own IP(v4) anymore!

Found out that defining a different FIB causes that behaviour in 9.1.
But using a different FIB doesn't caus the same in 8.2!

Can anybody tell me what has changed regarding FIBs after 8.2?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: FIB and jail regression [Was: can't reach jails own ipv4 from inside anymore]

2012-12-19 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 19.12.2012 12:56 (localtime):
>  ...
>>
>> I have more oddities I wanted to check with jails and lagg-interfaces
>> together with VLANs, but I have no idea why I can't connect from one
>> jail to it's own IP(v4) anymore!
> Found out that defining a different FIB causes that behaviour in 9.1.
> But using a different FIB doesn't caus the same in 8.2!

Easiest way to reproduce:

Just do a ping on the host (not jail)

setfib 0 ping anyLocalIP -> works
setfib 1 ping anyLocalIP -> doesn't work

Anybody with 9.1 and ROUTINGTABLES in custom kernel out there who can't
confirm that?

Turned out that 9.0-stable from Feb. 2012 doesn't show that problem.
So this problem seems to be introdued between 9.0 and 9.1.
Thanks,

-Harry
 



signature.asc
Description: OpenPGP digital signature


Re: FIB and jail regression [Was: can't reach jails own ipv4 from inside anymore]

2012-12-19 Thread Harald Schmalzbauer
 schrieb Göran Löwkrantz am 19.12.2012 14:44 (localtime):
>
>
> --On December 19, 2012 13:48:34 +0100 Harald Schmalzbauer
>  wrote:
>
>>  schrieb Harald Schmalzbauer am 19.12.2012 12:56 (localtime):
>>>  ...
>>>>
>>>> I have more oddities I wanted to check with jails and lagg-interfaces
>>>> together with VLANs, but I have no idea why I can't connect from one
>>>> jail to it's own IP(v4) anymore!
>>> Found out that defining a different FIB causes that behaviour in 9.1.
>>> But using a different FIB doesn't caus the same in 8.2!
>>
>> Easiest way to reproduce:
>>
>> Just do a ping on the host (not jail)
>>
>> setfib 0 ping anyLocalIP -> works
>> setfib 1 ping anyLocalIP -> doesn't work
>>
>> Anybody with 9.1 and ROUTINGTABLES in custom kernel out there who can't
>> confirm that?
>>
>> Turned out that 9.0-stable from Feb. 2012 doesn't show that problem.
>> So this problem seems to be introdued between 9.0 and 9.1.
>> Thanks,
>>
>> -Harry
>>
>>
> Works for me:
> # uname -a
> FreeBSD 9.1-PRERELEASE r243951: Fri Dec  7 02:29:14 CET 2012
> # sysctl -a | grep fib
> net.my_fibnum: 0
> net.add_addr_allfibs: 1
> net.fibs: 2
>
> # ifconfig sis1
> sis1: flags=8843 metric 0 mtu
> 1500
> options=83808
> ...
> inet 176.57.193.193 netmask 0xfff0 broadcast 176.57.193.207
> 
> media: Ethernet autoselect (100baseTX )
> status: active
>
> # setfib 0 ping 176.57.193.193
> PING 176.57.193.193 (176.57.193.193): 56 data bytes
> 64 bytes from 176.57.193.193: icmp_seq=0 ttl=64 time=0.497 ms
> 64 bytes from 176.57.193.193: icmp_seq=1 ttl=64 time=0.481 ms
> ^C
> --- 176.57.193.193 ping statistics ---
> 2 packets transmitted, 2 packets received, 0.0% packet loss
> round-trip min/avg/max/stddev = 0.481/0.489/0.497/0.008 ms
> # setfib 1 ping 176.57.193.193
> PING 176.57.193.193 (176.57.193.193): 56 data bytes
> 64 bytes from 176.57.193.193: icmp_seq=0 ttl=64 time=0.912 ms
> 64 bytes from 176.57.193.193: icmp_seq=1 ttl=64 time=0.650 ms
> ^C
> --- 176.57.193.193 ping statistics ---
> 2 packets transmitted, 2 packets received, 0.0% packet loss
> round-trip min/avg/max/stddev = 0.650/0.781/0.912/0.131 ms
>
> I have no kernel with both VIMAGE and ROUTINGTABLES so can test that,
> this has ROUTINGTABLES 2

I don't have vimage either.

Thanks a lot for your feedback!
That brought one more perception: The problem only affects alias addresses!
I took a different machine and also couldn't reproduce the problem first.
The I added an additional inet alias -> The problem initially described
occurs.
That's why my jail setu stopped working -> all Addresses ar alias addersses.

Any help highly appreziated!

-Harry



signature.asc
Description: OpenPGP digital signature


Re: FIB and jail regression [Was: can't reach jails own ipv4 from inside anymore]

2012-12-19 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 19.12.2012 13:48 (localtime):
> ...
> Easiest way to reproduce:
>
> Just do a ping on the host (not jail)
>
> setfib 0 ping anyLocalIP -> works
> setfib 1 ping anyLocalIP -> doesn't work
>
> Anybody with 9.1 and ROUTINGTABLES in custom kernel out there who can't
> confirm that?
>
> Turned out that 9.0-stable from Feb. 2012 doesn't show that problem.
> So this problem seems to be introdued between 9.0 and 9.1.

I have to correct myself. As it turned out, the problem only affects
inet alias addresses. I testet the non-alias IP at first, when it seemd
that 9.0-stable from Feb. 2012 was not affected. But as soon as I repeat
the test with an alias IP, there's the same problem and I get "ping:
sendto: Host is down" as answer.

So  this regression could be older. Unfortunately I don't have any
machine between 8.2-release and 9.0-stable_2/2012 for testing.
Any hint's how to find out what committ could be the cause?

Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


setfacl man page states "d=delete_child" and "D=delete"

2013-02-08 Thread Harald Schmalzbauer
 Hello,

I think there's a confusion in the man page setfacl(1).

In my tests, "D" means "delete_child" and "d" "delete"; like it's true
for other NFSv4 implementations. But manpage tells the other way around.

Since things didn't work as expected when I followed the man page I
checked the following as a member of group "intern":
1st test, following the man page, this acl should prevent users of group
"intern" to delete anything inside "testdir":

>: getfacl testdir
# file: testdir/
# owner: root
# group: intern
owner@:rwxp--aARWcCos:--:allow
group@:rwxp--a-R-c--s:--:allow
group@:-d:--:deny
 everyone@:--a-R-c--s:--:allow

>: ls -l testdir
total 3
drwxr-xr-x  2 root  intern  2  8 Feb 15:38 2nd
-rw-r--r--  1 root  intern  0  8 Feb 15:44 testfile

But:
>: rm testdir/testfile
override rw-r--r--  root/intern for testdir/testfile? y

>: ls -l testdir
total 2
drwxr-xr-x  2 root  intern  2  8 Feb 15:38 2nd

*pow*
"testfile" of directory testdir got unlinked, since write permission to
"testdir" has overridden group-readonly of "testfile" and no
"delete_child" restriction took place.

2nd test, using "D" instead of "d":
#: setfacl -m group@:D::deny shared/testdir

>: getfacl testdir
# file: testdir
# owner: root
# group: intern
owner@:rwxp--aARWcCos:--:allow
group@:rwxp--a-R-c--s:--:allow
group@:D-:--:deny
 everyone@:--a-R-c--s:--:allow

>: ls -l testdir
total 3
drwxr-xr-x  2 root  intern  2  8 Feb 15:38 2nd
-rw-r--r--  1 root  intern  0  8 Feb 15:55 testfile

>: rm testdir/testfile
override rw-r--r--  root/intern for testdir/testfile? y
rm: testdir/testfile: Operation not permitted

>: ls -l testdir
total 3
drwxr-xr-x  2 root  intern  2  8 Feb 15:38 2nd
-rw-r--r--  1 root  intern  0  8 Feb 15:55 testfile


Shall I file a PR? Or do I completely misunderstand things?

Thanks,

-Harry

P.S.: Btw., can anybody explain me why (at some time, someone decided
that) write permission to a directory does override file permissions
inside the directory? I can't get the sense.  Of course, there's the
sticky bit, but that's not inheritable. I can't imagine why the stick
bit doesn't work inverted. The default behaviour should be like with
sticky bit set, and by setting something like the sticky bit,
optionally, one can empower the directory write permitted users/groups
to override file-permissions inside. That's the only thing I'd ever needed.



signature.asc
Description: OpenPGP digital signature


multiple ACEs with the same ACL qualifier

2013-02-08 Thread Harald Schmalzbauer
 Hello,

I'd like to duplicate the following ACL:
# file: /data/shared/
# owner: harry
# group: harry
 group:1stgroup:r-x---a-R-c--s:fd:allow
 group:2ndgroup:rwxp--a-R-c--s:-d:allow
 group:2ndgroup:D-:-d:deny
 group:2ndgroup:r-a-R-c--s:f-i---:allow
owner@:rwxpDdaARWcCos:fd:allow
group@:r-xp--a-R-c--s:fd:allow
 everyone@:--:fd:allow

So there are two "group:2ndgroup:::allow" entries.
While it's annoying that I can't modify one specific of these with "-m"
(both get altered without warning/confirmation reques), I also can't use
"-M" to apply it read from file.

Are there any workarrounds?

Intention is to make sure newly created files can only be
deleted/altered by owner, while two other groups need to access files
and directories read-only, but one of them also needs write access. But
must'nt delete foreign files/directories.
Never had so many problems applying real-world needs... Done such setup
hundred times without effort, but on other FS...

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


problem stoping jails with jail(8), jail.conf and mount.fstab

2013-02-12 Thread Harald Schmalzbauer
 Hello,

on 9.1-R, I highly appreciate the new jail(8) and jail.conf
capabilities. Thanks for that extension!

But I have one problem: If I want to stop a jail with 'jaill -r
jailname', I get "umount: unmount of /.jail.jailname failed: Device busy"

It seems to me that the order of fstab.jailname entries are not reverted
by jail(8) when shutting down/umounting.
My C skills don't allow me to verify/fix that in usr.sbin/jail/command.c

Can anybody help please?

Thanks,

-Harry

(not subscribed to jail@)



signature.asc
Description: OpenPGP digital signature


Re: problem stoping jails with jail(8), jail.conf and mount.fstab

2013-02-12 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 12.02.2013 15:47 (localtime):
>  Hello,
>
> on 9.1-R, I highly appreciate the new jail(8) and jail.conf
> capabilities. Thanks for that extension!
>
> But I have one problem: If I want to stop a jail with 'jaill -r
> jailname', I get "umount: unmount of /.jail.jailname failed: Device busy"
>
> It seems to me that the order of fstab.jailname entries are not reverted
> by jail(8) when shutting down/umounting.
> My C skills don't allow me to verify/fix that in usr.sbin/jail/command.c

Btw, experimental falsifying isn't the problem:
fstab.jail1:
/dev/gpt/jail1ROOT/.jail.jail1ufsro0 0
/dev/gpt/jail1VAR/.jail.jail1/varufsrw,noatime 0 0

Starting jail with 'jail -c jail1': everything fine.

Stoping jail with 'jail -r jail1': error when fstab.jail1 is like above,
but error vanishes if I revert the two lines above before stoping!
So the root cause seems to be obvious.
But like mentioned, I can't fix that myself :-(

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Why scf (sfcd) monitoring sometimes doesn't work

2013-02-14 Thread Harald Schmalzbauer
 Hello,

I found fsc (http://www.freshports.org/sysutils/fsc/) to be extremely
useful.
Unfortunately, I can't get some services to be monitored, "fscadm
enable" just failes with "Could not monitor service."
I don't know how kqueue interaction is working, so I can't guess why
some services can be monitored fine and others not.
How can I start finding out what goes wrong?
How does the rc-name play into that role?

Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


Why fsc (fscd) monitoring sometimes doesn't work [Was: Re: Why scf (sfcd) monitoring sometimes doesn't work]

2013-02-14 Thread Harald Schmalzbauer
 schrieb Harald Schmalzbauer am 14.02.2013 13:34 (localtime):
>  Hello,
>
> I found fsc (http://www.freshports.org/sysutils/fsc/) to be extremely
> useful.
> Unfortunately, I can't get some services to be monitored, "fscadm
> enable" just failes with "Could not monitor service."
> I don't know how kqueue interaction is working, so I can't guess why
> some services can be monitored fine and others not.
> How can I start finding out what goes wrong?
> How does the rc-name play into that role?
>

Sorry for the ugly typo in the topic!



signature.asc
Description: OpenPGP digital signature


new jail(8) ignoring devfs_ruleset?

2013-02-15 Thread Harald Schmalzbauer
 Hello,

like already posted, on 9.1-R, I highly appreciate the new jail(8) and
jail.conf capabilities. Thanks for that extension!

Accidentally I saw that "devfs_ruleset" seems to be ignored.
If I list /dev/ I see all the hosts disk devices etc.
I set "devfs_ruleset = 4;" and "enforce_statfs = 1;" in jail.conf.
  Inside the jail,
sysctl security.jail.devfs_ruleset returnes "1".
But like mentioned, I can access all devices...

Thanks for any help,

-Harry

(not subscribed to freebsd-jail@)



signature.asc
Description: OpenPGP digital signature


mount lag, umounting returns wrong "Device busy"

2013-02-15 Thread Harald Schmalzbauer
 Hello,

while playing with new jail features, I recognized that manually
umounting doesn't work as I'd expect.
After jail has been destroyed, the following mountpoint is active:
/dev/gpt/jailname1ROOT on /.jail.jailname1 (ufs, local, read-only)

There was var mounted to /.jail.jailname1/var but that sucessfully umounted.
'fstat' also shows no open files in /.jail.jailname1

But when I do 'umount /.jail.jailname' I get "Device busy" returned.
Some minutes later umounting works.
But I always have to wait some time, although nothing is open and
nothing is mounted above.

Does anybody have an idea what could cause that false "Device busy"?

-Harry



signature.asc
Description: OpenPGP digital signature


Re: mount lag, umounting returns wrong "Device busy"

2013-02-15 Thread Harald Schmalzbauer
 schrieb Mateusz Guzik am 15.02.2013 17:50 (localtime):
> On Fri, Feb 15, 2013 at 05:43:16PM +0100, Harald Schmalzbauer wrote:
>>  Hello,
>>
>> while playing with new jail features, I recognized that manually
>> umounting doesn't work as I'd expect.
>> After jail has been destroyed, the following mountpoint is active:
>> /dev/gpt/jailname1ROOT on /.jail.jailname1 (ufs, local, read-only)
>>
>> There was var mounted to /.jail.jailname1/var but that sucessfully umounted.
>> 'fstat' also shows no open files in /.jail.jailname1
>>
>> But when I do 'umount /.jail.jailname' I get "Device busy" returned.
>> Some minutes later umounting works.
>> But I always have to wait some time, although nothing is open and
>> nothing is mounted above.
>>
>> Does anybody have an idea what could cause that false "Device busy"?
>>
> My guess is that the jail was not dead yet and it held a reference for
> /.jail.jailname1's vnode.
>
> jls -v should show the jail.
>
> I don't know if this can happen, but my guess is that not-yet-expired
> network connections hold reference to a jail preventing it from being
> destroyed. So I would definitely checkout netstat output. There may be
> other posibilities, but nothing obvious comes to my mind at the moment.

Good hint, I found out that returning the NIC (using jail with vnet)
takes some time and as soon as the NIC shows up back in the host, I also
can umount the jail's root mount point.
I have no idea about the internals of moving NICs. Is it "normal" that
it takes some time to return the NIC?
Almost every time I remove the jail (jail -r), I have to issue the
command twice. First, I see services getting stoped, but then the line:
  jail: kevent: No such process
'jail -r' cancels at that point (jls shows it active)
After the second 'jail -r' I get the following lines:
.
Terminated
gentlemail: removed
umount: unmount of /.jail.jailname1 failed: Device busy

Then 'jls' doesn't list the jail anymore, but the NIC still doesn't show
up in the hosts network stack.
And that's the cause for keeping the root mountpoint busy...
Could that be related to the wrong umount-order with 'jail -r'?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: new jail(8) ignoring devfs_ruleset?

2013-02-18 Thread Harald Schmalzbauer
 schrieb Jamie Gritton am 16.02.2013 00:40 (localtime):
> On 02/15/13 09:27, Harald Schmalzbauer wrote:
>>   Hello,
>>
>> like already posted, on 9.1-R, I highly appreciate the new jail(8) and
>> jail.conf capabilities. Thanks for that extension!
>>
>> Accidentally I saw that "devfs_ruleset" seems to be ignored.
>> If I list /dev/ I see all the hosts disk devices etc.
>> I set "devfs_ruleset = 4;" and "enforce_statfs = 1;" in jail.conf.
>>Inside the jail,
>> sysctl security.jail.devfs_ruleset returnes "1".
>> But like mentioned, I can access all devices...
>>
>> Thanks for any help,
>>
>> -Harry
>
> devfs_ruleset is only used along with mount.devfs - do you also have
> that set in jail.conf?

Thanks for your response.

Yes, I have mount.devfs; set.
Otherwise I wouldn't have any device inside my jail. Verified - and like
intended, right?
Another notable discrepancy: The man page tells that devfs_rulset is "4"
by default.
But when I don't set devfs_rulset in jail.conf at all, inside the jail,
'sysctl security.jail.devfs_ruleset': 0
When set, like mentioned above, it returns the corresponding value, but
it doesn't have any effect.
How gets devfs_rulset handled? Does jail(8) do the whole job? I'd like
to help finding the source, but have missed the whole new jail evolution...
Inside my jails, I don't have a fstab, outside I have them defined and
enabled with "mount" - and noticed the non-reverted umounting.

Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


intel kms, xorg and triple head?

2013-02-18 Thread Harald Schmalzbauer
 Hello,

I wasn't able to find infos about multi-head support for the new intel
kms with FreeBSD 9.1
Is it possible to have xorg driving 3 displays? I know of the
two-PLL-pipe limitation with intel's IvyBrindge-CPU/GPUs. But I don't
know if the new driver supports possible configurations? (e.G.
2x1600x1200 + 1x1920x1200).
Has anybody running xorg and 3 displays with i915kms? Or is it at least
said to be supported?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


pf loosing (v6) TCP states much too early, "no-route" not working with IPv6

2013-05-31 Thread Harald Schmalzbauer
 Hello,

my default pf config blocks everything and allowes specific connections.
One of them is "in from x to self port ssh" which expands to "port ssh
keep state flags S/SA" by default.

After ssh login, I see the corresponding entry in the states table:
all tcp 2001:db8:f0bb:1::1[22] <- 2001:db8:f0bb:1::3:1[42730]  
ESTABLISHED:ESTABLISHED

pfctl -s info claims:
TIMEOUTS:
...
tcp.established   86400s
...

After a couple of hours of inactivity, the ssh session silently stalls.
Here's what I have in the log:
rule 3/0(match): block in on rl1: 2001:db8:f0bb:1::3:1.42730 >
2001:db8:f0bb:1::1.22: Flags [P.], ack 1444009640, win 65535, length 48

The rule evaluation by itself is correct, it's no TCP-SYN, so it get's
blocked, but this packet should not get through the ruleset at all, at
least not before 86400s of idle connection. In my case, it was after ~3
hours. And ports numbers are exactly the same as in the state table
entry from some hours before. So the state table entry seems to got lost!

My question:

Is such a problem known?
Did I miss enything else?

System runs 8.1-STABLE/x86

Another issue was that "no-route" doesn't work for IPv6 connections. I
had to replace it with "any".

Thansk for any hints in advance,

-Harry

P.S.: It's an embedded box where upgrading is overdue, but not that easy...



signature.asc
Description: OpenPGP digital signature


Midori > Preferences > Segmentation fault

2013-06-11 Thread Harald Weis
Does anyone use the midori browser in 9.1 ?

Since I switched from 8.3 to 9.1-RELEASE I've got the following problem.

Midori > Edit > Preferences results, nearly every time, in: 



*** NSPlugin Viewer  *** ERROR: rpc_end_sync called when not in sync!   

Segmentation fault (core dumped)



Apart from this everything seems okay, for example flash videos work fine.

Compiling without any option (there are only four) does not help.

Does somebody know whether x...@freebsd.org is aware of this bug ?

Thank you in advance, 
Harald
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Midori > Preferences > Segmentation fault

2013-06-12 Thread Harald Weis
On Tue, Jun 11, 2013 at 05:14:05PM -0500, Edwin L. Culp W. wrote:
 > I am and also use chrome, firefox and opera and often dislike them all.  I
 > have minor problems with loads with all but I like Midori for most things.
 > It is still missing plugins, etc. and I dislike searching but not very
 > important.
 > 
 > I'm running 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #494 r251615: Tue Jun 11
 > 06:41:14 CDT 2013 and midori midori-0.5.2.

Thank you for replying.

Chrome, Firefox and Opera work fine, but every now and then they become
vulnerable. The annoying thing is that their compile time is taking hours.
Midori is lightweight and gives me all I need.

My system is FreeBSD 9.1-RELEASE #0 r243826: Tue Dec  4 06:55:39 UTC 2012.
I cannot see why freebsd-update could help. I'll do the update anyway.

midori -V yields:
Midori midori-0.5.2 ((null))
GTK+ 2.24.18 (2.24.18)  Glib 2.34.3 (2.34.3)
WebKitGTK+ 1.8.3 (1.8.3)libSoup 2.40.3
cairo 1.10.2 (1.10.2)   libnotify 0.7.3
gcr No  granite No
single instance Sockets

I have now reported the bug on
https://bugs.launchpad.net/midori

Bye,
Harald
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ftp-proxy(8) doesn't respect "-a" (source address for outgoing control connection)

2013-06-20 Thread Harald Schmalzbauer
 Hello,

according to man (8) ftp-proxy, "-a 1.2.3.4" should instruct ftp-proxy
to use 1.2.3.4 as source address for outgoing control connections.
But it doesn't. It seems to greatly ignor that directive, since I can
pass  any address, ieven if the machine doesn't own it. It always uses
the EGRES interface's inet address - inet6 not tested.

Any ideas why?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


MFC vm_page.c and vm_phys.c missing [Was: Re: Some missing patches in 9.2-RC1]

2013-08-08 Thread Harald Schmalzbauer
 ...

Can someone please have a look why this wasn't MFCd?
http://svnweb.freebsd.org/base?view=revision&revision=252653

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: status of autotuning freebsd for 9.2

2013-08-16 Thread Harald Schmalzbauer
 Bezüglich Pascal Drecker's Nachricht vom 16.07.2013 21:42 (localtime):
> ...
> IMHO, this is considered a new feature, and not a critical bug
>fix. re@
>>>> asked from the start of the code slush to avoid new features, and
>at
>>>> this point, it is too late. It is not worth introducing possible
>>>> regressions, which will only delay the 9.2-RELEASE.
>>>>
>>>> Glen
>>>>
>>>> OK, then we need a release notes telling people a sane value for
>>> nmbclusters and friends so that they know how to make 10gigE work.
>>>
>>> I'll poll my team for a value if someone else has one, that would be
>even
>>> better.
>>>
>>> --
>>> Alfred Perlstein
>>> VP Software Engineering, iXsystems
>>
>>
>>Is there a possibility that a separate unofficial patch set could be
>>released for people who want the autotuning but do not want to run 9
>>stable after 9.2 is released.
>>I would like the autotuning, but i am a little reluctent to use other
>>stable stuff i will get when tracking stable.
>>
>>Regards
>>Johan
>
>Hi,
>
>I think that's a good point.
>
>In our company, it�s not allowed to use the stable tree for any
>production system. Little and useful patches are still allowed.
>
>Having a central point with a description of each patch it would be
>much easier to update the release version with the needed patches.

You're welcome using my "deploy-tools" patchsets.
I'm deploying RELENG only, but with local patchset-policy. Originally,
these are automtically handled during build-process with deploy-tools,
but of course you can selective/manually apply the desired patches from
the "local-patches" directory:

ftp://ftp.omnilan.de/pub/FreeBSD/OmniLAN/deploy-tools/

Best regards,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: [CFT] VMware vmxnet3 ethernet driver

2013-08-21 Thread Harald Schmalzbauer
 Bezüglich Bryan Venteicher's Nachricht vom 05.08.2013 02:12 (localtime):
> Hi,
>
> I've ported the OpenBSD vmxnet3 ethernet driver to FreeBSD. I did a
> lot of cleanup, bug fixes, new features, etc (+2000 new lines) along
> the way so there is not much of a resemblance left.
>
> The driver is in good enough shape I'd like additional testers. A patch
> against -CURRENT is at [1]. Alternatively, the driver and a Makefile is
> at [2]; this should compile at least as far back as 9.1. I can look at
> 8-STABLE if there is interest.
>
> Obviously, besides reports of 'it works', I'm interested performance vs
> the emulated e1000, and (for those using it) the VMware tools vmxnet3
> driver. Hopefully it is no worse :)

Hello Bryan,

thanks a lot for your hard work!

It seems if_vmx doesn't support jumbo frames. If I set mtu 9000, I get
»vmx0: cannot populate Rx queue 0«, I have no problems using jumbo
frames with vmxnet3.

I took a oldish host (4x2,8GHz Core2[LGA775]) with recent software: ESXi
5.1U1 and FreeBSD-9.2-RC2
Two guests are connected to one MTU9000 "VMware Software Switch".

Simple iperf (standard TCP) results:

vmxnet3jumbo <-> vmxnet3jumbo
5.3Gbits/sec, load: 40-60%Sys 0.5-2%Intr

vmxnet3 <-> vmxnet3
1.85 GBits/sec, load: 60-80%Sys 0-0.8%Intr


if_vmx <-> if_vmx
1.51 GBits/sec, load: 10-45%Sys 40-48%Intr
!!!
if_vmxjumbo <-> if_vmxjumbo not possible


if_em(e1000) <-> if_em(e1000)
1.23 GBits/sec, load: 80-60%Sys 0.5-8%Intr

if_em(e1000)jumbo <-> if_em(e1000)jumbo
2.27Gbits/sec, load: 40-30%Sys 0.5-5%Intr


if_igb(e1000e)junmbo <-> if_igb(e1000e)jumbo
5.03 Gbits/s, load: 70-60%Sys 0.5%Intr

if_igb(e1000e) <-> if_igb(e1000e)
1.39 Gbits/s, load: 60-80%Sys 0.5%Intr


f_igb(e1000e) <-> if_igb(e1000e), both hw.em.[rt]xd=4096
1.66 Gbits/s, load: 65-90%Sys 0.5%Intr

if_igb(e1000e)junmbo <-> if_igb(e1000e)jumbo, both hw.em.[rt]xd=4096
4.81 Gbits/s, load: 65%Sys 0.5%Intr

Conclusion:
if_vmx performs well compared to the regular emulated nics and standard
MTU, but it's behind tuned e1000e nic emulation and can't reach vmxnet3
performance with regular mtu. If one needs throughput, the missing jumbo
frame support in if_vmx  is a show stopper.

e1000e is preferable over e1000, even if not officially choosable with
"FreeBSD"-selection as guest (edit .vmx and alter ethernet0.virtualDev =
"e1000e", and dont forget to set hw.em.enable_msix=0 in loader.conf,
although the driver e1000e attaches is if_igb!)

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


if_em, legacy nic and GbE saturation

2013-08-26 Thread Harald Schmalzbauer
 Hello,

I recycled an older box and put an i350-2 together with a second 82541GI
(PCI-slot, one already on-board) into it.
The two i350-ports are used with VMDq for ESXi5.1.
The two 82541GI are used as lagg-nics by a 9.2-RC (amd64) guest as
passthrou PCI device.
Always had good results with such setups, but found out, that nics which
use the legacy driver part of if_em max out at ~0.6Gbits/s (1500 MTU).

There's another NIC on board of this recycle-box, a 82566-PHY (ICH9
integrated MAC).
This one uses also if_em, but not legacy code, it reports version 7.3.8
(compared to 1.0.6).
And it has no problem fully saturating GbE (~925Mbits/s, no jumbo Frames
support anyways).

I'm using iperf, with and without lagg (doesn't change anyhing, like it
doesn't influence tests on some other boxes with 82576 and i350 (igb))
I see enough idle cycles so CPU shouldn't limit the legacy if_em nics.
Also, I see the 82541 consuming arround 8k irqs. Same does the
82566-PHY, but with much higher throughput...

I'd like to know if I can't generally expect to saturate older (PCI) GbE
nics the line for any reason... I can remember tigeon cards from more
than a decade ago, which indeed seemd to lack the performance to gain
GbE, but I thought that was no issue shortly later and no "modern"
Intel-GbE card had such constraints!?
Is there any special tuning for legacy if_em (no need for any TCP
tuning, 82566 doesn't have any issue)?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: if_em, legacy nic and GbE saturation

2013-08-26 Thread Harald Schmalzbauer
 Bezüglich Adrian Chadd's Nachricht vom 26.08.2013 10:34 (localtime):
> Hi,
>
> There's bus limits on how much data you can push over a PCI bus. You
> can look around online to see what 32/64 bit, 33/66MHz PCI throughput
> estimates are.
>
> It changes massively if you use small versus large frames as well.
>
> The last time I tried it i couldn't hit gige on PCI; I only managed to
> get to around 350mbit doing TCP tests.

Thanks, I'm roughly aware about the PCI bus limit, but I guess it should
be good for almost GbE: 33*10^6*32=1056, so if one considers overhead
and other bus-blocking things (nothing of significance is active on the
PCI bus in this case), I'd expect at least 800Mbis/s, which is what I
get with jumbo frames.
I also know that lagg won't help in regard of concurrent throughput
because of the PCI limit. But it's the redundancy why I also use 2 nics
in that parking-maschine.

I just have no explanation why I see that noticable difference between
mtu 1500 and 9000 on legacy if_em nic, which doesn't show up with the
second on-board nic (82566), which uses different if_em code.
I can imagine that it's related to PCI transfer limits (the 82566 is
ICH9 integrated which connects via DMI to the CPU, so no PCI
constraint), but if someone has more than an imagination, an explanation
was highly appreciated :-)

But if you saw similar constraints on other (non-if_em?) PCI-connected
nics, I'll leave it as it is. Just wanted some kind of confirmation that
it's normal that single-GbE doesn't play well with PCI.

Thank you,

-Harry




signature.asc
Description: OpenPGP digital signature


Re: [CFT] VMware vmxnet3 ethernet driver

2013-08-27 Thread Harald Schmalzbauer
 Bezüglich Bryan Venteicher's Nachricht vom 27.08.2013 06:18 (localtime):

...

>> It seems if_vmx doesn't support jumbo frames. If I set mtu 9000, I get
>> »vmx0: cannot populate Rx queue 0«, I have no problems using jumbo
>> frames with vmxnet3.
>>
> This could fail for two reasons - could not allocate an mbuf cluster,
> or the call to bus_dmamap_load_mbuf_sg() failed. For the former, you
> should check vmstat -z. For the later, the behavior of 
> bus_dmamap_load_mbuf_sg()
> changed between 9.1 and 9.2, and I know it was broken for awhile. I don't
> recall exactly when I fixed it (I think shortly after I made the original
> announcement). Could you retry with the files from HEAD @ [1]? Also, there
> are new sysctl oids (dev.vmx.X.mbuf_load_failed & dev.vmx.X.mgetcl_failed)
> for these errors.
>
> I just compiled the driver on 9.2-RC2 with the sources from HEAD and was
> able to change the MTU to 9000.
>
> [1]- http://svnweb.freebsd.org/base/head/sys/dev/vmware/vmxnet3/

Thanks a lot for your ongoing work!
I can confirm that with recent if_vmx.c from head and compiled for
9.2-RC3, setting mtu to 9000 works as expected :-)


>> I took a oldish host (4x2,8GHz Core2[LGA775]) with recent software: ESXi
>> 5.1U1 and FreeBSD-9.2-RC2
>> Two guests are connected to one MTU9000 "VMware Software Switch".
>>
> I've got a few performance things to still look at. What's the sysctl 
> dev.vmx.X output for the if_vmx<->if_vmx tests?

Just repeated if_vmx simple iperf bench, results vary slightly from
standard 10sec run to run, but still noticable high Intr usage:

if_vmx <-> if_vmx
1.32 GBits/sec, load: 10-45%Sys 40-48%Intr

if_vmxJumbo <-> if_vmxJumbo
5.01 GBits/sec, load: 10-45%Sys 40-48%Intr

Please find attached the different outputs of dev.vmx.X (the mtu9000 run was 
only 3.47GBits/sec in that case, took the numbers anyway)

wbr,

-Harry

dev.vmx.0.%desc: VMware VMXNET3 Ethernet Adapter
dev.vmx.0.%driver: vmx
dev.vmx.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PE40.S1F0
dev.vmx.0.%pnpinfo: vendor=0x15ad device=0x07b0 subvendor=0x15ad 
subdevice=0x07b0 class=0x02
dev.vmx.0.%parent: pci3
dev.vmx.0.ntxqueues: 1
dev.vmx.0.nrxqueues: 1
dev.vmx.0.collapsed: 0
dev.vmx.0.mgetcl_failed: 0
dev.vmx.0.mbuf_load_failed: 0
dev.vmx.0.txq0.ringfull: 133479
dev.vmx.0.txq0.offload_failed: 0
dev.vmx.0.txq0.hstats.tso_packets: 564986
dev.vmx.0.txq0.hstats.tso_bytes: 1686184580
dev.vmx.0.txq0.hstats.ucast_packets: 570604
dev.vmx.0.txq0.hstats.unicast_bytes: 1694679608
dev.vmx.0.txq0.hstats.mcast_packets: 0
dev.vmx.0.txq0.hstats.mcast_bytes: 0
dev.vmx.0.txq0.hstats.error: 0
dev.vmx.0.txq0.hstats.discard: 0
dev.vmx.0.txq0.debug.cmd_head: 106
dev.vmx.0.txq0.debug.cmd_next: 106
dev.vmx.0.txq0.debug.cmd_ndesc: 512
dev.vmx.0.txq0.debug.cmd_gen: 0
dev.vmx.0.txq0.debug.comp_next: 238
dev.vmx.0.txq0.debug.comp_ndesc: 512
dev.vmx.0.txq0.debug.comp_gen: 1
dev.vmx.0.rxq0.hstats.lro_packets: 0
dev.vmx.0.rxq0.hstats.lro_bytes: 0
dev.vmx.0.rxq0.hstats.ucast_packets: 579137
dev.vmx.0.rxq0.hstats.unicast_bytes: 38409312
dev.vmx.0.rxq0.hstats.mcast_packets: 0
dev.vmx.0.rxq0.hstats.mcast_bytes: 0
dev.vmx.0.rxq0.hstats.bcast_packets: 29
dev.vmx.0.rxq0.hstats.bcast_bytes: 1740
dev.vmx.0.rxq0.hstats.nobuffer: 0
dev.vmx.0.rxq0.hstats.error: 0
dev.vmx.0.rxq0.debug.cmd0_fill: 94
dev.vmx.0.rxq0.debug.cmd0_ndesc: 256
dev.vmx.0.rxq0.debug.cmd0_gen: 0
dev.vmx.0.rxq0.debug.cmd1_fill: 0
dev.vmx.0.rxq0.debug.cmd1_ndesc: 256
dev.vmx.0.rxq0.debug.cmd1_gen: 0
dev.vmx.0.rxq0.debug.comp_next: 94
dev.vmx.0.rxq0.debug.comp_ndesc: 512
dev.vmx.0.rxq0.debug.comp_gen: 0
dev.vmx.0.%desc: VMware VMXNET3 Ethernet Adapter
dev.vmx.0.%driver: vmx
dev.vmx.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PE40.S1F0
dev.vmx.0.%pnpinfo: vendor=0x15ad device=0x07b0 subvendor=0x15ad 
subdevice=0x07b0 class=0x02
dev.vmx.0.%parent: pci3
dev.vmx.0.ntxqueues: 1
dev.vmx.0.nrxqueues: 1
dev.vmx.0.collapsed: 0
dev.vmx.0.mgetcl_failed: 0
dev.vmx.0.mbuf_load_failed: 0
dev.vmx.0.txq0.ringfull: 58950
dev.vmx.0.txq0.offload_failed: 0
dev.vmx.0.txq0.hstats.tso_packets: 230508
dev.vmx.0.txq0.hstats.tso_bytes: 4314020112
dev.vmx.0.txq0.hstats.ucast_packets: 235382
dev.vmx.0.txq0.hstats.unicast_bytes: 4356943552
dev.vmx.0.txq0.hstats.mcast_packets: 0
dev.vmx.0.txq0.hstats.mcast_bytes: 0
dev.vmx.0.txq0.hstats.error: 0
dev.vmx.0.txq0.hstats.discard: 0
dev.vmx.0.txq0.debug.cmd_head: 333
dev.vmx.0.txq0.debug.cmd_next: 333
dev.vmx.0.txq0.debug.cmd_ndesc: 512
dev.vmx.0.txq0.debug.cmd_gen: 0
dev.vmx.0.txq0.debug.comp_next: 376
dev.vmx.0.txq0.debug.comp_ndesc: 512
dev.vmx.0.txq0.debug.comp_gen: 0
dev.vmx.0.rxq0.hstats.lro_packets: 0
dev.vmx.0.rxq0.hstats.lro_bytes: 0
dev.vmx.0.rxq0.hstats.ucast_packets: 255918
dev.vmx.0.rxq0.hstats.unicast_bytes: 17043918
dev.vmx.0.rxq0.hstats.mcast_packets: 0
dev.vmx.0.rxq0.hstats.mcast_bytes: 0
dev.vmx.0.rxq0.hstats.bcast_packets: 15
dev.vmx.0.rxq0.hstats.bcast_bytes: 900
dev.vmx.0.rxq0.hstats.nobuffer: 0
dev.vmx.0.rxq0.hstats.error: 0
dev.vmx.0.rxq0.debug.cm

umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-16 Thread Harald Schmalzbauer
 Hello,

I have some of these 4-Port-Serial-USB-Hubs:
http://www.delock.com/produkte/F_673_USB---Seriell_87414/merkmale.html
They have the MosChip MCS7840 inside, wich also understands RS485/422
besides RS232.

FreeBSDs umcs(4) supports the RS232 mode with standard baudrates and
works with that device.

Unfortunately, as soon as I open any of the 4 cuaU0.x ports, there are
500irq/s from ehci.
These irqs also accur if the port is closed again (until I unload the
umcs module).
These irqs prevent virtual machines and also real hardware from deeper
sleeping. In my case, it's a huge ammount of lost power saving.
I also saw that the windows driver of the MCS7840 consuming a lot of
CPU-cycles (XP-ESXi guest).
Is there any chance to tranquilize?

The 4-Port prolific model
(http://www.delock.com/produkte/F_673_USB---Seriell_61518/merkmale.html)
doesn't show these unneccessary irqs btw.!

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Hans Petter Selasky's Nachricht vom 17.09.2013 07:14
(localtime):
> Hi,
>
> Check using usbdump -i usbusX -f Y -s 65536 -vvv
>
> what is going on. Maybe some USB transfers are returning zero length data 
> from the chip.

Thanks for your help!
I can't really read the numbers, but these 4 actions look all the time,
whe no connection is open:
08:39:19.889658 usbus1.4 SUBM-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 16 bytes
 flags 0xa 
 status 0xcb023

08:39:19.891655 usbus1.4
DONE-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=8,IVAL=2,ERR=0
 frame[0] READ 5 bytes
   C1 C1 C1 01 55 -- -- --  -- -- -- -- -- -- -- --  |U   |
 flags 0xa 
 status 0xeb021

08:39:19.891658 usbus1.4 SUBM-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 16 bytes
 flags 0xa 
 status 0xeb023

08:39:19.893656 usbus1.4
DONE-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=8,IVAL=2,ERR=0
 frame[0] READ 5 bytes
   C1 C1 C1 01 55 -- -- --  -- -- -- -- -- -- -- --  |U   |
 flags 0xa 
 status 0xcb021


Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Lev Serebryakov's Nachricht vom 17.09.2013 10:44 (localtime):
> Hello, Hans.
> You wrote 17 сентября 2013 г., 9:14:17:
>
> HPS> Check using usbdump -i usbusX -f Y -s 65536 -vvv
> HPS> what is going on. Maybe some USB transfers are returning zero length 
> data from the chip.
>  Ok, I got 500 irq/s here from my device, so, I have same problem, as
>  topicstarter. My usbdump (9-STABLE) doesn't understand '-f Y', without

The Y is the device addr @ bus #X.

Same posted some seconds ago :-)

Thanks,

-Harry

>  filter I get:
>
> 12:42:58.929604 usbus3.2 SUBM-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=0,IVAL=2
>  frame[0] READ 16 bytes
>  flags 0xa 
>  status 0xeb023 
> 
> 12:42:58.931601 usbus3.2 
> DONE-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=8,IVAL=2,ERR=0
>  frame[0] READ 5 bytes
>    C1 01 01 01 55 -- -- --  -- -- -- -- -- -- -- --  |U   |
>  flags 0xa 
>  status 0xcb021 
> 
> 12:42:58.931607 usbus3.2 SUBM-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=0,IVAL=2
>  frame[0] READ 16 bytes
>  flags 0xa 
>  status 0xcb023 
> 
> 12:42:58.933601 usbus3.2 
> DONE-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=8,IVAL=2,ERR=0
>  frame[0] READ 5 bytes
>    C1 01 01 01 55 -- -- --  -- -- -- -- -- -- -- --  |U   |
>  flags 0xa 
>  status 0xeb021 
> 
> 12:42:58.933610 usbus3.2 SUBM-INTR-EP=0089,SPD=HIGH,NFR=1,SLEN=0,IVAL=2
>  frame[0] READ 16 bytes
>  flags 0xa 
>  status 0xeb023 
> 
>
>


-- 
OmniLAN - UNIX & Windows Netze + Systeme
Harald Schmalzbauer
Weidmannstraße 16
80997 München
Telefon: +49 (0)89 18947781
Notruf: +49 (0)89 85639293
USt-IdNr.: DE253184753
http://www.omnilan.de/




signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Hans Petter Selasky's Nachricht vom 17.09.2013 10:57
(localtime):
> On 09/17/13 10:47, Lev Serebryakov wrote:
>> Hello, Harald.
>> You wrote 17 сентября 2013 г., 12:46:25:
>>
>> HS> The Y is the device addr @ bus #X.
>>Oh :)
>>
>> HS> Same posted some seconds ago :-)
>>Yep, exactly the same pattern.
>>
>
>
> Hi,
>
> Could you show the configuration descriptor for your device?
>
> usbconfig -d X.Y dump_curr_config_desc

Shall we switch to non-list-comm?

Again, thanks for your help!
tk:/usr/home/admin/#:21 usbconfig -d 1.4 dump_curr_config_desc
ugen1.4:  at usbus1, cfg=0 md=HOST
spd=HIGH (480Mbps) pwr=ON (100mA)


 Configuration index 0

bLength = 0x0009
bDescriptorType = 0x0002
wTotalLength = 0x0051
bNumInterfaces = 0x0001
bConfigurationValue = 0x0001
iConfiguration = 0x  
bmAttributes = 0x00a0
bMaxPower = 0x0032

Interface 0
  bLength = 0x0009
  bDescriptorType = 0x0004
  bInterfaceNumber = 0x
  bAlternateSetting = 0x
  bNumEndpoints = 0x0009
  bInterfaceClass = 0x00ff
  bInterfaceSubClass = 0x
  bInterfaceProtocol = 0x00ff
  iInterface = 0x  

 Endpoint 0
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0081  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 1
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0002  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 2
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0083  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 3
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0004  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 4
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0085  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 5
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0006  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 6
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0087  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 7
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0008  
bmAttributes = 0x0002  
wMaxPacketSize = 0x0200
bInterval = 0x00ff
bRefresh = 0x
bSynchAddress = 0x

 Endpoint 8
bLength = 0x0007
bDescriptorType = 0x0005
bEndpointAddress = 0x0089  
bmAttributes = 0x0003  
wMaxPacketSize = 0x0010
bInterval = 0x0005
bRefresh = 0x
bSynchAddress = 0x
>
> The interrupt endpoint in question can be throttled by the USB stack,
> if the latency of these events are not important to your application.
>

Hmm, in my case, this 4-port-serial-USB-hub will be used as console
concentrator. So most time it's doing nothing, just feeding tmux with
consoles output. What latency are we talking about? Less than a some
milliseconds should be fine.
What I'm curious about is why my prolific USB-serial converter doesn't
generate these high irqs.

Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Hans Petter Selasky's Nachricht vom 17.09.2013 11:24
(localtime):
> On 09/17/13 11:06, Harald Schmalzbauer wrote:
>> ...
>> Shall we switch to non-list-comm?
>
> Hi,
>
> That's OK.
>
>> Hmm, in my case, this 4-port-serial-USB-hub will be used as console
>> concentrator. So most time it's doing nothing, just feeding tmux with
>> consoles output. What latency are we talking about? Less than a some
>> milliseconds should be fine.
>> What I'm curious about is why my prolific USB-serial converter doesn't
>> generate these high irqs.
>
> Try this patch and see what happens:
>
> ==
> --- umcs.c(revision 255492)
> +++ umcs.c(local)
> @@ -230,6 +230,7 @@
>  .bufsize = 0,/* use wMaxPacketSize */
>  .callback = &umcs7840_intr_callback,
>  .if_index = 0,
> +.interval = 20, /* ms */
>  },
>  };
>
>
> BTW: I see that the umcs driver shouldn't do synchronous control
> transfers from the USB interrupt transfer callback. This should be
> postponed into some worker thread, for example the USB explore thread.
> See USB audio driver for an example.
>
> --HPS

I tried your patch and it works as expected: IRQs decreased to ~64/s
when idle/disconnected.

One interesting thing I never measured before:
Console connection with 115k2 via umcs and 'while ( 2>1 ) echo "---..."
end' results in 8000 irqs/s :-( But that's also true for the prolific
(uplcom). The latter just goes down to 0.0 irqs/s when idle.

Doing the same with uart0 results in 1444irqs/s.
Is it by design/unavoidable that transfering the same via USB multiplies
by factor 5-6?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Lev Serebryakov's Nachricht vom 16.09.2013 23:37 (localtime):
> ...
>
>   To be honest, I didn't know much about USB at all, I wrote this driver
> without complete understanding USB magic and use USB only as transport to
> access MCS7840 registers...
>
>  Maybe, local USB Guru Hans Petter Selasky could give cime advicec how to
>  debug this situation. I've added freebsd-usb@ to CC:

Is that worth a try?
http://www.asix.com.tw/FrootAttach/driver/MCS7840_7820_FreeBSD_driver_v1.1.zip

I'd just compile and see what it does, skills don't last for
merging/helping with umcs :-(

At least, it seems to be possible to enable RS485-mode :-) :-)



signature.asc
Description: OpenPGP digital signature


Re: umcs (4-Port-USB-serial) triggering way too much ehci IRQs

2013-09-17 Thread Harald Schmalzbauer
 Bezüglich Ian Lepore's Nachricht vom 17.09.2013 18:16 (localtime):
> On Tue, 2013-09-17 at 17:38 +0200, Harald Schmalzbauer wrote:
>> ...
>>> Try this patch and see what happens:
>>>
>>> ==
>>> --- umcs.c(revision 255492)
>>> +++ umcs.c(local)
>>> @@ -230,6 +230,7 @@
>>>  .bufsize = 0,/* use wMaxPacketSize */
>>>  .callback = &umcs7840_intr_callback,
>>>  .if_index = 0,
>>> +.interval = 20, /* ms */
>>>  },
>>>  };
>>>
>>>
>>> BTW: I see that the umcs driver shouldn't do synchronous control
>>> transfers from the USB interrupt transfer callback. This should be
>>> postponed into some worker thread, for example the USB explore thread.
>>> See USB audio driver for an example.
>>>
>>> --HPS
>> I tried your patch and it works as expected: IRQs decreased to ~64/s
>> when idle/disconnected.
>>
>> One interesting thing I never measured before:
>> Console connection with 115k2 via umcs and 'while ( 2>1 ) echo "---..."
>> end' results in 8000 irqs/s :-( But that's also true for the prolific
>> (uplcom). The latter just goes down to 0.0 irqs/s when idle.
>>
>> Doing the same with uart0 results in 1444irqs/s.
>> Is it by design/unavoidable that transfering the same via USB multiplies
>> by factor 5-6?
>>
>> Thanks,
>>
>> -Harry
>>
> I don't know about that chipset, but with the FTDI chips it does xfers
> in 64 byte chunks and high speed bulk data results in an astronomical
> number of interrupts (and if you go fast enough, lost data).  I have

According to ASIX product brief, MCS7840 has 512 byte buffer. Pretty
much for an UART I think, which should make 115k2 baud connections  with
less than 30 transfers/s work, or am I missing something?


> some patches that assemble lots of the little chip-size buffers into
> bigger xfers that the ohci/ehci controller can handle without
> interrupting the processor; that helps the problem a bunch.

I think I also have at least one FTDI adapter arround, so I'd happily
testing if I can make it compile on RELENG_9_2.

Thanks,

-Harry





signature.asc
Description: OpenPGP digital signature


HFC-4S and pcm_slave

2013-09-18 Thread Harald Schmalzbauer
 Hello,

thanks to Hans Petter Selasky, isdn4bsd (i4b) was easy to install and
seems to do the same great job these days with 9.2 as a decade ago with
3.x :-)
But I had a hard time getting isdntest-connection working with my atcom
AX-4S (HFC-4S).
Accidentally I read on HPSs site that HFC-4S/8S are by default
initialized in pcm_slave mode.
I had no idea about pcm_mode – nothing qualified found with a very quick
search – but I understand the need and possibilities of clock sources.

But why is the default to rely on external clock source with HFC-4S?

And how – if it's describable in one sentence – does the HFC-4S read
external clock?

Is there something like 'hint.hfc.X.pcm_master' (planned)?

Why isn't isdn4bsd and libcapi in the official ports tree?

Thanks a lot,

-Harry




signature.asc
Description: OpenPGP digital signature


9.2 panic with wcb4xxp (dahdi-kmod26-2.6.1.r10738)

2013-09-19 Thread Harald Schmalzbauer
 Hello,

unloading the kernel module of dahdi-kmod26-2.6.1.r10738 leads to this
panic:

panic: blockable sleep lock (sleep mutex) 16 @
/usr/local/share/deploy-tools/RELENG_9_2/src/sys/vm/uma_core.c:2553
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper(c0a3d5bf,4c45522f,5f474e45,2f325f39,2f637273,...)
at db_trace_self_wrapper+0x26/frame 0xf00709cc
kdb_backtrace(c0a84539,1,c0a4124d,f0070a60,1,...) at
kdb_backtrace+0x2a/frame 0xf0070a28
panic(c0a4124d,c0a73003,c09e67b9,c0a71b3c,9f9,...) at panic+0x16f/frame
0xf0070a54
witness_checkorder(c15a5788,9,c0a71b3c,9f9,0,...) at
witness_checkorder+0xaa/frame 0xf0070aac
_mtx_lock_flags(c15a5788,0,c0a71b3c,9f9,0,...) at
_mtx_lock_flags+0xb1/frame 0xf0070ad8
uma_zfree_arg(c15a4a80,c7549320,c7549cb8,c7549320,c7549320,...) at
uma_zfree_arg+0x59/frame 0xf0070b1c
free(c7549320,c85d1680,c85cedce,2f7,c743b180,...) at free+0xd8/frame
0xf0070b40
dahdi_unregister_echocan_factory(c85ce60c,c0a36b31,108,0,c743b180,...)
at dahdi_unregister_echocan_factory+0xbd/frame 0xf0070b60
dahdi_cleanup(0,f0070ba4,c06ded93,c743b180,1,...) at
dahdi_cleanup+0x13/frame 0xf0070b7c
_linux_module_modevent(c743b180,1,c85d10a0,108,0,...) at
_linux_module_modevent+0x50/frame 0xf0070b88
module_unload(c743b180,c0a34a5c,284,292,2a7,...) at
module_unload+0x43/frame 0xf0070ba4
linker_file_unload(c78e3000,0,c0a34a5c,2a7,0,...) at
linker_file_unload+0x15e/frame 0xf0070bd4
linker_file_unload(c78e3200,0,c0a34a5c,449,c85a9000,...) at
linker_file_unload+0x444/frame 0xf0070c04
kern_kldunload(c865b2f0,3,0,f0070cfc,c09bd39b,...) at
kern_kldunload+0xd1/frame 0xf0070c30
sys_kldunloadf(c865b2f0,f0070ccc,c0a85bb4,c0a428ae,c0a87198,...) at
sys_kldunloadf+0x2b/frame 0xf0070c44
syscall(f0070d08) at syscall+0x2bb/frame 0xf0070cfc
Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xf0070cfc
--- syscall (444, FreeBSD ELF32, sys_kldunloadf), eip = 0x280c088b, esp
= 0xbfbfd27c, ebp = 0xbfbfdac8 ---
KDB: enter: panic

––
Loading the wcb4xxp kernel module leads to some hundred of these:

uma_zalloc_arg: zone "256" with the following non-sleepable locks held:
exclusive sleep mutex registration_mutex (registration_mutex) r = 0
(0xc85e08ac) locked @
/usr/local/ports-wrktree/usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:7296
KDB: stack backtrace:
db_trace_self_wrapper(c0a3d5bf,64736265,362e322d,722d312e,33373031,...)
at db_trace_self_wrapper+0x26/frame 0xf002e674
kdb_backtrace(c0730080,1,,c0c72e74,f002e720,...) at
kdb_backtrace+0x2a/frame 0xf002e6d0
_witness_debugger(c0a40d44,f002e734,4,1,0,...) at
_witness_debugger+0x25/frame 0xf002e6e8
witness_warn(5,0,c0a7210b,c0a667f3,f002e768,...) at
witness_warn+0x20d/frame 0xf002e720
uma_zalloc_arg(c159d840,0,502,2,c861df34,...) at
uma_zalloc_arg+0x34/frame 0xf002e780
malloc(ec,c0aafc2c,502,c84ace00,f002e7c4,...) at malloc+0x115/frame
0xf002e7a4
devfs_alloc(0,c852f5e0,f002e7e0,246,c85c60e0,...) at
devfs_alloc+0x31/frame 0xf002e7c4
make_dev_credv(b,0,0,0,1a4,...) at make_dev_credv+0x38/frame 0xf002e7f8
make_dev(c85c60e0,b,0,0,1a4,...) at make_dev+0x4a/frame 0xf002e824
_dahdi_assign_span(1,0,c85c3dce,1c80,0,...) at
_dahdi_assign_span+0x39e/frame 0xf002e860
dahdi_register_device(c70ccac0,c861800c,4,0,3,...) at
dahdi_register_device+0xd0/frame 0xf002e884
b4xxp_register(c8618000,c85a4970,c861801c,2,c0ab808c,...) at
b4xxp_register+0x367/frame 0xf002e8b4
b4xxp_device_attach(c7164900,c74a685c,c0ab808c,c0a3c75f,8003,...) at
b4xxp_device_attach+0x141/frame 0xf002e8e0
device_attach(c7164900,4,c0a3c5e7,aa5) at device_attach+0x3c3/frame
0xf002e920
device_probe_and_attach(c7164900,c715fa00,f002e954,c6e51a00,1,...) at
device_probe_and_attach+0x4e/frame 0xf002e93c
pci_driver_added(c7164980,c85a49c0,c0ab7e5c,c85a49c0,c745e180,...) at
pci_driver_added+0xe6/frame 0xf002e964
devclass_driver_added(c85a49c0,c0ac7088,101,0,c85a4a0c,...) at
devclass_driver_added+0x74/frame 0xf002e988
devclass_add_driver(c6e96e00,c85a49c0,7fff,c85a4f40,c85a49f4,...) at
devclass_add_driver+0x156/frame 0xf002e9a8
driver_module_handler(c74da680,0,c85a49f4,75,c06b82d1,...) at
driver_module_handler+0x85/frame 0xf002e9d4
module_register_init(c85a4a0c,0,c0a34a5c,e9,0,...) at
module_register_init+0xa7/frame 0xf002e9fc
linker_load_module(0,f002ec0c,c0a34a5c,40e,0,...) at
linker_load_module+0xa36/frame 0xf002ebec
kern_kldload(c852f5e0,c748f400,f002ec34,0,c8525000,...) at
kern_kldload+0xca/frame 0xf002ec1c
sys_kldload(c852f5e0,f002eccc,c0a85bb4,c0a4214f,c0a87198,...) at
sys_kldload+0x74/frame 0xf002ec44
syscall(f002ed08) at syscall+0x2bb/frame 0xf002ecfc
Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xf002ecfc
--- syscall (304, FreeBSD ELF32, sys_kldload), eip = 0x280c24ab, esp =
0xbfbfd92c, ebp = 0xbfbfde18 ---
uma_zalloc_arg: zone "16" with the following non-sleepable locks held:
exclusive sleep mutex registration_mutex (registration_mutex) r = 0
(0xc85e08ac) locked @
/usr/local/ports-wrktree/u

Re: 9.2 panic with wcb4xxp (dahdi-kmod26-2.6.1.r10738)

2013-09-23 Thread Harald Schmalzbauer
 Bezüglich Amitabh Kant's Nachricht vom 21.09.2013 03:24 (localtime):
> On Thu, Sep 19, 2013 at 7:35 PM, Harald Schmalzbauer
> mailto:h.schmalzba...@omnilan.de>> wrote:
>
> Hello,
>
> unloading the kernel module of dahdi-kmod26-2.6.1.r10738 leads to this
> panic:
>
>
> 
>
> wcb4xxp0: <6>Did not do the highestorder stuff
> <6>dahdi: Detected time shift.
> <5>dahdi_echocan_mg2: Registered echo canceler 'MG2'
>
> Starting asterisk afterwards also leads to panic.
> I guess dahdi development stalled, but I wanted to try it because I'd
> prefer freeswitch and need BRI support...
> Is somebody familiar with dahdi and interested in making it work with
> FreeBSD 9.2?
>
> Thanks,
>
> -Harry (not subscribed to isdn@)
>
>
>
> Have you been able to solve the problem? I am running Freeswitch (from
> git, not port) and dahdi/dahdi-kmod26 (from port) with PRI line
> (Digium 8 span and single span) without any problems on 9.1. Will test
> it on 9.2 and get back to you if I see a panic .
Hello Amitabh,

couldn't solve my problem.
First, dahdi_scan doesn't detect ports jumpered to NT mode. I need 2 ports in 
NT mode, so trying anything else with dahdi before my settings get correctly 
recognized is probably not worth the time.
Also I have to investigate if it is still true that libpri doesn't support ptmp 
in NT mode!?!
In general, the freebsd dahdi port doesn't seem to be in good shape; Couldn't 
find any docs about sysctls (dahdi.wcb4xxp.teignorered, '-d' shows nothing :-( 
), no man page – hard to find out anything about dahdi in FreeBSD, not even the 
supported hardwhere seems to be documentend anywhere...

Any hints highly appreciated, although I think the better way was to teach 
FreeTDM speaking CAPI. HPS does a great job keeping all kind of ISDN hardware 
supported by i4b (ISDN4BSD)!
Or to make chan_capi work with asterisk11 – the lesser evil than fighting 
dahid...

Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


newfs-msdos and default fat32 parameters

2008-08-02 Thread Harald Schmalzbauer

Hello,

lately I wanted to create some DOS bootable SD-Cards (for simply BIOS 
updates, disk diagnostic tools etc...)
After newfs_msdos -F32 -B VBR.bin (2.5G partition) the system just 
didn't continue booting after the MBR was loaded (VBR.bin is a 3 sectors 
dump of the DOS boot record which sys creates).
When I directly wrote the dump back to sectors 63-65 and 69-71 the 
system booted!
So I took my hex glasses and found some unfortunate default parameters 
of newfs_msdos.


- MediaType is f0 but probably should read f8 (fixed disk)
- The backup boot record should be located at offset 6, not 2.
- There should be defined 63 hidden sectors

With 'newfs_msdos -F32 -m 0xf8 -B VBR.bin -k 0x6 -o 63 -i 0x1 /dev/da6s1'
every thing was fine.
I'm no expert, I just found some FAT info. maybe the current defaults 
are wisley chosen. Maybe not?


Best regards,

-Harry



signature.asc
Description: OpenPGP digital signature


'diskinfo' problem with eSATA device (initio 1611)

2008-08-02 Thread Harald Schmalzbauer

Hello,

for quick harddrive tests (SMART, noise, backup etc..) I bought a very 
nice "docking" station connected to my ICH9 SATA controller 
(http://www.sharkoon.com/html/produkte/externe_gehaeuse/sata_quickport_pro/index_en.html)
I can read/write to inserted disks, also smartctl works, but my 
favourite test doesn't run:

diskinfo -t /dev/ad10 returns:
ioctl(DIOCGMEDIASIZE) failed, probably not a disk.: Inappropriate ioctl 
for device


The eSATA bridge is a initio 1611 chip.
I have another external SATA enclosure and there is the same problem.
Any ideas what the reason is and how to "fix"?

Best regards,

-Harry



signature.asc
Description: OpenPGP digital signature


Feature request dev.ata.X.detached=1

2008-08-02 Thread Harald Schmalzbauer

Hello,

while eSATA get's widle used I don't like to detach a channel first 
before I can hotplug a new disk.
Would it be possible to implement a sysctl which tells the controller at 
boot time to keep some channels detached?


Best regards,

-Harry



signature.asc
Description: OpenPGP digital signature


test message, please ignore

2008-08-08 Thread Harald Weis

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Audio CD problem on laptop VGN-SZ61MN

2008-08-08 Thread Harald Weis
Is there anyone out there who has installed FreeBSD on the above Sony
laptop ?

Both ''cat filename > /dev/dsp0.0'' or ''vlc cdda:///dev/[EMAIL PROTECTED] are 
OK.

If I run ''cdcontrol -f /dev/acd0 play'', there is no sound.
But the output of ''cdcontrol -f /dev/acd0 status audio'' is alright.
(same behaviour for cd0 instead of acd0) 

And the output of ''mplayer cdda://1//dev/acd0'' is:
Playing cdda://1//dev/acd0.
++ WARN: open: Inappropriate ioctl for device
**ERROR: fread (): Invalid argument

On the other hand, DVD's play fine, for example with
either ''vlc dvd:///dev/[EMAIL PROTECTED]'' or ''mplayer dvd://2''

I wonder whether the problem is related to the FAILURE line in dmesg:
acd0: DVDR  at ata0-master UDMA33
acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 
acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 
cd0 at ata0 bus 0 target 0 lun 0
cd0:  Removable CD-ROM SCSI-0 device 
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present

Adding ''device atapicam'' in the kernel doesn't make a difference.

I found some messages concerning the FAILURE line in the mailing list
archives, but no solution.

Thank you in advance for any help.

Harald Weis
--
FreeBSD 7.0-RELEASE #0: Thu Aug  7 13:00:47 CEST 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Audio CD problem on laptop VGN-SZ61MN

2008-08-09 Thread Harald Weis
On Fri, Aug 08, 2008 at 09:17:40PM -0300, Carlos A. M. dos Santos wrote:
> On Fri, Aug 8, 2008 at 5:13 AM, Harald Weis <[EMAIL PROTECTED]> wrote:

> > acd0: DVDR  at ata0-master UDMA33
> > acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00
> > acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00
> > cd0 at ata0 bus 0 target 0 lun 0
> > cd0:  Removable CD-ROM SCSI-0 device
> > cd0: 33.000MB/s transfers
> > cd0: Attempt to query device size failed: NOT READY, Medium not present
 
> Same result on my notebook (HP nx6320) and this is not a surprise. The
> "mixer" command does not show a "cd" input, meaning that there is no
> input port for the analog audio output of the CD drive (if any). The
> "status audio" arguments instructs cdcontrol to inquire the drive, not
> the audio device.
> 
> NetBSD's "cdplay" supports digital transfer mode since version 4.0.
> Perhaps this feature could be implemented on cdcontrol.

Yes, that explains what happens on my notebook and corresponds to
what the Handbook says:
If all goes well, you should now have a functioning sound card. If your
CD-ROM or DVD-ROM drive's audio-out pins are properly connected to your
sound card, you can put a CD in the drive and play it with
cdcontrol(1):
% cdcontrol -f /dev/acd0 play 1

Indeed, both the 'mixer' command and the (much more comfortable) 'rexima'
port show only 5 mixer devices:
Mixer vol  is currently set to  12:12
Mixer pcm  is currently set to  30:30
Mixer speaker  is currently set to   8:8
Mixer mic  is currently set to   0:0
Mixer rec  is currently set to   0:0

There is no 'cd' mixer device, and the only mixer device file on
this notebook is '/dev/mixer0'.

But this does not explain why mplayer, in contrast with vlc, refuses
to play the CD, does it ?

What is the meaning and reason of the above FAILURE lines ? 
2 identical lines with the GENERIC kernel,
1 single line, if I add 'device atapicam'.
That is apparently the only difference between the two cases.

Thank you again and in advance for any further help.

Harald
-- 
FreeBSD 7.0-RELEASE #0: Thu Aug  7 13:00:47 CEST 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Audio CD problem on laptop VGN-SZ61MN

2008-08-09 Thread Harald Weis
On Sat, Aug 09, 2008 at 11:56:23AM -0300, Carlos A. M. dos Santos wrote:

> 1. Try to insert the CD and wait until it stops pinning before
> starting mplayer. Some drives are a bit lazy on media recognition.
> 
> 2. Run "truss mplayer ..." and look for error messages.
> 
> 3. Attempt to use a different player. Xine and/or one of its alternate
> front-ends like GXine or Kaffeine are good choices.

I'll be away from keyboard for two weeks or so.
I shall reply then as soon as possible.

Thanks again, Carlos.

Bye
Harald

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: udf

2008-08-20 Thread Harald Schmalzbauer

Andriy Gapon schrieb am 19.05.2008 16:15 (localtime):

on 16/05/2008 19:48 Scott Long said the following:

There is no write support in UDF in FreeBSD.  When I wrote the fs code,


Should mount_udf work for UDF 1.02 DVDs?
Today I tried to check an unlabled DVD, which was suspected to be a 
vista copy, but I couldn't mount it:

mount_udf: /dev/cd0: Invalid argument
/dev/acd is completely broken for me since SATA drives... And burncd 
stopped working long time ago in FreeBSD 7 (ATAPI, ICHx).

Any hope to see these optical media issues beeing fixed for 7.1?

Thanks,

-Harry


packet writing was the only way to do discrete writes, and it's very
hard to make that work with a traditional VM system.  Now with DVD+R,
it's probably worth someone's time to look at it (though the append-only
nature of +R means that there are still some nasty VM complications to
deal with).  Until that happens, mkisofs can be used to create a static
UDF filesystem.


BTW, Remko has kindly notified me that Reinoud Zandijk has completed his
long work on UDF write support in NetBSD. I think that porting his work
is our best chance to get write support in FreeBSD too.




signature.asc
Description: OpenPGP digital signature


Re: Audio CD problem on laptop VGN-SZ61MN

2008-09-11 Thread Harald Weis
On Sun, Aug 10, 2008 at 08:54:17AM +0200, Harald Weis wrote:
> On Sat, Aug 09, 2008 at 11:56:23AM -0300, Carlos A. M. dos Santos wrote:
> 
> > 1. Try to insert the CD and wait until it stops pinning before
> > starting mplayer. Some drives are a bit lazy on media recognition.

Doesn't make a difference.

> > 2. Run "truss mplayer ..." and look for error messages.

The output file does not mention the "Invalid argument" message
or, as far as I can understand, any other fatal stuff.

> > 3. Attempt to use a different player. Xine and/or one of its alternate
> > front-ends like GXine or Kaffeine are good choices.
> 
> I'll be away from keyboard for two weeks or so.
> I shall reply then as soon as possible.

Xine:
``xine cdda://dev/acd0/n'' works fine despite of six lines
   ``Cannot find address for OpenGl extension function ...
   which should be available according to extension specs.''.
But the GUI does not work properly. For example, the CD button generates
``CDIOCREADAUDIO: Invalid argument'' messages.

No wonder that gxine and kaffeine which use the xine engine don't work either.

Finally, the only (audioCD) command which does work on this laptop is
``vlc cdda:///dev/[EMAIL PROTECTED]''.
Disappointing, but better than nothing.

Thank you again for your comments.
Harald



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Samsung SCX-4200 printer

2009-01-04 Thread Harald Weis
Is there a way to install the SCX-4200 printer on a FreeBSD box ?
The printer is delivered with the install software required for Linux.
And CUPS does not seem to "know" it.

Thank you in advance for any help.

Harald Weis
-- 
FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Samsung SCX-4200 printer

2009-01-07 Thread Harald Weis
On Tue, Jan 06, 2009 at 05:06:32PM -0500, Peter C. Lai wrote:
> What language does this printer use? I am a big fan of the minimalistic
> BSD LPD/R, so I usually just whip up an output filter that cats everything
> (since just about every app natively prints as postscript these days 
> [besides gimp-app I guess]) to ghostscript. Works for basically any printer
> with PCL or PS support (actually, with some of the the elcheapo PS 
> imitations out there, usually PCL5/XL works better). To me, CUPS/Foomatic 
> comes with some fancy PPDs and filters, but I think a lot of that is 
> bloated since it does similar things under the hood but wrapped in more
> layers of magic...
> 
> On 2009-01-06 04:46:40PM -0500, SDH Support wrote:
> > 
> > > Is there a way to install the SCX-4200 printer on a FreeBSD box ?
> > 
> > I would recommend googling this printer and determining its support on linux
> > first and then perhaps following the large amount of documentation with
> > installing CUPS for freebsd. I've gotten many different printers working on
> > my own.
> > 
> > 
> > 
> > 
> > ---
> > Kevin K.
> > Systems Administrator
> > www.webcanadahosting.com 
> > 
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 
> -- 
> ===
> Peter C. Lai | Bard College at Simon's Rock
> Systems Administrator| 84 Alford Rd.
> Information Technology Svcs. | Gt. Barrington, MA 01230 USA
> peter AT simons-rock.edu | (413) 528-7428
> ===

Many thanks for all your comments and hints.

Briefly for your information: The printer was bought by a person
who was a Linux user from the very beginning. He had no time to install
it though. In the mean time he has become a FreeBSD user and is working
all-day on a FreeBSD laptop (as a general practitioner, perhaps the
only one on earth). He's still short of time. A year ago or so I
tried to install the printer on his desktop machine. But I could not
find any SCX-4200 reference on openprinting.org. This time I reinstalled
all CUPS components on a 7.0-RELEASE, and finally looked for your help.

Well, I've started with the first advice, found UnifiedLinuxDriver.tar.gz,
installed manually scx4200.ppd and rastertosamsungspl. I tried all
possible file locations for rastertosamsungspl. But CUPS keeps saying
that it cannot find rastertosamsungspl. It seems to me that CUPS
would be happy with these two files.

'spl' seems to stand for Samsung Printer Language.

Harald
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Samsung SCX-4200 printer

2009-01-08 Thread Harald Weis
On Mon, Jan 05, 2009 at 07:01:02PM -0500, Alex Goncharov wrote:
> On Mon, Jan 05, 2009 at 11:26:23PM +0100, Roland Smith wrote:
> > On Mon, Jan 05, 2009 at 05:44:03PM +0100, Erwan David wrote:
> > > On Mon, Jan 05, 2009 at 05:36:35PM CET, Torfinn Ingolfsen 
> > >  said:
> > > > On Sun, 04 Jan 2009 23:14:22 +0100
> > > > Harald Weis  wrote:
> > > > 
> > > > > Is there a way to install the SCX-4200 printer on a FreeBSD box ?
> > > > > The printer is delivered with the install software required for Linux.
> > > > > And CUPS does not seem to "know" it.
> > > It is not always sufficient. My Brother DCP-540 CN is said to work
> > > perfectly, but only with brother binary linux drivers, under linux. I
> > > did not find any way to make it work under freeBSD.
> > 
> > This should be a FAQ: do yourself a favor and get a printer that
> > supports postscript. It will work with little effort with most
> > UNIX-based program (because they usually support postscript output) and
> > with most spoolers.
> 
> Try to install the cupsys, cupsys-bsd, cupsomatic-ppd and foomatic-db

None of these exist in the FreeBSD port index (/usr/ports/INDEX-7)

> ports and take a look at this link:
> 
>http://forums.linux-foundation.org/read.php?31,302,320,quote=1

Refers only to Linux :-(

Harald
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Samsung SCX-4200 printer

2009-01-08 Thread Harald Weis
On Thu, Jan 08, 2009 at 01:01:57PM -0500, Alex Goncharov wrote:
> ,--- You/Harald (Thu, 8 Jan 2009 12:11:27 +0100) *
> | > Try to install the cupsys, cupsys-bsd, cupsomatic-ppd and foomatic-db
> | > ports and take a look at this link:
> | 
> | None of these exist in the FreeBSD port index (/usr/ports/INDEX-7)
> 
> So, you've now looked :-)

Oh, yes, sorry, foomatic-db does indeed exist.

> 
> I've had good experience with foomatic use for most various printers
> in the past and wanted to give you a pointer to that package, no
> promises, since you didn't seem to be familiar it.
> 
> I didn't have the printing packages installed in the machine I sent
> the original mail from, so could not check the correct package names
> -- I can check what I have now:
> 
> 
> $ pkg_info -L foomatic-db-20070124_1| grep -ic samsung
> 62
> 
> $ pkg_info -L foomatic-db-20070124_1| grep -ic scx
> 0
> 
> 
> So, that printer is not yet in BSD foomatic-db.
> 
> | 
> | Refers only to Linux :-(
> 
> I know -- but I often used Linux-based advice as a clue for solving
> printing problems on FreeBSD.
> 
> Sorry this didn't help you (and I am sure you saw this
> http://www.openprinting.org/show_printer.cgi?recnum=Samsung-SCX-4200).

Yes, I did.

Harald
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Samsung SCX-4200 printer

2009-01-08 Thread Harald Weis
On Thu, Jan 08, 2009 at 03:06:42PM +0200, Alexander Shikoff wrote:
> On Sun, Jan 04, 2009 at 11:14:22PM +0100, Harald Weis wrote:
> > Is there a way to install the SCX-4200 printer on a FreeBSD box ?
> > The printer is delivered with the install software required for Linux.
> > And CUPS does not seem to "know" it.
> > 
> > Thank you in advance for any help.
> 
> I've successfully setup printing on SCX-4521F connected to MS Windows box.
> I use Linux binary driver from Samsung website. On my FreeBSD box 
> (7.1-PRERELEASE)
> I use cups and samba-client.
> 
> Unfortunately I have detailed description of all steps in Russian only.
> 
> But I'll try to give you a the short summary. I hope it will let you
> find right way:
> 

snip

> 
> That's all. If you have any issues/question feel free to ask. Have a nice day!
> 

Thank you very much. I'll report as soon as possible.

Harald
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Medical database Vidal

2009-01-09 Thread Harald Weis
When mounting a (cd9660) CD-ROM of the medical database Vidal in order
to try an installation with wine, I've discovered that I cannot see
two files (visible under Windows), setup.exe and some .ini file the
full name of which I have forgotten now, while I can perfectly see
the merlin-vcd-data.zip file in the dat directory.

How on Unix earth is this possible ??

Harald  
-- 
FreeBSD 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


lenovo t400 does not start 7.1

2009-01-13 Thread Harald Servat
Hello,

  I downloaded FreeBSD 7.1 (DVD iso image) for amd64 architecture (with
correct SHA256 checksum), but I'm unable to start the system (Lenovo T400).

  The boot process starts fine, the BTX messages appear, the 7-option menu
also appears, but when I hit enter (or when I choose start without ACPI or
start in safe mode) the \ symbols starts spinning for a while and then
freezes.

  I also tried with the 7.1-amd64-CD1, 7.1-i386-CD1 and a snapshot of
8.0-CURRENT dated from December. And the result is the same for all of them.

  Does anyone have any idea on what can be happening? Or at least, how can I
gather more information about this issue?

Thank you.
-- 
_
Empty your memory,
with a free()...
like a pointer!

If you cast a pointer to an integer,
it becomes an integer,
if you cast a pointer to a struct,
it becomes a struct.

The pointer can crash...,
and can overflow.

Be a pointer my friend...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lenovo t400 does not start 7.1

2009-01-13 Thread Harald Servat
On Tue, Jan 13, 2009 at 8:47 PM, Oleksandr Tymoshenko wrote:

> Harald Servat wrote:
>
>> Hello,
>>
>>  I downloaded FreeBSD 7.1 (DVD iso image) for amd64 architecture (with
>> correct SHA256 checksum), but I'm unable to start the system (Lenovo
>> T400).
>>
>>  The boot process starts fine, the BTX messages appear, the 7-option menu
>> also appears, but when I hit enter (or when I choose start without ACPI or
>> start in safe mode) the \ symbols starts spinning for a while and then
>> freezes.
>>
>>  I also tried with the 7.1-amd64-CD1, 7.1-i386-CD1 and a snapshot of
>> 8.0-CURRENT dated from December. And the result is the same for all of
>> them.
>>
>>  Does anyone have any idea on what can be happening? Or at least, how can
>> I
>> gather more information about this issue?
>>
>Not sure about install CDs but I tried all these systems on my t400.
> I installed 7.0 then upgraded  it to 7.1 and when it turned out that
> atheros is not 100% supported by it I upgraded to CURRENT. (I used cvsup
> for upgrades). What is your HW configuration?
>
>
Ok, thanks... I'll give a try to 7.0, and let's see if it works.


-- 
_
Empty your memory,
with a free()...
like a pointer!

If you cast a pointer to an integer,
it becomes an integer,
if you cast a pointer to a struct,
it becomes a struct.

The pointer can crash...,
and can overflow.

Be a pointer my friend...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lenovo t400 does not start 7.1

2009-01-13 Thread Harald Servat
On Tue, Jan 13, 2009 at 8:59 PM, Ben Kaduk  wrote:

> On Tue, Jan 13, 2009 at 1:57 PM, Harald Servat  wrote:
> > Hello,
> >
> >  I downloaded FreeBSD 7.1 (DVD iso image) for amd64 architecture (with
> > correct SHA256 checksum), but I'm unable to start the system (Lenovo
> T400).
> >
> >  The boot process starts fine, the BTX messages appear, the 7-option menu
> > also appears, but when I hit enter (or when I choose start without ACPI
> or
> > start in safe mode) the \ symbols starts spinning for a while and then
> > freezes.
>
> Have you tried an ISO for FreeBSD-CURRENT?
>
> I also have a T400, and definitely had FreeBSD running on it for a while,
> but I don't remember if it was 7 or current.
>
> -Ben Kaduk
>

I tried 8.0-CURRENT snapshot from December with no luck. I'll give 7.0 a try
(Oleksandr also suggested the same).

Thanks!


-- 
_
Empty your memory,
with a free()...
like a pointer!

If you cast a pointer to an integer,
it becomes an integer,
if you cast a pointer to a struct,
it becomes a struct.

The pointer can crash...,
and can overflow.

Be a pointer my friend...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lenovo t400 does not start 7.1

2009-01-13 Thread Harald Servat
On Tue, Jan 13, 2009 at 9:40 PM, Harald Servat  wrote:

>
>
> On Tue, Jan 13, 2009 at 8:47 PM, Oleksandr Tymoshenko 
> wrote:
>
>> Harald Servat wrote:
>>
>>> Hello,
>>>
>>>  I downloaded FreeBSD 7.1 (DVD iso image) for amd64 architecture (with
>>> correct SHA256 checksum), but I'm unable to start the system (Lenovo
>>> T400).
>>>
>>>  The boot process starts fine, the BTX messages appear, the 7-option menu
>>> also appears, but when I hit enter (or when I choose start without ACPI
>>> or
>>> start in safe mode) the \ symbols starts spinning for a while and then
>>> freezes.
>>>
>>>  I also tried with the 7.1-amd64-CD1, 7.1-i386-CD1 and a snapshot of
>>> 8.0-CURRENT dated from December. And the result is the same for all of
>>> them.
>>>
>>>  Does anyone have any idea on what can be happening? Or at least, how can
>>> I
>>> gather more information about this issue?
>>>
>>Not sure about install CDs but I tried all these systems on my
>> t400.
>> I installed 7.0 then upgraded  it to 7.1 and when it turned out that
>> atheros is not 100% supported by it I upgraded to CURRENT. (I used cvsup
>> for upgrades). What is your HW configuration?
>>
>>
> Ok, thanks... I'll give a try to 7.0, and let's see if it works.
>
>
Uhm... FreeBSD 7.0 i386 CD1 behaves in the same manner. It frozes when
spinning the backslash symbol.

The BIOS reports 4GB of RAM and SATA disk configured in AHCI mode. Should I
look for something unusual? and how?

Thank you.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lenovo t400 does not start 7.1

2009-01-14 Thread Harald Servat
On Wed, Jan 14, 2009 at 3:18 AM, Ganbold  wrote:

> Ganbold wrote:
>
>> Harald Servat wrote:
>>
>>> Hello,
>>>
>>>  I downloaded FreeBSD 7.1 (DVD iso image) for amd64 architecture (with
>>> correct SHA256 checksum), but I'm unable to start the system (Lenovo
>>> T400).
>>>
>>>  The boot process starts fine, the BTX messages appear, the 7-option menu
>>> also appears, but when I hit enter (or when I choose start without ACPI
>>> or
>>> start in safe mode) the \ symbols starts spinning for a while and then
>>> freezes.
>>>
>>>  I also tried with the 7.1-amd64-CD1, 7.1-i386-CD1 and a snapshot of
>>> 8.0-CURRENT dated from December. And the result is the same for all of
>>> them.
>>>
>>>  Does anyone have any idea on what can be happening? Or at least, how can
>>> I
>>> gather more information about this issue?
>>>
>>>
>>
>> Please try setting "Integrated Graphics" or "Switchable Graphics" mode on
>> Display setting in
>> BIOS. AFAICT it is known issue and FreeBSD doesn't boot when set to
>> Discrete Graphics in BIOS.
>>
>
> Or maybe it boots but nothing shows on screen.
>
> Ganbold
>
>

Ganbold,

  you were right. I've switched the BIOS>Display into Switchable graphics
and now the system does not freeze while the backslash symbol is spinning.
It occurs the same as you described in the -current list (see
http://lists.freebsd.org/pipermail/freebsd-current/2009-January/001941.html)

  However, let me tell you what I've found now (I tried these options in the
following order).

**
  a) I've started with option 2 (ACPI disabled) because I read somewhere
that FreeBSD did not support Lenovo T400's ACPI.

  The system boots and freezes after showing
  (something related with Timecounters)
  md0: Preloaded image  4194304 bytes at 
  Trying to mount root from ufs:/dev/md0

  b) When I chose option Safe mode (3, IIRC)

  Happens the same as a)

  c) I've tried with verbose logging (option 5, I think)

  The initialization dumps a lot of debugging information and after some
time, it brings me the "Choosing region" screen. I didn't get further
because I will not install the system right now (I have to prepare the
backups first ;) )

  I tried to use ScrollLock + RePag to look for the conflicting line found
in a) or any suspicious message but it didn't work. I also tried with an
holographic shell and the livefs without luck.

  d) I also tried option 1 (default boot) -- just in order to check.

  It also worked. Like c) but without debugging information.

**

  So, right now, it seems that the system allows me to begin the
installation of FreeBSD 7.1 amd64 on the T400 but I'm scared due to the lack
of functionality of options 2 and 3. This behavior is not normal, is it? Any
thoughts about this?

Thank you very much!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: lenovo t400 does not start 7.1

2009-01-15 Thread Harald Servat
On Thu, Jan 15, 2009 at 8:28 AM, Ganbold  wrote:

> Harald Servat wrote:
>
>>  However, let me tell you what I've found now (I tried these options in
>> the
>> following order).
>>
>> **
>>  a) I've started with option 2 (ACPI disabled) because I read somewhere
>> that FreeBSD did not support Lenovo T400's ACPI.
>>
>>  The system boots and freezes after showing
>>  (something related with Timecounters)
>>  md0: Preloaded image  4194304 bytes at 
>>  Trying to mount root from ufs:/dev/md0
>>
>>  b) When I chose option Safe mode (3, IIRC)
>>
>>  Happens the same as a)
>>
>>  c) I've tried with verbose logging (option 5, I think)
>>
>>  The initialization dumps a lot of debugging information and after some
>> time, it brings me the "Choosing region" screen. I didn't get further
>> because I will not install the system right now (I have to prepare the
>> backups first ;) )
>>
>>  I tried to use ScrollLock + RePag to look for the conflicting line found
>> in a) or any suspicious message but it didn't work. I also tried with an
>> holographic shell and the livefs without luck.
>>
>>  d) I also tried option 1 (default boot) -- just in order to check.
>>
>>  It also worked. Like c) but without debugging information.
>>
>> **
>>
>>  So, right now, it seems that the system allows me to begin the
>> installation of FreeBSD 7.1 amd64 on the T400 but I'm scared due to the
>> lack
>> of functionality of options 2 and 3. This behavior is not normal, is it?
>> Any
>> thoughts about this?
>>
>>
> I'm using Integrated graphics right now, didn't try option 2,3.
> I have to find firewire cable to check whether there is something going on
> when booting with Discrete graphics. You can install FreeBSD with
> integrated graphics
> and configure network, ssh etc. and then later try to boot with Discrete
> graphics.
> As boot screen stops with |, and as kib@ suggests some time later you can
> check
> the machine from the network whether you can login to it via ssh. Also you
> can try
> to ping at that time. You can also observe hard disk activity light.
> If you can ping or ssh then I guess it means boot works and it loads
> kernel,
> but something is not allowing to display on the screen.
>
> Ganbold
>
> --
> BACHELOR: A guy who is footloose and fiancee-free.
>

I don't remember if the HD led shows activity when booting from the DVD, but
it's worth to try once the system has been installed. I'll keep you informed
once I install the system.

Thank you Ganbold et altri.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Broken multimedia/dvdauthor repaired at last

2009-11-29 Thread Harald Weis
I just found a way to ''repair'' dvdauthor which is broken since several
months.
It suffices to comment two lines - 1082 and 1083 - in subreader.c like
so:
  //fribidi_set_mirroring (FRIBIDI_TRUE);
  //fribidi_set_reorder_nsm (FRIBIDI_FALSE);

# cd /usr/ports/multimedia/dvdauthor
# make
This produces the error message. Change subreader.c as said.
# vi work/dvdauthor-0.6.14/src/subreader.c
# make install

It works for me. Don't know what is missing now. But lxdvdrip which
requires dvdauthor is working as usual. :)

Hope it helps everybody else.
-- 
Harald Weis
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Broken multimedia/dvdauthor repaired at last

2009-11-29 Thread Harald Weis
On Sun, Nov 29, 2009 at 07:26:53PM +0100, Julian H. Stacey wrote:
> Hi,
> Reference:
> > From:       Harald Weis  
> > Date:   Sun, 29 Nov 2009 16:43:14 +0100 
> > Message-id: <20091129154314.ga2...@pollux.local.net> 
> 
> Harald Weis wrote:
> > I just found a way to ''repair'' dvdauthor which is broken since several
> > months.
> > It suffices to comment two lines - 1082 and 1083 - in subreader.c like
> > so:
> >   //fribidi_set_mirroring (FRIBIDI_TRUE);
> >   //fribidi_set_reorder_nsm (FRIBIDI_FALSE);
> > 
> > # cd /usr/ports/multimedia/dvdauthor
> > # make
> > This produces the error message. Change subreader.c as said.
> > # vi work/dvdauthor-0.6.14/src/subreader.c
> > # make install
> > 
> > It works for me. Don't know what is missing now. But lxdvdrip which
> > requires dvdauthor is working as usual. :)
> > 
> > Hope it helps everybody else.
> 
> Please use send-pr so this will be seen by those who can commit your fix.

Okay, done.
Harald
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


top Segmentation faulting on 8.0p2 amd64

2010-01-19 Thread Harald Schmalzbauer

Dear all,

I have no idea why top crashes with segmentation fault on my amd64 
machine running FreeBSD 8.0-RELEASE-p2.
If someone wants to have a loot at the core dump: 
http://www.schmalzbauer.de/downloads/top.core

But I think I should recompile it with DEBUG=-g first, right?
World and kernel are in sync, I ran an extra build to make that sure.
My kernconf follows.
Any help appreciated.

Thanks,

FreeBSD qaweb.hn.sand.spsnetz.de 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 
#0: Tue Jan 19 22:00:17 CET 2010 
ad...@qaweb.hn.sand.spsnetz.de:/usr/obj/usr/src/sys/FJS-RX100S3  amd64


include GENERIC

makeoptions none

nooptions   COMPAT_FREEBSD4 # Compatible with FreeBSD4
nooptions   COMPAT_FREEBSD5 # Compatible with FreeBSD5

# Floppy drives
nodevicefdc

# ATA and ATAPI devices
#device ata
#device atadisk # ATA disk drives
nodeviceataraid # ATA RAID drives
#device atapicd # ATAPI CDROM drives
nodeviceatapifd # ATAPI floppy drives
nodeviceatapist # ATAPI tape drives
#optionsATA_STATIC_ID   # Static device numbering

# SCSI Controllers
nodeviceahb # EISA AHA1742 family
nodeviceahc # AHA2940 and onboard AIC7xxx devices
nooptions   AHC_REG_PRETTY_PRINT# Print register bitfields in debug
# output.  Adds ~128k to driver.
nodeviceahd # AHA39320/29320 and onboard AIC79xx 
devices
nooptions   AHD_REG_PRETTY_PRINT# Print register bitfields in debug
# output.  Adds ~215k to driver.
nodeviceamd # AMD 53C974 (Tekram DC-390(T))
nodevicehptiop  # Highpoint RocketRaid 3xxx series
nodeviceisp # Qlogic family
##deviceispfw   # Firmware for QLogic HBAs- normally a 
module
nodevicempt # LSI-Logic MPT-Fusion
##devicencr # NCR/Symbios Logic
nodevicesym # NCR/Symbios Logic (newer chipsets + 
those of `ncr')
nodevicetrm # Tekram DC395U/UW/F DC315U adapters

nodeviceadv # Advansys SCSI adapters
nodeviceadw # Advansys wide SCSI adapters
nodeviceaha # Adaptec 154x SCSI adapters
nodeviceaic # Adaptec 15[012]x SCSI adapters, 
AIC-6[23]60.
nodevicebt  # Buslogic/Mylex MultiMaster SCSI 
adapters

nodevicencv # NCR 53C500
nodevicensp # Workbit Ninja SCSI-3
nodevicestg # TMC 18C30/18C50

# SCSI peripherals
device  scbus   # SCSI bus (required for SCSI)
device  ch  # SCSI media changers
#device da  # Direct Access (disks)
#device sa  # Sequential Access (tape etc)
#device cd  # CD
#device pass# Passthrough device (direct SCSI access)
#device ses # SCSI Environmental Services (and SAF-TE)

# RAID controllers interfaced to the SCSI subsystem
nodeviceamr # AMI MegaRAID
nodevicearcmsr  # Areca SATA II RAID
##XXX it is not 64-bit clean, -scottl
##deviceasr # DPT SmartRAID V, VI and Adaptec SCSI 
RAID
nodeviceciss# Compaq Smart RAID 5*
nodevicedpt # DPT Smartcache III, IV - See NOTES 
for options
nodevicehptmv   # Highpoint RocketRAID 182x
nodevicehptrr   # Highpoint RocketRAID 17xx, 22xx, 
23xx, 25xx
nodeviceiir # Intel Integrated RAID
nodeviceips # IBM (Adaptec) ServeRAID
nodevicemly # Mylex AcceleRAID/eXtremeRAID
nodevicetwa # 3ware 9000 series PATA/SATA RAID

# RAID controllers
nodeviceaac # Adaptec FSA RAID
nodeviceaacp# SCSI passthrough for aac (requires 
CAM)
nodeviceida # Compaq Smart RAID
nodevicemfi # LSI MegaRAID SAS
nodevicemlx # Mylex DAC960 family
#XXX pointer/int warnings
##devicepst # Promise Supertrak SX6000
nodevicetwe # 3ware ATA RAID

# atkbdc0 controls both the keyboard and the PS/2 mouse
#device atkbdc  # AT keyboard controller
#device atkbd   # AT keyboard
#device psm # PS/2 mouse

nodevicekbdmux  # keyboard multiplexer

#device vga # VGA video card dr

Re: Pack of CAM improvements

2010-01-22 Thread Harald Schmalzbauer

Alexander Motin schrieb am 19.01.2010 17:12 (localtime):
...

Patch can be found here:
http://people.freebsd.org/~mav/cam-ata.20100119.patch

Feedback as always welcome.


Again, thanks a lot for your ongoing great work!
The patch doesn't cleanly apply with vpo, but I don't use vpo so I 
didn't care.

Otherwise I couldn't find any problems.
The system detects reinserted SATA drives on ICH9 fine.

This was tested on a zfs backup server which went to the backbone 
yesterday, so I can't physically remove any devices any more for testing...


But I had some questions about zfs raidz states. I think that isn't a 
matter of atacam but if I removed one disk, zpool status still showed me 
the ada3 device "online".
After reinserting (and proper detection/initialisazion with cam, ada3 
was present again) and zpool clean, it set the devicea as UNAVAIL sinve 
I/O errors.

I coudn't get the device into the pool again, no matter what I tried.
Only rebooting the machine helped. Then I could clean and scrub.

What are the needed steps to provide a reinsterted hard disk to geom? 
With the latest patches I don't need to issue any reset/rescan comman, 
right?

So it's a zfs problem, right? My mistake in understanding?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: top Segmentation faulting on 8.0p2 amd64 (nss_ldapd problem?)

2010-01-22 Thread Harald Schmalzbauer

Mikolaj Golub schrieb am 22.01.2010 23:26 (localtime):

On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote:


Dear all,

I have no idea why top crashes with segmentation fault on my amd64
machine running FreeBSD 8.0-RELEASE-p2.
If someone wants to have a loot at the core dump:
http://www.schmalzbauer.de/downloads/top.core


core file is useless without binary and libraries. So it is better to run gdb
on your host, produce backtrace and post here:

gdb /usr/bin/top top.core
bt

And sure a backtrace from the top built with -g would be much better.

cd /usr/src/usr.bin/top
CFLAGS=-g make


Unfortunately nss_ldap seems to be the culprit.

gdb /usr/bin/top top.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `top'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libncurses.so.8...done.
Loaded symbols for /lib/libncurses.so.8
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libkvm.so.5...done.
Loaded symbols for /lib/libkvm.so.5
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /usr/local/lib/nss_ldap.so.1...done.
Loaded symbols for /usr/local/lib/nss_ldap.so.1
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
bt:
#0  0x000800d08403 in __nss_compat_gethostbyname () from 
/usr/local/lib/nss_ldap.so.1
#0  0x000800d08403 in __nss_compat_gethostbyname () from 
/usr/local/lib/nss_ldap.so.1
#1  0x000800d0606f in _nss_ldap_getpwent_r () from 
/usr/local/lib/nss_ldap.so.1

#2  0x0008009ffc54 in __nss_compat_getpwent_r () from /lib/libc.so.7
#3  0x000800a84a3d in nsdispatch () from /lib/libc.so.7
#4  0x000800a50976 in getpwent_r () from /lib/libc.so.7
#5  0x000800a50596 in sysctlbyname () from /lib/libc.so.7
#6  0x00406c6d in machine_init (statics=0x7fffea30, 
do_unames=1 '\001')

at /usr/src/usr.bin/top/machine.c:257
#7  0x00407a10 in main (argc=1, argv=0x7fffeb08)
at /usr/src/usr.bin/top/../../contrib/top/top.c:458

I'm using nss_ldapd-0.7.2 and there's no way to live without ldap...

Any help highly appreciated!

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: top Segmentation faulting on 8.0p2 amd64 (nss_ldapd problem?)

2010-01-24 Thread Harald Schmalzbauer

Alexander V. Chernikov schrieb am 24.01.2010 10:24 (localtime):
...

gdb /usr/bin/top top.core
bt

And sure a backtrace from the top built with -g would be much better.

cd /usr/src/usr.bin/top
CFLAGS=-g make


Unfortunately nss_ldap seems to be the culprit.
There is some strange problem with TLS and gcc optimization I can't 
localize


Please try to rebuild port with

post-configure:
   @${REINPLACE_CMD} -e 's/^\(CFLAGS .*\)-O2 \(.*\)$$/\1 -O0 \2/' 
${WRKSRC}/nss/Makefile


I'll submit updated port later


That indeed fixed the problem. Thank you very much.
But I found another point for improovement:
When deinstalling/installing nss_ldap.conf gets deleted/overwritten. I 
think it's better to install nss_ldap.conf.sample like many other ports do.

I also like the way lighttpd port is managed:
@unexec if cmp -s %D/etc/lighttpd.conf %D/etc/lighttpd.conf.sample; then 
rm -f %D/etc/lighttpd.conf; fi

etc/lighttpd.conf.sample
@exec [ -f %B/lighttpd.conf ] || cp %B/%f %B/lighttpd.conf

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included

2010-02-05 Thread Harald Schmalzbauer

Hello,

when I start lighttpd at boot time, the system half-locks in a way, that 
any process, which accesses /usr/local/etc stalls. It's also impossible 
to shut down.

/usr/local/etc is unionfs mounted.
I compiled a kernel with debug options.

When mounting unionfs at boot time, here's the firt LOR with trace:

lock order reversal:
 1st 0xff00018b47f8 unionfs (unionfs) @ 
/usr/src/sys/fs/unionfs/union_subr.c:356

 2nd 0xff00018d9d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
vrele() at vrele+0x120
unionfs_noderem() at unionfs_noderem+0x1c4
unionfs_reclaim() at unionfs_reclaim+0x11
vgonel() at vgonel+0xf1
vrecycle() at vrecycle+0x58
unionfs_inactive() at uniougen2.2:  at usbus2
nfs_inactive+ukbd0: 01.10/1.01, addr 2> on usbus2

x20
vinactive() at vinactive+0x6b
vput() at vput+0x216
kern_statatkbd1 at ukbd0
_vnhook() at kern_statat_vnhook+0xe9
kern_statat() at kern_statat+0x15
stat() at stat+0x22
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- suhid0: yaddr 2> on usbus2
scall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
0x7fffe5b8, rbp = 0x800b312c0 ---

KDB: enter: witness_checkorder
[thread pid 27 tid 100068 ]
Stopped at  kdb_enter+0x3d: movq$0,0x4c04dc(%rip)
Tracing pid 27 tid 100068 td 0xff00016fe720
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
vrele() at vrele+0x120
unionfs_noderem() at unionfs_noderem+0x1c4
unionfs_reclaim() at unionfs_reclaim+0x11
vgonel() at vgonel+0xf1
vrecycle() at vrecycle+0x58
unionfs_inactive() at unionfs_inactive+0x20
vinactive() at vinactive+0x6b
vput() at vput+0x216
kern_statat_vnhook() at kern_statat_vnhook+0xe9
kern_statat() at kern_statat+0x15
stat() at stat+0x22
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
0x7fffe5b8, rbp = 0x800b312c0 -



Like mentioned, there is that strange problem with lighttpd started at 
boot time. Other /urs/local/etc/rc.d startups don't lead to a 
/usr/local/etc deadlock.

Unfortunately I don't get any panic or anything else when the hang happens.
How can I aquire more information?
It's no problem to log in and to do everything else outside 
/usr/local/etc...



==

Here's a LOR at shutdown with trace:

lock order reversal:
 1st 0xff0001bc2098 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200
 2nd 0xff0001bc1ba8 devfs (devfs) @ 
/usr/src/sys/ufs/ffs/ffs_vfsops.c:1194

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_flushfiles() at ffs_flushfiles+0x93
ffs_unmount() at ffs_unmount+0x48
dounmount() at dounmount+0x2ac
vfs_unmountall() at vfs_unmountall+0x54
boot() at boot+0x814
mkdumpheader() at mkdumpheader
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
0x7fffe738, rbp = 0x402290 ---

KDB: enter: witness_checkorder
[thread pid 1 tid 12 ]
Stopped at  kdb_enter+0x3d: movq$0,0x4c04dc(%rip)

Tracing pid 1 tid 12 td 0xff0001310ab0
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_flushfiles() at ffs_flushfiles+0x93
ffs_unmount() at ffs_unmount+0x48
dounmount() at dounmount+0x2ac
vfs_unmountall() at vfs_unmountall+0x54
boot() at boot+0x814
mkdumpheader() at mkdumpheader
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
0x7fffe738, rbp = 0x402290 ---


Any Help highly appreciated!

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Reboot Loop: ffs_snapshot/bufwait LORs [Was: Re: 8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included]

2010-02-05 Thread Harald Schmalzbauer

Harald Schmalzbauer schrieb am 05.02.2010 12:31 (localtime):

Hello,

when I start lighttpd at boot time, the system half-locks in a way, that 
any process, which accesses /usr/local/etc stalls. It's also impossible 
to shut down.

/usr/local/etc is unionfs mounted.
I compiled a kernel with debug options.

When mounting unionfs at boot time, here's the firt LOR with trace:

lock order reversal:
 1st 0xff00018b47f8 unionfs (unionfs) @ 
/usr/src/sys/fs/unionfs/union_subr.c:356

 2nd 0xff00018d9d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
vrele() at vrele+0x120
unionfs_noderem() at unionfs_noderem+0x1c4
unionfs_reclaim() at unionfs_reclaim+0x11
vgonel() at vgonel+0xf1
vrecycle() at vrecycle+0x58
unionfs_inactive() at uniougen2.2:  at usbus2
nfs_inactive+ukbd0: 01.10/1.01, addr 2> on usbus2

x20
vinactive() at vinactive+0x6b
vput() at vput+0x216
kern_statatkbd1 at ukbd0
_vnhook() at kern_statat_vnhook+0xe9
kern_statat() at kern_statat+0x15
stat() at stat+0x22
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- suhid0: yaddr 2> on usbus2
scall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
0x7fffe5b8, rbp = 0x800b312c0 ---

KDB: enter: witness_checkorder
[thread pid 27 tid 100068 ]
Stopped at  kdb_enter+0x3d: movq$0,0x4c04dc(%rip)
Tracing pid 27 tid 100068 td 0xff00016fe720
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
vrele() at vrele+0x120
unionfs_noderem() at unionfs_noderem+0x1c4
unionfs_reclaim() at unionfs_reclaim+0x11
vgonel() at vgonel+0xf1
vrecycle() at vrecycle+0x58
unionfs_inactive() at unionfs_inactive+0x20
vinactive() at vinactive+0x6b
vput() at vput+0x216
kern_statat_vnhook() at kern_statat_vnhook+0xe9
kern_statat() at kern_statat+0x15
stat() at stat+0x22
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
0x7fffe5b8, rbp = 0x800b312c0 -



Like mentioned, there is that strange problem with lighttpd started at 
boot time. Other /urs/local/etc/rc.d startups don't lead to a 
/usr/local/etc deadlock.

Unfortunately I don't get any panic or anything else when the hang happens.
How can I aquire more information?
It's no problem to log in and to do everything else outside 
/usr/local/etc...



==

Here's a LOR at shutdown with trace:

lock order reversal:
 1st 0xff0001bc2098 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200
 2nd 0xff0001bc1ba8 devfs (devfs) @ 
/usr/src/sys/ufs/ffs/ffs_vfsops.c:1194

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_flushfiles() at ffs_flushfiles+0x93
ffs_unmount() at ffs_unmount+0x48
dounmount() at dounmount+0x2ac
vfs_unmountall() at vfs_unmountall+0x54
boot() at boot+0x814
mkdumpheader() at mkdumpheader
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
0x7fffe738, rbp = 0x402290 ---

KDB: enter: witness_checkorder
[thread pid 1 tid 12 ]
Stopped at  kdb_enter+0x3d: movq$0,0x4c04dc(%rip)

Tracing pid 1 tid 12 td 0xff0001310ab0
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_flushfiles() at ffs_flushfiles+0x93
ffs_unmount() at ffs_unmount+0x48
dounmount() at dounmount+0x2ac
vfs_unmountall() at vfs_unmountall+0x54
boot() at boot+0x814
mkdumpheader() at mkdumpheader
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
0x7fffe738, rbp = 0x402290 ---


Any Help highly appreciated!

Thanks,

-Harry


Additional LORs while regular machine operation (background fsck) which 
leads to reboot!
I have access over the serail console, but the machine is unresponsive 
after that. So I'm now in a endelss reboot loop with the debug kernel...


lock order reversal:
 1st 0xff0001b899d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423
 2nd 0xff802970fc28 bufwait (bufwait) @ 
/usr/src/sys/kern/vfs_bio.c:2559

 3rd 0xff00018b4448 ufs (ufs) @ /usr/src/

Re: Reboot Loop: ffs_snapshot/bufwait LORs [Was: Re: 8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included]

2010-02-05 Thread Harald Schmalzbauer

Harald Schmalzbauer schrieb am 05.02.2010 12:39 (localtime):

Harald Schmalzbauer schrieb am 05.02.2010 12:31 (localtime):

...


Additional LORs while regular machine operation (background fsck) which 
leads to reboot!
I have access over the serail console, but the machine is unresponsive 
after that. So I'm now in a endelss reboot loop with the debug kernel...


lock order reversal:
 1st 0xff0001b899d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423
 2nd 0xff802970fc28 bufwait (bufwait) @ 
/usr/src/sys/kern/vfs_bio.c:2559

 3rd 0xff00018b4448 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:544
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffe988, rbp = 0x800a028e-

KDB: enter: witness_checkorder
[thread pid 947 tid 100073 ]
Stopped at  kdb_enter+0x3d: movq$0,0x4c04dc(%rip)
db> lock order reversal:
 1st 0xff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
 2nd 0xff8000422028 uhci2 (uhci2) @ 
/usr/src/sys/dev/usb/controller/uhci.c:1551

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
uhci_do_poll() at uhci_do_poll+0x2e
usbd_transfer_poll() at usbd_transfer_poll+0x18d
ukbd_do_poll() at ukbd_do_poll+0x63
ukbd_get_key() at ukbd_get_key+0xa8
ukbd_read_char() at ukbd_read_char+0xaa
scgetc() at scgetc+0x5b
sc_cngetc() at sc_cngetc+0xf2
cncheckc() at cncheckc+0x65
cngetc() at cngetc+0x1c
db_readline() at db_readline+0x79
db_read_line() at db_read_line+0x15
db_command_loop() at db_command_loop+0x38
db_trap() at db_trap+0x87
kdb_trap() at kdb_trap+0x82
trap() at trap+0x18f
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0x80381141, rsp = 0xff803e959000, rbp = 
0xff803e959020 ---

kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffe988, rbp = 0x800a028e-

lock order reversal:
 1st 0xff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
 2nd 0xff0001747890 USB device mutex (USB device mutex) @ 
/usr/src/sys/dev/usb/usb_device.c:1410

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
usbd_clear_stall_proc() at usbd_clear_stall_proc+0x49
usbd_transfer_poll() at usbd_transfer_poll+0x1c0
ukbd_do_poll() at ukbd_do_poll+0x63
ukbd_get_key() at ukbd_get_key+0xa8
ukbd_read_char() at ukbd_read_char+0xaa
scgetc() at scgetc+0x5b
sc_cngetc() at sc_cngetc+0xf2
cncheckc() at cncheckc+0x65
cngetc() at cngetc+0x1c
db_readline() at db_readline+0x79
db_read_line() at db_read_line+0x15
db_command_loop() at db_command_loop+0x38
db_trap() at db_trap+0x87
kdb_trap() at kdb_trap+0x82
trap() at trap+0x18f
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0x80381141, rsp = 0xff803e959000, rbp = 
0xff803e959020 ---

kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffe988, rbp = 0x800a028e-

lock order reversal: (Giant after non-sleepable)
 1st 0xff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
 2nd 0x80820780 Giant (Giant) @ 
/usr/src/sys/dev/usb/usb_transfer.c:1952

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
usb_callback_proc() at usb_callback_proc+0x48
usbd_transfer_p

best practice to watch TCP parms of established sockets

2010-02-17 Thread Harald Schmalzbauer

Hello,

while doing some ZFS tests with RELENG_8 I recognized a mysterious 
performace drop after an hour uptime.
Now my first idea is to compare MSS and windows sizes before and after 
the performance drop.

How do I best capture them? tdpcump? It's GbE linkspeed...
Or is netstat capable to show these values? Or is there any way to read 
out the values stored in tcp.hostcache?


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: best practice to watch TCP parms of established sockets

2010-02-17 Thread Harald Schmalzbauer
Am 17.02.2010 19:56, schrieb Chuck Swiger:
> Hi--
> 
> On Feb 17, 2010, at 8:03 AM, Harald Schmalzbauer wrote:
>> while doing some ZFS tests with RELENG_8 I recognized a mysterious 
>> performace drop after an hour uptime.
>> Now my first idea is to compare MSS and windows sizes before and after the 
>> performance drop.
>> How do I best capture them? tdpcump? It's GbE linkspeed...
> 
> It seems more likely that ZFS is running into slowdowns from resource 
> contention, memory fragmentation, etc than your network would suddenly drop 
> out, but tcpdump -w outfile.pcap is a good method of looking

Thanks, but fisrt tests showed that ZFS is not causing the slowdown.
I noticed that disabling window scaling (rfc1323) speeds up the transfer
rate by factor 1.5 (rsync transfers start at 75MB/s, settling down at
55MB/s, while rfc1323 enabled leads to half the windows size (8k) and
rsync transfers start with 50MB/s and go down to 38MB/s).
Now I'm going through "internet core protocols" becaus it's been a long
time ago I had such a deep look into TCP/IP and I don't have all the TCP
sequences and options in memory.

Not falsified yet, but as soon as I open a second high data rate
transfer, simultanious to the rsync, (a smb transfer of a large file for
example) the shared speed will stay at that level, even if the second
transfer has finbished. Meaning: I can rsync with 50MB/s. If I
simultaniously transfer another file via samba, I have two 25MB/s
transfers. From that moment on I can never get more that 25MB/s per
transfer until I reboot the machine.

Like mentioned eralier, I first have to refresh some networking basics,
but expect some more results soon.

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]

2010-02-18 Thread Harald Schmalzbauer

Harald Schmalzbauer schrieb am 17.02.2010 20:15 (localtime):
...

Now my first idea is to compare MSS and windows sizes before and after the 
performance drop.
How do I best capture them? tdpcump? It's GbE linkspeed...

It seems more likely that ZFS is running into slowdowns from resource 
contention, memory fragmentation, etc than your network would suddenly drop 
out, but tcpdump -w outfile.pcap is a good method of looking


Thanks, but fisrt tests showed that ZFS is not causing the slowdown.


Hello,

I got exactly the same limitations when using tmpfs. So for now I'll 
concentrate on that, back to ZFS later.


Please clarify my TCP understanding.
If I have the window set to 65535 in the header and a MSS of 1460, how 
often should the receiver send ACK segments? window/MSS, right?
Now I see every two segments acknowledged in my dump (rsync between two 
em0 interaces).

I'd like to understand
a) why disabling net.inet.tcp.rfc1323 gives slightly better rsync 
throughput than enabled

b) why I can't transfer more than 50MB/s over my direct linked GbE boxes.

But right now I even don't understand the dump I see. As far as I 
understand I should only see every 45 data segments one ACK segment. 
That would clearly explain to me why I can't saturate my GbE link. But I 
can't imagine this is a uncovered faulty behaviour, so I guess I haven't 
understood TCP.


Please help.

Thanks in advance,

-Harry



signature.asc
Description: OpenPGP digital signature


Incorrect super block

2010-02-18 Thread Harald Weis
Has anybody encountered the following problem ?

Mac OS X does recognize FreeBSD partitions on USB disks, but doesn't
want to mount them because ``Incorrect super block''.
This is extremely annoying for my ``client'' because he relies on dayly
backups on USB keys. Is there a solution ? 

Thank you in advance.
-- 
Harald Weis
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-18 Thread Harald Schmalzbauer

Patrick Mahan schrieb am 18.02.2010 16:20 (localtime):

See inline...

...

Please clarify my TCP understanding.
If I have the window set to 65535 in the header and a MSS of 1460, how 
often should the receiver send ACK segments? window/MSS, right?


How soon you see the ACK is based on two values in the kernel:
   net.inet.tcp.delacktime
   net.inet.tcp.delayed_ack

The first one controls how soon the peer replies with an ACK if there is
no data to send back, ie. it is just a plain ack.  Van Jacobson first
recommended it in the early days of TCP/IP.  Historically, it has been
implemented as a 200 ms timer, but in FreeBSD it is a 100 ms timer.


Thank you for your hint. I heard of that but never thought about it, 
because 100ms is a magnitude higher than my µs LAN delay and since I'm 
not suffering from extremeley slow speeds.


...
a) why disabling net.inet.tcp.rfc1323 gives slightly better rsync 
throughput than enabled


rfc1323 deals with window scaling and timestamp options.  Perhaps these
are getting in the way?


b) why I can't transfer more than 50MB/s over my direct linked GbE boxes.

But right now I even don't understand the dump I see. As far as I 
understand I should only see every 45 data segments one ACK segment. 
That would clearly explain to me why I can't saturate my GbE link. But 
I can't imagine this is a uncovered faulty behaviour, so I guess I 
haven't understood TCP.




No we are also seeing similar behavior over the em(4) interface under
FreeBSD 8.0-Stable.


Some experimental results:
When rsyncing with windows, and FreeBSD is receiver, I see the same ACK 
ever two segemnts, but speed is at 72MB/s.
When FreeBSD is sender and Windows is receiver, it looks more I 
expected. There are about 20 data segments before a ACK is returned. And 
there are  TCP Window Update Segments, reflecting smaller receiver 
buffers on the windows side. But this happens at a throughput of 
82MB/s!!! So the windows machine is behaving like I understand the TCP 
flow control.

Any explanation why the FreeBSD machine seems to ignore window size?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-18 Thread Harald Schmalzbauer

Stephen Hurd schrieb am 18.02.2010 17:01 (localtime):

Harald Schmalzbauer wrote:

Some experimental results:
When rsyncing with windows, and FreeBSD is receiver, I see the same 
ACK ever two segemnts, but speed is at 72MB/s.
When FreeBSD is sender and Windows is receiver, it looks more I 
expected. There are about 20 data segments before a ACK is returned. 
And there are  TCP Window Update Segments, reflecting smaller receiver 
buffers on the windows side. But this happens at a throughput of 
82MB/s!!! So the windows machine is behaving like I understand the TCP 
flow control.

Any explanation why the FreeBSD machine seems to ignore window size?


IIRC, the delayed ACK RFC requires an ACK at least every second segment.


Good hint, but disabling leads to ACK after every single data segment ?!?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-18 Thread Harald Schmalzbauer

Stephen Hurd schrieb am 18.02.2010 17:09 (localtime):
...
A TCP SHOULD implement a delayed ACK, but an ACK should not be 
excessively delayed; in particular, the delay MUST be less than 0.5 
seconds, and in a stream of full-sized segments there SHOULD be an ACK 
for at least every second segment.


That's why I asked for help understandig TCP. I'm surely wrong then. I 
thought the ACK segment gets sent after the transfer of n segments 
equals windows-size. I don't undesrtand that window size yet... I'm back 
into my books


The idea of delayed ACKs is to allow an ACK to be sent with data if 
there will be data sent right away, not to combine ACKs... leaving out 
ACKs makes calculation of RTT problematical which causes performance 
problems all over the place... maybe the dearth of ACKs from the windows 
system is causing the problem?


The problem is not with the windows box, these transfer rates are 
sensible. The problem is with two RELENG_8 machines.


I'm doing this whole thing because I observed slowdowns under 20MB/s and 
I try to reproduce and investigate this. But first I have to get the 
idea right... If I don't understamd things going on when transfers make 
sense, I won't be able to determine what happens when transfers are 
slowed down...


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-18 Thread Harald Schmalzbauer

Adam Vande More schrieb am 18.02.2010 17:28 (localtime):
On Thu, Feb 18, 2010 at 10:24 AM, Harald Schmalzbauer 
mailto:h.schmalzba...@omnilan.de>> wrote:


The problem is not with the windows box, these transfer rates are
sensible. The problem is with two RELENG_8 machines.

I'm doing this whole thing because I observed slowdowns under 20MB/s
and I try to reproduce and investigate this. But first I have to get
the idea right... If I don't understamd things going on when
transfers make sense, I won't be able to determine what happens when
transfers are slowed down...


Have you considered the possibility it's not a tcp issue at all, maybe a 
nic driver?


I did, but then the transfers to the windows box were also affected I guess.

And to answer the iperf wondering of Patrick:


Client connecting to banana, TCP port 2121
TCP window size: 4.00 MByte (default)

[  3] local 192.168.147.249 port 65230 connected with 192.168.147.11 
port 2121

[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec  1.07 GBytes918 Mbits/sec

Though I have no idea what this tells me. Never used iperf before...

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-18 Thread Harald Schmalzbauer

Kevin Oberman schrieb am 18.02.2010 20:23 (localtime):
...

window allows for many packets to be in flight and with a 3 Gbps flow,
that is a LOT of data. While an ACK is sent every two packets of
received data, the transmitting side does not wait for the ACKs. It

...

That is a VERY simple and incomplete explanation of what is happening
with the window, but most of that is irrelevant in local transfers with


Thanks a lot, then I understood it at least half correct ;) My 
missunderstanding was that I thought the receiver would reduce ACKs... 
Now I know more :)
But unfortunately that makes it more mysterious where the throughput 
problem lies...


Thanks to everyone so far,

-Harry



signature.asc
Description: OpenPGP digital signature


ahcich timeouts, only with ahci, not with ataahci

2010-02-23 Thread Harald Schmalzbauer

Hello,

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr 


ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr 


ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr 


...

This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking 
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things 
correct) I also see the ZFS burst write congestion, but this doesn't 
lead to controller timeouts, thus blocking the machine.


Sometimes the machine recovers from the disk lock, but most often I have 
to reboot.


Kernel is from Feb. 19, so recent ahci improovements are active.
Controller is ICH9R with 3 Samsung F3 SpinPoints.

Any ideas how to work arround the hangs other than using the old ahci 
driver?


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-02-23 Thread Harald Schmalzbauer

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr

...


Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before timeout
message printed? Just want to be sure that driver waited enough before
give up.


Yes, there is some pause between the occurance of the hang and the first 
timeout message. But I can't tell you exactly if it's 30 seconds. I 
guess rather more than 30 sec.



This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things
correct) I also see the ZFS burst write congestion, but this doesn't
lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most often I have
to reboot.


How it looks when it doesn't? Can you send me full log messages?


Unfortunately not. That happened only once (which I recognized), 3 days 
ago and messages got turned over 5 times since then...
But I have some messages from 02/15, with kernel from january. Usually 
the messages continue to pop up until I reset the machine. This time 
there were only the three above, even after waiting half an hour (had to 
go on site). The old messages:


ahcich2: Timeout on slot 20
ahcich2: is  cs ff07 ss fff7 rs fff7 tfd c0 serr 


ahcich4: Timeout on slot 24
ahcich4: is  cs f07f ss ff7f rs ff7f tfd c0 serr 


ahcich2: Timeout on slot 17
ahcich2: is  cs fff9 ss  rs  tfd c0 serr 


ahcich4: Timeout on slot 20
ahcich4: is  cs 0030 ss  rs 0030 tfd c0 serr 


ahcich2: Timeout on slot 15
ahcich2: is  cs fff87fff ss  rs  tfd c0 serr 


ahcich4: Timeout on slot 22
ahcich4: is  cs fc0f ss ffcf rs ffcf tfd c0 serr 


ahcich2: Timeout on slot 13
ahcich2: is  cs 1fff ss  rs  tfd c0 serr 


ahcich4: Timeout on slot 16
ahcich4: is  cs 0001 ss  rs 0001 tfd c0 serr 


ahcich2: Timeout on slot 11
ahcich2: is  cs c7ff ss  rs  tfd c0 serr 


ahcich4: Timeout on slot 16
ahcich4: is  cs  ss 0001 rs 0001 tfd 40 serr 



Maybe it's helpful to you. Since I haven't seen the hang after 
upgrading, although doing extensive network transfer tests, I thought it 
vanished and haven't kept logs safe...



Kernel is from Feb. 19, so recent ahci improovements are active.
Controller is ICH9R with 3 Samsung F3 SpinPoints.

Any ideas how to work arround the hangs other than using the old ahci
driver?


Old ataahci driver wasn't using NCQ. NCQ may trigger some bugs in drive
firmware or expose some protocol inconsistencies. I would recommend you
to search for some errata for your drive and possibly firmware update.


Sounds reasonable.
How can I disable NCQ with new ahci?
I guess if it's a HDD firmware issue with NCQ the hang shouldn't happen 
when NCQ is disabled.
Btw, I found camcontrol cmd ada0 -a "EF 85 00 00 00 00 00 00 00 00 00 
00" for disabling APM and another one for disabling AAM. I did that for 
my drives. Is there a wiki where we can place such valuable commands?


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-02-23 Thread Harald Schmalzbauer

Alexander Motin schrieb am 23.02.2010 17:18 (localtime):
...

I guess if it's a HDD firmware issue with NCQ the hang shouldn't happen
when NCQ is disabled.


Just for case of real I/O timeout, run full surface test with SMART.


Unfortunately I couldn't find new firmware from Samsung, although one 
drive shows version 1AG01113 while the other two have 1AG01118. But the 
timeout happened at different channels, so it's not one certain disk...


One understanding question: If the drive doesn't complete a command, 
regardless if it's due to a firmware bug, a disk surface error or 
whatever, is there no way for the driver to terminate the request and 
take the drive offline after some time? This would be a very important 
behaviour for me. It doesn't make sense building RAIDz storage when a 
failing drive hangs the complete machine, even if the system partitions 
are on a complete different SSD.



Btw, I found camcontrol cmd ada0 -a "EF 85 00 00 00 00 00 00 00 00 00
00" for disabling APM and another one for disabling AAM. I did that for
my drives. Is there a wiki where we can place such valuable commands?


Probably not. It is just ATA commands, taken from ATA specification, but
definitely it is not very easy way.


Hmm, I saw some FreeBSD wikis, but I don't know if there's the _one_ 
official. I'll see if there's a possibility to leave some usefull hint's 
for such purposes.


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-02-23 Thread Harald Schmalzbauer

Alexander Motin schrieb am 23.02.2010 18:35 (localtime):
...

One understanding question: If the drive doesn't complete a command,
regardless if it's due to a firmware bug, a disk surface error or
whatever, is there no way for the driver to terminate the request and
take the drive offline after some time? This would be a very important
behaviour for me. It doesn't make sense building RAIDz storage when a
failing drive hangs the complete machine, even if the system partitions
are on a complete different SSD.


That's what timeouts are used for. When timeout detected, driver resets
device and reports error to upper layer. After receiving error, CAM
reinitializes device. If device is completely dead, reinitialization
will fail and device will be dropped immediately. If device is still
alive, reinit succeed and CAM will retry command again. If all retries
failed, error reported to the GEOM layer and then possibly to file
system. I have no idea how RAIDZ behaves in such case. May be after few
such errors it should drop that device out of array.

Timeout is a worst possible case for any device, as it takes too much
time and doesn't give any recovery information. Half-dead case is worst
possible case of timeout. It is difficult to say what which way is
better: drop last drive from degraded array and lost all info, or retry
forever. There is probably no right answer.


I see. Thanks a lot for clarification.
Before getting the machine onsite I did some ZFS tests like removing one 
disk when cvs checkout was running.
I can remember that ZFS hadn't showed the removed drive as offline, but 
there was no hang. The pool was degraded and after reinserting and 
rebooting I could resilver the pool. I couldn't manage to get it 
consistent without rebooting, but I accepted that since I would have to 
walk on site for changing the drive any way.
I'll restore the default vfs.zfs.txg.timeout=30, so the hang can be 
easily reproduced and see if I can 'camcontrol stop' the drive. Do you 
think I can get usefull information with that test?


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-02-23 Thread Harald Schmalzbauer
Am 23.02.2010 19:14, schrieb Alexander Motin:
...
>> I can remember that ZFS hadn't showed the removed drive as offline, but
>> there was no hang. The pool was degraded and after reinserting and
>> rebooting I could resilver the pool. I couldn't manage to get it
>> consistent without rebooting, but I accepted that since I would have to
>> walk on site for changing the drive any way.
> 
> That's question to ZFS. CAM and GEOM destroying/creating device
> automatically and fast enough.
> 
>> I'll restore the default vfs.zfs.txg.timeout=30, so the hang can be
>> easily reproduced and see if I can 'camcontrol stop' the drive. Do you
>> think I can get usefull information with that test?
> 
> Stop won't work for ATA devices. It is SCSI command. And all it does -
> stops spindle. It won't destroy device. AFAIK there is no method in CAM
> now to manually disable some device on-flight. If some device half-died,
> the best way is to mechanically disconnect it. It will help CAM to
> recover as fast as possible.

Thank you very much again. I wasn't aware of that. I thought 'camcontrol
stop' is similar to 'atacontrol detach'.
It's important for me te be able to manage my systems rmotely, so I need
to stay with the old ataahci driver. The detach feature has been
life-saver several times for me, especially with IDE disks. I often had
drives going nuts and detach/attach with gmirror/graid3 rebuild always
solved the problem. I expect to see also SATA drive oddities (maybe like
now) when they replace the old ide servers.

One last quick question: I read about a new feature adopting old ata to
cam. Do you have a link to useful information? Or will a mailman search
list all useful info.

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: RELENG_8 ignoring TCP window size? [Was: Re: Help for TCP understanding wanted, ACK-MSS-Window [Was: Re: best practice to watch TCP parms of established sockets]]

2010-02-25 Thread Harald Schmalzbauer

Nikos Ntarmos schrieb am 24.02.2010 16:37 (localtime):

On Fri, Feb 19, 2010 at 11:55:39AM +0200, Nikos Ntarmos wrote:

On Thu, Feb 18, 2010 at 10:41:28PM +0100, Harald Schmalzbauer wrote:

Kevin Oberman schrieb am 18.02.2010 20:23 (localtime):
...

window allows for many packets to be in flight and with a 3 Gbps flow,
that is a LOT of data. While an ACK is sent every two packets of
received data, the transmitting side does not wait for the ACKs. It

...

That is a VERY simple and incomplete explanation of what is happening
with the window, but most of that is irrelevant in local transfers with

Thanks a lot, then I understood it at least half correct ;) My
missunderstanding was that I thought the receiver would reduce
ACKs... Now I know more :)
But unfortunately that makes it more mysterious where the throughput
problem lies...

Thanks to everyone so far,

Hi there.

This is a long shot but have you tried disabling checksum and
segmentation offloading? I've found that they cause trouble with some
NICs. FYI on FreeBSD the first is done through 'ifconfig -rxcsum' while
the latter through 'sysctl net.inet.tcp.tso=0'. If you try this out,
remember to disable these features on both communicating boxes for the
period of the test, just to be sure that it's not the other box causing
these issues. You mentioned samba so if one of them is a win32 box, you
can access these settings through the hardware options of your NIC.


Hi again.

Just a friendly nudge. :) Did you find the root of these issues yet?


I don't have solid conclusions. But I ruled out some things.
First, the em driver has no problems in my setup. Neither disabling 
rx/txcsum nor disabling TSO made any differenz in trhoughput, though no 
direct recognizable load behaviour on my 2x3GHz machine. Mybe checsum 
offload is beneficial on slower machines.
One major problem was that one of the two FreeBSD Boxes was on a VMware 
which slowed things down a bit, although the FreeBSD System showed 
plenty free resources.


Disabling rfc1323 defnetily increases the throughput on gigabit 
ethernet. I can rsync between two (native) FreeBSD machines with 72-92 
MB/s averaging at 80+MB/s.
What I couldn't investigeate yet is why I always get 10% more throughput 
when one side is windwos, no matter which direction, no matter what 
application (cifs, rsync, ftp).
Another big point on my todo list is to find out why tcp.inflight brakes 
my webserver downloads really often to less than a quarter of the 
available bandwith (client bw, server bw is 100mb). I saw many 10mb/s E3 
pipes with 15ms delay, but limited transfer rates to 200kb/s. Disabling 
tcp.inflight opens that brake.


Since I could only capture "bad IP length" packets with checsum 
offloading enabled on the em, I guess it does also IP checksum 
offloading, not only TCP. I'll have to set up a port mirroring test bed 
to get the line packets. I hope I'll find the rfc1323 slowdown and the 
difference between FreeBSD clients and Windows clients. But at the 
moment I don't have spare equipment.


Greets,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-02 Thread Harald Schmalzbauer

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr

...


Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before timeout
message printed? Just want to be sure that driver waited enough before
give up.


This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things
correct) I also see the ZFS burst write congestion, but this doesn't
lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most often I have
to reboot.


How it looks when it doesn't? Can you send me full log messages?


Hello, this morning I had a stall, but the machine recovered after about 
 one Minute. Here's what I got from the kernel:

ahcich2: Timeout on slot 29
ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr 


em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr 


ahcich2: Timeout on slot 18
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr 


ahcich2: Timeout on slot 2
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr 


ahcich2: Timeout on slot 2
ahcich2: is  cs  ss 000c rs 000c tfd 40 serr 



Does this tell you something useful?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-03 Thread Harald Schmalzbauer

Alexander Motin schrieb am 03.03.2010 09:18 (localtime):

Harald Schmalzbauer wrote:

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr

...

Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before timeout
message printed? Just want to be sure that driver waited enough before
give up.


This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things
correct) I also see the ZFS burst write congestion, but this doesn't
lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most often I have
to reboot.

How it looks when it doesn't? Can you send me full log messages?

Hello, this morning I had a stall, but the machine recovered after about
 one Minute. Here's what I got from the kernel:
ahcich2: Timeout on slot 29
ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr

em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr

ahcich2: Timeout on slot 18
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs  ss 000c rs 000c tfd 40 serr


Does this tell you something useful?


It doesn't. Looking on logged register content - commands are indeed
still running and no interrupts requested. Interesting to see em1
watchdog timeout there. Aren't they related somehow?


dmesg | grep "irq 18":
uhci0:  port 0x20c0-0x20df irq 18 at 
device 26.0 on pci0
uhci4:  port 0x2040-0x205f irq 18 at 
device 29.2 on pci0
em1:  port 0x1000-0x103f 
mem 0xe192-0xe193,0xe190-0xe191 irq 18 at device 2.0 on pci3
ichsmb0:  port 0x2000-0x201f mem 
0xe1a22000-0xe1a220ff irq 18 at device 31.3 on pci0


The don't share the same IRQ at least.
dmesg | grep "irq 21"
uhci1:  port 0x20a0-0x20bf irq 21 at 
device 26.1 on pci0
ahci0:  port 
0x2408-0x240f,0x2414-0x2417,0x2400-0x2407,0x2410-0x2413,0x2020-0x203f 
mem 0xe1a21000-0xe1a217ff irq 21 at device 31.2 on pci0


The em1 has no cable attached. I get many of these em watchdog timeouts. 
Never thought they could be related to ahci. I'll see if the em watchdog 
timeouts happens in any relation to disk usage.


Thank you!

-Harry



signature.asc
Description: OpenPGP digital signature


Re: Incorrect super block

2010-03-06 Thread Harald Weis
On Mon, Feb 22, 2010 at 02:51:59PM -0800, Chris Knight wrote:
> This problem is caused by a big-endian, little-endian difference
> between the OSX implementation of UFS and the FreeBSD implementation.
> http://forums.macosxhints.com/showthread.php?t=86385

Yes, that's a good reason why both ufs1 and ufs2 don't work.

> 
> I solved this problem for myself by installing MacFuse

MacFuse is not yet available for Snow Leopard (10.6).

I've made some trials to understand the tar options and had a big
surprise yesterday: tar seems to have an enormous bug. For example:

tar -c -f etc.tar /etc
tar -r -f etc.tar /home/me/.icewm/
tar -u -f etc.tar /etc

The last command should not modify etc.tar.
But that's not the case.
There seems to be no difference between the -r and -u option.
How on earth is this possible ?

Harald
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-13 Thread Harald Schmalzbauer
Am 03.03.2010 12:06, schrieb Jeremy Chadwick:
> On Wed, Mar 03, 2010 at 09:28:25AM +0100, Harald Schmalzbauer wrote:
>> Alexander Motin schrieb am 03.03.2010 09:18 (localtime):
>>> Harald Schmalzbauer wrote:
>>>> Alexander Motin schrieb am 23.02.2010 16:10 (localtime):
>>>>> Harald Schmalzbauer wrote:
>>>>>> I'm frequently getting my machine locked with ahcichX timeouts:
>>>>>> ahcich2: Timeout on slot 0
>>>>>> ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr
>>>>>> 
>>>>>> ahcich2: Timeout on slot 8
>>>>>> ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr
>>>>>> 
>>>>>> ahcich2: Timeout on slot 8
>>>>>> ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr
>>>>>> 
>>>>>> ...
>>>>> Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
>>>>> command bitmasks in driver and hardware), controller doesn't report
>>>>> command completion. Looking on TFD status 0xc0 with BUSY bit set, I
>>>>> would suppose that either disk stuck in command processing for some
>>>>> reason, or controller missed command completion status.
>>>>>
>>>>> Have you noticed 30 second (default ATA timeout) pause before timeout
>>>>> message printed? Just want to be sure that driver waited enough before
>>>>> give up.
>>>>>
>>>>>> This happens when backup over GbE overloads ZFS/HDD capabilities.
>>>>>> I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
>>>>>> up almost immediately, but from it still happens.
>>>>>> When I don't use ahci but ataahci (the old driver if I understand things
>>>>>> correct) I also see the ZFS burst write congestion, but this doesn't
>>>>>> lead to controller timeouts, thus blocking the machine.
>>>>>>
>>>>>> Sometimes the machine recovers from the disk lock, but most often I have
>>>>>> to reboot.
>>>>> How it looks when it doesn't? Can you send me full log messages?
>>>> Hello, this morning I had a stall, but the machine recovered after about
>>>> one Minute. Here's what I got from the kernel:
>>>> ahcich2: Timeout on slot 29
>>>> ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr
>>>> 
>>>> em1: watchdog timeout -- resetting
>>>> em1: watchdog timeout -- resetting
>>>> ahcich2: Timeout on slot 10
>>>> ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr
>>>> 
>>>> ahcich2: Timeout on slot 18
>>>> ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr
>>>> 
>>>> ahcich2: Timeout on slot 2
>>>> ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr
>>>> 
>>>> ahcich2: Timeout on slot 2
>>>> ahcich2: is  cs  ss 000c rs 000c tfd 40 serr
>>>> 
>>>>
>>>> Does this tell you something useful?
>>>
>>> It doesn't. Looking on logged register content - commands are indeed
>>> still running and no interrupts requested. Interesting to see em1
>>> watchdog timeout there. Aren't they related somehow?
>>
>>  dmesg | grep "irq 18":
>> uhci0:  port 0x20c0-0x20df irq
>> 18 at device 26.0 on pci0
>> uhci4:  port 0x2040-0x205f irq
>> 18 at device 29.2 on pci0
>> em1:  port
>> 0x1000-0x103f mem 0xe192-0xe193,0xe190-0xe191 irq 18
>> at device 2.0 on pci3
>> ichsmb0:  port 0x2000-0x201f
>> mem 0xe1a22000-0xe1a220ff irq 18 at device 31.3 on pci0
>>
>> The don't share the same IRQ at least.
>> dmesg | grep "irq 21"
>> uhci1:  port 0x20a0-0x20bf irq
>> 21 at device 26.1 on pci0
>> ahci0:  port
>> 0x2408-0x240f,0x2414-0x2417,0x2400-0x2407,0x2410-0x2413,0x2020-0x203f
>> mem 0xe1a21000-0xe1a217ff irq 21 at device 31.2 on pci0
>>
>> The em1 has no cable attached. I get many of these em watchdog
>> timeouts. Never thought they could be related to ahci. I'll see if
>> the em watchdog timeouts happens in any relation to disk usage.
> 
> Please provide output from the commands I provided.  dmesg|grep is not
> sufficient for helping

Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-14 Thread Harald Schmalzbauer

Harald Schmalzbauer schrieb am 13.03.2010 22:27 (localtime):

Am 03.03.2010 12:06, schrieb Jeremy Chadwick:

On Wed, Mar 03, 2010 at 09:28:25AM +0100, Harald Schmalzbauer wrote:

Alexander Motin schrieb am 03.03.2010 09:18 (localtime):

Harald Schmalzbauer wrote:

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr

...

Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before timeout
message printed? Just want to be sure that driver waited enough before
give up.


This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things
correct) I also see the ZFS burst write congestion, but this doesn't
lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most often I have
to reboot.

How it looks when it doesn't? Can you send me full log messages?

Hello, this morning I had a stall, but the machine recovered after about
one Minute. Here's what I got from the kernel:
ahcich2: Timeout on slot 29
ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr

em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr

ahcich2: Timeout on slot 18
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs  ss 000c rs 000c tfd 40 serr


Does this tell you something useful?

It doesn't. Looking on logged register content - commands are indeed
still running and no interrupts requested. Interesting to see em1
watchdog timeout there. Aren't they related somehow?

dmesg | grep "irq 18":
uhci0:  port 0x20c0-0x20df irq
18 at device 26.0 on pci0
uhci4:  port 0x2040-0x205f irq
18 at device 29.2 on pci0
em1:  port
0x1000-0x103f mem 0xe192-0xe193,0xe190-0xe191 irq 18
at device 2.0 on pci3
ichsmb0:  port 0x2000-0x201f
mem 0xe1a22000-0xe1a220ff irq 18 at device 31.3 on pci0

The don't share the same IRQ at least.
dmesg | grep "irq 21"
uhci1:  port 0x20a0-0x20bf irq
21 at device 26.1 on pci0
ahci0:  port
0x2408-0x240f,0x2414-0x2417,0x2400-0x2407,0x2410-0x2413,0x2020-0x203f
mem 0xe1a21000-0xe1a217ff irq 21 at device 31.2 on pci0

The em1 has no cable attached. I get many of these em watchdog
timeouts. Never thought they could be related to ahci. I'll see if
the em watchdog timeouts happens in any relation to disk usage.

Please provide output from the commands I provided.  dmesg|grep is not
sufficient for helping track this down, specifically with regards to the
em1 watchdog timeouts.


Sorry for the delay, here's the details:
hos...@pci0:0:0:0:  class=0x06 card=0x34d08086 chip=0x29f08086
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = '3200 Chipset (Bearlake) Processor to I/O Controller'
class  = bridge
subclass   = HOST-PCI
e...@pci0:0:25:0:class=0x02 card=0x34d08086 chip=0x10bd8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)'
class  = network
subclass   = ethernet
uh...@pci0:0:26:0:  class=0x0c0300 card=0x34d08086 chip=0x29378086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
uh...@pci0:0:26:1:  class=0x0c0300 card=0x34d08086 chip=0x29388086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'
device = '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller'
class  = serial bus
subclass   = USB
eh...@pci0:0:26:7:  class=0x0c0320 card=0x34d08086 chip=0x293c8086
rev=0x02 hdr=0x00
vendor = 'Intel Corporation'

ahc parity errors with RELENG_8 from tody vs. 4 weeks ago

2010-03-14 Thread Harald Schmalzbauer

Hello,

today I refreshed one -stable machine to RELENG_8 from today and now I 
see hundreds of the excerpted lines. Never seen this before. There's a 
DAT72 drive at 0:14:0.

Do I have problems with my drive or with the new kernel?

(probe28:ahc1:0:14:0): parity error detected in Data-in phase. 
SEQADDR(0x6c) SCSIRATE(0x93)

ahc1: Recovery Initiated
>> Dump Card State Begins <
ahc1: Dumping Card State in Data-in phase, at SEQADDR 0x54
Card was paused
ACCUM = 0x40, SINDEX = 0x8a, DINDEX = 0xe4, ARG_2 = 0x3c
HCNT = 0x20 SCBPTR = 0x0
SCSIPHASE[0x4]:(MSG_OUT_PHASE) SCSISIGI[0xb6]:(REQI|BSYI|ATNI|MSGI|CDI)
ERROR[0x0] SCSIBUSL[0x20] LASTPHASE[0x40]:(IOI) 
SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)

SBLKCTL[0xa]:(SELWIDE|SELBUSB) SCSIRATE[0x93]:(SINGLE_EDGE|WIDEXFER)
SEQCTL[0x10]:(FASTMODE) SEQ_FLAGS[0x20]:(DPHASE) SSTAT0[0x0]
SSTAT1[0x1]:(REQINIT) SSTAT2[0x40]:(SHVALID) SSTAT3[0x1]
SIMODE0[0x8]:(ENSWRAP) 
SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)

SXFRCTL0[0x80]:(DFON) DFCNTRL[0x28]:(HDMAEN|SCSIEN)
DFSTATUS[0x80]:(PRELOAD_AVAIL)
STACK: 0x85 0x85 0x85 0x180
SCB count = 254
Kernel NEXTQSCB = 238
Card NEXTQSCB = 238
QINFIFO entries:
Waiting Queue entries:
Disconnected Queue entries:
QOUTFIFO entries:
Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
20 21 22 23 24 25 26 27 28 29 30 31

Sequencer SCB Info:
  0 SCB_CONTROL[0x0] SCB_SCSIID[0xe7]:(TWIN_CHNLB) SCB_LUN[0x0]
SCB_TAG[0xf0]
  1 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  2 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  3 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  4 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  5 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  6 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  7 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  8 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
  9 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 16 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 17 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 18 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 19 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 20 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 21 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 22 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 23 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 24 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 25 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 26 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 27 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 28 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 29 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 30 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
 31 SCB_CONTROL[0x0] SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) SCB_TAG[0xff]
Pend

Re: ahci errors on 8-stable

2010-03-15 Thread Harald Schmalzbauer

Nenhum_de_Nos schrieb am 09.03.2010 00:44 (localtime):

On Mon, 8 Mar 2010 14:26:53 -0800
Jeremy Chadwick  wrote:


On Mon, Mar 08, 2010 at 06:38:02PM -0300, Nenhum_de_Nos wrote:

I've seen these errors in a production machine in deep disk load (scp and
bsdtar in heavy use):

Please provide the output from the following commands:


As I had huge disk activity when those messages appeared, I did reboot after 
and now no more are there. I think the vmstat command should be issued when the 
problem was happening right ? (if so I can run the backup tar's and see what 
happens).


What disks do you use?
I have similar timeouts and mav has the hd firmware in mind to be the 
culprit 
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-02/msg00737.html


In my case it's the samsung EcoGreen SpinPouint F2 1.5TB, Firmware 
1AG01113 and 1AG01118. The disk on ahcich2 (where the timeouts appear)

has the newer firmware.

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahci errors on 8-stable

2010-03-17 Thread Harald Schmalzbauer

Nenhum_de_Nos schrieb am 16.03.2010 02:01 (localtime):

On Mon, March 15, 2010 14:21, Harald Schmalzbauer wrote:

Nenhum_de_Nos schrieb am 09.03.2010 00:44 (localtime):

On Mon, 8 Mar 2010 14:26:53 -0800
Jeremy Chadwick  wrote:


On Mon, Mar 08, 2010 at 06:38:02PM -0300, Nenhum_de_Nos wrote:

I've seen these errors in a production machine in deep disk load (scp
and
bsdtar in heavy use):

Please provide the output from the following commands:

As I had huge disk activity when those messages appeared, I did reboot
after and now no more are there. I think the vmstat command should be
issued when the problem was happening right ? (if so I can run the
backup tar's and see what happens).

What disks do you use?
I have similar timeouts and mav has the hd firmware in mind to be the
culprit
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-02/msg00737.html

In my case it's the samsung EcoGreen SpinPouint F2 1.5TB, Firmware
1AG01113 and 1AG01118. The disk on ahcich2 (where the timeouts appear)
has the newer firmware.


2 Seagate 1TB disks:

Mar  8 13:49:45 optimus kernel: ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
Mar  8 13:49:45 optimus kernel: ada1:  ATA-8 SATA 2.x
device
Mar  8 13:49:45 optimus kernel: ada1: 300.000MB/s transfers (SATA 2.x,
UDMA6, PIO size 8192bytes)
Mar  8 13:49:45 optimus kernel: ada1: Command Queueing enabled
Mar  8 13:49:45 optimus kernel: ada1: 953869MB (1953525168 512 byte
sectors: 16H 63S/T 16383C)
Mar  8 13:49:45 optimus kernel: ada2 at ahcich3 bus 0 scbus3 target 0 lun 0
Mar  8 13:49:45 optimus kernel: ada2:  ATA-8 SATA 2.x
device
Mar  8 13:49:45 optimus kernel: ada2: 300.000MB/s transfers (SATA 2.x,
UDMA6, PIO size 8192bytes)
Mar  8 13:49:45 optimus kernel: ada2: Command Queueing enabled
Mar  8 13:49:45 optimus kernel: ada2: 953869MB (1953525168 512 byte
sectors: 16H 63S/T 16383C)

those are known to be bad ?


In my experience, these are reliable drives. And completely different to 
mine. So I think it's not liklely to be a firmware bug.
I hope the problem can be pointed out. If there's anything I can help, 
please let me know.


Thanks,

-Harry




signature.asc
Description: OpenPGP digital signature


Does zfs have it's own nfs server?

2010-03-17 Thread Harald Schmalzbauer

Hello,

I observed some very strange filesystem security problems.
Now I found that if I set sharenfs=yes data/pub I can mount_nfs but it 
does't respect any settings in /etc/exports. Also I get very strange uid 
numbers when writing.

If I turn sharenfs off, limitations in /etc/exports work as expected.
I thought sharenfs and sharesmb are only working on OpenSolaris. What 
about shareiscsi?


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Dtrong elcheapo-ZFS-disk recommendation [Was: Re: ahcich timeouts, only with ahci, not with ataahci]

2010-03-25 Thread Harald Schmalzbauer

Harald Schmalzbauer schrieb am 14.03.2010 12:12 (localtime):

Harald Schmalzbauer schrieb am 13.03.2010 22:27 (localtime):

Am 03.03.2010 12:06, schrieb Jeremy Chadwick:

On Wed, Mar 03, 2010 at 09:28:25AM +0100, Harald Schmalzbauer wrote:

Alexander Motin schrieb am 03.03.2010 09:18 (localtime):

Harald Schmalzbauer wrote:

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 
serr


ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 
serr


ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 
serr


...
Looking that is (Interrupt status) is zero and `rs == cs | ss` 
(running

command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before 
timeout
message printed? Just want to be sure that driver waited enough 
before

give up.


This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from 
locking

up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I 
understand things
correct) I also see the ZFS burst write congestion, but this 
doesn't

lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most 
often I have

to reboot.

How it looks when it doesn't? Can you send me full log messages?
Hello, this morning I had a stall, but the machine recovered after 
about

one Minute. Here's what I got from the kernel:
ahcich2: Timeout on slot 29
ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr

em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr

ahcich2: Timeout on slot 18
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs  ss 000c rs 000c tfd 40 serr


Does this tell you something useful?

It doesn't. Looking on logged register content - commands are indeed
still running and no interrupts requested. Interesting to see em1
watchdog timeout there. Aren't they related somehow?

dmesg | grep "irq 18":
uhci0:  port 0x20c0-0x20df irq
18 at device 26.0 on pci0
uhci4:  port 0x2040-0x205f irq
18 at device 29.2 on pci0
em1:  port
0x1000-0x103f mem 0xe192-0xe193,0xe190-0xe191 irq 18
at device 2.0 on pci3
ichsmb0:  port 0x2000-0x201f
mem 0xe1a22000-0xe1a220ff irq 18 at device 31.3 on pci0

The don't share the same IRQ at least.

...
For the records: I replaced the Samsung F2 1.5TB 5200rpm EcoGreen Drives.
In my dreams that should improove my 3-disk RAIDZ from 33MB/s avarage 
(>5G transferes) to about 60MB/s.
In reality, it improoved it to 90MB/s, _and_ completely eliminatong the 
ahcich timeouts, as well as the burst writes where the complete machine 
stuck while ZFS flushed/wrote trransaction groups.
So the difference in ZFS usage between the disks is far beond my 
imagination.

I can higly recommend the:
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K2000
Device Model: Hitachi HDS722020ALA330
Serial Number:JK1174YAH9ZH7W
Firmware Version: JKAOA28A
User Capacity:2,000,398,934,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:Thu Mar 25 23:48:13 2010 CET

Some TB restored so far, no errors, no oddities, no problems at all. 
Same server, same FreeBSD, but ahci.ko enabled again (so with NCQ, 
thanks mav and friends).


I can confirm that the F2 Samsung drives worked fine with the old ata 
driver (speaking without enabling NQC) and ZFS. They did their job for 2 
weeks without any error in that time, but reproducable showed ahcich 
timeouts (with the newer ahci.ko) if load was higher than about 50MB/s 
@raizd with 3 disks (same ICH9)
So if I got my problem solved by replacing my HDDs (even the old one had 
the latest firmware) ans also got triple performance :))


Just to share the info.

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-29 Thread Harald Schmalzbauer

Alexander Motin schrieb am 03.03.2010 09:18 (localtime):

Harald Schmalzbauer wrote:

Alexander Motin schrieb am 23.02.2010 16:10 (localtime):

Harald Schmalzbauer wrote:

I'm frequently getting my machine locked with ahcichX timeouts:
ahcich2: Timeout on slot 0
ahcich2: is  cs 0001 ss  rs 0001 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs 0100 ss  rs 0100 tfd c0 serr

ahcich2: Timeout on slot 8
ahcich2: is  cs f07f ss ff7f rs ff7f tfd c0 serr

...

Looking that is (Interrupt status) is zero and `rs == cs | ss` (running
command bitmasks in driver and hardware), controller doesn't report
command completion. Looking on TFD status 0xc0 with BUSY bit set, I
would suppose that either disk stuck in command processing for some
reason, or controller missed command completion status.

Have you noticed 30 second (default ATA timeout) pause before timeout
message printed? Just want to be sure that driver waited enough before
give up.


This happens when backup over GbE overloads ZFS/HDD capabilities.
I reduced vfs.zfs.txg.timeout to 1 to prevent the machine from locking
up almost immediately, but from it still happens.
When I don't use ahci but ataahci (the old driver if I understand things
correct) I also see the ZFS burst write congestion, but this doesn't
lead to controller timeouts, thus blocking the machine.

Sometimes the machine recovers from the disk lock, but most often I have
to reboot.

How it looks when it doesn't? Can you send me full log messages?

Hello, this morning I had a stall, but the machine recovered after about
 one Minute. Here's what I got from the kernel:
ahcich2: Timeout on slot 29
ahcich2: is  cs 0003 ss e003 rs e003 tfd c0 serr

em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
ahcich2: Timeout on slot 10
ahcich2: is  cs 6000 ss 7c00 rs 7c00 tfd c0 serr

ahcich2: Timeout on slot 18
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs 0004 ss  rs 0004 tfd c0 serr

ahcich2: Timeout on slot 2
ahcich2: is  cs  ss 000c rs 000c tfd 40 serr


Does this tell you something useful?


It doesn't. Looking on logged register content - commands are indeed
still running and no interrupts requested. Interesting to see em1
watchdog timeout there. Aren't they related somehow?


I have the drives now running in another server, ich7 chipset.
Using UFS, the complete machine locks up for ~30 secs with disk load of 
3.5MB/s. But I don't get any timeout messages and the machine always 
recovered.

Changing to the old ata driver solves the problem.
Any chance to get this problem fixed? I couldn't see lockups on another 
OS with NCQ in AHCI mode enabled. I'd ship such a disk to anyone who is 
willing to debug.


Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: ahcich timeouts, only with ahci, not with ataahci

2010-03-30 Thread Harald Schmalzbauer

Alexander Motin schrieb am 29.03.2010 21:25 (localtime):

Harald Schmalzbauer wrote:

I have the drives now running in another server, ich7 chipset.
Using UFS, the complete machine locks up for ~30 secs with disk load of
3.5MB/s. But I don't get any timeout messages and the machine always
recovered.


Most of ICH7's do not support AHCI. What's about your's?


It does, it's a FujitsuSiemens Server and has ERST-II (LSI Software 
RAID) along with AHCI.



Changing to the old ata driver solves the problem.

...

Any chance to get this problem fixed? I couldn't see lockups on another
OS with NCQ in AHCI mode enabled. I'd ship such a disk to anyone who is
willing to debug.


It's difficult to fix something, until problem could be reproduced.


I understand!
The machine lock @3.5MB/s was wrong, sorry. Not the HD was the culprit 
but an intermediate router...
But still there is the problem that ZFS stalls if I use these drives 
with ahci, not with ataahci.



If you wish to send drive - my address is:
Topol-2, b34, f150, Dnepropetrovsk, 49040, Ukraine.
Phone: +380503622312.
Do not use courier services, only regular mail. Ask for tracking number.


Can you use such a drive? I mean for yourself. If yes, then I'll ship 
it, but if you say "na, thanks, no such crap" then I don't want to waste 
your time and highly appreciated skills to bother with vendor-specific 
problems.


Thnaks,

-Harry


--
OmniLAN - UNIX & Windows Netze + Systeme
Harald Schmalzbauer
Flintsbacher Str. 3
80686 München
+49 (0) 89 18947781
+49 (0) 160 93860101
USt-IdNr.: DE253184753
http:/www.omnilan.de/



signature.asc
Description: OpenPGP digital signature


em regression, UDP LOR followed by ssh stall

2010-04-16 Thread Harald Schmalzbauer

Hello,

with RELENG_8 from 6 weeks ago I never ran into the problem that my ssh 
connection stalled.
With today's RELENG_8 it reproducably hangs at first login. After some 
time I can open another ssh session which seems to stay without 
problems, but the first sessions is always dying a few seconds after login.

here's a LOR:

lock order reversal:
 1st 0xff0001801018 em0:rx(1) (em0:rx(1)) @ 
/usr/src/sys/dev/e1000/if_em.c:4057

 2nd 0x80938908 udp (udp) @ /usr/src/sys/netinet/udp_usrreq.c:474
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_rw_rlock() at _rw_rlock+0x58
udp_input() at udp_input+0x1cd
ip_input() at ip_input+0xb3
netisr_dispatch_src() at netisr_dispatch_src+0x9e
ether_demux() at ether_demux+0x176
ether_input() at ether_input+0x176
em_rxeof() at em_rxeof+0x175
em_msix_rx() at em_msix_rx+0x22
intr_event_execute_handlers() at intr_event_execute_handlers+0x67
ithread_loop() at ithread_loop+0xae
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff8075507d30, rbp = 0 ---

FreeBSD 8.0-STABLE #0: Fri Apr 16 10:34:36 CEST 2010 
ad...@korso.rzhp.omnilan.net:/usr/obj/usr/src/sys/ILZ-S32  amd64


em0: flags=8843 metric 0 mtu 1500
options=399b
ether 00:1b:21:3e:90:52
inet 10.21.0.2 netmask 0xff00 broadcast 10.21.0.255
media: Ethernet autoselect (1000baseT )
status: active

em0:  port 0x2000-0x201f mem 
0xe1a8-0xe1a9,0xe1a0-0xe1a7,0xe1aa-0xe1aa3fff irq 16 
at device 0.0 on pci1

em0: Using MSIX interrupts with 5 vectors
em0: [ITHREAD]
em0: [ITHREAD]
em0: [ITHREAD]
em0: [ITHREAD]
em0: [ITHREAD]
em0: Ethernet address: 00:1b:21:3e:90:52


Thanks for any help



signature.asc
Description: OpenPGP digital signature


  1   2   3   >