Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)

2012-09-13 Thread Giulio Ferro

On 09/12/2012 10:51 PM, Freddie Cash wrote:

On Wed, Sep 12, 2012 at 1:48 PM, Jack Vogel  wrote:

On Wed, Sep 12, 2012 at 12:40 PM, Freddie Cash  wrote:

Thanks for checking.  I've used lagg(4) with igb, just not on 9.x.

You're right, it seems to be pointing to the igb(4) driver in 9.x
compared to < 9.0.


How do you determine that since it doesn't happen without lagg?  I've no
reports of igb hanging otherwise and its being used extensively.


Well, I did say "seems to".  :)

igb+lagg worked for us on 8.3.  Haven't tried it since moving to 9.0
and 9-STABLE on those three boxes.

igb+lagg doesn't work for him on 9.0.  Although, I don't recall if
non-LACP options were tried earlier in this thread, or if it's just
the LACP mode that's failing.  If one mode works (say failover) and
LACP mode doesn't, that "seems to" point to lagg.



Sorry, forgot to mention it. I tried both failover and lacp: neither 
worked. The switch is a Dell powerconnect 6248 with ports configured for 
aggragation.


I first tried on a 9.1 prerelease, then on a 9.0 release to have
everything clean. In both ssh, both as server and as client, become
unresponsive and unkillable.

The problem might also lie within ssh/d, but I somehow doubt it.
I haven't tried other network services.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)

2012-09-12 Thread Giulio Ferro

On 09/11/2012 11:34 PM, Freddie Cash wrote:


On Sep 11, 2012 2:12 PM, "Giulio Ferro" mailto:au...@zirakzigil.org>> wrote:
 >
 > Well, there definitely seems to be a problem with igb and lagg.
 >
 > igb alone works as it should, but doesn't seem to work properly in lagg.
 >
 > To be sure I started from scratch from a 9.0 release with nothing but:
 >
 > /etc/rc.conf
 > ---
 > ifconfig_igb0="inet ..."
 >
 > ifconfig_igb1="up"
 > ifconfig_igb2="up"
 > ifconfig_igb3="up"
 >
 > cloned_interfaces="lagg0"
 > ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport
igb3 192.168.x.x/24"
 >
 > sshd_enable="YES"
 > ---
 >
 > This doesn't even manage to start sshd, it just hangs there at boot.
 >
 > Disabling lagg configuration everything works correctly.
 >

Just curious: does it work if you split the lagg configuration from the
IP config:

ifconfig_lagg0="laggproto ..."
ifconfig_lagg0_alias0="inet 192..."

I've had problems in the past with cloned interfaces not working right
if you do everything in one ifconfig line. Never spent much time
debugging it, though, as the split config always worked.



Nope, doesn't work. It always hangs at boot and cannot be killed 
(freebsd 9 RELEASE)


I still think the problem is with lagg and / or igb.
Someone should look into it.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Issue with igb and lagg (was Re: Problem with link aggregation + sshd)

2012-09-11 Thread Giulio Ferro

Well, there definitely seems to be a problem with igb and lagg.

igb alone works as it should, but doesn't seem to work properly in lagg.

To be sure I started from scratch from a 9.0 release with nothing but:

/etc/rc.conf
---
ifconfig_igb0="inet ..."

ifconfig_igb1="up"
ifconfig_igb2="up"
ifconfig_igb3="up"

cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 
192.168.x.x/24"


sshd_enable="YES"
---

This doesn't even manage to start sshd, it just hangs there at boot.

Disabling lagg configuration everything works correctly.

This installation is a zfs root, but I don't think this has anything to
do with this.


Yes, I think that the maintainer of igb and/or lagg driver should
absolutely look into this...


On 09/07/2012 12:01 PM, Simon Dick wrote:

We've had similar problems with lagg at work, each lagg is made up of
one igb and one em port, sometimes for no apparent reason they seem to
stop passing through traffic. The easiest way we've found to get it
working again is ifconfig down and up on one of the physical
interfaces. This is on 8.1

On 3 September 2012 19:25, Giulio Ferro  wrote:

No idea anybody why this bug happens? Patches?



On 08/29/2012 10:22 PM, Giulio Ferro wrote:


On 08/28/2012 11:12 AM, Damien Fleuriot wrote:


Hi Giulio,



Just to clear things up:
igb0: 192.168.9.60/24
lagg0: 192.168.12.21/24



Yes.
Actually I notice now that the lagg0 address is different from what
I wrote below in my rc.conf (192.168.12.7). I've just made many test
with different configuration, but no matter, it just doesn't work...




What's the IP of the host you're trying ssh connections from ?



I'm just trying to connect to and from management interface igb0
(192.168.9.60).
  From external pc I do : ssh myuser@192.168.9.60
  From that server I do : ssh myuser@pcaddress

Just to be more precise, the consequences are:
1) daemon sshd on the server gets stuck and becomes unkillable
2) the first connection may work, but then the program ssh on the
server becomes unresponsive and unkillable

If I don't create a lagg0 interface and just connect (say) igb1 to
the data switch, I've no problem and everything works.

Just to answer others' question, I connect igb1, igb2 and igb3 to the
same data switch in ports configured for aggregation.
I connect igb0 to another management switch (of course not configured
for aggregation)




Also, just in case, did you enable any firewall ? (PF, ipfw)



As I already said, no. Nothing is working/active on this server, just
sshd.

Thank you.






On 27 August 2012 21:22, Giulio Ferro  wrote:


Hi, thanks for the answer

Here is what you asked for:

# ifconfig igb0
igb0: flags=8843 metric 0 mtu
1500


options=4401bb

ether ...
inet 192.168.9.60 netmask 0xff00 broadcast 192.168.9.255
  inet6  prefixlen 64 scopeid 0x1
  nd6 options=29
  media: Ethernet autoselect (1000baseT )
  status: active



# netstat -rn
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif
Expire
default192.168.9.1UGS 00   igb0
127.0.0.1  link#12UH  00lo0
192.168.9.0/24 link#1 U   0   14   igb0
192.168.9.60   link#1 UHS 00lo0
192.168.12.0/24link#13U   0  109  lagg0
192.168.12.21  link#13UHS 00lo0

Internet6:
Destination   Gateway   Flags
Netif Expire
::/96 ::1
UGRS lo0
::1   link#12
UH lo0
:::0.0.0.0/96 ::1
UGRS lo0
fe80::/10 ::1
UGRS lo0
fe80::%igb0/64link#1U
igb0
fe80::ea39:35ff:feb6:a0d4%igb0link#1
UHS lo0
fe80::%igb1/64link#2U
igb1
fe80::ea39:35ff:feb6:a0d5%igb1link#2
UHS lo0
fe80::%igb2/64link#3U
igb2
fe80::ea39:35ff:feb6:a0d6%igb2link#3
UHS lo0
fe80::%igb3/64link#4U
igb3
fe80::ea39:35ff:feb6:a0d7%igb3link#4
UHS lo0
fe80::%lo0/64 link#12   U
lo0
fe80::1%lo0   link#12
UHS lo0
fe80::%lagg0/64   link#13   U
lagg0
fe80::ea39:35ff:feb6:a0d5%lagg0   link#13
UHS lo0
ff01::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0
U igb0
ff01::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1
U igb1
ff01::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2
U igb2
ff01::%igb

Re: Problem with link aggregation + sshd

2012-09-03 Thread Giulio Ferro

No idea anybody why this bug happens? Patches?


On 08/29/2012 10:22 PM, Giulio Ferro wrote:

On 08/28/2012 11:12 AM, Damien Fleuriot wrote:

Hi Giulio,



Just to clear things up:
igb0: 192.168.9.60/24
lagg0: 192.168.12.21/24



Yes.
Actually I notice now that the lagg0 address is different from what
I wrote below in my rc.conf (192.168.12.7). I've just made many test
with different configuration, but no matter, it just doesn't work...




What's the IP of the host you're trying ssh connections from ?


I'm just trying to connect to and from management interface igb0
(192.168.9.60).
 From external pc I do : ssh myuser@192.168.9.60
 From that server I do : ssh myuser@pcaddress

Just to be more precise, the consequences are:
1) daemon sshd on the server gets stuck and becomes unkillable
2) the first connection may work, but then the program ssh on the
server becomes unresponsive and unkillable

If I don't create a lagg0 interface and just connect (say) igb1 to
the data switch, I've no problem and everything works.

Just to answer others' question, I connect igb1, igb2 and igb3 to the
same data switch in ports configured for aggregation.
I connect igb0 to another management switch (of course not configured
for aggregation)




Also, just in case, did you enable any firewall ? (PF, ipfw)


As I already said, no. Nothing is working/active on this server, just sshd.

Thank you.






On 27 August 2012 21:22, Giulio Ferro  wrote:

Hi, thanks for the answer

Here is what you asked for:

# ifconfig igb0
igb0: flags=8843 metric 0 mtu
1500

options=4401bb

ether ...
inet 192.168.9.60 netmask 0xff00 broadcast 192.168.9.255
 inet6  prefixlen 64 scopeid 0x1
 nd6 options=29
 media: Ethernet autoselect (1000baseT )
 status: active



# netstat -rn
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif
Expire
default192.168.9.1UGS 00   igb0
127.0.0.1  link#12UH  00lo0
192.168.9.0/24 link#1 U   0   14   igb0
192.168.9.60   link#1 UHS 00lo0
192.168.12.0/24link#13U   0  109  lagg0
192.168.12.21  link#13UHS 00lo0

Internet6:
Destination   Gateway   Flags
Netif Expire
::/96 ::1
UGRS lo0
::1   link#12
UH lo0
:::0.0.0.0/96 ::1
UGRS lo0
fe80::/10 ::1
UGRS lo0
fe80::%igb0/64link#1U
igb0
fe80::ea39:35ff:feb6:a0d4%igb0link#1
UHS lo0
fe80::%igb1/64link#2U
igb1
fe80::ea39:35ff:feb6:a0d5%igb1link#2
UHS lo0
fe80::%igb2/64link#3U
igb2
fe80::ea39:35ff:feb6:a0d6%igb2link#3
UHS lo0
fe80::%igb3/64link#4U
igb3
fe80::ea39:35ff:feb6:a0d7%igb3link#4
UHS lo0
fe80::%lo0/64 link#12   U
lo0
fe80::1%lo0   link#12
UHS lo0
fe80::%lagg0/64   link#13   U
lagg0
fe80::ea39:35ff:feb6:a0d5%lagg0   link#13
UHS lo0
ff01::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0
U igb0
ff01::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1
U igb1
ff01::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2
U igb2
ff01::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3
U igb3
ff01::%lo0/32 ::1   U
lo0
ff01::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U
lagg0
ff02::/16 ::1
UGRS lo0
ff02::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0
U igb0
ff02::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1
U igb1
ff02::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2
U igb2
ff02::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3
U igb3
ff02::%lo0/32 ::1   U
lo0
ff02::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U
lagg0



# netstat -aln | grep 22
tcp40   0 *.22  *.* LISTEN
tcp60   0 *.22  *.* LISTEN

Note that I already tried to only listen on igb0 interface
(192.168.9.60) in
sshd_config, but the results are exactly
the same described below.







On 08/25/2012 01:22 PM, Damien Fleuriot wrote:


In the meantime kindly post:


Ifconfig for your igb0
Netstat -rn
Netstat -aln | grep 22



On 25 Aug 2012, at 13:18, Damien Fleuriot  wrote:


I'll get back to you regarding link aggregation when I'm done with
groceries.

We use it here in production and it works flawlessly.



On 25 Aug 2012, at 09:54, Giuli

Re: Problem with link aggregation + sshd

2012-08-29 Thread Giulio Ferro

On 08/28/2012 11:12 AM, Damien Fleuriot wrote:

Hi Giulio,



Just to clear things up:
igb0: 192.168.9.60/24
lagg0: 192.168.12.21/24



Yes.
Actually I notice now that the lagg0 address is different from what
I wrote below in my rc.conf (192.168.12.7). I've just made many test
with different configuration, but no matter, it just doesn't work...




What's the IP of the host you're trying ssh connections from ?


I'm just trying to connect to and from management interface igb0
(192.168.9.60).
From external pc I do : ssh myuser@192.168.9.60
From that server I do : ssh myuser@pcaddress

Just to be more precise, the consequences are:
1) daemon sshd on the server gets stuck and becomes unkillable
2) the first connection may work, but then the program ssh on the
server becomes unresponsive and unkillable

If I don't create a lagg0 interface and just connect (say) igb1 to
the data switch, I've no problem and everything works.

Just to answer others' question, I connect igb1, igb2 and igb3 to the
same data switch in ports configured for aggregation.
I connect igb0 to another management switch (of course not configured
for aggregation)




Also, just in case, did you enable any firewall ? (PF, ipfw)


As I already said, no. Nothing is working/active on this server, just sshd.

Thank you.






On 27 August 2012 21:22, Giulio Ferro  wrote:

Hi, thanks for the answer

Here is what you asked for:

# ifconfig igb0
igb0: flags=8843 metric 0 mtu 1500

options=4401bb
ether ...
inet 192.168.9.60 netmask 0xff00 broadcast 192.168.9.255
 inet6  prefixlen 64 scopeid 0x1
 nd6 options=29
 media: Ethernet autoselect (1000baseT )
 status: active



# netstat -rn
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif Expire
default192.168.9.1UGS 00   igb0
127.0.0.1  link#12UH  00lo0
192.168.9.0/24 link#1 U   0   14   igb0
192.168.9.60   link#1 UHS 00lo0
192.168.12.0/24link#13U   0  109  lagg0
192.168.12.21  link#13UHS 00lo0

Internet6:
Destination   Gateway   Flags
Netif Expire
::/96 ::1   UGRS lo0
::1   link#12   UH lo0
:::0.0.0.0/96 ::1   UGRS lo0
fe80::/10 ::1   UGRS lo0
fe80::%igb0/64link#1Uigb0
fe80::ea39:35ff:feb6:a0d4%igb0link#1UHS lo0
fe80::%igb1/64link#2Uigb1
fe80::ea39:35ff:feb6:a0d5%igb1link#2UHS lo0
fe80::%igb2/64link#3Uigb2
fe80::ea39:35ff:feb6:a0d6%igb2link#3UHS lo0
fe80::%igb3/64link#4Uigb3
fe80::ea39:35ff:feb6:a0d7%igb3link#4UHS lo0
fe80::%lo0/64 link#12   U lo0
fe80::1%lo0   link#12   UHS lo0
fe80::%lagg0/64   link#13   U   lagg0
fe80::ea39:35ff:feb6:a0d5%lagg0   link#13   UHS lo0
ff01::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0 U igb0
ff01::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1 U igb1
ff01::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2 U igb2
ff01::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3 U igb3
ff01::%lo0/32 ::1   U lo0
ff01::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U
lagg0
ff02::/16 ::1   UGRS lo0
ff02::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0 U igb0
ff02::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1 U igb1
ff02::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2 U igb2
ff02::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3 U igb3
ff02::%lo0/32 ::1   U lo0
ff02::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U
lagg0



# netstat -aln | grep 22
tcp40   0 *.22  *.* LISTEN
tcp60   0 *.22  *.* LISTEN

Note that I already tried to only listen on igb0 interface (192.168.9.60) in
sshd_config, but the results are exactly
the same described below.







On 08/25/2012 01:22 PM, Damien Fleuriot wrote:


In the meantime kindly post:


Ifconfig for your igb0
Netstat -rn
Netstat -aln | grep 22



On 25 Aug 2012, at 13:18, Damien Fleuriot 

Re: Problem with link aggregation + sshd

2012-08-27 Thread Giulio Ferro

Hi, thanks for the answer

Here is what you asked for:

# ifconfig igb0
igb0: flags=8843 metric 0 mtu 1500

options=4401bb
ether ...
inet 192.168.9.60 netmask 0xff00 broadcast 192.168.9.255
inet6  prefixlen 64 scopeid 0x1
nd6 options=29
media: Ethernet autoselect (1000baseT )
status: active



# netstat -rn
Routing tables

Internet:
DestinationGatewayFlagsRefs  Use  Netif Expire
default192.168.9.1UGS 00   igb0
127.0.0.1  link#12UH  00lo0
192.168.9.0/24 link#1 U   0   14   igb0
192.168.9.60   link#1 UHS 00lo0
192.168.12.0/24link#13U   0  109  lagg0
192.168.12.21  link#13UHS 00lo0

Internet6:
Destination   Gateway   Flags 
   Netif Expire
::/96 ::1   UGRS 
lo0
::1   link#12   UH 
lo0
:::0.0.0.0/96 ::1   UGRS 
lo0
fe80::/10 ::1   UGRS 
lo0
fe80::%igb0/64link#1U 
   igb0
fe80::ea39:35ff:feb6:a0d4%igb0link#1UHS 
lo0
fe80::%igb1/64link#2U 
   igb1
fe80::ea39:35ff:feb6:a0d5%igb1link#2UHS 
lo0
fe80::%igb2/64link#3U 
   igb2
fe80::ea39:35ff:feb6:a0d6%igb2link#3UHS 
lo0
fe80::%igb3/64link#4U 
   igb3
fe80::ea39:35ff:feb6:a0d7%igb3link#4UHS 
lo0
fe80::%lo0/64 link#12   U 
lo0
fe80::1%lo0   link#12   UHS 
lo0
fe80::%lagg0/64   link#13   U 
  lagg0
fe80::ea39:35ff:feb6:a0d5%lagg0   link#13   UHS 
lo0
ff01::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0 U 
igb0
ff01::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1 U 
igb1
ff01::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2 U 
igb2
ff01::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3 U 
igb3
ff01::%lo0/32 ::1   U 
lo0
ff01::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U 
lagg0
ff02::/16 ::1   UGRS 
lo0
ff02::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0 U 
igb0
ff02::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1 U 
igb1
ff02::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2 U 
igb2
ff02::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3 U 
igb3
ff02::%lo0/32 ::1   U 
lo0
ff02::%lagg0/32   fe80::ea39:35ff:feb6:a0d5%lagg0 U 
lagg0




# netstat -aln | grep 22
tcp40   0 *.22  *.* LISTEN
tcp60   0 *.22  *.* LISTEN

Note that I already tried to only listen on igb0 interface 
(192.168.9.60) in sshd_config, but the results are exactly

the same described below.






On 08/25/2012 01:22 PM, Damien Fleuriot wrote:

In the meantime kindly post:


Ifconfig for your igb0
Netstat -rn
Netstat -aln | grep 22



On 25 Aug 2012, at 13:18, Damien Fleuriot  wrote:


I'll get back to you regarding link aggregation when I'm done with groceries.

We use it here in production and it works flawlessly.



On 25 Aug 2012, at 09:54, Giulio Ferro  wrote:


No answer, so it seems that link aggregation doesn't really work in freebsd,
this may help others with the same problem...

I reverted back to one link for management and one for service, and ssh
works as it should...


On 08/21/2012 11:18 PM, Giulio Ferro wrote:

Scenario : freebsd 9 stable (yesterday) amd64 on HP server with 4 nic (igb)

1 nic is connected standalone to the management switch, the 3 other nics
are connected to a switch configured for aggregation.

If I configure the first nic (igb0) there is no problem, I can operate
as I normally do and sshd functions normally.

The problems start when I configure the 3 other nics for aggregation:

in /etc/rc.conf
...
ifconfig_igb1="up"
ifconfig_igb2="up"
ifconfig_igb3="up"

cloned_interfaces=lagg0
ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 
192.168.12.7/24"
...

I restart the server and the aggregation seems to work correctly, in
fact ifconfig returns the correct lagg0 interface with the aggregated
links, the correct protocol (lacp) and the correct ip address and the
status is active. I can

Re: Problem with link aggregation + sshd

2012-08-25 Thread Giulio Ferro

No answer, so it seems that link aggregation doesn't really work in freebsd,
this may help others with the same problem...

I reverted back to one link for management and one for service, and ssh
works as it should...


On 08/21/2012 11:18 PM, Giulio Ferro wrote:
Scenario : freebsd 9 stable (yesterday) amd64 on HP server with 4 nic 
(igb)


1 nic is connected standalone to the management switch, the 3 other nics
are connected to a switch configured for aggregation.

If I configure the first nic (igb0) there is no problem, I can operate
as I normally do and sshd functions normally.

The problems start when I configure the 3 other nics for aggregation:

in /etc/rc.conf
...
ifconfig_igb1="up"
ifconfig_igb2="up"
ifconfig_igb3="up"

cloned_interfaces=lagg0
ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport 
igb3 192.168.12.7/24"

...

I restart the server and the aggregation seems to work correctly, in
fact ifconfig returns the correct lagg0 interface with the aggregated
links, the correct protocol (lacp) and the correct ip address and the
status is active. I can ping other IPs on the aggregated link.

Also the other (standalone) link seems to work correctly. I can ping
that address from other machines, and I can ping other IPs from that
server.

DNS lookups work ok too I can also use telnet to connect to pop3
servers so there seems to be no problem on the network stack.

But if I try to connect to the sshd service on that server, it hangs
indefinitely. On the server I find two sshd processes:
/usr/sbin/sshd
/usr/sbin/sshd -R

There is no message in the logs.

If I try to kill sshd (/etc/rc.d/sshd stop) I can't. it just stays there
forever waiting for the pid to die (it never does)

Even ssh client doesn't seem to work. In fact, if I try to connect to
another server, the ssh client may start to work correctly, then soon
or later it just hangs there forever, and I can't kill it with ctrl-c.

No firewall is configured, there is nothing else working on this server.

Thanks for any suggestions...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Problem with link aggregation + sshd

2012-08-21 Thread Giulio Ferro

Scenario : freebsd 9 stable (yesterday) amd64 on HP server with 4 nic (igb)

1 nic is connected standalone to the management switch, the 3 other nics
are connected to a switch configured for aggregation.

If I configure the first nic (igb0) there is no problem, I can operate
as I normally do and sshd functions normally.

The problems start when I configure the 3 other nics for aggregation:

in /etc/rc.conf
...
ifconfig_igb1="up"
ifconfig_igb2="up"
ifconfig_igb3="up"

cloned_interfaces=lagg0
ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 
192.168.12.7/24"

...

I restart the server and the aggregation seems to work correctly, in
fact ifconfig returns the correct lagg0 interface with the aggregated
links, the correct protocol (lacp) and the correct ip address and the
status is active. I can ping other IPs on the aggregated link.

Also the other (standalone) link seems to work correctly. I can ping
that address from other machines, and I can ping other IPs from that
server.

DNS lookups work ok too I can also use telnet to connect to pop3
servers so there seems to be no problem on the network stack.

But if I try to connect to the sshd service on that server, it hangs
indefinitely. On the server I find two sshd processes:
/usr/sbin/sshd
/usr/sbin/sshd -R

There is no message in the logs.

If I try to kill sshd (/etc/rc.d/sshd stop) I can't. it just stays there
forever waiting for the pid to die (it never does)

Even ssh client doesn't seem to work. In fact, if I try to connect to
another server, the ssh client may start to work correctly, then soon
or later it just hangs there forever, and I can't kill it with ctrl-c.

No firewall is configured, there is nothing else working on this server.

Thanks for any suggestions...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kerberized NFS

2012-02-19 Thread Giulio Ferro

On 02/19/2012 02:55 AM, Rick Macklem wrote:

I just updated the patch:
   http://people.freebsd.org/~rmacklem/rpcsec_gss-9.patch

If you already downloaded it, please do so again, because
it had two arguments reversed in order and would not have
worked.

I think this one is correct, although I don't currently
have a Kerberos setup to test it with.

Good luck with it, rick




Thanks a lot. I'm trying next week.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


kerberized NFS

2012-02-17 Thread Giulio Ferro

Thanks everybody again for your help with setting up a working
kerberized nfsv4 system.

I was able to user-mount a nfsv4 share with krb5 security, and I was
trying to do the same as root.

Unfortunately the patch I found here:
http://people.freebsd.org/~rmacklem/rpcsec_gss.patch

fails to apply cleanly on a 9 stable system.

Is there a more recent patch available or some better way to automatically
mount the share at boot time?

Thanks again.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kerberized NFS

2012-01-28 Thread Giulio Ferro

Thank you to all of you for your replies. I'll try next week
and let you know.

My mail server was down for a few hours, but everything should
be ok now...

Giulio.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kerberized NFS

2012-01-28 Thread Giulio Ferro

I forgot to mentioned that I compiled both servers with
option KGSSAPI and device crypto, and I enabled gssd
on both.

Is there anyone who was able to configure this setup?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


kerberized NFS

2012-01-27 Thread Giulio Ferro

I'm trying to setup a kerberized NFS system made of a server and a
client (both freebsd 9 amd64 stable)

I've tried to follow this howto:
http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup

But couldn't get much out of it.

First question : is this howto still valid or something more recent
should be followed? I've searched with Google but I've come up empty.

I've set up kerberos heimdal, created the dns entries for both
client and server, set up krb5.keytab and copied it to client, set
up nfs4 according to man nfsv4:

(server)
cat /etc/exports
V4: /usr/src -sec=krb5:krb5i:krb5p

and then tried to mount it from the client:

mount_nfs -o ntfsv4,sec=krb5i,gssname=nfs 
nfsinternal1.dcssrl.it:/usr/src /usr/src


but it failed with :
[tcp] nfsinternal1.dcssrl.it:/usr/src: Permission denied

Can you point me to something that I might have got wrong?

Thanks in advance.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS wrong size stats with amavis

2010-10-22 Thread Giulio Ferro

On 10/22/2010 03:54 PM, Jeremy Chadwick wrote:

On Fri, Oct 22, 2010 at 03:03:53PM +0200, Giulio Ferro wrote:

[...snip; focusing specifically on this piece...]

If I launch du -s under /zfs (host machine, not jail) I get a total space
of about 750GB, but df -h always turns up with 2,7TB space occupied.


Is the ZFS filesystem (not pool) using compression?



Nope.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ZFS wrong size stats with amavis

2010-10-22 Thread Giulio Ferro

I've seen people discuss about this and it happened to me as well.

Freebsd 8.1 stable 25th september amd64
This server has one single boot disk with ufs + 1 array (hardware, 3ware)
which hold 2,7TB data. I've formatted the latter with ZFS.

Things seemed to work ok until I upgraded the system. One day the server
jails didn't boot anymore. I checked and the array occupation was 100% with
0 byte free.

This sounded strange, so I found a hidden file under 
/var/amavis/.spamassassin.

I removed the dir, started again, but after a while it grew again until
the system became unusable.

This is how I solved: I moved the amavis partition under UFS, then 
mounted with nullfs that dir under /var in the zfs jail.


I solved in the sense that it didn't grow anymore, but the fs occupation
stayed very near 100%, with only 1GB free space.

If I launch du -s under /zfs (host machine, not jail) I get a total space
of about 750GB, but df -h always turns up with 2,7TB space occupied.

I tried a zpool upgrade zfs and zpool scrub zfs, but to no avail.

What should I do. short of moving the data, destroying the pool and
creating it again (which I can't very easily do)?

Thanks.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


nfsv4 with kgssapi

2010-10-12 Thread Giulio Ferro

I'm trying to setup a nfs server which uses the kerberized rpc
header, so to overcome the problem with 16 groups:

http://www.mail-archive.com/freebsd-stable@freebsd.org/msg109809.html


FreeBSD 8 amd64 stable last (yesterday)

Following the man page for nfsv4 I have compiled the kernel with
-
options KGSSAPI
device crypto
-

My files:


/etc/exports
-
V4: /mydir -sec=krb5:krb5i:krb5p -network 192.168.0 -mask 255.255.255.0
-

/etc/rc.conf
-
...
nfs_server_enable="YES"
nfsv4_server_enable="YES"
nfsuserd_enable="YES"
gssd_enable="YES"
...
-

All daemons start ok, but in the logs I see:
nfsd[...]: no gssd, using AUTH_SYS only

Even though gssd is up and running.

What's wrong?

Thanks in advance.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ZFS tuning

2010-10-11 Thread Giulio Ferro

I've got lost in all the posts concerning the nefarious kmem_size too
small bug, and I'm going finally to upgrade my system (it's currently
8.0 stable 1st May).

What is now (freebsd 8.1 last stable) the state of the art of the tuning
I should do on my system (amd64, of course)?

To put it as plain as possible, what should I write in my /boot/loader.conf?

Thanks.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


About zfs + nfs stability

2010-08-31 Thread Giulio Ferro

I have a 8.0 stable (last build around April 2010) which I use as a nfs
server : amd64, 8GB RAM, ~7TB storage.

I had a lot of grief with the (sadly) well know "kmem map too small" 
bug, which really compromised the quality of the service that server

was deputed to.

There wasn't (and there still isn't) any relevant indication in the
official freebsd zfs documentation on how to bypass the problem.
Only thanks to the effort and goodwill of other users in this list
and with hours on end of trying,
I could come up with something working:
(in /boot/loader.conf)
vm.kmem_size="6096M"
vfs.zfs.arc_max="3584M"
vfs.zfs.prefetch_disable="1"
vfs.zfs.txg.timeout="5"

The freezes are gone, thankfully, but I often get huge slow-downs: 
looking in the logs of the nfs clients I get plenty of:

... kernel: nfs server ...:/path/to/dir: lockd not responding
... kernel: nfs server ...:/path/to/dir: lockd is alive again

I don't know if this has anything to do with zfs.
What I'd like to know is the answer to the following questions
by other users and/or developers.

I don't need opinions, only punctual facts people have verified for
themselves.

1) Is it a good idea to upgrade this production system to the latest 8 
stable (8.1 stable I believe)? Is it really stable?

2) Are the zfs aforementioned tuning in /boot/loader.conf still necessary?
3) Is it a good idea to switch to nfsv4? Performance? Stability?

and above all:

4) will I get a more stable and performant system by upgrading?

Thanks in advance for the answers...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PF + BRIDGE still causes system freezing

2010-05-30 Thread Giulio Ferro

Max Laier wrote:

On Friday 28 May 2010 07:46:07 Giulio Ferro wrote:
  

Months ago I reported a system freezing whenever bridge was used
with pf. This still happens now in 8.1 prerelease: after several minutes
to hours
that the bridge is active the system becomes unresponsive.



as I told you last time your reported this problem: you need to simplify your 
setup in order to track down the problem.  For all I know, you have created a 
routing or ethernet loop that is the cause of your problems.  Unless you can 
provide a simple setup that can be reproduced, you have to track down the 
issue yourself - sorry.


Max
  


Ok, I've moved the vpn-bridging service to a server without pf, and now
it seems to work correctly.

I maintain that this issue would need to look into, anyway...
I don't think that a system freezing is acceptable, even when the 
administrator

makes some configuration mistakes: the o.s. should complain about
"routing or ethernet loop", without leaving him wondering...
(how can I find them, anyway?)

Thanks for your help.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PF + BRIDGE still causes system freezing

2010-05-28 Thread Giulio Ferro

On 28.05.2010 07:46, Giulio Ferro wrote:

Would it be a good idea to try netgraph bridge?
Or the underlying implementation is the same as in if_bridge?



Months ago I reported a system freezing whenever bridge was used
with pf. This still happens now in 8.1 prerelease: after several 
minutes to hours

that the bridge is active the system becomes unresponsive.

# uname -a
FreeBSD firewall1 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #0: Thu May 27 
18:03:48 CEST 2010 r...@data1:/usr/obj/usr/src/sys/FIREWALL  amd64



cat /etc/sysctl.conf

net.inet.ip.forwarding=1
net.inet.ip.fastforwarding=1
net.inet.carp.preempt=1

Services running : sshd, named, inetd, ntpd, openvpn (tap), racoon, 
pptp, asterisk


2 physical interfaces : bce0, bce1
11 vlan interfaces : vlan1, ..., vlan11 (vlandev bce1)
11 carp interfaces ; carp1, ..., carp11  (carp1 has 23 alias addresses)
1 bridge interfaces : bridge0 addm vlan35 (used by openvpn)
2 gif interfaces : gif0, gif1 (racoon / IPSEC)

8 static routes

pf packet filter : 12 rdr rules, 3 nat rules, set skip{lo0, bridge0, 
vlan35}, 4 pass quick, block log all, about 30 pass keep state




When the system freezes, I get this from the debugger
-
db> show allchains
db> show alllocks
Process 12 (intr) thread 0xff00024293e0 (100028)
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xff000270ea18) 
locked @ /usr/src/sys/net/if_bridge.c:2184

Process 12 (intr) thread 0xff00022693e0 (100016)
exclusive sleep mutex Giant (Giant) r = 1 (0x80c93dc0) locked 
@ /usr/src/sys/dev/usb/usb_transfer.c:3023

Process 12 (intr) thread 0xff00022607c0 (106)
exclusive sleep mutex carp_if (carp_if) r = 0 (0xff00027329e0) 
locked @ /usr/src/sys/netinet/ip_carp.c:881

db>
-

Even if there is no solution yet, is there any quick and dirty 
workaround I can try?

I need this rather badly...

Thanks.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PF + BRIDGE still causes system freezing

2010-05-28 Thread Giulio Ferro

On 28.05.2010 07:46, Giulio Ferro wrote:


I've also tried to disable all filtering:

net.link.bridge.pfil_onlyip=0
net.link.bridge.pfil_member=0
net.link.bridge.pfil_bridge=0
net.link.bridge.pfil_local_phys=0
net.link.bridge.ipfw=0
net.link.bridge.ipfw_arp=0

But to no avail. It always freezes...



Months ago I reported a system freezing whenever bridge was used
with pf. This still happens now in 8.1 prerelease: after several 
minutes to hours

that the bridge is active the system becomes unresponsive.

# uname -a
FreeBSD firewall1 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #0: Thu May 27 
18:03:48 CEST 2010 r...@data1:/usr/obj/usr/src/sys/FIREWALL  amd64



cat /etc/sysctl.conf

net.inet.ip.forwarding=1
net.inet.ip.fastforwarding=1
net.inet.carp.preempt=1

Services running : sshd, named, inetd, ntpd, openvpn (tap), racoon, 
pptp, asterisk


2 physical interfaces : bce0, bce1
11 vlan interfaces : vlan1, ..., vlan11 (vlandev bce1)
11 carp interfaces ; carp1, ..., carp11  (carp1 has 23 alias addresses)
1 bridge interfaces : bridge0 addm vlan35 (used by openvpn)
2 gif interfaces : gif0, gif1 (racoon / IPSEC)

8 static routes

pf packet filter : 12 rdr rules, 3 nat rules, set skip{lo0, bridge0, 
vlan35}, 4 pass quick, block log all, about 30 pass keep state




When the system freezes, I get this from the debugger
-
db> show allchains
db> show alllocks
Process 12 (intr) thread 0xff00024293e0 (100028)
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xff000270ea18) 
locked @ /usr/src/sys/net/if_bridge.c:2184

Process 12 (intr) thread 0xff00022693e0 (100016)
exclusive sleep mutex Giant (Giant) r = 1 (0x80c93dc0) locked 
@ /usr/src/sys/dev/usb/usb_transfer.c:3023

Process 12 (intr) thread 0xff00022607c0 (106)
exclusive sleep mutex carp_if (carp_if) r = 0 (0xff00027329e0) 
locked @ /usr/src/sys/netinet/ip_carp.c:881

db>
-

Even if there is no solution yet, is there any quick and dirty 
workaround I can try?

I need this rather badly...

Thanks.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


PF + BRIDGE still causes system freezing

2010-05-27 Thread Giulio Ferro

Months ago I reported a system freezing whenever bridge was used
with pf. This still happens now in 8.1 prerelease: after several minutes 
to hours

that the bridge is active the system becomes unresponsive.

# uname -a
FreeBSD firewall1 8.1-PRERELEASE FreeBSD 8.1-PRERELEASE #0: Thu May 27 
18:03:48 CEST 2010 r...@data1:/usr/obj/usr/src/sys/FIREWALL  amd64



cat /etc/sysctl.conf

net.inet.ip.forwarding=1
net.inet.ip.fastforwarding=1
net.inet.carp.preempt=1

Services running : sshd, named, inetd, ntpd, openvpn (tap), racoon, 
pptp, asterisk


2 physical interfaces : bce0, bce1
11 vlan interfaces : vlan1, ..., vlan11 (vlandev bce1)
11 carp interfaces ; carp1, ..., carp11  (carp1 has 23 alias addresses)
1 bridge interfaces : bridge0 addm vlan35 (used by openvpn)
2 gif interfaces : gif0, gif1 (racoon / IPSEC)

8 static routes

pf packet filter : 12 rdr rules, 3 nat rules, set skip{lo0, bridge0, vlan35}, 4 
pass quick, block log all, about 30 pass keep state




When the system freezes, I get this from the debugger
-
db> show allchains
db> show alllocks
Process 12 (intr) thread 0xff00024293e0 (100028)
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xff000270ea18) locked @ 
/usr/src/sys/net/if_bridge.c:2184
Process 12 (intr) thread 0xff00022693e0 (100016)
exclusive sleep mutex Giant (Giant) r = 1 (0x80c93dc0) locked @ 
/usr/src/sys/dev/usb/usb_transfer.c:3023
Process 12 (intr) thread 0xff00022607c0 (106)
exclusive sleep mutex carp_if (carp_if) r = 0 (0xff00027329e0) locked @ 
/usr/src/sys/netinet/ip_carp.c:881
db>
-

Even if there is no solution yet, is there any quick and dirty workaround I can 
try?
I need this rather badly...

Thanks.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-05 Thread Giulio Ferro

On 05.05.2010 11:11, Simun Mikecin wrote:

- Original Message 
   

I'm really astounded at how unstable zfs is, it's
   

causing me a lot
 

of problem.

Why isn't it
   

stated in the handbook that zfs isn't up to production yet?
 


   

Why people responsible for ZFS on
freebsd aren't saying anything? What's the status of this issue? Is someone
working on this???
 


How much RAM do you have? Are you using i386 or amd64? Have you tried removing 
all zfs and kmem sysctl's so the system uses default values?

   



This is the first post of the current thread: it should tell you everything


NFS server amd64 Freebsd 8.0 recent (2 days ago)

This server has been running for several months without problems.
Beginning last week, however, I'm experiencing panics (about 1 per day)
with the error in the subject

Current settings:


vm.kmem_size_scale: 3
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 2764046336
...
hw.physmem: 8568225792
hw.usermem: 6117404672
hw.realmem: 9395240960
...
vfs.zfs.l2arc_noprefetch: 0
vfs.zfs.l2arc_feed_secs_shift: 1
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 128
vfs.zfs.l2arc_write_boost: 67108864
vfs.zfs.l2arc_write_max: 67108864
vfs.zfs.arc_meta_limit: 431882240
vfs.zfs.arc_meta_used: 431874720
vfs.zfs.arc_min: 215941120
vfs.zfs.arc_max: 1727528960


I've set nothing in either /boot/loader.conf or sysctl.conf


What should I do?


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-05 Thread Giulio Ferro

On 05.05.2010 09:52, Jeremy Chadwick wrote:

Nope, it's happened again... Now I've tried to rise vm.kmem_size to 6G...



Did you set both vm.kmem_size and vfs.zfs.arc_max, setting the latter to
something *less* than vm.kmem_size?

   


Yes.
After your suggestion, I set
vfs.zfs.arc_max: 3758096384
vm.kmem_size: 4G

Now:
vfs.zfs.arc_max: 3758096384
vm.kmem_size: 6392119296



I'm really astounded at how unstable zfs is, it's causing me a lot
of problem.

Why isn't it stated in the handbook that zfs isn't up to production yet?
 

I'm not at liberty to comment + answer this question.

   


Why people responsible for ZFS on freebsd aren't saying anything?
What's the status of this issue? Is someone working on this???
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-04 Thread Giulio Ferro

Giulio Ferro wrote:

Thanks, I'll try these settings.

I'll keep you posted.



Nope, it's happened again... Now I've tried to rise vm.kmem_size to 6G...

I'm really astounded at how unstable zfs is, it's causing me a lot of 
problem.


Why isn't it stated in the handbook that zfs isn't up to production yet?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Freebsd 8.0 kmem map too small

2010-05-03 Thread Giulio Ferro

On 03.05.2010 13:01, Jeremy Chadwick wrote:

On Mon, May 03, 2010 at 12:41:50PM +0200, Giulio Ferro wrote:
   

NFS server amd64 Freebsd 8.0 recent (2 days ago)

This server has been running for several months without problems.
Beginning last week, however, I'm experiencing panics (about 1 per day)
with the error in the subject

Current settings:


vm.kmem_size_scale: 3
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 2764046336
...
hw.physmem: 8568225792
hw.usermem: 6117404672
hw.realmem: 9395240960
...
vfs.zfs.l2arc_noprefetch: 0
vfs.zfs.l2arc_feed_secs_shift: 1
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 128
vfs.zfs.l2arc_write_boost: 67108864
vfs.zfs.l2arc_write_max: 67108864
vfs.zfs.arc_meta_limit: 431882240
vfs.zfs.arc_meta_used: 431874720
vfs.zfs.arc_min: 215941120
vfs.zfs.arc_max: 1727528960


I've set nothing in either /boot/loader.conf or sysctl.conf


What should I do?
 

You need to adjust vm.kmem_size to provide more space for the ARC.

Below are ZFS-relevant entries in our /boot/loader.conf on production
RELENG_8 systems with 8GB of RAM.  The reason we set kmem_size to half
our physical system memory is because I didn't want to risk other
processes which use a larger maxdsiz/dfldsiz/maxssiz to potentially
exhaust all memory.


# Increase vm.kmem_size to allow for ZFS ARC to utilise more memory.
vm.kmem_size="4096M"
vfs.zfs.arc_max="3584M"

# Disable ZFS prefetching
# http://southbrain.com/south/2008/04/the-nightmare-comes-slowly-zfs.html
# Increases overall speed of ZFS, but when disk flushing/writes occur,
# system is less responsive (due to extreme disk I/O).
# NOTE: 8.0-RC1 disables this by default on systems<= 4GB RAM anyway
# NOTE: System has 8GB of RAM, so prefetch would be enabled by default.
vfs.zfs.prefetch_disable="1"

# Decrease ZFS txg timeout value from 30 (default) to 5 seconds.  This
# should increase throughput and decrease the "bursty" stalls that
# happen during immense I/O with ZFS.
# http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html
# http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007355.html
vfs.zfs.txg.timeout="5"


   



Thanks, I'll try these settings.

I'll keep you posted.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Freebsd 8.0 kmem map too small

2010-05-03 Thread Giulio Ferro

NFS server amd64 Freebsd 8.0 recent (2 days ago)

This server has been running for several months without problems.
Beginning last week, however, I'm experiencing panics (about 1 per day)
with the error in the subject

Current settings:


vm.kmem_size_scale: 3
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 2764046336
...
hw.physmem: 8568225792
hw.usermem: 6117404672
hw.realmem: 9395240960
...
vfs.zfs.l2arc_noprefetch: 0
vfs.zfs.l2arc_feed_secs_shift: 1
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 128
vfs.zfs.l2arc_write_boost: 67108864
vfs.zfs.l2arc_write_max: 67108864
vfs.zfs.arc_meta_limit: 431882240
vfs.zfs.arc_meta_used: 431874720
vfs.zfs.arc_min: 215941120
vfs.zfs.arc_max: 1727528960


I've set nothing in either /boot/loader.conf or sysctl.conf


What should I do?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS permission strangeness

2010-04-16 Thread Giulio Ferro

On 16.04.2010 10:29, Sean wrote:



Yes, I have more than 16 groups, 22 actually...
 

Then there's nothing "wrong" per se, you're just hitting the fact that NFS v2 
and v3 only support 16 groups on the wire. That's just the way the protocol is defined.

   


Ops, I didn't know that...

Is there any solution solid enough for a production environment. Maybe nfs4?

Please advice...

Giulio.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS permission strangeness

2010-04-16 Thread Giulio Ferro

On 16.04.2010 02:30, Rick Macklem wrote:
login as "giulio", but when I try to access that same dir on the 
client machine

I get:
$ cd /path/to/root/dir/etc
(ok)
$ cd subdir2
subdir2/: Permission denied.


What happens is that I can access "subdir2" on the server machine when I

Yes, it should work. I just tried the same thing with a server running
UFS/FFS and it worked fine, so I think that the problem might be ZFS 
related. (You will get into trouble with more than 16 groups, since

that is all that AUTH_SYS for Sun RPC handles, but I did 10 like your
example and it worked ok for me, using FreeBSD-CURRENT client/server,
except that my server uses UFS/FFS.)

Hopefully someone with ZFS expertise can help out here?

If you can conveniently do the same test using a server that exports
a UFS/FFS file system, that would be helpful w.r.t. isolating the
problem.

rick


Yes, I have more than 16 groups, 22 actually...

However I still think this might be a NFS problem, since when I login on
the server machine I can access that directory all right, the problem arises
only when I try to access that dir in the client machine...

Giulio


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


NFS permission strangeness

2010-04-15 Thread Giulio Ferro

Here's the setup:
server : NFS server machine (fb 8 stable amd64 )
client : NFS client machine (as above)

server and client are both sharing the same permission database through 
ldap:


Both have in /etc/nsswitch.conf
...
group: files ldap
...
passwd: files ldap

This issue isn't related to ldap, however. I get the same result if I 
manually add

groups to /etc/group file (read on)

Let's suppose I have user "giulio" configured in my system.
giulio is also part (-G) of groups:
group1, group2, group3, ... , group10

server is exporting the directory
/path/to/root (on zfs)

the directory
/path/to/root/dir/etc/subdir1
has permission 770 and group ownership "group3"

I login as user "giulio" on server I can enter "subdir1" directory, 
since I'm

member of group "group3"

I then login as user "giulio" on client, and I can do the same (as 
expected).



When groups are more than a few, however, I get this strange behavior:

let's suppose the directory:
/path/to/root/dir/etc/subdir2
has permission 770 and group ownership "group10"

What happens is that I can access "subdir2" on the server machine when I
login as "giulio", but when I try to access that same dir on the client 
machine

I get:
$ cd /path/to/root/dir/etc
(ok)
$ cd subdir2
subdir2/: Permission denied.

if I issue this command on the client:
$ id
I get :
uid=1000 (giulio), gid=1000 (giuliogroup), groups=group1(1001), 
group2(1002),

group3(1003),...,group10(1010)

So there shouldn't really be any reason for me not to be able to access 
that dir...


Any idea?


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


NFS lockd problem

2010-03-26 Thread Giulio Ferro

Outset:
1 NFS server (with lockd)
2 NFS client (with lockd)

The clients serve several jails with apache, whose data (www) resides on 
the server


From time to time everything seem to freeze. Then, after one minute or 
so, the system

works again as nothing had happened.

In these occasions I get this in the logs on the client madchines:
Mar 26 10:29:38 virt1 kernel: nfs server 
192.168.40.121:/data/mount_servers/wwwsec/www: lockd not responding


followed shortly after by:

Mar 26 10:29:38 virt1 kernel: nfs server 
192.168.40.121:/data/mount_servers/wwwsec/www: lockd is alive again



On the server I only get this:
Mar 26 10:29:31 data1 kernel: NLM: failed to contact remote rpcbind, 
stat = 5, port = 28416


I don't think it's a network problem, since all connections are local 
and high speed (1Gb/s)


I must admit that, with the other nfs problem I reported weeks ago, this 
kind of freebsd system seems

less than stable to me, and this is very disappointing...

Anyway I'd appreciate any pointer on this issue...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Bridge causes freezes

2010-03-15 Thread Giulio Ferro

I confirm this problem for another server:
stable 8 amd64 + vlan + carp

Whenever I join a bridge with a vlan interface:

ifconfig bridge0 addm vlan35

The system soon or later freezes.

This time it has happened after 3 days of normal behavior.

No logs, no dump.


On 03.03.2010 12:30, Giulio Ferro wrote:

I'm setting up an openvpn demon in bridge mode on a firewall.

Scenario:
freebsd 8 amd64 stable (last week), pf, vlans, openvpn in tun mode 
(different

port, of course), many routes

I've created the bridge interface in rc.conf like this:
cloned_interfaces="vlan.. .. .. bridge0"
...
ifconfig_bridge0="addm vlan35 up"


Everything seems to work as expected as far as networking is concerned.

The problem arises after an hour or so: the system simply freezes, and 
no relevant

log can be found after restart.

This _always_ happens, even when I don't start the openvpn bridge 
demon...


Any idea, anybody?

Thanks.
___
freebsd-...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Fwd: Re: NFS Client error

2010-03-09 Thread Giulio Ferro

On 09.03.2010 10:14, Daniel Braniss wrote:

Thanks for your kind reply, I'm forwarding it there...


 Original Message 
Subject:Re: NFS Client error
Date:   Mon, 08 Mar 2010 23:59:29 +0100
From:   vol...@vwsoft.com
To: Giulio Ferro
CC: freebsd-hack...@freebsd.org, freebsd-...@freebsd.org



On 03/08/10 12:16, Giulio Ferro wrote:
 

  Freebsd 8 stable amd64

  It mounts different file systems by NFS (with locking) on a
  data server directly connected (gigabit) to the server

  Apache running in a several jails on those nfs folders.

  Now and then I get huge slow-down. When I look in the logs
  I get thousand of lines like these:
  Mar  5 11:50:52 virt2 kernel: vm_fault: pager read error, pid 46487 (httpd)
  Mar  5 11:50:52 virt2 kernel: pid 46487 (httpd), uid 80: exited on
  signal 11


  What should I do?
   

If the binary (httpd) is on a nfs server, then if the binary got
modified this is what usualy happens
   


Nope. The binary is on the jails on the local machine.
Only the configuration dir (etc/apache22) and data dir (www)
in on the nfs server.


|NFS CLIENT  |
| jail 1 : httpd |
| jail 2 : httpd | -->NFS SERVER
| jail 3 : httpd |
|...   |
---


Giulio.










my 2c
danny

   

Giulio,

it seems this is anyhow not related to network (nfs) operations. It's
looking like a problem in the VM. I think it makes sense to have a look
at the httpd.core file if the binary has been linked with debugging
symbols turned on. Also I think at first, it may not hurt to look at
vmstat -m output.

You may want to change ${subject} and post to stable@ to drive more
attention to your problem.

Volker


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
   


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Fwd: Re: NFS Client error

2010-03-09 Thread Giulio Ferro

Thanks for your kind reply, I'm forwarding it there...


 Original Message 
Subject:Re: NFS Client error
Date:   Mon, 08 Mar 2010 23:59:29 +0100
From:   vol...@vwsoft.com
To: Giulio Ferro 
CC: freebsd-hack...@freebsd.org, freebsd-...@freebsd.org



On 03/08/10 12:16, Giulio Ferro wrote:

 Freebsd 8 stable amd64

 It mounts different file systems by NFS (with locking) on a
 data server directly connected (gigabit) to the server

 Apache running in a several jails on those nfs folders.

 Now and then I get huge slow-down. When I look in the logs
 I get thousand of lines like these:
 Mar  5 11:50:52 virt2 kernel: vm_fault: pager read error, pid 46487 (httpd)
 Mar  5 11:50:52 virt2 kernel: pid 46487 (httpd), uid 80: exited on
 signal 11


 What should I do?


Giulio,

it seems this is anyhow not related to network (nfs) operations. It's
looking like a problem in the VM. I think it makes sense to have a look
at the httpd.core file if the binary has been linked with debugging
symbols turned on. Also I think at first, it may not hurt to look at
vmstat -m output.

You may want to change ${subject} and post to stable@ to drive more
attention to your problem.

Volker


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"