from:"Emile `iMil' Heitor"

Re: NPF ruleset not blocking IPs

2022-06-05 Thread Emile `iMil&#x27; Heitor


On Fri, 3 Jun 2022, Emile `iMil' Heitor wrote:



As the rules in the ruleset are declared as "final", I presume the default
`pass all` is not reached, am I right?


So, no, I was wrong. Changing the order made the rules apply. I simply removed
the "external" group and inserted the ruleset before the pass all:

group default {
pass final on lo0 all
pass stateful out final all

ruleset "blacklistd"
block in final from 

pass all

block in family inet6 all
pass proto ipv6-icmp all
pass stateful in family inet6 proto tcp to any port $tcp_allowed
pass stateful in family inet6 proto udp to any port $udp_allowed
}


------------
Emile `iMil' Heitor  | https://imil.net

NPF ruleset not blocking IPs

2022-06-03 Thread Emile `iMil&#x27; Heitor




I am trying to use npf along with blacklistd as an anti-bruteforce system.
Configuration-wide, everything seems to work together, yet blacklisted IPs,
while present in the "blacklistd" ruleset, don't seem to be blocked.

Here's my npf.conf file:

# npf.conf

$ext = vioif0
$ip4 = inet4(vioif0)
$ip6 = inet6(vioif0)

set bpf.jit on;
alg "icmp"

$tcp_allowed = {25, 53, 465, 587, 995, ssh, http, https}
$udp_allowed = {53}

table  type ipset file "/etc/npf_blacklist"

procedure "log" {
log: npflog0
}

group "external" on $ext {
ruleset "blacklistd"

block in final from 
}

group default {
pass final on lo0 all
pass stateful out final all
pass all

block in family inet6 all
pass proto ipv6-icmp all
pass stateful in family inet6 proto tcp to any port $tcp_allowed
pass stateful in family inet6 proto udp to any port $udp_allowed
}
# end of npf.conf

This virtual machine acts like an IPv6 router, hence the default rules.
Here's an extract of rules inserted by blacklistd:

$ sudo npfctl rule blacklistd list
ruleset block in final family inet4 proto udp from 64.231.104.8/32 to any port 53 # id="1" 
ruleset block in final family inet4 proto udp from 94.181.160.42/32 to any port 53 # id="2" 
ruleset block in final family inet4 proto udp from 209.126.8.168/32 to any port 53 # id="3" 
ruleset block in final family inet4 proto udp from 85.28.98.113/32 to any port 53 # id="4" 
ruleset block in final family inet4 proto udp from 44.200.125.213/32 to any port 53 # id="5" 
ruleset block in final family inet4 proto udp from 120.71.145.56/32 to any port 53 # id="6" 
ruleset block in final family inet4 proto udp from 90.90.90.90/32 to any port 53 # id="7" 
ruleset block in final family inet4 proto udp from 107.119.41.101/32 to any port 53 # id="8" 
ruleset block in final family inet4 proto udp from 78.116.212.157/32 to any port 53 # id="9" 
ruleset block in final family inet4 proto udp from 189.203.104.245/32 to any port 53 # id="a" 
ruleset block in final family inet4 proto udp from 193.124.7.9/32 to any port 53 # id="b" 
ruleset block in final family inet4 proto udp from 173.179.63.249/32 to any port 53 # id="c" 
ruleset block in final family inet4 proto udp from 174.244.240.203/32 to any port 53 # id="d" 
ruleset block in final family inet4 proto udp from 72.9.7.72/32 to any port 53 # id="e" 
ruleset block in final family inet4 proto udp from 95.105.64.219/32 to any port 53 # id="f" 
ruleset block in final family inet4 proto udp from 185.156.46.34/32 to any port 53 # id="10" 
ruleset block in final family inet4 proto tcp from 183.134.6.42/32 to any port 22 # id="7276" 
ruleset block in final family inet4 proto tcp from 185.220.100.253/32 to any port 22 # id="729a" 
ruleset block in final family inet4 proto udp from 35.174.16.235/32 to any port 53 # id="72b6"


Yet none of those IPs are blocked, I tried with a server of mine, it gets added 
to the list but is not blocked.

As the rules in the ruleset are declared as "final", I presume the default
`pass all` is not reached, am I right?
I am probably missing something obvious but can't figure out what.

Any ideas?

Thanks


Emile `iMil' Heitor  | https://imil.net

Re: blacklistd not reacting to postfix/smtpd AUTH failures

2020-08-08 Thread Emile `iMil&#x27; Heitor


On Fri, 7 Aug 2020, Martin Neitzel wrote:


You have to check the smtpd source to see if blacklist{,_r,_sa}
could be called at the point where the issue is logged.


Indeed the source code delivered. It suggests the notification should be
triggered when the auth attempt reach the smtpd_hard_error_limit:

if (state->error_count >= var_smtpd_hard_erlim) {
state->reason = REASON_ERROR_LIMIT;
state->error_mask |= MAIL_ERROR_PROTOCOL;
smtpd_chat_reply(state, "421 4.7.0 %s Error: too many errors",
 var_myhostname);
pfilter_notify(1, vstream_fileno(state->client));
break;
}

Which I had not set in the main.cf file. After setting it to 5, failed attempts
would be sent to blacklistd:

$ postconf smtpd_hard_error_limit
smtpd_hard_error_limit = 5

$ sudo blacklistctl dump -ab|egrep '32:25'
   186.159.2.57/32:25   1/3 2020/08/08 07:31:19
194.213.125.169/32:25   1/3 2020/08/08 07:17:08
185.4.44.60/32:25   1/3 2020/08/08 07:26:26
 94.243.219.122/32:25   1/3 2020/08/08 07:21:28
  202.40.186.26/32:25   1/3 2020/08/08 07:50:47

Maybe this should be documented...

More on connections limit http://www.postfix.org/TUNING_README.html#conn_limit

------------
Emile `iMil' Heitor  | https://imil.net


!DSPAM:5f2e3ea253355886372770!

blacklistd not reacting to postfix/smtpd AUTH failures

2020-08-07 Thread Emile `iMil&#x27; Heitor




Hi,

On this machine:

NetBSD senate.imil.net 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 
2020  mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64

I have the following setup:

$ cat /etc/blacklistd.conf
[local]
domain  dgram   *   *   *   3   24h
smtpstream  *   *   *   3   24h
submission  stream  *   *   *   3   24h
imaps   stream  *   *   *   3   24h
ssh stream  *   *   *   3   24h

$ cat /etc/npf.conf

$ext = vioif0

set bpf.jit on;
alg "icmp"

table  type ipset file "/etc/npf_blacklist"

group "external" on $ext {
ruleset "blacklistd"
block in final from 
pass final all
}

group default {
pass final all
}

This works, i.e. blocks bruteforce attempts on ports 53 and 22, but
authentication failures on port 25 are not catched and thus no blacklisting
takes place:

$ sudo grep AUTH /var/log/maillog|tail -6
Aug  7 14:17:08 senate postfix/smtpd[16590]: lost connection after AUTH from 
unknown[78.128.113.116]
Aug  7 14:25:11 senate postfix/smtpd[3931]: lost connection after AUTH from 
unknown[78.128.113.116]
Aug  7 14:25:16 senate postfix/smtpd[3931]: lost connection after AUTH from 
unknown[78.128.113.116]
Aug  7 14:25:21 senate postfix/smtpd[7936]: lost connection after AUTH from 
unknown[78.128.113.116]
Aug  7 14:25:25 senate postfix/smtpd[3931]: lost connection after AUTH from 
unknown[78.128.113.116]
Aug  7 14:25:29 senate postfix/smtpd[7936]: lost connection after AUTH from 
unknown[78.128.113.116]

$ sudo grep blacklist /var/log/messages
Aug  7 12:38:04 senate blacklistd[1955]: released 1.192.90.183/32:53 after 
86400 seconds
Aug  7 13:53:47 senate blacklistd[1955]: released 3.237.190.49/32:53 after 
86400 seconds
Aug  7 14:05:09 senate blacklistd[1955]: blocked 3.235.107.224/32:53 for 86400 
seconds

$ sudo blacklistctl dump -ab
address/ma:port id  nfail   last access
 89.248.167.135/32:53   1/3 2020/08/07 02:23:22
  195.144.21.56/32:53   1/3 2020/08/07 06:57:38
  146.88.240.15/32:53   1/3 2020/08/06 16:39:09
  3.235.107.224/32:53   3   3/3 2020/08/07 14:05:09
 146.88.240.128/32:53   2/3 2020/08/06 21:51:36
2001:bc8:234c:1/128:22  1/3 2020/08/06 16:21:34
 71.6.232.7/32:53   1/3 2020/08/07 05:42:50
80.82.65.90/32:53   2/3 2020/08/06 18:25:48
 74.82.47.2/32:53   1/3 2020/08/07 02:42:22
   146.88.240.4/32:53   1/3 2020/08/06 16:22:46
  193.29.15.169/32:53   2/3 2020/08/06 18:54:24
  185.232.65.36/32:53   1/3 2020/08/06 22:06:34
 192.35.168.251/32:53   1/3 2020/08/07 01:58:55
185.50.66.1/32:53   1/3 2020/08/07 12:52:59

smtpd is indeed linked over libblacklist:

$ ldd /usr/libexec/postfix/smtpd |grep black
-lblacklist.0 => /usr/lib/libblacklist.so.0

Anything I am missing here?

Thanks,

------------
Emile `iMil' Heitor  | https://imil.net


!DSPAM:5f2d57f9205059030080223!

Re: Poor network performances

2016-10-01 Thread Emile `iMil&#x27; Heitor


On Fri, 30 Sep 2016, Emile `iMil' Heitor wrote:


I tried tweaking sysctl a bit like indicated here:

https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/


I found these values to help a lot:

http://proj.sunet.se/E2E/netbsd.txt

from http://proj.sunet.se/E2E/tcptune.html


Insightful thread on this topic and how to read & understand those parameters:

https://mail-index.netbsd.org/tech-net/2015/08/03/msg005317.html

Long story short: I can now get 1Gbps from my re(4) NIC on NetBSD 7.0/amd64.

----
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \


!DSPAM:57ef6ce1128371868716790!

Re: Poor network performances

2016-09-30 Thread Emile `iMil&#x27; Heitor


On Fri, 30 Sep 2016, Emile `iMil' Heitor wrote:



I tried tweaking sysctl a bit like indicated here:

https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/


I found these values to help a lot:

http://proj.sunet.se/E2E/netbsd.txt

from http://proj.sunet.se/E2E/tcptune.html


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \


!DSPAM:57ee8907262441074220105!

Poor network performances

2016-09-30 Thread Emile `iMil&#x27; Heitor



Hi,

I've been witnessing poor performances while using NetBSD 7.0/amd64 on a
Gigabit network. I tried this with 2 differents NICs.

Default scenario, either re(4) or alc(4):

$ ifconfig re0 # relevant bits
re0: flags=8843 mtu 1500
capabilities=3f00
capabilities=3f00
enabled=0
ec_capabilities=3
ec_enabled=0
address: f8:df:2f:f7:af:f2
media: Ethernet autoselect (1000baseT full-duplex)
status: active
[...]

On the actual gigabit LAN:

$ iperf3 -c coruscant -l16k
Connecting to host coruscant, port 5201
[  4] local 192.168.1.57 port 32792 connected to 192.168.1.249 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-1.00   sec  6.73 MBytes  56.5 Mbits/sec0   69.3 KBytes 
[  4]   1.00-2.00   sec  12.1 MBytes   102 Mbits/sec0102 KBytes 
[  4]   2.00-3.00   sec  14.1 MBytes   118 Mbits/sec0136 KBytes 
[  4]   3.00-4.00   sec  15.0 MBytes   126 Mbits/sec   19154 KBytes 
[  4]   4.00-5.00   sec  16.4 MBytes   138 Mbits/sec0188 KBytes 
[  4]   5.00-6.00   sec  16.7 MBytes   140 Mbits/sec   30187 KBytes 
[  4]   6.00-7.00   sec  18.3 MBytes   153 Mbits/sec0195 KBytes 
[  4]   7.00-8.00   sec  17.8 MBytes   149 Mbits/sec0195 KBytes 
[  4]   8.00-9.00   sec  18.1 MBytes   152 Mbits/sec0195 KBytes 
[  4]   9.00-10.00  sec  18.0 MBytes   151 Mbits/sec0195 KBytes 
- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bandwidth   Retr
[  4]   0.00-10.00  sec   153 MBytes   129 Mbits/sec   49 sender
[  4]   0.00-10.00  sec   152 MBytes   128 Mbits/sec  receiver

The client machine is a linux box which actually reaches Gb transfer with
another linux host.

Over my FO Internet connection:

NetBSD:
$ iperf3 -c ping.online.net
[...]
[ ID] Interval   Transfer Bandwidth   Retr
[  6]   0.00-10.01  sec  44.3 MBytes  37.1 Mbits/sec   45 sender
[  6]   0.00-10.01  sec  44.1 MBytes  37.0 Mbits/sec  receiver

Linux:
$ iperf3 -c ping.online.net
[...]
[ ID] Interval   Transfer Bandwidth   Retr
[  4]   0.00-10.00  sec   124 MBytes   104 Mbits/sec   49 sender
[  4]   0.00-10.00  sec   121 MBytes   102 Mbits/sec  receiver

To be 100% honest, the Linux box is connected through a PLC while the NetBSD box
is directly connected to the ISP router...

I tried tweaking sysctl a bit like indicated here:

https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/

without success.

Hints? Thoughts?


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \


!DSPAM:57ee707c128551504132269!

Re: NetBSD 6 and 7 panic on kvm

2016-01-28 Thread Emile `iMil&#x27; Heitor


On Thu, 28 Jan 2016, Christos Zoulas wrote:


Indeed, but can't you use the images from:

http://nyftp.netbsd.org/pub/NetBSD-daily/netbsd-7/201601281010Z/amd64/


Just did it, as predicted, everything went smooth. Thanks!


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD 6 and 7 panic on kvm

2016-01-28 Thread Emile `iMil&#x27; Heitor


On Thu, 28 Jan 2016, Emile `iMil' Heitor wrote:


http://imil.net/stuff/NetbSD-7.0-ddb-bt.png

seems related to the virtio driver.


So I found this PR:

https://mail-index.netbsd.org/netbsd-bugs/2015/12/15/msg043768.html

Reducing the memory for the virtual machine allowed to install the OS, and
answering to Christos question on this PR, no, the official iso image does not
bundle the updated version of bus_dma.c.
It would really be a good idea to update those so people can actually try NetBSD
on kvm.


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD 6 and 7 panic on kvm

2016-01-28 Thread Emile `iMil&#x27; Heitor


On Thu, 28 Jan 2016, Emile `iMil' Heitor wrote:



Hi,

I've been trying to install NetBSD/amd64 7.0 and 6.1.5 as a virtual machine
using Linux's kvm, and it led to panic while extracting sets:

http://imil.net/stuff/NetBSD-7.0-kvm.png


A backtrace might help:

http://imil.net/stuff/NetbSD-7.0-ddb-bt.png

seems related to the virtio driver.

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

NetBSD 6 and 7 panic on kvm

2016-01-28 Thread Emile `iMil&#x27; Heitor



Hi,

I've been trying to install NetBSD/amd64 7.0 and 6.1.5 as a virtual machine
using Linux's kvm, and it led to panic while extracting sets:

http://imil.net/stuff/NetBSD-7.0-kvm.png

I came across kern/50139 which suggests to run the installation with 1 vCPU,
but it gave the same result.
How can I proceed in order to help debug this issue? kvm is a very popular
virtualization system on Linux and recently took over Xen regarding overall
performances, many cloud providers are using it.

Thanks,

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD 6.1 NFS server performances

2015-12-03 Thread Emile `iMil&#x27; Heitor


On Thu, 3 Dec 2015, Michael van Elst wrote:


reading with rsize=64k: ~ 90MB/s
reading with rsize=32k: ~ 60MB/s
writing with wsize=64k: ~ 40MB/s
writing with wsize=32k: ~ 30MB/s


Well, I have similar results except I get better performances while
reading/writing with {r,w}size=32k instead of 64, about +20MB/s


The disk itself gets 110MB/s reading and 90MB/s writing locally
on the NFS server.


I naively thought that with a 200MB/s write average I would top wire-speed with
NFS (which I do using dd|nc / nc>file)


Writing via NFS is always slower because even with NFS3 it's
partially synchronous.


Understood. Thanks again for your feedback!

----
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD 6.1 NFS server performances

2015-12-03 Thread Emile `iMil&#x27; Heitor


On Thu, 3 Dec 2015, Emile `iMil' Heitor wrote:


I know, right? And yes results are identical with differents bs values.

I've tried a bazillion NFS options on the clients (TCP, UDP, {r,w}size from 
8192

to 64k...), tried many OSes as a client, the NFS results are consistent,
always between 20 and 30MB/s.


interestingly enough, on a NetBSD client:

$ cd ${NFS_SHARE}
$ dd if=/dev/zero of=./test bs=32k count=31000
31000+0 records in
31000+0 records out
1015808000 bytes transferred in 12.673 secs (80155290 bytes/sec)

It's not yet wire-speed but much better than the results I posted earlier that
were obtained on Linux and OS/X clients.

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

NetBSD 6.1 NFS server performances

2015-12-03 Thread Emile `iMil&#x27; Heitor



Continuing on the NAS performances topic, now it's NFS server's turn.
First things first, I've checked both network and disk throughput, neither cause
a bottleneck:

tatooine is the client
coruscant is the server

$ iperf -c coruscant -p 2828 -t 10

Client connecting to coruscant, TCP port 2828
TCP window size: 43.8 KByte (default)

[  3] local 192.168.1.1 port 51371 connected with 192.168.1.2 port 2828
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec  1.06 GBytes   908 Mbits/sec

$ dd if=/dev/zero bs=1024K count=1000 | nc -v coruscant 2828
Connection to coruscant 2828 port [tcp/*] succeeded!
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 9.27363 s, 113 MB/s

Gigabit link, all clear.

Now using NFS:

$ dd if=/dev/zero bs=1024K count=1000 >Desktop/nfs@coruscant/imil/tmp/test
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 51.8476 s, 20.2 MB/s

I know, right? And yes results are identical with differents bs values.

I've tried a bazillion NFS options on the clients (TCP, UDP, {r,w}size from 8192
to 64k...), tried many OSes as a client, the NFS results are consistent,
always between 20 and 30MB/s.
NFS server is started via rc.d with the following rc.conf variables:

rpcbind=YES
mountd=YES
nfs_server=YES
nfsd_flags="-6tun 8"
lockd=YES
statd=YES

And yes I tried increasing or reducing thread number.

/etc/exports is pretty simple:

/export -alldirs -noresvport -maproot=root:wheel -network 192.168.1.0/24

I've read an extensive number of "NetBSD NFS server performances" posts here,
applied every suggestion without any luck, any idea would be highly appreciated.

Thanks,

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: priocscan vs fcfs

2015-12-02 Thread Emile `iMil&#x27; Heitor


On Wed, 2 Dec 2015, Michael van Elst wrote:


fcfs is also the "neutral" queue for drivers stacking on top of
each other. The queue sorting should really only be done
at one level.

But raidframe is more complicated because it does its own queuing
and sorting outside of this schema, in particular when it has to
read-modify-write stripe sets for small I/O.

That's probably why setting the queues all to fcfs is the best
for you.


Thanks a lot for this clear analysis Michael.

I can now confirm the results I've witnessed earlier, I've ran a couple of
benchmarks, including bonnie++ and iozone, the latter shows a ratio of x5 in
favor of the fsfc strategy for every type of operation. For those interested,
iozone spreadsheet output is available here (OOo / LibreOffice):

https://home.imil.net/tmp/coruscant-iozone-priocscan.ods
https://home.imil.net/tmp/coruscant-iozone-fsfc.ods

For each subset, first column is the amount of data written (from 64K to 4M)
and first row is the block size.

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

priocscan vs fcfs

2015-12-02 Thread Emile `iMil&#x27; Heitor



Hi,

I've never been really happy with my NetBSD RAIDframe NAS, never really got the
speed I was supposed to even with the right alignment / raid layout etc.

Today I dug into `dkctl(8)' while searching if cache was enabled for read and
write, and I came across the "strategy" command.
Long story short, changing from priocscan to fcfs strategy multiplied my NAS's
write speed by 6! I changed the strategy for all disk members:

# dkctl wd0 strategy fcfs
# dkctl wd1 strategy fcfs
# dkctl wd2 strategy fcfs

and also for the RAIDframe:

# dkctl raid0 strategy fcfs
/dev/rraid0d: priocscan -> fcfs

as changing it only for the disk members was apparently counter-productive.
And there we go, from a 40/50MB/s write average to a stunning 200 to 300MB/s,
which is more like what the disks can theroically do.

Could anyone with some background on these strategies explain what's behind the
curtain? I couldn't really find precise documentation on this matter...

Thanks,

------------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)

2015-11-27 Thread Emile `iMil&#x27; Heitor


On Thu, 26 Nov 2015, Emile `iMil' Heitor wrote:

If this issue _is_ NFS related, which I doubt now, it is then read-related, 
as

the build is done in tmpfs.


Pushing the logic further, I just tried with pkgsrc itself being in tmpfs, and
it froze even faster.


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)

2015-11-26 Thread Emile `iMil&#x27; Heitor


On Thu, 26 Nov 2015, Manuel Bouyer wrote:


what does 'show uvm' report ?


db{0}> show uvm
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
, ncolors=8  7444115 VM pages: 53990 active, 1807 inactive, 1 wired, 7302474 fre
e
  pages  26626 anon, 25029 file, 4143 exec
  freemin=4096, free-target=5461, wired-max=2481371
  cpu0:
faults=2789285, traps=2796277, intrs=1396531, ctxswitch=1585247
softint=792597, syscalls=1259477
  cpu1:
faults=1696656, traps=1698017, intrs=180486, ctxswitch=127378
softint=36160, syscalls=608644
  cpu2:
faults=1207093, traps=1208093, intrs=160266, ctxswitch=65538
softint=18178, syscalls=412550
  cpu3:
faults=1434344, traps=1435516, intrs=174413, ctxswitch=100028
softint=24126, syscalls=512909
  cpu4:
faults=1273978, traps=1275187, intrs=161384, ctxswitch=68847
softint=19305, syscalls=424913
  cpu5:
faults=1622825, traps=1624084, intrs=171817, ctxswitch=105319
softint=31330, syscalls=510165
  cpu6:
faults=1734292, traps=1735749, intrs=170374, ctxswitch=99131
softint=26841, syscalls=551106
  cpu7:
faults=1392652, traps=1393985, intrs=166582, ctxswitch=81469
softint=20174, syscalls=442880
  cpu8:
faults=1492063, traps=1493265, intrs=166791, ctxswitch=88768
softint=24325, syscalls=492824
  cpu9:
faults=1579170, traps=1580406, intrs=167471, ctxswitch=89049
softint=23423, syscalls=506804
  cpu10:
faults=2153399, traps=2154831, intrs=184225, ctxswitch=149924
softint=40597, syscalls=828691
  cpu11:
faults=3136585, traps=3138031, intrs=219926, ctxswitch=251413
softint=67270, syscalls=1262227
  cpu12:
faults=4211510, traps=4213265, intrs=222549, ctxswitch=273560
softint=78470, syscalls=1584403
  cpu13:
faults=3938228, traps=3940765, intrs=252763, ctxswitch=368601
softint=110598, syscalls=1636441
  cpu14:
faults=1720207, traps=1721476, intrs=183332, ctxswitch=138148
softint=43486, syscalls=759336
  cpu15:
faults=1547431, traps=1548457, intrs=177462, ctxswitch=126099
softint=36803, syscalls=657976
  fault counts:
noram=0, noanon=0, pgwait=0, pgrele=0
ok relocks(total)=19975519(19975516), anget(retrys)=1498606(0), amapcopy=186
2658
neighbor anon/obj pg=1672558/779689, gets(lock/unlock)=20195408/19975523
cases: anon=1035657, anoncow=462949, obj=18439642, prcopy=1755801, przero=11
148433
  daemon and swap counts:
woke=0, revs=0, scans=0, obscans=0, anscans=0
busy=0, freed=0, reactivate=0, deactivate=0
pageouts=0, pending=0, nswget=0
nswapdev=0, swpgavail=0
swpages=0, swpginuse=0, swpgonly=0, paging=0


----
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)

2015-11-26 Thread Emile `iMil&#x27; Heitor


On Thu, 26 Nov 2015, Emile `iMil' Heitor wrote:


Again, as there's no log at all, what would help debugging this behaviour?


FWIW, some ddb output (ddb is triggered by hitting + on domU's console):

fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 8012e5ad cs e030 rflags 202 cr2 7f7ff6c1e049
ilevel 8 rsp a0051864cc58
curlwp 0xa00035538840 pid 0.2 lowest kstack 0xa0051864a2c0
Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
xencons_tty_input() at netbsd:xencons_tty_input+0xb2
xencons_handler() at netbsd:xencons_handler+0x65
intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x19
evtchn_do_event() at netbsd:evtchn_do_event+0x281
do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x143
hypervisor_callback() at netbsd:hypervisor_callback+0x9e
idle_loop() at netbsd:idle_loop+0xe8
ds  3c80
es  c780
fs  c040
gs  7524
rdi a0003a62b330
rsi 8437d01f
rbp a0051864cc58
rbx 8437d01f
rdx 2b
rcx 2b
rax 1
r8  0
r9  805fc780cpu_info_primary
r10 cdd9f51e239cbb87
r11 246
r12 a0003d754c00
r13 8437d020
r14 a0003a62b330
r15 1
rip 8012e5adbreakpoint+0x5
cs  e030
rflags  202
rsp a0051864cc58
ss  e02b
netbsd:breakpoint+0x5:  leave
db{0}> ps
PIDLID S CPU FLAGS   STRUCT LWP *   NAME WAIT
162591 3  13 0   a0003f435080awk netio
242691 3  14 0   a0003f446a80cat nfsrcv
233011 3  1480   a0003f3f8a60 sh wait
118351 3  1480   a0003f40c0c0  bmake wait
9493 1 3  1480   a0003f26d4a0 sh wait
4831 1 3  1480   a0003f42d320  bmake wait
285131 3  1480   a0003f3f09e0 sh wait
6232 1 3  1580   a0003ed15ae0 sh wait
1172 1 3  1580   a0003f083000  bmake wait
165441 3  1580   a0003f439500 sh wait
181411 3  1580   a0003f430420 sh wait
2349 1 3  1580   a0003f448280   bash wait
490  1 3   580   a0003f447680   sshd select
2135 1 3   080   a0003f445640   sshd select
2234 1 3   480   a0003f2be580   bash ttyraw
381  1 3  1280   a0003f2be9a0   sshd select
382  1 3  1380   a0003e9cc680   sshd select
1868 1 3  15 0   a0003dca59c0  getty nfsrcv
2354 1 3   5 0   a0003ed51b00   cron nfsrcv
1675 1 3  1180   a0003edbd2e0  inetd kqueue
2105 1 3  1280   a0003ee0d720   nrpe select
2086 1 3  14   100   a0003eca46a0   qmgr nfsrcv
2033 1 3  13 0   a0003eca4ac0 pickup nfskqdet
2055 1 3   4 0   a0003edbd700 master tstile
164013 5  11  1000   a0003efff740  python2.7
1640 9 3  1180   a0003ee0db40  python2.7 kqueue
1640 8 3  1280   a0003ed152a0  python2.7 kqueue
1640 1 3   980   a0003e0f6240  python2.7 select
1555 1 3  1380   a0003e0f6660   sshd select
1407 1 3  1380   a0003dd15a20 powerd kqueue
892  1 3   280   a0003e099640  rpc.lockd select
884  1 3  1580   a0003e099a60  rpc.statd select
686  1 3   780   a0003dd151e0rpcbind select
677  1 3   5 0   a0003dd6fa40syslogd nfsrcv
11 3   880   a0003d75b100   init wait
0  131 3   3   200   a0003d75d140  nfskqpoll nfsrcv
0  129 3   4   200   a0003dc4e160   aiodoned aiodoned
0  128 3   7   200   a0003dc4e580ioflush syncer
0  127 3   0   200   a0003dc4e9a0   pgdaemon pgdaemon
0  124 3  14   200   a0003d75a920  nfsio nfsiod
0  123 3  13   200   a0003d75a500  nfsio nfsiod
0  122 3   9   200   a0003d75a0e0  nfsio nfsiod
0  121 3  15   200   a0003d75b940  nfsio nfsiod
0  120 3   0   200   a0003d75b520  cryptoret crypto_w
0  119 3   0   200   a0003d7530c0  unpgc unpgc
0  118 3   0   200   a0003d75c960xen_balloon xen_balloon
0  117 3   9   200   a0003d75c540vmem_rehash vmem_rehash
0  116 3   0   200   a0003d75d980

NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)

2015-11-26 Thread Emile `iMil&#x27; Heitor


On Mon, 2 Nov 2015, Emile `iMil' Heitor wrote:


I'm trying to get rid of those hangs for weeks now, tried every mount flag
combination without success, the system would freeze randomly, leaving the 
whole
OS unresponsive. There's no log, no kernel message, the domU actually 
responds

to network solicitations (ping, telnet 22...) but once it's frozen, it is
impossible to run any command, it will just hang.

The exact same setup is successfully running since Sept 2014 on
NetBSD 6.1/amd64.

Any idea how to get some valuable information to help tracking down this
awful behaviour?


A bit of follow-up. I've been trying many workarounds during the past weeks, and
right now I'm not convinced it even is an NFS problem.
I've setup a tmpfs bulk build directory, and even that way, NetBSD 7.0 would
freeze randomly after a couple of minutes while processing `pbulk.sh'.
What I can say:

- the server is a fresh diskless NetBSD 7.0 domU (PXE/NFS)
- there's not a single information about the freeze, not even in the console
- I've only witnessed those freezes when calling `pbulk.sh' (couldn't get
  further anyway)
- cvs co pkgsrc does not freezes, I ran it many times without issues
- the domU stays up for days if no operation is made
- I started this domU on various dom0s to validate this was not a hardware
  problem, always had the same symptoms
- I tried a custom 7.0_STABLE kernel without success

If this issue _is_ NFS related, which I doubt now, it is then read-related, as
the build is done in tmpfs.

Again, as there's no log at all, what would help debugging this behaviour?

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: Raspberry Pi 2, nfs mount hangs after some time

2015-11-02 Thread Emile `iMil&#x27; Heitor


On Mon, 2 Nov 2015, Christos Zoulas wrote:


Can you get into ddb?


unfortunately no, the system hangs but does not panic, it just becomes unusable.


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: Raspberry Pi 2, nfs mount hangs after some time

2015-11-02 Thread Emile `iMil&#x27; Heitor


On Mon, 26 Oct 2015, Robert Elz wrote:


For workarounds, mount using tcp (won't cure the problem, but will
make it far less common), and use interruptible mounts (mount_nfs -T -i)
so when it does hang, you can kill the process(es) at least.


A mee-too reply.

I've setup a NetBSD 7.0/amd64 bulk-build domU as I did for NetBSD 6.1/amd64, it
uses our platform's NetApp NFS servers (thousands of Linux domUs are using
those, the hardware is not guilty).
I'm trying to get rid of those hangs for weeks now, tried every mount flag
combination without success, the system would freeze randomly, leaving the whole
OS unresponsive. There's no log, no kernel message, the domU actually responds
to network solicitations (ping, telnet 22...) but once it's frozen, it is
impossible to run any command, it will just hang.

The exact same setup is successfully running since Sept 2014 on
NetBSD 6.1/amd64.

Any idea how to get some valuable information to help tracking down this
awful behaviour?

------------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

IPsec over GRE slow performances

2015-07-14 Thread Emile `iMil&#x27; Heitor



Hi,

I have set up an IPsec over GRE connection with a remote host, both are NetBSD
6.1 based. The "client" is connected to the Internet through a 400Mbps fiber
connection. The "server" is located on a 10Gbps network. Both machines have
1Gbps NICs which behave perfectly, meaning they both reach the link speed limit
when transferring data outside the IPsec tunnel.
When doing a transfer through the tunnel, speed drops at a factor 5 to 10:

direct connection:

/dev/null27%[> ] 503.19M  45.3MB/s   eta 83s

IPsec connection:

/dev/null 2%[  ]  47.76M  6.05MB/s   eta 5m 3s

The tunnel is setup this way:

On the server, which is a NetBSD domU running on a debian/amd64 dom0:

$ cat /etc/ifconfig.xennet0
# server interface
up
inet 192.168.1.2 netmask 255.255.255.0
inet 172.16.1.1 netmask 0xfffc alias
$ cat /etc/ifconfig.gre0 
create

tunnel 172.16.1.1 172.16.1.2 up
inet 172.16.1.5 172.16.1.6 netmask 255.255.255.252

IPsec traffic is forwarded from dom0's public IP to the domU's xennet0 interface
through an iptables NAT rule:

-A PREROUTING -i eth0 -p udp -m udp --dport 500 -j DNAT --to-destination 192.168.1.2:500 
-A PREROUTING -i eth0 -p esp -j DNAT --to-destination 192.168.1.2 
-A PREROUTING -i eth0 -p ah -j DNAT --to-destination 192.168.1.2


On the client:

$ cat /etc/ifconfig.vlan8 
# client public interface

create
vlan 8 vlanif re0
!dhcpcd -i $int
inet 172.16.1.2 netmask 0xfffc alias
$ cat /etc/ifconfig.gre1 
create

tunnel 172.16.1.2 172.16.1.1 up
inet 172.16.1.6 172.16.1.5 netmask 255.255.255.252

On racoon side, I tried various hash / encryption algorithms combinations, even
enc_null, but nothing changes really, transfer is still stuck at a 6MB/s max.

Here's the racoon setup:

On the server:

remote office.public.ip {
exchange_mode main;
lifetime time 28800 seconds;
proposal {
encryption_algorithm blowfish;
hash_algorithm sha1;
authentication_method pre_shared_key;
dh_group 2;
}
generate_policy off;
}

sainfo address 172.16.1.1/30 any address 172.16.1.2/30 any {
pfs_group 2;
encryption_algorithm blowfish;
authentication_algorithm hmac_sha1;
compression_algorithm deflate;
lifetime time 3600 seconds;
}

On the client:

remote node.public.ip {
exchange_mode main;
lifetime time 28800 seconds;
proposal {
encryption_algorithm blowfish;
hash_algorithm sha1;
authentication_method pre_shared_key;
dh_group 2;
}
generate_policy off;
}

sainfo address 172.16.1.2/30 any address 172.16.1.1/30 any {
pfs_group 2;
encryption_algorithm blowfish;
authentication_algorithm hmac_sha1;
compression_algorithm deflate;
lifetime time 3600 seconds;
}

The tunnel establishes with no issue, the only problem here is transfer drop.
Again, when transferring from / to the server from / to the client without
tunnel, speed is optimal, drop occurs _only_ through IPsec.

Both machines are intel-based CPUs running at 2+GHz, plenty of memory and very
little CPU time consumed by anything else than forwarding / NAT.

Has anyone witnessed such a behaviour? Any idea on where to look further?

Thanks,

------------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: alc(4) current for NetBSD 6.1.5, patch included

2015-01-25 Thread Emile `iMil&#x27; Heitor


On Sun, 25 Jan 2015, Leonardo Taccari wrote:


If someone has succesfully tested alc(4) on 813x/815x chipsets too can
we request a pull up for netbsd-7? It seems that the 816x chipsets are
common on various motherboards and laptopts.

What do you think?


Definitely a good idea as those chips are widely spread for a very long time
now. Also, I believe backporting it to 7.0 should be pretty straightforward.


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

alc(4) current for NetBSD 6.1.5, patch included

2015-01-24 Thread Emile `iMil&#x27; Heitor



Hi,

Using Leonardo Taccari's work on alc(4) for NetBSD current, I've enabled my
AR8171 Gbe NIC on NetBSD 6.1.5 by porting the driver.
Anyone interested, just apply the enclosed patch in src/sys/dev/pci then:

make -f Makefile.pcidevs in src/sys/dev/pci

and rebuild your kernel as usual.

HTH

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \Index: if_alc.c
===
RCS file: /cvsroot/src/sys/dev/pci/if_alc.c,v
retrieving revision 1.5
diff -u -r1.5 if_alc.c
--- if_alc.c29 Aug 2011 14:47:08 -  1.5
+++ if_alc.c24 Jan 2015 21:47:00 -
@@ -95,6 +95,16 @@
"Atheros AR8152 v1.1 PCIe Fast Ethernet" },
{ PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8152_B2, 6 * 1024,
"Atheros AR8152 v2.0 PCIe Fast Ethernet" },
+   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8161, 9 * 1024,
+   "Atheros AR8161 PCIe Gigabit Ethernet" },
+   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8162, 9 * 1024,
+   "Atheros AR8162 PCIe Fast Ethernet" },
+   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8171, 9 * 1024,
+   "Atheros AR8171 PCIe Gigabit Ethernet" },
+   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8172, 9 * 1024,
+   "Atheros AR8172 PCIe Fast Ethernet" },
+   { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_E2200, 9 * 1024,
+   "Killer E2200 Gigabit Ethernet" },
{ 0, 0, 0, NULL },
 };
 
@@ -103,33 +113,50 @@
 static int alc_detach(device_t, int);
 
 static int alc_init(struct ifnet *);
+static int alc_init_backend(struct ifnet *, bool);
 static voidalc_start(struct ifnet *);
 static int alc_ioctl(struct ifnet *, u_long, void *);
 static voidalc_watchdog(struct ifnet *);
 static int alc_mediachange(struct ifnet *);
 static voidalc_mediastatus(struct ifnet *, struct ifmediareq *);
 
-static voidalc_aspm(struct alc_softc *, int);
+static voidalc_aspm(struct alc_softc *, int, int);
+static voidalc_aspm_813x(struct alc_softc *, int);
+static voidalc_aspm_816x(struct alc_softc *, int);
 static voidalc_disable_l0s_l1(struct alc_softc *);
 static int alc_dma_alloc(struct alc_softc *);
 static voidalc_dma_free(struct alc_softc *);
+static voidalc_dsp_fixup(struct alc_softc *, int);
 static int alc_encap(struct alc_softc *, struct mbuf **);
 static struct alc_ident *
alc_find_ident(struct pci_attach_args *);
 static voidalc_get_macaddr(struct alc_softc *);
+static voidalc_get_macaddr_813x(struct alc_softc *);
+static voidalc_get_macaddr_816x(struct alc_softc *);
+static voidalc_get_macaddr_par(struct alc_softc *);
 static voidalc_init_cmb(struct alc_softc *);
 static voidalc_init_rr_ring(struct alc_softc *);
-static int alc_init_rx_ring(struct alc_softc *);
+static int alc_init_rx_ring(struct alc_softc *, bool);
 static voidalc_init_smb(struct alc_softc *);
 static voidalc_init_tx_ring(struct alc_softc *);
 static int alc_intr(void *);
 static voidalc_mac_config(struct alc_softc *);
+static uint32_talc_mii_readreg_813x(struct alc_softc *, int, int);
+static uint32_talc_mii_readreg_816x(struct alc_softc *, int, int);
+static voidalc_mii_writereg_813x(struct alc_softc *, int, int, int);
+static voidalc_mii_writereg_816x(struct alc_softc *, int, int, int);
 static int alc_miibus_readreg(device_t, int, int);
 static voidalc_miibus_statchg(device_t);
 static voidalc_miibus_writereg(device_t, int, int, int);
-static int alc_newbuf(struct alc_softc *, struct alc_rxdesc *, int);
+static uint32_talc_miidbg_readreg(struct alc_softc *, int);
+static voidalc_miidbg_writereg(struct alc_softc *, int, int);
+static uint32_talc_miiext_readreg(struct alc_softc *, int, int);
+static uint32_talc_miiext_writereg(struct alc_softc *, int, int, int);
+static int alc_newbuf(struct alc_softc *, struct alc_rxdesc *, bool);
 static voidalc_phy_down(struct alc_softc *);
 static voidalc_phy_reset(struct alc_softc *);
+static voidalc_phy_reset_813x(struct alc_softc *);
+static voidalc_phy_reset_816x(struct alc_softc *);
 static voidalc_reset(struct alc_softc *);
 static voidalc_rxeof(struct alc_softc *, struct rx_rdesc *);
 static int alc_rxintr(struct alc_softc *);
@@ -159,12 +186,34 @@
 alc_miibus_readreg(device_t dev, int phy, int reg)
 {
struct alc_softc *sc = device_private(dev);
+   int v;
+
+   if ((sc->alc_flags & AL

Re: poor write performances with RAID 5 Raidframe

2014-12-06 Thread Emile `iMil&#x27; Heitor


On Sun, 30 Nov 2014, Emile `iMil' Heitor wrote:


$ dd if=/dev/zero of=/export/imil/tmp/test bs=1m count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 42.619 secs (24603486 bytes/sec)


Turns out performances are not *that* bad considering the disks I'm using[1][2]
plus RAID 5 overead. I re-ran some benchmarks using benchmarks/bonnie++ and dd
but with 4k blocks this time, and boths results were consistent, between 30 and
40MB/s write speed.

[1]: 1x http://www.cnet.com/products/seagate-barracuda-7200-14-2tb/specs/
[2]: 2x 
http://www.storagereview.com/seagate_barracuda_green_2tb_review_st2000dl003

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: poor write performances with RAID 5 Raidframe

2014-11-30 Thread Emile `iMil&#x27; Heitor




# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
64 1 1 5

~

# newfs -O2 -b64k -I dk0



This might not be the best configuration; you've got two RAIDframe
SUs per FFS block.


So you'd recommend to run newfs with 32k as the block size?

--------
Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

poor write performances with RAID 5 Raidframe

2014-11-29 Thread Emile `iMil&#x27; Heitor



Hi,

I'm witnessing poor performances while writing to a RAID 5 Raidframe I setup
to be my home NAS a couple of months ago.
Following some documentations, tutorials, and mailing-list posts, I think I
had everything aligned right as the following configuration suggests:

raid1.conf:
START array
# numRow numCol numSpare
1 3 0

START disks
/dev/wd2a
/dev/wd3a
/dev/wd4a

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
64 1 1 5

START queue
fifo 100

# gpt show raid1
   startsize  index  contents
   0   1 PMBR
   1   1 Pri GPT header
   2  32 Pri GPT table
  34  94
 128  7814057951  1  GPT part - NetBSD FFSv1/FFSv2
  7814058079  32 Sec GPT table
  7814058111   1 Sec GPT header

# dkctl raid1 listwedges
/dev/rraid1d: 1 wedge:
dk0: 5bdb599c-744a-11e3-9160-002354666e0f, 7814057951 blocks at 128, type: ffs

newfs(8) was done like this:

# newfs -O2 -b64k -I dk0

The disks involved are Seagate Barracuda which are known to have as such high
write transfer rate as ~100MB/s, and I merely get a 25MB/s with those:

$ dd if=/dev/zero of=/export/imil/tmp/test bs=1m count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 42.619 secs (24603486 bytes/sec)

On the other hand, read speed is really good:

$ dd if=/export/imil/media/foobar_adventures.mp4 of=/dev/null bs=1m
1395+1 records in
1395+1 records out
1463376863 bytes transferred in 7.899 secs (185261028 bytes/sec)

Any advice on this matter? Does this configuration looks good?

Thanks,


Emile `iMil' Heitor * 
  _
| http://imil.net| ASCII ribbon campaign ( )
| http://www.NetBSD.org  |  - against HTML email  X
| http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-03 Thread Emile `iMil&#x27; Heitor




  FWIW, it is documented in mount(8):


log (FFS only) Mount the file system with wapbl(4) meta-
data journaling, also known simply as logging.  It
provides rapid metadata updates and eliminates the
need to check file system consistency after a system
outage.  A file system mounted with log can not be
mounted with async.  [...]


Well, my bad :(

Wouldn't it be wise that mount(8) forbids the use of both flags ?

--
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-03 Thread Emile `iMil&#x27; Heitor




asynchronous + log should be impossible.


I've just discovered that the hard way :)


Did you find out wether the system is really hanging or just very slow?
In particular, is there any disk activity and does the system respond
on the network?


Like I said earlier on this thread, the system hangs, no service responds
anymore, but the server still replies to ICMP and does not panic. I waited for
about 4 hours to see if the operations would proceed but they didn't.

------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-03 Thread Emile `iMil&#x27; Heitor




Could one of those mount flags cause any harm with a RAIDframe array ?

/dev/dk0 on /export type ffs (asynchronous, log, noatime, NFS exported, 
local)


well well, it seems like removing the `async' flag solved the issue, I've just
ran multiple tests that caused instant fail yesterday which passed right away
without asynchronous I/O... some kind of memleak maybe ?

------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-03 Thread Emile `iMil&#x27; Heitor




This value can be set to zero, however up to a factor of three
in throughput will be lost over the performance obtained at a 5%
threshold.


hm, I missed that point. Will raise up.


Well, I have applied both recommendations, and yet the machine just had the same
behaviour while rsync'ing data to the newly created RAID5. Again, the system
was kind of "up" but no command could be typed and services did not respond
anymore. Everything was frozen.

Could one of those mount flags cause any harm with a RAIDframe array ?

/dev/dk0 on /export type ffs (asynchronous, log, noatime, NFS exported, local)

Just to be clear, the trivial dd(1) test as of the rsync took place locally
on the server, no network transfers yet.

------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-02 Thread Emile `iMil&#x27; Heitor



Hi Greg,


Any particular reason why you set minfree to 0 instead of leaving it at
the default?  Especially given that the man-page for tunefs says:


Just because that volume is dedicated in storing media and stuff, no need to
reserve space for root, and hey, 5% of 4T is a lot :\


This value can be set to zero, however up to a factor of three
in throughput will be lost over the performance obtained at a 5%
threshold.


hm, I missed that point. Will raise up.


Losing a factor of 3 in throughput in addition to the RAID5 write
penalty seems pretty expensive, performance-wise :(  (You might also
get better write performance with '64 1 1 5' in this configuration,
instead of '32 1 1 5')


Noted, will definitely try.

Thanks for the replies!

----------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: [RAIDframe] system hangs while manipulating large files

2014-01-02 Thread Emile `iMil&#x27; Heitor




Does it hang or does it just take an enormously long time (>30min)
to complete the operation?


oh, I must say I haven't waited that long, this morning it's been struggling
for about 20 minutes, I had to hard-reset the machine as I needed it for work.

------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

[RAIDframe] system hangs while manipulating large files

2014-01-02 Thread Emile `iMil&#x27; Heitor



Hi and Happy New year fellow NetBSD users,

I'm setting up a RAID5 NAS using NetBSD 6.1.2/amd64. The RAID5 array is used as
a media storage system, it is composed of 3x2T SATA disks. I followed various
on-line documents such as:

http://www.netbsd.org/docs/guide/en/chap-rf.html
http://mail-index.netbsd.org/netbsd-users/2011/09/02/msg008979.html
http://abs0d.blogspot.fr/2011/08/setting-up-8tb-netbsd-file-server.html
http://pbraun.nethence.com/unix/sysutils_bsd/raidframe.html

I've copied about 10G of data composed of 1 to 100M files, everything went
smoothly. Then, in order to have some performances figures, I used `dd' like
this:

$ dd if=/dev/zero of=./test bs=1m count=5000

withing a couple of seconds, the whole system becomes irresponsive and hangs
totally but does not panic. I have reproduced this behaviour a couple of times,
the system hangs everytime.

Here's the RAID setup:

$ cat /etc/raid1.conf
START array
# numRow numCol numSpare
1 3 0

START disks
/dev/wd1a
/dev/wd3a
/dev/wd5a

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
32 1 1 5

START queue
fifo 100

and here's how I did the setup:

dd if=/dev/zero of=/dev/rwd1d bs=8k count=1
dd if=/dev/zero of=/dev/rwd3d bs=8k count=1
dd if=/dev/zero of=/dev/rwd5d bs=8k count=1
disklabel -r -e -I wd1 # s/4.2BSD/RAID/
disklabel -r -e -I wd3 # s/4.2BSD/RAID/
disklabel -r -e -I wd5 # s/4.2BSD/RAID/
raidctl -v -C raid1.conf raid1
raidctl -v -I `date +%s` raid1
raidctl -v -A yes raid1
raidctl -i raid1
gpt add -b 128 raid1
dkctl raid1 addwedge export 128 7814058015 ffs
newfs -O2 -b64k dk0
tunefs -m0 raid1
mount -o rw,log,async,noatime /dev/dk0 /export

Any hint? Have someone already faced this issue?

Thanks

----------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: no file system for xbd0 (dev 0x8e00)

2013-05-30 Thread Emile `iMil&#x27; Heitor


On Thu, 30 May 2013, Jean-Yves Migeon wrote:

Do you get the same output between "disklabel /dev/xbd1" and "disklabel -Ar 
/dev/xbd1"?


I would also try dd'ing the LV and pass it through vnconfig(8) (never did it, 
YMMV).


I've seen weird behavior from domU when it falls on specific forbidden 
operations from dom0's backend (like trying to overwrite the disk's disklabel 
after a silent corruption).


I finally fixed the domU's fs, it appears that the superblock list given by
newfs -N was wrong, and I found a working alternative superblock. Luckily,
only the /dev directory and a couple of files have been wiped by fsck(8)
and I've been able to boot the virtual machine and put it to work again.
Nevertheless, that story is a bit frightening, because no error message has
never showed neither on the domU nor on the dom0, the fs corruption just
"happened silently", just like you said.

Thanks for your help anyway!

----------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

Re: no file system for xbd0 (dev 0x8e00)

2013-05-30 Thread Emile `iMil&#x27; Heitor




I suggest checking this from a dom0 perspective first before continuing the 
hunt for weird and uncommon file system corruption issues in the domU.

In the dom0, make sure your lvm is active, e.g. marked with an 'a' rather than 
a 'd'
in the output of lvm lvs. I suspect you couldn't even get a disklabel if it 
wasn't, but it can't hurt to check the lvm status in dom0 anyway.

I'd also suggest in dom0 dd'ing the lvm's raw device to /dev/null to make sure 
there are no i/o errors.


lvs output, from the dom0, seems ok:

  webserver   vg1  owi-ao  20.00g

The `o'rigin attribute is set because I've just done a snapshot just in case..

dd from the vbd to /dev/null went well also.

I have a bad feeling about this...

----------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

no file system for xbd0 (dev 0x8e00)

2013-05-30 Thread Emile `iMil&#x27; Heitor



Hi,

Since yesterday, a NetBSD 6.0.1/amd64 domU of mine can't mount its root fs
anymore:

boot device: xbd0
root on xbd0a dumps on xbd0b
Your machine does not initialize mem_clusters; sparse_dumps disabled
Supported file systems: union umap tmpfs smbfs puffs ptyfs procfs overlay null 
ntfs nfs msdos mfs lfs kernfs fdesc ext2fs ffs coda cd9660
no file system for xbd0 (dev 0x8e00)
cannot mount root, error = 79
root device (default xbd0a):

That domU used to boot / work peacefully, but for an unknown reason, the
virtual block device is not recognized anymore. When trying to mount that
device in another virtual machine, I get the following:

[~] mount -o log /dev/xbd1a /mnt
mount_ffs: /dev/xbd1a on /mnt: incorrect super block

Whereas disklabel indicates:

[~] disklabel /dev/xbd1
# /dev/xbd1d:
type: unknown
disk: Xen Virtual ESD
label: 
flags:

bytes/sector: 512
sectors/track: 2048
tracks/cylinder: 1
sectors/cylinder: 2048
cylinders: 20480
total sectors: 41943040
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
#sizeoffset fstype [fsize bsize cpg/sgs]
 a:  37748736 0 4.2BSD   2048 16384 0  # (Cyl.  0 -  18431)
 b:   4194304  37748736   swap # (Cyl.  18432 -  20479)
 c:  41943040 0 unused  0 0# (Cyl.  0 -  20479)
 d:  41943040 0 unused  0 0# (Cyl.  0 -  20479)

The virtual block device is a LVM logical volume, I use that setup in almost
all of the domUs I run, it is declared very simply in domU's configuration
file:

disk = [ 'phy:/dev/mapper/vg1-webserver,hda,w' ]

Running fsck_ffs gives:

[~] fsck_ffs /dev/xbd1a 
** /dev/rxbd1a

BAD SUPER BLOCK: CAN'T FIND SUPERBLOCK
/dev/rxbd1a: CANNOT FIGURE OUT SECTORS PER CYLINDER

And specifying another superblock doesn't change anything:

[~] fsck_ffs -b 32 /dev/xbd1a 
Alternate super block location: 32

** /dev/rxbd1a
BAD SUPER BLOCK: MAGIC NUMBER WRONG

Any idea on what can I try in order to recover that virtual drive?

----------
Emile `iMil' Heitor .°. 
_
  | http://imil.net| ASCII ribbon campaign ( )
  | http://www.NetBSD.org  |  - against HTML email  X
  | http://gcu.info|  & vCards / \

38 matches

Mail list logo