Re: NPF ruleset not blocking IPs
On Fri, 3 Jun 2022, Emile `iMil' Heitor wrote: As the rules in the ruleset are declared as "final", I presume the default `pass all` is not reached, am I right? So, no, I was wrong. Changing the order made the rules apply. I simply removed the "external" group and inserted the ruleset before the pass all: group default { pass final on lo0 all pass stateful out final all ruleset "blacklistd" block in final from pass all block in family inet6 all pass proto ipv6-icmp all pass stateful in family inet6 proto tcp to any port $tcp_allowed pass stateful in family inet6 proto udp to any port $udp_allowed } ------------ Emile `iMil' Heitor | https://imil.net
NPF ruleset not blocking IPs
I am trying to use npf along with blacklistd as an anti-bruteforce system. Configuration-wide, everything seems to work together, yet blacklisted IPs, while present in the "blacklistd" ruleset, don't seem to be blocked. Here's my npf.conf file: # npf.conf $ext = vioif0 $ip4 = inet4(vioif0) $ip6 = inet6(vioif0) set bpf.jit on; alg "icmp" $tcp_allowed = {25, 53, 465, 587, 995, ssh, http, https} $udp_allowed = {53} table type ipset file "/etc/npf_blacklist" procedure "log" { log: npflog0 } group "external" on $ext { ruleset "blacklistd" block in final from } group default { pass final on lo0 all pass stateful out final all pass all block in family inet6 all pass proto ipv6-icmp all pass stateful in family inet6 proto tcp to any port $tcp_allowed pass stateful in family inet6 proto udp to any port $udp_allowed } # end of npf.conf This virtual machine acts like an IPv6 router, hence the default rules. Here's an extract of rules inserted by blacklistd: $ sudo npfctl rule blacklistd list ruleset block in final family inet4 proto udp from 64.231.104.8/32 to any port 53 # id="1" ruleset block in final family inet4 proto udp from 94.181.160.42/32 to any port 53 # id="2" ruleset block in final family inet4 proto udp from 209.126.8.168/32 to any port 53 # id="3" ruleset block in final family inet4 proto udp from 85.28.98.113/32 to any port 53 # id="4" ruleset block in final family inet4 proto udp from 44.200.125.213/32 to any port 53 # id="5" ruleset block in final family inet4 proto udp from 120.71.145.56/32 to any port 53 # id="6" ruleset block in final family inet4 proto udp from 90.90.90.90/32 to any port 53 # id="7" ruleset block in final family inet4 proto udp from 107.119.41.101/32 to any port 53 # id="8" ruleset block in final family inet4 proto udp from 78.116.212.157/32 to any port 53 # id="9" ruleset block in final family inet4 proto udp from 189.203.104.245/32 to any port 53 # id="a" ruleset block in final family inet4 proto udp from 193.124.7.9/32 to any port 53 # id="b" ruleset block in final family inet4 proto udp from 173.179.63.249/32 to any port 53 # id="c" ruleset block in final family inet4 proto udp from 174.244.240.203/32 to any port 53 # id="d" ruleset block in final family inet4 proto udp from 72.9.7.72/32 to any port 53 # id="e" ruleset block in final family inet4 proto udp from 95.105.64.219/32 to any port 53 # id="f" ruleset block in final family inet4 proto udp from 185.156.46.34/32 to any port 53 # id="10" ruleset block in final family inet4 proto tcp from 183.134.6.42/32 to any port 22 # id="7276" ruleset block in final family inet4 proto tcp from 185.220.100.253/32 to any port 22 # id="729a" ruleset block in final family inet4 proto udp from 35.174.16.235/32 to any port 53 # id="72b6" Yet none of those IPs are blocked, I tried with a server of mine, it gets added to the list but is not blocked. As the rules in the ruleset are declared as "final", I presume the default `pass all` is not reached, am I right? I am probably missing something obvious but can't figure out what. Any ideas? Thanks Emile `iMil' Heitor | https://imil.net
Re: blacklistd not reacting to postfix/smtpd AUTH failures
On Fri, 7 Aug 2020, Martin Neitzel wrote: You have to check the smtpd source to see if blacklist{,_r,_sa} could be called at the point where the issue is logged. Indeed the source code delivered. It suggests the notification should be triggered when the auth attempt reach the smtpd_hard_error_limit: if (state->error_count >= var_smtpd_hard_erlim) { state->reason = REASON_ERROR_LIMIT; state->error_mask |= MAIL_ERROR_PROTOCOL; smtpd_chat_reply(state, "421 4.7.0 %s Error: too many errors", var_myhostname); pfilter_notify(1, vstream_fileno(state->client)); break; } Which I had not set in the main.cf file. After setting it to 5, failed attempts would be sent to blacklistd: $ postconf smtpd_hard_error_limit smtpd_hard_error_limit = 5 $ sudo blacklistctl dump -ab|egrep '32:25' 186.159.2.57/32:25 1/3 2020/08/08 07:31:19 194.213.125.169/32:25 1/3 2020/08/08 07:17:08 185.4.44.60/32:25 1/3 2020/08/08 07:26:26 94.243.219.122/32:25 1/3 2020/08/08 07:21:28 202.40.186.26/32:25 1/3 2020/08/08 07:50:47 Maybe this should be documented... More on connections limit http://www.postfix.org/TUNING_README.html#conn_limit ------------ Emile `iMil' Heitor | https://imil.net !DSPAM:5f2e3ea253355886372770!
blacklistd not reacting to postfix/smtpd AUTH failures
Hi, On this machine: NetBSD senate.imil.net 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 2020 mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64 I have the following setup: $ cat /etc/blacklistd.conf [local] domain dgram * * * 3 24h smtpstream * * * 3 24h submission stream * * * 3 24h imaps stream * * * 3 24h ssh stream * * * 3 24h $ cat /etc/npf.conf $ext = vioif0 set bpf.jit on; alg "icmp" table type ipset file "/etc/npf_blacklist" group "external" on $ext { ruleset "blacklistd" block in final from pass final all } group default { pass final all } This works, i.e. blocks bruteforce attempts on ports 53 and 22, but authentication failures on port 25 are not catched and thus no blacklisting takes place: $ sudo grep AUTH /var/log/maillog|tail -6 Aug 7 14:17:08 senate postfix/smtpd[16590]: lost connection after AUTH from unknown[78.128.113.116] Aug 7 14:25:11 senate postfix/smtpd[3931]: lost connection after AUTH from unknown[78.128.113.116] Aug 7 14:25:16 senate postfix/smtpd[3931]: lost connection after AUTH from unknown[78.128.113.116] Aug 7 14:25:21 senate postfix/smtpd[7936]: lost connection after AUTH from unknown[78.128.113.116] Aug 7 14:25:25 senate postfix/smtpd[3931]: lost connection after AUTH from unknown[78.128.113.116] Aug 7 14:25:29 senate postfix/smtpd[7936]: lost connection after AUTH from unknown[78.128.113.116] $ sudo grep blacklist /var/log/messages Aug 7 12:38:04 senate blacklistd[1955]: released 1.192.90.183/32:53 after 86400 seconds Aug 7 13:53:47 senate blacklistd[1955]: released 3.237.190.49/32:53 after 86400 seconds Aug 7 14:05:09 senate blacklistd[1955]: blocked 3.235.107.224/32:53 for 86400 seconds $ sudo blacklistctl dump -ab address/ma:port id nfail last access 89.248.167.135/32:53 1/3 2020/08/07 02:23:22 195.144.21.56/32:53 1/3 2020/08/07 06:57:38 146.88.240.15/32:53 1/3 2020/08/06 16:39:09 3.235.107.224/32:53 3 3/3 2020/08/07 14:05:09 146.88.240.128/32:53 2/3 2020/08/06 21:51:36 2001:bc8:234c:1/128:22 1/3 2020/08/06 16:21:34 71.6.232.7/32:53 1/3 2020/08/07 05:42:50 80.82.65.90/32:53 2/3 2020/08/06 18:25:48 74.82.47.2/32:53 1/3 2020/08/07 02:42:22 146.88.240.4/32:53 1/3 2020/08/06 16:22:46 193.29.15.169/32:53 2/3 2020/08/06 18:54:24 185.232.65.36/32:53 1/3 2020/08/06 22:06:34 192.35.168.251/32:53 1/3 2020/08/07 01:58:55 185.50.66.1/32:53 1/3 2020/08/07 12:52:59 smtpd is indeed linked over libblacklist: $ ldd /usr/libexec/postfix/smtpd |grep black -lblacklist.0 => /usr/lib/libblacklist.so.0 Anything I am missing here? Thanks, ------------ Emile `iMil' Heitor | https://imil.net !DSPAM:5f2d57f9205059030080223!
Re: Poor network performances
On Fri, 30 Sep 2016, Emile `iMil' Heitor wrote: I tried tweaking sysctl a bit like indicated here: https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/ I found these values to help a lot: http://proj.sunet.se/E2E/netbsd.txt from http://proj.sunet.se/E2E/tcptune.html Insightful thread on this topic and how to read & understand those parameters: https://mail-index.netbsd.org/tech-net/2015/08/03/msg005317.html Long story short: I can now get 1Gbps from my re(4) NIC on NetBSD 7.0/amd64. ---- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \ !DSPAM:57ef6ce1128371868716790!
Re: Poor network performances
On Fri, 30 Sep 2016, Emile `iMil' Heitor wrote: I tried tweaking sysctl a bit like indicated here: https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/ I found these values to help a lot: http://proj.sunet.se/E2E/netbsd.txt from http://proj.sunet.se/E2E/tcptune.html Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \ !DSPAM:57ee8907262441074220105!
Poor network performances
Hi, I've been witnessing poor performances while using NetBSD 7.0/amd64 on a Gigabit network. I tried this with 2 differents NICs. Default scenario, either re(4) or alc(4): $ ifconfig re0 # relevant bits re0: flags=8843 mtu 1500 capabilities=3f00 capabilities=3f00 enabled=0 ec_capabilities=3 ec_enabled=0 address: f8:df:2f:f7:af:f2 media: Ethernet autoselect (1000baseT full-duplex) status: active [...] On the actual gigabit LAN: $ iperf3 -c coruscant -l16k Connecting to host coruscant, port 5201 [ 4] local 192.168.1.57 port 32792 connected to 192.168.1.249 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 6.73 MBytes 56.5 Mbits/sec0 69.3 KBytes [ 4] 1.00-2.00 sec 12.1 MBytes 102 Mbits/sec0102 KBytes [ 4] 2.00-3.00 sec 14.1 MBytes 118 Mbits/sec0136 KBytes [ 4] 3.00-4.00 sec 15.0 MBytes 126 Mbits/sec 19154 KBytes [ 4] 4.00-5.00 sec 16.4 MBytes 138 Mbits/sec0188 KBytes [ 4] 5.00-6.00 sec 16.7 MBytes 140 Mbits/sec 30187 KBytes [ 4] 6.00-7.00 sec 18.3 MBytes 153 Mbits/sec0195 KBytes [ 4] 7.00-8.00 sec 17.8 MBytes 149 Mbits/sec0195 KBytes [ 4] 8.00-9.00 sec 18.1 MBytes 152 Mbits/sec0195 KBytes [ 4] 9.00-10.00 sec 18.0 MBytes 151 Mbits/sec0195 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 153 MBytes 129 Mbits/sec 49 sender [ 4] 0.00-10.00 sec 152 MBytes 128 Mbits/sec receiver The client machine is a linux box which actually reaches Gb transfer with another linux host. Over my FO Internet connection: NetBSD: $ iperf3 -c ping.online.net [...] [ ID] Interval Transfer Bandwidth Retr [ 6] 0.00-10.01 sec 44.3 MBytes 37.1 Mbits/sec 45 sender [ 6] 0.00-10.01 sec 44.1 MBytes 37.0 Mbits/sec receiver Linux: $ iperf3 -c ping.online.net [...] [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 124 MBytes 104 Mbits/sec 49 sender [ 4] 0.00-10.00 sec 121 MBytes 102 Mbits/sec receiver To be 100% honest, the Linux box is connected through a PLC while the NetBSD box is directly connected to the ISP router... I tried tweaking sysctl a bit like indicated here: https://wiki.netbsd.org/tutorials/tuning_netbsd_for_performance/ without success. Hints? Thoughts? Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \ !DSPAM:57ee707c128551504132269!
Re: NetBSD 6 and 7 panic on kvm
On Thu, 28 Jan 2016, Christos Zoulas wrote: Indeed, but can't you use the images from: http://nyftp.netbsd.org/pub/NetBSD-daily/netbsd-7/201601281010Z/amd64/ Just did it, as predicted, everything went smooth. Thanks! Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD 6 and 7 panic on kvm
On Thu, 28 Jan 2016, Emile `iMil' Heitor wrote: http://imil.net/stuff/NetbSD-7.0-ddb-bt.png seems related to the virtio driver. So I found this PR: https://mail-index.netbsd.org/netbsd-bugs/2015/12/15/msg043768.html Reducing the memory for the virtual machine allowed to install the OS, and answering to Christos question on this PR, no, the official iso image does not bundle the updated version of bus_dma.c. It would really be a good idea to update those so people can actually try NetBSD on kvm. Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD 6 and 7 panic on kvm
On Thu, 28 Jan 2016, Emile `iMil' Heitor wrote: Hi, I've been trying to install NetBSD/amd64 7.0 and 6.1.5 as a virtual machine using Linux's kvm, and it led to panic while extracting sets: http://imil.net/stuff/NetBSD-7.0-kvm.png A backtrace might help: http://imil.net/stuff/NetbSD-7.0-ddb-bt.png seems related to the virtio driver. -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
NetBSD 6 and 7 panic on kvm
Hi, I've been trying to install NetBSD/amd64 7.0 and 6.1.5 as a virtual machine using Linux's kvm, and it led to panic while extracting sets: http://imil.net/stuff/NetBSD-7.0-kvm.png I came across kern/50139 which suggests to run the installation with 1 vCPU, but it gave the same result. How can I proceed in order to help debug this issue? kvm is a very popular virtualization system on Linux and recently took over Xen regarding overall performances, many cloud providers are using it. Thanks, -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD 6.1 NFS server performances
On Thu, 3 Dec 2015, Michael van Elst wrote: reading with rsize=64k: ~ 90MB/s reading with rsize=32k: ~ 60MB/s writing with wsize=64k: ~ 40MB/s writing with wsize=32k: ~ 30MB/s Well, I have similar results except I get better performances while reading/writing with {r,w}size=32k instead of 64, about +20MB/s The disk itself gets 110MB/s reading and 90MB/s writing locally on the NFS server. I naively thought that with a 200MB/s write average I would top wire-speed with NFS (which I do using dd|nc / nc>file) Writing via NFS is always slower because even with NFS3 it's partially synchronous. Understood. Thanks again for your feedback! ---- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD 6.1 NFS server performances
On Thu, 3 Dec 2015, Emile `iMil' Heitor wrote: I know, right? And yes results are identical with differents bs values. I've tried a bazillion NFS options on the clients (TCP, UDP, {r,w}size from 8192 to 64k...), tried many OSes as a client, the NFS results are consistent, always between 20 and 30MB/s. interestingly enough, on a NetBSD client: $ cd ${NFS_SHARE} $ dd if=/dev/zero of=./test bs=32k count=31000 31000+0 records in 31000+0 records out 1015808000 bytes transferred in 12.673 secs (80155290 bytes/sec) It's not yet wire-speed but much better than the results I posted earlier that were obtained on Linux and OS/X clients. -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
NetBSD 6.1 NFS server performances
Continuing on the NAS performances topic, now it's NFS server's turn. First things first, I've checked both network and disk throughput, neither cause a bottleneck: tatooine is the client coruscant is the server $ iperf -c coruscant -p 2828 -t 10 Client connecting to coruscant, TCP port 2828 TCP window size: 43.8 KByte (default) [ 3] local 192.168.1.1 port 51371 connected with 192.168.1.2 port 2828 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.06 GBytes 908 Mbits/sec $ dd if=/dev/zero bs=1024K count=1000 | nc -v coruscant 2828 Connection to coruscant 2828 port [tcp/*] succeeded! 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 9.27363 s, 113 MB/s Gigabit link, all clear. Now using NFS: $ dd if=/dev/zero bs=1024K count=1000 >Desktop/nfs@coruscant/imil/tmp/test 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 51.8476 s, 20.2 MB/s I know, right? And yes results are identical with differents bs values. I've tried a bazillion NFS options on the clients (TCP, UDP, {r,w}size from 8192 to 64k...), tried many OSes as a client, the NFS results are consistent, always between 20 and 30MB/s. NFS server is started via rc.d with the following rc.conf variables: rpcbind=YES mountd=YES nfs_server=YES nfsd_flags="-6tun 8" lockd=YES statd=YES And yes I tried increasing or reducing thread number. /etc/exports is pretty simple: /export -alldirs -noresvport -maproot=root:wheel -network 192.168.1.0/24 I've read an extensive number of "NetBSD NFS server performances" posts here, applied every suggestion without any luck, any idea would be highly appreciated. Thanks, -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: priocscan vs fcfs
On Wed, 2 Dec 2015, Michael van Elst wrote: fcfs is also the "neutral" queue for drivers stacking on top of each other. The queue sorting should really only be done at one level. But raidframe is more complicated because it does its own queuing and sorting outside of this schema, in particular when it has to read-modify-write stripe sets for small I/O. That's probably why setting the queues all to fcfs is the best for you. Thanks a lot for this clear analysis Michael. I can now confirm the results I've witnessed earlier, I've ran a couple of benchmarks, including bonnie++ and iozone, the latter shows a ratio of x5 in favor of the fsfc strategy for every type of operation. For those interested, iozone spreadsheet output is available here (OOo / LibreOffice): https://home.imil.net/tmp/coruscant-iozone-priocscan.ods https://home.imil.net/tmp/coruscant-iozone-fsfc.ods For each subset, first column is the amount of data written (from 64K to 4M) and first row is the block size. -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
priocscan vs fcfs
Hi, I've never been really happy with my NetBSD RAIDframe NAS, never really got the speed I was supposed to even with the right alignment / raid layout etc. Today I dug into `dkctl(8)' while searching if cache was enabled for read and write, and I came across the "strategy" command. Long story short, changing from priocscan to fcfs strategy multiplied my NAS's write speed by 6! I changed the strategy for all disk members: # dkctl wd0 strategy fcfs # dkctl wd1 strategy fcfs # dkctl wd2 strategy fcfs and also for the RAIDframe: # dkctl raid0 strategy fcfs /dev/rraid0d: priocscan -> fcfs as changing it only for the disk members was apparently counter-productive. And there we go, from a 40/50MB/s write average to a stunning 200 to 300MB/s, which is more like what the disks can theroically do. Could anyone with some background on these strategies explain what's behind the curtain? I couldn't really find precise documentation on this matter... Thanks, ------------ Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)
On Thu, 26 Nov 2015, Emile `iMil' Heitor wrote: If this issue _is_ NFS related, which I doubt now, it is then read-related, as the build is done in tmpfs. Pushing the logic further, I just tried with pkgsrc itself being in tmpfs, and it froze even faster. Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)
On Thu, 26 Nov 2015, Manuel Bouyer wrote: what does 'show uvm' report ? db{0}> show uvm Current UVM status: pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 , ncolors=8 7444115 VM pages: 53990 active, 1807 inactive, 1 wired, 7302474 fre e pages 26626 anon, 25029 file, 4143 exec freemin=4096, free-target=5461, wired-max=2481371 cpu0: faults=2789285, traps=2796277, intrs=1396531, ctxswitch=1585247 softint=792597, syscalls=1259477 cpu1: faults=1696656, traps=1698017, intrs=180486, ctxswitch=127378 softint=36160, syscalls=608644 cpu2: faults=1207093, traps=1208093, intrs=160266, ctxswitch=65538 softint=18178, syscalls=412550 cpu3: faults=1434344, traps=1435516, intrs=174413, ctxswitch=100028 softint=24126, syscalls=512909 cpu4: faults=1273978, traps=1275187, intrs=161384, ctxswitch=68847 softint=19305, syscalls=424913 cpu5: faults=1622825, traps=1624084, intrs=171817, ctxswitch=105319 softint=31330, syscalls=510165 cpu6: faults=1734292, traps=1735749, intrs=170374, ctxswitch=99131 softint=26841, syscalls=551106 cpu7: faults=1392652, traps=1393985, intrs=166582, ctxswitch=81469 softint=20174, syscalls=442880 cpu8: faults=1492063, traps=1493265, intrs=166791, ctxswitch=88768 softint=24325, syscalls=492824 cpu9: faults=1579170, traps=1580406, intrs=167471, ctxswitch=89049 softint=23423, syscalls=506804 cpu10: faults=2153399, traps=2154831, intrs=184225, ctxswitch=149924 softint=40597, syscalls=828691 cpu11: faults=3136585, traps=3138031, intrs=219926, ctxswitch=251413 softint=67270, syscalls=1262227 cpu12: faults=4211510, traps=4213265, intrs=222549, ctxswitch=273560 softint=78470, syscalls=1584403 cpu13: faults=3938228, traps=3940765, intrs=252763, ctxswitch=368601 softint=110598, syscalls=1636441 cpu14: faults=1720207, traps=1721476, intrs=183332, ctxswitch=138148 softint=43486, syscalls=759336 cpu15: faults=1547431, traps=1548457, intrs=177462, ctxswitch=126099 softint=36803, syscalls=657976 fault counts: noram=0, noanon=0, pgwait=0, pgrele=0 ok relocks(total)=19975519(19975516), anget(retrys)=1498606(0), amapcopy=186 2658 neighbor anon/obj pg=1672558/779689, gets(lock/unlock)=20195408/19975523 cases: anon=1035657, anoncow=462949, obj=18439642, prcopy=1755801, przero=11 148433 daemon and swap counts: woke=0, revs=0, scans=0, obscans=0, anscans=0 busy=0, freed=0, reactivate=0, deactivate=0 pageouts=0, pending=0, nswget=0 nswapdev=0, swpgavail=0 swpages=0, swpginuse=0, swpgonly=0, paging=0 ---- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)
On Thu, 26 Nov 2015, Emile `iMil' Heitor wrote: Again, as there's no log at all, what would help debugging this behaviour? FWIW, some ddb output (ddb is triggered by hitting + on domU's console): fatal breakpoint trap in supervisor mode trap type 1 code 0 rip 8012e5ad cs e030 rflags 202 cr2 7f7ff6c1e049 ilevel 8 rsp a0051864cc58 curlwp 0xa00035538840 pid 0.2 lowest kstack 0xa0051864a2c0 Stopped in pid 0.2 (system) at netbsd:breakpoint+0x5: leave breakpoint() at netbsd:breakpoint+0x5 xencons_tty_input() at netbsd:xencons_tty_input+0xb2 xencons_handler() at netbsd:xencons_handler+0x65 intr_biglock_wrapper() at netbsd:intr_biglock_wrapper+0x19 evtchn_do_event() at netbsd:evtchn_do_event+0x281 do_hypervisor_callback() at netbsd:do_hypervisor_callback+0x143 hypervisor_callback() at netbsd:hypervisor_callback+0x9e idle_loop() at netbsd:idle_loop+0xe8 ds 3c80 es c780 fs c040 gs 7524 rdi a0003a62b330 rsi 8437d01f rbp a0051864cc58 rbx 8437d01f rdx 2b rcx 2b rax 1 r8 0 r9 805fc780cpu_info_primary r10 cdd9f51e239cbb87 r11 246 r12 a0003d754c00 r13 8437d020 r14 a0003a62b330 r15 1 rip 8012e5adbreakpoint+0x5 cs e030 rflags 202 rsp a0051864cc58 ss e02b netbsd:breakpoint+0x5: leave db{0}> ps PIDLID S CPU FLAGS STRUCT LWP * NAME WAIT 162591 3 13 0 a0003f435080awk netio 242691 3 14 0 a0003f446a80cat nfsrcv 233011 3 1480 a0003f3f8a60 sh wait 118351 3 1480 a0003f40c0c0 bmake wait 9493 1 3 1480 a0003f26d4a0 sh wait 4831 1 3 1480 a0003f42d320 bmake wait 285131 3 1480 a0003f3f09e0 sh wait 6232 1 3 1580 a0003ed15ae0 sh wait 1172 1 3 1580 a0003f083000 bmake wait 165441 3 1580 a0003f439500 sh wait 181411 3 1580 a0003f430420 sh wait 2349 1 3 1580 a0003f448280 bash wait 490 1 3 580 a0003f447680 sshd select 2135 1 3 080 a0003f445640 sshd select 2234 1 3 480 a0003f2be580 bash ttyraw 381 1 3 1280 a0003f2be9a0 sshd select 382 1 3 1380 a0003e9cc680 sshd select 1868 1 3 15 0 a0003dca59c0 getty nfsrcv 2354 1 3 5 0 a0003ed51b00 cron nfsrcv 1675 1 3 1180 a0003edbd2e0 inetd kqueue 2105 1 3 1280 a0003ee0d720 nrpe select 2086 1 3 14 100 a0003eca46a0 qmgr nfsrcv 2033 1 3 13 0 a0003eca4ac0 pickup nfskqdet 2055 1 3 4 0 a0003edbd700 master tstile 164013 5 11 1000 a0003efff740 python2.7 1640 9 3 1180 a0003ee0db40 python2.7 kqueue 1640 8 3 1280 a0003ed152a0 python2.7 kqueue 1640 1 3 980 a0003e0f6240 python2.7 select 1555 1 3 1380 a0003e0f6660 sshd select 1407 1 3 1380 a0003dd15a20 powerd kqueue 892 1 3 280 a0003e099640 rpc.lockd select 884 1 3 1580 a0003e099a60 rpc.statd select 686 1 3 780 a0003dd151e0rpcbind select 677 1 3 5 0 a0003dd6fa40syslogd nfsrcv 11 3 880 a0003d75b100 init wait 0 131 3 3 200 a0003d75d140 nfskqpoll nfsrcv 0 129 3 4 200 a0003dc4e160 aiodoned aiodoned 0 128 3 7 200 a0003dc4e580ioflush syncer 0 127 3 0 200 a0003dc4e9a0 pgdaemon pgdaemon 0 124 3 14 200 a0003d75a920 nfsio nfsiod 0 123 3 13 200 a0003d75a500 nfsio nfsiod 0 122 3 9 200 a0003d75a0e0 nfsio nfsiod 0 121 3 15 200 a0003d75b940 nfsio nfsiod 0 120 3 0 200 a0003d75b520 cryptoret crypto_w 0 119 3 0 200 a0003d7530c0 unpgc unpgc 0 118 3 0 200 a0003d75c960xen_balloon xen_balloon 0 117 3 9 200 a0003d75c540vmem_rehash vmem_rehash 0 116 3 0 200 a0003d75d980
NetBSD/amd64 7.0 domU freezes while running pbulk.sh (was Re: Raspberry Pi 2, nfs mount hangs after some time)
On Mon, 2 Nov 2015, Emile `iMil' Heitor wrote: I'm trying to get rid of those hangs for weeks now, tried every mount flag combination without success, the system would freeze randomly, leaving the whole OS unresponsive. There's no log, no kernel message, the domU actually responds to network solicitations (ping, telnet 22...) but once it's frozen, it is impossible to run any command, it will just hang. The exact same setup is successfully running since Sept 2014 on NetBSD 6.1/amd64. Any idea how to get some valuable information to help tracking down this awful behaviour? A bit of follow-up. I've been trying many workarounds during the past weeks, and right now I'm not convinced it even is an NFS problem. I've setup a tmpfs bulk build directory, and even that way, NetBSD 7.0 would freeze randomly after a couple of minutes while processing `pbulk.sh'. What I can say: - the server is a fresh diskless NetBSD 7.0 domU (PXE/NFS) - there's not a single information about the freeze, not even in the console - I've only witnessed those freezes when calling `pbulk.sh' (couldn't get further anyway) - cvs co pkgsrc does not freezes, I ran it many times without issues - the domU stays up for days if no operation is made - I started this domU on various dom0s to validate this was not a hardware problem, always had the same symptoms - I tried a custom 7.0_STABLE kernel without success If this issue _is_ NFS related, which I doubt now, it is then read-related, as the build is done in tmpfs. Again, as there's no log at all, what would help debugging this behaviour? -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: Raspberry Pi 2, nfs mount hangs after some time
On Mon, 2 Nov 2015, Christos Zoulas wrote: Can you get into ddb? unfortunately no, the system hangs but does not panic, it just becomes unusable. Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: Raspberry Pi 2, nfs mount hangs after some time
On Mon, 26 Oct 2015, Robert Elz wrote: For workarounds, mount using tcp (won't cure the problem, but will make it far less common), and use interruptible mounts (mount_nfs -T -i) so when it does hang, you can kill the process(es) at least. A mee-too reply. I've setup a NetBSD 7.0/amd64 bulk-build domU as I did for NetBSD 6.1/amd64, it uses our platform's NetApp NFS servers (thousands of Linux domUs are using those, the hardware is not guilty). I'm trying to get rid of those hangs for weeks now, tried every mount flag combination without success, the system would freeze randomly, leaving the whole OS unresponsive. There's no log, no kernel message, the domU actually responds to network solicitations (ping, telnet 22...) but once it's frozen, it is impossible to run any command, it will just hang. The exact same setup is successfully running since Sept 2014 on NetBSD 6.1/amd64. Any idea how to get some valuable information to help tracking down this awful behaviour? ------------ Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
IPsec over GRE slow performances
Hi, I have set up an IPsec over GRE connection with a remote host, both are NetBSD 6.1 based. The "client" is connected to the Internet through a 400Mbps fiber connection. The "server" is located on a 10Gbps network. Both machines have 1Gbps NICs which behave perfectly, meaning they both reach the link speed limit when transferring data outside the IPsec tunnel. When doing a transfer through the tunnel, speed drops at a factor 5 to 10: direct connection: /dev/null27%[> ] 503.19M 45.3MB/s eta 83s IPsec connection: /dev/null 2%[ ] 47.76M 6.05MB/s eta 5m 3s The tunnel is setup this way: On the server, which is a NetBSD domU running on a debian/amd64 dom0: $ cat /etc/ifconfig.xennet0 # server interface up inet 192.168.1.2 netmask 255.255.255.0 inet 172.16.1.1 netmask 0xfffc alias $ cat /etc/ifconfig.gre0 create tunnel 172.16.1.1 172.16.1.2 up inet 172.16.1.5 172.16.1.6 netmask 255.255.255.252 IPsec traffic is forwarded from dom0's public IP to the domU's xennet0 interface through an iptables NAT rule: -A PREROUTING -i eth0 -p udp -m udp --dport 500 -j DNAT --to-destination 192.168.1.2:500 -A PREROUTING -i eth0 -p esp -j DNAT --to-destination 192.168.1.2 -A PREROUTING -i eth0 -p ah -j DNAT --to-destination 192.168.1.2 On the client: $ cat /etc/ifconfig.vlan8 # client public interface create vlan 8 vlanif re0 !dhcpcd -i $int inet 172.16.1.2 netmask 0xfffc alias $ cat /etc/ifconfig.gre1 create tunnel 172.16.1.2 172.16.1.1 up inet 172.16.1.6 172.16.1.5 netmask 255.255.255.252 On racoon side, I tried various hash / encryption algorithms combinations, even enc_null, but nothing changes really, transfer is still stuck at a 6MB/s max. Here's the racoon setup: On the server: remote office.public.ip { exchange_mode main; lifetime time 28800 seconds; proposal { encryption_algorithm blowfish; hash_algorithm sha1; authentication_method pre_shared_key; dh_group 2; } generate_policy off; } sainfo address 172.16.1.1/30 any address 172.16.1.2/30 any { pfs_group 2; encryption_algorithm blowfish; authentication_algorithm hmac_sha1; compression_algorithm deflate; lifetime time 3600 seconds; } On the client: remote node.public.ip { exchange_mode main; lifetime time 28800 seconds; proposal { encryption_algorithm blowfish; hash_algorithm sha1; authentication_method pre_shared_key; dh_group 2; } generate_policy off; } sainfo address 172.16.1.2/30 any address 172.16.1.1/30 any { pfs_group 2; encryption_algorithm blowfish; authentication_algorithm hmac_sha1; compression_algorithm deflate; lifetime time 3600 seconds; } The tunnel establishes with no issue, the only problem here is transfer drop. Again, when transferring from / to the server from / to the client without tunnel, speed is optimal, drop occurs _only_ through IPsec. Both machines are intel-based CPUs running at 2+GHz, plenty of memory and very little CPU time consumed by anything else than forwarding / NAT. Has anyone witnessed such a behaviour? Any idea on where to look further? Thanks, ------------ Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: alc(4) current for NetBSD 6.1.5, patch included
On Sun, 25 Jan 2015, Leonardo Taccari wrote: If someone has succesfully tested alc(4) on 813x/815x chipsets too can we request a pull up for netbsd-7? It seems that the 816x chipsets are common on various motherboards and laptopts. What do you think? Definitely a good idea as those chips are widely spread for a very long time now. Also, I believe backporting it to 7.0 should be pretty straightforward. Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
alc(4) current for NetBSD 6.1.5, patch included
Hi, Using Leonardo Taccari's work on alc(4) for NetBSD current, I've enabled my AR8171 Gbe NIC on NetBSD 6.1.5 by porting the driver. Anyone interested, just apply the enclosed patch in src/sys/dev/pci then: make -f Makefile.pcidevs in src/sys/dev/pci and rebuild your kernel as usual. HTH -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \Index: if_alc.c === RCS file: /cvsroot/src/sys/dev/pci/if_alc.c,v retrieving revision 1.5 diff -u -r1.5 if_alc.c --- if_alc.c29 Aug 2011 14:47:08 - 1.5 +++ if_alc.c24 Jan 2015 21:47:00 - @@ -95,6 +95,16 @@ "Atheros AR8152 v1.1 PCIe Fast Ethernet" }, { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8152_B2, 6 * 1024, "Atheros AR8152 v2.0 PCIe Fast Ethernet" }, + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8161, 9 * 1024, + "Atheros AR8161 PCIe Gigabit Ethernet" }, + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8162, 9 * 1024, + "Atheros AR8162 PCIe Fast Ethernet" }, + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8171, 9 * 1024, + "Atheros AR8171 PCIe Gigabit Ethernet" }, + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_AR8172, 9 * 1024, + "Atheros AR8172 PCIe Fast Ethernet" }, + { PCI_VENDOR_ATTANSIC, PCI_PRODUCT_ATTANSIC_E2200, 9 * 1024, + "Killer E2200 Gigabit Ethernet" }, { 0, 0, 0, NULL }, }; @@ -103,33 +113,50 @@ static int alc_detach(device_t, int); static int alc_init(struct ifnet *); +static int alc_init_backend(struct ifnet *, bool); static voidalc_start(struct ifnet *); static int alc_ioctl(struct ifnet *, u_long, void *); static voidalc_watchdog(struct ifnet *); static int alc_mediachange(struct ifnet *); static voidalc_mediastatus(struct ifnet *, struct ifmediareq *); -static voidalc_aspm(struct alc_softc *, int); +static voidalc_aspm(struct alc_softc *, int, int); +static voidalc_aspm_813x(struct alc_softc *, int); +static voidalc_aspm_816x(struct alc_softc *, int); static voidalc_disable_l0s_l1(struct alc_softc *); static int alc_dma_alloc(struct alc_softc *); static voidalc_dma_free(struct alc_softc *); +static voidalc_dsp_fixup(struct alc_softc *, int); static int alc_encap(struct alc_softc *, struct mbuf **); static struct alc_ident * alc_find_ident(struct pci_attach_args *); static voidalc_get_macaddr(struct alc_softc *); +static voidalc_get_macaddr_813x(struct alc_softc *); +static voidalc_get_macaddr_816x(struct alc_softc *); +static voidalc_get_macaddr_par(struct alc_softc *); static voidalc_init_cmb(struct alc_softc *); static voidalc_init_rr_ring(struct alc_softc *); -static int alc_init_rx_ring(struct alc_softc *); +static int alc_init_rx_ring(struct alc_softc *, bool); static voidalc_init_smb(struct alc_softc *); static voidalc_init_tx_ring(struct alc_softc *); static int alc_intr(void *); static voidalc_mac_config(struct alc_softc *); +static uint32_talc_mii_readreg_813x(struct alc_softc *, int, int); +static uint32_talc_mii_readreg_816x(struct alc_softc *, int, int); +static voidalc_mii_writereg_813x(struct alc_softc *, int, int, int); +static voidalc_mii_writereg_816x(struct alc_softc *, int, int, int); static int alc_miibus_readreg(device_t, int, int); static voidalc_miibus_statchg(device_t); static voidalc_miibus_writereg(device_t, int, int, int); -static int alc_newbuf(struct alc_softc *, struct alc_rxdesc *, int); +static uint32_talc_miidbg_readreg(struct alc_softc *, int); +static voidalc_miidbg_writereg(struct alc_softc *, int, int); +static uint32_talc_miiext_readreg(struct alc_softc *, int, int); +static uint32_talc_miiext_writereg(struct alc_softc *, int, int, int); +static int alc_newbuf(struct alc_softc *, struct alc_rxdesc *, bool); static voidalc_phy_down(struct alc_softc *); static voidalc_phy_reset(struct alc_softc *); +static voidalc_phy_reset_813x(struct alc_softc *); +static voidalc_phy_reset_816x(struct alc_softc *); static voidalc_reset(struct alc_softc *); static voidalc_rxeof(struct alc_softc *, struct rx_rdesc *); static int alc_rxintr(struct alc_softc *); @@ -159,12 +186,34 @@ alc_miibus_readreg(device_t dev, int phy, int reg) { struct alc_softc *sc = device_private(dev); + int v; + + if ((sc->alc_flags & AL
Re: poor write performances with RAID 5 Raidframe
On Sun, 30 Nov 2014, Emile `iMil' Heitor wrote: $ dd if=/dev/zero of=/export/imil/tmp/test bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 42.619 secs (24603486 bytes/sec) Turns out performances are not *that* bad considering the disks I'm using[1][2] plus RAID 5 overead. I re-ran some benchmarks using benchmarks/bonnie++ and dd but with 4k blocks this time, and boths results were consistent, between 30 and 40MB/s write speed. [1]: 1x http://www.cnet.com/products/seagate-barracuda-7200-14-2tb/specs/ [2]: 2x http://www.storagereview.com/seagate_barracuda_green_2tb_review_st2000dl003 -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: poor write performances with RAID 5 Raidframe
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5 64 1 1 5 ~ # newfs -O2 -b64k -I dk0 This might not be the best configuration; you've got two RAIDframe SUs per FFS block. So you'd recommend to run newfs with 32k as the block size? -------- Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
poor write performances with RAID 5 Raidframe
Hi, I'm witnessing poor performances while writing to a RAID 5 Raidframe I setup to be my home NAS a couple of months ago. Following some documentations, tutorials, and mailing-list posts, I think I had everything aligned right as the following configuration suggests: raid1.conf: START array # numRow numCol numSpare 1 3 0 START disks /dev/wd2a /dev/wd3a /dev/wd4a START layout # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5 64 1 1 5 START queue fifo 100 # gpt show raid1 startsize index contents 0 1 PMBR 1 1 Pri GPT header 2 32 Pri GPT table 34 94 128 7814057951 1 GPT part - NetBSD FFSv1/FFSv2 7814058079 32 Sec GPT table 7814058111 1 Sec GPT header # dkctl raid1 listwedges /dev/rraid1d: 1 wedge: dk0: 5bdb599c-744a-11e3-9160-002354666e0f, 7814057951 blocks at 128, type: ffs newfs(8) was done like this: # newfs -O2 -b64k -I dk0 The disks involved are Seagate Barracuda which are known to have as such high write transfer rate as ~100MB/s, and I merely get a 25MB/s with those: $ dd if=/dev/zero of=/export/imil/tmp/test bs=1m count=1000 1000+0 records in 1000+0 records out 1048576000 bytes transferred in 42.619 secs (24603486 bytes/sec) On the other hand, read speed is really good: $ dd if=/export/imil/media/foobar_adventures.mp4 of=/dev/null bs=1m 1395+1 records in 1395+1 records out 1463376863 bytes transferred in 7.899 secs (185261028 bytes/sec) Any advice on this matter? Does this configuration looks good? Thanks, Emile `iMil' Heitor * _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
FWIW, it is documented in mount(8): log (FFS only) Mount the file system with wapbl(4) meta- data journaling, also known simply as logging. It provides rapid metadata updates and eliminates the need to check file system consistency after a system outage. A file system mounted with log can not be mounted with async. [...] Well, my bad :( Wouldn't it be wise that mount(8) forbids the use of both flags ? -- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
asynchronous + log should be impossible. I've just discovered that the hard way :) Did you find out wether the system is really hanging or just very slow? In particular, is there any disk activity and does the system respond on the network? Like I said earlier on this thread, the system hangs, no service responds anymore, but the server still replies to ICMP and does not panic. I waited for about 4 hours to see if the operations would proceed but they didn't. ------ Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
Could one of those mount flags cause any harm with a RAIDframe array ? /dev/dk0 on /export type ffs (asynchronous, log, noatime, NFS exported, local) well well, it seems like removing the `async' flag solved the issue, I've just ran multiple tests that caused instant fail yesterday which passed right away without asynchronous I/O... some kind of memleak maybe ? ------ Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
This value can be set to zero, however up to a factor of three in throughput will be lost over the performance obtained at a 5% threshold. hm, I missed that point. Will raise up. Well, I have applied both recommendations, and yet the machine just had the same behaviour while rsync'ing data to the newly created RAID5. Again, the system was kind of "up" but no command could be typed and services did not respond anymore. Everything was frozen. Could one of those mount flags cause any harm with a RAIDframe array ? /dev/dk0 on /export type ffs (asynchronous, log, noatime, NFS exported, local) Just to be clear, the trivial dd(1) test as of the rsync took place locally on the server, no network transfers yet. ------ Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
Hi Greg, Any particular reason why you set minfree to 0 instead of leaving it at the default? Especially given that the man-page for tunefs says: Just because that volume is dedicated in storing media and stuff, no need to reserve space for root, and hey, 5% of 4T is a lot :\ This value can be set to zero, however up to a factor of three in throughput will be lost over the performance obtained at a 5% threshold. hm, I missed that point. Will raise up. Losing a factor of 3 in throughput in addition to the RAID5 write penalty seems pretty expensive, performance-wise :( (You might also get better write performance with '64 1 1 5' in this configuration, instead of '32 1 1 5') Noted, will definitely try. Thanks for the replies! ---------- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: [RAIDframe] system hangs while manipulating large files
Does it hang or does it just take an enormously long time (>30min) to complete the operation? oh, I must say I haven't waited that long, this morning it's been struggling for about 20 minutes, I had to hard-reset the machine as I needed it for work. ------ Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
[RAIDframe] system hangs while manipulating large files
Hi and Happy New year fellow NetBSD users, I'm setting up a RAID5 NAS using NetBSD 6.1.2/amd64. The RAID5 array is used as a media storage system, it is composed of 3x2T SATA disks. I followed various on-line documents such as: http://www.netbsd.org/docs/guide/en/chap-rf.html http://mail-index.netbsd.org/netbsd-users/2011/09/02/msg008979.html http://abs0d.blogspot.fr/2011/08/setting-up-8tb-netbsd-file-server.html http://pbraun.nethence.com/unix/sysutils_bsd/raidframe.html I've copied about 10G of data composed of 1 to 100M files, everything went smoothly. Then, in order to have some performances figures, I used `dd' like this: $ dd if=/dev/zero of=./test bs=1m count=5000 withing a couple of seconds, the whole system becomes irresponsive and hangs totally but does not panic. I have reproduced this behaviour a couple of times, the system hangs everytime. Here's the RAID setup: $ cat /etc/raid1.conf START array # numRow numCol numSpare 1 3 0 START disks /dev/wd1a /dev/wd3a /dev/wd5a START layout # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5 32 1 1 5 START queue fifo 100 and here's how I did the setup: dd if=/dev/zero of=/dev/rwd1d bs=8k count=1 dd if=/dev/zero of=/dev/rwd3d bs=8k count=1 dd if=/dev/zero of=/dev/rwd5d bs=8k count=1 disklabel -r -e -I wd1 # s/4.2BSD/RAID/ disklabel -r -e -I wd3 # s/4.2BSD/RAID/ disklabel -r -e -I wd5 # s/4.2BSD/RAID/ raidctl -v -C raid1.conf raid1 raidctl -v -I `date +%s` raid1 raidctl -v -A yes raid1 raidctl -i raid1 gpt add -b 128 raid1 dkctl raid1 addwedge export 128 7814058015 ffs newfs -O2 -b64k dk0 tunefs -m0 raid1 mount -o rw,log,async,noatime /dev/dk0 /export Any hint? Have someone already faced this issue? Thanks ---------- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: no file system for xbd0 (dev 0x8e00)
On Thu, 30 May 2013, Jean-Yves Migeon wrote: Do you get the same output between "disklabel /dev/xbd1" and "disklabel -Ar /dev/xbd1"? I would also try dd'ing the LV and pass it through vnconfig(8) (never did it, YMMV). I've seen weird behavior from domU when it falls on specific forbidden operations from dom0's backend (like trying to overwrite the disk's disklabel after a silent corruption). I finally fixed the domU's fs, it appears that the superblock list given by newfs -N was wrong, and I found a working alternative superblock. Luckily, only the /dev directory and a couple of files have been wiped by fsck(8) and I've been able to boot the virtual machine and put it to work again. Nevertheless, that story is a bit frightening, because no error message has never showed neither on the domU nor on the dom0, the fs corruption just "happened silently", just like you said. Thanks for your help anyway! ---------- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
Re: no file system for xbd0 (dev 0x8e00)
I suggest checking this from a dom0 perspective first before continuing the hunt for weird and uncommon file system corruption issues in the domU. In the dom0, make sure your lvm is active, e.g. marked with an 'a' rather than a 'd' in the output of lvm lvs. I suspect you couldn't even get a disklabel if it wasn't, but it can't hurt to check the lvm status in dom0 anyway. I'd also suggest in dom0 dd'ing the lvm's raw device to /dev/null to make sure there are no i/o errors. lvs output, from the dom0, seems ok: webserver vg1 owi-ao 20.00g The `o'rigin attribute is set because I've just done a snapshot just in case.. dd from the vbd to /dev/null went well also. I have a bad feeling about this... ---------- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \
no file system for xbd0 (dev 0x8e00)
Hi, Since yesterday, a NetBSD 6.0.1/amd64 domU of mine can't mount its root fs anymore: boot device: xbd0 root on xbd0a dumps on xbd0b Your machine does not initialize mem_clusters; sparse_dumps disabled Supported file systems: union umap tmpfs smbfs puffs ptyfs procfs overlay null ntfs nfs msdos mfs lfs kernfs fdesc ext2fs ffs coda cd9660 no file system for xbd0 (dev 0x8e00) cannot mount root, error = 79 root device (default xbd0a): That domU used to boot / work peacefully, but for an unknown reason, the virtual block device is not recognized anymore. When trying to mount that device in another virtual machine, I get the following: [~] mount -o log /dev/xbd1a /mnt mount_ffs: /dev/xbd1a on /mnt: incorrect super block Whereas disklabel indicates: [~] disklabel /dev/xbd1 # /dev/xbd1d: type: unknown disk: Xen Virtual ESD label: flags: bytes/sector: 512 sectors/track: 2048 tracks/cylinder: 1 sectors/cylinder: 2048 cylinders: 20480 total sectors: 41943040 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: #sizeoffset fstype [fsize bsize cpg/sgs] a: 37748736 0 4.2BSD 2048 16384 0 # (Cyl. 0 - 18431) b: 4194304 37748736 swap # (Cyl. 18432 - 20479) c: 41943040 0 unused 0 0# (Cyl. 0 - 20479) d: 41943040 0 unused 0 0# (Cyl. 0 - 20479) The virtual block device is a LVM logical volume, I use that setup in almost all of the domUs I run, it is declared very simply in domU's configuration file: disk = [ 'phy:/dev/mapper/vg1-webserver,hda,w' ] Running fsck_ffs gives: [~] fsck_ffs /dev/xbd1a ** /dev/rxbd1a BAD SUPER BLOCK: CAN'T FIND SUPERBLOCK /dev/rxbd1a: CANNOT FIGURE OUT SECTORS PER CYLINDER And specifying another superblock doesn't change anything: [~] fsck_ffs -b 32 /dev/xbd1a Alternate super block location: 32 ** /dev/rxbd1a BAD SUPER BLOCK: MAGIC NUMBER WRONG Any idea on what can I try in order to recover that virtual drive? ---------- Emile `iMil' Heitor .°. _ | http://imil.net| ASCII ribbon campaign ( ) | http://www.NetBSD.org | - against HTML email X | http://gcu.info| & vCards / \