Re: HP Smart array P440 support

2017-09-07 Thread Rainer Duffner

> Am 07.09.2017 um 23:13 schrieb Priyadarshana Chandrasena :
> 
> I have not installed FreeBSD using iLo and virtual CD method. I will read
> about this method and try it out. 



You will need an iLO license for that.


Also, enjoy Gen10 servers, where even more features are going to be 
„pay-for-play“.



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

RE: HP Smart array P440 support

2017-09-07 Thread Priyadarshana Chandrasena
I have not installed FreeBSD using iLo and virtual CD method. I will read
about this method and try it out. 
Thanks.

-Original Message-
From: Maciej Suszko [mailto:mac...@suszko.eu] 
Sent: Thursday, 7 September 2017 9:34 PM
To: Priyadarshana Chandrasena
Cc: FreeBSD-stable@FreeBSD.org
Subject: Re: HP Smart array P440 support

On Sun, 3 Sep 2017 21:19:08 +0200
Maciej Suszko  wrote:

> "Priyadarshana Chandrasena"  wrote:
> > I have HP DL20 Gen 9 server. But I can not install FreeBSD 10.2 in 
> > it. I have a HP smart array P440 in my server. I can not find out 
> > anywhere if ciss driver support include HP Gen 9 storage 
> > controllers.  Could you please update the ciss driver in 10.3?
> 
> Hi,
> 
> Got two DL20 with P440, UEFI booted 11.0 to logical disk (raid array).
> In a few days I plan to boot it in HBA mode, we'll see if it's working.

Just tested, HBA mode... Machine BIOS U22 v1.80, one-time booted in Legacy
mode with mfsBSD iso in Virtual CD drive (through iLO), recreated system
zpool to mirror of 2 drives, booted in UEFI mode - it does work.

#v+
root@storage-04:~ # camcontrol inquiry da2
pass2:  Fixed Direct Access SPC-4 SCSI device
pass2: Serial Number S4226CNTM70507YN
pass2: 135.168MB/s transfers, Command Queueing Enabled

root@storage-04:~ # camcontrol inquiry da3
pass3:  Fixed Direct Access SPC-4 SCSI device
pass3: Serial Number S4226CHHM70540VJ
pass3: 135.168MB/s transfers, Command Queueing Enabled

root@storage-04:~ # zpool status rpool
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0 in 0h2m with 0 errors on Thu Sep  7 11:56:31 2017
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
da2p3   ONLINE   0 0 0
da3p3   ONLINE   0 0 0

errors: No known data errors
#v-

[ c u t ]
--
regards, Maciej Suszko.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-09-07 Thread Julien Charbon

 Hi Ben,

On 8/31/17 12:04 PM, Ben RUBSON wrote:
>> On 28 Aug 2017, at 11:27, Julien Charbon  wrote:
>>
>> On 8/28/17 10:25 AM, Ben RUBSON wrote:
 On 16 Aug 2017, at 11:02, Ben RUBSON  wrote:

> On 15 Aug 2017, at 23:33, Julien Charbon  wrote:
>
> On 8/11/17 11:32 AM, Ben RUBSON wrote:
>>> On 08 Aug 2017, at 13:33, Julien Charbon  wrote:
>>>
>>> On 8/8/17 10:31 AM, Hans Petter Selasky wrote:

 Suggested fix attached.
>>>
>>> I agree we your conclusion.  Just for the record, more precisely this
>>> regression seems to have been introduced with:
>>> (...)
>>> Thus good catch, and your patch looks good.  I am going to just verify
>>> the other in_pcbrele_wlocked() calls in TCP stack.
>>
>> Julien, do you plan to make this fix reach 11.0-p12 ?
>
> I am checking if your issue is another flavor of the issue fixed by:
>
> https://svnweb.freebsd.org/base?view=revision&revision=307551
> https://reviews.freebsd.org/D8211
>
> This fix in not in 11.0 but in 11.1.  Currently I did not found how an
> inp in INP_TIMEWAIT state can have been INP_FREED without having its tw
> set to NULL already except the issue fixed by r307551.
>
> Thus could you try to apply this patch:
>
> https://github.com/freebsd/freebsd/commit/acb5bfda99b753d9ead3529d04f20087c5f7d0a0.patch
>
> and see if you can still reproduce this issue?

 Thank you for your answer Julien.
 Unfortunately, I'm not sure at all how to reproduce the issue.
 I have other servers which are 100% identical to this one, same workload,
 same some-months uptime, but they did not trigger the bug yet.

 If other network stack experts (I'm not) agree with your analysis,
 we could then certainly go further with D8211 / r307551.

 One thing that perhaps might help :
 # netstat -an | grep TIME_WAIT$ | wc -l
 468

 Note that due to this running bug, sendmail has lots of difficulties to 
 send outgoing mails.
 As soon as I run the above netstat command, I receive a lot of stacked 
 mails (more than 20 this time).
 As if netstat was able to somehow help...

 Number of TIME_WAIT connections however does not decrease, but increases.

> And in the spirit of r307551 fix and based on Hans patch I will also
> propose to add a kernel log describing the issue instead of starting an
> infinite loop when INVARIANT is not set.

 Which should then never be triggered :)
 Good idea I think !
>>>
>>> What about :
>>> D8211/r307551
>>> + Hans' patch
>>> + Julien's idea of a kernel log (sort of "We should not be here but we are")
>>
>> I did this change and I am testing it
> 
> Good news !
> 
>> on your side did you try this patch applied on 11.0?
>>
>> https://github.com/freebsd/freebsd/commit/acb5bfda99b753d9ead3529d04f20087c5f7d0a0.patch
> 
> Yes, patch applied and running correctly,
> however hard to say whether or not it solves this issue,
> as there is no easy way to reproduce it.

 No problem, it is just a matter of not seeing the issue anymore during
a long enough period.

I created a review that includes Hans's patch and uses the same
log(LOG_ERR) logic than r307551:
https://reviews.freebsd.org/D12267

 On my side, TCP smoke tests are ok.  And I am going to launch our TCP
QA on it while receiving review comments.

> Mail sent to FreeBSD Security Team !
> 
> Many thanks, let's stay tuned !

 Thanks to you and Hans for reporting that issue.  And in summary:

 - Applying r307551 on top of 11.0 should prevent this case to happen
 - D12267 will prevent the tcp_tw_2msl_scan() infinite loop while
reporting the error, in case a regression defeating r307551 is introduced

 Thanks.

--
Julien
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.1 running on HyperV hn interface hangs

2017-09-07 Thread Paul Koch
On Thu, 7 Sep 2017 13:51:11 +0800
Sepherosa Ziehau  wrote:

> Weird, your traffic pattern does not even belong to anything heavy.
> Sending is mainly UDP, which will never be able to saturate the TX
> buffer ring causing the RXBUF ACK sending failure.  This is weird.

It's a bit tricky. The poller is very fast. We ping every device every 15
seconds, and collect every MIB object every 60 seconds. The poller "rate
limits" itself by dividing each minute into 100ms time slots and only sends a
specific amount of pings/snmp packets in each time slot.  The problem is, it
blasts the request packets out really fast at the start of each time slot,
and then sits in a receive loop until the next time slot comes around.  The
requests are not paced over the 100ms, therefore it will blast out a lot
of packets in a few milliseconds.

We use to use a 1 second rate limiting time slot, and didn't interlace
ping/snmp requests, but we found certain interface types on Cisco 6509
switches couldn't keep up with back-to-back pings and would lose them.


> Anyhow, make sure to test this patch:
> 8762017-Sep-07 02:19 hn_inc_txbr.diff

Yep.  Might take a bit of time to test though because we'll need to get the
customer to spin up a test VM on the same platform, and they are fairly
remote (Perth, Australia).  We don't run any Microsoft servers/HyperV setups
in our lab.

Paul.
-- 
Paul Koch | Founder | CEO
AKIPS Network Monitor | akips.com
Brisbane, Australia
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: HP Smart array P440 support

2017-09-07 Thread Maciej Suszko
On Sun, 3 Sep 2017 21:19:08 +0200
Maciej Suszko  wrote:

> "Priyadarshana Chandrasena"  wrote:
> > I have HP DL20 Gen 9 server. But I can not install FreeBSD 10.2 in
> > it. I have a HP smart array P440 in my server. I can not find out
> > anywhere if ciss driver support include HP Gen 9 storage
> > controllers.  Could you please update the ciss driver in 10.3?  
> 
> Hi,
> 
> Got two DL20 with P440, UEFI booted 11.0 to logical disk (raid array).
> In a few days I plan to boot it in HBA mode, we'll see if it's working.

Just tested, HBA mode... Machine BIOS U22 v1.80, one-time booted in
Legacy mode with mfsBSD iso in Virtual CD drive (through iLO), recreated
system zpool to mirror of 2 drives, booted in UEFI mode - it does work.

#v+
root@storage-04:~ # camcontrol inquiry da2
pass2:  Fixed Direct Access SPC-4 SCSI device
pass2: Serial Number S4226CNTM70507YN
pass2: 135.168MB/s transfers, Command Queueing Enabled

root@storage-04:~ # camcontrol inquiry da3
pass3:  Fixed Direct Access SPC-4 SCSI device
pass3: Serial Number S4226CHHM70540VJ
pass3: 135.168MB/s transfers, Command Queueing Enabled

root@storage-04:~ # zpool status rpool
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0 in 0h2m with 0 errors on Thu Sep  7 11:56:31 2017
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
da2p3   ONLINE   0 0 0
da3p3   ONLINE   0 0 0

errors: No known data errors
#v-

[ c u t ]
-- 
regards, Maciej Suszko.


pgpTZcuT5AjiV.pgp
Description: OpenPGP digital signature


Re: 11.1 running on HyperV hn interface hangs

2017-09-07 Thread Sepherosa Ziehau
Weird, your traffic pattern does not even belong to anything heavy.
Sending is mainly UDP, which will never be able to saturate the TX
buffer ring causing the RXBUF ACK sending failure.  This is weird.
Anyhow, make sure to test this patch:
8762017-Sep-07 02:19 hn_inc_txbr.diff

On Thu, Sep 7, 2017 at 1:07 PM, Paul Koch  wrote:
> On Thu, 7 Sep 2017 10:22:40 +0800
> Sepherosa Ziehau  wrote:
>
>> Is it possible to tell me your workload?  e.g. TX heavy or RX heavy.
>> Enabled TSO or not.  Details like how the send syscalls are issue will
>> be interesting.  And your Windows version, include the patch level,
>> etc.
>>
>> Please try the following patch:
>> https://people.freebsd.org/~sephe/hn_dec_txdesc.diff
>>
>> Thanks,
>> sephe
>
> Hi Sephe,
>
> Here's a bit of an explanation of the environment...
>
> AKIPS Network Monitor workload:
> - 22000 devices (routers/switches/APs/etc)
> - 123000 interfaces (60 snmp polling)
> - 131 netflow exporters
> - ~1500 pings per second
> - ~1000 snmp requests/responses per second (~1.9 million MIB object/min)
> - ~250 netflow packets/sec (~4500 flows/sec incoming)
> - ~130 syslog messages/sec (incoming)
> - ~200 snmp traps/sec (incoming)
>
> The ping/snmp poller is a single monolithic process (no threads).
> Separate processes for each of the syslog/trap/netflow collection.
>
> SNMP requests are sent using the sendto() system call over a non-blocking UDP
> socket for both IPv4 and v6.  We set the UDP socket receive buffer size to
> 4 Mbytes.  Nothing really complex with it.
>
> Pings are interlaced with snmp requests so we limit the bursty nature of
> small back-to-back packets (eliminates issues with switch interfaces dropping
> bursts of packets).  Ping requests are sent using a raw icmp socket.  We
> don't read the responses from the icmp socket, instead we put the interface
> into promiscuous mode and use the BPF info to measure the tx/rx RTT values.
>
> Syslog daemon just listens on a UDP socket with a 4 Mbyte receive buffer.
> Same with the snmp trap daemon.
>
>
> Here's some links to performance graphs of the VM:
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last2h.pdf
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last24h.pdf
>  https://www.akips.com/downloads/hyperv-fbsd11.1p1/system-graphs-last7d.pdf
>
> The OS was upgraded to 11.1p1 at 5pm on the 5th Sep.  The hn0 interface hung
> at 7:36pm.  The interface hung three times before we reverted to 11.0p9.  It
> takes a few hours after rebooting the VM before the interface hangs.
>
>
> Microsoft Host is running Windows 2012 R2.  Waiting for patch level info from
> the customer.
>
> I'll have to get the customer to spin up a new VM before trying your patch.
>
>
> Here's some info (after a reboot of the VM)
>
> Guest VM dmesg:
>
> FreeBSD 11.1-RELEASE-p1 #0 r322350: Thu Aug 10 22:16:21 UTC 2017
> r...@shed31.akips.com:/usr/obj/usr/src/sys/GENERIC amd64
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM
> 4.0.0)
> VT(vga): text 80x25
> Hyper-V Version: 6.3.9600 [SP18]
>   
> Features=0xe7f
>   PM Features=0x0 [C2]
>   Features3=0x7b2
> Timecounter "Hyper-V" frequency 1000 Hz quality 2000
> CPU: Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz (2300.00-MHz K8-class CPU)
>   Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
>   
> Features=0x1f83fbff
>   Features2=0x80002001
>   AMD Features=0x20100800
>   AMD Features2=0x1
> Hypervisor: Origin = "Microsoft Hv"
> real memory  = 34359738368 (32768 MB)
> avail memory = 33325903872 (31782 MB)
> Event timer "LAPIC" quality 100
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s)
> random: unblocking device.
> ioapic0: Changing APIC ID to 0
> ioapic0  irqs 0-23 on motherboard
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #2 Launched!
> Timecounter "Hyper-V-TSC" frequency 1000 Hz quality 3000
> random: entropy device external interface
> kbd1 at kbdmux0
> netmap: loaded module
> module_register_init: MOD_LOAD (vesa, 0x80f5b220, 0) error 19
> nexus0
> vtvga0:  on motherboard
> cryptosoft0:  on motherboard
> acpi0:  on motherboard
> acpi0: Power Button (fixed)
> cpu0:  on acpi0
> cpu1:  on acpi0
> cpu2:  on acpi0
> cpu3:  on acpi0
> attimer0:  port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0:  port 0x70-0x71 irq 8 on acpi0
> Event timer "RTC" frequency 32768 Hz quality 0
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> vmbus0:  on pcib0
> pci0:  on pcib0
> isab0:  at device 7.0 on pci0
> isa0:  on isab0
> atapci0:  port
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0
> ata0:  at channel 0 on atapci0
> ata1:  at channel 1 on atapci0
> pci0:  at device 7.3 (no drive