Re: [smartos-discuss] network card compatibility questions

2017-11-08 Thread Robert Mustacchi
On 11/7/17 15:35 , Lonnie Cumberland wrote:
> Greetings All,
> 
> As a short test to play around a bit, I just tried to boot up the latest
> SmartOS on my AMD 64-bit HP Pavilion A7-1461 PC that I have been running
> Ubuntu 16.04 and although it does boot up reasonably well, but I found that
> it does not recognize my network cards "Qualcomm Atheros AR8161 Gigabit
> Ethernet" and "Broadcom Limited BCM43228 802.11 a/b/g/n" wireless.
> 
> The SmarOS Hardware Requirements pointed me over to  http://illumos.org/hcl/
> 
> and sure enough, the Qualcomm Atheros AR8151 Gigabit Ethernet is supported,
> but not the Qualcomm Atheros AR8161 Gigabit Ethernet. The Broadcom Limited
> BCM43228 is not even on the list and I guess that is due to it being
> wireless.

For the interim, others comments around an Intel nic of some kind will
get you moving, I might ask that you hold onto that AR8161 as it may be
possible for us to add support for that and having someone with hardware
will really help. Though I can't promise anything there or any timeline.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Poor network performance on 10GbE over VLAN

2017-11-09 Thread Robert Mustacchi
Hi Denis,

The fundamental issue right now is that in certain VLAN configurations
the system is not taking advantage of hardware polling as it should be.
This means that the system falls back to high watermarks for packet
rates that are a bit low and end up netting the performance around
roughly what you're seeing for a single thread.

As you might imagine, we're acutely aware of this problem and are in the
process of implementing RFD 97
(https://github.com/joyent/rfd/tree/master/rfd/0097) to address it,
which is already seeing promising results from our current experiments.

Robert

On 11/9/17 19:35 , Denis Cheong wrote:
> I am adding 10GbE to my existing SmartOS server but am experiencing unusual 
> and severe performance issues that I’m at a loss to explain.
> 
> Over the default untagged 10GbE link, I can get >9Gbit/sec consistently under 
> all configurations.
> As soon as I test over a VLAN, transfer rates plummet to a very inconsistent 
> 3-4Gbit/sec RX, and <1Gbit/sec TX.
> 
> Does anybody have any ideas what might be going on here?
> 
> Performance over default VLAN ID (SmartOS is running iperf3 -s; nb with 
> SmartOS as client and other host as server, performance is identical):
> 
> Connecting to host 192.168.245.14, port 5201
>   local 192.168.245.21 port 56809 connected to 192.168.245.14 port 5201
>   Interval   Transfer Bandwidth
>   0.00-1.00   sec  1.12 GBytes  9.58 Gbits/sec
>   1.00-2.00   sec  1.12 GBytes  9.62 Gbits/sec
>   2.00-3.00   sec  1.12 GBytes  9.62 Gbits/sec
>   3.00-4.00   sec  1.12 GBytes  9.60 Gbits/sec
>   4.00-5.00   sec  1.12 GBytes  9.59 Gbits/sec
>   5.00-6.00   sec  1.12 GBytes  9.61 Gbits/sec
>   6.00-7.00   sec  1.12 GBytes  9.59 Gbits/sec
>   7.00-8.00   sec  1.10 GBytes  9.47 Gbits/sec
>   8.00-9.00   sec  1.12 GBytes  9.60 Gbits/sec
>   9.00-10.00  sec  1.12 GBytes  9.63 Gbits/sec
>   - - - - - - - - - - - - - - - - - - - - - - - -
>   Interval   Transfer Bandwidth
> 0.00-10.00  sec  11.2 GBytes  9.59 Gbits/sec  sender
> 0.00-10.00  sec  11.2 GBytes  9.59 Gbits/sec  receive
> 
> Performance over the same link, but over VLAN 300 (SmartOS is running iperf3 
> -s; note wild variation from 2 - 5Gbit/sec):
> 
> Connecting to host 192.168.245.134, port 5201
>   local 192.168.245.133 port 56786 connected to 192.168.245.134 port 5201
>   Interval   Transfer Bandwidth
>   0.00-1.00   sec   523 MBytes  4.39 Gbits/sec
>   1.00-2.00   sec   481 MBytes  4.04 Gbits/sec
>   2.00-3.00   sec   608 MBytes  5.10 Gbits/sec
>   3.00-4.00   sec   560 MBytes  4.70 Gbits/sec
>   4.00-5.00   sec   242 MBytes  2.03 Gbits/sec
>   5.00-6.00   sec   592 MBytes  4.96 Gbits/sec
>   6.00-7.00   sec   553 MBytes  4.64 Gbits/sec
>   7.00-8.00   sec   253 MBytes  2.12 Gbits/sec
>   8.00-9.00   sec   569 MBytes  4.77 Gbits/sec
>   9.00-10.00  sec   507 MBytes  4.25 Gbits/sec
>   - - - - - - - - - - - - - - - - - - - - - - - -
>   Interval   Transfer Bandwidth
> 0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec  sender
> 0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec  receiver
> 
> Performance over the same link, VLAN 300, SmartOS as client, server on other 
> host (note significantly worse performance on transmit):
> 
> Connecting to host 192.168.245.133, port 5201
>   local 192.168.245.134 port 35851 connected to 192.168.245.133 port 5201
>   Interval   Transfer Bandwidth
>   0.00-1.00   sec   104 MBytes   875 Mbits/sec
>   1.00-2.00   sec  46.3 MBytes   389 Mbits/sec
>   2.00-3.00   sec   130 MBytes  1.09 Gbits/sec
>   3.00-4.00   sec  76.0 MBytes   638 Mbits/sec
>   4.00-5.00   sec  97.0 MBytes   814 Mbits/sec
>   5.00-6.00   sec  17.4 MBytes   146 Mbits/sec
>   6.00-7.00   sec  67.6 MBytes   567 Mbits/sec
>   7.00-8.00   sec  92.4 MBytes   775 Mbits/sec
>   8.00-9.00   sec  79.7 MBytes   669 Mbits/sec
>   9.00-10.00  sec  73.3 MBytes   615 Mbits/sec
>   - - - - - - - - - - - - - - - - - - - - - - - -
>   Interval   Transfer Bandwidth
> 0.00-10.00  sec   785 MBytes   658 Mbits/sec  sender
> 0.00-10.00  sec   784 MBytes   658 Mbits/sec  receiver
> 
>

Re: [smartos-discuss] SmartOS functionality question

2017-11-09 Thread Robert Mustacchi
On 11/9/17 14:39 , Lonnie Cumberland wrote:
> Hi All,
> 
> Well, the weekend is finally approaching and I am hoping to get a bit more
> done on my SmartOS projects.
> 
> With that in mind, I was wondering something about the way that SmartOS can
> start/stop zones with VM's.
> 
> I am just wondering if you can suspend a VM and save the current state of
> the system instead of having to shutdown? Then later you can just restart
> the VM from the saved state.
> 
> Virtualbox has this type of "pause" function as do a number of other
> hypervisors and I was just wondering about SmartOS in this regard.

Hi Lonnie,

We do not have such a functionality at this time.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] No resolv.conf in ubuntu-certified-17.10

2017-11-28 Thread Robert Mustacchi
On 11/21/17 13:07 , Daniel Kontsek wrote:
> Hello,
> 
> We’ve got a bug report  about 
> the ubuntu-certified-17.10 
> 
>  image from images.joyent.com  not setting 
> /etc/resolv.conf from the resolvers property in vmadm. The 
> ubuntu-certified-16.04 image and other ubuntu-certified images before set 
> /etc/resolv.conf correctly.
> 
> At first, I was thinking that this has something to do with cloud-init. But 
> after playing around I noticed that the resolvconf package is missing or was 
> removed as there is a symlink to a non-existent file in /run 
> (/etc/resolv.conf -> ../run/resolvconf/resolv.conf). Installing resolvconf 
> solves the problem.
> However, there is also the systemd-resolved service, which maintains 
> /run/systemd/resolve/resolv.conf from the first boot. So changing the symlink 
> to point to /run/systemd/resolve/resolv.conf seems to be the best solution.

Hi Daniel,

Thanks for reporting this. I'll forward this onto some of the folks
working on images and hopefully we'll hear something back on this.

Thanks,
Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Create network overlays after boot

2017-11-28 Thread Robert Mustacchi
On 10/31/17 2:16 , Daniel Kontsek wrote:
> Hi,
> 
> This post can be considered as continuation of
> https://www.mail-archive.com/smartos-discuss@lists.smartos.org/msg04922.html 
> 
> 
> Overlay networking in SmartOS is great but the inability to configure them 
> automatically after boot is a real PITA. We have a proposal of implementation 
> of overlays persistence using the config file. But before  we implement it, I 
> want to ask the community - maybe there's already some other way planned for 
> overlay persistence.

Hi Daniel,

Thanks for putting this together. Sorry it's taken a while to get back
around to this. When I put together the overlay stuff originally I
wasn't sure how we wanted to expose it and have it make sense.

> The proposal is described here: 
> https://github.com/erigones/esdc-factory/issues/85#issuecomment-340701358 
> 
> 
> Basically, the `overlay_="”` from usbkey/config will 
> be transformed into a valid json in 
> /var/run/smartdc/networking/overlay_rules.json by network/physical SVC.

One thing that we've been trying to do with the nic tags and really this
is just another form of it is to have this be managed by nictagadm. I
think the config logic is probably alright, but I'd want to make sure
that we're able to manage it that way and have using nictagadm really be
the interface for this rather than having folks continue using the tag.

Cody, what do you think here?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] DHCP server in a zone?

2017-12-19 Thread Robert Mustacchi
On 12/19/17 7:55 , Matthias Teege wrote:
> Hello!
> 
> I've installed SmartOS, created a zone and installed the isc-dhcpd
> from the packages. I can see the DHCPDISCOVER and a DHCPOFFER in
> the logs but dont see an answer packet on the network interfaces.
> The client gets not address. I've also tried an lx branded zone
> with the same result.
> 
> Do I have "tune" the zone or the root zone to handle DHCP?

Did you change any of the vmadm anti-spoofing properties on the zone in
vmadm? By default, a zone is prevented from being a dhcp server.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] DHCP server in a zone?

2017-12-20 Thread Robert Mustacchi
On 12/20/17 3:32 , Matthias Teege wrote:
> On Tue, Dec 19, 2017 at 07:57:27AM -0800, Robert Mustacchi wrote:
> 
> Hello!
> 
>> On 12/19/17 7:55 , Matthias Teege wrote:
> 
>>> Do I have "tune" the zone or the root zone to handle DHCP?
>>
>> Did you change any of the vmadm anti-spoofing properties on the zone in
>> vmadm? By default, a zone is prevented from being a dhcp server.
> 
> I've found the documentation. Setting '"allow_dhcp_spoofing": true'
> solved the problem. Maybe the "dhcp_server" is the better option.

Generally, the "dhcp_server" option is the better one. It's the main
thing we set on our zones in triton that service dhcp. I say better
mostly because I like to use the minimal feature set here.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] "man -s2 stat" is WRONG!

2018-01-18 Thread Robert Mustacchi
On 1/17/18 18:40 , Jesus Cea wrote:
> what would be the right approach to request a man page update?. I guess
> this should be pushed thru Illumos, but I don't know the details.

The bug report to update the manual page for the defined higher
precision values is sufficient. I'll file that and take care of updating
the manual page. It's worth noting that the actual resolution will be
dependent on the file system and hardware clock.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] hanging boot on ASUS box

2018-02-10 Thread Robert Mustacchi
On 2/8/18 16:17 , Robert Fisher wrote:
> No joy with anything ACPI, XHCI or legacy boot. I've tried absolutely
> everything in the BIOS, and I can't even make it fail in a different way.
> FreeBSD, OpenBSD, Ubuntu, all work great. SmartOS, OmniOS, Solaris, all
> lock at the same point.
> 
> Anyone got any more ideas?

Do you by chance happen to have a Serial header on that system?

Thanks,
Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] failure to boot on Intel Core i7 8700 Hexa-Core Processor (Coffee Lake) and Gigabyte Z370XP SLI motherboard

2018-02-10 Thread Robert Mustacchi
On 2/8/18 18:34 , de...@hyltown.com wrote:
> I have what appears to be roughly the same situation as described by Robert 
> Fisher (ASUS H110 board and a 6th gen i5). Booting with -v i end up with:
> 
> root on /ramdisk:a fstype ufs
> 
> Neither F1+A nor Shift+Pause breaks out of this. Following suggestions from 
> Robert's thread, I've attempted "-B disable-xhci=true" and "-B 
> disable-acpi=true" but saw no difference. I have also disabled all C-states, 
> power management, UEFI, power management and speedstep-type of things I could 
> find but saw no difference. 
> 
> rmustacc said he hadn't seen reports of coffee lake at all, working or not.
> 
> ricco386 suggested looking through this seemingly related issue:
> https://github.com/joyent/smartos-live/issues/727
> 
> Though I didn't understand everything going on in that thread, and knowing 
> that I am likely running a different version than what is depicted there, I 
> did follow steps from this point:
> https://github.com/joyent/smartos-live/issues/727#issuecomment-342868065
> 
> Here is what I saw at the point of the hang, which definitely differs from 
> what was shown in that thread:
> https://postimg.org/image/4g4kovbdx/
> 
> 
> So ...
> 
> I've been running SmartOS on Supermicro hardware for several years at small 
> customer sites, using a mix of native zones and KVM to achieve a more-or-less 
> all-in-one server solution. I set them up and automate what management I can, 
> and don't revisit until/unless there are problems. It basically just works - 
> so I never end up getting very deep into troubleshooting. Because of this, 
> I'm unfamiliar with kernel debugging and all that - so I don't have much to 
> share relating to my problem other than what is listed above.
> 
> BUT the Gigabyte/i7 box is one I just built in hopes of playing around - so 
> it's currently available to bang on in case someone cares to hold my hand 
> through doing so. Perhaps this can be used to find/circumvent issues related 
> to coffee lake. Any help would be appreciated.

Is there a serial header that we can use for kmdb on that system? It may
be useful to try and use the module auto load / breakpoint system and/or
maybe disable the boot of other CPUs to try and debug.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Flag day: OS-6947 ucode shouldn't need install step

2018-06-11 Thread Robert Mustacchi
Hi,

If you don't build the platform image, then you can ignore this e-mail.

I put back OS-6947 ucode shouldn't need install step. With this you will
need to make sure that you update both smartos-live and illumos-joyent
in tandemn. If you have any questions and issues, please let me know. My
apologies for those of you who have build issues due to the initial
integration of OS-6944.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Heads Up: OS-6992 Want hypervisor API for FPU management

2018-06-12 Thread Robert Mustacchi
Hi,

If you don't build the platform, you can ignore this message.

With the integration of OS-6992 Want hypervisor API for FPU management,
if you update the kvm repo, you will need to make sure that you update
illumos-joyent as well. This is primarily done as part of cleaning up
and laying the groundwork to be able to run both bhyve and kvm at the
same time ala Apple's hypervisor framework. If you have any questions or
issues, please let me know.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: http://www.listbox.com


[smartos-discuss] platform flag day: optional gcc6 in -extra

2018-07-17 Thread Robert Mustacchi
Hi all,

If you don't build the platform, then you can skip this message. With
the integration of 'OS-7042 illumos-extra should support building
optional, extra gcc versions' if you update illumos-extra, then you must
update smartos-live and rerun ./configure for each workspace. In other
words the following steps should be taken:

$ gmake clobber
$ gmake update
$ ./configure

If you'd like to build the optional compilers, you should specify:

$ gmake BUILD_EXTRA_GCC=yes live

Note, this gcc6 is not currently being used. We will be more
aggressively introducing more compilers and warnings along with
bootstraps to ease the process.

If you have any questions, please reach out.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] after SmartOS Clean Re-install (20180711T060947Z) issue with Intel 10 Gigabit X710-DA2 SFP+ Dual Port Network Card

2018-07-17 Thread Robert Mustacchi
Hi Daniel,

It looks like your dump device may not be large enough for us to have
the entire dump. Is it possible to increase the dump device? I guess
there's something that's gone wrong as a result of the integration of
the TSO support for i40e.

Robert

On 7/17/18 12:00 , Daniel Plominski wrote:
> Hi,
> 
> after a smartos reinstall and upgrade to a newer platform image, the
> network card will not work after a while.
> The network card is responsible for vlan. Vlan works in a lx zone, but
> not for a kvm anymore.
> 
> 1.  SmartOS Clean Re-install
>  Restore Datasets Settings
>  Restore usbkey/config etc.
>  Restore ZFS Datasets
>  Import Zones to /etc/zones/index
>  Run / Start LX & KVM Zones
> 
> PI from SunOS assg10 5.11 joyent_20180509T053210Z i86pc i386 i86pc to
> SunOS assg10 5.11 joyent_20180717T123432Z i86pc i386 i86pc
> https://github.com/ass-a2s/illumos-joyent/tree/ass-release-20180717
> https://datasets.ass.de/public/SmartOS/20180717T123432Z/smartos-20180717T123432Z-USB.img.bz2
> 
> Rollback on version joyent_20180509T053210Z now shows the same error
> after a while.
> 
> I did not see any abnormalities under
> https://github.com/joyent/illumos-joyent/tree/master/usr/src/uts/common/io/i40e
> 
> after Shutdown all VMs and SmartOS Reboot:
> 
> 2018-07-16T05:18:37.496685+00:00 assg10 genunix: [ID 936769 kern.info]
> mpt_sas2 is /pci@7a,0/pci8086,2f01@0/pci1000,30e0@0/iport@v0
> 2018-07-16T05:18:37.496688+00:00 assg10 genunix: [ID 408114 kern.info]
> /pci@7a,0/pci8086,2f01@0/pci1000,30e0@0/iport@v0 (mpt_sas2) online
> 2018-07-16T05:18:37.496703+00:00 assg10 genunix: [ID 454863 kern.info]
> dump on /dev/zvol/dsk/zones/dump size 10465 MB
> 2018-07-16T05:18:37.496706+00:00 assg10 genunix: [ID 127566 kern.info]
> device pciclass,03@0(display#0) keeps up device sd@0,0(disk#0), but
> the former is not power managed
> 2018-07-16T05:18:37.496709+00:00 assg10 mac: [ID 469746 kern.info]
> NOTICE: aggr1000 registered
> 2018-07-16T05:18:37.496712+00:00 assg10 mac: [ID 435574 kern.info]
> NOTICE: igb1 link up, 1000 Mbps, full duplex
> 2018-07-16T05:18:37.496715+00:00 assg10 mac: [ID 435574 kern.info]
> NOTICE: aggr1000 link up, 1000 Mbps, full duplex
> 2018-07-16T05:18:37.496718+00:00 assg10 mac: [ID 435574 kern.info]
> NOTICE: igb2 link up, 1000 Mbps, full duplex
> 2018-07-16T05:18:37.496721+00:00 assg10 mac: [ID 435574 kern.info]
> NOTICE: igb3 link up, 1000 Mbps, full duplex
> 2018-07-16T05:18:37.496724+00:00 assg10 mac: [ID 435574 kern.info]
> NOTICE: igb0 link up, 1000 Mbps, full duplex
> 2018-07-16T05:18:37.496727+00:00 assg10 genunix: [ID 390243 kern.info]
> Creating /etc/devices/devid_cache
> 2018-07-16T05:18:37.496730+00:00 assg10 genunix: [ID 390243 kern.info]
> Creating /etc/devices/pci_unitaddr_persistent
> 2018-07-16T05:18:37.497026+00:00 assg10 savecore: [ID 570001 auth.error]
> reboot after panic: assertion failed: tcb != NULL, file:
> ../../common/io/i40e/i40e_transceiver.c, line: 2074
> 2018-07-16T05:18:33+00:00 assg10 savecore: [ID 676874 auth.error] Saving
> compressed system crash dump in /var/crash/volatile/vmdump.0
> 2018-07-16T05:18:40.860505+00:00 assg10 unix: [ID 504448 kern.info]
> NOTICE: Fastboot: Couldn't open /platform/i86pc/amd64/boot_archive
> 2018-07-16T05:18:45.870340+00:00 assg10 pseudo: [ID 129642 kern.info]
> pseudo-device: devinfo0
> 2018-07-16T05:18:45.870392+00:00 assg10 genunix: [ID 936769 kern.info]
> devinfo0 is /pseudo/devinfo@0
> 2018-07-16T05:18:50.549203+00:00 assg10 genunix: [ID 390243 kern.info]
> Creating /etc/devices/devname_cache
> 2018-07-16T05:20:04+00:00 assg10 savecore: [ID 320429 auth.error]
> Decompress the crash dump with #012'savecore -vf
> /var/crash/volatile/vmdump.0'
> 2018-07-16T05:20:04.857876+00:00 assg10 rootnex: [ID 349649 kern.info]
> xsvc0 at root: space 0 offset 0
> 2018-07-16T05:20:04.857902+00:00 assg10 genunix: [ID 936769 kern.info]
> xsvc0 is /xsvc@0,0
> 2018-07-16T05:20:06.914513+00:00 assg10 fmd: [ID 377184 daemon.error]
> SUNW-MSG-ID: FMD-8000-2K, TYPE: Defect, VER: 1, SEVERITY:
> Minor#012EVENT-TIME: Mon Jul 16 05:20:06 UTC 2018#012PLATFORM:
> Super-Server, CSN: 9000135765, HOSTNAME:
> assg10.assdomain.intern#012SOURCE: fmd-self-diagnosis, REV:
>#012EVENT-ID: 901bac51-20d6-c3ba-d2c3-df70f5af044f#012DESC: An
> illumos Fault Manager component has experienced an error that required
> the module to be disabled.  Refer to http://illumos.org/msg/FMD-8000-2K
>  for more information.#012AUTO-RESPONSE: The module has been disabled.
> Events destined for the module will be saved for manual
> diagnosis.#012IMPACT: Automated diagnosis and response for subsequent
> events associated with this module will not occur.#012REC-ACTION: Use
> fmdump -v -u  to locate the module.  Use fmadm reset 
> to reset the module.
> [root@assg10 ~]#
> [root@assg10 /var/crash/volatile]# ls -all
> total 20467016
> drwx--   2 root root   5 Juli 16 05:20 .
> drwxr-xr-x   3 root root   3 Juli 15 23:03 ..
> -rw-r--r--   1 root root 

Re: [smartos-discuss] after SmartOS Clean Re-install (20180711T060947Z) issue with Intel 10 Gigabit X710-DA2 SFP+ Dual Port Network Card

2018-07-19 Thread Robert Mustacchi
On 7/19/18 5:39 , Daniel Plominski wrote:
> Hi,
> 
> In order to avoid major disruptions, we have now started the entire
> server infrastructure with version 20180526T113546Z
> https://datasets.ass.de/public/SmartOS/20180526T113546Z/smartos-20180526T113546Z-USB.img.bz2

I really can't recommend doing that. You've made yourself vulnerable to
the eager FPU security issue.

> We need proper support for:
> Intel 10 Gigabit X710-DA2 SFP+ (Firmware NVM 1.5/1.7)

The firmware differences are likely just the driver being whinging. The
bigger issue is likely whatever the rare logic bug is with the LSO
additions. If you do have a complete dump that has that, that would
help, since all of our testing of this hasn't seen that issue so I'm not
sure we're going to see it again in our environment as it may depend on
the traffic pattern that was going on at your end.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] NVME 1.3

2018-07-27 Thread Robert Mustacchi
On 7/27/18 7:38 , Jan Paul wrote:
> Given the amazing progress Robert has made on making the new Kaby lake 
> machines work with SmartOS (huge kudos!),
> I'd like to raise a question related to us "small scale" hobbyists HW needs.
> 
> Given the fact the NUCs seem to be still the most efficient small lab box for 
> home SmartOS playing and those are mostly NVME only machines.
> I found that getting NVME 1.2 M2 SSDs is almost impossible as most of the 
> NVMEs I've been able to get are 1.3 ones.
> 
> Is there any plan on up-reving the Illumos nvme driver with the new version 
> support?

Yes, that's on the TODO list.

> I so far workaround it via setting the strict-version=0 in nvme.conf and the 
> system so far works, but I just wanted to bring this question up so
> we can get some feedback on it.

That's good to know. It should usually work and we maybe should relax
that strict version check to only be based on major versions. Which NVMe
1.3 parts are you using?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: AW: [smartos-discuss] Latest Smartos version cannot boot Win7 x64 with KVM

2018-07-27 Thread Robert Mustacchi
Hi Gernot,

I believe that some part of the eager FPU and the common FPU API for
hypervisors is likely responsible for this regression. Apologies, I
haven't had the time to hunt that down and am seeing if I can track down
a Win 7 image.

Robert

On 7/27/18 1:02 , Gernot Straßer wrote:
> I would really appreciate a word from Joyent engineers, if this issue is
> being worked on or if there is a work-around yet…
> 
> thanks
> 
> Von: Gernot Straßer [mailto:gernot.stras...@freenet.de]
> Gesendet: Dienstag, 17. Juli 2018 07:56
> An: smartos-discuss@lists.smartos.org
> Betreff: [smartos-discuss] Latest Smartos version cannot boot Win7 x64 with
> KVM
> 
> See also https://github.com/joyent/smartos-live/issues/792
> 
> Regards
> 
> Gernot
> 
> smartos-discuss |  
> Archives |   Modify Your
> Subscription
> 
> 
> 


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] based on OS-5492 - add more Advanced-Format drives (wiki.illumos.org 29.07.2018)

2018-08-01 Thread Robert Mustacchi
On 8/1/18 11:18 , Daniel Plominski wrote:
> Hi,
> 
> we have several Samsung 850 PRO (MZ7WD480) SSDs in use and we need a
> Zpool ashift of 13, maybe the addition of the sd.conf in the current
> joyent/smartos-live repository makes sense
> 
> https://github.com/ass-a2s/smartos-live/commit/386d25877b44d8c41057aefc933256dd7cc58c7f

Are there correctness problems? What sector sizes is the drive advertising?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] based on OS-5492 - add more Advanced-Format drives (wiki.illumos.org 29.07.2018)

2018-08-02 Thread Robert Mustacchi
On 8/1/18 12:18 , Daniel Plominski wrote:
> Hi Robert,
> 
> before the patch, the smartos setup had only ever used an ashift of 9,
> which is definitely wrong since the Samsung 850 PRO uses native 8k blocks

The reason we did this previously was because those devices were
actually causing ZFS errors on a scrub that seemed to be firmware
related. If it's not actually causing reliability issues with the drive,
then that's going to be a different story. What logical and physical
sector size is the device actually advertising?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


[smartos-discuss] Subject: L1 Terminal Fault (CVE-2018-3615, CVE-2018-3620, CVE-2018-3646)

2018-08-14 Thread Robert Mustacchi
Hi All,

Several vulnerabilities that are all called L1 Terminal Fault (L1TF)
have been announced which are CVE-2018-3615, CVE-2018-3620,
CVE-2018-3646. I wanted to call attention to the fact that this is a
problem for SmartOS users who are running multi-tenant, untrusted,
workloads. The full Joyent security advisory is availble at:
https://help.joyent.com/hc/en-us/articles/360007955414-Security-Advisory-Intel-L1-Terminal-Fault-Vulnerabilities-CVE-2018-3615-CVE-2018-3620-CVE-2018-3646-.

We'll have updated platform media available with fixes for this out
shortly. The changes have just been integrated and you can find the
reviews at https://cr.joyent.us/#/c/4679/ and
https://cr.joyent.us/#/c/4680/.

If you have any questions, please reach out.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


[smartos-discuss] ACPI Testing Request

2018-08-16 Thread Robert Mustacchi
Hi All,

There have been a number of issues with boot hangs on some of the more
recent Kaby Lake processors and a few Skylake SKUs. We were able to root
cause this to a deadlock in the core ACPICA code. For the full details
see https://github.com/joyent/smartos-live/issues/727 and
https://smartos.org/bugview/OS-7093.

Since ACPI changes can be a bit gritty, I would like to ask for a bit
more help in testing this across a variety of platforms -- in particular
Desktop platforms. I've put together a series of test images that have
the newer ACPI and also end up logging substantially more ACPI related
information to the console in case something goes wrong (particularly on
debug bits).

If you could test this and ensure that you can boot and reboot OK, I
would greatly appreciate it. I have both debug and non-debug media. If
you'd like to build this yourself, the changes to illumos-joyent that
we've made are available at
https://github.com/rmustacc/illumos-gate/tree/acpi-dev-smartos. I'd also
like to thank Mike Gerdts who wrote a bunch of tools for updating the
ACPI tree in illumos which has made this effort substantially easier.
Based on that we're now able to better track how we're handling changes
and revisions to ACPI. That's available at
https://github.com/joyent/acpica/tree/joyent/20180629-wip.

Please note that these images are based on a platform from last week.

non-debug raw platform:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/platform-20180807T230146Z.tgz

non-debug ISO vga:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-vga.iso

non-debug ISO ttya:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-ttya.iso

non-debug ISO ttyb:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-ttyb.iso

non-debug USB vga:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-vga.usb.bz2

non-debug USB ttya:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-ttya.usb.bz2

non-debug USB ttyb:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/non-debug/acpi-nd-ttyb.usb.bz2


debug raw platform:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/platform-20180807T223604Z.tgz

debug ISO vga:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-vga.iso

debug ISO ttya:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-ttya.iso

debug ISO ttyb:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-ttyb.iso

debug USB vga:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-vga.usb.bz2

debug USB ttya:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-ttya.usb.bz2

debug USB ttyb:
https://us-east.manta.joyent.com/rmustacc/public/preview/acpi-201808/debug/acpi-debug-ttyb.usb.bz2

Again, thank you in advance for giving this a shot. Whether it works and
especially if it does not for some reason, if you test this, can you
please reply and let me know what the motherboard, processor, and BIOS
revision that you're using are?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] NVME 1.3

2018-08-20 Thread Robert Mustacchi
On 7/27/18 8:51 , Jan Paul wrote:
> 
> 
>> On 27 Jul 2018, at 17:29, Robert Mustacchi  wrote:
>>
>> On 7/27/18 7:38 , Jan Paul wrote:
>>> Given the amazing progress Robert has made on making the new Kaby lake 
>>> machines work with SmartOS (huge kudos!),
>>> I'd like to raise a question related to us "small scale" hobbyists HW needs.
>>>
>>> Given the fact the NUCs seem to be still the most efficient small lab box 
>>> for home SmartOS playing and those are mostly NVME only machines.
>>> I found that getting NVME 1.2 M2 SSDs is almost impossible as most of the 
>>> NVMEs I've been able to get are 1.3 ones.
>>>
>>> Is there any plan on up-reving the Illumos nvme driver with the new version 
>>> support?
>>
>> Yes, that's on the TODO list.
> Perfect!
>>
>>> I so far workaround it via setting the strict-version=0 in nvme.conf and 
>>> the system so far works, but I just wanted to bring this question up so
>>> we can get some feedback on it.
>>
>> That's good to know. It should usually work and we maybe should relax
>> that strict version check to only be based on major versions. Which NVMe
>> 1.3 parts are you using?
>>
> Samsung 970EVO 250G
> 
> it seems to be pretty happy with the workaround (so far about 10 hours of 
> runtime on SmartOS).

FYI, a fix for this went back today:
https://github.com/joyent/illumos-joyent/commit/1eb19b4a7770efe8736592808ccffef5e3c16bb8.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125
Powered by Listbox: https://www.listbox.com


Re: [smartos-discuss] Call for Testers, usb->ethernet devices (Round 2)

2014-03-04 Thread Robert Mustacchi
On 3/2/14 9:20 , G B wrote:
> I can validate the usb-ehternet drivers work.  I have a Trendnet TU-ET100C, 
> Trendnet TU2-ET100 and another 10/100 usb Ethernet adapter using the axf 
> driver and all three are recognized.  The non-Trendnet adapter is what I 
> currently have plugged into my switch and it is working for an Internet 
> connection from SmartOS.

Thank you for testing. I've landed them in illumos-joyent. They should
be a part of the next release.

Robert

> On Friday, February 28, 2014 3:04 PM, Robert Mustacchi  
> wrote:
>   
> On 2/28/14 12:06 , Garrett D'Amore wrote:
>> Maybe we ought to integrate these upstream.  As part of that effort, if it 
>> were to occur, I’d like to see the legacy STREAMS DLPI stuff yanked — there 
>> is no reason these drivers need that code for any illumos system. (Or 
>> indeed, for any system S10 or newer.)
> 
> Integrating it into illumos is certainly the long term plan.
> 
> Robert
> 
>> On February 28, 2014 at 11:41:10 AM, Robert Mustacchi (r...@joyent.com) 
>> wrote:
>>
>> This is an update of the previous set of test drivers that were built  
>> incorrectly. I've been able to validate that these drivers all at least  
>> load, though I have no devices to attach them to. You can find them at:  
>>
>> http://us-east.manta.joyent.com/rmustacc/public/preview/preview.html 
>>
>> Here's the background from my previous mail on this subject:  
>>
>> Over the years, several folks have been interested in the USB to  
>> Ethernet that Masa Murayama has put together at  
>> (http://homepage2.nifty.com/mrym3/taiyodo/eng/). After talking with him,  
>> I've gone ahead and put together some test images of SmartOS with these  
>> devices with the goal of integrating them into illumos. If you have one  
>> of these devices and currently build your own driver and image, and  
>> could give these a shot, I'd greatly appreciate it as I don't have any  
>> of these devices myself.  
>>
>> Thanks,  
>> Robert  
>>
>>
>> ---  
>> smartos-discuss  
>> Archives: https://www.listbox.com/member/archive/184463/=now 
>> RSS Feed: 
>> https://www.listbox.com/member/archive/rss/184463/22103350-51080293 
>> Modify Your Subscription: https://www.listbox.com/member/?&; 
>> Powered by Listbox: http://www.listbox.com/ 
>>
>>
>>
>> ---
>> smartos-discuss
>> Archives: https://www.listbox.com/member/archive/184463/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21483261-4b78dd38
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com/
>>
> 
> 
> 
> ---
> smartos-discuss
> Archives: https://www.listbox.com/member/archive/184463/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/184463/24559458-54d8e931
> Modify Your Subscription: https://www.listbox.com/member/?&;
> 
> Powered by Listbox: http://www.listbox.com/
> 
> 
> ---
> smartos-discuss
> Archives: https://www.listbox.com/member/archive/184463/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21483261-4b78dd38
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
> 



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] KVM / illumos flag day

2014-03-10 Thread Robert Mustacchi
Hi,

I've just pushed a fix for 'HVM-796 kvm time jumps on Sandy Bridge'
which requires a corresponding change in the illumos repo 'OS-2817 Need
a way to get tsc deltas'. If you update the illumos-kvm repo, you need
to update your illumos repo otherwise the build will fail.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Hung Boot on upgrade

2014-03-15 Thread Robert Mustacchi
On 03/15/2014 12:56 AM, Matthias Götzke wrote:
> Hi,
> 
> We wanted to boot our smartos server with a newer image (tried a few). Sadly 
> the boot seems stuck.
> 
> Booting with noimport=true and manually importing seems stuck too no 
> (observable) progress on zpool import zones we let it run for a few hours, 
> and its a very small pool , just a 4 disks plus cache drives.
> 
> Booting the old image works fine..
> 
> What is the best way to debug such issues ?

The best thing to do here is again to boot noimport. Next, run the zpool
import in the background. Once that appears to be hung, you'll want to
run something like this:

$ pgrep zpool
$ # Note the output pid from pgrep and substitute with  below
$ mdb -ke '0t::pid2proc | ::walk thread | ::findstack -v'

That should generate output and tell us where in the kernel the threads
that appear to be hung are. Based on that, we should be able to make
more progress.

Just to make sure, you haven't done anything like remove the cache
device have you?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Flag day: Bardiche

2014-03-19 Thread Robert Mustacchi
If you don't build the SmartOS platform, you probably don't need to read
this.

I've just pushed the bardiche project which has changes across
smartos-live, illumos-joyent, kvm and kvm-cmd. You should make sure all
of those are up to date before continuing.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] disk configuration

2014-03-21 Thread Robert Mustacchi
On 03/21/2014 09:09 AM, Alessio wrote:
> Hi. I know this is a question asked several times, but I can't start
> with a nasty configuration, so a suggestion for my case would be welcome.
> 
> I have two brand new servers.
> They are equipped with 6 SATA disks (3TB each one).
> Well, the question is...: what is the best zpool configuration?
> Could RAIDZ2 be the right choice?

What are your constraints with respect to capacity, performance, and
durability?



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS 20140320

2014-03-21 Thread Robert Mustacchi
On 3/21/14 9:51 , Goktug YILDIRIM wrote:
> Hello,
> 
> I've just booted from latest ISO but it seems that my KVMs are not getting
> automatic IPs (tested only windows KVMs, not using dhcp server). If I set
> IP manually, it works.
> I've upgraded from 20131204T101631Z.

Can you clarify your configuration with Windows VMs please? Have you
specified an IP in vmadm or is it set to 'dhcp'? eg. are you expecting
qemu to give you an IP address or a real dhcp server?

Is this happening with non-Windows guests?

Robert

> My aim is to test bardiche benefits. Firewall works at the first glance.
> (and thanks for doing this great work!).
> 
> -Goktug
> 
> 
> On Fri, Mar 21, 2014 at 5:43 PM, Keith Wesolowski <
> keith.wesolow...@joyent.com> wrote:
> 
>> SmartOS 20140320 is now available.
>>
>> Page of links:
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/index.html
>>
>> Latest directory redirect:
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/latest.html
>>
>> Individual links:
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/smartos-20140321T062644Z.iso
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/smartos-20140321T062644Z-USB.img.bz2
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/smartos-20140321T062644Z.vmwarevm.tar.bz2
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/platform-20140321T062644Z.tgz
>>
>> Snaplinks for the latest build:
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos-latest.iso
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos-latest-USB.img.bz2
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/smartos-latest.vmwarevm.tar.bz2
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/platform-latest.tgz
>>
>> Changelog:
>>
>> https://us-east.manta.joyent.com/Joyent_Dev/public/SmartOS/20140321T062644Z/changelog.txt
>>
>>
>> ---
>> smartos-discuss
>> Archives: https://www.listbox.com/member/archive/184463/=now
>> RSS Feed:
>> https://www.listbox.com/member/archive/rss/184463/23022383-129bf69b
>> Modify Your Subscription:
>> https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>>
> 
> 
> 
> ---
> smartos-discuss
> Archives: https://www.listbox.com/member/archive/184463/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/184463/21483261-4b78dd38
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
> 



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS 20140320

2014-03-21 Thread Robert Mustacchi
On 3/21/14 10:31 , Goktug YILDIRIM wrote:
> Update... It works on linux guest (Joyent's Centos 6).
> Here is the my "vmadm create" json.
> http://pastebin.com/gBPNYsnv
> 
> My previous windows guests are Windows 2008 R2 and Windows 7.
> 
> Would you like me to file a bug? If needed I can gather required diagnostic
> infos.

For now please file a bug on illumos-kvm-cmd with the json that you've
provided. I'll see if I can get some cycles to reproduce it locally,
though it may be a little while for me to do it.

Can you run the following DTrace one liner when forcing Windows to do a
dhcp request and put that into the bug please:

dtrace -n 'pid$target::vnic_*:entry{ @[probefunc] = count(); }' -p 

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Modifying /etc/opt

2014-03-21 Thread Robert Mustacchi
On 3/21/14 11:39 , a b wrote:
> Oh yeah, I see:
> 
>  if [ $? -eq 0 ]; then
> mount -F zfs ${SYS_ZPOOL}/var /var
> mount -F zfs ${SYS_ZPOOL}/config /etc/zones
> mount -F zfs ${SYS_ZPOOL}/opt /opt
> if [[ -n $(/bin/bootparams | grep '^headnode=true') ]]; then
> mkdir -p /opt/smartdc/agents/smf
>   mount -O -F lofs /var/svc/manifest/site /opt/smartdc/agents/smf
> fi
> 
> OK, since I don't yet understand how SmartOS is built, how do I rebuild the 
> miniroot?
> Might there be a document somewhere on this?
> 
> I seem to vaguely remember Keith mentioning something along the lines of 
> modifying the miniroot for something a week or two ago... maybe it was USB 
> drivers?

http://wiki.smartos.org/display/DOC/Building+SmartOS+on+SmartOS

> A search for "miniroot" in the SmartOS wiki turns up nothing.

We don't call it a miniroot, hence why that search term will not find
anything on it.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] problem with "spanning-tree portfast trunk" enabled switch port

2014-03-21 Thread Robert Mustacchi
On 3/21/14 14:13 , arnold.w...@gxs.com wrote:
> I'm having an issue with "spanning-tree portfast trunk" enabled switch port 
> with SmartOS machine. The switch port will disable itself, probably due to 
> BPDU packets generated from the NICs. I found this article from Oracle, 
> http://docs.oracle.com/cd/E23824_01/html/821-1458/giyhe.html, which allows me 
> to block the BPDU packets. I enabled "restricted" link-protection mode, 
> following this, 
> http://docs.oracle.com/cd/E23824_01/html/821-1458/giyjl.html#scrolltoc .
> It fixed the issue, as long as the machine stays up, however the problem 
> happens again if the machine is rebooted. I have tried to apply the same fix 
> via the SMF, however it didn't make any difference, I assume it started too 
> late. Any suggestion how I can make this as the NIC parameter so it can be 
> applied when the NIC come online? The usual /etc/ won't work in SmartOS.
> Thanks in advance for any help.
> BTW, I can't set the switch port to "spanning-tree portfast disabled" since 
> I'm running SmartOS machine inside VMware ESX host. I don't think ESX makes 
> any difference to this issue, other than I can't make the switch port change 
> since "spanning-tree portfast trunk" is recommended by VMware. Since there 
> are other machines hosted in the ESX, the system admin won't make that change.

So I would presume that the reason you're seeing spanning tree here is
because you're actually inside VMware. In VMware and other virtualized
environments we create a bridge to work around a lot of the problems of
having multiple vnics, promiscuous mode, etc.

That change was made a long time ago. Since then, there was some work on
the driver side to basically just say, if it's a VMware device, we're
just going to assume it only supports a single mac address and if we do
anything else, put the device into promiscuous mode. It may be with that
the bridge is no longer needed. W'ell have to do some experiments locally.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Kernel panic

2014-03-21 Thread Robert Mustacchi
On 03/21/2014 03:53 PM, Alex Adriaanse wrote:
> I’ve installed SmartOS on a new server, and I regularly get kernel panics. 
> The last couple of times that this happened it seems to boot up fine after it 
> restarted from the panic, but when I'd restart again it’d give me a panic 
> again. I haven’t tested it enough to see whether this is a consistent 
> pattern. Interestingly enough, another server that I have with fairly similar 
> specs does not run into this issue. The only differences between the two that 
> I’m aware of is that the crashing server has a newer/faster CPU and slightly 
> different set of hard drives. It’s got the same motherboard and disk 
> controller).
> 
> Here is some output from running mdb on the crash dump:
> 
>> ::status
> debugging crash dump vmcore.2 (64-bit) from 
> operating system: 5.11 joyent_20140307T223339Z (i86pc)
> image uuid: (not set)
> panic message: BAD TRAP: type=e (#pf Page fault) rp=ff0040aff0a0 
> addr=7fbd11e10 occurred in module "apix" due to an illegal access to a user 
> address
> dump content: kernel pages only
>> ::stack
> apix_intx_enable+0x42(12)
> apix_enable_vector+0x125(ff090eed3800)
> apix_setup_io_intr+0x58(ff090eed3800)
> apix_addspl+0x6b(8920, 5, 0, 0)
> apix_add_avintr+0x136(ff090eda9d40, 5, f7efaf60, 
> ff09067484e8, 8920, ff090edc2200)
> add_avintr+0x7d(ff090eda9d40, 5, f7efaf60, ff09067484e8, 
> 8920, ff090edc2200)
> pci_enable_intr+0xb7(ff0906749800, ff0906833010, ff090eda9d40, 0)
> pci_common_intr_ops+0x20e(ff0906749800, ff0906833010, 8, 
> ff090eda9d40, 0)
> npe_intr_ops+0x21(ff0906749800, ff0906833010, 8, ff090eda9d40, 0)
> pciide_intr_ops+0x18c(ff0906833010, ff0906831aa0, 8, 
> ff090eda9d40, 0)
> i_ddi_intr_ops+0x4e(ff0906831aa0, ff0906831aa0, 8, ff090eda9d40, 
> 0)
> ddi_intr_enable+0x57(ff090eda9d40)
> ddi_add_intr+0xca(ff0906831aa0, 0, ff090edc2358, 0, f7efaf60, 
> ff090edc2200)
> ghd_register+0x178(f7f024b3, ff090edc2318, ff0906831aa0, 0, 
> ff090edc2200, f7f006e0)
> ata_init_controller+0x31b(ff0906831aa0)
> ata_attach+0x34(ff0906831aa0, 0)
> devi_attach+0x92(ff0906831aa0, 0)
> attach_node+0xa7(ff0906831aa0)
> i_ndi_config_node+0x86(ff0906831aa0, 6, 0)
> i_ddi_attachchild+0x48(ff0906831aa0)
> devi_attach_node+0x5e(ff0906831aa0, 4004048)
> config_immediate_children+0xbf(ff0906833010, 4004048, )
> devi_config_common+0xd9(ff0906833010, 4004048, )
> mt_config_thread+0x58(ff0913062c40)
> thread_start+8()
> 
> Please let me know what I should do to further diagnose (or work around) this 
> issue.

You could work around this issue by passing -Bdisable-apix=true in the
boot parameters. Something similar to this has been seen once before,
but it's never been root caused. Making the dump available will help.
It's very likely that there's some kind of race going on here that isn't
being dealt with properly which is why you see this on some machines,
but not others.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] problem with "spanning-tree portfast trunk" enabled switch port

2014-03-21 Thread Robert Mustacchi
On 3/21/14 20:21 , Nick Perry wrote:
>> In VMware and other virtualized environments we create a bridge to work
> around a lot of the problems of having multiple vnics, promiscuous mode,
> etc.
> 
> But if the ESXi host doesn't have BPDU filtering enabled for the SmartOS
> guest and is connected to a BPDUGuard enabled physical switch port (which I
> would think the vast majority of ESXi servers are), you're going to knock
> each uplink port of the ESX host offline in turn, also disrupting its
> other tenants. :(

Right. This was originally done targeting the VBox and the traditional
Desktop based folks where this was less important. But given that there
are now probably better methods for what the bridge was trying to
achieve (working around the fact that VMware doesn't properly emulate
the NICs whose PCI IDs it's using) and harcode the fact that the
hypervisors are wrong in the driver based on the VMware sub-system IDs.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Missing some networking steps in KVM guest deployment?

2014-03-24 Thread Robert Mustacchi
Hi Evan,

I have some responses inline for this and also some comments based on
your most previous e-mail.

First can you please confirm what release you're running?

We should have support for the Intel I354, though it will require a more
recent platform. Support wasn't added into illumos until the beginning
of this month. Next, can you also please confirm what the PCI ID of your
HP NC360T is? You should be able to get that from prtconf -d.

On 3/24/14 15:22 , Evan Rowley wrote:
> The last email I sent was somewhat auto-corrected by my Android phone.
> I apologize to anyone who was confused by the minor typos.
> 
> [root@00-1f-29-61-59-34 ~]# nictagadm list
> NAME   MACADDRESS LINK
> admin  00:1f:29:61:59:34  e1000g0
> 
> 
> ^ This looks normal.

Yes, there's nothing wrong with that.

> The single KVM guest I am running now is paired (somehow? I don't know
> the exact mechanics yet) to the net0 link.

Every entry in the nics array causes the creation of a virtual nic in
the host. The first one for a given zone is called net0, the second
net1, etc. A vnic is created over a physical nic. The physical nic is
then programmed to receive on that mac address. If we've exceeded the
number of mac addresses that the device supports, then it is
transparently put into promiscuous mode.

> A couple interesting things in the output for net0:
> 
> LINK PROPERTY PERM VALUE DEFAULT POSSIBLE
> net0 state r- unknown up up,down
> 
> ^ The state is currently "unknown" which the rest of the links have
> "up". I'm not sure why that is. Is it out of the ordinary?

I'll have to dig into how that's generated, but I see the same state on
our KVM instances.

In general, what I'd ask is that if you snoop packets in the global
zone, eg. run snoop -d e1000g0, do you see traffic coming from the
guest? Keep in mind that depending on the problems that you're seeing,
you may need to pay attention to ARP messages. What we'd like to
ascertain is are we seeing traffic go out over the wire and if so, are
we seeing responses come back to it and appear in the packet capture?

I hope this helps point you in the right direction, let me know if
there's something else I can clarify or suggest with respect to debugging.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Kernel panic

2014-03-26 Thread Robert Mustacchi
On 3/21/14 20:30 , Alex Adriaanse wrote:
> On 3/21/2014 8:58 PM, Robert Mustacchi wrote:
>> You could work around this issue by passing -Bdisable-apix=true in the
>> boot parameters. Something similar to this has been seen once before,
>> but it's never been root caused. Making the dump available will help.
>> It's very likely that there's some kind of race going on here that isn't
>> being dealt with properly which is why you see this on some machines,
>> but not others.
> 
> Thanks. I've posted the dump at http://oseberg.io/debug/vmdump.gz

I've taken a look at this so far. Can you confirm whether the BIOS has
configured your controller to be in legacy IDE mode or if it's in AHCI mode?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Kernel panic

2014-03-26 Thread Robert Mustacchi
On 3/26/14 17:37 , Alex Adriaanse wrote:
> On Mar 26, 2014, at 6:02 PM, Robert Mustacchi  wrote:
>> I've taken a look at this so far. Can you confirm whether the BIOS has
>> configured your controller to be in legacy IDE mode or if it's in AHCI mode?
> 
> It’s using legacy IDE mode. Currently there are no drives attached to the 
> built-in controller. I’ve posted a screenshot at 
> http://oseberg.io/debug/sata-config-bios.png

Thanks, that helps confirm a little bit. An alternate work around may be
to either disable that entirely or change it to AHCI.

Are you booting from USB? If so, I'd like to put together an anonymous
DTrace script that you could use when you boot so that the next time it
panics, we can get some additional information. It's not entirely clear
right now what's causing us to end up here. I think that should help us
get that information.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Strange Issues on Boot and Kernel Panics

2014-03-27 Thread Robert Mustacchi
On 03/27/2014 06:53 AM, Chris McVittie wrote:
> Hi,
> We've had a week of experiencing Kernel Panics on one of our servers.  When
> booting we sporadically get: (maybe 2/5 times)
> 
> relocation error: R_AMD64_PC32 /kernel/amd64/genunix file
> /kernel/amd64/genunix: ...
> bad strndx 1822
> 
> sometimes this repeats itself as seen in
> 
> (maybe 1/5 times)
> Sometimes it hangs on GRUB Loading, Please Wait.
> 
> (maybe 1/5 times)
> Kernel trap message
> 
> (maybe 1/5 times)
> Success - then Kernel panic within 24 Hours.
> I'll share the details
> 
> 
> I'm having a hard time diagnosing the issue, or finding out what the root
> cause is.  Can these seemingly different error messages be connected?  Is
> the server at fault, or is this an OS issue?
> 
> Server has Supermicro X8DTU, with Dual Xeons 5520 and 48GB RAM.
> 
> Additional screenshots at http://imgur.com/a/LVqXx

Hi,

What image are you using? Did you build your own or is this one of the
stock ones?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] n00b question

2014-04-01 Thread Robert Mustacchi
On 04/01/2014 06:58 PM, 20052...@use.startmail.com wrote:
> hi,
> I just installed smart os virtually, and created a new zone mainly using
> instructions here:
> http://wiki.smartos.org/display/DOC/How+to+create+a+zone+%28+OS+virtualized+machine+%29+in+SmartOS
> 
> I can't get out or into the zone network wise (ssh, wget, ping), but can
> ssh to the global zone.  I was wondering is something disabling
> networking by default in new zones for security reasons.
> I figured I'd ask before i start ripping my hair out.  I used the latest
> 64bit image.

So, you mentioned that you're running this virutally, does that mean
that you're running SmartOS inside of a virtual machine? If that's the
case, you generally need to tell the hypervisor software like
virtualbox/vmware to put the underlying network device in promiscuous mode.

If you're running physically, then the first thing to ask is can you
ping the global zone and then can you ping your gateway. You should
confirm that the gateway is set by looking at the routing tables from
netstat -rn.

Robert

> Relevant information if anyone can think of anything
> 
> netstat -i
> Name  Mtu  Net/Dest  AddressIpkts  Ierrs Opkts  Oerrs Collis
> Queue
> lo0   8232 loopback  localhost  0  0 0  0 0  0
> net0  1500 192.168.1.0   192.168.1.150  7970 1020 0  0
> 
> **
> 
> ifconfig net0
> net0: flags=40001000843
> mtu 1500 index 2
> inet 192.168.1.150 netmask ff00 broadcast 192.168.1.255
> ether e2:af:54:b:2e:7d
> ***
> ps -ef | grep inet
> root  6771  6242   0 21:28:18 ?   0:00 /usr/lib/inet/inetd
> start
> ***
> more /etc/defaultrouter
> 192.168.1.1
> 
> ***
> more /etc/resolv.conf
> search local
> nameserver 8.8.8.8
> 
>  
> 
> 
> 
> ---
> smartos-discuss
> Archives: https://www.listbox.com/member/archive/184463/=now
> RSS Feed:
> https://www.listbox.com/member/archive/rss/184463/21483261-4b78dd38
> Modify Your Subscription:
> https://www.listbox.com/member/?&;
> 
> Powered by Listbox: http://www.listbox.com
> 



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] disk performance

2014-04-04 Thread Robert Mustacchi
On 04/04/2014 04:16 AM, Alessio wrote:
> Hi.
> I'm not here to reiterate a discussion. Just for curiosity.
> I have a server (Fujitsu Primergy RX300 S8) with a RAID card (LSI
> D2607). And 6 SATA disks, 3TB each one.
> I cannot configure disks as JBOD, but as usual in this case, I've
> created 6 RAID0 volumes.
> 
> I have no way to put zil in an SSD.
> I've then tried to configure the zpool in two ways:
> - 3 two-way mirrors
> - raidz2
> 
> 
> If I use a command like this:
> dd if=/dev/zero of=prova bs=128k count=1
> 
> I get these results (average).
> 
> - mirror
> 
> in the GZ 1734 MB/s
> in a zone 382 MB/s
> in a KVM  149 MB/s
> 
> - raidz2
> 
> in the GZ 1597 MB/s
> in a zone 302 MB/s
> in a KVM  30.8 MB/s
> 
> 
> 30.8 MB/s is poor!
> And why so much difference between zones and GZ?

Well, first off. KVM guests always perform synchronous writes.
Therefore, that'll be much more expensive than anything else.

For the other differences, I'd seriously go through and do active
benchmarking (http://brendangregg.com/activebenchmarking.html). There
are many factors that could happen here, but from your numbers, it's
almost certain that you're testing asynchronous writes in the case of
the GZ and the normal zone as opposed to a KVM instance which is issuing
synchronous writes.

You could easily be hitting other things such as the I/O throttle, CPU
throttling, etc. You'll need to dig into your configuration.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Re: [developer] zpool import hangs

2014-04-04 Thread Robert Mustacchi
On 04/03/2014 08:45 PM, Youzhong Yang wrote:
> We have 3 test servers running SmartOS, host B and C have zpools zp01 and
> zp02 respectively, host A can see disks of zp01 and zp02 (disks of zp01 are
> connected to HBA ports of A and B; disks of zp02 are connected to HBA ports
> of A and C).
> 
> And I have a loop to test zpool import/export:
> 
> for i in {1..100}
> do
> on B: zpool export -f zp01
> on A: zpool import -o cachefile=none zp01
> on A: zpool export -f zp01
> on B: zpool import -o cachefile=none zp01
> on C: zpool export -f zp02
> on A: zpool import -o cachefile=none zp02
> on A: zpool export -f zp02
> on C: zpool import -o cachefile=none zp02
> done
> 
> After a few iterations, zpool import hangs. This can be reproduced
> consistently. Is this a known issue? Should I file a bug report?

Given that you're blocked here in the creation of the thread creation,
particularly for the metatslab group and the means by which you reached
here, it highly corresponds to a reported problem
(http://www.listbox.com/member/archive/182191/2014/03/sort/time_rev/page/3/entry/17:85/20140314181835:B085982A-ABC6-11E3-B7F2-ABEF6D29024B/).
You should probably verify that you have quite a large number of kernel
threads and that is where the leak is coming from.

I don't know if George ever filed a bug for this or not, but he
indicated that a fix was on the way.

Robert


> 
> # pstack 29207
> 29207:  zpool import -o cachefile=none zp01
> 
> # mdb -ke '0t29207::pid2proc | ::walk thread | ::findstack -v'
> stack pointer for thread ff3289f39040: ff01ed3041e0
> [ ff01ed3041e0 _resume_from_idle+0xf4() ]
>   ff01ed304210 swtch+0x141()
>   ff01ed304250 cv_wait+0x70(ff32330f901e, ff32330f9020)
>   ff01ed304380 vmem_xalloc+0x630(ff32330f9000, 6000, 1000, 0, 0, 0,
> 0,
>   0100)
>   ff01ed3043f0 vmem_alloc+0x137(ff32330f9000, 6000, 100)
>   ff01ed304510 segkp_get_internal+0x11b(fbc33760, 5000, e,
>   ff01ed304528, 0)
>   ff01ed304570 segkp_cache_get+0x103(1)
>   ff01ed304610 thread_create+0x544(0, 0, fbaf35b0,
> ff32c9520b40
>   , 0, fbc30540, ff010002, 003c)
>   ff01ed304660 taskq_thread_create+0x108(ff32c9520b40)
>   ff01ed304710 taskq_create_common+0x1a7(f7e7e238, 0, 32, 3c, a,
>   7fff, fbc30540, ff32, ff0100040008)
>   ff01ed304770 taskq_create+0x50(f7e7e238, 32, 3c, a, 7fff,
> 8)
>   ff01ed3047b0 metaslab_group_create+0x96(ff36fbca70a8,
> ff328784d000
>   )
>   ff01ed304860 vdev_alloc+0x54a(ff32a78fe000, ff01ed304928,
>   ff3285910a80, ff3284e74540, a, 0)
>   ff01ed304900 spa_config_parse+0x48(ff32a78fe000, ff01ed304928,
>   ff3285910a80, ff3284e74540, a, 0)
>   ff01ed3049a0 spa_config_parse+0xda(ff32a78fe000, ff01ed304a18,
>   ff36fbca7f88, 0, 0, 0)
>   ff01ed304a90 spa_load_impl+0xf4(ff32a78fe000, d5c8b305012c90c8,
>   ff32d3417d30, 3, 0, 1, ff01ed304ad8)
>   ff01ed304b30 spa_load+0x14e(ff32a78fe000, 3, 0, 1)
>   ff01ed304b80 spa_tryimport+0xaa(ff3286740180)
>   ff01ed304bd0 zfs_ioc_pool_tryimport+0x51(ff335c22a000)
>   ff01ed304c80 zfsdev_ioctl+0x4a7(5a, 5a06, 804258c, 13,
>   ff32578b3458, ff01ed304e68)
>   ff01ed304cc0 cdev_ioctl+0x39(5a, 5a06, 804258c, 13,
>   ff32578b3458, ff01ed304e68)
>   ff01ed304d10 spec_ioctl+0x60(ff3284335d80, 5a06, 804258c, 13,
>   ff32578b3458, ff01ed304e68, 0)
>   ff01ed304da0 fop_ioctl+0x55(ff3284335d80, 5a06, 804258c, 13,
>   ff32578b3458, ff01ed304e68, 0)
>   ff01ed304ec0 ioctl+0x9b(3, 5a06, 804258c)
>   ff01ed304f10 _sys_sysenter_post_swapgs+0x149()
> 
> 
> 
> ---
> illumos-developer
> Archives: https://www.listbox.com/member/archive/182179/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175256-b12e2e88
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
> 



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Re: [developer] zpool import hangs

2014-04-05 Thread Robert Mustacchi
On 04/04/2014 02:08 PM, Youzhong Yang wrote:
> Thanks Robert, surya and everyone, you guys rock!
> 
> I built a new image with the following fix, zpool import works very well,
> 'kstat vmem:35:segkp:*' also shows no sign of memory leaks after 100 times
> zpool export/import.
> 
> https://github.com/dweeezil/zfs/commit/69b0687

Hey Youzhong,

Glad that was the case. Would you be willing to drive getting that back
into illumos proper?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] MPICH compile failed...

2014-04-14 Thread Robert Mustacchi
On 4/14/14 15:13 , Nick Zivkovic wrote:
> Hi.
> 
> I've been trying to get MPICH to compile my very simple hello-world
> program which uses MPI.
> 
> Steps I took:
> 
> mpicc -c hw.c
> mpicc -o hw hw.o
> 
> Which resulted in the following error:
> 
> ] mpicc -o hw hw.o
> ld: fatal: file /usr/lib/amd64/libgcc_s.so.1: version 'GCC_4.5.0' does
> not exist:
> required by file
> /opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64/libgfortran.so.3
> ld: fatal: file /usr/lib/amd64/libgcc_s.so.1: version 'GCC_4.5.0' does
> not exist:
> required by file
> /opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3/../../../../x86_64-sun-solaris2.11/lib/amd64/libquadmath.so.0
> ld: fatal: file processing errors. No output written to hw
> collect2: error: ld returned 1 exit status
> ]
> 
> I have no idea what it's trying to communicate, so I thought I'd see
> if anyone here has run into similar issues with MPI and SmartOS.
> 
> Any suggestions?
> 
> BTW, here is the output of `mpichversion`.
> ] mpichversion
> MPICH Version:  3.0.4
> MPICH Release date: Wed Apr 24 10:08:10 CDT 2013
> MPICH Device:   ch3:nemesis
> MPICH configure:--datadir=/opt/local/share/mpich
> --sysconfdir=/opt/local/etc/mpich --docdir=/opt/local/share/doc/mpich
> --htmldir=/opt/local/share/doc/mpich/html
> --with-openpa-prefix=/opt/local --with-hwloc-prefix=/opt/local
> --with-pm=hydra:gforker --disable-fc --with-thread-package=posix
> --with-libiconv-prefix=/opt/local --prefix=/opt/local
> --build=x86_64-sun-solaris2.11 --host=x86_64-sun-solaris2.11
> --mandir=/opt/local/man
> MPICH CC:   gcc -O2 -pipe -O2 -I/usr/include -I/opt/local/include   -O2
> MPICH CXX:  g++ -O2 -pipe -O2 -I/usr/include -I/opt/local/include  -O2
> MPICH F77:  f77 -O  -O2
> MPICH FC:   no

Hi Nick,

So, the problem here is that both the platform and pkgsrc provide the
library libgcc_s. When pkgsrc compilers build, they put /opt/local/lib
or /opt/local/lib/amd64 first on both the library and run path, which
you probably are not doing here. This might normally be done by
specifying -L /opt/local/lib/amd64 -R /opt/local/lib/amd64 when it's
linking.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] MPICH compile failed...

2014-04-14 Thread Robert Mustacchi
On 4/14/14 16:02 , Nick Zivkovic wrote:
> Ok fixed.
> 
> The command `mpicc` is in fact a bash script, located at 
> `/opt/local/bin/mpicc`.
> 
> There is a variable named LDFLAGS which is defined as follows:
> 
> LDFLAGS="-L/opt/local/lib -Wl,-R/opt/local/lib -L/opt/local/lib
> -L/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3
> -Wl,-R/opt/local/gcc47/lib/gcc/x86_64-sun-solaris2.11/4.7.3
> -L/opt/local/gcc47/lib -Wl,-R/opt/local/gcc47/lib -L/usr/lib/amd64
> -Wl,-R/usr/lib/amd64 -L/opt/local/lib -Wl,-R/opt/local/lib "
> 
> I replaced the `/usr/lib/amd64` with `/opt/local/lib/amd64`.
> 
> Thanks for pointing me in the right direction.

You'll likely want to do the correct ordering with the include path. Did
this come from a pkgsrc package? If so, can you file a bug about that on
github.com/joyent/pkgsrc, we should make sure that this is fixed to do
the right thing.

Thanks,
Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] moving zone from omnios to smartos or vice versa

2014-04-17 Thread Robert Mustacchi
On 4/17/14 5:17 , Mihai - Cristian Satmarean wrote:
> Hi all,
> 
> I have a question if is possible to move a zone from omnios to smartos?
> or the other way around

Hi,

It's generally not possible to move a zone around between the two
distributions. We have very different models for how zones should work.

In SmartOS, we have sparse zones which means that the global zone
provides the contents of things like /lib and /usr. However, in OmniOS,
those are provided by the zone, but are mostly tied to the kernel.
Further, the general design of a zone in the OmniOS world, based on my
understanding, is that packages are installed into /usr and /usr is
writeable. In our world, that is not true and packages are installed
into another prefix.

What you might want to do is create a delegated dataset that contains
your data, but not programs, and move that around instead.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] ZFS latency

2014-04-21 Thread Robert Mustacchi
On 04/21/2014 12:40 AM, Micky wrote:
> Ok, this is pretty odd. I have setup a few GZs with little to no activity,
> and I am seeing a consistent write latency of 15 to 20 microseconds.
> 
> # vfsstat -IMzZ 2 3
>   r/i   w/i  Mr/i  Mw/i ractv wactv read_t writ_t  %r  %w   d/i  del_t zone
> 151.0   6.0   0.3   0.0   0.0   0.01.8   15.6   0   0   0.00.0
> global (0)
> 
> This is 20131128T230213Z, loaded with default ::zfs_params.
> 
> Am I victim of some throttling bug somewhere?

You'll note that the del/i column is zero. That means that you're not
seeing any delays.

15 to 20 microseconds seems pretty good. It means that all your reads
are hitting in the ARC and all your writes are asynchronous. You're not
hitting your disks at all. If you were, I'd expect to start seeing times
in milliseconds. Can you describe what behavior you were expecting to
see here that makes this seem odd?

Of course, averages can be misleading, you should certainly take break
things down more discretely and consider looking at the zone_vfs kstat
module or with DTrace.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] lx brand - howto

2014-04-22 Thread Robert Mustacchi
On 4/22/14 13:32 , Mihai - Cristian Satmarean wrote:
> Dear all,
> 
> is there a howto for using the lx brand?
> Thank you!

No, it is not supported in SmartOS at this time. When it is, we'll be
sure to let folks know.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Ubuntu Certified Images

2014-04-26 Thread Robert Mustacchi
On 04/26/2014 01:28 PM, Torsten Frank wrote:
> does anybody know, where to get the ubuntu 14.04 dataset
> http://wiki.joyent.com/wiki/display/jpc2/Ubuntu+Certified#UbuntuCertified-14
> .0420140416 ? i can¹t find it at https://datasets.joyent.com/datasets/
> Thanks, Tolson

At this time Joyent cannot distribute the Canonical Ubuntu Certified
images publicly. I hope that this changes in the future.

Sorry,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS on AMD's FX-8350 Vishera?

2014-05-04 Thread Robert Mustacchi
On 04/28/2014 09:12 AM, Evan Rowley via smartos-discuss wrote:
> SmartOS Followers,
> 
> Has anyone tried this? With success or failure? Did it work for zones but
> not KVM, or did KVM work as well?
> 
> This is a fast processor and some 990FX motherboards secretly support ECC
> memory for ZFS use.
> 

Hey Evan,

I haven't personally used this processor, but I know of no reason it
won't work for zones. Because it's a newer generation or processor, KVM
should work well with Joshua Clulow's AMD branch
(https://github.com/jclulow/illumos-kvm).

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Locating data in an OS zone?

2014-05-04 Thread Robert Mustacchi
On 04/24/2014 12:04 PM, Alain O'Dea wrote:
> I would redeploy and move/copy the data in.  I don't think there is a 
> supported path for upgrading a zone's image in place on SmartOS.

While we have an open bug to document this, there is actually vmadm
reprovision. Which takes the base image and replaces it with a new base
image. Keep in mind that you'll want to make sure that all of your data
is a delegated dataset.

We use this quite a lot at Joyent. We build images that include all the
necessary software installed. So an upgrade is just a reprovision, and
in the case of something that isn't stateless, all the relevant data is
in the delegated dataset.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] DHCP for zones

2014-05-04 Thread Robert Mustacchi
Sorry for the delay in getting back to this thread.

On 04/22/2014 09:50 PM, Nicholas Lee wrote:
> So, everything except for the resolver seems to be set correctly from the
> dhcp server.  Is this the default action?  Do I need to manually edit
> /etc/resolv.conf?

The dhcp option in vmadm today does not, to my limited knowledge, ask
dhcp for resolvers. Though someone should look at snoop output for dhcp
and verify what it is and what it isn't sending.

While you can use the resolvers and maintain_resolvers properties that
you mention, that's obviously not the best way. Looking into this is
pretty low on our priority list because we actually never use dhcp at
Joyent. If folks could dig into this and figure out whether or not we're
actually requesting DNS servers or not or if it's being clobbered for
some reason, than we can go from there.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] ZFS latency

2014-05-04 Thread Robert Mustacchi
Sorry for the delay in getting back to this.

On 04/21/2014 08:07 AM, Micky wrote:
> But this is now interesting. I am seeing a behavior of slow slow I/O
> from KVM guest on idle GZ.
> Kstat zfs_vfs looks like this.
> 
> I assume it shows no latency but what does those delay_cnt and delay_time 
> mean?
> 
> 
> module: zone_vfsinstance: 2
> name:   359c4843-024e-4bab-8e9b-4330db  class:zone_vfs
> 100ms_ops   0
> 10ms_ops1
> 10s_ops 0
> 1s_ops  0
> crtime  5758.489956494
> delay_cnt   63810
> delay_time  3043170
> nread   605726
> nwritten5566
> reads   66
> rlentime299610
> rtime   162938
> snaptime13104.730586639
> wlentime27242641
> writes  2527
> wtime   27242641
> zonename   359c4843-024e-4bab-8e9b-4330dbbd3abf

delay_cnt is the number of times that the ZFS I/O throttle has throttled
since the instances was started via vmadm. The delay_time is the total
amount of time in microseconds that the zone has been throttled since it
was started via vmadm. It will be cleared whenever the zone resets,
which does not happen when reboot is typed inside of a kvm guest.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS on VMware ESXi, networking?

2014-05-04 Thread Robert Mustacchi
Hi Chris,

Sorry this message slipped through the cracks.

On 04/09/2014 02:08 PM, Chris Ferebee wrote:
> The standard SmartOS VMware platform image doesn’t seem to include support 
> for VMXNET3. Can I install VMware’s Solaris drivers into SmartOS? Is there a 
> tutorial that might help with this? I’m a bit out of my depth here, which is 
> the point - this is meant to be a learning experience. Thanks for any 
> suggestions. :-)

Today there isn't a vmxnet3 driver in illumos so we don't have one that
you can use. In general, Solaris binary drivers can't be assumed to be
compatible with illumos based distributions. There are some companies
who have a vmxsnet3 driver and they started the integration process, but
I don't recall why it got stalled. In general, for all of our VMware use
at Joyent (generally VMware Fusion), we use an instance of e1000g.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Locating data in an OS zone?

2014-05-04 Thread Robert Mustacchi
On 5/4/14 21:52 , Nicholas Lee wrote:
> How does it deal with differences in /etc and/or any packaging installed
> /opt/.../etc/ config files?

It doesn't, which mean that I didn't explain it well enough. The base
image and all data installed in that zfs dataset is replaced. That means
anything that isn't in a delegated dataset, eg. /opt and /etc is
replaced with the contents of the new image. The only information kept
around is the delegated dataset and the vmadm metadata. This is why we
build custom images for use that have everything installed and
configured in advance.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Zone upgrades

2014-05-06 Thread Robert Mustacchi
On 05/06/2014 04:15 AM, G B wrote:
> In Solaris 10 one would detach, then attach a zone.  
> 
> How are zones upgraded in SmartOS?  

SmartOS zones are sparse, so there are two parts to the zone upgrade to
consider. There are the sparse contents, /usr and /lib from the global
zone, and there is everything else.

For /usr and /lib, you need to reboot onto a newer smartos platform and
the zones will be taken care of automatically. Everything in the zone,
pkgsrc and the like, is independent of that.

What in particular is it that you want to change?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] ifconfig cause reboot on hp proliant dl338 gne9

2016-01-12 Thread Robert Mustacchi
On 1/11/16 23:51 , 许若辰 wrote:
> Hi all, I’ve researched this problem and get some new information.
> Now the problem is, when the machine execute "ifconfig bge2 plumb”, the 
> machine will reboot.
> I modified the installation script and skip this command, then I can 
> successfully install smartos in this server. Although then it is failed to 
> start, maybe there have another command in start script.
> So I think there may be some problem with bge2, but, I can install centos7 in 
> this machine and all nics work fine.
> Any idea?

That suggests that the operating system panicked or there is some bug.
Did it produce a dump?

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] ifconfig cause reboot on hp proliant dl338 gne9

2016-01-12 Thread Robert Mustacchi
On 1/12/16 0:18 , Chris Ridd wrote:
> I really wish HP used Intel NICs...

Intel NICs are an option with both HP Gen 9 products on both the 1 GbE
and 10 GbE options. The 1 GbE stuff is called the HP 361T and 366T.

Though, we have both the docs and source to bge and bnxe. So if folks
are having problems with either of them, we should be able to root cause
and fix them.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] zfs: some advice on server configuration

2016-01-21 Thread Robert Mustacchi
On 1/21/16 7:51 , Alessio Ciregia wrote:
> Let's say I want to buy a new server.
> I know that I could search it on google, but I would like some advice
> related to disks configuration in order to separate ZIL and L2ARC from
> data: how many SSD, and wich size? Mirror, stripe?
> I would like some real example from you. What is your configuration?

The first important questions to answer are how much are you willing to
spend and what is it that you want to do with the machine? Are you
trying to have a large amount of tenancy? Are you going to be primarily
file serving over HTTP? Are you going to be running databases?

Basically what's your target working set for the machine going to be? Is
it an amount of memory that can fit in DRAM?

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] IPMP Failure with latest build !!!

2016-01-22 Thread Robert Mustacchi
On 1/22/16 1:01 , Skale Franz wrote:
> Hi,
> we upgraded the release from 2015 Jul to the latest version.
> For now, ipmp doesn't work anymore, I had to deactivate all port exept one of 
> course.
> Since we use standby links, the traffic should only occur on this port and 
> the other should be STANDBY.
> We found out, that in september 2015 there were changes in the ipmp and also 
> one new command has been invtented. (if_mpadm).

The if_mpadm code has been around since the launch of opensolaris. There
don't appear to have been any changes to the ipmp driver during the time
frame that you're referring to.

What changes are you referring to in the IPMP code?

> After activating all links (incl. Standyports), all interfaces produce 
> traffic and we have DUPs  checking via remote as well as enormous amounts of 
> congestions.
> Dladm show-vnic shows, that it uses multicast with an random MAC address. So 
> far so good. But ipmp seems not to check that other interfaces are standby 
> and the traffic seems to be routed trough the standby links but outgoing from 
> the online interface. This is good for RR but not for standby mode.
> Is there a major bug in the IPMP implementation ?

Can you please provide the exact series of steps that you use to enable
IPMP inside of your zone? Or is this running in the global zone?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-24 Thread Robert Mustacchi
On 1/24/16 23:17 , Jorge Schrauwen wrote:
> Make sure to use the hidden? (undocumented) ETHERSTUB="stub0 stub1 ..." 
> 
> Option in /usbkey/config, you can then assign it to a nictag like you
> normally do by referencing it using stubX instead of the mac address. 

Use nictagadm add -l to do this. There's no need to manually edit the
configuration file.

Robert




---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-25 Thread Robert Mustacchi
On 1/25/16 5:38 , Humberto Ramirez wrote:
> What would you say is the improvement over a standard vnic? Does it
> approach a 10G link speed?

An etherstub is a local virtual switch. VNICs can be created on top of
it like they can be created on top of normal physical devices.

When you're only focusing on loopback devices and virtio devices, link
speed is a red herring and you should just ignore it. Link speed only
matters when you have a physical device as that speed indicates the
upper band of the data rate that it can put on the wire.

If you've rigged everything up over an etherstub then you'll never go
out over the physical device; however, devices will still show a link
speed, because there's really no way not to. For example, a virtio
device in a hardware virtualized guest has no way of knowing what the
link speed of the device its going out over is. It could be 100 Mbit/s,
1 Gbit/s, 10 Gbit/s, or 40 Gbit/s, etc. and still only show the link
speed in the guest as 1 Gbit/s.

Practically, the limits of link speed for a VNIC are based on the
underlying device or the kernel data path, so it can saturate a 10
Gbit/s device. On the flip side, due to how the hardware virtualization
is currently implemented, it is unlikely that you will see speeds much
higher than 1 Gbit/s.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-25 Thread Robert Mustacchi
On 1/25/16 8:53 , de...@hyltown.com wrote:
> - On Jan 25, 2016, at 11:04 AM, Robert Mustacchi r...@joyent.com wrote:
> 
>> On 1/25/16 5:38 , Humberto Ramirez wrote:
>>> What would you say is the improvement over a standard vnic? Does it
>>> approach a 10G link speed?
>>
>> An etherstub is a local virtual switch. VNICs can be created on top of
>> it like they can be created on top of normal physical devices.
>>
>> When you're only focusing on loopback devices and virtio devices, link
>> speed is a red herring and you should just ignore it. Link speed only
>> matters when you have a physical device as that speed indicates the
>> upper band of the data rate that it can put on the wire.
>>
>> If you've rigged everything up over an etherstub then you'll never go
>> out over the physical device; however, devices will still show a link
>> speed, because there's really no way not to. For example, a virtio
>> device in a hardware virtualized guest has no way of knowing what the
>> link speed of the device its going out over is. It could be 100 Mbit/s,
>> 1 Gbit/s, 10 Gbit/s, or 40 Gbit/s, etc. and still only show the link
>> speed in the guest as 1 Gbit/s.
>>
>> Practically, the limits of link speed for a VNIC are based on the
>> underlying device or the kernel data path, so it can saturate a 10
>> Gbit/s device. On the flip side, due to how the hardware virtualization
>> is currently implemented, it is unlikely that you will see speeds much
>> higher than 1 Gbit/s.
> 
> would you please give an example of how one would create an etherstub
> (acting as a virtual switch) in such a way that the vms local to the
> given smartos machine would leverage that, while vms on other smartos
> machines would still be able to reach the vms connected to the etherstub?

Let me try to clarify how all this works. I don't think you've done
anything wrong per se.

Whenever you create a VNIC, you traditionally have to specify it as
being over a specific physical device. Logically you can think of this
like as every VNIC and every physical device are plugged into a switch
and frames which don't match any devices on that switch (eg. other VNICs
and the physical device itself) will be transmitted out over the wire of
the physical device. Note, you never create this switch, nor can you
manage it or configure. It's all set up by automatically for you.

An etherstub is similar in concept. You can think of it like the
physical device in the above example, except that if the destination MAC
address is not a VNIC on the etherstub, then the frames will be dropped.
The etherstub itself has no MAC address.

So in this case, the only time I'd employ an etherstub is if I wanted to
have a network that was local to the host itself. There are a couple
reasons you might use this. The primary use case I see is that folks
want to have one zone that acts as a router or firewall and all the rest
are on their own private network that uses the one zone to get there. In
this case, they'll put the router/firewall over both a physical device
and an etherstub.

Now, there's also no reason that you have to use etherstubs. For
example, at Joyent, we don't use etherstubs at all, because there's
nothing we have that we want to bind to the confines of a single host.
Even when there are private networks, they span more than one physical host.

Does that help clarify things at all?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Is zfs dead man timer tunable?

2016-01-25 Thread Robert Mustacchi
On 1/25/16 22:50 , Fred Liu wrote:
> [root@pluto /zones/debug]# ls -la
> total 4818285
> drwxr-xr-x   2 root root   5 Jan 26 14:16 .
> drwxr-xr-x  16 root root  19 Jan 25 19:21 ..
> -rw-r--r--   1 root root   2 Jan 26 14:15 bounds
> -rw-r--r--   1 root root1056 Jan 26 14:15 METRICS.csv
> -rw-r--r--   1 root root 4305518592 Jan 26 14:15 vmdump.0
> [root@pluto /zones/debug]# mdb -f vmdump.0
>> echo ::zio_state 
>  mdb:  failed to dereference symbol: operation not supported by target
>   > ::status
>  debugging file 'vmdump.0' (object file)
>  
>  [root@pluto /zones/debug]# echo "::zio_state" | mdb -f vmdump.0
>   invalid command '::zio_state': unknown dcmd name
>  
>  It looks like I can't find too much useful info here.

You don't want to be using mdb -f on a system dump. Instead here, I
would go into that directory, run `savecore -vf vmdump.0 .`.

That will create a unix.0 and vmcore.0 which you can then access with
mdb by running `mdb 0` in that directory. Note that by using mdb -f,
you've asked mdb not to interpret the core dump, but rather treat it as
a raw file which is why the dcmds are not being found.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [developer] RE: [smartos-discuss] Is zfs dead man timer tunable?

2016-01-25 Thread Robert Mustacchi
On 1/25/16 23:18 , Fred Liu wrote:
> 
> 
>> -Original Message-----
>> From: Robert Mustacchi [mailto:r...@joyent.com]
>> Sent: 星期二, 一月 26, 2016 14:57
>> To: smartos-discuss@lists.smartos.org
>> Cc: illumos-developer
>> Subject: Re: [smartos-discuss] Is zfs dead man timer tunable?
>>
>> On 1/25/16 22:50 , Fred Liu wrote:
>>> [root@pluto /zones/debug]# ls -la
>>> total 4818285
>>> drwxr-xr-x   2 root root   5 Jan 26 14:16 .
>>> drwxr-xr-x  16 root root  19 Jan 25 19:21 ..
>>> -rw-r--r--   1 root root   2 Jan 26 14:15 bounds
>>> -rw-r--r--   1 root root1056 Jan 26 14:15 METRICS.csv
>>> -rw-r--r--   1 root root 4305518592 Jan 26 14:15 vmdump.0
>>> [root@pluto /zones/debug]# mdb -f vmdump.0
>>>> echo ::zio_state
>>>  mdb:  failed to dereference symbol: operation not supported by
>> target
>>>   > ::status
>>>  debugging file 'vmdump.0' (object file)
>>>
>>>  [root@pluto /zones/debug]# echo "::zio_state" | mdb -f vmdump.0
>>>   invalid command '::zio_state': unknown dcmd name
>>>
>>>  It looks like I can't find too much useful info here.
>>
>> You don't want to be using mdb -f on a system dump. Instead here, I
>> would go into that directory, run `savecore -vf vmdump.0 .`.
>>
>> That will create a unix.0 and vmcore.0 which you can then access with
>> mdb by running `mdb 0` in that directory. Note that by using mdb -f,
>> you've asked mdb not to interpret the core dump, but rather treat it as
>> a raw file which is why the dcmds are not being found.
>>
> 
> It looks like "`savecore -vf vmdump.0 .`" won't work:
> 
> [root@pluto /zones/debug]# ls -la
> total 39
> drwxr-xr-x   2 root root   2 Jan 26 15:15 .
> drwxr-xr-x  16 root root  19 Jan 25 19:21 ..
> 
> [root@pluto /zones/debug]# savecore -vf vmdump.0 .
> savecore: stat("vmdump.0"): No such file or directory
> savecore: open("vmdump.0"): No such file or directory
> 
> [root@pluto /zones/debug]# touch vmdump.0
> [root@pluto /zones/debug]# savecore -vf vmdump.0 .
> savecore: pread: Invalid argument
> savecore: pread: Invalid argument
> 
> For I cleaned "/zones/debug", it looks I need push server panic again:
> [root@pluto /zones/debug]# savecore -v /zones/debug
> savecore: dump already processed

It's certainly not going to work if the file's not there. Which your
first ls shows it's not. Did you delete it?



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-26 Thread Robert Mustacchi
On 1/25/16 12:36 , the outsider wrote:
> The good old Solaris 10 already "sensed" IP traffic from zones and kept all 
> IP-traffic that didn't have to go "on wire" inside its own hardware. 

That was only because it had 'shared networking stacks'. Specifically
during this time, all of the non-global zones shared the ARP/NDP tables,
the routing tables, etc. This actually limited a lot of what these zones
were able to do.

Instead, 'exclusive networking stacks' were introduced. Here every zone
has their own set of networking information, everything from IPF
rulesets, to ARP/NDP tables, routing tables, tunables, etc. As a result
of this, not every zone is necessarily considered for sending data from
one to another at an IP layer. Which is rather important. This allows
for the zones to be on different VLANs and even have the same IP
addresses. This combined with the ability for the global zone to set
antispoofing/link protection properties on the devices, means that even
if the zone or KVM instance wants to change an IP or MAC address, it
can't use them.

> Good habit on hardware with multiple NICs is to keep a NIC unattached from 
> the network so you can assign it to every zone that doesn't need outgoing 
> traffic. By that you spare a port on the switch also. The NIC has to be up 
> and plumbed for zones to use it. 

Basically none of this is really needed anymore. If you want to have a
group of zones isolated on a local-only network, you just create an
etherstub and create VNICs over that.

Also, in general, you don't need to ever explicitly bring links up or
down (from an ifconfig sense) to use them with VNICs.

I hope this helps clarify a bit about what's changed and how things work
these days. Let me know if you have additional questions.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-26 Thread Robert Mustacchi
On 1/26/16 8:21 , Humberto Ramirez wrote:
> "Practically, the limits of link speed for a VNIC are based on the
> underlying device or the kernel data path, so it can saturate a 10
> Gbit/s device. On the flip side, due to how the hardware virtualization
> is currently implemented, it is unlikely that you will see speeds much
> higher than 1 Gbit/s."
> 
> Robert, so based on your experience I will not see 2 VMs talking faster
> than 1 Gbit/s? (At least not in SmartOS)
> Did I understand you correctly?

That's only true for *KVM* guests. Though it may vary.

Traditional zones or lx zones can easily saturate 10+ Gbit/s.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-26 Thread Robert Mustacchi
On 1/26/16 11:07 , David Preece wrote:
> 
>> On 26/01/2016, at 7:10 AM, Robert Mustacchi  wrote:
>>
>> So in this case, the only time I'd employ an etherstub is if I wanted to
>> have a network that was local to the host itself. 
> 
> I had been assuming one could join etherstubs together with an L2 tunnel of 
> some description - or is this a really bad idea?

Are you referring to two different etherstubs on the same physical host
or on different physical hosts?

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fastest link between 2 VMs

2016-01-26 Thread Robert Mustacchi
On 1/26/16 11:10 , David Preece wrote:
> 
>> On 27/01/2016, at 8:09 AM, Robert Mustacchi  wrote:
>>
>> Are you referring to two different etherstubs on the same physical host
>> or on different physical hosts?
> 
> Different.

You certainly can do something like that. I know folks who did things
like snoop for unsent packets, encapsulate them and forward them on.
However, instead, we developed a different abstraction to take care of
this which we call an 'overlay' which can be created and managed with
dladm on top of which vnics and the like can be created.

While it functions like an etherstub on the local host, frames will be
encapsulated and sent off to remote hosts based on configuration
settings. Effectively, creating 'overlay networks'. This is how we
implement fabrics in SDC/Triton, which is basically point-to-point VXLAN
tunnels while SDC controls the directory.

You can find more information on overlays starting at the following
manual page: https://smartos.org/man/5/overlay.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] lx (centos 6) + postfix network_interfaces

2016-01-27 Thread Robert Mustacchi
On 1/27/16 8:17 , InterNetX - Juergen Gotteswinter wrote:
> Hi,
> 
> i am trying to get postfix inside a lx zone working, which seems to be
> ok as long as i dont change network_interfaces to "all" in main.cf. if
> "all" is set, postfix completly stops working, its not even logging
> anymore. even after reverting this setting to localhost or whatever it
> stays dead (rebooting the zone doesnt help, too).
> 
> when i specify the ips directly, everything is fine ...
> 
> i expect that i might not be the first with this issue, maybe someone
> knows how to fix this?

The most helpeful thing to do here would be to strace/truss postfix in
this mode and see what it's trying to do and is failing. Let us know if
you need more pointers to how to do this.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] benchmarking between LX and KVM

2016-01-28 Thread Robert Mustacchi
On 1/28/16 2:44 , Fred Liu wrote:
> Hi,
> 
> Anyone who has ever tried performance benchmarking between LX and KVM.
> In my dirty-and-quick test(compling gcc), LX is 20% slower than KVM. It is 
> sort of disspointed!

Are you performing active benchmarking and on the latest platform? This
is a case where a quick use of the USE method may be rather insightful.
If you just did a fire and forget, it's very easy to end up comparing
different things.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Win10 KVM installation problems

2016-01-31 Thread Robert Mustacchi
On 1/31/16 17:14 , Jason Lawrence wrote:
> I tried installing Windows 10 (32 bit) in a KVM zone and it appears to
> hang immediately after booting from the ISO, indefinitely displaying the
> Windows startup logo via VNC for several hours. I see zero disk
> activity. The only obvious information I can find relating to KVM is in
> the syslog right after starting the VM:
> 
> 2016-01-31T22:25:58.379728+00:00 smarty genunix: [ID 408114 kern.info]
> /pseudo/zconsnex@1/zcons@12 (zcons12) online
> 2016-01-31T22:25:59.332226+00:00 smarty mac: [ID 469746 kern.info]
> NOTICE: vnic1218 registered
> 2016-01-31T22:25:59.461815+00:00 smarty kvm: [ID 420667 kern.info]
> kvm_lapic_reset: vcpu=ff051d031000, id=0, base_msr= fee00100 PRIx64
> base_address=fee0
> 2016-01-31T22:25:59.461836+00:00 smarty kvm: [ID 710719 kern.info] vmcs
> revision_id = 12
> 2016-01-31T22:25:59.462918+00:00 smarty kvm: [ID 420667 kern.info]
> kvm_lapic_reset: vcpu=ff04ebf3b000, id=1, base_msr= fee0 PRIx64
> base_address=fee0
> 2016-01-31T22:25:59.462927+00:00 smarty kvm: [ID 710719 kern.info] vmcs
> revision_id = 12
> 2016-01-31T22:26:03.139185+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x1010101 data fd7fffdfe880
> 2016-01-31T22:26:03.139211+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x1010101 data fd7fffdfe880
> 2016-01-31T22:26:03.144332+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0xff30ccec data fd7fffdfe850
> 2016-01-31T22:26:03.144407+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x0 data 0
> 2016-01-31T22:26:03.144776+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x0 data 0
> 2016-01-31T22:26:03.144819+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x0 data 0
> 2016-01-31T22:26:03.184712+00:00 smarty kvm: [ID 291337 kern.info] vcpu
> 1 received sipi with vector # 10
> 2016-01-31T22:26:03.184722+00:00 smarty kvm: [ID 420667 kern.info]
> kvm_lapic_reset: vcpu=ff04ebf3b000, id=1, base_msr= fee00800 PRIx64
> base_address=fee0
> 2016-01-31T22:26:04.387896+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x0 data 0
> 2016-01-31T22:26:07.862118+00:00 smarty kvm: [ID 713435 kern.info]
> unhandled rdmsr: 0xff295897
> 2016-01-31T22:26:07.862170+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x50bff7 data fd7ffe9de1c0
> 2016-01-31T22:26:07.867041+00:00 smarty kvm: [ID 713435 kern.info]
> unhandled rdmsr: 0xff295897
> 2016-01-31T22:26:07.867059+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x50bff7 data fd7ffe9de1c0
> 2016-01-31T22:26:07.867357+00:00 smarty kvm: [ID 713435 kern.info]
> unhandled rdmsr: 0x3c0003f
> 2016-01-31T22:26:07.867376+00:00 smarty kvm: [ID 391722 kern.info]
> unhandled wrmsr: 0x0 data 0
> 2016-01-31T22:26:19.488477+00:00 smarty kvm: [ID 713435 kern.info]
> unhandled rdmsr: 0x3a
> 
> This means nothing to me, but any suggestions on where I can start looking?

The first thing that I'd look at is the kvmstat output and see if the
guest is running and trying to do things or not.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] mariadb and centos7 lx

2016-02-10 Thread Robert Mustacchi
On 2/9/16 1:19 , Alessio Ciregia wrote:
> FYI I had to put PrivateTmp=False inside the
> /usr/lib/systemd/system/mariadb.service file, in order to start mariadb
> in a Centos 7 LX zone (dataset uuid aae64e42-c88d-11e5-a49d-87f422b1820b).

Hi Alessio,

Would it be possible to file a bug about this on
github.com/joyent/smartos-live/issues and if possible, to grab the
strace output for the service starting that indicates what actually failed?

Thanks for reporting this.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Querying ZFS permanent error in zdb

2016-02-15 Thread Robert Mustacchi
On 2/14/16 15:07 , Micky wrote:
> So after getting a permanent error on a zvol inside a mirrored zpool, I
> deleted the zvol and restored it from the backup.
> 
> But after running a scrub, I still get this in verbose status:
> 
> errors: Permanent errors have been detected in the following files:
> <0x11cdc>:<0x1>
> 
> Is there a way with zdb to look it up in zdb?
> 
> Or to know if it's about the corrupted metadata?

I'd suggest reaching out to the Open ZFS developer list. There'll be
more folks there who are intimately familiar with ZFS internals and be
able to better help you.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


[smartos-discuss] Intel I219 support, e1000g, igb testing requests

2016-02-22 Thread Robert Mustacchi
Hi all,

I've updated the igb and e1000g drivers for the most recent changes from
Intel. Most notably, this adds support for the I219 family of devices
which can be found on Skylake systems with the 100 series chipsets.

If you have an I219, in particular, I'd appreciate if you could test
this, as this work is primarily for you.

If you don't have an I219, but do have other Intel 1 gig cards, powered
by the e1000g and igb drivers, I'd appreciate it if you could also test
this. You can see what NICs you have by running dladm show-phys.

Here are links to all of the different formats I have it in:

SmartOS/SDC platform:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/platform-20160221T163907Z.tgz
SmartOS ISO:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/platform-20160221T163907Z.iso
SmartOS USB:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/platform-20160221T163907Z.usb.bz2

e1000g 64-bit x86:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv/amd64/e1000g
e1000g 64-bit x86 debug:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv-debug/amd64/e1000g

e1000g 32-bit x86:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv/e1000g
e1000g 32-bit x86 debug:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv-debug/e1000g

igb 64-bit x86:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv/amd64/igb
igb 64-bit x86 debug:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv-debug/amd64/igb
igb 32-bit x86:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv/igb
igb 32-bit x86 debug:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/drv-debug/igb


webrev:
http://us-east.manta.joyent.com/rmustacc/public/webrevs//index.html
patch:
https://us-east.manta.joyent.com/rmustacc/public/preview/i219/i219.patch

I will send separate mail to the list for review. Please do not reply to
this with any non-testing review feedback at this time.

If you do end up testing this, I ask that you do the following:

1) For each entry in dladm show-phys that's e1000g or igb, run:
prtconf -d /dev/

Note if devices share the same description, then it's not important to
repeat this. e.g. you may have a card with multiple ports.

2) Make sure that everything that used to work, still works. e.g. basic
unicast and multicast traffic flows. VNICs and zones are still all
pingable, etc.

3) If you have an I219, I'd appreciate if you could run the following
test just to make sure that we're properly transitioning the NIC to
promiscuous mode. The test basically is to create sixteen VNICs in total.

After each VNIC is created:
   * Assign an IP address to that VNIC
   * Ensure that you can ping that IP address from another host
   * Create the next VNIC
   * Stop after the 16th one

4) If you find yourselves wanting to do some basic stress tests, that'd
be great. I'll make sure that we do some for several of the devices as well.

If you have any questions, please reach out to me and let me know.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] promiscuous-mode nic passthrough (think Snort and SPAN)

2016-02-27 Thread Robert Mustacchi
On 2/26/16 16:23 , Rob Seastrom wrote:
> 
> Hi folks,
> 
> Maybe my Google-fu is failing me (and searching my archives of this list has 
> failed me too)...  but has anyone got a recipe for passing through a physical 
> NIC in a mode where it can go promiscuous mode to a SmartMachine?  Is that 
> even possible with Crossbow in the middle?
> 
> Use case is monitoring span/port mirrors on a couple of switches, or maybe 
> optical taps if I manage to find my junk box.  I see that Snort is in pkgsrc 
> - don't know if that means people are running it just on a SmartMachine to 
> monitor traffic to and from it, or if folks are actually running a full blown 
> network IDS on SmartOS.

While you can't assign a physical nic itself you can opt to allow the
vnic to have unfiltered access to the underlying device's promiscuous
mode with the vmadm property 'nics.*.allow_unfiltered_promisc'.

That should do what you need, I expect, but still allow other zones to
leverage the device (which would not really be possible if you assigned
the NIC fully to the zone).

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] promiscuous-mode nic passthrough (think Snort and SPAN)

2016-02-29 Thread Robert Mustacchi
On 2/29/16 11:46 , Rob Seastrom wrote:
> 
>> On Feb 27, 2016, at 4:34 PM, Robert Mustacchi  wrote:
>>
>> On 2/26/16 16:23 , Rob Seastrom wrote:
>>>
>>> Hi folks,
>>>
>>> Maybe my Google-fu is failing me (and searching my archives of this list 
>>> has failed me too)...  but has anyone got a recipe for passing through a 
>>> physical NIC in a mode where it can go promiscuous mode to a SmartMachine?  
>>> Is that even possible with Crossbow in the middle?
>>>
>>> Use case is monitoring span/port mirrors on a couple of switches, or maybe 
>>> optical taps if I manage to find my junk box.  I see that Snort is in 
>>> pkgsrc - don't know if that means people are running it just on a 
>>> SmartMachine to monitor traffic to and from it, or if folks are actually 
>>> running a full blown network IDS on SmartOS.
>>
>> While you can't assign a physical nic itself you can opt to allow the
>> vnic to have unfiltered access to the underlying device's promiscuous
>> mode with the vmadm property 'nics.*.allow_unfiltered_promisc'.
>>
>> That should do what you need, I expect, but still allow other zones to
>> leverage the device (which would not really be possible if you assigned
>> the NIC fully to the zone).
> 
> 
> Not sure what I'm doing wrong here, but I'm only seeing broadcast and 
> multicast traffic.  The vnic in the zone doesn't show PROMISC in the flags 
> when I'm running tcpdump or snoop.

For what it's worth, I don't see the PROMISC flag on a VNIC normally.

> I can see all traffic just fine when I run snoop in the global zone.
> 
> A possible added difficulty is that the mirror port is spitting out 802.1q 
> tagged traffic.  I was only getting the LLDP traffic between the switch and 
> the router (i.e. untagged) before I configured the nic with a vlan in the 
> smartmachine.

When I originally did the unfiltered promisc bits it was focused on
additional mac addresses for KVM guests which would still be on the same
VLAN. There could be some gotchas there. Though, I'd also run dladm
show-linkprop to verify that it's been properly set. Note that this will
require the zone to be halted and then started up again.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] promiscuous-mode nic passthrough (think Snort and SPAN)

2016-02-29 Thread Robert Mustacchi
On 2/27/16 14:53 , Eric wrote:
> On Sat, Feb 27, 2016 at 4:34 PM Robert Mustacchi  wrote:
> 
>> While you can't assign a physical nic itself you can opt to allow the
>> vnic to have unfiltered access to the underlying device's promiscuous
>> mode with the vmadm property 'nics.*.allow_unfiltered_promisc'.
>>
> 
>  From the vmadm manual, does it mean it's meant to work with just KVMs?

That's certainly the intent of the work that was done. In this case it
looks like it's not limiting it explicitly, but it's true and it
probably isn't safe to rely on long-term.

Robert

> nics.*.allow_unfiltered_promisc:
> 
> With this property set to true, this VM will be able to have multiple
> MAC addresses (eg. running SmartOS with VNICs). Without this option
> these packets will not be picked up as only those unicast packets
> destined for the VNIC's MAC will get through. Warning: do not enable
> this option unless you fully understand the security implications.
> 
> type: boolean
> vmtype: KVM
> listable: yes (see above)
> create: yes
> update: yes
> default: false



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] promiscuous-mode nic passthrough (think Snort and SPAN)

2016-02-29 Thread Robert Mustacchi
On 2/29/16 14:56 , Rob Seastrom wrote:
> 
>> On Feb 29, 2016, at 5:27 PM, Robert Mustacchi  wrote:
>>> I can see all traffic just fine when I run snoop in the global zone.
>>>
>>> A possible added difficulty is that the mirror port is spitting out 802.1q 
>>> tagged traffic.  I was only getting the LLDP traffic between the switch and 
>>> the router (i.e. untagged) before I configured the nic with a vlan in the 
>>> smartmachine.
>>
>> When I originally did the unfiltered promisc bits it was focused on
>> additional mac addresses for KVM guests which would still be on the same
>> VLAN. There could be some gotchas there. Though, I'd also run dladm
>> show-linkprop to verify that it's been properly set. Note that this will
>> require the zone to be halted and then started up again.
> 
> Turns out that it was not being set right.  I rebooted the zone, and then 
> prior to starting tcpdump, ran:
> 
> dladm set-linkprop -z 2dc24843-a10c-6e9d-a9d0-c69520ece6d9 -p 
> promisc-filtered=off net1
> 
> I was rewarded with "dladm: warning: invalid link property 
> 'promisc-filtered'", but the current value changed from "on" to "off", and 
> after that tcpdump worked as expected.

That warning came most likely because you missed the -t option.

> Doesn't seem to be persistent across reboots of the zone though.  Any clues 
> to making it persistent?

Actually, looking deeper, the problem is that I trusted my memory too
much as Eric pointed out. We don't actually support passing this through
for non-KVM instances at this moment, per
https://github.com/joyent/smartos-live/blob/master/overlay/generic/usr/lib/brand/joyent/statechange#L21.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS on NUC6I3SYH?

2016-03-04 Thread Robert Mustacchi
On 3/4/16 7:07 , Dirk Steinberg wrote:
> Hi,
> 
> does SmartOS run on the NUC6I3SYH? Has anyone tried?
> It has an Intel Core i3-6100U CPU based on the Skylake
> architecture.

There's someone who tried a NUC on the developer list, not sure if it's
that one in particular.

> From experience I know that newer Intel platforms tend 
> to have newer versions of their NIC chips as well and
> sometimes there are not (yet) supported, but I do not
> know what NIC part is in the box.

The I219 is the phy that's shipping with the platform currently.

> I saw that Robert is testing some new NIC drivers (I219)
> so maybe this is what is needed. When will these be 
> integrated into SmartOS?

At this time, there are some bugs and I don't have an I219 at this time.
Some folks offered to help provide remote access so I could do some
debugging, but I have no ETA at this time.

> Does anyone know if the new NUCs (6th gen) will
> support MTU 9000 jumbo frames for SDC fabric networking?

According to the spec, it should.

> On the older Broadwell NUCs (I218) I was unable to set 
> any MTU higher than 1500. I found that disappointing,
> but maybe I did something wrong.

The I218 in theory supports jumbo frames. We'll need to work out a bit
more information about what went wrong. The best starting point is to
get the output of dladm show-linkprop -p mtu on one of the e1000g's for
the I218.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Samsung 950 Pro on SmartOS?

2016-03-04 Thread Robert Mustacchi
On 3/4/16 7:14 , Dirk Steinberg wrote:
> Hi,
> 
> has anyone tried the new Samsung 950 Pro on SmartOS?
> It is a PCIe-based NVMe SSD in M.2 form factor.
> 
> Can anyone say if it would be suitable as a log disk?

The question you need to ask yourself is what is your duty cycle going
to be. For the 256 GB device, the endurance is rated to 200 TBW for its
lifetime. You'll want to figure out if that lifetime fits your intended
usage and how long you intend the part to last and how heavily you're
intending to use the device.

For a home machine, it may be fine depending entirely on what you're
doing with the machine. I would not recommend it for a production server.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Samsung 950 Pro on SmartOS?

2016-03-04 Thread Robert Mustacchi
On 3/4/16 8:21 , Dirk Steinberg wrote:
> 
>> Am 04.03.2016 um 17:10 schrieb Robert Mustacchi :
>>
>> On 3/4/16 7:14 , Dirk Steinberg wrote:
>>> Hi,
>>>
>>> has anyone tried the new Samsung 950 Pro on SmartOS?
>>> It is a PCIe-based NVMe SSD in M.2 form factor.
>>>
>>> Can anyone say if it would be suitable as a log disk?
>>
>> The question you need to ask yourself is what is your duty cycle going
>> to be. For the 256 GB device, the endurance is rated to 200 TBW for its
>> lifetime. You'll want to figure out if that lifetime fits your intended
>> usage and how long you intend the part to last and how heavily you're
>> intending to use the device.
>>
>> For a home machine, it may be fine depending entirely on what you're
>> doing with the machine. I would not recommend it for a production server
> 
> OK, this assessment is purely on endurance?

Yes. I'd also be looking at general write latency and seeing where it is
in that, since that's the only thing that matters for a SLOG.

> Are there any SMART counters one could read out of an SSD to see how
> many writes one has consumed out of the available, say 200TB?
> So I know that my drive is say 30% used up?

I don't personally know.

> Apart from that: is NVMe support in SmartOS considered stable?

There is driver support for it in the system. I have not heard many
reports positively or negatively about it.



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS on NUC6I3SYH?

2016-03-04 Thread Robert Mustacchi
On 3/4/16 8:32 , Dirk Steinberg wrote:
> 
>>> On the older Broadwell NUCs (I218) I was unable to set 
>>> any MTU higher than 1500. I found that disappointing,
>>> but maybe I did something wrong.
>>
>> The I218 in theory supports jumbo frames. We'll need to work out a bit
>> more information about what went wrong. The best starting point is to
>> get the output of dladm show-linkprop -p mtu on one of the e1000g's for
>> the I218.
> 
> [root@nuc0 ~]# dladm show-phys
> LINK MEDIASTATE  SPEED  DUPLEXDEVICE
> e1000g0  Ethernet up 1000   full  e1000g0
> [root@nuc0 ~]# dladm show-linkprop -p mtu e1000g0
> LINK PROPERTYPERM VALUE  DEFAULTPOSSIBLE
> e1000g0  mtu rw   1500   1500   1500-9216
> 
> Setting a higher MTU in a running system does not work 
> (I read that this is expected, although on Linux this has always worked)
> 
> [root@nuc0 ~]# ifconfig e1000g0 mtu 9000
> ifconfig: setifmtu: SIOCSLIFMTU: e1000g0: Invalid argument
> [root@nuc0 ~]# dladm set-linkprop -p mtu=9000 e1000g0
> dladm: warning: cannot set link property ‚mtu' on 'e1000g0': link busy

Yes, this is a known limitation with that driver.

> So I changed the MTU in the boot-time config in /usbkey/config:
> 
> # underlay_nic is the underlay for SDC fabric networking
> underlay_nic=b8:ae:ed:72:8a:17
> underlay0_vlan_id=4
> underlay0_ip=10.88.88.2
> underlay0_netmask=255.255.255.0
> underlay0_mtu=9000
> 
> But this leads to an error in the boot-up process.
> Or do I have to increase the MTU of the physical
> e1000g0 as well? Since the untagged e1000g0 
> is running the admin network I thought I was 
> supposed to leave that at 1500…

Is this SmartOS or SDC? Is this an SDC headnode?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] SmartOS on NUC6I3SYH?

2016-03-04 Thread Robert Mustacchi
On 3/4/16 10:13 , Dirk Steinberg wrote:
> 
>> Am 04.03.2016 um 18:27 schrieb Robert Mustacchi :
>>
>> On 3/4/16 8:32 , Dirk Steinberg wrote:
>>>
>>>>> On the older Broadwell NUCs (I218) I was unable to set 
>>>>> any MTU higher than 1500. I found that disappointing,
>>>>> but maybe I did something wrong.
>>>>
>>>> The I218 in theory supports jumbo frames. We'll need to work out a bit
>>>> more information about what went wrong. The best starting point is to
>>>> get the output of dladm show-linkprop -p mtu on one of the e1000g's for
>>>> the I218.
>>>
>>> [root@nuc0 ~]# dladm show-phys
>>> LINK MEDIASTATE  SPEED  DUPLEXDEVICE
>>> e1000g0  Ethernet up 1000   full  e1000g0
>>> [root@nuc0 ~]# dladm show-linkprop -p mtu e1000g0
>>> LINK PROPERTYPERM VALUE  DEFAULTPOSSIBLE
>>> e1000g0  mtu rw   1500   1500   1500-9216
>>>
>>> Setting a higher MTU in a running system does not work 
>>> (I read that this is expected, although on Linux this has always worked)
>>>
>>> [root@nuc0 ~]# ifconfig e1000g0 mtu 9000
>>> ifconfig: setifmtu: SIOCSLIFMTU: e1000g0: Invalid argument
>>> [root@nuc0 ~]# dladm set-linkprop -p mtu=9000 e1000g0
>>> dladm: warning: cannot set link property ‚mtu' on 'e1000g0': link busy
>>
>> Yes, this is a known limitation with that driver.
>>
>>> So I changed the MTU in the boot-time config in /usbkey/config:
>>>
>>> # underlay_nic is the underlay for SDC fabric networking
>>> underlay_nic=b8:ae:ed:72:8a:17
>>> underlay0_vlan_id=4
>>> underlay0_ip=10.88.88.2
>>> underlay0_netmask=255.255.255.0
>>> underlay0_mtu=9000
>>>
>>> But this leads to an error in the boot-up process.
>>> Or do I have to increase the MTU of the physical
>>> e1000g0 as well? Since the untagged e1000g0 
>>> is running the admin network I thought I was 
>>> supposed to leave that at 1500…
>>
>> Is this SmartOS or SDC? Is this an SDC headnode?
> 
> Well, this WAS supposed to be SDC, but the install failed
> because of the smbios / UUID stuff. So I went back to 
> SmartOS. But I kept the setup with admin untagged and all
> other nictags with tagged VLANs. The physical box has
> only one NIC, there is no way around this.

Okay. Well, part of the problem here is that with a lot of manual config
file editing there's less checking here, especially with SmartOS. Using
nictagadm(1M) will help a little bit with some of this but doesn't cover
everything.

So every VNIC that's created in SmartOS is defined by a NIC tag. The NIC
tag determines the maximum MTU that we can create a vnic on top of it.
When we start up, we look at all the nic tags defined on a NIC and set
the MTU of the physical device to the maximum of all of them. In this
case, because there's no NIC tag defining the MTU to be 9000, that's why
you're seeing an error.

I know in your case there's a single port, but this should be fine
presuming that you're running a platform that contains the fix for
OS-5146 (https://smartos.org/bugview/OS-5146).

> BTW, it would be nice for SmartOS to be less restrictive
> about the admin network and allow that to be a tagged VLAN
> as well. Apart from PXE-booting (which I do not use in this case)
> I see no argument against it and configuring the admin 
> VLAN (2 in my case) as untagged on the access port for the
> SmartOS server leads to more pain down the road:
> the normal default VLAN 1 that is the usual customer facing VLAN
> needs to be tagged if the admin VLAN 2 is untagged.
> But adding a VLAN to SmartOS with VLAN-id 1 is not
> supported in SmartOS and gives an error! WFT???!!
> Why can I not use VLAN 1? Is there any way around this?

I'm not sure why off hand that's the case, but I can certainly
understand why it's frustrating. Can you file a bug about this at
github.com/joyent/smartos-live/issues/ and we can dig into it?

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Passing a SmartOS RAM Disk to a KVM

2016-03-10 Thread Robert Mustacchi
On 3/10/16 11:48 , Humberto Ramirez wrote:
> I would like to create a temporary RAMDisk with ramdiskadm for use
> inside a KVM, (Temporary disk intensive task). Am I crazy?? Any
> drawbacks?

There's no supported way to do this in SmartOS. Even worse, you'd still
have to do vrtio to get to the disk, which is likely more of your
problem than ZFS based storage.

I'd suggest using your KVM guest's native tools for doing this and
allocating that guest extra DRAM and creating that ramdisk inside of the
guest.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] How to set routing table in lx zone?

2016-03-11 Thread Robert Mustacchi
On 3/11/16 19:40 , 贺德嘉 wrote:
> It works! Thanks!

Alternatively you can set all the routes via vmadm(1M).

Robert

> Best regards,
> Dejia He
> ———
> www.briphant.com
> Briphant Technologies Co., Ltd.
> 
> From: Ian Collins
> Reply-To: 
> "smartos-discuss@lists.smartos.org"
> Date: Saturday, March 12, 2016 at 11:31 AM
> To: 
> "smartos-discuss@lists.smartos.org"
> Subject: Re: [smartos-discuss] How to set routing table in lx zone?
> 
> On 03/12/16 16:22, 贺德嘉 wrote:
> Dear all,
> 
> I’ve tried to set routing table in lx zone(centos 6/7), but it doesn’t work. 
> I got below result when I use route add/del command:
> 
> Try using /native/sbin/route
> 
> --
> Ian.
> 
> smartos-discuss | 
> Archives 
> [https://www.listbox.com/images/feed-icon-10x10.jpg904846c.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc]
>    | 
> Modify Your Subscription   
> [https://www.listbox.com/images/listbox-logo-small.png904846c.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc]
>  
> 


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] MongoDB segmentation fault on base-64-lts 15.4.0

2016-03-13 Thread Robert Mustacchi
On 3/13/16 3:38 , Eric wrote:
> 
> On 3/13/2016 4:31 AM, Ian Collins wrote:
>> On 03/13/16 21:16, Eric wrote:
>>
>> It's better to start a new thread than hijack an exiting one!
> 
> Metadata! I changed to a new subject line, but the email client pulled
> something extra that was in the previous thread. Sorry about that.
> 
> 
>> Could the mismatch in CLI tools and the database be the problem?
> 
> The /opt/local/bin/mongo program comes with the mongodb package and not
> with the CLI tools. The CLI tools comes with:
> 
> mongodump mongoexport   mongofilesmongoimport   mongooplog
> mongorestore  mongostat mongotop
> 
> 
> And, the mongodb package comes with:
> 
> mongo   mongod  mongoperf   mongos  mongosniff
> 

You'll want to go through the core dump and see where it's blowing up,
which will be more important than the truss output as a starting point.
I'd suggest making it available and filing a bug on
github.com/joyent/pkgsrc/issues/.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Can not install Build20160317T000621Z HPE Proliant DL380 Gen9, 2 CPU (E5-2650 v3)

2016-03-22 Thread Robert Mustacchi
On 3/22/16 8:45 , Richard Elling wrote:
> we find that bge2 is toxic. more later today...

If one of you could boot -kd and get a stack trace for where we're dying
that'd be greatly appreciated and we can start tackling what's going on
there.

It looks like there are some other threads on HP x2apic issues that
we'll follow on there instead.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Can not install Build20160317T000621Z HPE Proliant DL380 Gen9, 2 CPU (E5-2650 v3)

2016-03-23 Thread Robert Mustacchi
On 3/23/16 2:00 , Benny Kjellgren wrote:
> Thank you
> 
> "-B pci-reprog=off" solved my problem with SmartOS not booting up.
> 
> And have changed from raidz1 with pair of mirrored disks LV
> to raidz2 with single disk raid 0 LV + one spare

I'm glad that you've got this workaround working; however, if either you
or Richard could still boot up the system with -kd without doing the -B
pci-reprog=off and relay where we're panciking, that'd be quite helpful.

I'd like to make sure we get to root cause on this so these workarounds
aren't necessary. Unfortunately right now we don't have enough
information to make progress on that.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] 20151104T185720Z Hangs

2016-03-23 Thread Robert Mustacchi
On 3/21/16 8:38 , Karl Rossing wrote:
> Upgraded to 20160317T000621Z and it still hangs.
> 
> On 2016-03-21 7:09 AM, Karl Rossing wrote:
>> We have a Cisco Systems Inc R200-1120402 (2xE5540, 64GB of ram,
>> ST1000NM0011 drives) that we upgraded to joyent_20151104T185720Z
>>
>> The server stops responding, the console freezes and we can't ssh into
>>  It did panic once with ""panic message: I/O to pool 'zones'
>> appears to be hung" but has locked up twice since with no panic msg.
>>
>> I'm wondering there is something between 20151104T185720Z and
>> 20160317T000621Z that might explain the problem?
>>
> 

The fact that you have the I/O deadman fire is not always a good sign.
That suggests that for some reason I/O has stopped. When this happens,
are you able to inject an NMI (non-maskable interrupt) into the system?
You can do this via the 'chassis power diag' ipmitool command.

That should force it to generate a dump and we can talk through how to
investigate what's going on there.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] OpenJDK 1.8 SSL error

2016-03-23 Thread Robert Mustacchi
On 3/22/16 4:20 , the outsider wrote:
> The OpenJDK version openjdk8-1.8.51nb1 of SmartoS doesn't support HTTPS
> connections when used in conjunction with Wildfly 10. 
> 
> ( I suppose with any Java based webserver it will fail) 
> 
>  
> 
> This is related to bug https://bugzilla.redhat.com/show_bug.cgi?id=1167153
> 
>  
> 
> 2016-03-21 10:11:30,884 ERROR [org.xnio.nio] (default I/O-37) XNIO11:
> Task io.undertow.protocols.ssl.SslConduit$4$1@82f8096 failed with an
> exception: java.lang.Error: BAD
> 
> at
> sun.security.ssl.HandshakeHash.getFinishedHash(HandshakeHash.java:249)
> 
> at
> sun.security.ssl.HandshakeMessage$Finished.getFinished(HandshakeMessage.java
> :1940)
> 
> at
> sun.security.ssl.HandshakeMessage$Finished.verify(HandshakeMessage.java:1909
> )
> 
> at
> sun.security.ssl.ServerHandshaker.clientFinished(ServerHandshaker.java:1679)
> 
> at
> sun.security.ssl.ServerHandshaker.processMessage(ServerHandshaker.java:305)
> 
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
> 
> at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
> 
> at
> sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1025)
> 
> at
> sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:907)
> 
> at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
> 
> at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
> 
> at
> io.undertow.protocols.ssl.SslConduit.doUnwrap(SslConduit.java:705)
> 
> at io.undertow.protocols.ssl.SslConduit.doWrap(SslConduit.java:789)
> 
> at
> io.undertow.protocols.ssl.SslConduit.doHandshake(SslConduit.java:609)
> 
> at
> io.undertow.protocols.ssl.SslConduit.access$600(SslConduit.java:63)
> 
> at io.undertow.protocols.ssl.SslConduit$4$1.run(SslConduit.java:982)
> 
> at org.xnio.nio.WorkerThread.safeRun(WorkerThread.java:580)
> 
> at org.xnio.nio.WorkerThread.run(WorkerThread.java:464)
> 
>  
> 
> I installed the latest Oracle JDK yesterday on my zone and Wildfly ran
> within 1 minute with HTTPS enabled. 
> 
>  
> 
> Should I Jira this? 

Thanks for reporting this. Could you file a bug on
github.com/joyent/pkgsrc/issues please? If you happened to have a series
of steps to reproduce this so we can verify when we've fixed the issue,
that'd be greatly appreciated.

Thanks,
Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Fwd: Adding a new disk - too hard :(

2016-03-23 Thread Robert Mustacchi
On 3/20/16 16:32 , David Preece wrote:
> Any ideas on this? If someone can at least point me at the most similar 
> driver then I can start to work on it. Maybe.
> 
>> Begin forwarded message:
>>
>> I'm trying to add a PCIe SSD to act as an L2ARC but it's not going well. The 
>> device is found at boot:
>> hows up in the picl tree (abbreviated)
>> And under cfgadm (abbreviated)
>> But if I try to configure the disk:
>>
>> [root@i7 ~]# cfgadm -c configure sata1/7
>> cfgadm: Hardware specific failure: Failed to config device at ap_id: 
>> /devices/pci@0,0/pci8086,3b42@1c/pci1b4b,9230@0:7

Hi,

Unfortunately I'm not very familiar with these devices. But I think the
place I'd start looking at is why we're detecting a Marvell Virtual
device as well as seeing what exaclty the cfgadm is failing to configure
with DTrace. I suspect that https://www.illumos.org/issues/4904 is related.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Can not install Build20160317T000621Z HPE Proliant DL380 Gen9, 2 CPU (E5-2650 v3)

2016-03-23 Thread Robert Mustacchi
On 3/23/16 8:22 , Richard Elling wrote:
> 
>> On Mar 23, 2016, at 6:59 AM, Robert Mustacchi  wrote:
>>
>> On 3/23/16 2:00 , Benny Kjellgren wrote:
>>> Thank you
>>>
>>> "-B pci-reprog=off" solved my problem with SmartOS not booting up.
>>>
>>> And have changed from raidz1 with pair of mirrored disks LV
>>> to raidz2 with single disk raid 0 LV + one spare
>>
>> I'm glad that you've got this workaround working; however, if either you
>> or Richard could still boot up the system with -kd without doing the -B
>> pci-reprog=off and relay where we're panciking, that'd be quite helpful.
> 
> Unfortunately, there is no panic. If there were a panic, we'd be well on our 
> way :-)

So you're seeing a hard reset of the system then?

Can you break through bge_attach() and step over parts of it to see
where we're going awry?

>>
>> I'd like to make sure we get to root cause on this so these workarounds
>> aren't necessary. Unfortunately right now we don't have enough
>> information to make progress on that.
> 
> Agree, and there is precious little public doc on how these things are built.
> Strange PCI configurations are known to cause problems in the past, leading
> to unpredictable behaviour. We suspect similar here. We'll map out the PCI
> fabric as a next step, however that will be model specific.

Okay, please keep us posted as soon as you have any additional
information. I'd like to make sure we don't lose track of these and get
them root caused (along with the x2apic issues).

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] FreeBSD KVM image

2016-03-29 Thread Robert Mustacchi
On 3/29/16 3:43 , G B via smartos-discuss wrote:
> Last week I installed the FreeBSD 10.2 image for KVM provided by Joyent.  
> What I really like is being able to do 'vmadm console UUID' and not have to 
> use VNC.  Maybe I've done something wrong in the past, but I don't recall 
> being able to do that.  

I think this was something that we were trying to make sure we got added
to all the images at some point, but may have missed the FBSD one.

Christopher, can you make sure a bug is opened on that if it isn't already?

Thanks,
Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] vmadm create: timed out waiting for /var/svc/provisioning to move for $uuid

2016-03-31 Thread Robert Mustacchi
On 3/31/16 2:06 , Stefan wrote:
> We have two identical machines (A and B) both equipped with 192 GiB RAM.
>  The first one runs 10 VMs using 16,000 MB RAM each plus two smaller VMs
> with less than 3 GiB RAM:
> 
> 
>[root@A ~]# echo ::memstat | mdb -k
>Page SummaryPagesMB  %Tot
>     
>Kernel3726258 145557%
>Boot pages  68306   2660%
>ZFS File Data 2976930 116286%
>Anon 41622277162587   83%
>Exec and libs5473210%
>Page cache  22635880%
>Free (cachelist)25908   1010%
>Free (freelist)   1879280  73404%
> 
>Total50327067196590
>Physical 50327066196590
> 
>[root@A ~]# mdb -ke 'availrmem/D ; pages_pp_maximum/D'
>availrmem:
>availrmem:  2350836
>pages_pp_maximum:
>pages_pp_maximum:   1947905
> 
> There are 2,350,836 pages of 4,096 bytes available resident memory which
> seems reasonable.  However, on machine B we could create no more than
> six VMs with 16,000 MB RAM before we get timeouts.  Here are its figures:
> 
>[root@B ~]# echo ::memstat | mdb -k
>Page SummaryPagesMB  %Tot
>     
>Kernel2532099  98915%
>Boot pages  68306   2660%
>ZFS File Data  169799   6630%
>Anon 26373657103022   52%
>Exec and libs4432170%
>Page cache  14865580%
>Free (cachelist)13798530%
>Free (freelist)  21150107 82617   42%
> 
>Total50327063196590
>Physical 50327062196590
> 
> 
>[root@B ~]# mdb -ke 'availrmem/D ; pages_pp_maximum/D'
>availrmem:
>availrmem:  8122046
>pages_pp_maximum:
>pages_pp_maximum:   1947905
> 
> On this machine vmadm create failed with
> 
>timed out waiting for /var/svc/provisioning to move for $uuid
> 
> A: joyent_20160218T022556Z
> B: joyent_20160317T000621Z
> 
> Any ideas?

What is the sizing of your sawp device? e.g. zfs list zone/swap

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Tips for Docker on SmartOS

2016-03-31 Thread Robert Mustacchi
On 3/31/16 22:16 , Eric Ripa wrote:
> Hi,
> 
> I’ve been running a single SmartOS host as my home server for almost a year. 
> While I’m super happy with the setup I would like to start running Docker 
> experiments in a more clean way than setting up KVM hosts and doing the 
> manual shenanigans. I know there is a somewhat finished process for SDC to do 
> this and I know you can manually import containers via imgadm and creating 
> vmadm files for it, but I’m looking for a more polished process.

Hi,

Can you describe what you don't find polished about SDC here? Is it just
that you don't want to run SDC as opposed to SmartOS?

In terms of launching it, once SDC is set up, you just point the stock
docker tools and set the host url to SDC.

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] vmadm create: timed out waiting for /var/svc/provisioning to move for $uuid

2016-04-06 Thread Robert Mustacchi
On 4/6/16 1:26 , Stefan wrote:
> Am 31.03.2016 18:30, schrieb Robert Mustacchi:
>> On 3/31/16 2:06 , Stefan wrote:
>>> We have two identical machines (A and B) both equipped with 192 GiB RAM.
>>>  The first one runs 10 VMs using 16,000 MB RAM each plus two smaller VMs
>>> with less than 3 GiB RAM:
> 
>>> There are 2,350,836 pages of 4,096 bytes available resident memory which
>>> seems reasonable.  However, on machine B we could create no more than
>>> six VMs with 16,000 MB RAM before we get timeouts.  Here are its
>>> figures:
> 
>>> On this machine vmadm create failed with
> 
>>> Any ideas?
>>
>> What is the sizing of your sawp device? e.g. zfs list zone/swap
> 
> The machines were configured with different swap sizes:
> 
>[root@A ~]# swap -sh
>total: 159G allocated + 668M reserved = 160G used, 49G available
> 
>[root@B ~]# swap -sh
>total: 101G allocated + 95M reserved = 101G used, 30G available
> 
> After adjusting B's swap the timeouts are gone.

Okay, great. Glad that's worked out.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] KVM shutdown while rsyncing files

2016-04-06 Thread Robert Mustacchi
On 4/6/16 0:05 , Kilian Ries wrote:
> -  lwp# 85 / thread# 85  
> 
>  df7fff29f0ea _lwp_kill () + a
> 
>  df7fff2338f0 raise (6) + 20
> 
>  df7fff20db78 abort () + 98
> 
>  0054b172 qemu_oom_check (0) + 49
> 
>  0054b1ab qemu_memalign (200, 7e) + 33
> 
>  00508a5d qemu_blockalign (f9dc70, 7e) + 4f
> 
>  0050c485 handle_aiocb_rw (9c51b5570) + c2
> 
>  0050c770 aio_thread (0) + 166
> 
>  df7fff297b5a _thrp_setup (df7fff079240) + 8a
> 
>  df7fff297e70 _lwp_start ()

So based on this thread I think I have an idea of what's happening and
an idea of how to solve it.

Originally we didn't have preadv / pwritev in illumos and then when we
initially added it, the amount of IOVECS we used was variable and QEMU
didn't really respect IOVEC_MAX. Now, this matters because what QEMU
appears to be doing here is saying because it has an I/O vector that it
can't send, it's going to go ahead and try to basically allocate a large
amount of memory to make it all one contiguous amount that it can send.

So, in this case I think what we can do is actually release the preadv /
pwritev restrictions that came into place originally. This has the
advantage that it should reduce the burden of memory allocation on qemu
and thus speed up a bit of the I/O processing.

If I were able to produce a platform or a QEMU binary to test this
against, would you be in a position to run this again, given that it
seems to reproduce fairly frequently for you? It might be a couple days
before I could get around to that.

Robert



---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] KVM shutdown while rsyncing files

2016-04-06 Thread Robert Mustacchi
On 4/6/16 13:12 , Kilian Ries wrote:
> Hi Robert,
> 
> sounds great. Testing should be no problem, i’m able to update my platform 
> image or change the QEMU binary.

Okay. Would you mind opening a bug on
http://github.com/joyent/smartos-live to track this and I'll update that
when I have something put together. Hopefully by the end of this week,
but given what's going on it may be sometime next week.

Thanks,
Robert

> Am 06.04.16, 16:49 schrieb "Robert Mustacchi" :
> 
>> On 4/6/16 0:05 , Kilian Ries wrote:
>>> -  lwp# 85 / thread# 85  
>>>
>>>  df7fff29f0ea _lwp_kill () + a
>>>
>>>  df7fff2338f0 raise (6) + 20
>>>
>>>  df7fff20db78 abort () + 98
>>>
>>>  0054b172 qemu_oom_check (0) + 49
>>>
>>>  0054b1ab qemu_memalign (200, 7e) + 33
>>>
>>>  00508a5d qemu_blockalign (f9dc70, 7e) + 4f
>>>
>>>  0050c485 handle_aiocb_rw (9c51b5570) + c2
>>>
>>>  0050c770 aio_thread (0) + 166
>>>
>>>  df7fff297b5a _thrp_setup (df7fff079240) + 8a
>>>
>>>  df7fff297e70 _lwp_start ()
>>
>> So based on this thread I think I have an idea of what's happening and
>> an idea of how to solve it.
>>
>> Originally we didn't have preadv / pwritev in illumos and then when we
>> initially added it, the amount of IOVECS we used was variable and QEMU
>> didn't really respect IOVEC_MAX. Now, this matters because what QEMU
>> appears to be doing here is saying because it has an I/O vector that it
>> can't send, it's going to go ahead and try to basically allocate a large
>> amount of memory to make it all one contiguous amount that it can send.
>>
>> So, in this case I think what we can do is actually release the preadv /
>> pwritev restrictions that came into place originally. This has the
>> advantage that it should reduce the burden of memory allocation on qemu
>> and thus speed up a bit of the I/O processing.
>>
>> If I were able to produce a platform or a QEMU binary to test this
>> against, would you be in a position to run this again, given that it
>> seems to reproduce fairly frequently for you? It might be a couple days
>> before I could get around to that.
>>
>> Robert
>>
> 
> 


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: [smartos-discuss] Link Aggregation Problem

2016-04-08 Thread Robert Mustacchi
On 4/8/16 6:39 , Kilian Ries wrote:
> When i configure a link aggregation, i'm seeing on the switch that the 
> SmartOS host is sending some packets from MAC-address1 and some packets from 
> MAC-address2. The normal behaviour should be that it is only sending from one 
> MAC-Address.

Hi,

You're right, that is a bit surprising. So, if you take a look at dladm
show-aggr -x, you should see what we think the various MAC addresses
involved are. You can take a look at that with dladm show-aggr -x.

Which device do you see the improper mac address going out? I'd suggest
using snoop in the global zone. Also, does this happen with all traffic,
only specific flows, something else?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: AW: [smartos-discuss] Link Aggregation Problem

2016-04-08 Thread Robert Mustacchi
On 4/8/16 7:50 , Kilian Ries wrote:
> Hi Robert,
> 
> seems like a driver problem to me... Found some other users in the illumos 
> forum which have problems with that chipset.
> 
> 
> dladm show-aggr -x
> LINKPORT   SPEED DUPLEX   STATE ADDRESS
> PORTSTATE
> aggr0   -- 1000Mb fullup44:a8:42:34:87:63  --
> bge0   1000Mb fullup44:a8:42:34:87:63  
> attached
> bge1   1000Mb fullup44:a8:42:34:87:64  
> attached
> 
> 
> The strange thing is:
> 
> While i'm running "snoop -d aggr0" i am able to ping every other host in the 
> subnet and the outgoing mac-address seems to be right. When i'm canceling 
> snoop, the failure is back again and i can only ping some host in the subnet.

Rather than run snoop on the aggr, what happens if you run it on the
individual bge devices? Otherwise from your switch, which device is it
that we're seeing the wrong mac on? I'd presume it's on bge1. From the
switch is there any pattern to the traffic with the incorrect mac address?

> I think i'm going to put another network-card into the host ...

If possible, could we do a bit more debugging before you do that?
Otherwise I'm afraid we'll never get to the root cause here.

Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


Re: AW: AW: [smartos-discuss] Link Aggregation Problem

2016-04-11 Thread Robert Mustacchi
On 4/11/16 5:00 , Kilian Ries wrote:
> @ Robert Mustacchi
> 
> Shure, we can dig a litt more into that. If i run snoop against bge0 
> interface, nothing changes and i am still unable to ping the other host in my 
> subnet:
> 
> ###
> $snoop -d bge0
> 
> $ping 192.168.234.20
> no answer from 192.168.234.20
> ###
> 
> If i run snoop on the bge1 interface i'm able to ping:
> 
> ###
> $snoop -d bge1
> 
> $ping 192.168.234.20
> 192.168.234.20 is alive
> ###
> 
> 
> Some further testing showed that the ICMP request leaves the host via the 
> bge0 interface but returns via bge1:
> 
> ###
> $snoop -d bge0 |grep ICMP
> Using device bge0 (promiscuous mode)
> hostname -> 192.168.234.20 ICMP Echo request (ID: 6326 Sequence number: 0)
> 
> $snoop -d bge1 |grep ICMP
> Using device bge1 (promiscuous mode)
> 192.168.234.20 -> hostname ICMP Echo reply (ID: 6326 Sequence number: 0)
> ###
> 
> 
> Normally it should return on the same interface it was sent from (which is 
> the case if i try to ping the gateway):

Because of hashing and the potential for different hashing algorithms
going on, I don't necessarily know if I expect that; however, I would
expect it to be consistent for a given flow.

> ###
> $snoop -d bge0 |grep ICMP
> Using device bge0 (promiscuous mode)
> hostname -> 192.168.234.1 ICMP Echo request (ID: 6337 Sequence number: 0)
> 192.168.234.1 -> hostname ICMP Echo reply (ID: 6337 Sequence number: 0)
> ###
> 
> 
> So the problem only occurs on some IPs, not on all:
> 
> ping 192.168.234.1 (gateway) works
> ping 192.168.235.16 (other smartos host) works
> ping 192.168.234.20 (other smartos host) doesn't work

Okay, this helps. I have a working theory for what might be happening. I
should have asked for this initially, but with the snoop of bge0 could
you actually run snoop -d bge0 -o /path/to/some/file icmp and then after
you do a few pings in the going out the bge0 case and coming in the bge1
case, could you ctrl+c and make that snoop file available please?

Thanks,
Robert


---
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com


  1   2   3   4   5   >