Re: [OmniOS-discuss] Re-installing OmniOS after Crash, Errors with pkg [Subject Edited]

2013-10-29 Thread Sam M
Hi Eric,

I'm OK with bloody.

The problem I'm having seems minor, I only need to know where I can
find OmniTI's
CA root certificate so that I can install it locally.

Sam


On 29 October 2013 19:19, Eric Sproul  wrote:

> On Tue, Oct 29, 2013 at 4:00 AM, Sam M  wrote:
> > Hello Chris.
> >
> > Any update on this? Kindly let us know when the process is completed and
> I
> > can update.
>
> The unstable (aka bloody) repo is frequently, well, unstable.  If this
> is inconvenient for you, then the stable release (currently r151006,
> soon to be r151008) would be a better choice.
>
> Eric
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Tobias Oetiker
Hi Eric,

Today Eric Sproul wrote:

> On Tue, Oct 29, 2013 at 3:10 PM, Tobias Oetiker  wrote:
> > the troubling bit is that during the outage, the kvm hosts on
> > akami0 and nigiri0 were able to talk to the physical network just
> > fine, but they were not able to talk to fugu0 ...  and this is all
> > happening inside the crossbow switch within illumos if I
> > understand the concept correctly ...
>
> I'm not an expert on the Crossbow stack, but essentially I think this
> is correct, so perhaps we're looking at a VNIC issue with fugu0 and
> not anything to do with hardware.

that's what I was afraid to ... maybe some 'bad' packets coming in
over the network (it is pretty wild setup there) somehow confusing
the stack ...

> Since you're not using aggregate links, it should be possible to let
> the global zone use igb0 directly without disturbing the KVM vnics.  I
> do this on a number of dev systems.  It might be worth a try to move
> the address fugu0/v4static to igb0/v4static, though you'll of course
> need out-of-band or console access to do that.

did that now ... no need for a console btw, if you are cool :-)
(and have a console just in case)

  ipadm delete-addr fugu0/v4static; ipadm create-addr -T static -a 
10.10.10.1/30 igb0/v4static

cheers
tobi
>
> Eric
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch t...@oetiker.ch ++41 62 775 9902 / sb: -9900
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Eric Sproul
On Tue, Oct 29, 2013 at 3:55 PM, Tobias Oetiker  wrote:
> this how we had it setup originally :-) we thought that it might be
> better to hook zone0 into the virtual switch ... for performance
> since the kvms are using nfs resources from zone0 ...
>
> but you are right, it might make sense to switch back ...

AFAIK, unless you are actually using etherstubs, there is no "virtual
switch" in the strict sense-- there's just a pathway out of one
virtualized TCP/IP stack in the kernel (the one for the KVM vnic) and
into another (fugu0).  I'm not sure it's any different if the
destination address is over a hardware interface as opposed to a vnic,
but it seems simpler to my eyes.  I am not a kernel engineer though.
:)

Eric
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Tobias Oetiker
Today Eric Sproul wrote:

> On Tue, Oct 29, 2013 at 3:10 PM, Tobias Oetiker  wrote:
> > the troubling bit is that during the outage, the kvm hosts on
> > akami0 and nigiri0 were able to talk to the physical network just
> > fine, but they were not able to talk to fugu0 ...  and this is all
> > happening inside the crossbow switch within illumos if I
> > understand the concept correctly ...
>
> I'm not an expert on the Crossbow stack, but essentially I think this
> is correct, so perhaps we're looking at a VNIC issue with fugu0 and
> not anything to do with hardware.
>
> Since you're not using aggregate links, it should be possible to let
> the global zone use igb0 directly without disturbing the KVM vnics.  I
> do this on a number of dev systems.  It might be worth a try to move
> the address fugu0/v4static to igb0/v4static, though you'll of course
> need out-of-band or console access to do that.

this how we had it setup originally :-) we thought that it might be
better to hook zone0 into the virtual switch ... for performance
since the kvms are using nfs resources from zone0 ...

but you are right, it might make sense to switch back ...

cheers
tobi

>
> Eric
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch t...@oetiker.ch ++41 62 775 9902 / sb: -9900
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Eric Sproul
On Tue, Oct 29, 2013 at 3:10 PM, Tobias Oetiker  wrote:
> the troubling bit is that during the outage, the kvm hosts on
> akami0 and nigiri0 were able to talk to the physical network just
> fine, but they were not able to talk to fugu0 ...  and this is all
> happening inside the crossbow switch within illumos if I
> understand the concept correctly ...

I'm not an expert on the Crossbow stack, but essentially I think this
is correct, so perhaps we're looking at a VNIC issue with fugu0 and
not anything to do with hardware.

Since you're not using aggregate links, it should be possible to let
the global zone use igb0 directly without disturbing the KVM vnics.  I
do this on a number of dev systems.  It might be worth a try to move
the address fugu0/v4static to igb0/v4static, though you'll of course
need out-of-band or console access to do that.

Eric
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Tobias Oetiker
Hi Eric,

Today Eric Sproul wrote:

> On Tue, Oct 29, 2013 at 10:06 AM, Tobias Oetiker  wrote:
> > ADDROBJ   TYPE STATEADDR
> > lo0/v4static   ok   127.0.0.1/8
> > fugu0/v4staticstatic   ok   zzz.yy.8.5/23
> > fugu1/v4staticstatic   ok   10.10.10.1/30
> > lo0/v6static   ok   ::1/128
> >
> > the dropout does not coincide with a big backup job ... I am
> > running collectd on the omnios host, and it has been faithfully
> > recoding what happend on the interface while it was offline.
> >
> > The trafic stats show that packets have been coming into fugu0 but
> > only very few got sent out ... (if it happens again I will do a
> > snoop in the interface)
>
> A snoop would be my first tactic too.  Are these VNICs using VLAN tags
> or is everything untagged?

will do when I get access next ...

> >> For good measure, let's also look at `prtconf -d` to see what this igb
> >> hardware is.
> >
> > pci8086,1d10 (pciex8086,1d10) [Intel Corporation C600/X79 series 
> > chipset PCI Express Root Port 1], instance #6
> > pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> > Network Connection], instance #0
> > pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> > Network Connection], instance #1
> > pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> > Network Connection], instance #2
> > pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> > Network Connection], instance #3
> >
> > note that the kvm hosts were able to talk via igb0 while fugu (zone0) was 
> > not.
>
> OK, so this is I350, for which support should be pretty stable (it's
> been in upstream illumos for over a year and I know Joyent deploys
> I350 heavily in their public cloud).  I don't see any open issues on
> igb or I350 that would be relevant here.

the troubling bit is that during the outage, the kvm hosts on
akami0 and nigiri0 were able to talk to the physical network just
fine, but they were not able to talk to fugu0 ...  and this is all
happening inside the crossbow switch within illumos if I
understand the concept correctly ...

cheers
tobi

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch t...@oetiker.ch ++41 62 775 9902 / sb: -9900
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Eric Sproul
On Tue, Oct 29, 2013 at 10:06 AM, Tobias Oetiker  wrote:
> ADDROBJ   TYPE STATEADDR
> lo0/v4static   ok   127.0.0.1/8
> fugu0/v4staticstatic   ok   zzz.yy.8.5/23
> fugu1/v4staticstatic   ok   10.10.10.1/30
> lo0/v6static   ok   ::1/128
>
> the dropout does not coincide with a big backup job ... I am
> running collectd on the omnios host, and it has been faithfully
> recoding what happend on the interface while it was offline.
>
> The trafic stats show that packets have been coming into fugu0 but
> only very few got sent out ... (if it happens again I will do a
> snoop in the interface)

A snoop would be my first tactic too.  Are these VNICs using VLAN tags
or is everything untagged?

>
>> For good measure, let's also look at `prtconf -d` to see what this igb
>> hardware is.
>
> pci8086,1d10 (pciex8086,1d10) [Intel Corporation C600/X79 series 
> chipset PCI Express Root Port 1], instance #6
> pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> Network Connection], instance #0
> pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> Network Connection], instance #1
> pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> Network Connection], instance #2
> pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
> Network Connection], instance #3
>
> note that the kvm hosts were able to talk via igb0 while fugu (zone0) was not.

OK, so this is I350, for which support should be pretty stable (it's
been in upstream illumos for over a year and I know Joyent deploys
I350 heavily in their public cloud).  I don't see any open issues on
igb or I350 that would be relevant here.

Eric
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Tobias Oetiker
Today Eric Sproul wrote:

> On Tue, Oct 29, 2013 at 6:21 AM, Tobias Oetiker  wrote:
> > oot@fugu:~# dladm show-link
> > LINKCLASS MTUSTATEBRIDGE OVER
> > igb0phys  1500   up   -- --
> > igb1phys  1500   up   -- --
> > igb2phys  1500   unknown  -- --
> > igb3phys  1500   unknown  -- --
> > akami0  vnic  1500   up   -- igb0
> > nigiri0 vnic  1500   up   -- igb0
> > fugu0   vnic  1500   up   -- igb0
> > fugu1   vnic  1500   up   -- igb1
> >
> > the interfaces akami0 and nigiri0 are assigned to two kvm hosts
> > fugu0 is used by the omnios host and fugu1 is a direct link to a
> > second omnios host for zfs send receive backups.
>
> Could we see the output of `ipadm show-addr` in the global zone?  If
> not, are fugu0 and fugu1 in the same subnet?  Does the drop-out
> coincide with any other usage patterns, such as an active backup over
> fugu1?

ADDROBJ   TYPE STATEADDR
lo0/v4static   ok   127.0.0.1/8
fugu0/v4staticstatic   ok   zzz.yy.8.5/23
fugu1/v4staticstatic   ok   10.10.10.1/30
lo0/v6static   ok   ::1/128

the dropout does not coincide with a big backup job ... I am
running collectd on the omnios host, and it has been faithfully
recoding what happend on the interface while it was offline.

The trafic stats show that packets have been coming into fugu0 but
only very few got sent out ... (if it happens again I will do a
snoop in the interface)

> For good measure, let's also look at `prtconf -d` to see what this igb
> hardware is.

pci8086,1d10 (pciex8086,1d10) [Intel Corporation C600/X79 series 
chipset PCI Express Root Port 1], instance #6
pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
Network Connection], instance #0
pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
Network Connection], instance #1
pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
Network Connection], instance #2
pci8086,3584 (pciex8086,1521) [Intel Corporation I350 Gigabit 
Network Connection], instance #3

note that the kvm hosts were able to talk via igb0 while fugu (zone0) was not.

cheers
tobi

>
> Eric
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch t...@oetiker.ch ++41 62 775 9902 / sb: -9900
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Eric Sproul
On Tue, Oct 29, 2013 at 6:21 AM, Tobias Oetiker  wrote:
> oot@fugu:~# dladm show-link
> LINKCLASS MTUSTATEBRIDGE OVER
> igb0phys  1500   up   -- --
> igb1phys  1500   up   -- --
> igb2phys  1500   unknown  -- --
> igb3phys  1500   unknown  -- --
> akami0  vnic  1500   up   -- igb0
> nigiri0 vnic  1500   up   -- igb0
> fugu0   vnic  1500   up   -- igb0
> fugu1   vnic  1500   up   -- igb1
>
> the interfaces akami0 and nigiri0 are assigned to two kvm hosts
> fugu0 is used by the omnios host and fugu1 is a direct link to a
> second omnios host for zfs send receive backups.

Could we see the output of `ipadm show-addr` in the global zone?  If
not, are fugu0 and fugu1 in the same subnet?  Does the drop-out
coincide with any other usage patterns, such as an active backup over
fugu1?

For good measure, let's also look at `prtconf -d` to see what this igb
hardware is.

Eric
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Re-installing OmniOS after Crash, Errors with pkg [Subject Edited]

2013-10-29 Thread Eric Sproul
On Tue, Oct 29, 2013 at 4:00 AM, Sam M  wrote:
> Hello Chris.
>
> Any update on this? Kindly let us know when the process is completed and I
> can update.

The unstable (aka bloody) repo is frequently, well, unstable.  If this
is inconvenient for you, then the stable release (currently r151006,
soon to be r151008) would be a better choice.

Eric
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] omnios host goes suddenly silent on the network

2013-10-29 Thread Tobias Oetiker
Folks,

I am runnning omnios OmniOS 5.11 omnios-8d266aa  2013.05.04 on
a box where we server zfs storage space and also run a few kvm
hosts.

Over the last few days, omnios has intermittently become 'mute' on
its network interface. Not answering to tcp or icmp requests.

Connecting via the console shows that the host is otherwhise fine,
it just can not talk over the network anymore.

Normally the condition cleared after a few minutes.

The kernel does not write anything into the log file (running with
*.debug /var/log/debug.log)

Whats more, the virtual machines running on the machine, talking
over the same physical network interface continue their work
unperturbed. With network connectivity and all ... but they also
can not talk to the omnios server via the network.

I have the following network setup

oot@fugu:~# dladm show-link
LINKCLASS MTUSTATEBRIDGE OVER
igb0phys  1500   up   -- --
igb1phys  1500   up   -- --
igb2phys  1500   unknown  -- --
igb3phys  1500   unknown  -- --
akami0  vnic  1500   up   -- igb0
nigiri0 vnic  1500   up   -- igb0
fugu0   vnic  1500   up   -- igb0
fugu1   vnic  1500   up   -- igb1

the interfaces akami0 and nigiri0 are assigned to two kvm hosts
fugu0 is used by the omnios host and fugu1 is a direct link to a
second omnios host for zfs send receive backups.

anyone seem such a behaviour ?
any debugging ideas ?

cheers
tobi
-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch t...@oetiker.ch ++41 62 775 9902 / sb: -9900
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Re-installing OmniOS after Crash, Errors with pkg [Subject Edited]

2013-10-29 Thread Sam M
Hello Chris.

Any update on this? Kindly let us know when the process is completed and I
can update.

Thanks.

Sam.



On 27 October 2013 22:02, Chris Nehren wrote:

> Are you on bloody? We're in the middle of getting signed packages
> ready. Hold off on updating for now.
>
> On Sun, Oct 27, 2013 at 17:46:40 +0530, Sam M wrote:
> > Hello all.
> >
> >  After installing napp-it, OmniOS got hosed. Not sure what caused it,
> > probably not napp-it, because I couldn't boot up my system with any of
> the
> > boot images, including the original.
> >
> > Now, on a brand new installation, I'm getting the following error -
> >
> > *# pkg update web/ca-bundle pkg*
> > *Creating Plan -*
> > *pkg update: The certificate which issued this
> > certificate:/C=US/ST=Maryland/O=OmniTI/OU=OmniOS/CN=OmniOS r151007
> Release
> > Signing Certificate/emailAddress=omnios-supp...@omniti.com could not be
> > found. The issuer is:/C=US/ST=Maryland/L=Fulton/O=OmniTI/CN=OmniTI
> > Certificate Authority*
> > *The package involved is:pkg://omnios/library/python-2/pybonjour@1.1.1
> > ,5.11-0.151007:20130516T114553Z*
> > *
> > *
> > *# uname -a*
> > *SunOS sequoia 5.11 omnios-df542ea i86pc i386 i86pc Solaris*
> >
> > There were no errors earlier after the installation when updating
> packages.
> >
> > How can I fix this? Where can I find OmniTI's CA certificate?
> >
> > TIA.
> >
> > Sam
>
> > ___
> > OmniOS-discuss mailing list
> > OmniOS-discuss@lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
> --
> Chris Nehren
> Site Reliability Engineer, OmniTI
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss