Re: [OmniOS-discuss] ami instance upgrade problems
Hi, The problem is caused, I think, by device numbering differences in xen versus ec2. I solved this problem when creating my AMIs by the following procedure. 1. Create an extra volume in EC2 with the same size as the instance boot volume. 2. Attach the extra volume to the instance. 3. zpool attach the extra volume (c1t1d0?) zpool attach rpool c4t0d0 c1t1d0 4. zpool detach the original volume with incorrect name (c4t0d0) zpool detach rpool c4t0d0 5. zpool attach the original volume with the proper name (c1t0d0) zpool attach rpool c1t1d0 c0t0d0 6. zpool detach the extra volume. zpool detach rpool c1t1d0 7. Detach the extra volume from the instance and delete it. Double check all disk names in your instances first! regards Al On 20/07/18 01:46, PÁSZTOR György wrote: > Hi, > > I'm learning amazon, so I thought it would be nice, if I'd play with an > omnios inside my free tier experiments instead of linux. > I installed omnios from the "official" source: ami-0169c5108d1bdfd57 > (Yes, I choose Ohio instead of N. Virginia to play at) > > One small sidenote: ipv6 is not enabled in the official image, however I > configured so for the instance. After install I manually run this: > ipadm create-addr -T addrconf xnf0/v6 > Well, it solved everything. It seems way more simpleer then toying with > linux. > > But... I can not update my instance. > pkg update created the omnios-1 be, but I can not activate it. > root@ip-172-31-28-110:~# beadm list > BE Active Mountpoint Space Policy Created > omnios NR / 801M static 2018-05-04 18:52 > omnios-1 - /tmp/tmpTvggQP 158M static 2018-07-20 00:15 > root@ip-172-31-28-110:~# beadm activate -v omnios-1 > be_do_installboot: device c4t0d0 > be_do_installboot: install failed for device c4t0d0. > Command: "/usr/sbin/installboot -m -f /tmp/tmpTvggQP/boot/pmbr > /tmp/tmpTvggQP/boot/gptzfsboot /dev/rdsk/c4t0d0s0" > Errors: > open: No such file or directory > Unable to open device /dev/rdsk/c4t0d0s0 > be_run_cmd: command terminated with error status: 1 > Unable to activate omnios-1. > Error installing boot files. > root@ip-172-31-28-110:~# ls -l /dev/rdsk/*t0d0s0 > lrwxrwxrwx 1 root root 34 Jul 18 22:07 /dev/rdsk/c1t0d0s0 -> > ../../devices/xpvd/xdf@51712:a,raw > lrwxrwxrwx 1 root root 8 Jul 20 00:21 /dev/rdsk/c4t0d0s0 -> > c1t0d0s0 > root@ip-172-31-28-110:~# /usr/sbin/installboot -m -f /tmp/tmpTvggQP/boot/pmbr > /tmp/tmpTvggQP/boot/gptzfsboot /dev/rdsk/c1t0d0s0 > bootblock version installed on /dev/rdsk/c1t0d0s0 is more recent or identical > Use -F to override or install without the -u option > > root@ip-172-31-28-110:~# zpool status -v > pool: syspool > state: ONLINE > scan: none requested > config: > > NAMESTATE READ WRITE CKSUM > syspool ONLINE 0 0 0 > c4t0d0ONLINE 0 0 0 > > errors: No known data errors > root@ip-172-31-28-110:~# echo | format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: >0. c1t0d0 > /xpvd/xdf@51712 > Specify disk (enter its number): Specify disk (enter its number): > > > Well, I'm stucked at this point. I don't know how could I fix these. > I assume, the problem is, somewhere around the c4 vs c1 numbering, so it > try to open the wrong device. > > Note.: So let's just assume, it should work, without running the installboot > command. > > root@ip-172-31-28-110:~# cd /usr/sbin > root@ip-172-31-28-110:/usr/sbin# mv installboot installboot.save > root@ip-172-31-28-110:/usr/sbin# ln -s ../bin/true installboot > root@ip-172-31-28-110:/usr/sbin# beadm activate -v omnios-1 > be_do_installboot: device c4t0d0 > Command: "/usr/sbin/installboot -m -f /tmp/tmpTvggQP/boot/pmbr > /tmp/tmpTvggQP/boot/gptzfsboot /dev/rdsk/c4t0d0s0" > Activated successfully > root@ip-172-31-28-110:/usr/sbin# reboot > OmniOS 5.11 omnios-r151026-b6848f4455 June 2018 > root@ip-172-31-28-110:~# beadm list > BE Active Mountpoint Space Policy Created > omnios - - 3.66M static 2018-05-04 18:52 > omnios-1 NR / 1.01G static 2018-07-20 00:15 > > Will. It's a very dirty hack. Is there a nicer way to fix this c4 vs c1 > thing? > Btw.: it seems installboot would give back false, even if it could open the > device, because it already has the same version of boot block. Shouldn't > that circumstance checked on behalf of beadm? > > Cheers, > gyu > ___ > OmniOS-discuss mailing list > OmniOS-discuss@lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] xnf panic
Hi Andy, OK, I booted a t2.micro instance from my AMI (Original OmniOS CE r151022 HVM). Updated pkg and then updated to r151022as. Then shut the instance down, changed the type to m4.large, then started the instance. A panic/reboot loop ensued. Stopped the instance. Changed the instance back to t2.micro and booted. Applied the hotfix and rebooted. Rebooted fine. Stopped the instance, changed instance type to m4.large, and started the instance. The instance started and I could log in just fine. To be sure, I stopped and restarted the instance a few times. It always came back up ok. So, the patch looks good to me. If you can think of any more tests you would like me to run, I'll give them a go for you. Thank you for the speedy fix. regards Al On 09/04/18 18:05, Al Slater wrote: > Hi Andy, > > Wow, that was fast. I will test this when I have fed the family. > > I have my own AMI, so I can spin a server as a T2 instance, apply the > fix and then restart after changing to an M4. > > regards > > Al > > On 09/04/18 16:04, Andy Fiddaman wrote: >> >> Al, >> >> I've prepared a hot-fix containing this driver update if you'd like to >> test. I tried to test it myself on AWS but can't provision any type of >> machine from that unofficial CE r151022 AMI in London. Just snapshot the >> EBS volume first in case of problems. >> >> To install: >> >> pkg apply-hot-fix https://downloads.omniosce.org/pkg/r151022/7186-xnf.p5p >> >> which will create a new boot-environment, and reboot. >> >> Assuming it looks ok, this update will be part of next Monday's release. >> >> Regards, >> >> Andy >> >> On Mon, 9 Apr 2018, Al Slater wrote: >> >> ; Hi, >> ; >> ; Has the fix for 7186 (xnf: panic on Xen 4.x) been integrated into >> ; r151022 since the initial CE release? >> ; >> ; I have an instance in AWS that they required me to stop and start again >> ; due to host patching. When I started it again the instance went into a >> ; panic/reboot loop. The stack dump looked similar to the one in the >> ; error report. >> ; >> ; I managed to get the instance started by changing the instance type from >> ; m4.large to t2.large. Presumably AWS are migrating towards Xen versions >> ; > 4 in london region. I don't know how long until the t2 hosts are >> updated. >> ; >> ; regards >> ; >> ; >> > > ___ > OmniOS-discuss mailing list > OmniOS-discuss@lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] xnf panic
Hi Andy, Wow, that was fast. I will test this when I have fed the family. I have my own AMI, so I can spin a server as a T2 instance, apply the fix and then restart after changing to an M4. regards Al On 09/04/18 16:04, Andy Fiddaman wrote: > > Al, > > I've prepared a hot-fix containing this driver update if you'd like to > test. I tried to test it myself on AWS but can't provision any type of > machine from that unofficial CE r151022 AMI in London. Just snapshot the > EBS volume first in case of problems. > > To install: > > pkg apply-hot-fix https://downloads.omniosce.org/pkg/r151022/7186-xnf.p5p > > which will create a new boot-environment, and reboot. > > Assuming it looks ok, this update will be part of next Monday's release. > > Regards, > > Andy > > On Mon, 9 Apr 2018, Al Slater wrote: > > ; Hi, > ; > ; Has the fix for 7186 (xnf: panic on Xen 4.x) been integrated into > ; r151022 since the initial CE release? > ; > ; I have an instance in AWS that they required me to stop and start again > ; due to host patching. When I started it again the instance went into a > ; panic/reboot loop. The stack dump looked similar to the one in the > ; error report. > ; > ; I managed to get the instance started by changing the instance type from > ; m4.large to t2.large. Presumably AWS are migrating towards Xen versions > ; > 4 in london region. I don't know how long until the t2 hosts are updated. > ; > ; regards > ; > ; > ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] xnf panic
On 09/04/2018 08:49, Al Slater wrote: > Has the fix for 7186 (xnf: panic on Xen 4.x) been integrated into > r151022 since the initial CE release? > > I have an instance in AWS that they required me to stop and start again > due to host patching. When I started it again the instance went into a > panic/reboot loop. The stack dump looked similar to the one in the > error report. > > I managed to get the instance started by changing the instance type from > m4.large to t2.large. Presumably AWS are migrating towards Xen versions >> 4 in london region. I don't know how long until the t2 hosts are updated. Repeatable on any m4 instance type I tried. Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 panic[cpu0]/thread=ff000f4e6c40: BAD TRAP: type=e (#pf Page fault) rp=ff000f4e69b0 addr=40 occurred in module "xnf" due to a NULL pointer dereference sched: #pf Page fault Bad kernel fault at addr=0x40 pid=0, pc=0xf79b6e67, sp=0xff000f4e6aa0, eflags=0x10206 cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 1406b8<smep,osxsav,xmme,fxsr,pge,pae,pse,de> cr2: 40cr3: c40cr8: c rdi: 286 rsi:6 rdx:c rcx: ff03d5dbf064 r8:0 r9:0 rax: 150 rbx:3 rbp: ff000f4e6af0 r10:0 r11: fb800983 r12: ff03d5d9 r13:0 r14: 15 r15:9 fsb:0 gsb: fbc397e0 ds: 4b es: 4b fs:0 gs: 1c3 trp:e err:0 rip: f79b6e67 cs: 30 rfl:10206 rsp: ff000f4e6aa0 ss: 38 Warning - stack not written to the dump buffer ff000f4e6890 unix:die+df () ff000f4e69a0 unix:trap+e18 () ff000f4e69b0 unix:cmntrap+e6 () ff000f4e6af0 xnf:xnf_tx_clean_ring+c7 () ff000f4e6b60 xnf:tx_slots_get+95 () ff000f4e6ba0 xnf:xnf_intr+15b () ff000f4e6bf0 unix:av_dispatch_softvect+78 () ff000f4e6c20 unix:dispatch_softint+39 () ff000f635460 unix:switch_sp_and_call+13 () ff000f6354a0 unix:dosoftint+44 () ff000f635500 unix:do_interrupt+ba () ff000f635510 unix:cmnint+ba () fffff7c6aec0 sha2:SHA256TransformBlocks+109f () -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] xnf panic
Hi, Has the fix for 7186 (xnf: panic on Xen 4.x) been integrated into r151022 since the initial CE release? I have an instance in AWS that they required me to stop and start again due to host patching. When I started it again the instance went into a panic/reboot loop. The stack dump looked similar to the one in the error report. I managed to get the instance started by changing the instance type from m4.large to t2.large. Presumably AWS are migrating towards Xen versions > 4 in london region. I don't know how long until the t2 hosts are updated. regards -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] sudo update
Hi Andy, On 23/11/17 10:40, Andy Fiddaman wrote: > > On Thu, 23 Nov 2017, Al Slater wrote: > > ; Hi, > ; > ; I have just updated a number of my omniosce boxes to r151022y, bringing > ; in the sudo updates in r151022u. > ; > ; All my machines have BSM auditing enabled, and now I am seeing the > ; following when using sudo > ; > ; sudo: au_preselect: Bad file number > > Hi, this is something we specifically tested along with the sudo update > since auditing was an area that changed quite a bit. Could you please check > that all of your packages are up-to-date (particularly SUNWcs) and that the > output of the following commands matches on your system? > > r151022% auditrecord -e AUE_sudo > > sudo > program sudo See sudo(1m) > event ID6650 AUE_sudo > class lo,ua,as (0x00061000) > header > subject > exec_arguments command args > [text] error message (failure only) > return > > r151022% grep sudo /etc/security/audit_event /usr/lib/audit/audit_record_attr > /etc/security/audit_event:# sudo event > /etc/security/audit_event:6650:AUE_sudo:sudo(1m):lo,ua,as > /usr/lib/audit/audit_record_attr:label=AUE_sudo > > If the problem persists, please post the audit configuration that you're > using so we can try and replicate (auditconfig -getflags) Ok, I can see the issue. The upgrade installed a audit_event.new into /etc/security, but it was not merged into our modified audit_event. I can see what I need to do to fix this now. Thank you for the pointers. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] sudo update
Hi, I have just updated a number of my omniosce boxes to r151022y, bringing in the sudo updates in r151022u. All my machines have BSM auditing enabled, and now I am seeing the following when using sudo sudo: au_preselect: Bad file number regards -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Problem updating OmniOS machines
Does anyone have any ideas what the cause is here, or how to debug it? On 21/08/2017 11:22, Al Slater wrote: > I have an number of omnios boxes running r151022, all upgraded from > r151014. Currently uname says omnios-r151022-f9693432c2 > > All but one of them are failing to update. The process just stops after > "Downloading linked" for each zone with no error message, but with a > return code of 1. Each machine said I had to upgrade pkg first, which > was done. > > > aslate-admin@mars:/export/home/aslate-admin$ sudo pkg update -r > Packages to update: 123 >Create boot environment: Yes > Create backup boot environment: No > > Planning linked: 0/8 done; 1 working: zone:qa-redis1 > Linked image 'zone:qa-redis1' output: > | Packages to update: 14 > ` > Planning linked: 1/8 done; 1 working: zone:qa-redis3 > Linked image 'zone:qa-redis3' output: > | Packages to update: 14 > ` > Planning linked: 2/8 done; 1 working: zone:qa-seclb1 > Linked image 'zone:qa-seclb1' output: > | Packages to update: 14 > ` > Planning linked: 3/8 done; 1 working: zone:pg-ugweb01 > Linked image 'zone:pg-ugweb01' output: > | Packages to update: 14 > ` > Planning linked: 4/8 done; 1 working: zone:qa-b2cweb05 > Linked image 'zone:qa-b2cweb05' output: > | Packages to update: 14 > ` > Planning linked: 5/8 done; 1 working: zone:base > Linked image 'zone:base' output: > | Packages to update: 14 > ` > Planning linked: 6/8 done; 1 working: zone:qa-lb1 > Linked image 'zone:qa-lb1' output: > | Packages to update: 14 > ` > Planning linked: 7/8 done; 1 working: zone:qa-tseclb1 > Linked image 'zone:qa-tseclb1' output: > | Packages to update: 14 > ` > Planning linked: 8/8 done > DOWNLOADPKGS FILESXFER (MB) > SPEED > Completed123/123 3553/355379.6/79.6 > 0B/s > > Downloading linked: 0/8 done; 1 working: zone:qa-redis1 > Downloading linked: 1/8 done; 1 working: zone:qa-redis3 > Downloading linked: 2/8 done; 1 working: zone:qa-seclb1 > Downloading linked: 3/8 done; 1 working: zone:pg-ugweb01 > Downloading linked: 4/8 done; 1 working: zone:qa-b2cweb05 > Downloading linked: 5/8 done; 1 working: zone:base > Downloading linked: 6/8 done; 1 working: zone:qa-lb1 > Downloading linked: 7/8 done; 1 working: zone:qa-tseclb1 > Linked progress: /aslate-admin@mars:/export/home/aslate-admin$ echo $? > 1 > > > Running with -v doesn't give any hints. > > The machines are updating from my own pkg repo, which is kept in sync > with the omniosce repo. > > Any ideas what is wrong? > -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] Problem updating OmniOS machines
I have an number of omnios boxes running r151022, all upgraded from r151014. Currently uname says omnios-r151022-f9693432c2 All but one of them are failing to update. The process just stops after "Downloading linked" for each zone with no error message, but with a return code of 1. Each machine said I had to upgrade pkg first, which was done. aslate-admin@mars:/export/home/aslate-admin$ sudo pkg update -r Packages to update: 123 Create boot environment: Yes Create backup boot environment: No Planning linked: 0/8 done; 1 working: zone:qa-redis1 Linked image 'zone:qa-redis1' output: | Packages to update: 14 ` Planning linked: 1/8 done; 1 working: zone:qa-redis3 Linked image 'zone:qa-redis3' output: | Packages to update: 14 ` Planning linked: 2/8 done; 1 working: zone:qa-seclb1 Linked image 'zone:qa-seclb1' output: | Packages to update: 14 ` Planning linked: 3/8 done; 1 working: zone:pg-ugweb01 Linked image 'zone:pg-ugweb01' output: | Packages to update: 14 ` Planning linked: 4/8 done; 1 working: zone:qa-b2cweb05 Linked image 'zone:qa-b2cweb05' output: | Packages to update: 14 ` Planning linked: 5/8 done; 1 working: zone:base Linked image 'zone:base' output: | Packages to update: 14 ` Planning linked: 6/8 done; 1 working: zone:qa-lb1 Linked image 'zone:qa-lb1' output: | Packages to update: 14 ` Planning linked: 7/8 done; 1 working: zone:qa-tseclb1 Linked image 'zone:qa-tseclb1' output: | Packages to update: 14 ` Planning linked: 8/8 done DOWNLOADPKGS FILESXFER (MB) SPEED Completed123/123 3553/355379.6/79.6 0B/s Downloading linked: 0/8 done; 1 working: zone:qa-redis1 Downloading linked: 1/8 done; 1 working: zone:qa-redis3 Downloading linked: 2/8 done; 1 working: zone:qa-seclb1 Downloading linked: 3/8 done; 1 working: zone:pg-ugweb01 Downloading linked: 4/8 done; 1 working: zone:qa-b2cweb05 Downloading linked: 5/8 done; 1 working: zone:base Downloading linked: 6/8 done; 1 working: zone:qa-lb1 Downloading linked: 7/8 done; 1 working: zone:qa-tseclb1 Linked progress: /aslate-admin@mars:/export/home/aslate-admin$ echo $? 1 Running with -v doesn't give any hints. The machines are updating from my own pkg repo, which is kept in sync with the omniosce repo. Any ideas what is wrong? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] initialboot
Hi, Sorry, I clearly didn't explain well enough. Normally, initial-boot is enabled in the iso/kayak image that is installed. When it is started on the first boot it runs /.initialboot and then disables itself. I was thinking that if there was some way to "re-enable" initial-boot, then I could drop a /.initialboot script, re-enable initial-boot and then shutdown before creating new AMI, such that when an instance based upon the new AMI was launched, it would run .initialboot. The problem is, enabling initial-boot immediately runs the .initialboot script and then disables itself. So, I hoped there was a way to enable the service such that it did not immediately enable, but was enabled so it would start after the next reboot. Al On 31/07/17 21:49, PÁSZTOR György wrote: > Hi, > > I hope you don't mind, but I started a new thread with this, since it seems > a completly new topic. > > "Al Slater" <al.sla...@scluk.com> írta 2017-07-31 21:05-kor: >> One more question though, is there any way to enable an SMF service for >> the next reboot, but not immediately. Specifically, I want to enable >> the initial-boot service with a .initialboot file in place, then create >> a new AMI. > > I don't completely understand. You want to enable initialboot after the > boot was complete, and only after a certain amount of time? > I'm not sure, what this initialboot exactly does, but it seems not a simple > service, it's a milestone. Maybe, I would not mess with it. > Otherwise, if I need a delay between the service and the boot, and it's > important to remain "disabled" while it's not enabled: > Create an @reboot cronjob. I don't remember which cron implementation is > the default. On linux's vixie's cron the time can be @reboot. > >> I wist to use .initialboot to grab the instance configuration from >> amazon (hostname, root keys etc) and configure appropriately when the >> new instance starts. > > Again: I don't completely understand your scenario. > You created one ami, and you want to "close it back", and clone it several > times, so after it's first reboot, it should do the initalboot steps? > Why do you want to wait? > What I just found about the /.initialboot, it's a simple shell script. > If you need to wait here, why not just put a sleep command into the > beginning of the script? > Or if you have to wait for some specific resource: Why don't poll it once > per every 5 sec or so? > > Cheers, > Gyu > ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Omios, hvm and AWS
On 31/07/17 21:30, Eric Sproul wrote: > On Mon, Jul 31, 2017 at 4:05 PM, Al Slater <al.sla...@scluk.com> wrote: >> One more question though, is there any way to enable an SMF service for >> the next reboot, but not immediately. Specifically, I want to enable >> the initial-boot service with a .initialboot file in place, then create >> a new AMI. >> >> I wist to use .initialboot to grab the instance configuration from >> amazon (hostname, root keys etc) and configure appropriately when the >> new instance starts. > > Hi Al, > The initial-boot service isn't really suitable for this sort of thing. > You might want to check out > pkg://omnios/system/management/ec2-credential which specifically > handles setting up the credentials at first boot. That could be > trivially extended[1] to set the system hostname and probably any > other "standard" thing that operators want. > > Eric > > [1] > https://github.com/omniosorg/omnios-build/blob/master/build/ec2-credential/files/install-ec2-credential Thanks for the pointer Eric. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Omios, hvm and AWS
On 31/07/2017 11:39, Peter Tribble wrote: > > > On Mon, Jul 31, 2017 at 11:09 AM, Al Slater <al.sla...@scluk.com > <mailto:al.sla...@scluk.com>> wrote: > > On 31/07/2017 11:07, Al Slater wrote: > > On 30/07/2017 20:15, Peter Tribble wrote: > >> > The following should get you going: > >> > > >> > > https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/ > > <https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/> > >> > <https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/ > > <https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/>> > > > > OK, I followed the above procedure and have produced an AMI. > > > > When I create an instance and try to boot it, I get the following in the > > system log: > > SunOS Release 5.11 Version omnios-r151022-f9693432c2 64-bit > > Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights > reserved. > > NOTICE: Cannot read the pool label from '/xpvd/xdf@51728:a' > NOTICE: spa_import_rootpool: error 5 > > Cannot mount root on /xpvd/xdf@51728:a fstype zfs > panic[cpu0]/thread=fbc38560: vfs_mountroot: cannot mount root > Warning - stack not written to the dump buffer > fbc7ad70 genunix:vfs_mountroot+39b () > fbc7adb0 genunix:main+138 () > fbc7adc0 unix:_locore_start+90 () > > > How can I fix this? > > > You're likely the first person down this path. > > Generically, this means that the device paths embedded in the pool > don't match those provided by the "hardware" you're booting on. > > So the system thinks it should have a disk at /xpvd/xdf@51728:a > > On my instance, I have: > > /dev/rdsk/c2t0d0s0 -> ../../devices/xpvd/xdf@51712:a,raw > > In other words, 51712 not 51728. > > For this to work, you have to set up your xen instance to exactly mirror > what EC2 provides. Somehow it's gotten mixed up. In your configuration, > did you use xvda? I think 51728 is what you get if you use xvdb for the > disk, > which won't work. I had: > > disk=[ 'file:/home/ptribble/iso/tribblix-0m20.1.iso,hdb:cdrom,r', > 'file:/root/ami-template.img,xvda,w' ] > Thanks Peter, I see what happened... I started off with the instructions from https://wiki.openindiana.org/oi/Creating+OpenIndiana+EC2+image Then changed to following the instructions at https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/ while neglecting to change the disks line in my xen config. Oh well, starting again... -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Omios, hvm and AWS
On 31/07/2017 11:07, Al Slater wrote: > On 30/07/2017 20:15, Peter Tribble wrote: >> > The following should get you going: >> > >> > >> https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/ >> >> <https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/> > > OK, I followed the above procedure and have produced an AMI. > > When I create an instance and try to boot it, I get the following in the > system log: SunOS Release 5.11 Version omnios-r151022-f9693432c2 64-bit Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved. NOTICE: Cannot read the pool label from '/xpvd/xdf@51728:a' NOTICE: spa_import_rootpool: error 5 Cannot mount root on /xpvd/xdf@51728:a fstype zfs panic[cpu0]/thread=fbc38560: vfs_mountroot: cannot mount root Warning - stack not written to the dump buffer fbc7ad70 genunix:vfs_mountroot+39b () fbc7adb0 genunix:main+138 () fffffbc7adc0 unix:_locore_start+90 () How can I fix this? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Omios, hvm and AWS
Hi Peter, On 28/07/17 22:37, Peter Tribble wrote: > I wish to run up a number of OmniOS instances in AWS. > > The current OmniOS AMIs in AWS seem to use pv virtualization, precluding > their use on the t2 and m4 instance types that I want to use. > > > Worse; newer regions only support hvm. In my case, this rules out London. That is precisely where I want to run my instances. > So, I thought I would try to produce my own AMI with hvm virtualization. > > I am looking to use omniosce r151022, is this likely to work at all? > > I have read https://omnios.omniti.com/wiki.php/Ec2Ami > <https://omnios.omniti.com/wiki.php/Ec2Ami>, does anyone know > how that procedure would be amended to cater for loader/hvm instead of > pv-grub? > > > The following should get you going: > > https://www.prakashsurya.com/post/2017-02-06-creating-a-custom-amazon-ec2-ami-from-iso/ That looks very helpful, thank you for the link. > Essentially, if you install any illumos distro you can send the disk > image up > to AWS and create an AMI. If you create the image by installing using Xen > *exactly* as described, you're done. If you're getting the image from > somewhere > else then the phys_path to the disk embedded in the pool will be wrong > and need > to be rewritten, which basically means going into Xen again. I will be using xen so hopefully all will be good... -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Omios, hvm and AWS
Thank you, that clarified my understanding. Al On 27/07/17 22:47, PÁSZTOR György wrote: > Hi, > > "Al Slater" <al.sla...@scluk.com> írta 2017-07-27 12:17-kor: >> So, I thought I would try to produce my own AMI with hvm virtualization. >> >> I am looking to use omniosce r151022, is this likely to work at all? > > I haven't tryed to upgrade my r151022 with the ce updates, but I'm pretty > sure that it must work. > >> I have read https://omnios.omniti.com/wiki.php/Ec2Ami, does anyone know >> how that procedure would be amended to cater for loader/hvm instead of >> pv-grub? > > If you use hvm, then there is no need for an extra loader. Just install > omnios, as you would onto the "virtual" hdd. > However, I never tried amazon's env. I experimenting with omnios on my home > nas. (See my mail two days ago) > > The only drawback what I found: if the xen hypervisor is >=4.6 (or >4.5.1 I > don't know yet), then the pv network driver won't work. > > Cheers, > Gyu > -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] Omios, hvm and AWS
Hi, I wish to run up a number of OmniOS instances in AWS. The current OmniOS AMIs in AWS seem to use pv virtualization, precluding their use on the t2 and m4 instance types that I want to use. So, I thought I would try to produce my own AMI with hvm virtualization. I am looking to use omniosce r151022, is this likely to work at all? I have read https://omnios.omniti.com/wiki.php/Ec2Ami, does anyone know how that procedure would be amended to cater for loader/hvm instead of pv-grub? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ANNOUNCEMENT OmniOS Community Edition - OmniOSce r151022h
root` to the PATH given in the list.) > > > 5. Install the new ca-bundle containing our new CA > > ``` > # /usr/bin/pkg update -rv web/ca-bundle > ``` > > 6. Remove the CA file imported by hand > > ``` > # rm /etc/ssl/pkg/omniosce-ca.cert.pem > ``` > > 7. Finally update as usual > > ```https://pkg.omniosce.org/r151022/core/ > # /usr/bin/pkg update -rv > ``` > > > ## About OmniOS Community Edition Association > > OmniOS Community Edition Association (OmniOSce) is a Swiss association, > dedicated to the continued support and release of OmniOS for the benefit of > all parties involved. The board of OmniOSce controls access to the OmniOS CA. > Current board members are: Tobias Oetiker (President), Andy Fiddaman > (Development), Dominik Hassler (Treasurer). > > > ## About Citrus-IT > > Citrus IT is a UK company that provides a managed email service platform to > companies around the world. For many years they ran their systems on Solaris > with SPARC hardware but transitioned to OmniOS in 2012. www.citrus-it.net > > > ## About OETIKER+PARTNER AG > > OETIKER+PARTNER is a Swiss system management and software development > company. Employees from O+P are involved in many Open Source Software > projects. O+P runs most of their server hardware on OmniOS. www.oetiker.ch > > > Press inquiries to i...@omniosce.org > > Published July 12, 2017 > > OmniOSce > Aarweg 17 > 4600 Olten > Switzerland > > http://www.omniosce.org > > -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Updates for OmniOS r151014 & r151016
On 13/11/15 20:13, Dan McDonald wrote: > 014: > -- > > > - ilbd memory leak plug Thanks for getting this is there Dan, we have been running leak free for 3 days now. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 10/11/15 15:26, Dan McDonald wrote: > >> On Nov 10, 2015, at 2:50 AM, Al Slater <al.sla...@scluk.com> wrote: >> >> On 10/11/2015 07:40, Al Slater wrote: >>> It seems to me that ilbd_run_probe just needs to call >>> posix_spawn_file_actions_destroy appropriately. >> >> And probably posix_spawnattr_destroy as well? > > Wow! Great catch. I'll bet a small sum you nailed this to the wall. > > Want me to build you a replacement ilbd? Yes please :) Thanks for your, and Bob's, help with this. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Hi Dan, On 06/11/2015 18:31, Dan McDonald wrote: You said you had a test box, right? Yes. Can you: - Disable UMEM_DEBUG - RESTART the service. - IMMEDIATELY after restart do pmap, and do pmap once per (sec, 10 sec, something) to see how it grows? Attached is a compressed file with 5hrs or so of 10s pmaps. Hopefully not too big for the list. After that, maybe we can dtrace and see what's going on. -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 pmap.6589.gz Description: application/gzip ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 09/11/15 15:43, Dan McDonald wrote: > >> On Nov 9, 2015, at 8:39 AM, Al Slater <al.sla...@scluk.com> wrote: >> >> Attached is a compressed file with 5hrs or so of 10s pmaps. >> Hopefully not too big for the list. > > It compressed nicely. I'm noticing a pattern: > > Mon Nov 9 08:21:45 UTC 2015 total Kb 134008 133504 131416 > - Mon Nov 9 08:50:21 UTC 2015 total Kb 265080 264576 262488 > - Mon Nov 9 09:37:42 UTC 2015 total Kb 265088 264580 262492 > - Mon Nov 9 09:47:40 UTC 2015 total Kb 527232 526724 524636 > - Mon Nov 9 11:42:19 UTC 2015 total Kb 1051520 1050960 1048872 > - Mon Nov 9 11:42:29 UTC 2015 total Kb 1051520 1051012 1048924 > - > > > It's mostly linear growth. Notice the time intervals also double > whenever the footprint essentially doubles? > > So I need to back up and ask some things, especially given libumem > doesn't appear to show leaks or even usage: > > 1.) Is the eating of memory affecting your system peformance? (If > you've only 8GB, yeah, I can see that.) Hmmm... I started investigating after the servers hung a couple of times. I have not conclusively proved that this was the cause, but the machines have been running for months with no issue after I added a cronjob to restart ilb twice a day. I can see a gradual increase in kernel memory use as well, but I have not investigated that. > 2.) Is ilb failing after it gets sufficiently large? Again, no link conclusively proved, but I did see log messages like the following when the memory use had grown to 4Gb... Nov 5 11:17:01 l1-lb2 ilbd[3041]: [ID 410242 daemon.error] ilbd_hc_probe_timer: cannot restart timer: rule ggp server _ggp.11, disabling it I looked at the source for ilbd and I think this could be caused by a memory allocation failure in iu_schedule_timer. After these messages was generated, it looks like the disabled servers were never re-enabled, so eventually this could end up with no enabled servers, and therefore no service, without manual intervention. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 05/11/2015 14:57, Dan McDonald wrote: On Nov 5, 2015, at 6:38 AM, Al Slater <al.sla...@scluk.com> wrote: I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks show nothing as well? ::findleaks against the 4GB corefile showed nothing. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 06/11/15 14:51, Dan McDonald wrote: > >> On Nov 6, 2015, at 9:39 AM, Dan McDonald <dan...@omniti.com> wrote: >> >> Lots of LARGE anonymous mappings. I wonder why that happened? I'll dig into >> that a bit more. > > pmap(1) works even better on running processes. Could you run, say "pmap -xa > `pgrep ilbd`" on your running machine? Here you go... root@loki:/export/home/BRIGHTON/aslate# pmap -xa `pgrep ilbd` 12346: /usr/lib/inet/ilbd Address Kbytes RSSAnon Locked Mode Mapped File 08027000 132 132 132 - rw---[ stack ] 0805 76 76 - - r-x-- ilbd 08073000 4 4 4 - rw--- ilbd 08074000 96 - - - rw--- ilbd 0808C000115611401112 - rw---[ heap ] 0D20 262144 262144 262144 - rwx--[ anon ] 1D40 524288 524288 524288 - rwx--[ anon ] 3D60 1048576 1048576 1048576 - rwx--[ anon ] 7D80 1048576 1048576 1048576 - rwx--[ anon ] BDA0 524288 524288 524288 - rwx--[ anon ] DDC0 262144 262144 262144 - rwx--[ anon ] EDE0 131072 131072 131072 - rwx--[ anon ] F600 65536 65536 65536 - rwx--[ anon ] FA20 32768 32768 32768 - rwx--[ anon ] FC40 16384 16384 16384 - rwx--[ anon ] FD60819281928192 - rwx--[ anon ] FE00409640964096 - rwx--[ anon ] FE60204820482048 - rwx--[ anon ] FE8A 36 16 - - r-x-- libtsol.so.2 FE8B9000 4 4 4 - rw--- libtsol.so.2 FE8C 4 4 4 - rwx--[ anon ] FE8D 140 112 - - r-x-- libbsm.so.1 FE903000 28 28 28 - rw--- libbsm.so.1 FE90A000 4 - - - rw--- libbsm.so.1 FE91 16 16 - - r-x-- libsecdb.so.1 FE924000 4 4 4 - rw--- libsecdb.so.1 FE93102410241024 - rwx--[ anon ] FEA4 512 512 512 - rwx--[ anon ] FEAD 256 256 256 - rwx--[ anon ] FEB2 128 128 128 - rwx--[ anon ] FEB5 64 64 64 - rwx--[ anon ] FEB7 64 16 16 - rwx--[ anon ] FEB9 4 4 4 - rwx--[ anon ] FEBA 20 20 - - r-x-- libilb.so.1 FEBB5000 4 4 4 - rw--- libilb.so.1 FEBC 32 32 - - r-x-- libuutil.so.1 FEBD8000 4 4 4 - rw--- libuutil.so.1 FEBE 4 4 4 - rwx--[ anon ] FEBF 172 148 - - r-x-- libscf.so.1 FEC2B000 4 4 4 - rw--- libscf.so.1 FEC3 20 20 - - r-x-- libinetutil.so.1 FEC45000 4 4 4 - rw--- libinetutil.so.1 FEC5 4 4 4 - rwx--[ anon ] FEC6 20 12 - - r-x-- libcmdutils.so.1 FEC75000 4 4 4 - rw--- libcmdutils.so.1 FEC8 4 4 - - r--s- dev:528,24 ino:2821218250 FEC9 64 64 4 - rwx--[ anon ] FECB 64 64 4 - rwx--[ anon ] FECD 416 368 - - r-x-- libnsl.so.1 FED48000 8 8 8 - rw--- libnsl.so.1 FED4A000 20 16 4 - rw--- libnsl.so.1 FED5 4 4 4 - rwx--[ anon ] FED6 52 48 - - r-x-- libsocket.so.1 FED7D000 4 4 4 - rw--- libsocket.so.1 FED8 24 12 12 - rwx--[ anon ] FED91252 936 - - r-x-- libc_hwcap1.so.1 FEED9000 36 36 32 - rwx-- libc_hwcap1.so.1 FEEE2000 8 8 8 - rwx-- libc_hwcap1.so.1 FEEF 4 4 4 - rwx--[ anon ] FEF0 196 112 - - r-x-- libumem.so.1 FEF4 8 4 4 - rwx-- libumem.so.1 FEF52000 76 72 16 - rw--- libumem.so.1 FEF65000 24 24 24 - rw--- libumem.so.1 FEF7 4 4 - - r--s- ld.config FEF8 4 4 4 - rwx--[ anon ] FEF9 4 4 4 - rw---[ anon ] FEFA 4 4 4 - rw---[ anon ] FEFB 4 4 4 - rwx--[ anon ] FEFB5000 216 216 - - r-x-- ld.so.1 FEFFB000 8 8 8 - rwx-- ld.so.1 FEFFD000 4 4 4 - rwx-- ld.so.1 --- --- ------- --- total Kb 3936668 3935948 3933588 -- Al Slater ___ OmniOS-discuss mailing list OmniOS-disc
Re: [OmniOS-discuss] ILB memory leak?
To the mailing list as well... On 22/10/2015 09:43, Al Slater wrote: > On 21/10/2015 17:35, Dan McDonald wrote: >> >>> On Oct 21, 2015, at 6:08 AM, Al Slater <al.sla...@scluk.com> >>> wrote: >>> >>> Hi, >>> >>> I am running omnios r151014 on a couple of machines with a couple >>> of zones each. 1 zone runs apache as an SSL reverse proxy, the >>> other runs ILB for load balancing web to app tier connections. >>> >>> I noticed that in the ILB zone, the ilbd process memory grows to >>> about 2Gb. Restarting ILB releases the memory, and then the >>> memory usage gradually increases again, with each memory increase >>> approximately 2 * the size of the previous one. I run a cronjob >>> twice a day ( 8am and 8pm) which restarts the ilb service and >>> releases the memory. >>> >>> A graph of memory usage is available at >>> https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 >>> > >> There are currently 62 rules in the load balancer, with a > >> total >>> of 664 server/port pairs. >>> >>> Is there anything I can provide that would help track this down? >> >> You can use svccfg(1M) to enable user-level memory debugging on ilb. >> It may cause the ilb daemon to dump core. (And you're just noticing >> this in the process, not kernel memory consumption, correct?) > > I am seeing kernel memory consumption increasing as well, but that may > be a different issue. The ilbd process memory is definitely growing. > >> As root: >> >> svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so >> svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm >> enable ilb >> >> That should enable user-level memory debugging. If you get a >> coredump, save it and share it. If you don't and the ilb daemon >> keeps running, eventually please: >> >> gcore `pgrep ilbd` >> >> and share THAT corefile. You can also do this by youself: >> >> mdb > ::findleaks >> >> and share ::findleaks. >> >> Once you're done generating corefiles, repeat the steps above, but >> use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the >> setenv lines. > > Thanks Dan. As we are talking about production boxes here, I will have > to try and reproduce on another box and then I will give the process > above a go and see what we come up with. I have reproduced the problem on a test box. prstat shows: 3041 daemon 3946M 3946M sleep 590 0:48:03 0.1% ilbd/1 memstat: root@loki:/export/home/BRIGHTON/aslate# echo ::memstat | mdb -k Page SummaryPagesMB %Tot Kernel 238420 931 12% ZFS File Data 630861 2464 31% Anon 1054835 4120 51% Exec and libs2204 80% Page cache 10624411% Free (cachelist) 9236360% Free (freelist)105626 4125% Total 2051806 8014 Physical 2051805 8014 mdb findleaks: root@loki:/export/home/BRIGHTON/aslate# mdb core.3041 Loading modules: [ libumem.so.1 libc.so.1 libcmdutils.so.1 libuutil.so.1 ld.so.1 ] > ::findleaks findleaks: no memory leaks detected > Now, I am seeing lots of log messages like the following in /var/adm/messages Nov 5 11:17:01 l1-lb2 ilbd[3041]: [ID 410242 daemon.error] ilbd_hc_probe_timer: cannot restart timer: rule ggp server _ggp.11, disabling it So, I was wrong about growing to 2Gb, the truth is nearer 4Gb. I am guessing that ilbd_hc_restart_timer is failing because no more memory can be allocated. I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Hi Dan, On 05/11/2015 14:57, Dan McDonald wrote: On Nov 5, 2015, at 6:38 AM, Al Slater <al.sla...@scluk.com> wrote: I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks show nothing as well? ::umausers may be helpful. root@loki:/export/home/BRIGHTON/aslate# mdb core.3041 Loading modules: [ libumem.so.1 libc.so.1 libcmdutils.so.1 libuutil.so.1 ld.so.1 ] ::umausers 71424 bytes for 62 allocations with data size 1152: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_ilbd_alloc_sg+0x13 ilbd_create_sg+0x9a ilbd_scf_instance_walk_pg+0x2a6 ilbd_walk_sg_pgs+0x37 i_ilbd_read_config+0x28 main_loop+0x7f main+0x1d3 _start+0x83 53120 bytes for 664 allocations with data size 80: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 ilbd_hc_srv_add+0x18 ilbd_hc_associate_rule+0xd8 ilbd_create_rule+0x1a3 ilbd_scf_instance_walk_pg+0x1c4 ilbd_walk_rule_pgs+0x37 i_ilbd_read_config+0x4e main_loop+0x7f main+0x1d3 _start+0x83 53120 bytes for 664 allocations with data size 80: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_add_srv2sg+0x15 ilbd_add_server_to_group+0x310 ilbd_scf_instance_walk_pg+0x2dd ilbd_walk_sg_pgs+0x37 i_ilbd_read_config+0x28 main_loop+0x7f main+0x1d3 _start+0x83 31584 bytes for 658 allocations with data size 48: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x99 libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 libinetutil.so.1`iu_schedule_timer_ms+0x2d libinetutil.so.1`iu_schedule_timer+0x37 ilbd_hc_restart_timer+0xbc ilbd_hc_probe_timer+0x23 libinetutil.so.1`iu_expire_timers+0xbe ilbd_hc_timeout+0x11 main_loop+0xe6 main+0x1d3 _start+0x83 12288 bytes for 1 allocations with data size 12288: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libc.so.1`ltzset_u+0xa2 libc.so.1`localtime_r+0x35 libc.so.1`ctime_r+0x2c libc.so.1`vsyslog+0x1e4 ilbd_log+0x48 main+0x15e _start+0x83 10368 bytes for 54 allocations with data size 192: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x99 libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_alloc_ilbd_rule+0x17 ilbd_create_rule+0xfa ilbd_scf_instance_walk_pg+0x1c4 ilbd_walk_rule_pgs+0x37 i_ilbd_read_config+0x4e main_loop+0x7f main+0x1d3 _start+0x83 Sharing the corefile would also be helpful. I have put it on dropbox https://www.dropbox.com/s/y6cv78d1xk5j5u7/core.3041.gz?dl=0 I'm assuming, given you see problems at 4GB that ilbd is a 32-bit process, right? Yes, # file /usr/lib/inet/ilbd /usr/lib/inet/ilbd: ELF 32-bit LSB executable 80386 Version 1, dynamically linked, not stripped, no debugging information available cheers -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 21/10/2015 17:35, Dan McDonald wrote: On Oct 21, 2015, at 6:08 AM, Al Slater <al.sla...@scluk.com> wrote: Hi, I am running omnios r151014 on a couple of machines with a couple of zones each. 1 zone runs apache as an SSL reverse proxy, the other runs ILB for load balancing web to app tier connections. I noticed that in the ILB zone, the ilbd process memory grows to about 2Gb. Restarting ILB releases the memory, and then the memory usage gradually increases again, with each memory increase approximately 2 * the size of the previous one. I run a cronjob twice a day ( 8am and 8pm) which restarts the ilb service and releases the memory. A graph of memory usage is available at https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 >> There are currently 62 rules in the load balancer, with a >> total of 664 server/port pairs. Is there anything I can provide that would help track this down? You can use svccfg(1M) to enable user-level memory debugging on ilb. It may cause the ilb daemon to dump core. (And you're just noticing this in the process, not kernel memory consumption, correct?) I am seeing kernel memory consumption increasing as well, but that may be a different issue. The ilbd process memory is definitely growing. As root: svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm enable ilb That should enable user-level memory debugging. If you get a coredump, save it and share it. If you don't and the ilb daemon keeps running, eventually please: gcore `pgrep ilbd` and share THAT corefile. You can also do this by youself: mdb > ::findleaks and share ::findleaks. Once you're done generating corefiles, repeat the steps above, but use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the setenv lines. Thanks Dan. As we are talking about production boxes here, I will have to try and reproduce on another box and then I will give the process above a go and see what we come up with. -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] ILB memory leak?
Hi, I am running omnios r151014 on a couple of machines with a couple of zones each. 1 zone runs apache as an SSL reverse proxy, the other runs ILB for load balancing web to app tier connections. I noticed that in the ILB zone, the ilbd process memory grows to about 2Gb. Restarting ILB releases the memory, and then the memory usage gradually increases again, with each memory increase approximately 2 * the size of the previous one. I run a cronjob twice a day ( 8am and 8pm) which restarts the ilb service and releases the memory. A graph of memory usage is available at https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 There are currently 62 rules in the load balancer, with a total of 664 server/port pairs. Is there anything I can provide that would help track this down? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] pkgrecv r151014
Hi, I am trying to pkgrecv r151014 into my own repository and keep bumping into this: pkgrecv: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so: chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384 computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times) Is there a problem with this package in the repository? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] pkgrecv r151014
On 06/04/15 11:03, Al Slater wrote: Hi, I am trying to pkgrecv r151014 into my own repository and keep bumping into this: pkgrecv: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so: chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384 computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times) Is there a problem with this package in the repository? Same happens with pkg install... # pkg install pkg:/developer/sunstudio12.1@12.1-0.151014 Packages to install: 1 Create boot environment: No Create backup boot environment: No DOWNLOADPKGS FILESXFER (MB) SPEED developer/sunstudio12.1 0/1 5042/7006 203.1/256.3 3.0M/s Errors were encountered while attempting to retrieve package or file data for the requested operation. Details follow: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so: chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384 computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times) regards -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] pkgrecv r151014
Thanks Eric, AV on the gateway was the problem. Al On 06/04/15 15:14, Eric Sproul wrote: On Mon, Apr 6, 2015 at 6:24 AM, Al Slater al.sla...@scluk.com wrote: On 06/04/15 11:03, Al Slater wrote: Hi, I am trying to pkgrecv r151014 into my own repository and keep bumping into this: pkgrecv: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so: chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384 computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times) Is there a problem with this package in the repository? It seems fine from my location: $ pkg contents -mr developer/sunstudio12.1 | grep libsunir.so file 19d832f8b112a9545e9d9b5aaf1384a7a37248f3 chash=b251c238070b6fdbf392194e85319e2c954a5384 elfarch=i386 elfbits=32 elfhash=710138bfbc99dd3aefd4a41dd49b9779cae35f15 group=bin mode=0755 $ curl -s http://pkg.omniti.com/omnios/r151014/file/1/19d832f8b112a9545e9d9b5aaf1384a7a37248f3 | sha1sum b251c238070b6fdbf392194e85319e2c954a5384 - Eric -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] pkgsend generate bug with spaces in file names
On 04/11/14 13:43, Lauri Tirkkonen wrote: On Tue, Nov 04 2014 13:35:39 +, Al Slater wrote: I have run into the same problem while packaging cmake. Is there a solution? I have an open pull request for this, but from what I understand OmniOS' pkg isn't currently in a state where they can merge it. The OmniTIers can probably elaborate on that, but if you want to apply the patch yourself, it's at https://github.com/postwait/pkg5/pull/4 Thanks for that. -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss