Hello

First of all, thanks for your quick answer.

I have tried to upgrade from 3.0 to 3.2, but I get this error when I try to 
install the 3.2
"Error: centos-release-ceph-luminous conflicts with 
centos-release-ceph-jewel-1.0-1.el7.centos.noarch"

I see that I can fix it adding --skip-broken but.... I am not pretty sure If 
this is a good idea...I will wait for your opinión. In my first approach I 
tried to deploy directly sf 3.2 but with our arch I get some errors and at the 
end I give up. Anyway I can try again.


By other hand, I remount /srv/host-rootfs to try but, this doesn't works.


I attach our arch.yaml in case it could help

________________________________________
De: Tristan Cacqueray [[email protected]]
Enviado: viernes, 12 de abril de 2019 2:55
Para: Javier Pena; RUIZ LOPEZ Noel; [email protected]
Asunto: Re: [Softwarefactory-dev] Zuul NODE FAILURE

On Thu, Apr 11, 2019 at 12:08 Javier Pena wrote:
> ----- Original Message -----
>
>> Hello,
>
>> I have just deployed software factory and when I tried to test how zuul
>> works, I got the following zuul error:
>
>> NODE_FAILURE
>
>> Now, I can see that nodes always keep "building" state.
>
>> nodepool log :
>
>> 2019-04-11 17:55:18,203 ERROR nodepool.NodeLauncher-0000000046: Launch
>> attempt 9/9 failed for node 0000000046:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 40, in launch
>> self.handler.pool, hostid, port, self.label)
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/provider.py",
>> line 149, in createContainer
>> "Manager %s failed to initialized" % self.provider.name)
>> RuntimeError: Manager oci-provider-hypervisor-oci failed to initialized
>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000046: Launch failed
>> for node 0000000046:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py",
>> line 659, in run
>> self.launch()
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 57, in launch
>> self.node.host_keys = key
>> UnboundLocalError: local variable 'key' referenced before assignment
>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000045: Launch failed
>> for node 0000000045:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py",
>> line 659, in run
>> self.launch()
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 57, in launch
>> self.node.host_keys = key
>> UnboundLocalError: local variable 'key' referenced before assignment
>> 2019-04-11 17:55:22,918 INFO nodepool.DeletedNodeWorker: Deleting failed
>> instance 0000000045-centos-oci-100-0000000045 from
>> oci-provider-hypervisor-oci
>> 2019-04-11 17:55:22,926 INFO nodepool.NodeDeleter: Deleting ZK node
>> id=0000000045, state=deleting,
>> external_id=0000000045-centos-oci-100-0000000045
>> 2019-04-11 17:55:22,934 INFO nodepool.DeletedNodeWorker: Deleting failed
>> instance 0000000046-centos-oci-100-0000000046 from
>> oci-provider-hypervisor-oci
>> 2019-04-11 17:55:22,940 INFO nodepool.NodeDeleter: Deleting ZK node
>> id=0000000046, state=deleting,
>> external_id=0000000046-centos-oci-100-0000000046
>> 2019-04-11 17:55:26,276 INFO nodepool.NodePool: Creating requests for 2
>> centos-oci nodes
>> 2019-04-11 17:55:29,822 INFO
>> nodepool.PoolWorker.oci-provider-hypervisor-oci-main: Assigning node request
>> <NodeRequest {'id': '100-0000000047', 'node_types': ['centos-oci'], 'state':
>> 'requested', 'state_time': 1554998126.2781763, 'stat':
>> ZnodeStat(czxid=11466, mzxid=11466, ctime=1554998126279,
>> mtime=1554998126279, version=0, cversion=0, aversion=0, ephemeralOwner=0,
>> dataLength=217, numChildren=0, pzxid=11466), 'nodes': [], 'reuse': False,
>> 'declined_by': [], 'requestor': 'NodePool:min-ready'}>
>> 2019-04-11 17:55:29,845 WARNING nodepool.driver.oci.OpenContainerProvider:
>> Creating container when provider isn't ready
>
>> Any idea?

Hello Noel,

NODE_ERROR indicates a failure to start the nodes, and the exception you
found in the logs is an issue that has been fixed in newer version.
It seems like you deployed Software Factory version 3.0, since 3.1 the
drivers has been renamed runC and greatly improved.
Can you try to upgrade to version 3.2:

https://www.softwarefactory-project.io/docs/3.2/operator/upgrade.html

After the upgrade process, please restart the instance (that's because
we don't support upgrade from 3.0, and restart is needed to refresh the
services).


>
> Hi Noel,
>
> I'm not sure if it's the same situation, but last time I tried to use oci 
> containers I had to remount /srv/host-rootfs as read-write before it would 
> work the first time (it is mounted as read-only by default). After this and a 
> reboot, it worked fine as ro.
>
> So can you try a quick "mount -o remount,rw /srv/host-rootfs" and see if it 
> fixes it?
>

Since version 3.1 (and the rename to runC), we fixed a critical issue
with bubblewrap and this remount shouldn't be needed anymore.

Regards,
-Tristan
description: Minimal Software Factory deployment
inventory:
- hostname: managesf.sftests.com
  ip: 10.6.71.81
  name: managesf
  public_url: https://sftests.com
  roles:
  - install-server
  - mysql
  - zookeeper
  - gateway
  - cauth
  - managesf
  - etherpad
  - lodgeit
  - gitweb
  - gerrit
  - gerritbot
  - logserver
  - zuul-scheduler
  - zuul-executor
  - zuul-web
  - nodepool-launcher
  - murmur
  - mirror
  - kibana
  - repoxplorer
  - hydrant
  - firehose
  - grafana
  - rabbitmq
  - storyboard
  - storyboard-webclient
- hostname: elk.sftests.com
  ip: 192.168.71.82
  name: elk
  public_url: http://elk.sftests.com
  roles:
  - elasticsearch
  - logstash
  - influxdb
- hostname: nodepool-builder.sftests.com
  ip: 192.168.71.83
  name: nodepool-builder
  public_url: http://nodepool-builder.sftests.com
  roles:
  - nodepool-builder
- hostname: zuul-merger.sftests.com
  ip: 192.168.71.84
  name: zuul-merger
  public_url: http://zuul-merger.sftests.com
  roles:
  - zuul-merger
- hostname: hypervisor-oci.sftests.com
  ip: 192.168.71.86
  max-servers: 10
  name: hypervisor-oci
  public_url: http://hypervisor-oci.sftests.com
  remote: true
  roles:
  - hypervisor-oci
_______________________________________________
Softwarefactory-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/softwarefactory-dev

Reply via email to