On Thu, Apr 11, 2019 at 12:08 Javier Pena wrote:
> ----- Original Message -----
>
>> Hello,
>
>> I have just deployed software factory and when I tried to test how zuul
>> works, I got the following zuul error:
>
>> NODE_FAILURE
>
>> Now, I can see that nodes always keep "building" state.
>
>> nodepool log :
>
>> 2019-04-11 17:55:18,203 ERROR nodepool.NodeLauncher-0000000046: Launch
>> attempt 9/9 failed for node 0000000046:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 40, in launch
>> self.handler.pool, hostid, port, self.label)
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/provider.py",
>> line 149, in createContainer
>> "Manager %s failed to initialized" % self.provider.name)
>> RuntimeError: Manager oci-provider-hypervisor-oci failed to initialized
>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000046: Launch failed
>> for node 0000000046:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py",
>> line 659, in run
>> self.launch()
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 57, in launch
>> self.node.host_keys = key
>> UnboundLocalError: local variable 'key' referenced before assignment
>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000045: Launch failed
>> for node 0000000045:
>> Traceback (most recent call last):
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py",
>> line 659, in run
>> self.launch()
>> File
>> "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py",
>> line 57, in launch
>> self.node.host_keys = key
>> UnboundLocalError: local variable 'key' referenced before assignment
>> 2019-04-11 17:55:22,918 INFO nodepool.DeletedNodeWorker: Deleting failed
>> instance 0000000045-centos-oci-100-0000000045 from
>> oci-provider-hypervisor-oci
>> 2019-04-11 17:55:22,926 INFO nodepool.NodeDeleter: Deleting ZK node
>> id=0000000045, state=deleting,
>> external_id=0000000045-centos-oci-100-0000000045
>> 2019-04-11 17:55:22,934 INFO nodepool.DeletedNodeWorker: Deleting failed
>> instance 0000000046-centos-oci-100-0000000046 from
>> oci-provider-hypervisor-oci
>> 2019-04-11 17:55:22,940 INFO nodepool.NodeDeleter: Deleting ZK node
>> id=0000000046, state=deleting,
>> external_id=0000000046-centos-oci-100-0000000046
>> 2019-04-11 17:55:26,276 INFO nodepool.NodePool: Creating requests for 2
>> centos-oci nodes
>> 2019-04-11 17:55:29,822 INFO
>> nodepool.PoolWorker.oci-provider-hypervisor-oci-main: Assigning node request
>> <NodeRequest {'id': '100-0000000047', 'node_types': ['centos-oci'], 'state':
>> 'requested', 'state_time': 1554998126.2781763, 'stat':
>> ZnodeStat(czxid=11466, mzxid=11466, ctime=1554998126279,
>> mtime=1554998126279, version=0, cversion=0, aversion=0, ephemeralOwner=0,
>> dataLength=217, numChildren=0, pzxid=11466), 'nodes': [], 'reuse': False,
>> 'declined_by': [], 'requestor': 'NodePool:min-ready'}>
>> 2019-04-11 17:55:29,845 WARNING nodepool.driver.oci.OpenContainerProvider:
>> Creating container when provider isn't ready
>
>> Any idea?

Hello Noel,

NODE_ERROR indicates a failure to start the nodes, and the exception you
found in the logs is an issue that has been fixed in newer version.
It seems like you deployed Software Factory version 3.0, since 3.1 the
drivers has been renamed runC and greatly improved.
Can you try to upgrade to version 3.2:

https://www.softwarefactory-project.io/docs/3.2/operator/upgrade.html

After the upgrade process, please restart the instance (that's because
we don't support upgrade from 3.0, and restart is needed to refresh the
services).


>
> Hi Noel, 
>
> I'm not sure if it's the same situation, but last time I tried to use oci 
> containers I had to remount /srv/host-rootfs as read-write before it would 
> work the first time (it is mounted as read-only by default). After this and a 
> reboot, it worked fine as ro. 
>
> So can you try a quick "mount -o remount,rw /srv/host-rootfs" and see if it 
> fixes it? 
>

Since version 3.1 (and the rename to runC), we fixed a critical issue
with bubblewrap and this remount shouldn't be needed anymore.

Regards,
-Tristan

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Softwarefactory-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/softwarefactory-dev

Reply via email to