Hi, Starting with 3.2 you should probably start with the minimal arch https://softwarefactory-project.io/cgit/software-factory/sf-config/tree/refarch/minimal.yaml Then add others components you need step by step.
Regards, Fabien On Fri, Apr 12, 2019 at 12:26 PM Tristan Cacqueray <[email protected]> wrote: > > On Fri, Apr 12, 2019 at 09:18 RUIZ LOPEZ Noel wrote: > > Hello > > > > First of all, thanks for your quick answer. > > > > I have tried to upgrade from 3.0 to 3.2, but I get this error when I try > to install the 3.2 > > "Error: centos-release-ceph-luminous conflicts with > centos-release-ceph-jewel-1.0-1.el7.centos.noarch" > > > > Oops, that was addressed by an upgrade note for the 3.1 release, as > explained here: https://www.softwarefactory-project.io/releases/3.1/ > (search for "Upgrade Notes") > > You have to do: > "yum remove -y centos-release-ceph-jewel" before installing the > sf-release-3.2.rpm > > Regards, > -Tristan > > > I see that I can fix it adding --skip-broken but.... I am not pretty > sure If this is a good idea...I will wait for your opinión. In my first > approach I tried to deploy directly sf 3.2 but with our arch I get some > errors and at the end I give up. Anyway I can try again. > > > > > > By other hand, I remount /srv/host-rootfs to try but, this doesn't works. > > > > > > I attach our arch.yaml in case it could help > > > > ________________________________________ > > De: Tristan Cacqueray [[email protected]] > > Enviado: viernes, 12 de abril de 2019 2:55 > > Para: Javier Pena; RUIZ LOPEZ Noel; [email protected] > > Asunto: Re: [Softwarefactory-dev] Zuul NODE FAILURE > > > > On Thu, Apr 11, 2019 at 12:08 Javier Pena wrote: > >> ----- Original Message ----- > >> > >>> Hello, > >> > >>> I have just deployed software factory and when I tried to test how zuul > >>> works, I got the following zuul error: > >> > >>> NODE_FAILURE > >> > >>> Now, I can see that nodes always keep "building" state. > >> > >>> nodepool log : > >> > >>> 2019-04-11 17:55:18,203 ERROR nodepool.NodeLauncher-0000000046: Launch > >>> attempt 9/9 failed for node 0000000046: > >>> Traceback (most recent call last): > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py", > >>> line 40, in launch > >>> self.handler.pool, hostid, port, self.label) > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/provider.py", > >>> line 149, in createContainer > >>> "Manager %s failed to initialized" % self.provider.name) > >>> RuntimeError: Manager oci-provider-hypervisor-oci failed to initialized > >>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000046: Launch > failed > >>> for node 0000000046: > >>> Traceback (most recent call last): > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py", > >>> line 659, in run > >>> self.launch() > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py", > >>> line 57, in launch > >>> self.node.host_keys = key > >>> UnboundLocalError: local variable 'key' referenced before assignment > >>> 2019-04-11 17:55:19,208 ERROR nodepool.NodeLauncher-0000000045: Launch > failed > >>> for node 0000000045: > >>> Traceback (most recent call last): > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/__init__.py", > >>> line 659, in run > >>> self.launch() > >>> File > >>> > "/opt/rh/rh-python35/root/usr/lib/python3.5/site-packages/nodepool/driver/oci/handler.py", > >>> line 57, in launch > >>> self.node.host_keys = key > >>> UnboundLocalError: local variable 'key' referenced before assignment > >>> 2019-04-11 17:55:22,918 INFO nodepool.DeletedNodeWorker: Deleting > failed > >>> instance 0000000045-centos-oci-100-0000000045 from > >>> oci-provider-hypervisor-oci > >>> 2019-04-11 17:55:22,926 INFO nodepool.NodeDeleter: Deleting ZK node > >>> id=0000000045, state=deleting, > >>> external_id=0000000045-centos-oci-100-0000000045 > >>> 2019-04-11 17:55:22,934 INFO nodepool.DeletedNodeWorker: Deleting > failed > >>> instance 0000000046-centos-oci-100-0000000046 from > >>> oci-provider-hypervisor-oci > >>> 2019-04-11 17:55:22,940 INFO nodepool.NodeDeleter: Deleting ZK node > >>> id=0000000046, state=deleting, > >>> external_id=0000000046-centos-oci-100-0000000046 > >>> 2019-04-11 17:55:26,276 INFO nodepool.NodePool: Creating requests for 2 > >>> centos-oci nodes > >>> 2019-04-11 17:55:29,822 INFO > >>> nodepool.PoolWorker.oci-provider-hypervisor-oci-main: Assigning node > request > >>> <NodeRequest {'id': '100-0000000047', 'node_types': ['centos-oci'], > 'state': > >>> 'requested', 'state_time': 1554998126.2781763, 'stat': > >>> ZnodeStat(czxid=11466, mzxid=11466, ctime=1554998126279, > >>> mtime=1554998126279, version=0, cversion=0, aversion=0, > ephemeralOwner=0, > >>> dataLength=217, numChildren=0, pzxid=11466), 'nodes': [], 'reuse': > False, > >>> 'declined_by': [], 'requestor': 'NodePool:min-ready'}> > >>> 2019-04-11 17:55:29,845 WARNING > nodepool.driver.oci.OpenContainerProvider: > >>> Creating container when provider isn't ready > >> > >>> Any idea? > > > > Hello Noel, > > > > NODE_ERROR indicates a failure to start the nodes, and the exception you > > found in the logs is an issue that has been fixed in newer version. > > It seems like you deployed Software Factory version 3.0, since 3.1 the > > drivers has been renamed runC and greatly improved. > > Can you try to upgrade to version 3.2: > > > > https://www.softwarefactory-project.io/docs/3.2/operator/upgrade.html > > > > After the upgrade process, please restart the instance (that's because > > we don't support upgrade from 3.0, and restart is needed to refresh the > > services). > > > > > >> > >> Hi Noel, > >> > >> I'm not sure if it's the same situation, but last time I tried to use > oci containers I had to remount /srv/host-rootfs as read-write before it > would work the first time (it is mounted as read-only by default). After > this and a reboot, it worked fine as ro. > >> > >> So can you try a quick "mount -o remount,rw /srv/host-rootfs" and see > if it fixes it? > >> > > > > Since version 3.1 (and the rename to runC), we fixed a critical issue > > with bubblewrap and this remount shouldn't be needed anymore. > > > > Regards, > > -Tristan > > description: Minimal Software Factory deployment > > inventory: > > - hostname: managesf.sftests.com > > ip: 10.6.71.81 > > name: managesf > > public_url: https://sftests.com > > roles: > > - install-server > > - mysql > > - zookeeper > > - gateway > > - cauth > > - managesf > > - etherpad > > - lodgeit > > - gitweb > > - gerrit > > - gerritbot > > - logserver > > - zuul-scheduler > > - zuul-executor > > - zuul-web > > - nodepool-launcher > > - murmur > > - mirror > > - kibana > > - repoxplorer > > - hydrant > > - firehose > > - grafana > > - rabbitmq > > - storyboard > > - storyboard-webclient > > - hostname: elk.sftests.com > > ip: 192.168.71.82 > > name: elk > > public_url: http://elk.sftests.com > > roles: > > - elasticsearch > > - logstash > > - influxdb > > - hostname: nodepool-builder.sftests.com > > ip: 192.168.71.83 > > name: nodepool-builder > > public_url: http://nodepool-builder.sftests.com > > roles: > > - nodepool-builder > > - hostname: zuul-merger.sftests.com > > ip: 192.168.71.84 > > name: zuul-merger > > public_url: http://zuul-merger.sftests.com > > roles: > > - zuul-merger > > - hostname: hypervisor-oci.sftests.com > > ip: 192.168.71.86 > > max-servers: 10 > > name: hypervisor-oci > > public_url: http://hypervisor-oci.sftests.com > > remote: true > > roles: > > - hypervisor-oci > _______________________________________________ > Softwarefactory-dev mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/softwarefactory-dev >
_______________________________________________ Softwarefactory-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/softwarefactory-dev
