Re: [go-cd] Re: GOCD AWS ECS Elastic Agent allocation is falling

Sriram Narayanan Tue, 03 Sep 2024 05:36:05 -0700

( I am ill so please excuse the limited questions)
- does the ECS consumer get created and registered if you remove the user
data script?
- what changed between when this ECS used to work vs now?


— Sriram

On Tue, 3 Sep 2024 at 7:23 PM, pradeep devaraj <[email protected]>
wrote:

> Hi Team / Chad Wilson.
>
> Docker service and ECS service is failing when new server comes up. AMI
> id: ami-0a5f593ecaa0f722d  community one.  when we manully  spin the server
> and attach via ASG it's registering to cluster. when we try the same from
> gocd ecs cluster profile(AWS ECS ELastic plugin) it's not working and
> Docker service and ECS service is failing.
>
>
>
> On Monday, September 2, 2024 at 11:21:06 PM UTC+5:30 pradeep devaraj wrote:
>
>> Adding++
>>
>> we are getting the agnet creation and deletion in loop
>> [go] Received a request to create an agent for the job:
>> [SpecOps_UAT_Elastic_Img_crt/6/test/1/test]
>> [go] No running instance(s) found to build the ECS Task to perform
>> current job.
>> [go] Creating a new container instance to schedule ECS Task.
>> [go] Waiting for instance(s) ([i-061187c3d2ea07317]) to register with
>> cluster.
>> [go] Received a request to create an agent for the job:
>> [SpecOps_UAT_Elastic_Img_crt/6/test/1/test]
>> [go] No running instance(s) found to build the ECS Task to perform
>> current job.
>> [go] Creating a new container instance to schedule ECS Task.
>> [go] Waiting for instance(s) ([i-00bb68d594121ab15]) to register with
>> cluster.
>> [go] Received a request to create an agent for the job:
>> [SpecOps_UAT_Elastic_Img_crt/6/test/1/test]
>> [go] No running instance(s) found to build the ECS Task to perform
>> current job.
>> [go] Creating a new container instance to schedule ECS Task.
>>
>> On Monday, September 2, 2024 at 9:55:48 PM UTC+5:30 pradeep devaraj wrote:
>>
>>> We are using a GOCD AWS ECS elastic agent plugin.
>>> GOCD version: GoCD Version: 23.4.0
>>>
>>> GoCD Elastic Agent Plugin for Amazon ECS
>>>
>>>    - Version7.3.0-416
>>>    -
>>>    -
>>>    -
>>>    -
>>>
>>>
>>> *AMI id: *ami-0ba9fb6bc8faf1fe0
>>>
>>>
>>> *Elastic instance is coming up and its not getting assigned to ECS
>>> cluster, we logged in to server and found the blow error. *
>>>
>>> [root@ip-******* ~]# systemctl restart docker
>>> Job for docker.service failed because start of the service was attempted
>>> too often. See "systemctl status docker.service" and "journalctl -xe" for
>>> details.
>>> To force a start use "systemctl reset-failed docker.service" followed by
>>> "systemctl start docker.service" again.
>>> [root@ip- *******   ~]# journalctl -xe
>>> -- Defined-By: systemd
>>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>> --
>>> -- Unit ecs.service has finished shutting down.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]: start
>>> request repeated too quickly for docker.service
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]: Failed to
>>> start Docker Application Container Engine.
>>> -- Subject: Unit docker.service has failed
>>> -- Defined-By: systemd
>>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>> --
>>> -- Unit docker.service has failed.
>>> --
>>> -- The result is failed.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]:
>>> docker.service failed.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]: Starting
>>> Amazon Elastic Container Service - container agent...
>>> -- Subject: Unit ecs.service has begun start-up
>>> -- Defined-By: systemd
>>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>> --
>>> -- Unit ecs.service has begun starting up.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]:
>>> ecs.service: control process exited, code=exited status=1
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=info time=2024-09-02T16:03:20Z msg="post-stop"
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=info time=2024-09-02T16:03:20Z msg="Cleaning up the credentials
>>> endpoint setup for Amazon El
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=error time=2024-09-02T16:03:20Z msg="Error performing action 'delete'
>>> for iptables route: ex
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=error time=2024-09-02T16:03:20Z msg="Error performing action 'delete'
>>> for iptables route: ex
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=error time=2024-09-02T16:03:20Z msg="Error performing action 'delete'
>>> for iptables route: ex
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon amazon-ecs-init[6236]:
>>> level=error time=2024-09-02T16:03:20Z msg="Error performing action 'delete'
>>> for iptables route: ex
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]: Failed to
>>> start Amazon Elastic Container Service - container agent.
>>> -- Subject: Unit ecs.service has failed
>>> -- Defined-By: systemd
>>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>> --
>>> -- Unit ecs.service has failed.
>>> --
>>> -- The result is failed.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]: Unit
>>> ecs.service entered failed state.
>>> Sep 02 16:03:20 ip-10-226-11-63.aws.cloud.epsilon systemd[1]:
>>> ecs.service failed.
>>>
>>>
>>>
>>> [root@ipXXXX ~]# df -hT
>>> Filesystem     Type      Size  Used Avail Use% Mounted on
>>> devtmpfs       devtmpfs  7.7G     0  7.7G   0% /dev
>>> tmpfs          tmpfs     7.7G     0  7.7G   0% /dev/shm
>>> tmpfs          tmpfs     7.7G  376K  7.7G   1% /run
>>> tmpfs          tmpfs     7.7G     0  7.7G   0% /sys/fs/cgroup
>>> /dev/nvme0n1p1 xfs       100G  2.4G   98G   3% /
>>> tmpfs          tmpfs     1.6G     0  1.6G   0% /run/user/0
>>> [root@ip-10-226-11-63 ~]# docker --version
>>> Docker version 25.0.5, build 5dc9bcc
>>>
>>> BELOW User data script we are using and getting excited while spinning
>>> up an error.
>>>
>>> "ECS_INSTANCE_ATTRIBUTES={"server-id":"31e424ad-e242-45d2-a5bb-0ef7be0d8306"}
>>> EOT echo 'File /etc/ecs/ecs.config successfully created.' log "Finished
>>> executing GoCD's user data script, now executing custom user data script
>>> from use, if present." #!/bin/bash echo "ECS_CLUSTER=GoCD-ECS-UAT"  >>
>>> /etc/ecs/ecs.config log "Finished executing user specified user data
>>> script." --// #cloud-config cloud_final_modules: - [scripts-user, always]
>>> --// Content-Type: text/x-shellscript; charset="us-ascii" MIME-Version: 1.0
>>> Content-Transfer-Encoding: 7bit Content-Disposition: attachment;
>>> filename="initialize_instance_store" #!/bin/bash exec > >(tee
>>> /var/log/initialize_instance_store.log | logger -t user-data -s
>>> 2>/dev/console) 2>&1 function log() {     echo "[$(date "+%Y-%m-%d
>>> %H:%M:%S")] - $1" >> /var/log/initialize_instance_store.log } function
>>> try() {    $@    return 0 } log "Starting to setup instance store for the
>>> docker." INSTANCE_STORES=$(ls
>>> /dev/disk/by-id/*EC2_NVMe_Instance_Storage*-ns-1) if [ -z
>>> "${INSTANCE_STORES}" ]; then     log "No instance store detected." fi
>>> VOLUMES="$INSTANCE_STORES" if [ -e "/dev/xvdcz" ]; then     log "Instance
>>> has /dev/xvdcz EBS volume. Using it for docker logical volume group."
>>> VOLUMES="$VOLUMES /dev/xvdcz" fi if [ -z "${VOLUMES}" ]; then     log "No
>>> addition volumes. Using box standard docker setup." else     log "Available
>>> instance stores: ${VOLUMES}."     log "Setting up the docker logical volume
>>> group."     service docker stop     rm -rf /var/lib/docker/*     dmsetup
>>> remove_all     VOLUME_GROUP=docker     LOGICAL_VOLUME=docker-pool     try
>>> vgremove -y "${VOLUME_GROUP}"     try lvremove -y "${LOGICAL_VOLUME}"
>>> vgcreate -y "${VOLUME_GROUP}" ${VOLUMES}     sleep 2     lvcreate -y -l
>>> 5%VG -n ${LOGICAL_VOLUME}\meta ${VOLUME_GROUP}     lvcreate -y -l 90%VG -n
>>> ${LOGICAL_VOLUME} ${VOLUME_GROUP}     sleep 2     lvconvert -y --zero n
>>> --thinpool ${VOLUME_GROUP}/${LOGICAL_VOLUME} --poolmetadata
>>> ${VOLUME_GROUP}/${LOGICAL_VOLUME}\meta     echo 'DOCKER_STORAGE_OPTIONS="
>>> --storage-driver devicemapper --storage-opt
>>> dm.thinpooldev=/dev/mapper/docker-docker--pool --storage-opt
>>> dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true
>>> --storage-opt dm.fs=ext4 --storage-opt dm.use_deferred_deletion=true"' >
>>> /etc/sysconfig/docker-storage     test -f /bin/systemctl && systemctl
>>> reset-failed docker.service     service docker restart     test -f
>>> /bin/systemctl && systemctl enable --no-block --now ecs fi log "Setup
>>> completed." --//"
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "go-cd" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/go-cd/763a2904-4962-4c8b-ae2a-b8bf72701e5bn%40googlegroups.com
> <https://groups.google.com/d/msgid/go-cd/763a2904-4962-4c8b-ae2a-b8bf72701e5bn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/go-cd/CANiY96Y-svaDeOWqseTpqSPEE48G1rogWc%3DznGHuYLH5Nr%2B%2Bwg%40mail.gmail.com.

Re: [go-cd] Re: GOCD AWS ECS Elastic Agent allocation is falling

Reply via email to