Does that have potential to break other things? We could presumably also
update 
https://github.com/apache/beam/blob/4718cdff87fed4f92636e94dbf3a04c2315d6a95/.test-infra/jenkins/job_IODatastoresCredentialsRotation.groovy#L38
to pool-1 instead.

I put up https://github.com/apache/beam/pull/24466 in case that is
preferable.

On Thu, Dec 1, 2022 at 1:29 PM Yi Hu <ya...@google.com> wrote:

> Thanks for reporting. I have bumped the pool size of io-datastore as we
> have more tests being added and the default-pool frequently becomes
> unschedulable due to memory constraints. A simple fix is just rename the
> 'pool1' back to 'default-pool'.
>
> On Thu, Dec 1, 2022 at 1:26 PM Danny McCormick <dannymccorm...@google.com>
> wrote:
>
>> Yes, I was just starting to look into this. Looks like this is the result
>> of this job failing -
>> https://github.com/apache/beam/blob/ec2a07b38c1f640c62e7c3b96966f18b334a7ce9/.test-infra/jenkins/job_IODatastoresCredentialsRotation.groovy#L49
>>
>> The error is:
>>
>> ```
>>
>> *21:25:58* + gcloud container clusters upgrade io-datastores 
>> --node-pool=default-pool --zone=us-central1-a --quiet*21:25:59* ERROR: 
>> (gcloud.container.clusters.upgrade) No node pool found matching the name 
>> [default-pool].
>>
>> ```
>>
>>
>> from 
>> https://ci-beam.apache.org/job/Rotate%20IO-Datastores%20Cluster%20Credentials/6/console
>>
>>
>> It looks like there's been some change to the cluster that is causing the
>> job to fail. If we don't fix this and rerun, the cluster's creds will
>> expire (probably in like a monthish). I'm not sure what the impact of that
>> would be, I think probably broken IO integration tests.
>>
>> @John Casey <johnjca...@google.com> or @Yi Hu <ya...@google.com> might
>> know more about this, I think the cluster in question is
>> https://pantheon.corp.google.com/kubernetes/clusters/details/us-central1-a/io-datastores/details?mods=dataflow_dev&project=apache-beam-testing
>>
>> Next steps are:
>> 1) figuring out why there's no longer a default-pool
>> 2) Either recreating it or modifying the cred rotation logic
>> 3) (Minor) Fixing the url in the Jenkins job so it actually points to the
>> failing job when we get emails like this
>>
>> On Thu, Dec 1, 2022 at 1:18 PM Byron Ellis via dev <dev@beam.apache.org>
>> wrote:
>>
>>> Is there something we need to do here?
>>>
>>> On Thu, Dec 1, 2022 at 10:10 AM Apache Jenkins Server <
>>> jenk...@builds.apache.org> wrote:
>>>
>>>> Something went wrong during the automatic credentials rotation for
>>>> IO-Datastores Cluster, performed at Thu Dec 01 15:00:47 UTC 2022. It may be
>>>> necessary to check the state of the cluster certificates. For further
>>>> details refer to the following links:
>>>>  * https://ci-beam.apache.org/job/beam_SeedJob_Standalone/
>>>>  * https://ci-beam.apache.org/.
>>>
>>>

Reply via email to