Re: how to troubleshoot: FATAL: canceling authentication due to timeout

2021-03-17 Thread Marc
Hi,
Not much, we don't see any failed login.

We have added debug login into sssd service since we just found out that
restarting sssd released the user and it became usable again.
So there must be something wrong between postgres and sssd/pam modules...

Waiting now for fresh logs if it happens again.

Thanks!
On Wed, 17 Mar 2021, 22:32 Diego,  wrote:

> hi!
>
> What you see in the log files of CentOS ( /var/log ) ?
> i
>
>
> On 17/03/2021 16:00, Marc wrote:
>
> hi all,
>
> We are facing a problem with a user login into database. It happens when 
> there is large load and only from time to time.
> Once we get this error, the user becomes unusable until database is 
> restarted. (That user is being used by multiple instances of the same 
> application, it also happens using dedicated users for each application, 
> resulting on one of those users being locked out, the rest keep working fine)
>
> The errors is as follows:
> LOG: pam_authenticate failed: Authentication failure
> FATAL: canceling authentication due to timeout
>
> Our setup:
> 3 nodes cluster
> - Centos 7
> - Streaming replication in place (async)
> - WAL shipped to an external location
> - pooling done at client side
> - Centos joined to an Active Directory domain
> - Authentication is using PAM module
>
> User is completely fine in AD side since i can use it to login to a standby 
> DB.
> I guess there must be a lock that prevents this user to do the first 
> authentication step, but no idea how to find it. I’ve tried common queries to 
> find locks but I can’t see anything relevant.
>
> I would appreciate if someone could point me to the right direction!
>
> Thanks a lot!
> Marc.
>
>
>
>
>


Re: how to troubleshoot: FATAL: canceling authentication due to timeout

2021-03-17 Thread Diego

hi!

What you see in the log files of CentOS ( /var/log ) ?
i


On 17/03/2021 16:00, Marc wrote:

hi all,

We are facing a problem with a user login into database. It happens when there 
is large load and only from time to time.
Once we get this error, the user becomes unusable until database is restarted. 
(That user is being used by multiple instances of the same application, it also 
happens using dedicated users for each application, resulting on one of those 
users being locked out, the rest keep working fine)

The errors is as follows:
LOG: pam_authenticate failed: Authentication failure
FATAL: canceling authentication due to timeout

Our setup:
3 nodes cluster
- Centos 7
- Streaming replication in place (async)
- WAL shipped to an external location
- pooling done at client side
- Centos joined to an Active Directory domain
- Authentication is using PAM module

User is completely fine in AD side since i can use it to login to a standby DB.
I guess there must be a lock that prevents this user to do the first 
authentication step, but no idea how to find it. I’ve tried common queries to 
find locks but I can’t see anything relevant.

I would appreciate if someone could point me to the right direction!

Thanks a lot!
Marc.





how to troubleshoot: FATAL: canceling authentication due to timeout

2021-03-17 Thread Marc
hi all,

We are facing a problem with a user login into database. It happens when there 
is large load and only from time to time.
Once we get this error, the user becomes unusable until database is restarted. 
(That user is being used by multiple instances of the same application, it also 
happens using dedicated users for each application, resulting on one of those 
users being locked out, the rest keep working fine)

The errors is as follows:
LOG: pam_authenticate failed: Authentication failure
FATAL: canceling authentication due to timeout

Our setup:
3 nodes cluster
- Centos 7
- Streaming replication in place (async)
- WAL shipped to an external location
- pooling done at client side
- Centos joined to an Active Directory domain
- Authentication is using PAM module

User is completely fine in AD side since i can use it to login to a standby DB.
I guess there must be a lock that prevents this user to do the first 
authentication step, but no idea how to find it. I’ve tried common queries to 
find locks but I can’t see anything relevant.

I would appreciate if someone could point me to the right direction!

Thanks a lot!
Marc.