netdisco-users Digest, Vol 199, Issue 7

netdisco-users-request Tue, 07 Mar 2023 20:19:22 -0800

Send netdisco-users mailing list submissions to
        netdisco-users@lists.sourceforge.net


To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
        netdisco-users-requ...@lists.sourceforge.net

You can reach the person managing the list at
        netdisco-users-ow...@lists.sourceforge.net

When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."

Today's Topics:

   1. Re: Out of memory errors in backend log upon upgrading to
      Netdisco 2.60.7 (Muris)
   2. Re: Out of memory errors in backend log upon upgrading to
      Netdisco 2.60.7 (Muris)

--- Begin Message ---

This is what its showing;

netdisco=> \d+ device_skip;
                                               Table "public.device_skip"
   Column   |            Type             | Collation | Nullable |   Default    
| Storage  | Stats target | Description 
------------+-----------------------------+-----------+----------+--------------+----------+--------------+-------------
 backend    | text                        |           | not null |              
| extended |              | 
 device     | inet                        |           | not null |              
| main     |              | 
 actionset  | text[]                      |           |          | '{}'::text[] 
| extended |              | 
 deferrals  | integer                     |           |          | 0            
| plain    |              | 
 last_defer | timestamp without time zone |           |          |              
| plain    |              | 
Access method: heap

On 8/3/2023, 12:06, "Christian Ramseyer" <ramse...@netnea.com 
<mailto:ramse...@netnea.com>> wrote:


Can you show the backend column for these IPs and the \d+ for the table? 
There should actually be a primary key there that does not allow the 
same backend/ip tuple twice, so either this is missing or the one 
backend you have gets recorded with different names. My table/index 
looks like this:


\d+ device_skip;


Table "device_skip"
Column | Type | Collation | Nullable | ..
------------+-----------------------------+-----------+----------+
backend | text | | not null |
device | inet | | not null |
actionset | text[] | | |
deferrals | integer | | |
last_defer | timestamp without time zone | | |




Indexes:
"device_skip_pkey" PRIMARY KEY, btree (backend, device)








On 08.03.23 02:28, Muris wrote:
> Thanks..I only have one backend.
> 
> Here is the query I executed
> 
> netdisco=> select device, count(*) as cnt from device_skip group by device 
> having
> netdisco-> count(*) > 1;
> device | cnt
> ---------------+-----
> 10.183.x.55 | 2
> 10.183.x.22 | 2
> 
> These devices that are listed are actually access points probably discovered 
> by CDP/LLDP that netdisco cant get to. So what does that tell you?
> Not sure why its having or showing as a list of 2 for access points
> 
> Maybe it’s a bug of some sort for access points its trying to deal with?
> 
> 
> On 8/3/2023, 11:21, "Christian Ramseyer" <ramse...@netnea.com 
> <mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
> <mailto:ramse...@netnea.com>>> wrote:
> 
> 
> Sounds like you're getting multiple device_skip entries for the same device.
> 
> 
> Unfortunately I'm not well versed in how this really works, especially
> with multiple backends (do you have multiple?). I imagine this query
> will create some output in your db:
> 
> 
> select device, count(*) as cnt from device_skip group by device having
> count(*) > 1;
> device | cnt
> --------+-----
> (0 rows)
> 
> 
> 
> 
> or maybe:
> 
> 
> select backend, device, count(*) as cnt from device_skip group by
> backend, device having count(*) > 1;
> backend | device | cnt
> ---------+--------+-----
> (0 rows)
> 
> 
> Until somebody has a better idea, you can try disabling the whole
> device_skip mechanism by setting workers.max_deferrals and retry_after
> to zero.
> 
> 
> https://github.com/netdisco/netdisco/wiki/Configuration#workers 
> <https://github.com/netdisco/netdisco/wiki/Configuration#workers> 
> <https://github.com/netdisco/netdisco/wiki/Configuration#workers> 
> <https://github.com/netdisco/netdisco/wiki/Configuration#workers&gt;>
> 
> 
> I'm not sure if this will achiveve not executing the query that fails,
> but might be worth a try.
> 
> 
> 
> 
> On 07.03.23 23:51, Muris wrote:
>> Thanks Christian, I have done that also clearing out the admin table,
>> also i increased the work_mem to 64M which now the process workers are
>> working and not getting out of memory.
>>
>> I left it running and it was working fine for an hour or two.. i just
>> checked it and all the nd2 processes seems to have stopped again, so i
>> checked
>> the netdisco-backend.log and it shows this
>>
>> [27082] 2023-03-07 22:31:36 error bless( {'msg' =>
>> 'DBIx::Class::Row::update(): Can\'t update
>> App::Netdisco::DB::Result::DeviceSkip=HASH(0x5da4278): updated more than
>> one row at
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/DB/Result/DeviceSkip.pm line 41
>> '}, 'DBIx::Class::Exception' )
>>
>> It seems something in the device_skip table causes it to crash, then all
>> the process workers stop working.. then I have to manually restart
>> netdisco-backend,
>> I have cleared device_skip and still this comes... any ideas about this one?
>>
>> On Wed, Mar 8, 2023 at 8:33 AM Christian Ramseyer <ramse...@netnea.com 
>> <mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
>> <mailto:ramse...@netnea.com>>
>> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com> 
>> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com>>>> wrote:
>>
>> Hi Muris
>>
>> We sometimes see that the admin table or its indexes can get very
>> large,
>> despite not much data being in there. This is due to index bloat in
>> Postgres, ie. fragmentation of on-disk pages.
>>
>> Since all your error seem to come from admin, I'd try clearing out the
>> table and recreating the indexes like shown here:
>>
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;gt;&amp;gt;&gt;>
>>
>> No worries, the table just holds job queue records and clearing it out
>> does not lose any data you care about.
>>
>> If you're still having trouble, you can also look at index bloat in
>> other tables:
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;&gt;>
>>  
>> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;gt;&amp;gt;&gt;>
>> and try one of the proposed solutions there, e.g.
>>
>> netdisco-do psql -e 'reindex database concurrently netdisco'
>>
>> For general Postgres memory tuning, I like using
>> <https://pgtune.leopard.in.ua/#/about <https://pgtune.leopard.in.ua/#/about> 
>> <https://pgtune.leopard.in.ua/#/about> 
>> <https://pgtune.leopard.in.ua/#/about&gt;>
>> <https://pgtune.leopard.in.ua/#/about>> 
>> <https://pgtune.leopard.in.ua/#/about&gt;&gt;> 
>> <https://pgtune.leopard.in.ua/#/about&gt;&gt;> 
>> <https://pgtune.leopard.in.ua/#/about&amp;gt;&amp;gt;&gt;> since it just 
>> asks a few system
>> questions and then outputs all the ~20 parameters in a reasonable
>> relation to each other.
>>
>>
>>
>> Cheers
>> Christian
>>
>>
>> On 07.03.23 11:20, Muris wrote:
>>> Hey guys,
>>>
>>> I seem to be having some out of memory errors on upgrading to
>> netdisco
>>> 2.60.7 the latest version.. not sure whats going on.
>>>
>>> I followed this guide
>>>
>> https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;&gt;>
>>  
>> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&amp;gt;&gt;>
>>>
>>> And I increased the worker_mem to 32MB which seems to have helped
>> a bit,
>>> but still noticing errors on the backend time to time, and
>> messages like
>>> this, any ideas?
>>>
>>> Running on Xeon 8 core with 16gb ram.
>>>
>>> DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
>> DBD::Pg::st
>>> execute failed: ERROR:out of memory
>>>
>>> DETAIL:Failed on request of size 32800 in memory context
>>> "HashBatchContext". [for Statement "SELECT me.job, me.entered,
>>> me.started, me.finished, me.device, me.port, me.action,
>> me.subaction,
>>> me.status, me.username, me.userip, me.log, me.debug, me.device_key,
>>> me.job_priority FROM (WITH my_jobs AS
>>>
>>> (SELECT admin.* FROM admin
>>>
>>> LEFT OUTER JOIN device_skip ds
>>>
>>> ON (ds.backend = ? AND admin.device = ds.device
>>>
>>> AND admin.action = ANY (ds.actionset))
>>>
>>> WHERE admin.status = 'queued'
>>>
>>> AND ds.device IS NULL)
>>>
>>> SELECT my_jobs.*,
>>>
>>> CASE WHEN ( (my_jobs.username IS NOT NULL AND ((ds.deferrals IS
>> NULL AND
>>> ds.last_defer IS NULL)
>>>
>>> OR my_jobs.entered > ds.last_defer))
>>>
>>> OR (my_jobs.action = ANY (string_to_array(btrim(?, '{"}'), '","'))) )
>>>
>>> THEN 100
>>>
>>> ELSE 0
>>>
>>> END AS job_priority
>>>
>>> FROM my_jobs
>>>
>>> LEFT OUTER JOIN device_skip ds
>>>
>>> ON (ds.backend = ? AND ds.device = my_jobs.device)
>>>
>>> WHERE ds.deferrals < ?
>>>
>>> OR (my_jobs.username IS NOT NULL AND (ds.last_defer IS NULL
>>>
>>> OR my_jobs.entered > ds.last_defer))
>>>
>>> OR (ds.deferrals IS NULL AND ds.last_defer IS NULL)
>>>
>>> OR ds.last_defer <= ( LOCALTIMESTAMP - ?::interval )
>>>
>>> ORDER BY job_priority DESC,
>>>
>>> ds.deferrals ASC NULLS FIRST,
>>>
>>> ds.last_defer ASC NULLS LAST,
>>>
>>> device_key DESC NULLS LAST,
>>>
>>> random()
>>>
>>> LIMIT ?
>>>
>>> ) me" with ParamValues: 1='netdisco',
>>>
>> 2='{"contact","hook::exec","hook::http","location","portcontrol","portname","power","snapshot","vlan"}',
>>  3=’netdisco’, 4='10', 5='7 days', 6='1'] at 
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 122
>>>
>>> [5405] 2023-03-07 08:36:21 error bless( {'msg' =>
>>> 'DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
>> DBD::Pg::st
>>> execute failed: ERROR:out of memory
>>>
>>> DETAIL:Failed on request of size 72 in memory context
>> "MessageContext".
>>> [for Statement "UPDATE admin SET finished = ?, log = ?, started = ?,
>>> status = ? WHERE ( job = ? )" with ParamValues: 1=\'Tue Mar7
>> 19:06:21
>>> 2023\', 2=\'DBIx::Class::Storage::DBI::_dbh_execute(): DBI
>> Exception:
>>> DBD::Pg::st execute failed: ERROR:out of memory
>>>
>>> DETAIL:Failed on request of size 16 in memory context
>> "MessageContext".
>>> [for Statement "SELECT me.ip, me.port, me.creation, me.descr, me.up,
>>> me.up_admin, me.type, me.duplex, me.duplex_admin, me.speed,
>>> me.speed_admin, me.name <http://me.name> <http://me.name&gt;> 
>>> <http://me.name&gt;> <http://me.name&amp;gt;&gt;>, me.mac, me.mtu, me.stp,
>> me.remote_ip,
>>> me.remote_port, me.remote_type, me.remote_id, me.has_subinterfaces,
>>> me.is_master, me.slave_of, me.manual_topo, me.is_uplink, me.vlan,
>>> me.pvid, me.lastchange, me.custom_fields, neighbor_alias.ip,
>>> neighbor_alias.alias, neighbor_alias.subnet, neighbor_alias.port,
>>> neighbor_alias.dns, neighbor_alias.creation, device.ip,
>> device.creation,
>>> device.dns, device.description, device.uptime, device.contact,
>>> device.name <http://device.name> <http://device.name&gt;> 
>>> <http://device.name&gt;> <http://device.name&amp;gt;&gt;>, device.location, 
>>> device.layers,
>> device.num_ports,
>>> device.mac, device.serial, device.chassis_id, device.model,
>>> device.ps1_type, device.ps2_type, device.ps1_status,
>> device.ps2_status,
>>> device.fan, device.slots, dev...\', 3=\'Tue Mar7 19:06:17 2023\',
>>> 4=\'error\', 5=\'215666754\'] at
>>>
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
>> line 282
>>>
>>> '}, 'DBIx::Class::Exception' )
>>>
>>> [5427] 2023-03-07 08:36:28 error bless( {'msg' =>
>>> 'DBIx::Class::Storage::DBI::catch {...} (): DBI Connection
>> failed: DBI
>>> connect(\'dbname=netdisco\',\'netdisco\',...) failed: server
>> closed the
>>> connection unexpectedly
>>>
>>> This probably means the server terminated abnormally
>>>
>>> before or while processing the request. at
>>> /home/netdisco/perl5/lib/perl5/DBIx/Class/Storage/DBI.pm line
>> 1639. at
>>>
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
>> line 292
>>>
>>> '}, 'DBIx::Class::Exception' )
>>>
>>> DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
>> DBD::Pg::st
>>> execute failed: ERROR:out of memory
>>>
>>> Use of uninitialized value $op in pattern match (m//) at
>>> /home/netdisco/perl5/lib/perl5/SQL/Abstract/Classic.pm line 327.
>>>
>>> [14006] 2023-03-07 10:05:49 error bless( {'msg' =>
>>> 'DBIx::Class::SQLMaker::ClassicExtensions::puke(): Fatal:
>> Operator calls
>>> in update must be in the form { -op => $arg } at
>>>
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
>> line 282
>>>
>>> '}, 'DBIx::Class::Exception' )
>>>
>>> Use of uninitialized value $op in pattern match (m//) at
>>> /home/netdisco/perl5/lib/perl5/SQL/Abstract/Classic.pm line 327.
>>>
>>> [14718] 2023-03-07 10:10:16 error bless( {'msg' =>
>>> 'DBIx::Class::SQLMaker::ClassicExtensions::puke(): Fatal:
>> Operator calls
>>> in update must be in the form { -op => $arg } at
>>>
>> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
>> line 282
>>>
>>> '}, 'DBIx::Class::Exception' )
>>
>>
>> -- 
>> Christian Ramseyer, netnea ag
>> Network Management. Security. OpenSource.
>> https://www.netnea.com <https://www.netnea.com> <https://www.netnea.com> 
>> <https://www.netnea.com&gt;> <https://www.netnea.com> 
>> <https://www.netnea.com&gt;> <https://www.netnea.com&gt;> 
>> <https://www.netnea.com&amp;gt;&gt;>
>> Phone: +41 79 644 77 64
>>
> 
> 


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com <https://www.netnea.com>
Phone: +41 79 644 77 64

--- End Message ---

--- Begin Message ---

I think at the moment this is causing my problem and seems to be crashing the 
jobs, this error message below, it only started happening at netdisco 2.60.7 
upgrade, even if I clear device_skip, admin and start again, the jobs show as 
polling for a period but then no activity.. then I check the 
netdisco-backend.log and this appears below, 

Maybe if oliver knows if there is any bugs or what this error could be?

error bless( {'msg' => 'DBIx::Class::SQLMaker::ClassicExtensions::puke(): 
Fatal: Operator calls in update must be in the form { -op => $arg } at 
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 282
'}, 'DBIx::Class::Exception' )

On 8/3/2023, 12:07, "Muris" <alcat...@gmail.com <mailto:alcat...@gmail.com>> 
wrote:


I also forgot to mention - I also am getting this as well which seems to be 
maybe interfering which is showing in the backend


[17503] 2023-03-08 00:31:43 error bless( {'msg' => 
'DBIx::Class::SQLMaker::ClassicExtensions::puke(): Fatal: Operator calls in 
update must be in the form { -op => $arg } at 
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 282
'}, 'DBIx::Class::Exception' )


On 8/3/2023, 11:58, "Muris" <alcat...@gmail.com <mailto:alcat...@gmail.com> 
<mailto:alcat...@gmail.com <mailto:alcat...@gmail.com>>> wrote:




Thanks..I only have one backend.




Here is the query I executed




netdisco=> select device, count(*) as cnt from device_skip group by device 
having
netdisco-> count(*) > 1;
device | cnt 
---------------+-----
10.183.x.55 | 2
10.183.x.22 | 2




These devices that are listed are actually access points probably discovered by 
CDP/LLDP that netdisco cant get to. So what does that tell you?
Not sure why its having or showing as a list of 2 for access points




Maybe it’s a bug of some sort for access points its trying to deal with?








On 8/3/2023, 11:21, "Christian Ramseyer" <ramse...@netnea.com 
<mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
<mailto:ramse...@netnea.com>> <mailto:ramse...@netnea.com 
<mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
<mailto:ramse...@netnea.com>>>> wrote:








Sounds like you're getting multiple device_skip entries for the same device.








Unfortunately I'm not well versed in how this really works, especially 
with multiple backends (do you have multiple?). I imagine this query 
will create some output in your db:








select device, count(*) as cnt from device_skip group by device having 
count(*) > 1;
device | cnt
--------+-----
(0 rows)
















or maybe:








select backend, device, count(*) as cnt from device_skip group by 
backend, device having count(*) > 1;
backend | device | cnt
---------+--------+-----
(0 rows)








Until somebody has a better idea, you can try disabling the whole 
device_skip mechanism by setting workers.max_deferrals and retry_after 
to zero.








https://github.com/netdisco/netdisco/wiki/Configuration#workers 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers&gt;> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers&gt;> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers&gt;> 
<https://github.com/netdisco/netdisco/wiki/Configuration#workers&amp;gt;&gt;>








I'm not sure if this will achiveve not executing the query that fails, 
but might be worth a try.
















On 07.03.23 23:51, Muris wrote:
> Thanks Christian, I have done that also clearing out the admin table, 
> also i increased the work_mem to 64M which now the process workers are 
> working and not getting out of memory.
> 
> I left it running and it was working fine for an hour or two.. i just 
> checked it and all the nd2 processes seems to have stopped again, so i 
> checked
> the netdisco-backend.log and it shows this
> 
> [27082] 2023-03-07 22:31:36 error bless( {'msg' => 
> 'DBIx::Class::Row::update(): Can\'t update 
> App::Netdisco::DB::Result::DeviceSkip=HASH(0x5da4278): updated more than 
> one row at 
> /home/netdisco/perl5/lib/perl5/App/Netdisco/DB/Result/DeviceSkip.pm line 41
> '}, 'DBIx::Class::Exception' )
> 
> It seems something in the device_skip table causes it to crash, then all 
> the process workers stop working.. then I have to manually restart 
> netdisco-backend,
> I have cleared device_skip and still this comes... any ideas about this one?
> 
> On Wed, Mar 8, 2023 at 8:33 AM Christian Ramseyer <ramse...@netnea.com 
> <mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
> <mailto:ramse...@netnea.com>> <mailto:ramse...@netnea.com 
> <mailto:ramse...@netnea.com> <mailto:ramse...@netnea.com 
> <mailto:ramse...@netnea.com>>> 
> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com> 
> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com>> 
> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com> 
> <mailto:ramse...@netnea.com <mailto:ramse...@netnea.com>>>>> wrote:
> 
> Hi Muris
> 
> We sometimes see that the admin table or its indexes can get very
> large,
> despite not much data being in there. This is due to index bloat in
> Postgres, ie. fragmentation of on-disk pages.
> 
> Since all your error seem to come from admin, I'd try clearing out the
> table and recreating the indexes like shown here:
> 
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes>>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#polling-mysteriously-stuck-or-very-large-admindevice_skip-tables-and-indexes&amp;amp;gt;&amp;amp;gt;&amp;gt;&gt;>
> 
> No worries, the table just holds job queue records and clearing it out
> does not lose any data you care about.
> 
> If you're still having trouble, you can also look at index bloat in
> other tables:
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat>>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;gt;&amp;gt;&gt;>
>  
> <https://github.com/netdisco/netdisco/wiki/Database-Tips#unreasonable-database-size-index-bloat&amp;amp;gt;&amp;amp;gt;&amp;gt;&gt;>
> and try one of the proposed solutions there, e.g.
> 
> netdisco-do psql -e 'reindex database concurrently netdisco'
> 
> For general Postgres memory tuning, I like using
> <https://pgtune.leopard.in.ua/#/about <https://pgtune.leopard.in.ua/#/about> 
> <https://pgtune.leopard.in.ua/#/about> 
> <https://pgtune.leopard.in.ua/#/about&gt;> 
> <https://pgtune.leopard.in.ua/#/about> 
> <https://pgtune.leopard.in.ua/#/about&gt;> 
> <https://pgtune.leopard.in.ua/#/about&gt;> 
> <https://pgtune.leopard.in.ua/#/about&amp;gt;&gt;>
> <https://pgtune.leopard.in.ua/#/about>> 
> <https://pgtune.leopard.in.ua/#/about&gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&amp;gt;&amp;gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&amp;gt;&amp;gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&amp;gt;&amp;gt;&gt;> 
> <https://pgtune.leopard.in.ua/#/about&amp;amp;gt;&amp;amp;gt;&amp;gt;&gt;> 
> since it just asks a few system
> questions and then outputs all the ~20 parameters in a reasonable
> relation to each other.
> 
> 
> 
> Cheers
> Christian
> 
> 
> On 07.03.23 11:20, Muris wrote:
> > Hey guys,
> >
> > I seem to be having some out of memory errors on upgrading to
> netdisco
> > 2.60.7 the latest version.. not sure whats going on.
> >
> > I followed this guide
> >
> https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;amp;gt;&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/>>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;gt;&amp;gt;&gt;>
>  
> <https://severalnines.com/blog/architecture-and-tuning-memory-postgresql-databases/&amp;amp;gt;&amp;amp;gt;&amp;gt;&gt;>
> >
> > And I increased the worker_mem to 32MB which seems to have helped
> a bit,
> > but still noticing errors on the backend time to time, and
> messages like
> > this, any ideas?
> >
> > Running on Xeon 8 core with 16gb ram.
> >
> > DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
> DBD::Pg::st
> > execute failed: ERROR:out of memory
> >
> > DETAIL:Failed on request of size 32800 in memory context
> > "HashBatchContext". [for Statement "SELECT me.job, me.entered,
> > me.started, me.finished, me.device, me.port, me.action,
> me.subaction,
> > me.status, me.username, me.userip, me.log, me.debug, me.device_key,
> > me.job_priority FROM (WITH my_jobs AS
> >
> > (SELECT admin.* FROM admin
> >
> > LEFT OUTER JOIN device_skip ds
> >
> > ON (ds.backend = ? AND admin.device = ds.device
> >
> > AND admin.action = ANY (ds.actionset))
> >
> > WHERE admin.status = 'queued'
> >
> > AND ds.device IS NULL)
> >
> > SELECT my_jobs.*,
> >
> > CASE WHEN ( (my_jobs.username IS NOT NULL AND ((ds.deferrals IS
> NULL AND
> > ds.last_defer IS NULL)
> >
> > OR my_jobs.entered > ds.last_defer))
> >
> > OR (my_jobs.action = ANY (string_to_array(btrim(?, '{"}'), '","'))) )
> >
> > THEN 100
> >
> > ELSE 0
> >
> > END AS job_priority
> >
> > FROM my_jobs
> >
> > LEFT OUTER JOIN device_skip ds
> >
> > ON (ds.backend = ? AND ds.device = my_jobs.device)
> >
> > WHERE ds.deferrals < ?
> >
> > OR (my_jobs.username IS NOT NULL AND (ds.last_defer IS NULL
> >
> > OR my_jobs.entered > ds.last_defer))
> >
> > OR (ds.deferrals IS NULL AND ds.last_defer IS NULL)
> >
> > OR ds.last_defer <= ( LOCALTIMESTAMP - ?::interval )
> >
> > ORDER BY job_priority DESC,
> >
> > ds.deferrals ASC NULLS FIRST,
> >
> > ds.last_defer ASC NULLS LAST,
> >
> > device_key DESC NULLS LAST,
> >
> > random()
> >
> > LIMIT ?
> >
> > ) me" with ParamValues: 1='netdisco',
> >
> 2='{"contact","hook::exec","hook::http","location","portcontrol","portname","power","snapshot","vlan"}',
>  3=’netdisco’, 4='10', 5='7 days', 6='1'] at 
> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 122
> >
> > [5405] 2023-03-07 08:36:21 error bless( {'msg' =>
> > 'DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
> DBD::Pg::st
> > execute failed: ERROR:out of memory
> >
> > DETAIL:Failed on request of size 72 in memory context
> "MessageContext".
> > [for Statement "UPDATE admin SET finished = ?, log = ?, started = ?,
> > status = ? WHERE ( job = ? )" with ParamValues: 1=\'Tue Mar7
> 19:06:21
> > 2023\', 2=\'DBIx::Class::Storage::DBI::_dbh_execute(): DBI
> Exception:
> > DBD::Pg::st execute failed: ERROR:out of memory
> >
> > DETAIL:Failed on request of size 16 in memory context
> "MessageContext".
> > [for Statement "SELECT me.ip, me.port, me.creation, me.descr, me.up,
> > me.up_admin, me.type, me.duplex, me.duplex_admin, me.speed,
> > me.speed_admin, me.name <http://me.name> <http://me.name&gt;> 
> > <http://me.name&gt;> <http://me.name&amp;gt;&gt;> <http://me.name&gt;> 
> > <http://me.name&amp;gt;&gt;> <http://me.name&amp;gt;&gt;> 
> > <http://me.name&amp;amp;gt;&amp;gt;&gt;>, me.mac, me.mtu, me.stp,
> me.remote_ip,
> > me.remote_port, me.remote_type, me.remote_id, me.has_subinterfaces,
> > me.is_master, me.slave_of, me.manual_topo, me.is_uplink, me.vlan,
> > me.pvid, me.lastchange, me.custom_fields, neighbor_alias.ip,
> > neighbor_alias.alias, neighbor_alias.subnet, neighbor_alias.port,
> > neighbor_alias.dns, neighbor_alias.creation, device.ip,
> device.creation,
> > device.dns, device.description, device.uptime, device.contact,
> > device.name <http://device.name> <http://device.name&gt;> 
> > <http://device.name&gt;> <http://device.name&amp;gt;&gt;> 
> > <http://device.name&gt;> <http://device.name&amp;gt;&gt;> 
> > <http://device.name&amp;gt;&gt;> 
> > <http://device.name&amp;amp;gt;&amp;gt;&gt;>, device.location, 
> > device.layers,
> device.num_ports,
> > device.mac, device.serial, device.chassis_id, device.model,
> > device.ps1_type, device.ps2_type, device.ps1_status,
> device.ps2_status,
> > device.fan, device.slots, dev...\', 3=\'Tue Mar7 19:06:17 2023\',
> > 4=\'error\', 5=\'215666754\'] at
> >
> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
> line 282
> >
> > '}, 'DBIx::Class::Exception' )
> >
> > [5427] 2023-03-07 08:36:28 error bless( {'msg' =>
> > 'DBIx::Class::Storage::DBI::catch {...} (): DBI Connection
> failed: DBI
> > connect(\'dbname=netdisco\',\'netdisco\',...) failed: server
> closed the
> > connection unexpectedly
> >
> > This probably means the server terminated abnormally
> >
> > before or while processing the request. at
> > /home/netdisco/perl5/lib/perl5/DBIx/Class/Storage/DBI.pm line
> 1639. at
> >
> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
> line 292
> >
> > '}, 'DBIx::Class::Exception' )
> >
> > DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception:
> DBD::Pg::st
> > execute failed: ERROR:out of memory
> >
> > Use of uninitialized value $op in pattern match (m//) at
> > /home/netdisco/perl5/lib/perl5/SQL/Abstract/Classic.pm line 327.
> >
> > [14006] 2023-03-07 10:05:49 error bless( {'msg' =>
> > 'DBIx::Class::SQLMaker::ClassicExtensions::puke(): Fatal:
> Operator calls
> > in update must be in the form { -op => $arg } at
> >
> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
> line 282
> >
> > '}, 'DBIx::Class::Exception' )
> >
> > Use of uninitialized value $op in pattern match (m//) at
> > /home/netdisco/perl5/lib/perl5/SQL/Abstract/Classic.pm line 327.
> >
> > [14718] 2023-03-07 10:10:16 error bless( {'msg' =>
> > 'DBIx::Class::SQLMaker::ClassicExtensions::puke(): Fatal:
> Operator calls
> > in update must be in the form { -op => $arg } at
> >
> /home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm
> line 282
> >
> > '}, 'DBIx::Class::Exception' )
> 
> 
> -- 
> Christian Ramseyer, netnea ag
> Network Management. Security. OpenSource.
> https://www.netnea.com <https://www.netnea.com> <https://www.netnea.com> 
> <https://www.netnea.com&gt;> <https://www.netnea.com> 
> <https://www.netnea.com&gt;> <https://www.netnea.com&gt;> 
> <https://www.netnea.com&amp;gt;&gt;> <https://www.netnea.com> 
> <https://www.netnea.com&gt;> <https://www.netnea.com&gt;> 
> <https://www.netnea.com&amp;gt;&gt;> <https://www.netnea.com&gt;> 
> <https://www.netnea.com&amp;gt;&gt;> <https://www.netnea.com&amp;gt;&gt;> 
> <https://www.netnea.com&amp;amp;gt;&amp;gt;&gt;>
> Phone: +41 79 644 77 64
> 








-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com <https://www.netnea.com> <https://www.netnea.com> 
<https://www.netnea.com&gt;> <https://www.netnea.com> 
<https://www.netnea.com&gt;> <https://www.netnea.com&gt;> 
<https://www.netnea.com&amp;gt;&gt;>
Phone: +41 79 644 77 64

--- End Message ---

_______________________________________________
Netdisco mailing list - Digest Mode
netdisco-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netdisco-users

netdisco-users Digest, Vol 199, Issue 7

Reply via email to