Schema | Name | Type | Owner | Size |
Description
--------+----------------------------+-------+----------+------------+-------------
public | admin | table | netdisco | 173 MB |
public | community | table | netdisco | 224 kB |
public | dbix_class_schema_versions | table | netdisco | 40 kB |
public | device | table | netdisco | 3312 kB |
public | device_ip | table | netdisco | 34 MB |
public | device_module | table | netdisco | 895 MB |
public | device_port | table | netdisco | 1656 MB |
public | device_port_log | table | netdisco | 48 kB |
public | device_port_power | table | netdisco | 124 MB |
public | device_port_properties | table | netdisco | 354 MB |
public | device_port_ssid | table | netdisco | 17 MB |
public | device_port_vlan | table | netdisco | 1084 MB |
public | device_port_wireless | table | netdisco | 6776 kB |
public | device_power | table | netdisco | 1760 kB |
public | device_skip | table | netdisco | 5544 kB |
public | device_vlan | table | netdisco | 67 MB |
public | log | table | netdisco | 8192 bytes |
public | netmap_positions | table | netdisco | 288 kB |
public | node | table | netdisco | 317 MB |
public | node_ip | table | netdisco | 2084 MB |
public | node_monitor | table | netdisco | 8192 bytes |
public | node_nbt | table | netdisco | 4328 kB |
public | node_wireless | table | netdisco | 16 MB |
public | oui | table | netdisco | 2160 kB |
public | process | table | netdisco | 8192 bytes |
public | sessions | table | netdisco | 48 kB |
public | statistics | table | netdisco | 200 kB |
public | subnets | table | netdisco | 1296 kB |
public | topology | table | netdisco | 48 kB |
public | user_log | table | netdisco | 600 kB |
public | users | table | netdisco | 48 kB |
*From: *Christian Ramseyer <ramse...@netnea.com>
*Date: *Thursday, 20 January 2022 at 12:45 am
*To: *alcatron <alcat...@gmail.com>,
netdisco-users@lists.sourceforge.net
<netdisco-users@lists.sourceforge.net>, Jethro Binks
<jethro.bi...@strath.ac.uk>
*Subject: *Re: [Netdisco] Netdisco auto discovery tasks suddenly stopped
working
On 19.01.22 14:00, alcatron wrote:
As for picking up on the error, I saw this in the netdisco-backend log.
I believe the device_skip table was getting so big it was running out of
memory processing it, the device skip table was like 162MB
Im sure this will happen again in the next 2-3 months when the
device_skip table builds up. Perhaps its some kind of bug it can only
handle a device_skip table of a certain size?
It's weird how it would get that big, as IIRC it keeps only one record
per device in your DB at most. Is this including indexes? They might
become quite big, since Postgres can create some "bloat" under our
insert/delete pattern.
device_skip is just used to not poll unreachable devices over and over
again, there is no important data in there. So if in doubt,
delete from device_skip;
vacuum analyze device_skip;
reindex table device skip;
should allow for a fresh start.
There are also the max_deferrals and retry_after options to control the
skip behaviour. I don't think it will affect the table size much though.
https://github.com/netdisco/netdisco/wiki/Configuration#workers
<https://github.com/netdisco/netdisco/wiki/Configuration#workers>
If you're getting these issues regularly I'd definitely experiment with
the Postgres memory settings a bit, starting at work_mem.
Cheers
Christian
Both of these in the netdisco-backend.log were referring to items in the
“device_skip”, I looked through lots of logged data and found when it
started not working.
DETAIL: Failed on request of size 284 in memory context
"CacheMemoryContext". [for Statement "SELECT me.backend, me.device,
me.actionset, me.deferrals, me.last_defer FROM device_skip me WHERE ( (
me.backend = ? AND me.device = ? ) )" with ParamValues: 1=\'server\',
2=\'10.1.1.1\'] at
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 261
'}, 'DBIx::Class::Exception' )
[18851] 2022-01-11 01:30:43 error bless( {'msg' =>
'DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st
execute failed: ERROR: out of memory
DETAIL: Failed on request of size 8344 in memory context
"MessageContext". [for Statement "SELECT me.backend, me.device,
me.actionset, me.deferrals, me.last_defer FROM device_skip me WHERE ( (
me.backend = ? AND me.device = ? ) )" with ParamValues: 1=\'server\',
2=\10.1.1.2\'] at
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 261
'}, 'DBIx::Class::Exception' )
*From: *alcatron <alcat...@gmail.com>
*Date: *Wednesday, 19 January 2022 at 10:14 pm
*To: *Christian Ramseyer <ramse...@netnea.com>,
netdisco-users@lists.sourceforge.net <netdisco-users@lists.sourceforge.net>
*Subject: *Re: [Netdisco] Netdisco auto discovery tasks suddenly stopped
working
Hi Christian, thankyou for the tips.
I found what the problem is, it was crashing and not going past a
certain object in the “device_skip” table in the database.
I truncated that field in psql, and let it re-populate and that fixed
the automatic discovery and arpnip/macsuck etc.
I have found after a while perhaps 2-3 months something happens in the
“device_skip” table and halts these processes then I need to clear it to
make it work again. I remember I had this similar issue a few months
back, then I remembered what I did.
Muris
*From: *Christian Ramseyer <ramse...@netnea.com>
*Date: *Tuesday, 18 January 2022 at 12:20 pm
*To: *alcatron <alcat...@gmail.com>,
netdisco-users@lists.sourceforge.net <netdisco-users@lists.sourceforge.net>
*Subject: *Re: [Netdisco] Netdisco auto discovery tasks suddenly stopped
working
Hi
> could not connect to
> server: No such file or directory/
This would be very concerning, meaning that Postgres is not running at
all. But since you seem to have the web frontend running that is
probably not the case currently, so I wouldn't worry too much. Might be
an old log entry.
> Failed on request of size 16 in memory context
> "MessageContext".
That on the other hand might be the issue. Postgres uses all kinds of
memory parameters, if one of them is too small the total GB of RAM
sticks in the server don't matter much.
I had various issues with huge and clogged up discovery queues over the
years, as a first measure I'd try to:
stop netdisco-backend
restart Postgres, connect to the database with "netdisco-do psql" and in
there run a "delete from admin;".
for good measure, also run "reindex table admin;"
restart netdisco-backend
This sounds dangerous but admin is in fact just the queue of actions to
be done, so no important data will be lost.
Also a "select count(*) from admin" first might be interesting, to see
how many rows are in there. If it's an absurdly high number (millions)
you can run e.g. "create table admin_backup as select * from admin;" for
analysis later.
If you're still getting the memory errors afterwards and it still
doesn't work, I'd try to configure the memory parameters with this
assistant, using the "online transaction processing" db type.
https://pgtune.leopard.in.ua/#/about
<https://pgtune.leopard.in.ua/#/about>
<https://pgtune.leopard.in.ua/#/about
<https://pgtune.leopard.in.ua/#/about>>
Cheers
Christian
On 17.01.22 22:03, alcatron wrote:
Hi all, just wanting to ask your thoughts on what could be causing
netdisco to suddenly stop performing auto discovery tasks.
Seems only arpnip is working via scheduled tasks, but discovery/macsuck
has halted to auto perform. If I go manually to the device on web
interface and trigger the auto discovery/arpnip/macsuck it works fine on
the device.
Nothing has changed on system, running for a few months now, and
suddenly the auto discovery is broken partly.
If I go to the backend log I see error like this below. The server is
running and operational as I can still perform the manual to get
discovery etc
The server is not out of memory as it has like 16GB and still plenty
unused not what the messages are indicating..
Thanks for any assistance 😊
/DBIx::Class::Schema::Versioned::_on_connect(): Your DB is currently
unversioned. Please call upgrade on your schema to sync the DB. at
/home/netdisco/perl5/lib/perl5/DBICx/Sugar.pm line 121/
/DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI
connect('dbname=netdisco','netdisco',...) failed: could not connect to
server: No such file or directory/
/ Is the server running locally and accepting/
/ connections on Unix domain socket
"/var/run/postgresql/.s.PGSQL.5432"? at
/home/netdisco/perl5/lib/perl5/DBIx/Class/Storage/DBI.pm line 1639. at
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 50/
//
/[25756] error bless( {'msg' =>
'DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st
execute failed: ERROR: out of memory/
/DETAIL: Failed on request of size 16 in memory context
"MessageContext". [for Statement "SELECT me.job, me.entered, me.started,
me.finished, me.device, me.port, me.action, me.subaction, me.status,
me.username, me.userip, me.log, me.debug, me.device_key FROM admin me
WHERE ( me.job = ? ) FOR UPDATE" with ParamValues: 1=\'186421742\'] at
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 267/
/'}, 'DBIx::Class::Exception' )/
/[25781] 2022-01-11 01:33:53 error bless( {'msg' =>
'DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st
execute failed: ERROR: out of memory/
/DETAIL: Failed on request of size 16 in memory context
"MessageContext". [for Statement "SELECT me.job, me.entered, me.started,
me.finished, me.device, me.port, me.action, me.subaction, me.status,
me.username, me.userip, me.log, me.debug, me.device_key FROM admin me
WHERE ( me.job = ? ) FOR UPDATE" with ParamValues: 1=\'186420514\'] at
/home/netdisco/perl5/lib/perl5/App/Netdisco/JobQueue/PostgreSQL.pm line 267/
_______________________________________________
Netdisco mailing list
netdisco-users@lists.sourceforge.net
https://sourceforge.net/p/netdisco/mailman/netdisco-users/
<https://sourceforge.net/p/netdisco/mailman/netdisco-users/>
<https://sourceforge.net/p/netdisco/mailman/netdisco-users/
<https://sourceforge.net/p/netdisco/mailman/netdisco-users/>>
--
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com <https://www.netnea.com> <https://www.netnea.com
<https://www.netnea.com>>
Phone: +41 79 644 77 64
--
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com <https://www.netnea.com>
Phone: +41 79 644 77 64