I've just tested our other system (also using an old version of lustre - another appliance, so we can't trivially update it to latest lustre) and am seeing a bunch of changelog errors -- I'd reverted this filesystem back to robinhood 2.5
# rpm -qa | grep lustre lustre-iokit-2.5.2-3.0.101_0.47.86.1.11753.0.PTF_default lustre-client-2.5.2-3.0.101_0.47.86.1.11753.0.PTF_default lustre-client-modules-2.5.2-3.0.101_0.47.86.1.11753.0.PTF_default robinhood-tmpfs-2.5.5-2.lustre2.5 lustre-client-tests-2.5.2-3.0.101_0.47.86.1.11753.0.PTF_default which works ok, but when I try to use a newer version of lustre client and robinhood (as part of our migration to SLES 12) it fails On a freshly dropped database and just set up to follow changelogs: f001:/etc/robinhood.d # robinhood --read-log -f scratch2_v31.conf 2017/10/12 20:57:26 [124042/1] CheckFS | '/scratch2' matches mount point '/scratch2', type=lustre, fs=10.10.100.3@o2ib:10.10.100.4@o2ib:/snx11037 2017/10/12 20:57:26 [124042/1] ListMgr | Table does not exist: 'SELECT value FROM VARS WHERE varname='VersionFunctionSet'' (Table 'rbh_scratch2.VARS' doesn't exist) 2017/10/12 20:57:26 [124042/1] ListMgr | No function versioning (expected: 1.6). Existing functions will be dropped and re-created. 2017/10/12 20:57:26 [124042/1] ListMgr | Table does not exist: 'SELECT value FROM VARS WHERE varname='VersionTriggerSet'' (Table 'rbh_scratch2.VARS' doesn't exist) 2017/10/12 20:57:26 [124042/1] ListMgr | No trigger versioning (expected: 1.4). Existing triggers will be dropped and re-created. 2017/10/12 20:57:26 [124042/1] ListMgr | table VARS does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table ENTRIES does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table NAMES does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table ANNEX_INFO does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | function sz_range does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table ACCT_STAT does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | Populating accounting table from existing DB contents. This can take a while... 2017/10/12 20:57:26 [124042/1] ListMgr | table STRIPE_INFO does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table STRIPE_ITEMS does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | table SOFT_RM does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | trigger ACCT_ENTRY_INSERT does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | trigger ACCT_ENTRY_DELETE does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | trigger ACCT_ENTRY_UPDATE does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | function one_path does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] ListMgr | function this_path does not exist (or wrong version): creating it. 2017/10/12 20:57:26 [124042/1] Main | Daemon started (running modules: log_reader) 2017/10/12 20:57:27 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:27 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:28 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:28 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:29 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:29 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:30 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:30 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:31 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:31 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:32 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:32 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:33 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:33 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:34 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:34 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:35 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:35 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:36 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:36 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:37 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:37 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:38 [124042/2] ChangeLog | Error -22 in llapi_changelog_recv(): Invalid argument. Trying to reopen it. 2017/10/12 20:57:38 [124042/2] ChangeLog | Error -22 closing changelog: Invalid argument ^C2017/10/12 20:57:39 [124042/3] SigHdlr | SIGINT received: performing clean daemon shutdown 2017/10/12 20:57:39 [124042/3] ChangeLog | Stop request has been sent to all ChangeLog reader threads 2017/10/12 20:57:39 [124042/2] ChangeLog | Changelog reader thread terminating 2017/10/12 20:57:39 [124042/3] EntryProc | Pipeline successfully flushed 2017/10/12 20:57:39 [124042/3] STATS | ==== EntryProcessor Pipeline Stats === 2017/10/12 20:57:39 [124042/3] STATS | Idle threads: 0 2017/10/12 20:57:39 [124042/3] STATS | Id constraints count: 0 (hash min=0/max=0/avg=0.0) 2017/10/12 20:57:39 [124042/3] STATS | Name constraints count: 0 (hash min=0/max=0/avg=0.0) 2017/10/12 20:57:39 [124042/3] STATS | Stage | Wait | Curr | Done | Total | ms/op | 2017/10/12 20:57:39 [124042/3] STATS | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 | 2017/10/12 20:57:39 [124042/3] STATS | 1: GET_INFO_DB | 0 | 0 | 0 | 990 | 0.51 | 2017/10/12 20:57:39 [124042/3] STATS | 2: GET_INFO_FS | 0 | 0 | 0 | 990 | 6.70 | 2017/10/12 20:57:39 [124042/3] STATS | 3: PRE_APPLY | 0 | 0 | 0 | 990 | 0.00 | 2017/10/12 20:57:39 [124042/3] STATS | 4: DB_APPLY | 0 | 0 | 0 | 990 | 0.48 | 98.48% batched (avg batch size: 28.7) 2017/10/12 20:57:39 [124042/3] STATS | 5: CHGLOG_CLR | 0 | 0 | 0 | 990 | 0.02 | 2017/10/12 20:57:39 [124042/3] STATS | 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 | 2017/10/12 20:57:39 [124042/3] STATS | DB ops: get=0/ins=990/upd=0/rm=0 2017/10/12 20:57:39 [124042/3] ChangeLog | Error -22 closing changelog: Invalid argument 2017/10/12 20:57:39 [124042/3] STATS | ChangeLog reader #0: 2017/10/12 20:57:39 [124042/3] STATS | fs_name = snx11037 2017/10/12 20:57:39 [124042/3] STATS | mdt_name = MDT0000 2017/10/12 20:57:39 [124042/3] STATS | reader_id = cl1 2017/10/12 20:57:39 [124042/3] STATS | records read = 3309 2017/10/12 20:57:39 [124042/3] STATS | interesting records = 990 2017/10/12 20:57:39 [124042/3] STATS | suppressed records = 2319 2017/10/12 20:57:39 [124042/3] STATS | records pending = 0 2017/10/12 20:57:39 [124042/3] STATS | status = terminating 2017/10/12 20:57:39 [124042/3] STATS | last received: rec_id=2835177183, rec_time=2017/10/12 20:57:26.934430, received at 2017/10/12 20:57:26.952364 2017/10/12 20:57:39 [124042/3] STATS | receive speed: 254.54 rec/sec, log/real time ratio: 2.06 2017/10/12 20:57:39 [124042/3] STATS | last pushed: rec_id=2835177156, rec_time=2017/10/12 20:57:26.607426, pushed at 2017/10/12 20:57:31.955170 2017/10/12 20:57:39 [124042/3] STATS | push speed: 252.46 rec/sec, log/real time ratio: 2.03 2017/10/12 20:57:39 [124042/3] STATS | last committed: rec_id=2835177156, rec_time=2017/10/12 20:57:26.607426, committed at 2017/10/12 20:57:32.736805 2017/10/12 20:57:39 [124042/3] STATS | commit speed: 252.46 rec/sec, log/real time ratio: 2.03 2017/10/12 20:57:39 [124042/3] STATS | last cleared: rec_id=2835177156, rec_time=2017/10/12 20:57:26.607426, cleared at 2017/10/12 20:57:39.958262 2017/10/12 20:57:39 [124042/3] STATS | ChangeLog stats: 2017/10/12 20:57:39 [124042/3] STATS | MARK: 0, CREAT: 729, MKDIR: 112, HLINK: 0, SLINK: 0, MKNOD: 0, UNLNK: 0, RMDIR: 0 2017/10/12 20:57:39 [124042/3] STATS | RENME: 0, RNMTO: 0, OPEN: 0, CLOSE: 1636, LYOUT: 0, TRUNC: 0, SATTR: 832, XATTR: 0 2017/10/12 20:57:39 [124042/3] STATS | HSM: 0, MTIME: 0, CTIME: 0, ATIME: 0, MIGRT: 0 2017/10/12 20:57:39 [124042/3] SigHdlr | Exiting. strangely, when I try and restart this I get the same error as the other /scratch filesystem, ie f001:/etc/robinhood.d # robinhood --read-log -f scratch2_v31.conf 2017/10/12 21:16:35 [125317/1] CheckFS | '/scratch2' matches mount point '/scratch2', type=lustre, fs=10.10.100.3@o2ib:10.10.100.4@o2ib:/snx11037 2017/10/12 21:16:35 [125317/1] ChangeLog | ERROR -2 opening changelog for MDT 'snx11037-MDT0000': No such file or directory 2017/10/12 21:16:35 [125317/1] Main | Error 2 initializing ChangeLog Reader f001:/etc/robinhood.d # lfs changelog_clear snx11037-MDT0000 cl1 0 f001:/etc/robinhood.d # robinhood --read-log -f scratch2_v31.conf 2017/10/12 21:17:05 [125360/1] CheckFS | '/scratch2' matches mount point '/scratch2', type=lustre, fs=10.10.100.3@o2ib:10.10.100.4@o2ib:/snx11037 2017/10/12 21:17:05 [125360/1] ChangeLog | ERROR -2 opening changelog for MDT 'snx11037-MDT0000': No such file or directory 2017/10/12 21:17:05 [125360/1] Main | Error 2 initializing ChangeLog Reader f001:/etc/robinhood.d # lfs changelog snx11037-MDT0000 | head 2835221624 11CLOSE 13:17:03.595956685 2017.10.12 0x3 t=[0x20001051f:0xcd4d:0x0] 2835221625 13TRUNC 13:17:03.874960367 2017.10.12 0xe t=[0x20001051f:0xcd40:0x0] 2835221626 11CLOSE 13:17:03.875960380 2017.10.12 0x243 t=[0x20001051f:0xcd40:0x0] 2835221627 13TRUNC 13:17:03.915960912 2017.10.12 0xe t=[0x20001051f:0xcd3c:0x0] 2835221628 13TRUNC 13:17:03.917960936 2017.10.12 0xe t=[0x20001051f:0xcd41:0x0] 2835221629 11CLOSE 13:17:03.918960949 2017.10.12 0x242 t=[0x20001051f:0xcd41:0x0] 2835221630 11CLOSE 13:17:03.918960949 2017.10.12 0x243 t=[0x20001051f:0xcd3c:0x0] 2835221631 11CLOSE 13:17:03.959961491 2017.10.12 0x3 t=[0x20001051f:0xcd3e:0x0] 2835221632 11CLOSE 13:17:03.959961492 2017.10.12 0x3 t=[0x20001051f:0xcd3d:0x0] 2835221633 11CLOSE 13:17:03.969961623 2017.10.12 0x3 t=[0x20001051f:0xcd44:0x0] Andrew ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
