I have fixed the issue some how misconf two systems had the same IP of client .. after it change all are ok
/Zee On Mon, Oct 8, 2018 at 12:51 PM Zeeshan Ali Shah <javacli...@gmail.com> wrote: > We are getting the following error when run rsync . > > Background: We have three filesystem on same MDS , MDT are different zfs > pools.. would that be an issue ? > > any advice ? > > error below > -------- > [Mon Oct 8 12:29:11 2018] Lustre: sgp-MDT0000-mdc-ffff883ffc2c3000: > Connection restored to 172.100.120.25@o2ib (at 172.100.120.25@o2ib) > [Mon Oct 8 12:29:11 2018] Lustre: Skipped 14 previous similar messages > [Mon Oct 8 12:31:25 2018] LNet: > 31340:0:(o2iblnd_cb.c:1350:kiblnd_reconnect_peer()) Abort reconnection of > 172.100.120.25@o2ib: connected > [Mon Oct 8 12:31:25 2018] LNet: > 31340:0:(o2iblnd_cb.c:1350:kiblnd_reconnect_peer()) Skipped 9 previous > similar messages > [Mon Oct 8 12:31:32 2018] Lustre: > 54961:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for slow reply: [sent 1538991905/real 1538991905] > req@ffff8819acf68f00 x1611847503061104/t0(0) > o36->sgp-MDT0000-mdc-ffff883ffc2c3000@172.100.120.25@o2ib:12/10 lens > 608/33520 e 0 to 1 dl 1538991912 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 > [Mon Oct 8 12:31:32 2018] Lustre: > 54961:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous > similar messages > [Mon Oct 8 12:31:32 2018] Lustre: sgp-MDT0000-mdc-ffff883ffc2c3000: > Connection to sgp-MDT0000 (at 172.100.120.25@o2ib) was lost; in progress > operations using this service will wait for recovery to complete > [Mon Oct 8 12:31:32 2018] Lustre: Skipped 2 previous similar messages > [Mon Oct 8 12:31:32 2018] Lustre: sgp-MDT0000-mdc-ffff883ffc2c3000: > Connection restored to 172.100.120.25@o2ib (at 172.100.120.25@o2ib) > [Mon Oct 8 12:31:32 2018] Lustre: Skipped 2 previous similar messages > [Mon Oct 8 12:34:01 2018] LNet: > 25934:0:(o2iblnd_cb.c:2307:kiblnd_passive_connect()) Stale connection > request > [Mon Oct 8 12:34:01 2018] LNet: > 25934:0:(o2iblnd_cb.c:2307:kiblnd_passive_connect()) Skipped 2 previous > similar messages > [Mon Oct 8 12:34:01 2018] LNet: > 31340:0:(o2iblnd_cb.c:1350:kiblnd_reconnect_peer()) Abort reconnection of > 172.100.120.25@o2ib: connected > [Mon Oct 8 12:34:01 2018] LNet: > 31340:0:(o2iblnd_cb.c:1350:kiblnd_reconnect_peer()) Skipped 3 previous > similar messages > [Mon Oct 8 12:34:08 2018] Lustre: > 54961:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for slow reply: [sent 1538992061/real 1538992061] > req@ffff881ad060f500 x1611847503440304/t0(0) > o101->sgp-MDT0000-mdc-ffff883ffc2c3000@172.100.120.25@o2ib:12/10 lens > 880/33728 e 0 to 1 dl 1538992068 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 > [Mon Oct 8 12:34:08 2018] Lustre: > 54961:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous > similar message > [Mon Oct 8 12:34:08 2018] Lustre: sgp-MDT0000-mdc-ffff883ffc2c3000: > Connection to sgp-MDT0000 (at 172.100.120.25@o2ib) was lost; in progress > operations using this service will wait for recovery to complete > [Mon Oct 8 12:34:08 2018] Lustre: Skipped 1 previous similar message > [Mon Oct 8 12:34:08 2018] Lustre: sgp-MDT0000-mdc-ffff883ffc2c3000: > Connection restored to 172.100.120.25@o2ib (at 172.100.120.25@o2ib) > [Mon Oct 8 12:34:08 2018] Lustre: Skipped 1 previous similar message > ------ >
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org