Hi,

Is there special consideration before migrate from 1.6.5.1 to 1.8.3
1 mgs 2 filesystem 3 oss 12 ost 80T ( we need now 16T ost )

we migrate just 1 client for test to see how it's comporting
and I have some strange issue

1 on this client users don't have acces any more to their quota
---------------------------------------------------------------

lfs quota -v -u weill /home
Disk quotas for user weill (uid 1001):
      Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /home     [0]     [0]     [0]             [0]     [0]     [0]
quotactl failed: Operation not permitted
homefs-OST0000_UUID quotactl failed: Operation not permitted
Some errors happened when getting quota info. Some devices may be not working 
or deactivated. The 
data in "[]" is inaccurate.

from  root it's working

lfs quota -u weill /home
Disk quotas for user weill (uid 1001):
      Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /home 1646536  5000000 5100000           16628       0       0

2 regular error -16 only on the migrated node
---------------------------------------------

May 10 07:15:34 ciclad12 kernel: Lustre: 
4202:0:(client.c:1434:ptlrpc_expire_one_request()) @@@ 
Request x1333645653270608 sent from datafs-OST000a-osc-ffff810c1b6a0c00 to NID 
172.20.176....@tcp 7s 
ago has timed out (7s prior to deadline).
May 10 07:15:34 ciclad-io2 kernel: Lustre: 
7178:0:(ldlm_lib.c:525:target_handle_reconnect()) 
datafs-OST000a: 15f8d8bb-7b73-bcb3-c3bc-2b03195a9360 reconnecting
May 10 07:15:34 ciclad12 kernel: Lustre: datafs-OST000a-osc-ffff810c1b6a0c00: 
Connection to service 
datafs-OST000a via nid 172.20.176....@tcp was lost; in progress operations 
using this service will 
wait for recovery to complete.
May 10 07:15:34 ciclad12 kernel: Lustre: 
4202:0:(client.c:1434:ptlrpc_expire_one_request()) @@@ 
Request x1333645653270610 sent from datafs-OST000a-osc-ffff810c1b6a0c00 to NID 
172.20.176....@tcp 7s 
ago has timed out (7s prior to deadline).
May 10 07:15:34 ciclad-io2 kernel: Lustre: 
7178:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 
6776 previous similar messages
May 10 07:15:34 ciclad-io2 kernel: Lustre: 
7178:0:(ldlm_lib.c:760:target_handle_connect()) 
datafs-OST000a: refuse reconnection from 
15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@172.20.176.242@tcp to 
0xffff8101da55a000; still busy with 5 active RPCs
May 10 07:15:34 ciclad-io2 kernel: Lustre: 
7178:0:(ldlm_lib.c:760:target_handle_connect()) Skipped 
6775 previous similar messages
May 10 07:15:34 ciclad-io2 kernel: LustreError: 
7178:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ 
processing error (-16)  r...@ffff8101bab43400 x1333645653270613/t0 
o8->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
368/200 e 0 to 0 dl 
1273468734 ref 1 fl Interpret:/0/0 rc -16/0
May 10 07:15:34 ciclad-io2 kernel: LustreError: 
7178:0:(ldlm_lib.c:1536:target_send_reply_msg()) 
Skipped 6775 previous similar messages
May 10 07:15:34 ciclad12 kernel: LustreError: 11-0: an error occurred while 
communicating with 
172.20.176....@tcp. The ost_connect operation failed with -16
May 10 07:15:34 ciclad12 kernel: LustreError: Skipped 778 previous similar 
messages
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7253:0:(service.c:1064:ptlrpc_server_handle_request()) 
@@@ Request x1333645653270608 took longer than estimated (6+2s); client may 
timeout. 
r...@ffff81022f851000 x1333645653270608/t54502752 
o4->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
448/352 e 0 to 0 dl 
1273468533 ref 1 fl Complete:/0/0 rc 0/0
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7253:0:(service.c:1064:ptlrpc_server_handle_request()) 
Skipped 1 previous similar message
May 10 07:15:35 ciclad12 kernel: LustreError: 11-0: an error occurred while 
communicating with 
172.20.176....@tcp. The ost_connect operation failed with -16
May 10 07:15:35 ciclad12 kernel: LustreError: Skipped 596 previous similar 
messages
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7172:0:(ldlm_lib.c:525:target_handle_reconnect()) 
datafs-OST000a: 15f8d8bb-7b73-bcb3-c3bc-2b03195a9360 reconnecting
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7172:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 
1364 previous similar messages
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7172:0:(ldlm_lib.c:760:target_handle_connect()) 
datafs-OST000a: refuse reconnection from 
15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@172.20.176.242@tcp to 
0xffff8101da55a000; still busy with 4 active RPCs
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7172:0:(ldlm_lib.c:760:target_handle_connect()) Skipped 
1364 previous similar messages
May 10 07:15:35 ciclad-io2 kernel: LustreError: 
7172:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ 
processing error (-16)  r...@ffff81022f851200 x1333645653271978/t0 
o8->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
368/200 e 0 to 0 dl 
1273468735 ref 1 fl Interpret:/0/0 rc -16/0
May 10 07:15:35 ciclad-io2 kernel: LustreError: 
7172:0:(ldlm_lib.c:1536:target_send_reply_msg()) 
Skipped 1364 previous similar messages
May 10 07:15:35 ciclad12 kernel: LustreError: 11-0: an error occurred while 
communicating with 
172.20.176....@tcp. The ost_connect operation failed with -16
May 10 07:15:35 ciclad12 kernel: LustreError: Skipped 2918 previous similar 
messages
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7160:0:(ldlm_lib.c:525:target_handle_reconnect()) 
datafs-OST000a: 15f8d8bb-7b73-bcb3-c3bc-2b03195a9360 reconnecting
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7160:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 
2607 previous similar messages
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7160:0:(ldlm_lib.c:760:target_handle_connect()) 
datafs-OST000a: refuse reconnection from 
15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@172.20.176.242@tcp to 
0xffff8101da55a000; still busy with 4 active RPCs
May 10 07:15:35 ciclad-io2 kernel: Lustre: 
7160:0:(ldlm_lib.c:760:target_handle_connect()) Skipped 
2607 previous similar messages
May 10 07:15:35 ciclad-io2 kernel: LustreError: 
7160:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ 
processing error (-16)  r...@ffff81022e05c400 x1333645653274586/t0 
o8->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
368/200 e 0 to 0 dl 
1273468735 ref 1 fl Interpret:/0/0 rc -16/0
May 10 07:15:35 ciclad-io2 kernel: LustreError: 
7160:0:(ldlm_lib.c:1536:target_send_reply_msg()) 
Skipped 2607 previous similar messages
May 10 07:15:36 ciclad-io2 kernel: Lustre: 
7231:0:(service.c:1064:ptlrpc_server_handle_request()) 
@@@ Request x1333645653270612 took longer than estimated (6+3s); client may 
timeout. 
r...@ffff810139315600 x1333645653270612/t54502757 
o4->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
448/352 e 0 to 0 dl 
1273468533 ref 1 fl Complete:/0/0 rc 0/0
May 10 07:15:36 ciclad12 kernel: LustreError: 11-0: an error occurred while 
communicating with 
172.20.176....@tcp. The ost_connect operation failed with -16
May 10 07:15:36 ciclad12 kernel: LustreError: Skipped 4443 previous similar 
messages
May 10 07:15:36 ciclad-io2 kernel: Lustre: 
7180:0:(ldlm_lib.c:525:target_handle_reconnect()) 
datafs-OST000a: 15f8d8bb-7b73-bcb3-c3bc-2b03195a9360 reconnecting
May 10 07:15:36 ciclad-io2 kernel: Lustre: 
7180:0:(ldlm_lib.c:525:target_handle_reconnect()) Skipped 
4627 previous similar messages
May 10 07:15:36 ciclad-io2 kernel: Lustre: 
7180:0:(ldlm_lib.c:760:target_handle_connect()) 
datafs-OST000a: refuse reconnection from 
15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@172.20.176.242@tcp to 
0xffff8101da55a000; still busy with 2 active RPCs
May 10 07:15:36 ciclad-io2 kernel: Lustre: 
7180:0:(ldlm_lib.c:760:target_handle_connect()) Skipped 
4627 previous similar messages
May 10 07:15:36 ciclad-io2 kernel: LustreError: 
7180:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ 
processing error (-16)  r...@ffff810068c24a00 x1333645653279214/t0 
o8->15f8d8bb-7b73-bcb3-c3bc-2b03195a9...@net_0x20000869db0f2_uuid:0/0 lens 
368/200 e 0 to 0 dl 
1273468736 ref 1 fl Interpret:/0/0 rc -16/0
May 10 07:15:36 ciclad-io2 kernel: LustreError: 
7180:0:(ldlm_lib.c:1536:target_send_reply_msg()) 
Skipped 4627 previous similar messages
May 10 07:15:36 ciclad12 kernel: Lustre: datafs-OST000a-osc-ffff810c1b6a0c00: 
Connection restored to 
service datafs-OST000a using nid 172.20.176....@tcp.



-- 
  Weill Philippe -  Administrateur Systeme et Reseaux
  CNRS/UPMC/IPSL   LATMOS (UMR 8190)
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to