Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi,
looks that some osds are down?!

What is the output of ceph osd tree

Udo

Am 25.09.2014 04:29, schrieb Aegeaner:
 The cluster healthy state is WARN:
 
  health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
 incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;
 292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are blocked
  32 sec; recovery 12474/46357 objects degraded (26.909%)
  monmap e3: 3 mons at
 
 {CVM-0-mon01=172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0},
 election epoch 24, quorum 0,1,2 CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
  osdmap e421: 9 osds: 9 up, 9 in
   pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178 objects
 330 MB used, 3363 GB / 3363 GB avail
 12474/46357 objects degraded (26.909%)
   20 stale+peering
   87 stale+active+clean
8 stale+down+peering
   59 stale+incomplete
  118 stale+active+degraded
 
 
 What does these errors mean? Can these PGs be recovered?
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi again,
sorry - forgot my post... see

osdmap e421: 9 osds: 9 up, 9 in

shows that all your 9 osds are up!

Do you have trouble with your journal/filesystem?

Udo

Am 25.09.2014 08:01, schrieb Udo Lembke:
 Hi,
 looks that some osds are down?!
 
 What is the output of ceph osd tree
 
 Udo
 
 Am 25.09.2014 04:29, schrieb Aegeaner:
 The cluster healthy state is WARN:

  health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
 incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;
 292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are blocked
  32 sec; recovery 12474/46357 objects degraded (26.909%)
  monmap e3: 3 mons at
 
 {CVM-0-mon01=172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0},
 election epoch 24, quorum 0,1,2 CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
  osdmap e421: 9 osds: 9 up, 9 in
   pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178 objects
 330 MB used, 3363 GB / 3363 GB avail
 12474/46357 objects degraded (26.909%)
   20 stale+peering
   87 stale+active+clean
8 stale+down+peering
   59 stale+incomplete
  118 stale+active+degraded


 What does these errors mean? Can these PGs be recovered?


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Irek Fasikhov
osd_op(client.4625.1:9005787)
.


This is due to external factors. For example, the network settings.

2014-09-25 10:05 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi again,
 sorry - forgot my post... see

 osdmap e421: 9 osds: 9 up, 9 in

 shows that all your 9 osds are up!

 Do you have trouble with your journal/filesystem?

 Udo

 Am 25.09.2014 08:01, schrieb Udo Lembke:
  Hi,
  looks that some osds are down?!
 
  What is the output of ceph osd tree
 
  Udo
 
  Am 25.09.2014 04:29, schrieb Aegeaner:
  The cluster healthy state is WARN:
 
   health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
  incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;
  292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are blocked
   32 sec; recovery 12474/46357 objects degraded (26.909%)
   monmap e3: 3 mons at
  {CVM-0-mon01=
 172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0
 },
  election epoch 24, quorum 0,1,2 CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
   osdmap e421: 9 osds: 9 up, 9 in
pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178 objects
  330 MB used, 3363 GB / 3363 GB avail
  12474/46357 objects degraded (26.909%)
20 stale+peering
87 stale+active+clean
 8 stale+down+peering
59 stale+incomplete
   118 stale+active+degraded
 
 
  What does these errors mean? Can these PGs be recovered?
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Aegeaner
Yeah, three of nine OSDs went down but I recreated them, but the pgs 
cannot be recovered.


I don't know how to erase all the pgs, so I deleted all the osd pools, 
including data and metadata … Now all pgs are active and clean...


I'm not sure if there are more elegant ways to deal with this.

===
Aegeaner


在 2014-09-25 14:11, Irek Fasikhov 写道:

osd_op(client.4625.1:9005787)
.


This is due to external factors. For example, the network settings.

2014-09-25 10:05 GMT+04:00 Udo Lembke ulem...@polarzone.de 
mailto:ulem...@polarzone.de:


Hi again,
sorry - forgot my post... see

osdmap e421: 9 osds: 9 up, 9 in

shows that all your 9 osds are up!

Do you have trouble with your journal/filesystem?

Udo

Am 25.09.2014 08:01, schrieb Udo Lembke:
 Hi,
 looks that some osds are down?!

 What is the output of ceph osd tree

 Udo

 Am 25.09.2014 04:29, schrieb Aegeaner:
 The cluster healthy state is WARN:

  health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
 incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck
inactive;
 292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are
blocked
  32 sec; recovery 12474/46357 objects degraded (26.909%)
  monmap e3: 3 mons at
   
 {CVM-0-mon01=172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0


http://172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0},
 election epoch 24, quorum 0,1,2
CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
  osdmap e421: 9 osds: 9 up, 9 in
   pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178
objects
 330 MB used, 3363 GB / 3363 GB avail
 12474/46357 objects degraded (26.909%)
   20 stale+peering
   87 stale+active+clean
8 stale+down+peering
   59 stale+incomplete
  118 stale+active+degraded


 What does these errors mean? Can these PGs be recovered?


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Replies Inline :

Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

-Original Message-
From: Sage Weil [mailto:sw...@redhat.com]
Sent: Wednesday, September 24, 2014 6:10 PM
To: Sahana Lokeshappa
Cc: Varada Kari; ceph-us...@ceph.com
Subject: RE: [Ceph-community] Pgs are in stale+down+peering state

On Wed, 24 Sep 2014, Sahana Lokeshappa wrote:
 2.a9518 0   0   0   0   2172649472  3001
 3001active+clean2014-09-22 17:49:35.357586  6826'35762
 17842:72706 [12,7,28]   12  [12,7,28]   12
 6826'35762
 2014-09-22 11:33:55.985449  0'0 2014-09-16 20:11:32.693864

Can you verify that 2.a9 exists in teh data directory for 12, 7, and/or 28?  If 
so the next step would be to enable logging (debug osd = 20, debug ms = 1) and 
see wy peering is stuck...

Yes 2.a9 directories are present in osd.12, 7 ,28

and 0.49 0.4d and 0.1c directories are not present in respective acting osds.


Here are the logs I can see when debugs were raised to 20


2014-09-24 18:38:41.706566 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:41.706586 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:41.706592 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [476de738//0//-1,f38//0//-1)
2014-09-24 18:38:41.711778 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 23 objects deeply
2014-09-24 18:38:41.730881 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.73 7f92eede0700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.822444 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.822519 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.878894 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.878921 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.918307 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.918426 7f92eede0700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.951678 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:41.951709 7f92eede0700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:42.064759 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.107016 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.107032 7f92eede0700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.109356 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109372 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109373 7f92f15e5700 20 osd.12 17850 _dispatch 0xeb0d900 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5
2014-09-24 18:38:42.109378 7f92f15e5700 10 osd.12 17850 queueing MOSDRepScrub 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5
2014-09-24 18:38:42.109395 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109396 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109456 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:42.109522 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Hi Craig,

Sorry for late response. Somehow missed this mail.
All osds are up and running. There were no specific logs related to this 
activity.  And, there are no IOs running right now. Few osds were made in and 
out ,removed fully and recreated before these pgs coming to this stage.
I had tried restarting osds. It didn’t work.

Thanks
Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

From: Craig Lewis [mailto:cle...@centraldesktop.com]
Sent: Wednesday, September 24, 2014 5:44 AM
To: Sahana Lokeshappa
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

Is osd.12  doing anything strange?  Is it consuming lots of CPU or IO?  Is it 
flapping?   Writing any interesting logs?  Have you tried restarting it?

If that doesn't help, try the other involved osds: 56, 27, 6, 25, 23.  I doubt 
that it will help, but it won't hurt.



On Mon, Sep 22, 2014 at 11:21 AM, Varada Kari 
varada.k...@sandisk.commailto:varada.k...@sandisk.com wrote:
Hi Sage,

To give more context on this problem,

This cluster has two pools rbd and user-created.

Osd.12 is a primary for some other PG’s , but the problem happens for these 
three  PG’s.

$ sudo ceph osd lspools
0 rbd,2 pool1,

$ sudo ceph -s
cluster 99ffc4a5-2811-4547-bd65-34c7d4c58758
 health HEALTH_WARN 3 pgs down; 3 pgs peering; 3 pgs stale; 3 pgs stuck 
inactive; 3 pgs stuck stale; 3 pgs stuck unclean; 1 requests are blocked  32 
sec
monmap e1: 3 mons at 
{rack2-ram-1=10.242.42.180:6789/0,rack2-ram-2=10.242.42.184:6789/0,rack2-ram-3=10.242.42.188:6789/0http://10.242.42.180:6789/0,rack2-ram-2=10.242.42.184:6789/0,rack2-ram-3=10.242.42.188:6789/0},
 election epoch 2008, quorum 0,1,2 rack2-ram-1,rack2-ram-2,rack2-ram-3
 osdmap e17842: 64 osds: 64 up, 64 in
  pgmap v79729: 2148 pgs, 2 pools, 4135 GB data, 1033 kobjects
12504 GB used, 10971 GB / 23476 GB avail
2145 active+clean
   3 stale+down+peering

Snippet from pg dump:

2.a9518 0   0   0   0   2172649472  30013001
active+clean2014-09-22 17:49:35.357586  6826'35762  17842:72706 
[12,7,28]   12  [12,7,28]   12   6826'35762  2014-09-22 
11:33:55.985449  0'0 2014-09-16 20:11:32.693864
0.590   0   0   0   0   0   0   0   
active+clean2014-09-22 17:50:00.751218  0'0 17842:4472  
[12,41,2]   12  [12,41,2]   12  0'0 2014-09-22 16:47:09.315499  
 0'0 2014-09-16 12:20:48.618726
0.4d0   0   0   0   0   0   4   4   
stale+down+peering  2014-09-18 17:51:10.038247  186'4   11134:498   
[12,56,27]  12  [12,56,27]  12  186'42014-09-18 17:30:32.393188 
 0'0 2014-09-16 12:20:48.615322
0.490   0   0   0   0   0   0   0   
stale+down+peering  2014-09-18 17:44:52.681513  0'0 11134:498   
[12,6,25]   12  [12,6,25]   12  0'0  2014-09-18 17:16:12.986658 
 0'0 2014-09-16 12:20:48.614192
0.1c0   0   0   0   0   0   12  12  
stale+down+peering  2014-09-18 17:51:16.735549  186'12  11134:522   
[12,25,23]  12  [12,25,23]  12  186'12   2014-09-18 17:16:04.457863 
 186'10  2014-09-16 14:23:58.731465
2.17510 0   0   0   0   2139095040  30013001
active+clean2014-09-22 17:52:20.364754  6784'30742  17842:72033 
[12,27,23]  12  [12,27,23]  12   6784'30742  2014-09-22 
00:19:39.905291  0'0 2014-09-16 20:11:17.016299
2.7e8   508 0   0   0   0   2130706432  34333433
active+clean2014-09-22 17:52:20.365083  6702'21132  17842:64769 
[12,25,23]  12  [12,25,23]  12   6702'21132  2014-09-22 
17:01:20.546126  0'0 2014-09-16 14:42:32.079187
2.6a5   528 0   0   0   0   2214592512  28402840
active+clean2014-09-22 22:50:38.092084  6775'34416  17842:83221 
[12,58,0]   12  [12,58,0]   12   6775'34416  2014-09-22 
22:50:38.091989  0'0 2014-09-16 20:11:32.703368

And we couldn’t observe and peering events happening on the primary osd.

$ sudo ceph pg 0.49 query
Error ENOENT: i don't have pgid 0.49
$ sudo ceph pg 0.4d query
Error ENOENT: i don't have pgid 0.4d
$ sudo ceph pg 0.1c query
Error ENOENT: i don't have pgid 0.1c

Not able to explain why the peering was stuck. BTW, Rbd pool doesn’t contain 
any data.

Varada

From: Ceph-community 
[mailto:ceph-community-boun...@lists.ceph.commailto:ceph-community-boun...@lists.ceph.com]
 On Behalf Of Sage Weil
Sent: Monday, September 22, 2014 10:44 PM
To: Sahana Lokeshappa; 

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-25 Thread Sahana Lokeshappa
Hi All,

Here are the steps I followed, to get back all pgs to active+clean state. Still 
don't know what is the root cause for this pg state.

1. Force create pgs which are in stale+down+peering
2. Stop osd.12
3. Mark osd.12 as lost
4. Start osd.12
5. All pgs were back to active+clean state

Thanks
Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park
C V Raman nagar, Bangalore 560093
T: +918042422283 
sahana.lokesha...@sandisk.com


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sahana 
Lokeshappa
Sent: Thursday, September 25, 2014 1:26 PM
To: Sage Weil
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

Replies Inline :

Sahana Lokeshappa
Test Development Engineer I
SanDisk Corporation
3rd Floor, Bagmane Laurel, Bagmane Tech Park C V Raman nagar, Bangalore 560093
T: +918042422283
sahana.lokesha...@sandisk.com

-Original Message-
From: Sage Weil [mailto:sw...@redhat.com]
Sent: Wednesday, September 24, 2014 6:10 PM
To: Sahana Lokeshappa
Cc: Varada Kari; ceph-us...@ceph.com
Subject: RE: [Ceph-community] Pgs are in stale+down+peering state

On Wed, 24 Sep 2014, Sahana Lokeshappa wrote:
 2.a9518 0   0   0   0   2172649472  3001
 3001active+clean2014-09-22 17:49:35.357586  6826'35762
 17842:72706 [12,7,28]   12  [12,7,28]   12
 6826'35762
 2014-09-22 11:33:55.985449  0'0 2014-09-16 20:11:32.693864

Can you verify that 2.a9 exists in teh data directory for 12, 7, and/or 28?  If 
so the next step would be to enable logging (debug osd = 20, debug ms = 1) and 
see wy peering is stuck...

Yes 2.a9 directories are present in osd.12, 7 ,28

and 0.49 0.4d and 0.1c directories are not present in respective acting osds.


Here are the logs I can see when debugs were raised to 20


2014-09-24 18:38:41.706566 7f92e2dc8700  7 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] replica_scrub
2014-09-24 18:38:41.706586 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map
2014-09-24 18:38:41.706592 7f92e2dc8700 20 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] scrub_map_chunk [476de738//0//-1,f38//0//-1)
2014-09-24 18:38:41.711778 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] _scan_list scanning 23 objects deeply
2014-09-24 18:38:41.730881 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.73 7f92eede0700 20 osd.12 17850 share_map_peer 
0x89cda20 already has epoch 17850
2014-09-24 18:38:41.822444 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.822519 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd2eb080 already has epoch 17850
2014-09-24 18:38:41.878894 7f92eede0700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.878921 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0xd5cd5a0 already has epoch 17850
2014-09-24 18:38:41.918307 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.918426 7f92eede0700 20 osd.12 17850 share_map_peer 
0x1161bde0 already has epoch 17850
2014-09-24 18:38:41.951678 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:41.951709 7f92eede0700 20 osd.12 17850 share_map_peer 
0x7fc5700 already has epoch 17850
2014-09-24 18:38:42.064759 7f92e2dc8700 10 osd.12 pg_epoch: 17850 pg[2.738( v 
6870'28894 (4076'25093,6870'28894] local-les=17723 n=537 ec=188 les/c 
17723/17725 17722/17722/17709) [57,12,48] r=1 lpr=17722 pi=17199-17721/6 
luod=0'0 crt=0'0 lcod 0'0 active] build_scrub_map_chunk done.
2014-09-24 18:38:42.107016 7f92ed5dd700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.107032 7f92eede0700 20 osd.12 17850 share_map_peer 
0x10377b80 already has epoch 17850
2014-09-24 18:38:42.109356 7f92f15e5700 10 osd.12 17850 do_waiters -- start
2014-09-24 18:38:42.109372 7f92f15e5700 10 osd.12 17850 do_waiters -- finish
2014-09-24 18:38:42.109373 7f92f15e5700 20 osd.12 17850 _dispatch 0xeb0d900 
replica scrub(pg: 
2.738,from:0'0,to:6489'28646,epoch:17850,start:f38//0//-1,end:92371f38//0//-1,chunky:1,deep:1,version:5)
 v5

Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Micha Krause

Hi,

   That's strange.  3.13 is way before any changes that could have had any

such effect.  Can you by any chance try with older kernels to see where
it starts misbehaving for you?  3.12?  3.10?  3.8?


my crush tunables are set to bobtail, so I can't go bellow 3.9, I will try 3.12 
tomorrow and
report back.


Ok, I have tested 3.12.9 and it also hangs.

I have no other pre-build kernels to test :-(.

If I have to compile Kernels anyway I will test 3.16.3 as well :-/.


Micha Krause
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-25 Thread Alexandre DERUMIER
As Dieter asked, what replication level is this, I guess 1? 

Yes, replication x1 for theses benchmarks.

Now at 3 nodes and 6 OSDs you're getting about the performance of a single 
SSD, food for thought. 

yes, sure . I don't have more nodes to test, but I would like to known if it's 
scale more than 20k iops with more nodes.

but clearly, the cpu is the limit.



- Mail original - 

De: Christian Balzer ch...@gol.com 
À: ceph-users@lists.ceph.com 
Envoyé: Jeudi 25 Septembre 2014 06:50:31 
Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
IOPS 

On Wed, 24 Sep 2014 20:49:21 +0200 (CEST) Alexandre DERUMIER wrote: 

 What about writes with Giant? 
 
 I'm around 
 - 4k iops (4k random) with 1osd (1 node - 1 osd) 
 - 8k iops (4k random) with 2 osd (1 node - 2 osd) 
 - 16K iops (4k random) with 4 osd (2 nodes - 2 osd by node) 
 - 22K iops (4k random) with 6 osd (3 nodes - 2 osd by node) 
 
 Seem to scale, but I'm cpu bound on node (8 cores E5-2603 v2 @ 1.80GHz 
 100% cpu for 2 osd) 
 
You don't even need a full SSD cluster to see that Ceph has a lot of room 
for improvements, see my Slow IOPS on RBD compared to journal and backing 
devices thread in May. 

As Dieter asked, what replication level is this, I guess 1? 

Now at 3 nodes and 6 OSDs you're getting about the performance of a single 
SSD, food for thought. 

Christian 

 - Mail original - 
 
 De: Sebastien Han sebastien@enovance.com 
 À: Jian Zhang jian.zh...@intel.com 
 Cc: Alexandre DERUMIER aderum...@odiso.com, 
 ceph-users@lists.ceph.com Envoyé: Mardi 23 Septembre 2014 17:41:38 
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS 
 
 What about writes with Giant? 
 
 On 18 Sep 2014, at 08:12, Zhang, Jian jian.zh...@intel.com wrote: 
 
  Have anyone ever testing multi volume performance on a *FULL* SSD 
  setup? We are able to get ~18K IOPS for 4K random read on a single 
  volume with fio (with rbd engine) on a 12x DC3700 Setup, but only able 
  to get ~23K (peak) IOPS even with multiple volumes. Seems the maximum 
  random write performance we can get on the entire cluster is quite 
  close to single volume performance. 
  
  Thanks 
  Jian 
  
  
  -Original Message- 
  From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
  Of Sebastien Han Sent: Tuesday, September 16, 2014 9:33 PM 
  To: Alexandre DERUMIER 
  Cc: ceph-users@lists.ceph.com 
  Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go 
  over 3, 2K IOPS 
  
  Hi, 
  
  Thanks for keeping us updated on this subject. 
  dsync is definitely killing the ssd. 
  
  I don't have much to add, I'm just surprised that you're only getting 
  5299 with 0.85 since I've been able to get 6,4K, well I was using the 
  200GB model, that might explain this. 
  
  
  On 12 Sep 2014, at 16:32, Alexandre DERUMIER aderum...@odiso.com 
  wrote: 
  
  here the results for the intel s3500 
   
  max performance is with ceph 0.85 + optracker disabled. 
  intel s3500 don't have d_sync problem like crucial 
  
  %util show almost 100% for read and write, so maybe the ssd disk 
  performance is the limit. 
  
  I have some stec zeusram 8GB in stock (I used them for zfs zil), I'll 
  try to bench them next week. 
  
  
  
  
  
  
  INTEL s3500 
  --- 
  raw disk 
   
  
  randread: fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k 
  --iodepth=32 --group_reporting --invalidate=0 --name=abc 
  --ioengine=aio bw=288207KB/s, iops=72051 
  
  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
  r_await w_await svctm %util sdb 0,00 0,00 73454,00 0,00 293816,00 
  0,00 8,00 30,96 0,42 0,42 0,00 0,01 99,90 
  
  randwrite: fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k 
  --iodepth=32 --group_reporting --invalidate=0 --name=abc 
  --ioengine=aio --sync=1 bw=48131KB/s, iops=12032 Device: rrqm/s 
  wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await 
  svctm %util sdb 0,00 0,00 0,00 24120,00 0,00 48240,00 4,00 2,08 0,09 
  0,00 0,09 0,04 100,00 
  
  
  ceph 0.80 
  - 
  randread: no tuning: bw=24578KB/s, iops=6144 
  
  
  randwrite: bw=10358KB/s, iops=2589 
  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
  r_await w_await svctm %util sdb 0,00 373,00 0,00 8878,00 0,00 
  34012,50 7,66 1,63 0,18 0,00 0,18 0,06 50,90 
  
  
  ceph 0.85 : 
  - 
  
  randread : bw=41406KB/s, iops=10351 
  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
  r_await w_await svctm %util sdb 2,00 0,00 10425,00 0,00 41816,00 0,00 
  8,02 1,36 0,13 0,13 0,00 0,07 75,90 
  
  randwrite : bw=17204KB/s, iops=4301 
  
  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
  r_await w_await svctm %util sdb 0,00 333,00 0,00 9788,00 0,00 
  57909,00 11,83 1,46 0,15 0,00 0,15 0,07 67,80 
  
  
  ceph 0.85 tuning op_tracker=false 
   
  
  randread : 

Re: [ceph-users] bug: ceph-deploy does not support jumbo frame

2014-09-25 Thread yuelongguang
thanks. i have not configured switch.

i just know about it.









在 2014-09-25 12:38:48,Irek Fasikhov malm...@gmail.com 写道:

You have configured the switch?


2014-09-25 5:07 GMT+04:00 yuelongguang fasts...@163.com:

hi,all
after i set mtu=9000,  ceph-deply waits reply all the time , 'detecting 
platform for host.'
 
how to know what commands  ceph-deploy need that osd to do?
 
thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







--

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Guys, 

Have done some testing with 3.16.3-031603-generic downloaded from Ubuntu utopic 
branch. The hang task problem is gone when using large block size (tested with 
1M and 4M) and I could no longer preproduce the hang tasks while doing 100 dd 
tests in a for loop. 

However, I can confirm that I am still getting hang tasks while working with a 
4K block size. The hang tasks start after about an hour, but they do not cause 
the server crash. After a while the dd test times out and continues with the 
loop. This is what I was running: 

for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K count=25K 
oflag=direct ; done 

The following test definately produces the hang tasks like these: 

[23160.549785] INFO: task dd:2033 blocked for more than 120 seconds. 
[23160.588364] Tainted: G OE 3.16.3-031603-generic #201409171435 
[23160.627998] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message. 
[23160.706856] dd D 000b 0 2033 23859 0x 
[23160.706861] 88011cec78c8 0082 88011cec78d8 
88011cec7fd8 
[23160.706865] 000143c0 000143c0 88048661bcc0 
880113441440 
[23160.706868] 88011cec7898 88067fd54cc0 880113441440 
880113441440 
[23160.706871] Call Trace: 
[23160.706883] [81791f69] schedule+0x29/0x70 
[23160.706887] [8179203f] io_schedule+0x8f/0xd0 
[23160.706893] [81219e74] dio_await_completion+0x54/0xd0 
[23160.706897] [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0 
[23160.706903] [810ba81e] ? wake_up_bit+0x2e/0x40 
[23160.706908] [812aa865] ? jbd2_journal_dirty_metadata+0xc5/0x260 
[23160.706914] [81265320] ? ext4_get_block_write+0x20/0x20 
[23160.706919] [8121ca5c] __blockdev_direct_IO+0x4c/0x50 
[23160.706922] [81265320] ? ext4_get_block_write+0x20/0x20 
[23160.706928] [8129f44e] ext4_ind_direct_IO+0xce/0x410 
[23160.706931] [81265320] ? ext4_get_block_write+0x20/0x20 
[23160.706935] [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0 
[23160.706938] [81290158] ? __ext4_journal_stop+0x78/0xa0 
[23160.706942] [812627fc] ext4_direct_IO+0xec/0x1e0 
[23160.706946] [8120a003] ? __mark_inode_dirty+0x53/0x2d0 
[23160.706952] [8116d39b] generic_file_direct_write+0xbb/0x180 
[23160.706957] [811ffbe2] ? mnt_clone_write+0x12/0x30 
[23160.706960] [8116d707] __generic_file_write_iter+0x2a7/0x350 
[23160.706963] [8125c2b1] ext4_file_write_iter+0x111/0x3d0 
[23160.706969] [81192fd4] ? iov_iter_init+0x14/0x40 
[23160.706976] [811e0c8b] new_sync_write+0x7b/0xb0 
[23160.706978] [811e19a7] vfs_write+0xc7/0x1f0 
[23160.706980] [811e1eaf] SyS_write+0x4f/0xb0 
[23160.706985] [81795ded] system_call_fastpath+0x1a/0x1f 
[23280.705400] INFO: task dd:2033 blocked for more than 120 seconds. 
[23280.745358] Tainted: G OE 3.16.3-031603-generic #201409171435 
[23280.785069] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message. 
[23280.864158] dd D 000b 0 2033 23859 0x 
[23280.864164] 88011cec78c8 0082 88011cec78d8 
88011cec7fd8 
[23280.864167] 000143c0 000143c0 88048661bcc0 
880113441440 
[23280.864170] 88011cec7898 88067fd54cc0 880113441440 
880113441440 
[23280.864173] Call Trace: 
[23280.864185] [81791f69] schedule+0x29/0x70 
[23280.864197] [8179203f] io_schedule+0x8f/0xd0 
[23280.864203] [81219e74] dio_await_completion+0x54/0xd0 
[23280.864207] [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0 
[23280.864213] [810ba81e] ? wake_up_bit+0x2e/0x40 
[23280.864218] [812aa865] ? jbd2_journal_dirty_metadata+0xc5/0x260 
[23280.864224] [81265320] ? ext4_get_block_write+0x20/0x20 
[23280.864229] [8121ca5c] __blockdev_direct_IO+0x4c/0x50 
[23280.864239] [81265320] ? ext4_get_block_write+0x20/0x20 
[23280.864244] [8129f44e] ext4_ind_direct_IO+0xce/0x410 
[23280.864247] [81265320] ? ext4_get_block_write+0x20/0x20 
[23280.864251] [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0 
[23280.864254] [81290158] ? __ext4_journal_stop+0x78/0xa0 
[23280.864258] [812627fc] ext4_direct_IO+0xec/0x1e0 
[23280.864263] [8120a003] ? __mark_inode_dirty+0x53/0x2d0 
[23280.864268] [8116d39b] generic_file_direct_write+0xbb/0x180 
[23280.864273] [811ffbe2] ? mnt_clone_write+0x12/0x30 
[23280.864284] [8116d707] __generic_file_write_iter+0x2a7/0x350 
[23280.864289] [8125c2b1] ext4_file_write_iter+0x111/0x3d0 
[23280.864295] [81192fd4] ? iov_iter_init+0x14/0x40 
[23280.864300] [811e0c8b] new_sync_write+0x7b/0xb0 
[23280.864302] [811e19a7] vfs_write+0xc7/0x1f0 
[23280.864307] [811e1eaf] SyS_write+0x4f/0xb0 
[23280.864314] [81795ded] system_call_fastpath+0x1a/0x1f 
[23400.861043] INFO: task dd:2033 blocked for 

[ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Pavel V. Kaygorodov
Hi!

16 pgs in our ceph cluster are in active+clean+replay state more then one day.
All clients are working fine.
Is this ok?

root@bastet-mon1:/# ceph -w
cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
 health HEALTH_OK
 monmap e3: 3 mons at 
{1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election epoch 
2570, quorum 0,1,2 1,2,3
 osdmap e3108: 16 osds: 16 up, 16 in
  pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
2066 GB used, 10879 GB / 12945 GB avail
8688 active+clean
  16 active+clean+replay
  client io 3237 kB/s wr, 68 op/s


root@bastet-mon1:/# ceph pg dump | grep replay
dumped all in format plain
0.fd0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628   
[0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 02:23:49.463704  
0'0 2014-09-23 02:23:49.463704
0.e80   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823   
[2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 14:37:32.910787  
0'0 2014-09-22 14:37:32.910787
0.aa0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451   
[0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 00:39:10.717363  
0'0 2014-09-23 00:39:10.717363
0.9c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917   
[0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 14:40:06.694479  
0'0 2014-09-22 14:40:06.694479
0.9a0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486   
[0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 01:14:55.825900  
0'0 2014-09-23 01:14:55.825900
0.910   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962   
[0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 14:37:44.652796  
0'0 2014-09-22 14:37:44.652796
0.8c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635   
[0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 01:52:52.390529  
0'0 2014-09-23 01:52:52.390529
0.8b0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636   
[2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 01:31:38.134466  
0'0 2014-09-23 01:31:38.134466
0.500   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801   
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 08:38:53.963779  
0'0 2014-09-13 10:27:26.977929
0.440   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819   
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 11:59:05.208164  
0'0 2014-09-20 11:59:05.208164
0.390   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827   
[0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 14:40:50.697850  
0'0 2014-09-22 14:40:50.697850
0.320   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719   
[2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 12:06:23.716480  
0'0 2014-09-20 12:06:23.716480
0.2c0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540   
[0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 23:44:53.387815  
0'0 2014-09-22 23:44:53.387815
0.1f0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:28.651059  0'0 3108:2522   
[0,2,14,11][0,2,14,11]  0   0'0 2014-09-22 23:38:16.315755  
0'0 2014-09-22 23:38:16.315755
0.7 0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:35.848797  0'0 3108:1739   
[7,0,12,10][7,0,12,10]  7   0'0 2014-09-22 14:43:38.224718  
0'0 2014-09-22 14:43:38.224718
0.3 0   0   0   0   0   0   0   
active+clean+replay 2014-09-24 02:38:08.885066  0'0 3108:1640   
[2,0,11,15][2,0,11,15]  2   0'0 2014-09-20 06:18:55.987318  
0'0 2014-09-20 06:18:55.987318

With best regards,
  Pavel.

___
ceph-users mailing list

Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Right, I've stopped the tests because it is just getting ridiculous. Without 
rbd cache enabled, dd tests run extremely slow: 

dd if=/dev/zero of=/tmp/mount/1G bs=1M count=1000 oflag=direct 
230+0 records in 
230+0 records out 
241172480 bytes (241 MB) copied, 929.71 s, 259 kB/s 

Any thoughts why I am getting 250kb/s instead of expected 100MB/s+ with large 
block size? 

How do I investigate what's causing this crappy performance? 

Cheers 

Andrei 

- Original Message -

 From: Andrei Mikhailovsky and...@arhont.com
 To: Micha Krause mi...@krausam.de
 Cc: ceph-users@lists.ceph.com
 Sent: Thursday, 25 September, 2014 10:58:07 AM
 Subject: Re: [ceph-users] Frequent Crashes on rbd to nfs gateway
 Server

 Guys,

 Have done some testing with 3.16.3-031603-generic downloaded from
 Ubuntu utopic branch. The hang task problem is gone when using large
 block size (tested with 1M and 4M) and I could no longer preproduce
 the hang tasks while doing 100 dd tests in a for loop.

 However, I can confirm that I am still getting hang tasks while
 working with a 4K block size. The hang tasks start after about an
 hour, but they do not cause the server crash. After a while the dd
 test times out and continues with the loop. This is what I was
 running:

 for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K
 count=25K oflag=direct ; done

 The following test definately produces the hang tasks like these:

 [23160.549785] INFO: task dd:2033 blocked for more than 120 seconds.
 [23160.588364] Tainted: G OE 3.16.3-031603-generic #201409171435
 [23160.627998] echo 0  /proc/sys/kernel/hung_task_timeout_secs
 disables this message.
 [23160.706856] dd D 000b 0 2033 23859 0x
 [23160.706861] 88011cec78c8 0082 88011cec78d8
 88011cec7fd8
 [23160.706865] 000143c0 000143c0 88048661bcc0
 880113441440
 [23160.706868] 88011cec7898 88067fd54cc0 880113441440
 880113441440
 [23160.706871] Call Trace:
 [23160.706883] [81791f69] schedule+0x29/0x70
 [23160.706887] [8179203f] io_schedule+0x8f/0xd0
 [23160.706893] [81219e74] dio_await_completion+0x54/0xd0
 [23160.706897] [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0
 [23160.706903] [810ba81e] ? wake_up_bit+0x2e/0x40
 [23160.706908] [812aa865] ?
 jbd2_journal_dirty_metadata+0xc5/0x260
 [23160.706914] [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706919] [8121ca5c] __blockdev_direct_IO+0x4c/0x50
 [23160.706922] [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706928] [8129f44e] ext4_ind_direct_IO+0xce/0x410
 [23160.706931] [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706935] [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0
 [23160.706938] [81290158] ? __ext4_journal_stop+0x78/0xa0
 [23160.706942] [812627fc] ext4_direct_IO+0xec/0x1e0
 [23160.706946] [8120a003] ? __mark_inode_dirty+0x53/0x2d0
 [23160.706952] [8116d39b]
 generic_file_direct_write+0xbb/0x180
 [23160.706957] [811ffbe2] ? mnt_clone_write+0x12/0x30
 [23160.706960] [8116d707]
 __generic_file_write_iter+0x2a7/0x350
 [23160.706963] [8125c2b1] ext4_file_write_iter+0x111/0x3d0
 [23160.706969] [81192fd4] ? iov_iter_init+0x14/0x40
 [23160.706976] [811e0c8b] new_sync_write+0x7b/0xb0
 [23160.706978] [811e19a7] vfs_write+0xc7/0x1f0
 [23160.706980] [811e1eaf] SyS_write+0x4f/0xb0
 [23160.706985] [81795ded] system_call_fastpath+0x1a/0x1f
 [23280.705400] INFO: task dd:2033 blocked for more than 120 seconds.
 [23280.745358] Tainted: G OE 3.16.3-031603-generic #201409171435
 [23280.785069] echo 0  /proc/sys/kernel/hung_task_timeout_secs
 disables this message.
 [23280.864158] dd D 000b 0 2033 23859 0x
 [23280.864164] 88011cec78c8 0082 88011cec78d8
 88011cec7fd8
 [23280.864167] 000143c0 000143c0 88048661bcc0
 880113441440
 [23280.864170] 88011cec7898 88067fd54cc0 880113441440
 880113441440
 [23280.864173] Call Trace:
 [23280.864185] [81791f69] schedule+0x29/0x70
 [23280.864197] [8179203f] io_schedule+0x8f/0xd0
 [23280.864203] [81219e74] dio_await_completion+0x54/0xd0
 [23280.864207] [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0
 [23280.864213] [810ba81e] ? wake_up_bit+0x2e/0x40
 [23280.864218] [812aa865] ?
 jbd2_journal_dirty_metadata+0xc5/0x260
 [23280.864224] [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864229] [8121ca5c] __blockdev_direct_IO+0x4c/0x50
 [23280.864239] [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864244] [8129f44e] ext4_ind_direct_IO+0xce/0x410
 [23280.864247] [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864251] [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0
 [23280.864254] [81290158] ? __ext4_journal_stop+0x78/0xa0
 [23280.864258] 

[ceph-users] ceph debian systemd

2014-09-25 Thread zorg

Hi,
I'm using ceph version 0.80.5

I trying to make work  a ceph cluster using debian and systemd

I have already manage to install ceph cluster on debian with sysinit 
without any problem


But after installing all, using ceph deploy without error

after rebooting not all my osd  start (they are not mount)
and what is more strange at each reboot, it 's not the same osd that 
start adn some start 10 min later


I ' ve this in the log

Sep 25 12:18:23 addceph3 systemd-udevd[437]: 
'/usr/sbin/ceph-disk-activate /dev/sdh1' [1005] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[476]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdq1' [1142]
Sep 25 12:18:23 addceph3 systemd-udevd[486]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdg1' [998]
Sep 25 12:18:23 addceph3 systemd-udevd[486]: 
'/usr/sbin/ceph-disk-activate /dev/sdg1' [998] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[476]: 
'/usr/sbin/ceph-disk-activate /dev/sdq1' [1142] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[458]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdi1' [1001]
Sep 25 12:18:23 addceph3 systemd-udevd[458]: 
'/usr/sbin/ceph-disk-activate /dev/sdi1' [1001] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[444]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdj1' [1006]
Sep 25 12:18:23 addceph3 systemd-udevd[460]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdk1' [1152]
Sep 25 12:18:23 addceph3 systemd-udevd[444]: 
'/usr/sbin/ceph-disk-activate /dev/sdj1' [1006] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[460]: 
'/usr/sbin/ceph-disk-activate /dev/sdk1' [1152] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[469]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdm1' [1110]
Sep 25 12:18:23 addceph3 systemd-udevd[470]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdp1' [1189]
Sep 25 12:18:23 addceph3 systemd-udevd[469]: 
'/usr/sbin/ceph-disk-activate /dev/sdm1' [1110] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[470]: 
'/usr/sbin/ceph-disk-activate /dev/sdp1' [1189] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[468]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdl1' [1177]
Sep 25 12:18:23 addceph3 systemd-udevd[447]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdo1' [1181]
Sep 25 12:18:23 addceph3 systemd-udevd[468]: 
'/usr/sbin/ceph-disk-activate /dev/sdl1' [1177] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[447]: 
'/usr/sbin/ceph-disk-activate /dev/sdo1' [1181] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[490]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdr1' [1160]
Sep 25 12:18:23 addceph3 systemd-udevd[490]: 
'/usr/sbin/ceph-disk-activate /dev/sdr1' [1160] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 systemd-udevd[445]: timeout: killing 
'/usr/sbin/ceph-disk-activate /dev/sdn1' [1202]
Sep 25 12:18:23 addceph3 systemd-udevd[445]: 
'/usr/sbin/ceph-disk-activate /dev/sdn1' [1202] terminated by signal 9 
(Killed)
Sep 25 12:18:23 addceph3 kernel: [   39.813701] XFS (sdo1): Mounting 
Filesystem
Sep 25 12:18:23 addceph3 kernel: [   39.854510] XFS (sdo1): Ending clean 
mount
Sep 25 12:22:59 addceph3 systemd[1]: ceph.service operation timed out. 
Terminating.
Sep 25 12:22:59 addceph3 systemd[1]: Failed to start LSB: Start Ceph 
distributed file system daemons at boot time.




I'm not actually very experimented with systemd
don't really how ceph handle systemd

if someone can give me a bit of information

thanks



--
probeSys - spécialiste GNU/Linux
site web : http://www.probesys.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Ilya Dryomov
On Thu, Sep 25, 2014 at 1:58 PM, Andrei Mikhailovsky and...@arhont.com wrote:
 Guys,

 Have done some testing with 3.16.3-031603-generic downloaded from Ubuntu
 utopic branch. The hang task problem is gone when using large block size
 (tested with 1M and 4M) and I could no longer preproduce the hang tasks
 while doing 100 dd tests in a for loop.



 However, I can confirm that I am still getting hang tasks while working with
 a 4K block size. The hang tasks start after about an hour, but they do not
 cause the server crash. After a while the dd test times out and continues
 with the loop. This is what I was running:

 for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K count=25K
 oflag=direct ; done

 The following test definately produces the hang tasks like these:

 [23160.549785] INFO: task dd:2033 blocked for more than 120 seconds.
 [23160.588364]   Tainted: G   OE 3.16.3-031603-generic
 #201409171435
 [23160.627998] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables
 this message.
 [23160.706856] dd  D 000b 0  2033  23859
 0x
 [23160.706861]  88011cec78c8 0082 88011cec78d8
 88011cec7fd8
 [23160.706865]  000143c0 000143c0 88048661bcc0
 880113441440
 [23160.706868]  88011cec7898 88067fd54cc0 880113441440
 880113441440
 [23160.706871] Call Trace:
 [23160.706883]  [81791f69] schedule+0x29/0x70
 [23160.706887]  [8179203f] io_schedule+0x8f/0xd0
 [23160.706893]  [81219e74] dio_await_completion+0x54/0xd0
 [23160.706897]  [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0
 [23160.706903]  [810ba81e] ? wake_up_bit+0x2e/0x40
 [23160.706908]  [812aa865] ?
 jbd2_journal_dirty_metadata+0xc5/0x260
 [23160.706914]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706919]  [8121ca5c] __blockdev_direct_IO+0x4c/0x50
 [23160.706922]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706928]  [8129f44e] ext4_ind_direct_IO+0xce/0x410
 [23160.706931]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23160.706935]  [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0
 [23160.706938]  [81290158] ? __ext4_journal_stop+0x78/0xa0
 [23160.706942]  [812627fc] ext4_direct_IO+0xec/0x1e0
 [23160.706946]  [8120a003] ? __mark_inode_dirty+0x53/0x2d0
 [23160.706952]  [8116d39b] generic_file_direct_write+0xbb/0x180
 [23160.706957]  [811ffbe2] ? mnt_clone_write+0x12/0x30
 [23160.706960]  [8116d707] __generic_file_write_iter+0x2a7/0x350
 [23160.706963]  [8125c2b1] ext4_file_write_iter+0x111/0x3d0
 [23160.706969]  [81192fd4] ? iov_iter_init+0x14/0x40
 [23160.706976]  [811e0c8b] new_sync_write+0x7b/0xb0
 [23160.706978]  [811e19a7] vfs_write+0xc7/0x1f0
 [23160.706980]  [811e1eaf] SyS_write+0x4f/0xb0
 [23160.706985]  [81795ded] system_call_fastpath+0x1a/0x1f
 [23280.705400] INFO: task dd:2033 blocked for more than 120 seconds.
 [23280.745358]   Tainted: G   OE 3.16.3-031603-generic
 #201409171435
 [23280.785069] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables
 this message.
 [23280.864158] dd  D 000b 0  2033  23859
 0x
 [23280.864164]  88011cec78c8 0082 88011cec78d8
 88011cec7fd8
 [23280.864167]  000143c0 000143c0 88048661bcc0
 880113441440
 [23280.864170]  88011cec7898 88067fd54cc0 880113441440
 880113441440
 [23280.864173] Call Trace:
 [23280.864185]  [81791f69] schedule+0x29/0x70
 [23280.864197]  [8179203f] io_schedule+0x8f/0xd0
 [23280.864203]  [81219e74] dio_await_completion+0x54/0xd0
 [23280.864207]  [8121c6a8] do_blockdev_direct_IO+0x958/0xcc0
 [23280.864213]  [810ba81e] ? wake_up_bit+0x2e/0x40
 [23280.864218]  [812aa865] ?
 jbd2_journal_dirty_metadata+0xc5/0x260
 [23280.864224]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864229]  [8121ca5c] __blockdev_direct_IO+0x4c/0x50
 [23280.864239]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864244]  [8129f44e] ext4_ind_direct_IO+0xce/0x410
 [23280.864247]  [81265320] ? ext4_get_block_write+0x20/0x20
 [23280.864251]  [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0
 [23280.864254]  [81290158] ? __ext4_journal_stop+0x78/0xa0
 [23280.864258]  [812627fc] ext4_direct_IO+0xec/0x1e0
 [23280.864263]  [8120a003] ? __mark_inode_dirty+0x53/0x2d0
 [23280.864268]  [8116d39b] generic_file_direct_write+0xbb/0x180
 [23280.864273]  [811ffbe2] ? mnt_clone_write+0x12/0x30
 [23280.864284]  [8116d707] __generic_file_write_iter+0x2a7/0x350
 [23280.864289]  [8125c2b1] ext4_file_write_iter+0x111/0x3d0
 [23280.864295]  [81192fd4] ? iov_iter_init+0x14/0x40
 [23280.864300]  [811e0c8b] new_sync_write+0x7b/0xb0
 [23280.864302] 

Re: [ceph-users] [ceph-calamari] Setting up Ceph calamari :: Made Simple

2014-09-25 Thread Johan Kooijman
Karan,

Thanks for the tutorial, great stuff. Please note that in order to get the
graphs working, I had to install ipvsadm and create a symlink from
/sbin/ipvsadm to /usr/bin/ipvsadm (CentOS 6).

On Wed, Sep 24, 2014 at 10:16 AM, Karan Singh karan.si...@csc.fi wrote:

 Hello Cepher’s

 Now here comes my new blog on setting up Ceph Calamari.

 I hope you would like this step-by-step guide

 http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html


 - Karan -


 ___
 ceph-calamari mailing list
 ceph-calam...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com




-- 
Met vriendelijke groeten / With kind regards,
Johan Kooijman
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Andrei Mikhailovsky
Ilya, 

I've not used rbd map on older kernels. Just experimenting with rbd map to have 
an iscsi and nfs gateway service for hypervisors such as xenserver and vmware. 
I've tried it with the latest ubuntu LTS kernel 3.13 I believe and noticed the 
issue. 
Can you not reproduce the hang tasks when doing dd testing? have you tried 4K 
block sizes and running it for sometime, like I have done? 

Thanks 

Andrei 

- Original Message -

 From: Ilya Dryomov ilya.dryo...@inktank.com
 To: Andrei Mikhailovsky and...@arhont.com
 Cc: Micha Krause mi...@krausam.de, ceph-users@lists.ceph.com
 Sent: Thursday, 25 September, 2014 12:04:37 PM
 Subject: Re: [ceph-users] Frequent Crashes on rbd to nfs gateway
 Server

 On Thu, Sep 25, 2014 at 1:58 PM, Andrei Mikhailovsky
 and...@arhont.com wrote:
  Guys,
 
  Have done some testing with 3.16.3-031603-generic downloaded from
  Ubuntu
  utopic branch. The hang task problem is gone when using large block
  size
  (tested with 1M and 4M) and I could no longer preproduce the hang
  tasks
  while doing 100 dd tests in a for loop.
 
 
 
  However, I can confirm that I am still getting hang tasks while
  working with
  a 4K block size. The hang tasks start after about an hour, but they
  do not
  cause the server crash. After a while the dd test times out and
  continues
  with the loop. This is what I was running:
 
  for i in {1..100} ; do time dd if=/dev/zero of=/tmp/mount/1G bs=4K
  count=25K
  oflag=direct ; done
 
  The following test definately produces the hang tasks like these:
 
  [23160.549785] INFO: task dd:2033 blocked for more than 120
  seconds.
  [23160.588364] Tainted: G OE 3.16.3-031603-generic
  #201409171435
  [23160.627998] echo 0  /proc/sys/kernel/hung_task_timeout_secs
  disables
  this message.
  [23160.706856] dd D 000b 0 2033 23859
  0x
  [23160.706861] 88011cec78c8 0082 88011cec78d8
  88011cec7fd8
  [23160.706865] 000143c0 000143c0 88048661bcc0
  880113441440
  [23160.706868] 88011cec7898 88067fd54cc0 880113441440
  880113441440
  [23160.706871] Call Trace:
  [23160.706883] [81791f69] schedule+0x29/0x70
  [23160.706887] [8179203f] io_schedule+0x8f/0xd0
  [23160.706893] [81219e74] dio_await_completion+0x54/0xd0
  [23160.706897] [8121c6a8]
  do_blockdev_direct_IO+0x958/0xcc0
  [23160.706903] [810ba81e] ? wake_up_bit+0x2e/0x40
  [23160.706908] [812aa865] ?
  jbd2_journal_dirty_metadata+0xc5/0x260
  [23160.706914] [81265320] ?
  ext4_get_block_write+0x20/0x20
  [23160.706919] [8121ca5c] __blockdev_direct_IO+0x4c/0x50
  [23160.706922] [81265320] ?
  ext4_get_block_write+0x20/0x20
  [23160.706928] [8129f44e] ext4_ind_direct_IO+0xce/0x410
  [23160.706931] [81265320] ?
  ext4_get_block_write+0x20/0x20
  [23160.706935] [81261fbb] ext4_ext_direct_IO+0x1bb/0x2a0
  [23160.706938] [81290158] ? __ext4_journal_stop+0x78/0xa0
  [23160.706942] [812627fc] ext4_direct_IO+0xec/0x1e0
  [23160.706946] [8120a003] ? __mark_inode_dirty+0x53/0x2d0
  [23160.706952] [8116d39b]
  generic_file_direct_write+0xbb/0x180
  [23160.706957] [811ffbe2] ? mnt_clone_write+0x12/0x30
  [23160.706960] [8116d707]
  __generic_file_write_iter+0x2a7/0x350
  [23160.706963] [8125c2b1]
  ext4_file_write_iter+0x111/0x3d0
  [23160.706969] [81192fd4] ? iov_iter_init+0x14/0x40
  [23160.706976] [811e0c8b] new_sync_write+0x7b/0xb0
  [23160.706978] [811e19a7] vfs_write+0xc7/0x1f0
  [23160.706980] [811e1eaf] SyS_write+0x4f/0xb0
  [23160.706985] [81795ded] system_call_fastpath+0x1a/0x1f
  [23280.705400] INFO: task dd:2033 blocked for more than 120
  seconds.
  [23280.745358] Tainted: G OE 3.16.3-031603-generic
  #201409171435
  [23280.785069] echo 0  /proc/sys/kernel/hung_task_timeout_secs
  disables
  this message.
  [23280.864158] dd D 000b 0 2033 23859
  0x
  [23280.864164] 88011cec78c8 0082 88011cec78d8
  88011cec7fd8
  [23280.864167] 000143c0 000143c0 88048661bcc0
  880113441440
  [23280.864170] 88011cec7898 88067fd54cc0 880113441440
  880113441440
  [23280.864173] Call Trace:
  [23280.864185] [81791f69] schedule+0x29/0x70
  [23280.864197] [8179203f] io_schedule+0x8f/0xd0
  [23280.864203] [81219e74] dio_await_completion+0x54/0xd0
  [23280.864207] [8121c6a8]
  do_blockdev_direct_IO+0x958/0xcc0
  [23280.864213] [810ba81e] ? wake_up_bit+0x2e/0x40
  [23280.864218] [812aa865] ?
  jbd2_journal_dirty_metadata+0xc5/0x260
  [23280.864224] [81265320] ?
  ext4_get_block_write+0x20/0x20
  [23280.864229] [8121ca5c] __blockdev_direct_IO+0x4c/0x50
  [23280.864239] [81265320] ?
  ext4_get_block_write+0x20/0x20
  [23280.864244] [8129f44e] ext4_ind_direct_IO+0xce/0x410
  [23280.864247] 

[ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
v0.67.11 Dumpling
===

This stable update for Dumpling fixes several important bugs that affect a 
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If 
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052 
  Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted 
  (Sage Weil)
* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Frequent Crashes on rbd to nfs gateway Server

2014-09-25 Thread Ilya Dryomov
On Thu, Sep 25, 2014 at 7:06 PM, Andrei Mikhailovsky and...@arhont.com wrote:
 Ilya,

 I've not used rbd map on older kernels. Just experimenting with rbd map to
 have an iscsi and nfs gateway service for hypervisors such as xenserver and
 vmware. I've tried it with the latest ubuntu LTS kernel 3.13 I believe and
 noticed the issue.
 Can you not reproduce the hang tasks when doing dd testing? have you tried
 4K block sizes and running it for sometime, like I have done?

I forget which block size I tried, but it was one that you reported on
the tracker, I didn't make up my own.  I'll try it exactly the way you
described in your previous mail.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Icehouse Ceph -- live migration fails?

2014-09-25 Thread Daniel Schneller
Hi!

We have an Icehouse system running with librbd based Cinder and Glance
configurations, storing images and volumes in Ceph.

Configuration is (apart from network setup details, of course) by the
book / OpenStack setup guide.

Works very nicely, including regular migration, but live migration of
virtual machines fails. I created a simple machine booting from a volume
based off the Ubuntu 14.04.1 cloud image for testing. 

Using Horizon, I can move this VM from host to host, but when I try to
Live Migrate it from one baremetal host to another, I get an error 
message “Failed to live migrate instance to host ’node02’.

The only related log entry I recognize is in the controller’s nova-api.log:


2014-09-25 17:15:47.679 3616 INFO nova.api.openstack.wsgi 
[req-f3dc3c2e-d366-40c5-a1f1-31db71afd87a f833f8e2d1104e66b9abe9923751dcf2 
a908a95a87cc42cd87ff97da4733c414] HTTP exception thrown: Compute service of 
node02.baremetal.clusterb.centerdevice.local is unavailable at this time.
2014-09-25 17:15:47.680 3616 INFO nova.osapi_compute.wsgi.server 
[req-f3dc3c2e-d366-40c5-a1f1-31db71afd87a f833f8e2d1104e66b9abe9923751dcf2 
a908a95a87cc42cd87ff97da4733c414] 10.102.6.8 POST 
/v2/a908a95a87cc42cd87ff97da4733c414/servers/0f762f35-64ee-461f-baa4-30f5de4d5ddf/action
 HTTP/1.1 status: 400 len: 333 time: 0.1479030

I cannot see anything of value on the destination host itself.

New machines get scheduled there, so the compute service cannot really
be down.

In this thread Travis 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-March/019944.html 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-March/019944.html
describes a similar situation, however that was on Folsom, so I wonder if it
is still applicable.

Would be great to get some outside opinion :)

Thanks!
Daniel

-- 
Daniel Schneller
Mobile Development Lead
 
CenterDevice GmbH  | Merscheider Straße 1
   | 42699 Solingen
tel: +49 1754155711| Deutschland
daniel.schnel...@centerdevice.com  | www.centerdevice.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Mike Dawson

On 9/25/2014 11:09 AM, Sage Weil wrote:

v0.67.11 Dumpling
===

This stable update for Dumpling fixes several important bugs that affect a
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052
   Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted
   (Sage Weil)


Sage,

Thanks for the great work! Could you provide any links describing how to 
tune the scrub and snap trim thread pool IO priority? I couldn't find 
these settings in the docs.


IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
#9503, right?


Thanks,
Mike Dawson



* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
On Thu, 25 Sep 2014, Mike Dawson wrote:
 On 9/25/2014 11:09 AM, Sage Weil wrote:
  v0.67.11 Dumpling
  ===
  
  This stable update for Dumpling fixes several important bugs that affect a
  small set of users.
  
  We recommend that all Dumpling users upgrade at their convenience.  If
  none of these issues are affecting your deployment there is no urgency.
  
  
  Notable Changes
  ---
  
  * common: fix sending dup cluster log items (#9080 Sage Weil)
  * doc: several doc updates (Alfredo Deza)
  * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
  * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
  * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
  * librbd: fix error path cleanup when failing to open image (#8912 Josh
  Durgin)
  * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
 Sage Weil)
  * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
  * osd: allow scrub and snap trim thread pool IO priority to be adjusted
 (Sage Weil)
 
 Sage,
 
 Thanks for the great work! Could you provide any links describing how to tune
 the scrub and snap trim thread pool IO priority? I couldn't find these
 settings in the docs.

It's 

 osd disk thread ioprio class = idle
 osd disk thread ioprio priority = 0

Note that this is a short-term solution; we eventaully want to send all IO 
through the same queue so that we can prioritize things more carefully.  
This setting will most likely go away in the future.

 IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
 #9503, right?

Correct.  That will come later once it's gone through more testing.

Thanks!
sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Dan Van Der Ster
Hi Mike,

 On 25 Sep 2014, at 17:47, Mike Dawson mike.daw...@cloudapt.com wrote:
 
 On 9/25/2014 11:09 AM, Sage Weil wrote:
 v0.67.11 Dumpling
 ===
 
 This stable update for Dumpling fixes several important bugs that affect a
 small set of users.
 
 We recommend that all Dumpling users upgrade at their convenience.  If
 none of these issues are affecting your deployment there is no urgency.
 
 
 Notable Changes
 ---
 
 * common: fix sending dup cluster log items (#9080 Sage Weil)
 * doc: several doc updates (Alfredo Deza)
 * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
 * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
 * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
 * librbd: fix error path cleanup when failing to open image (#8912 Josh 
 Durgin)
 * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
   Sage Weil)
 * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
 * osd: allow scrub and snap trim thread pool IO priority to be adjusted
   (Sage Weil)
 
 Sage,
 
 Thanks for the great work! Could you provide any links describing how to tune 
 the scrub and snap trim thread pool IO priority? I couldn't find these 
 settings in the docs.

I use:

[osd]
  osd disk thread ioprio class = 3
  osd disk thread ioprio priority = 0

You’ll need to use the cfq io scheduler for those to have an effect.

FYI, I can make scrubs generally transparent by also adding:

  osd scrub sleep = .1
  osd scrub chunk max = 5
  osd deep scrub stride = 1048576

Your mileage may vary.

 IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
 #9503, right?

Those didn’t make it.

Cheers, Dan



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-maintainers] v0.67.11 dumpling released

2014-09-25 Thread Loic Dachary
Hi,

On 25/09/2014 17:53, Sage Weil wrote:
 On Thu, 25 Sep 2014, Mike Dawson wrote:
 On 9/25/2014 11:09 AM, Sage Weil wrote:
 v0.67.11 Dumpling
 ===

 This stable update for Dumpling fixes several important bugs that affect a
 small set of users.

 We recommend that all Dumpling users upgrade at their convenience.  If
 none of these issues are affecting your deployment there is no urgency.


 Notable Changes
 ---

 * common: fix sending dup cluster log items (#9080 Sage Weil)
 * doc: several doc updates (Alfredo Deza)
 * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
 * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
 * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
 * librbd: fix error path cleanup when failing to open image (#8912 Josh
 Durgin)
 * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
Sage Weil)
 * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
 * osd: allow scrub and snap trim thread pool IO priority to be adjusted
(Sage Weil)

 Sage,

 Thanks for the great work! Could you provide any links describing how to tune
 the scrub and snap trim thread pool IO priority? I couldn't find these
 settings in the docs.
 
 It's 
 
  osd disk thread ioprio class = idle
  osd disk thread ioprio priority = 0
 
 Note that this is a short-term solution; we eventaully want to send all IO 
 through the same queue so that we can prioritize things more carefully.  
 This setting will most likely go away in the future.
 

The documentation for these can be found at

http://ceph.com/docs/giant/rados/configuration/osd-config-ref/#operations

Control-f ioprio

Cheers

 IIUC, 0.67.11 does not include the proposed changes to address #9487 or 
 #9503, right?
 
 Correct.  That will come later once it's gone through more testing.
 
 Thanks!
 sage
 ___
 Ceph-maintainers mailing list
 ceph-maintain...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-maintainers-ceph.com
 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Sage Weil
On Thu, 25 Sep 2014, Dan Van Der Ster wrote:
 Hi Mike,
 
  On 25 Sep 2014, at 17:47, Mike Dawson mike.daw...@cloudapt.com wrote:
  
  On 9/25/2014 11:09 AM, Sage Weil wrote:
  v0.67.11 Dumpling
  ===
  
  This stable update for Dumpling fixes several important bugs that affect a
  small set of users.
  
  We recommend that all Dumpling users upgrade at their convenience.  If
  none of these issues are affecting your deployment there is no urgency.
  
  
  Notable Changes
  ---
  
  * common: fix sending dup cluster log items (#9080 Sage Weil)
  * doc: several doc updates (Alfredo Deza)
  * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
  * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
  * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
  * librbd: fix error path cleanup when failing to open image (#8912 Josh 
  Durgin)
  * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
Sage Weil)
  * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
  * osd: allow scrub and snap trim thread pool IO priority to be adjusted
(Sage Weil)
  
  Sage,
  
  Thanks for the great work! Could you provide any links describing how to 
  tune the scrub and snap trim thread pool IO priority? I couldn't find these 
  settings in the docs.
 
 I use:
 
 [osd]
   osd disk thread ioprio class = 3

Sigh.. it looks like the version that went into master and firefly uses 
the string names for classes while the dumpling patch takes the numeric 
ID.  Oops.  You'll need to take some care to adjust this setting when you 
upgrade.

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [ceph-calamari] Setting up Ceph calamari :: Made Simple

2014-09-25 Thread Dan Mick
Can you explain this a little more, Johan?  I've never even heard of
ipvsadmin or its facilities before today, and it ought not be required...
On Sep 25, 2014 7:04 AM, Johan Kooijman m...@johankooijman.com wrote:

 Karan,

 Thanks for the tutorial, great stuff. Please note that in order to get the
 graphs working, I had to install ipvsadm and create a symlink from
 /sbin/ipvsadm to /usr/bin/ipvsadm (CentOS 6).

 On Wed, Sep 24, 2014 at 10:16 AM, Karan Singh karan.si...@csc.fi wrote:

 Hello Cepher’s

 Now here comes my new blog on setting up Ceph Calamari.

 I hope you would like this step-by-step guide

 http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html


 - Karan -


 ___
 ceph-calamari mailing list
 ceph-calam...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com




 --
 Met vriendelijke groeten / With kind regards,
 Johan Kooijman

 ___
 ceph-calamari mailing list
 ceph-calam...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Gregory Farnum
I imagine you aren't actually using the data/metadata pool that these
PGs are in, but it's a previously-reported bug we haven't identified:
http://tracker.ceph.com/issues/8758
They should go away if you restart the OSDs that host them (or just
remove those pools), but it's not going to hurt anything as long as
you aren't using them.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Sep 25, 2014 at 3:37 AM, Pavel V. Kaygorodov pa...@inasan.ru wrote:
 Hi!

 16 pgs in our ceph cluster are in active+clean+replay state more then one day.
 All clients are working fine.
 Is this ok?

 root@bastet-mon1:/# ceph -w
 cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
  health HEALTH_OK
  monmap e3: 3 mons at 
 {1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election epoch 
 2570, quorum 0,1,2 1,2,3
  osdmap e3108: 16 osds: 16 up, 16 in
   pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
 2066 GB used, 10879 GB / 12945 GB avail
 8688 active+clean
   16 active+clean+replay
   client io 3237 kB/s wr, 68 op/s


 root@bastet-mon1:/# ceph pg dump | grep replay
 dumped all in format plain
 0.fd0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628 
   [0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 02:23:49.463704  
 0'0 2014-09-23 02:23:49.463704
 0.e80   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823 
   [2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 14:37:32.910787  
 0'0 2014-09-22 14:37:32.910787
 0.aa0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451 
   [0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 00:39:10.717363  
 0'0 2014-09-23 00:39:10.717363
 0.9c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917 
   [0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 14:40:06.694479  
 0'0 2014-09-22 14:40:06.694479
 0.9a0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486 
   [0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 01:14:55.825900  
 0'0 2014-09-23 01:14:55.825900
 0.910   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962 
   [0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 14:37:44.652796  
 0'0 2014-09-22 14:37:44.652796
 0.8c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635 
   [0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 01:52:52.390529  
 0'0 2014-09-23 01:52:52.390529
 0.8b0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636 
   [2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 01:31:38.134466  
 0'0 2014-09-23 01:31:38.134466
 0.500   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801 
   [7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 08:38:53.963779  
 0'0 2014-09-13 10:27:26.977929
 0.440   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819 
   [7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 11:59:05.208164  
 0'0 2014-09-20 11:59:05.208164
 0.390   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827 
   [0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 14:40:50.697850  
 0'0 2014-09-22 14:40:50.697850
 0.320   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719 
   [2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 12:06:23.716480  
 0'0 2014-09-20 12:06:23.716480
 0.2c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540 
   [0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 23:44:53.387815  
 0'0 2014-09-22 23:44:53.387815
 0.1f0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.651059  0'0 3108:2522 
   [0,2,14,11][0,2,14,11]  0   0'0 2014-09-22 23:38:16.315755  
 0'0 2014-09-22 23:38:16.315755
 0.7 0   0   0   0   0   0   0   
 

Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Mike Dawson
Looks like the packages have partially hit the repo, but at least the 
following are missing:


Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/librbd1_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/librados2_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/python-ceph_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/ceph_0.67.11-1precise_amd64.deb 
 404  Not Found
Failed to fetch 
http://ceph.com/debian-dumpling/pool/main/c/ceph/libcephfs1_0.67.11-1precise_amd64.deb 
 404  Not Found


Based on the timestamps of the files that made it, it looks like the 
process to publish the packages isn't still in process, but rather 
failed yesterday.


Thanks,
Mike Dawson


On 9/25/2014 11:09 AM, Sage Weil wrote:

v0.67.11 Dumpling
===

This stable update for Dumpling fixes several important bugs that affect a
small set of users.

We recommend that all Dumpling users upgrade at their convenience.  If
none of these issues are affecting your deployment there is no urgency.


Notable Changes
---

* common: fix sending dup cluster log items (#9080 Sage Weil)
* doc: several doc updates (Alfredo Deza)
* libcephfs-java: fix build against older JNI headesr (Greg Farnum)
* librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage Weil)
* librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
* librbd: fix error path cleanup when failing to open image (#8912 Josh Durgin)
* mon: fix crash when adjusting pg_num before any OSDs are added (#9052
   Sage Weil)
* mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
* osd: allow scrub and snap trim thread pool IO priority to be adjusted
   (Sage Weil)
* osd: fix mount/remount sync race (#9144 Sage Weil)

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Alfredo Deza
On Thu, Sep 25, 2014 at 1:27 PM, Mike Dawson mike.daw...@cloudapt.com wrote:
 Looks like the packages have partially hit the repo, but at least the
 following are missing:

 Failed to fetch
 http://ceph.com/debian-dumpling/pool/main/c/ceph/librbd1_0.67.11-1precise_amd64.deb
 404  Not Found
 Failed to fetch
 http://ceph.com/debian-dumpling/pool/main/c/ceph/librados2_0.67.11-1precise_amd64.deb
 404  Not Found
 Failed to fetch
 http://ceph.com/debian-dumpling/pool/main/c/ceph/python-ceph_0.67.11-1precise_amd64.deb
 404  Not Found
 Failed to fetch
 http://ceph.com/debian-dumpling/pool/main/c/ceph/ceph_0.67.11-1precise_amd64.deb
 404  Not Found
 Failed to fetch
 http://ceph.com/debian-dumpling/pool/main/c/ceph/libcephfs1_0.67.11-1precise_amd64.deb
 404  Not Found

 Based on the timestamps of the files that made it, it looks like the process
 to publish the packages isn't still in process, but rather failed yesterday.

That is odd. I just went ahead and re-pushed the packages and they are
now showing up.

Thanks for letting us know!



 Thanks,
 Mike Dawson


 On 9/25/2014 11:09 AM, Sage Weil wrote:

 v0.67.11 Dumpling
 ===

 This stable update for Dumpling fixes several important bugs that affect a
 small set of users.

 We recommend that all Dumpling users upgrade at their convenience.  If
 none of these issues are affecting your deployment there is no urgency.


 Notable Changes
 ---

 * common: fix sending dup cluster log items (#9080 Sage Weil)
 * doc: several doc updates (Alfredo Deza)
 * libcephfs-java: fix build against older JNI headesr (Greg Farnum)
 * librados: fix crash in op timeout path (#9362 Matthias Kiefer, Sage
 Weil)
 * librbd: fix crash using clone of flattened image (#8845 Josh Durgin)
 * librbd: fix error path cleanup when failing to open image (#8912 Josh
 Durgin)
 * mon: fix crash when adjusting pg_num before any OSDs are added (#9052
Sage Weil)
 * mon: reduce log noise from paxos (Aanchal Agrawal, Sage Weil)
 * osd: allow scrub and snap trim thread pool IO priority to be adjusted
(Sage Weil)
 * osd: fix mount/remount sync race (#9144 Sage Weil)

 Getting Ceph
 

 * Git at git://github.com/ceph/ceph.git
 * Tarball at http://ceph.com/download/ceph-0.67.11.tar.gz
 * For packages, see http://ceph.com/docs/master/install/get-packages
 * For ceph-deploy, see
 http://ceph.com/docs/master/install/install-ceph-deploy
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Pavel V. Kaygorodov
Hi!

 I imagine you aren't actually using the data/metadata pool that these
 PGs are in, but it's a previously-reported bug we haven't identified:
 http://tracker.ceph.com/issues/8758
 They should go away if you restart the OSDs that host them (or just
 remove those pools), but it's not going to hurt anything as long as
 you aren't using them.

Thanks a lot, restarting of osds helps!
BTW, I tried to delete data and metadata pools just after setup, but ceph 
refused me to do this.

With best regards,
  Pavel.



 On Thu, Sep 25, 2014 at 3:37 AM, Pavel V. Kaygorodov pa...@inasan.ru wrote:
 Hi!
 
 16 pgs in our ceph cluster are in active+clean+replay state more then one 
 day.
 All clients are working fine.
 Is this ok?
 
 root@bastet-mon1:/# ceph -w
cluster fffeafa2-a664-48a7-979a-517e3ffa0da1
 health HEALTH_OK
 monmap e3: 3 mons at 
 {1=10.92.8.80:6789/0,2=10.92.8.81:6789/0,3=10.92.8.82:6789/0}, election 
 epoch 2570, quorum 0,1,2 1,2,3
 osdmap e3108: 16 osds: 16 up, 16 in
  pgmap v1419232: 8704 pgs, 6 pools, 513 GB data, 125 kobjects
2066 GB used, 10879 GB / 12945 GB avail
8688 active+clean
  16 active+clean+replay
  client io 3237 kB/s wr, 68 op/s
 
 
 root@bastet-mon1:/# ceph pg dump | grep replay
 dumped all in format plain
 0.fd0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.902766  0'0 3108:2628
[0,7,14,8] [0,7,14,8]   0   0'0 2014-09-23 
 02:23:49.463704  0'0 2014-09-23 02:23:49.463704
 0.e80   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:21.945082  0'0 3108:1823
[2,7,9,10] [2,7,9,10]   2   0'0 2014-09-22 
 14:37:32.910787  0'0 2014-09-22 14:37:32.910787
 0.aa0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.326607  0'0 3108:2451
[0,7,15,12][0,7,15,12]  0   0'0 2014-09-23 
 00:39:10.717363  0'0 2014-09-23 00:39:10.717363
 0.9c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.325229  0'0 3108:1917
[0,7,9,12] [0,7,9,12]   0   0'0 2014-09-22 
 14:40:06.694479  0'0 2014-09-22 14:40:06.694479
 0.9a0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:29.325074  0'0 3108:2486
[0,7,14,11][0,7,14,11]  0   0'0 2014-09-23 
 01:14:55.825900  0'0 2014-09-23 01:14:55.825900
 0.910   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.839148  0'0 3108:1962
[0,7,9,10] [0,7,9,10]   0   0'0 2014-09-22 
 14:37:44.652796  0'0 2014-09-22 14:37:44.652796
 0.8c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.838683  0'0 3108:2635
[0,2,9,11] [0,2,9,11]   0   0'0 2014-09-23 
 01:52:52.390529  0'0 2014-09-23 01:52:52.390529
 0.8b0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:21.215964  0'0 3108:1636
[2,0,8,14] [2,0,8,14]   2   0'0 2014-09-23 
 01:31:38.134466  0'0 2014-09-23 01:31:38.134466
 0.500   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:35.869160  0'0 3108:1801
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 
 08:38:53.963779  0'0 2014-09-13 10:27:26.977929
 0.440   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:35.871409  0'0 3108:1819
[7,2,15,10][7,2,15,10]  7   0'0 2014-09-20 
 11:59:05.208164  0'0 2014-09-20 11:59:05.208164
 0.390   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.653190  0'0 3108:1827
[0,2,9,10] [0,2,9,10]   0   0'0 2014-09-22 
 14:40:50.697850  0'0 2014-09-22 14:40:50.697850
 0.320   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:10.970515  0'0 3108:1719
[2,0,14,9] [2,0,14,9]   2   0'0 2014-09-20 
 12:06:23.716480  0'0 2014-09-20 12:06:23.716480
 0.2c0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.647268  0'0 3108:2540
[0,7,12,8] [0,7,12,8]   0   0'0 2014-09-22 
 23:44:53.387815  0'0 2014-09-22 23:44:53.387815
 0.1f0   0   0   0   0   0   0   
 active+clean+replay 2014-09-24 02:38:28.651059  0'0 3108:2522
[0,2,14,11][0,2,14,11]  0   0'0 2014-09-22 
 23:38:16.315755 

Re: [ceph-users] Any way to remove possible orphaned files in a federated gateway configuration

2014-09-25 Thread Lyn Mitchell
Thanks Yehuda for your response, much appreciated.

Using the radosgw-admin object stat option I was able to reconcile the 
objects on master and slave.  There are 10 objects on the master that have 
replicated to the slave, for these 10 objects I was able to confirm by pulling 
the tag prefix from object stat, verifying size, name, etc.  There are still 
a large number of shadow files in .region-1.zone-2.rgw.buckets pool which 
have no corresponding object to cross reference using object stat command.  
These files are taking up several hundred GB from OSD's on the region-2 
cluster.  What would be the correct way to remove these shadow files that no 
longer have objects associated?  Is there a process that will clean these 
orphaned objects?  Any steps anyone can provide to remove these files would 
greatly appreciated.

BTW - Since my original post several objects have been copied via s3 client to 
the master and everything appears to be replicating without issue.  Objects 
have been deleted as well, the sync looks fine, objects are being removed from 
master and slave.  I'm pretty sure the large number of orphaned shadow files 
that are currently in the .region-1.zone-2.rgw.buckets pool are from the 
original sync performed back on Sept. 15.

Thanks in advance,
MLM

-Original Message-
From: yehud...@gmail.com [mailto:yehud...@gmail.com] On Behalf Of Yehuda Sadeh
Sent: Tuesday, September 23, 2014 5:30 PM
To: lyn_mitch...@bellsouth.net
Cc: ceph-users; ceph-commun...@lists.ceph.com
Subject: Re: [ceph-users] Any way to remove possible orphaned files in a 
federated gateway configuration

On Tue, Sep 23, 2014 at 3:05 PM, Lyn Mitchell mitc...@bellsouth.net wrote:
 Is anyone aware of a way to either reconcile or remove possible 
 orphaned “shadow” files in a federated gateway configuration?  The 
 issue we’re seeing is the number of chunks/shadow files on the slave has many 
 more “shadow”
 files than the master, the breakdown is as follows:

 master zone:

 .region-1.zone-1.rgw.buckets = 1737 “shadow” files of which there are 
 10 distinct sets of tags, an example of 1 distinct set is:

 alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_1 through
 alph-1.80907.1__shadow_.VTZYW5ubV53wCHAKcnGwrD_yGkyGDuG_516



 slave zone:

 .region-1.zone-2.rgw.buckets = 331961 “shadow” files, of which there 
 are 652 distinct sets of  tags, examples:

 1 set having 516 “shadow” files:

 alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_1 through
 alph-1.80907.1__shadow_.yPT037fjWhTi_UtHWSYPcRWBanaN9Oy_516



 236 sets having 515 “shadow” files apiece:

 alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_1 through
 alph-1.80907.1__shadow_.RA9KCc_U5T9kBN_ggCUx8VLJk36RSiw_515

 alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_1 through
 alph-1.80907.1__shadow_.aUWuanLbJD5vbBSD90NWwjkuCxQmvbQ_515

These are all part of the same bucket (prefixed by alph-1.80907.1).


 ….



 The number of shadow files in zone-2 is taking quite a bit of space from the
 OSD’s in the cluster.   Without being able to trace back to the original
 file name from an s3 or rados tag, I have no way of knowing which 
 files these are.  Is it possible that the same file may have been 
 replicated multiple times, due to network or connectivity issues?



 I can provide any logs or other information that may provide some 
 help, however at this point we’re not seeing any real errors.



 Thanks in advance for any help that can be provided,

You can also run the following command on the existing objects within that 
specific bucket:

$ radosgw-admin object stat --bucket=bucket --object=object

This will show the mapping from the rgw object to the rados objects that 
construct it.


Yehuda

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD import slow

2014-09-25 Thread Josh Durgin

On 09/24/2014 04:57 PM, Brian Rak wrote:

I've been doing some testing of importing virtual machine images, and
I've found that 'rbd import' is at least 2x as slow as 'qemu-img
convert'.  Is there anything I can do to speed this process up?  I'd
like to use rbd import because it gives me a little additional flexibility.

My test setup was a 40960MB LVM volume, and I used the following two
commands:

rbd import /dev/lvmtest/testvol test
qemu-img convert /dev/lvmtest/testvol rbd:test/test

rbd import took 13 minutes, qemu-img took 5.

I'm at a loss to explain this, I would have expected rbd import to be
faster.

This is with ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)


rbd import was doing one synchronous I/O after another. Recently import
and export were parallelized according to 
--rbd-concurrent-management-ops (default 10), which helps quite a bit. 
This will be in

giant.

Josh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Best practice about using multiple disks on one single OSD

2014-09-25 Thread James Pan
Hi,

I have several servers and each server has 4 disks.
Now I am going to setup Ceph on these servers and use all the 4 disks but it 
seems one OSD instance can be configured with one backend storage. 

So there seems two options to me:

1. Make the 4 disks into a raid0 then setup OSD to use this raid0 but obviously 
this is not good because one disk failure will ruin the entire storage.
2. Build FS on each disk and start 4 OSD instances on the server.

Both options are not good. So I am wondering what's the best practice of 
setting up multiple didks on one OSD for Ceph.


Thanks!
Best Regards,



James Jiaming Pan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Best practice about using multiple disks on one single OSD

2014-09-25 Thread Jean-Charles LOPEZ
Hi James,

the best practice is to set up 1 OSD daemon per physical disk drive.

In your case, each OSD node would hence be 4 OSD daemons using one physical 
drive per daemon, and deploying a minimum of 3 servers so each object copy 
resides on a separate physical server.

JC



On Sep 25, 2014, at 20:42, James Pan dev...@yahoo.com wrote:

 Hi,
 
 I have several servers and each server has 4 disks.
 Now I am going to setup Ceph on these servers and use all the 4 disks but it 
 seems one OSD instance can be configured with one backend storage. 
 
 So there seems two options to me:
 
 1. Make the 4 disks into a raid0 then setup OSD to use this raid0 but 
 obviously this is not good because one disk failure will ruin the entire 
 storage.
 2. Build FS on each disk and start 4 OSD instances on the server.
 
 Both options are not good. So I am wondering what's the best practice of 
 setting up multiple didks on one OSD for Ceph.
 
 
 Thanks!
 Best Regards,
 
 
 
 James Jiaming Pan
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] iptables

2014-09-25 Thread shiva rkreddy
Hello,
On my ceph cluster osd node . there is a rule to REJECT all.
As per the documentation, added a rule to allow the trafficon the full
range of ports,
But, the cluster will not come into clean state. Can you please share your
experience with the iptables configuration.

Following are the INPUT rules:

5ACCEPT tcp  --  10.108.240.192/260.0.0.0/0   multiport
dports 6800:7100
6REJECT all  --  0.0.0.0/00.0.0.0/0
reject-with icmp-host-prohibited

Thanks,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com