[ceph-users] Still risky to remove RBD-Images?

2018-08-20 Thread Mehmet

Hello,

AFAIK removing of big RBD-Images would lead ceph to produce blocked 
requests - I dont mean caused by poor disks.


Is this still the case with "Luminous (12.2.4)"?

I have a a few images with

- 2 Terrabyte
- 5 Terrabyte
and
- 20 Terrabyte

in size and have to delete the images.

Would be nice if you could enlightne me :)

- Mehmet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Add Partitions to Ceph Cluster

2018-07-23 Thread Mehmet

Hi Dimitri,

what is the output of

- ceph osd tree?

Perhaps you have a initials crush weight of 0 and in this case there 
wouldnt be any change in the PGs till you change the weight.


- Mehmet

Am 2018-07-10 11:58, schrieb Dimitri Roschkowski:

Hi,

is it possible to use just a partition instead of a whole disk for
OSD? On a server I already use hdb for Ceph and want to add hda4 to be
used in the Ceph Cluster, but it didn’t work for me.

On the server with the partition I tried:

ceph-disk prepare /dev/sda4

and

ceph-disk activate /dev/sda4

And with df I see, that ceph did something on the partition:

/dev/sda4   1.8T  2.8G  1.8T   1% /var/lib/ceph/osd/ceph-4


My problem is, that after I activated the disk, I didn't see a change
in the ceph status output:

  data:
pools:   6 pools, 168 pgs
objects: 25.84 k objects, 100 GiB
usage:   305 GiB used, 6.8 TiB / 7.1 TiB avail
pgs: 168 active+clean

Can some one help me?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Safe to use rados -p rbd cleanup?

2018-07-15 Thread Mehmet

hello guys,

in my production cluster i've many objects like this

"#> rados -p rbd ls | grep 'benchmark'"
... .. .
benchmark_data_inkscope.example.net_32654_object1918
benchmark_data_server_26414_object1990
... .. .

Is it safe to run "rados -p rbd cleanup" or is there any risk for my 
images?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Move Ceph-Cluster to another Datacenter

2018-06-25 Thread Mehmet

Hey Ceph people,

need advise on how to move a ceph-cluster from one datacenter to another 
without any downtime :)


DC 1:
3 dedicated MON-Server (also MGR on this Servers)
4 dedicated OSD-Server (3x12 OSD, 1x 23 OSDs)

3 Proxmox Nodes with connection to our Ceph-Storage (not managed from 
proxmox! Ceph is a standalone installation)


DC 2:
No Ceph-related Server actualy

Luminous (12.2.4)
Only one Pool:
NAMEID USED   %USED MAX AVAIL OBJECTS
rbd 0  30638G 63.8917318G 16778324

I need to move my Ceph instalation from DC1 to DC2 and would realy be 
happy if you could give me some advise on how to do this without any 
downtime and in a still performant manner.


The latency from DC1 to DC2 is ~1,5ms - could perhaps bring up a 10GB 
fiber connection between DC1 and DC2..


A second Ceph-Cluster on DC2 is for cost reasons not possible but i 
could bring a 5th OSD Server Online there.
So "RBD-Mirror" isn't actualy passable way - but i will try to make this 
possible ^^ ...

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-02-23 Thread Mehmet

Sage Wrote( Tue, 2 Jan 2018 17:57:32 + (UTC)):

Hi Stefan, Mehmet,



Hi Sage,
Sorry for the *extremly late* response!


Are these clusters that were upgraded from prior versions, or fresh
luminous installs?


My Cluster was initialy installed with jewel (10.2.1) have seen some 
minor updates and is finaly upgraded from Jewel (10.2.10) to Luminous 
(12.2.1)


Actualy is installed:

- ceph version 12.2.2 (cf0baba3b47f9427c6c97e2144b094b7e5ba) 
luminous (stable)


I had a look in my logfiled and have still the log entries like:

... .. .
2018-02-23 11:23:34.247878 7feaa2a2d700 -1 osd.59 pg_epoch: 36269 
pg[0.346( v 36269'30160204 (36269'30158634,36269'30160204] 
local-lis/les=36253/36254 n=12956 ec=141/141 lis/c 36253/36253 les/c/f 
36254/36264/0 36253/36253/36190) [4,59,23] r=1 lpr=36253 luod=0'0 
crt=36269'30160204 lcod 36269'30160203 active] _scan_snaps no head for 
0:62e347cd:::rbd_data.63efee238e1f29.038c:48 (have MIN)

... .. .

need further information?
- Mehmet



This message indicates that there is a stray clone object with no
associated head or snapdir object.  That normally should never
happen--it's presumably the result of a (hopefully old) bug.  The scrub
process doesn't even clean them up, which maybe says something about 
how

common it is/was...

sage


On Sun, 24 Dec 2017, ceph@xx wrote:

> Hi Stefan,
>
> Am 14. Dezember 2017 09:48:36 MEZ schrieb Stefan Kooman :
> >Hi,
> >
> >We see the following in the logs after we start a scrub for some osds:
> >
> >ceph-osd.2.log:2017-12-14 06:50:47.180344 7f0f47db2700  0
> >log_channel(cluster) log [DBG] : 1.2d8 scrub starts
> >ceph-osd.2.log:2017-12-14 06:50:47.180915 7f0f47db2700 -1 osd.2
> >pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209]
> >local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f
> > >11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733
> >crt=11890'165209 lcod 11890'165208 mlcod 11890'165208
> >active+clean+scrubbing] _scan_snaps no head for
> >1:1b518155:::rbd_data.620652ae8944a.0126:29 (have MIN)
> >ceph-osd.2.log:2017-12-14 06:50:47.180929 7f0f47db2700 -1 osd.2
> >pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209]
> >local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f
> >11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733
> >crt=11890'165209 lcod 11890'165208 mlcod 11890'165208
> >active+clean+scrubbing] _scan_snaps no head for
> >1:1b518155:::rbd_data.620652ae8944a.0126:14 (have MIN)
> >ceph-osd.2.log:2017-12-14 06:50:47.180941 7f0f47db2700 -1 osd.2
> >pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209]
> >local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f
> >11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733
> >crt=11890'165209 lcod 11890'165208 mlcod 11890'165208
> >active+clean+scrubbing] _scan_snaps no head for
> >1:1b518155:::rbd_data.620652ae8944a.0126:a (have MIN)
> >ceph-osd.2.log:2017-12-14 06:50:47.214198 7f0f43daa700  0
> >log_channel(cluster) log [DBG] : 1.2d8 scrub ok
> >
> >So finally it logs "scrub ok", but what does " _scan_snaps no head for
> >..." mean?
>
> I also see this lines in our Logfiles and am wonder  what this means.
>
> >Does this indicate a problem?
>
> I do not guess so because we actually have not  any issues.
>
> >
> >Ceph 12.2.2 with bluestore on lvm
>
> We using 12.2.2 with filestore on xfs.
>
> - Mehmet
> >
> >Gr. Stefan
> ___
> ceph-users mailing list

v> ceph-users@xx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


___
ceph-users mailing list
ceph-users@xx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dasboard (12.2.1) does not work (segfault and runtime error)

2017-10-22 Thread Mehmet


Hello John,

Am 22. Oktober 2017 13:58:34 MESZ schrieb John Spray :
>On Fri, Oct 20, 2017 at 10:10 AM, Mehmet  wrote:
>> Hello,
>>
>> yesterday i've upgraded my "Jewel"-Cluster (10.2.10) to "Luminous"
>(12.2.1).
>> This went realy smooth - Thanks! :)
>>
>> Today i wanted to enable the BuildIn Dasboard via
>>
>> #> vi ceph.conf
>> [...]
>> [mgr]
>> mgr_modules = dashboard
>> [...]
>>
>> #> ceph-deploy --overwrite-conf config push monserver1 monserver2
>monserver3
>>
>> #> systemctl restart ceph-mgr@monserver1
>>
>> Cause the module doesnt seem to be activated so i have to enable it
>"again"
>> via the CLI
>
>Editing mgr_modules in ceph.conf is no longer necessary since
>luminous.  You can skip that part and just use the mgr module enable
>CLI bit.

Nice to know. Thanks :) 

>
>> #> ceph mgr module enable dashboard
>>
>> Then configured my Firewall like
>>
>> #> ufw allow proto tcp from my to any port 7000
>>
>> Cause the deamon seems to listen only on ipv6 i have set
>>
>> #> ceph config-key put mgr/dashboard/server_addr ::
>>
>> ...also tried with set the ip address
>>
>> #> ceph config-key set mgr/dashboard/server_addr 123.123.123.123
>>
>> But i cannot access the dashboard! :*(
>>
>> When i restart the "ceph-mgr" with
>>
>> #> systemctl restart ceph-mgr@monserver1
>>
>> i get *always* a "segmentation fault with thread:ceph-mgr"
>>
>> - https://pastebin.com/6fA239Ec
>>
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]: *** Caught signal
>(Segmentation
>> fault) **
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  in thread 7fbdbb71c500
>> thread_name:ceph-mgr
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  ceph version 12.2.1
>> (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  1: (()+0x3e1014)
>> [0x55c8bffb8014]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  2: (()+0x11390)
>> [0x7fbdb9f55390]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  3:
>> (std::__cxx11::_List_base> std::char_traits, std::allocator >,
>> std::allocatorstd::char_traits,
>> std::allocator > > >::_M_clear()+0x10) [0x55c8bfe47d40]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  4: (std::vector> std::allocator >::~vector()+0x68) [0x55c8c01e0ed8]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  5: (()+0x39ff8)
>> [0x7fbdb8ee9ff8]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  6: (()+0x3a045)
>> [0x7fbdb8eea045]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  7:
>(__libc_start_main()+0xf7)
>> [0x7fbdb8ed0837]
>> Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  8: (_start()+0x29)
>> [0x55c8bfe14e29]
>
>Odd, I thought that was fixed with
>http://tracker.ceph.com/issues/20869 but perhaps not.

Should i do something to bring this on attention or did  you do that already?

>
>> and always a *runtime error when start with
>.../mgr/restful/module.py* (Also
>> included in the logfile above)
>>
>> 2017-10-20 10:16:45.494019 7f7455a34700  1 mgr[restful] Unknown
>request ''
>> 2017-10-20 10:16:45.494133 7f74467fe700  0 mgr[restful] Traceback
>(most
>> recent call last):
>>   File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
>> self._serve()
>>   File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
>> raise RuntimeError('no certificate configured')
>> RuntimeError: no certificate configured
>
>This part is nothing to worry about (there is a patch that will be in
>a later release to make the message less dramatic) -- it's just
>complaining that it can't serve the rest api because it doesn't have a
>certificate.

I have found that already. This can easyli be "fixed" with

#> ceph restful create-self-signed-cert

Thanks again

- Mehmet 
>
>Cheers,
>John
>
>>
>> OS: Linux monserver1 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12
>14:59:54
>> UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>> Ubuntu 16.04.3 LTS
>>
>> Ceph: ceph versions
>> {
>> "mon": {
>> "ceph version 12.2.1
>(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>> luminous (stable)": 3
>> },
>> "mgr": {
>> "ceph version 12.2.1
>(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>> luminous (stable)": 1
>> },
>> "osd": {
>> "ceph version 12.2.1
>(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>> luminous (stable)": 60
>> },
>> "mds": {},
>> "overall": {
>> "ceph version 12.2.1
>(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>> luminous (stable)": 64
>> }
>> }
>>
>> The ceph-mgr was installed before as described below
>>
>> #> ceph-deploy gatherkeys monserver1
>> #> ceph-deploy mgr create monserver1
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dasboard (12.2.1) does not work (segfault and runtime error)

2017-10-20 Thread Mehmet

Am 2017-10-20 13:00, schrieb Mehmet:

Am 2017-10-20 11:10, schrieb Mehmet:

Hello,

yesterday i've upgraded my "Jewel"-Cluster (10.2.10) to "Luminous"
(12.2.1). This went realy smooth - Thanks! :)

Today i wanted to enable the BuildIn Dasboard via

#> vi ceph.conf
[...]
[mgr]
mgr_modules = dashboard
[...]

#> ceph-deploy --overwrite-conf config push monserver1 monserver2 
monserver3


#> systemctl restart ceph-mgr@monserver1

Cause the module doesnt seem to be activated so i have to enable it
"again" via the CLI

#> ceph mgr module enable dashboard

Then configured my Firewall like

#> ufw allow proto tcp from my to any port 7000

Cause the deamon seems to listen only on ipv6 i have set

#> ceph config-key put mgr/dashboard/server_addr ::

...also tried with set the ip address

#> ceph config-key set mgr/dashboard/server_addr 123.123.123.123

But i cannot access the dashboard! :*(


This is now fixed - i can access the dashboard (a mate did a mistake so 
the firewall between my office an the datacenter wasnt properly 
configured... Oo).


The error message below still exists

- Mehmet



When i restart the "ceph-mgr" with

#> systemctl restart ceph-mgr@monserver1

i get *always* a "segmentation fault with thread:ceph-mgr"

- https://pastebin.com/6fA239Ec

Oct 20 10:16:45 monserver1 ceph-mgr[28376]: *** Caught signal
(Segmentation fault) **
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  in thread 7fbdbb71c500
thread_name:ceph-mgr
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  1: (()+0x3e1014) 
[0x55c8bffb8014]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  2: (()+0x11390) 
[0x7fbdb9f55390]

Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  3:
(std::__cxx11::_List_base, std::allocator >,
std::allocator, std::allocator > > >::_M_clear()+0x10)
[0x55c8bfe47d40]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  4: (std::vector >::~vector()+0x68) [0x55c8c01e0ed8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  5: (()+0x39ff8) 
[0x7fbdb8ee9ff8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  6: (()+0x3a045) 
[0x7fbdb8eea045]

Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  7:
(__libc_start_main()+0xf7) [0x7fbdb8ed0837]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  8: (_start()+0x29) 
[0x55c8bfe14e29]


and always a *runtime error when start with .../mgr/restful/module.py*
(Also included in the logfile above)

2017-10-20 10:16:45.494019 7f7455a34700  1 mgr[restful] Unknown 
request ''

2017-10-20 10:16:45.494133 7f74467fe700  0 mgr[restful] Traceback
(most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured


This is now being fixed with

#> ceph restful create-self-signed-cert

But the "segmentation error" abov still exists :*/

- Mehmet


OS: Linux monserver1 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12
14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 16.04.3 LTS

Ceph: ceph versions
{
"mon": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 1
},
"osd": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 60
},
"mds": {},
"overall": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 64
}
}

The ceph-mgr was installed before as described below

#> ceph-deploy gatherkeys monserver1
#> ceph-deploy mgr create monserver1


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dasboard (12.2.1) does not work (segfault and runtime error)

2017-10-20 Thread Mehmet

Am 2017-10-20 11:10, schrieb Mehmet:

Hello,

yesterday i've upgraded my "Jewel"-Cluster (10.2.10) to "Luminous"
(12.2.1). This went realy smooth - Thanks! :)

Today i wanted to enable the BuildIn Dasboard via

#> vi ceph.conf
[...]
[mgr]
mgr_modules = dashboard
[...]

#> ceph-deploy --overwrite-conf config push monserver1 monserver2 
monserver3


#> systemctl restart ceph-mgr@monserver1

Cause the module doesnt seem to be activated so i have to enable it
"again" via the CLI

#> ceph mgr module enable dashboard

Then configured my Firewall like

#> ufw allow proto tcp from my to any port 7000

Cause the deamon seems to listen only on ipv6 i have set

#> ceph config-key put mgr/dashboard/server_addr ::

...also tried with set the ip address

#> ceph config-key set mgr/dashboard/server_addr 123.123.123.123

But i cannot access the dashboard! :*(

When i restart the "ceph-mgr" with

#> systemctl restart ceph-mgr@monserver1

i get *always* a "segmentation fault with thread:ceph-mgr"

- https://pastebin.com/6fA239Ec

Oct 20 10:16:45 monserver1 ceph-mgr[28376]: *** Caught signal
(Segmentation fault) **
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  in thread 7fbdbb71c500
thread_name:ceph-mgr
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  1: (()+0x3e1014) 
[0x55c8bffb8014]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  2: (()+0x11390) 
[0x7fbdb9f55390]

Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  3:
(std::__cxx11::_List_base, std::allocator >,
std::allocator, std::allocator > > >::_M_clear()+0x10)
[0x55c8bfe47d40]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  4: (std::vector >::~vector()+0x68) [0x55c8c01e0ed8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  5: (()+0x39ff8) 
[0x7fbdb8ee9ff8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  6: (()+0x3a045) 
[0x7fbdb8eea045]

Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  7:
(__libc_start_main()+0xf7) [0x7fbdb8ed0837]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  8: (_start()+0x29) 
[0x55c8bfe14e29]


and always a *runtime error when start with .../mgr/restful/module.py*
(Also included in the logfile above)

2017-10-20 10:16:45.494019 7f7455a34700  1 mgr[restful] Unknown request 
''

2017-10-20 10:16:45.494133 7f74467fe700  0 mgr[restful] Traceback
(most recent call last):
  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured


This is now being fixed with

#> ceph restful create-self-signed-cert

But the "segmentation error" abov still exists :*/

- Mehmet


OS: Linux monserver1 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12
14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 16.04.3 LTS

Ceph: ceph versions
{
"mon": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 1
},
"osd": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 60
},
"mds": {},
"overall": {
"ceph version 12.2.1
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 64
}
}

The ceph-mgr was installed before as described below

#> ceph-deploy gatherkeys monserver1
#> ceph-deploy mgr create monserver1


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Dasboard (12.2.1) does not work (segfault and runtime error)

2017-10-20 Thread Mehmet

Hello,

yesterday i've upgraded my "Jewel"-Cluster (10.2.10) to "Luminous" 
(12.2.1). This went realy smooth - Thanks! :)


Today i wanted to enable the BuildIn Dasboard via

#> vi ceph.conf
[...]
[mgr]
mgr_modules = dashboard
[...]

#> ceph-deploy --overwrite-conf config push monserver1 monserver2 
monserver3


#> systemctl restart ceph-mgr@monserver1

Cause the module doesnt seem to be activated so i have to enable it 
"again" via the CLI


#> ceph mgr module enable dashboard

Then configured my Firewall like

#> ufw allow proto tcp from my to any port 7000

Cause the deamon seems to listen only on ipv6 i have set

#> ceph config-key put mgr/dashboard/server_addr ::

...also tried with set the ip address

#> ceph config-key set mgr/dashboard/server_addr 123.123.123.123

But i cannot access the dashboard! :*(

When i restart the "ceph-mgr" with

#> systemctl restart ceph-mgr@monserver1

i get *always* a "segmentation fault with thread:ceph-mgr"

- https://pastebin.com/6fA239Ec

Oct 20 10:16:45 monserver1 ceph-mgr[28376]: *** Caught signal 
(Segmentation fault) **
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  in thread 7fbdbb71c500 
thread_name:ceph-mgr
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  1: (()+0x3e1014) 
[0x55c8bffb8014]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  2: (()+0x11390) 
[0x7fbdb9f55390]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  3: 
(std::__cxx11::_List_basestd::char_traits, std::allocator >, 
std::allocator, 
std::allocator > > >::_M_clear()+0x10) [0x55c8bfe47d40]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  4: (std::vectorstd::allocator >::~vector()+0x68) [0x55c8c01e0ed8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  5: (()+0x39ff8) 
[0x7fbdb8ee9ff8]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  6: (()+0x3a045) 
[0x7fbdb8eea045]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  7: 
(__libc_start_main()+0xf7) [0x7fbdb8ed0837]
Oct 20 10:16:45 monserver1 ceph-mgr[28376]:  8: (_start()+0x29) 
[0x55c8bfe14e29]


and always a *runtime error when start with .../mgr/restful/module.py* 
(Also included in the logfile above)


2017-10-20 10:16:45.494019 7f7455a34700  1 mgr[restful] Unknown request 
''
2017-10-20 10:16:45.494133 7f74467fe700  0 mgr[restful] Traceback (most 
recent call last):

  File "/usr/lib/ceph/mgr/restful/module.py", line 248, in serve
self._serve()
  File "/usr/lib/ceph/mgr/restful/module.py", line 299, in _serve
raise RuntimeError('no certificate configured')
RuntimeError: no certificate configured

OS: Linux monserver1 4.4.0-96-generic #119-Ubuntu SMP Tue Sep 12 
14:59:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Ubuntu 16.04.3 LTS

Ceph: ceph versions
{
"mon": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 3

},
"mgr": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 1

},
"osd": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 60

},
"mds": {},
"overall": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 64

}
}

The ceph-mgr was installed before as described below

#> ceph-deploy gatherkeys monserver1
#> ceph-deploy mgr create monserver1


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-13 Thread Mehmet
Hey guys,

Does this mean we have to do additional when upgrading from Jewel 10.2.10 to 
luminous 12.2.1?

- Mehmet

Am 9. Oktober 2017 04:02:14 MESZ schrieb kefu chai :
>On Mon, Oct 9, 2017 at 8:07 AM, Joao Eduardo Luis  wrote:
>> This looks a lot like a bug I fixed a week or so ago, but for which I
>> currently don't recall the ticket off the top of my head. It was
>basically a
>
>http://tracker.ceph.com/issues/21300
>
>> crash each time a "ceph osd df" was called, if a mgr was not
>available after
>> having set the luminous osd require flag. I will check the log in the
>> morning to figure out whether you need to upgrade to a newer version
>or if
>> this is a corner case the fix missed. In the mean time, check if you
>have
>> ceph-mgr running, because that's the easy work around (assuming it's 
>the
>> same bug).
>>
>>   -Joao
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
>-- 
>Regards
>Kefu Chai
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimise Setup with Bluestore

2017-08-17 Thread Mehmet
*resend... this Time to the list...*
Hey David Thank you for the response!

My use case is actually only rbd for kvm Images where mostly Running Lamp 
systems on Ubuntu or centos.
All Images (rbds) are created with "proxmox" where the ceph defaults are used 
(actually Jewel in the near Future luminous...)

What i want to know is Primary which constelation would be optimal for 
bluestore?

I.e.
Put db and RAW Device on HDD and the Wal on nvme in my case?
Should i replace a HDD with an ssd and use this for the dbs so thats finally 
HDD as RAW Device, db on ssd and Wal on nvme?
Or... use the HDD for RAW and the nvme for Wal and db?

Hope you (and others) understand what i mean :)

- Mehmet 

Am 16. August 2017 19:01:30 MESZ schrieb David Turner :
>Honestly there isn't enough information about your use case.  RBD usage
>with small IO vs ObjectStore with large files vs ObjectStore with small
>files vs any number of things.  The answer to your question might be
>that
>for your needs you should look at having a completely different
>hardware
>configuration than what you're running.  There is no correct way to
>configure your cluster based on what hardware you have.  What hardware
>you
>use and what configuration settings you use should be based on your
>needs
>and use case.
>
>On Wed, Aug 16, 2017 at 12:13 PM Mehmet  wrote:
>
>> :( no suggestions or recommendations on this?
>>
>> Am 14. August 2017 16:50:15 MESZ schrieb Mehmet :
>>
>>> Hi friends,
>>>
>>> my actual hardware setup per OSD-node is as follow:
>>>
>>> # 3 OSD-Nodes with
>>> - 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no
>>> Hyper-Threading
>>> - 64GB RAM
>>> - 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
>>> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device
>for
>>> 12 Disks (20G Journal size)
>>> - 1x Samsung SSD 840/850 Pro only for the OS
>>>
>>> # and 1x OSD Node with
>>> - 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 Threads)
>>> - 64GB RAM
>>> - 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
>>> - 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
>>> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device
>for
>>> 24 Disks (15G Journal size)
>>> - 1x Samsung SSD 850 Pro only for the OS
>>>
>>> As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe – 400G)
>>> Device for whole Spinning Disks (partitioned) on each OSD-node.
>>>
>>> When „Luminous“ is available (as next LTE) i plan to switch vom
>>> „filestore“ to „bluestore“ 😊
>>>
>>> As far as i have read bluestore consists of
>>> - „the device“
>>> - „block-DB“: device that store RocksDB metadata
>>> - „block-WAL“: device that stores RocksDB „write-ahead journal“
>>>
>>> Which setup would be usefull in my case?
>>> I Would setup the disks via "ceph-deploy".
>>>
>>> Thanks in advance for your suggestions!
>>> - Mehmet
>>> --
>>>
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph cluster with SSDs

2017-08-17 Thread Mehmet
Which ssds are used? Are they in production? If so how is your PG Count?

Am 17. August 2017 20:04:25 MESZ schrieb M Ranga Swami Reddy 
:
>Hello,
>I am using the Ceph cluster with HDDs and SSDs. Created separate pool
>for each.
>Now, when I ran the "ceph osd bench", HDD's OSDs show around 500 MB/s
>and SSD's OSD show around 280MB/s.
>
>Ideally, what I expected was - SSD's OSDs should be at-least 40% high
>as compared with HDD's OSD bench.
>
>Did I miss anything here? Any hint is appreciated.
>
>Thanks
>Swami
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimise Setup with Bluestore

2017-08-17 Thread Mehmet


Hey Mark :)

Am 16. August 2017 21:43:34 MESZ schrieb Mark Nelson :
>Hi Mehmet!
>
>On 08/16/2017 11:12 AM, Mehmet wrote:
>> :( no suggestions or recommendations on this?
>>
>> Am 14. August 2017 16:50:15 MESZ schrieb Mehmet :
>>
>> Hi friends,
>>
>> my actual hardware setup per OSD-node is as follow:
>>
>> # 3 OSD-Nodes with
>> - 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no
>> Hyper-Threading
>> - 64GB RAM
>> - 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
>> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling
>Device for
>> 12 Disks (20G Journal size)
>> - 1x Samsung SSD 840/850 Pro only for the OS
>>
>> # and 1x OSD Node with
>> - 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20
>Threads)
>> - 64GB RAM
>> - 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
>> - 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
>> - 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling
>Device for
>> 24 Disks (15G Journal size)
>> - 1x Samsung SSD 850 Pro only for the OS
>
>The single P3700 for 23 spinning disks is pushing it.  They have high 
>write durability but based on the model that is the 400GB version?

Yes. It is a 400GB Version. 

>If you are doing a lot of writes you might wear it out pretty fast and

Actually the intel isdct tool says this One should alive 40 years ^^ 
(EnduranceAnalyzer). But this should be proofed ;)

>it's 
>a single point of failure for the entire node (if it dies you have a
>lot 
>of data dying with it).  General unbalanced setups like this are 
>trickier to get performing well as well.
>

Yes. That is true. That could be happen to All of my 4 Nodes. Perhaps the chef 
should see what will happen before i can get Money to optimise the Nodes...

>>
>> As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe –
>400G)
>> Device for whole Spinning Disks (partitioned) on each OSD-node.
>>
>> When „Luminous“ is available (as next LTE) i plan to switch vom
>> „filestore“ to „bluestore“ 😊
>>
>> As far as i have read bluestore consists of
>> - „the device“
>> - „block-DB“: device that store RocksDB metadata
>> - „block-WAL“: device that stores RocksDB „write-ahead journal“
>>
>> Which setup would be usefull in my case?
>> I Would setup the disks via "ceph-deploy".
>
>So typically we recommend something like a 1-2GB WAL partition on the 
>NVMe drive per OSD and use the remaining space for DB.  If you run out 
>of DB space, bluestore will start using the spinning disks to store KV 
>data instead.  I suspect this will still be the advice you will want to
>
>follow, though at some point having so many WAL and DB partitions on
>the 
>NVMe may start becoming a bottleneck.  Something like 63K sequential 
>writes to heavily fragmented objects might be worth testing, but in
>most 
>cases I suspect DB and WAL on NVMe is still going to be faster.
>

Thanks thats what i expected. Another idea would be to replace a Spinning Disk 
of the Nodes with an intel ssd for wal/db... Perhaps for the dbs?

- Mehmet

>>
>> Thanks in advance for your suggestions!
>> - Mehmet
>>
>
>>
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimise Setup with Bluestore

2017-08-16 Thread Mehmet
:( no suggestions or recommendations on this? 

Am 14. August 2017 16:50:15 MESZ schrieb Mehmet :
>Hi friends,
>
>my actual hardware setup per OSD-node is as follow:
>
># 3 OSD-Nodes with
>- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no 
>Hyper-Threading
>- 64GB RAM
>- 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
>- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for
>
>12 Disks (20G Journal size)
>- 1x Samsung SSD 840/850 Pro only for the OS
>
># and 1x OSD Node with
>- 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 Threads)
>- 64GB RAM
>- 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
>- 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
>- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for
>
>24 Disks (15G Journal size)
>- 1x Samsung SSD 850 Pro only for the OS
>
>As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe – 400G) 
>Device for whole Spinning Disks (partitioned) on each OSD-node.
>
>When „Luminous“ is available (as next LTE) i plan to switch vom 
>„filestore“ to „bluestore“ 😊
>
>As far as i have read bluestore consists of
>-  „the device“
>-  „block-DB“: device that store RocksDB metadata
>-  „block-WAL“: device that stores RocksDB „write-ahead journal“
>
>Which setup would be usefull in my case?
>I Would setup the disks via "ceph-deploy".
>
>Thanks in advance for your suggestions!
>- Mehmet
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Jewel (10.2.7) osd suicide timeout while deep-scrub

2017-08-15 Thread Mehmet
I am Not Sure but perhaps nodown/out could help to Finish?

- Mehmet  

Am 15. August 2017 16:01:57 MESZ schrieb Andreas Calminder 
:
>Hi,
>I got hit with osd suicide timeouts while deep-scrub runs on a
>specific pg, there's a RH article
>(https://access.redhat.com/solutions/2127471) suggesting changing
>osd_scrub_thread_suicide_timeout' from 60s to a higher value, problem
>is the article is for Hammer and the osd_scrub_thread_suicide_timeout
>doesn't exist when running
>ceph daemon osd.34 config show
>and the default timeout (60s) suggested in the article doesn't really
>match the sucide timeout time in the logs:
>
>2017-08-15 15:39:37.512216 7fb293137700  1 heartbeat_map is_healthy
>'OSD::osd_op_tp thread 0x7fb231adf700' had suicide timed out after 150
>2017-08-15 15:39:37.518543 7fb293137700 -1 common/HeartbeatMap.cc: In
>function 'bool ceph::HeartbeatMap::_check(const
>ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fb293137700
>time 2017-08-15 15:39:37.512230
>common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")
>
>The suicide timeout (150) does match the
>osd_op_thread_suicide_timeout, however when I try changing this I get:
>ceph daemon osd.34 config set osd_op_thread_suicide_timeout 300
>{
>"success": "osd_op_thread_suicide_timeout = '300' (unchangeable) "
>}
>
>And the deep scrub will sucide timeout after 150 seconds, just like
>before.
>
>The cluster is left with osd.34 flapping. Is there any way to let the
>deep-scrub finish and get out of the infinite deep-scrub loop?
>
>Regards,
>Andreas
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Optimise Setup with Bluestore

2017-08-14 Thread Mehmet

Hi friends,

my actual hardware setup per OSD-node is as follow:

# 3 OSD-Nodes with
- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no 
Hyper-Threading

- 64GB RAM
- 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for 
12 Disks (20G Journal size)

- 1x Samsung SSD 840/850 Pro only for the OS

# and 1x OSD Node with
- 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 Threads)
- 64GB RAM
- 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
- 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for 
24 Disks (15G Journal size)

- 1x Samsung SSD 850 Pro only for the OS

As you can see, i am using 1 (one) NVMe (Intel DC P3700 NVMe – 400G) 
Device for whole Spinning Disks (partitioned) on each OSD-node.


When „Luminous“ is available (as next LTE) i plan to switch vom 
„filestore“ to „bluestore“ 😊


As far as i have read bluestore consists of
-   „the device“
-   „block-DB“: device that store RocksDB metadata
-   „block-WAL“: device that stores RocksDB „write-ahead journal“

Which setup would be usefull in my case?
I Would setup the disks via "ceph-deploy".

Thanks in advance for your suggestions!
- Mehmet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] One Monitor filling the logs

2017-08-08 Thread Mehmet
I guess this is Related to
"debug_mgr": "1/5"
But not Sure.. . Give it a try.

Hth
Mehmet 


Am 8. August 2017 16:28:21 MESZ schrieb Konrad Riedel :
>Hi Ceph users,
>
>my luminous (ceph version 12.1.1) testcluster is doing fine, except
>that 
>one Monitor is filling the logs
>
>  -rw-r--r-- 1 ceph ceph 119M Aug  8 15:27 ceph-mon.1.log
>
>ceph-mon.1.log:
>
>2017-08-08 15:57:49.509176 7ff4573c4700  0 log_channel(cluster) log 
>[DBG] : Standby manager daemon felix started
>2017-08-08 15:57:49.646006 7ff4573c4700  0 log_channel(cluster) log 
>[DBG] : Standby manager daemon daniel started
>2017-08-08 15:57:49.830046 7ff45d13a700  0 log_channel(cluster) log 
>[DBG] : mgrmap e256330: udo(active)
>2017-08-08 15:57:51.509410 7ff4573c4700  0 log_channel(cluster) log 
>[DBG] : Standby manager daemon felix started
>2017-08-08 15:57:51.646269 7ff4573c4700  0 log_channel(cluster) log 
>[DBG] : Standby manager daemon daniel started
>2017-08-08 15:57:52.054987 7ff45d13a700  0 log_channel(cluster) log 
>[DBG] : mgrmap e256331: udo(active)
>
>I've tried to reduce the debug settings ( "debug_mon": "0/1", 
>"debug_monc": "0/1"), but I still get 3 messages per
>second. Does anybody know how to mute this?
>
>All log settings (defaults):
>
>{
> "name": "mon.1",
> "cluster": "ceph",
> "debug_none": "0/5",
> "debug_lockdep": "0/1",
> "debug_context": "0/1",
> "debug_crush": "1/1",
> "debug_mds": "1/5",
> "debug_mds_balancer": "1/5",
> "debug_mds_locker": "1/5",
> "debug_mds_log": "1/5",
> "debug_mds_log_expire": "1/5",
> "debug_mds_migrator": "1/5",
> "debug_buffer": "0/1",
> "debug_timer": "0/1",
> "debug_filer": "0/1",
> "debug_striper": "0/1",
> "debug_objecter": "0/1",
> "debug_rados": "0/5",
> "debug_rbd": "0/5",
> "debug_rbd_mirror": "0/5",
> "debug_rbd_replay": "0/5",
> "debug_journaler": "0/5",
> "debug_objectcacher": "0/5",
> "debug_client": "0/5",
> "debug_osd": "1/5",
> "debug_optracker": "0/5",
> "debug_objclass": "0/5",
> "debug_filestore": "1/3",
> "debug_journal": "1/3",
> "debug_ms": "0/5",
> "debug_mon": "0/1",
> "debug_monc": "0/1",
> "debug_paxos": "1/5",
> "debug_tp": "0/5",
> "debug_auth": "1/5",
> "debug_crypto": "1/5",
> "debug_finisher": "1/1",
> "debug_heartbeatmap": "1/5",
> "debug_perfcounter": "1/5",
> "debug_rgw": "1/5",
> "debug_civetweb": "1/10",
> "debug_javaclient": "1/5",
> "debug_asok": "1/5",
> "debug_throttle": "1/1",
> "debug_refs": "0/0",
> "debug_xio": "1/5",
> "debug_compressor": "1/5",
> "debug_bluestore": "1/5",
> "debug_bluefs": "1/5",
> "debug_bdev": "1/3",
> "debug_kstore": "1/5",
> "debug_rocksdb": "4/5",
> "debug_leveldb": "4/5",
> "debug_memdb": "4/5",
> "debug_kinetic": "1/5",
> "debug_fuse": "1/5",
> "debug_mgr": "1/5",
> "debug_mgrc": "1/5",
> "debug_dpdk": "1/5",
> "debug_eventtrace": "1/5",
> "host": "felix",
>
>Thanks & regards
>
>Konrad Riedel
>
>--
>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds wont start. asserts with "failed to load OSD map for epoch , got 0 bytes"

2017-07-11 Thread Mehmet
Hello Mark,

Perhaps something like

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --pgid 1.fs1 --op 
export --file /tmp/test

Could help ya to get your PG back.

I have Never used the above Command. It is a Notice from a Post in the Mailing 
List so i recommend to Read more about ceph-objectstore-tool before use!

- Mehmet 

Am 1. Juli 2017 01:53:49 MESZ schrieb Mark Guz :
>Hi all
>
>I have two osds that are asserting , see
>https://pastebin.com/raw/xmDPg84a
>
>I am running kraken 11.2.0 and am kinda blocked by this.  Anything i
>try 
>to do with these osds, results in a abrt.
>
>Need to recover a down pg from one of the osds and even the pg export
>seg 
>faults.
>
>I'm at my wits end with this...
>
>Regards
>
>Mark Guz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD node type/count mixes in the cluster

2017-06-18 Thread Mehmet
Hi,

We actually using 3xIntel Server with 12 osds and One supermicro with 24 osds 
in One ceph Cluster Journals on nvme per server. Did not seeing any issues jet. 

Best
Mehmet

Am 9. Juni 2017 19:24:40 MESZ schrieb Deepak Naidu :
>Thanks David for sharing your experience, appreciate it.
>
>--
>Deepak
>
>From: David Turner [mailto:drakonst...@gmail.com]
>Sent: Friday, June 09, 2017 5:38 AM
>To: Deepak Naidu; ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] OSD node type/count mixes in the cluster
>
>
>I ran a cluster with 2 generations of the same vendor hardware. 24 osd
>supermicro and 32 osd supermicro (with faster/more RAM and CPU cores). 
>The cluster itself ran decently well, but the load differences was
>drastic between the 2 types of nodes. It required me to run the cluster
>with 2 separate config files for each type of node and was an utter
>PITA when troubleshooting bottlenecks.
>
>Ultimately I moved around hardware and created a legacy cluster on the
>old hardware and created a new cluster using the newer configuration. 
>In general it was very hard to diagnose certain bottlenecks due to
>everything just looking so different.  The primary one I encountered
>was snap trimming due to deleted thousands of snapshots/day.
>
>If you aren't pushing any limits of Ceph, you will probably be fine. 
>But if you have a really large cluster, use a lot of snapshots, or are
>pushing your cluster harder than the average user... Then I'd avoid
>mixing server configurations in a cluster.
>
>On Fri, Jun 9, 2017, 1:36 AM Deepak Naidu
>mailto:dna...@nvidia.com>> wrote:
>Wanted to check if anyone has a ceph cluster which has mixed vendor
>servers both with same disk size i.e. 8TB but different count i.e.
>Example 10 OSD servers from Dell with 60 Disk per server and other 10
>OSD servers from HP with 26 Disk per server.
>
>If so does that change any performance dynamics ? or is it not
>advisable .
>
>--
>Deepak
>---
>This email message is for the sole use of the intended recipient(s) and
>may contain
>confidential information.  Any unauthorized review, use, disclosure or
>distribution
>is prohibited.  If you are not the intended recipient, please contact
>the sender by
>reply email and destroy all copies of the original message.
>---
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Available tools for deploying ceph cluster as a backend storage ?

2017-05-22 Thread Mehmet
Perhaps openATTIC is also an alternativ to admin Ceph. Actually i prefer 
ceph-deploy 

Am 18. Mai 2017 15:33:52 MESZ schrieb Shambhu Rajak :
>Let me explore the code to my needs. Thanks Chris
>Regards,
>Shambhu
>
>From: Bitskrieg [mailto:bitskr...@bitskrieg.net]
>Sent: Thursday, May 18, 2017 6:40 PM
>To: Shambhu Rajak; wes_dilling...@harvard.edu
>Cc: ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] Available tools for deploying ceph cluster as
>a backend storage ?
>
>
>Shambhu,
>
>If you're looking for something turnkey/dead-simple, you should be
>talking to red hat.  Everything out there that is FOSS is either A. Not
>fully-featured for life cycle management (ceph-deploy) or requires
>nontrivial amounts of time/expertise/etc. To put together and add
>whatever extra management features you need (Ansible, salt, etc.).
>
>That said, we did create a relatively simple framework for complete
>life cycle management of ceph using salt, along with the pieces
>required for cinder, glance, and nova integration.  Some of the stuff
>is environment specific, but those pieces are easy enough to pull out
>and adjust to your needs.  Code is here:
>https://git.cybbh.space/vta/saltstack.
>
>Chris
>
>On May 18, 2017 8:56:52 AM Shambhu Rajak
>mailto:sra...@sandvine.com>> wrote:
>HI Wes
>Since I want a production deployment, full-fledged management would be
>necessary for administrating, maintaining, could you suggest on this
>lines.
>Thanks,
>Shambhu
>
>From: Wes Dillingham
>[mailto:wes_dilling...@harvard.edu]
>Sent: Thursday, May 18, 2017 6:08 PM
>To: Shambhu Rajak
>Cc: ceph-users@lists.ceph.com
>Subject: Re: [ceph-users] Available tools for deploying ceph cluster as
>a backend storage ?
>
>If you dont want a full fledged configuration management approach
>ceph-deploy is your best bet.
>http://docs.ceph.com/docs/master/rados/deployment/ceph-deploy-new/
>
>On Thu, May 18, 2017 at 8:28 AM, Shambhu Rajak
>mailto:sra...@sandvine.com>> wrote:
>Hi ceph-users,
>
>I want to deploy ceph-cluster as a backend storage for openstack, so I
>am trying to find the best tool available for deploying ceph cluster.
>Few are in my mind:
>https://github.com/ceph/ceph-ansible
>https://github.com/01org/virtual-storage-manager/wiki/Getting-Started-with-VSM
>
>Is there anything else that are available that could be much easier to
>use and give production deployment.
>
>Thanks,
>Shambhu Rajak
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>--
>Respectfully,
>
>Wes Dillingham
>wes_dilling...@harvard.edu
>Research Computing | Infrastructure Engineer
>Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 102
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Read from Replica Osds?

2017-05-08 Thread Mehmet
Hi,

I thought that Clients do also reads from ceph replicas. Sometimes i Read in 
the web that this does only happens from the primary pg like how ceph handle 
writes... so what is True?

Greetz
Mehmet___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding New OSD Problem

2017-05-01 Thread Mehmet
Also i would set 

osd_crush_initial_weight = 0
In ceph.conf an decrease the Crush weight via

Ceph osd Crush reweight osd.36 0.05000

Step by step

Am 25. April 2017 23:19:08 MESZ schrieb Reed Dier :
>Others will likely be able to provide some better responses, but I’ll
>take a shot to see if anything makes sense.
>
>With 10.2.6 you should be able to set 'osd scrub during recovery’ to
>false to prevent any new scrubs from occurring during a recovery event.
>Current scrubs will complete, but future scrubs will not being until
>recovery has completed.
>
>Also, adding just one OSD on the new server, assuming all 6 are
>ready(?) will cause a good deal of unnecessary data reshuffling as you
>add more OSD’s.
>And on top of that, assuming the pool’s crush ruleset is ‘chooseleaf
>first 0 type host’ then that should create a bit of an unbalanced
>weighting. Any reason you aren’t bringing in all 6 OSD’s at once?
>You should be able to set noscrub, noscrub-deep, norebalance,
>nobackfill, and norecover flags (also probably want noout to prevent
>rebalance if OSDs flap), wait for scrubs to complete (especially deep),
>add your 6 OSD’s, unset your flags for recovery/rebalance/backfill, and
>it will then move data only once, and hopefully not have the scrub
>load. After recovery, unset the scrub flags, and be back to normal.
>
>Caveat, no VM’s running on my cluster, but those seem like low hanging
>fruit for possible load lightening during a rebalance.
>
>Reed
>
>> On Apr 25, 2017, at 3:47 PM, Ramazan Terzi 
>wrote:
>> 
>> Hello,
>> 
>> I have a Ceph Cluster with specifications below:
>> 3 x Monitor node
>> 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks
>have SSD journals)
>> Distributed public and private networks. All NICs are 10Gbit/s
>> osd pool default size = 3
>> osd pool default min size = 2
>> 
>> Ceph version is Jewel 10.2.6.
>> 
>> Current health status:
>>cluster 
>> health HEALTH_OK
>> monmap e9: 3 mons at
>{ceph-mon01=xxx:6789/0,ceph-mon02=xxx:6789/0,ceph-mon03=xxx:6789/0}
>>election epoch 84, quorum 0,1,2
>ceph-mon01,ceph-mon02,ceph-mon03
>> osdmap e1512: 36 osds: 36 up, 36 in
>>flags sortbitwise,require_jewel_osds
>>  pgmap v7698673: 1408 pgs, 5 pools, 37365 GB data, 9436 kobjects
>>83871 GB used, 114 TB / 196 TB avail
>>1408 active+clean
>> 
>> My cluster is active and a lot of virtual machines running on it
>(Linux and Windows VM's, database clusters, web servers etc).
>> 
>> When I want to add a new storage node with 1 disk, I'm getting huge
>problems. With new osd, crushmap updated and Ceph Cluster turns into
>recovery mode. Everything is OK. But after a while, some runnings VM's
>became unmanageable. Servers become unresponsive one by one. Recovery
>process would take an average of 20 hours. For this reason, I removed
>the new osd. Recovery process completed and everythink become normal.
>> 
>> When new osd added, health status:
>>cluster 
>> health HEALTH_WARN
>>91 pgs backfill_wait
>>1 pgs bacfilling
>>28 pgs degraded
>>28 pgs recovery_wait
>>28 phs stuck degraded
>>recovery 2195/18486602 objects degraded (0.012%)
>>recovery 1279784/18486602 objects misplaced (6.923%)
>> monmap e9: 3 mons at
>{ceph-mon01=xxx:6789/0,ceph-mon02=xxx:6789/0,ceph-mon03=xxx:6789/0}
>>election epoch 84, quorum 0,1,2
>ceph-mon01,ceph-mon02,ceph-mon03
>> osdmap e1512: 37 osds: 37 up, 37 in
>>flags sortbitwise,require_jewel_osds
>>  pgmap v7698673: 1408 pgs, 5 pools, 37365 GB data, 9436 kobjects
>>83871 GB used, 114 TB / 201 TB avail
>>2195/18486602 objects degraded (0.012%)
>>1279784/18486602 objects misplaced (6.923%)
>>1286 active+clean
>>91 active+remapped+wait_backfill
>>   28 active+recovery_wait+degraded
>> 2 active+clean+scrubbing+deep
>> 1 active+remapped+backfilling
>> recovery io 430 MB/s, 119 objects/s
>> client io 36174 B/s rrd, 5567 kB/s wr, 5 op/s rd, 700 op/s wr
>> 
>> Some Ceph config parameters:
>> osd_max_backfills = 1
>> osd_backfill_full_ratio = 0.85
>> osd_recovery_max_active = 3
>> osd_recovery_threads = 1
>> 
>> How I can add new OSD's safely?
>> 
>> Best regards,
>> Ramazan
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fujitsu

2017-04-20 Thread Mehmet
Hi Felix,

What happens when you Restart the Server - is the numbering then identical?

I have this behavior on Intel Server and when i put additional Disks in 
Runtime, the numbering increases but i expect to get a Number from between two 
Slots  (of course when i put a Disk between this slots).

Hope you understand what i mean :)

HTH 
Mehmet 

Am 20. April 2017 09:19:32 MESZ schrieb "Stolte, Felix" 
:
>Hello cephers,
>
>is anyone using Fujitsu Hardware for Ceph OSDs with the PRAID EP400i
>Raidcontroller in JBOD Mode? We are having three identical servers with
>identical Disk placement. First three Slots are SSDs for journaling and
>remaining nine slots with SATA Disks. Problem is, that in Ubuntu (and I
>would guess in any other distribution as well) the disk paths for the
>same
>physical drive slot differ between the three servers. For example on
>Server
>A first disk is identified as "pci-:01:00.0-scsi-0:0:14:0" and as
>"pci-:01:00.0-scsi-0:0:17:0" on the other. This makes provisioning
>osds
>nearly impossible. Anyone ran into the same issue an knows how to fix
>this?
>Fujitsu support couldn't help (in fact they did not know, that you can
>put
>the controller in JBOD mode ...). I actvated JBOD via the "Enable JBOD"
>option in the controller management menu of the praid ep400i raid
>controller.
>
>Cheers Felix
>
>
>Felix Stolte
>IT-Services
>E-Mail: f.sto...@fz-juelich.de
>Internet: http://www.fz-juelich.de
>
>Forschungszentrum Juelich GmbH
>52425 Juelich
>Sitz der Gesellschaft: Juelich
>Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
>Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>Prof. Dr. Sebastian M. Schmidt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Working Ceph guide for Centos 7 ???

2017-04-07 Thread Mehmet
Perhaps ceph-deploy can Work when you disable the "epel" Repo? 

Purge all and try it again. 

Am 7. April 2017 04:27:59 MESZ schrieb Travis Eddy :
>Here is what I tried: (several times)
>Nothing works
>The best I got was following the Ceph guide and adding
>sudo yum install centos-release-ceph-jewel
>
>When I do that its finishes and never works, acts like no one is
>talking.
>(selinux off, firewalld off)
>
>admin node $ ceph health
>2017-04-06 18:45:47.047387 7f16aa7fc700  0 -- :/3745109405 >>
>192.168.234.21:6789/0 pipe(0x7f16ac063cc0 sd=3 :0 s=1 pgs=0 cs=0 l=1
>c=0x7f16ac05c330).fault
>
>mon node $ ceph health
>HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs
>stuck inactive; 64 pgs stuck unclean; no osds
>
>
>the problems with the guides:
>
> http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
> https://wiki.centos.org/SpecialInterestGroup/Storage/ceph-Quickstart
> https://www.server-world.info/en/note?os=CentOS_7&p=ceph
>
>https://www.howtoforge.com/tutorial/how-to-build-a-ceph-cluster-on-centos-7/
>
>
> ceph guide fails at  the step: "ceph-deploy install "
>looks like package dep problem ( i just copied the end, there was alot
>of
>problems)
>
>[FX8-bench][WARNIN] Error: Package: 1:ceph-common-0.80.7-0.8.el7.x86_64
>(epel)
>[FX8-bench][WARNIN]Requires: librbd1 = 1:0.80.7
>[FX8-bench][WARNIN]Installed: 1:librbd1-10.2.6-0.el7.x86_64
>(@Ceph)
>[FX8-bench][WARNIN]librbd1 = 1:10.2.6-0.el7
>[FX8-bench][WARNIN]Available:
>1:librbd1-0.80.7-0.8.el7.x86_64
>(epel)
>[FX8-bench][WARNIN]librbd1 = 1:0.80.7-0.8.el7
>[FX8-bench][WARNIN]Available: 1:librbd1-0.94.5-1.el7.x86_64
>(base)
>[FX8-bench][WARNIN]librbd1 = 1:0.94.5-1.el7
>[FX8-bench][DEBUG ]  You could try running: rpm -Va --nofiles
>--nodigest
>[FX8-bench][ERROR ] RuntimeError: command returned non-zero exit
>status: 1
>[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y
>install ceph ceph-radosgw
>
>if i install the centos-release-ceph package the problem goes away, but
>ceph dosen't work still
>
>
>the centos guide fails:
>https://wiki.centos.org/SpecialInterestGroup/Storage/ceph-Quickstart
>ceph-deploy install --mon c7-ceph-mon0
>
>[am1-test][DEBUG ] Retrieving
>http://ceph.com/rpm-hammer/el7/noarch/ceph-release-1-0.el7.noarch.rpm
>[am1-test][WARNIN] error: open of 24-Apr-2016 failed: No such file or
>directory
>[am1-test][WARNIN] error: open of 00:05 failed: No such file or
>directory
>[am1-test][WARNIN] error: -: not an rpm package (or package manifest):
>[am1-test][WARNIN] error: open of [am1-test][WARNIN] error: open of href=el7/>el7/ failed: No such
>file
>or directory
>[am1-test][WARNIN] error: open of 29-Aug-2016 failed: No such file or
>directory
>[am1-test][WARNIN] error: open of 11:53 failed: No such file or
>directory
>[am1-test][WARNIN] error: -: not an rpm package (or package manifest):
>[am1-test][WARNIN] error: open of [am1-test][WARNIN] error: open of href=fc20/>fc20/ failed: No such
>file
>or directory
>[am1-test][WARNIN] error: open of 07-Apr-2015 failed: No such file or
>directory
>[am1-test][WARNIN] error: open of 19:21 failed: No such file or
>directory
>[am1-test][WARNIN] error: -: not an rpm package (or package manifest):
>[am1-test][WARNIN] error: open of [am1-test][WARNIN] error: open of href=rhel6/>rhel6/ failed: No
>such
>file or directory
>[am1-test][WARNIN] error: open of 07-Apr-2015 failed: No such file or
>directory
>[am1-test][WARNIN] error: open of 19:22 failed: No such file or
>directory
>[am1-test][WARNIN] error: -: not an rpm package (or package manifest):
>[am1-test][WARNIN] error: open of  failed: No such
>file or
>directory
>[am1-test][WARNIN] error: open of  failed: No such file or
>directory
>[am1-test][ERROR ] RuntimeError: command returned non-zero exit status:
>32
>[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm -Uvh
>--replacepkgs
>http://ceph.com/rpm-hammer/el7/noarch/ceph-release-1-0.el7.noarch.rpm
>
>
>
>Normally Server-World is my goto source, but i'm not going to try
>becuase
>it wants version "Hammer"
> https://www.server-world.info/en/note?os=CentOS_7&p=ceph
>
>
>
> howtoforge fails at the ceph-deploy install step
>
> [FX8-bench][DEBUG ] Retrieving
>http://ceph.com/rpm-hammer/el7/noarch/ceph-release-1-0.el7.noarch.rpm
>[FX8-bench][WARNIN] error: open of Index failed: No such file or
>directory
>[FX8-bench][WARNIN] error: open of of failed: No such file or directory
>[FX8-bench][WARNIN] error: open of /rpm-hammer/No
>such file or directory
>[FX8-bench][WARNIN] error: open of href=../>../ failed: No such
>file or
>directory
>[FX8-bench][WARNIN] error: open of [FX8-bench][WARNIN] error: open of href=el6/>el6/ failed: No such
>file
>or directory
>[FX8-bench][WARNIN] error: open of 24-Apr-2016 failed: No such file or
>directory
>[FX8-bench][WARNIN] error: open of 00:05 failed: No such file or
>directory
>[FX8-bench][WARNIN] error: -: not an rpm packa

Re: [ceph-users] active+clean+inconsistent and pg repair

2017-03-19 Thread Mehmet

Hi Shain,

what i would do:
take the osd.32 out

# systemctl stop ceph-osd@32
# ceph osd out osd.32

this will cause rebalancing.

to repair/reuse the drive you can do:

# smartctl -t long /dev/sdX
This will start a long self-test on the drive and - i bet - abort this 
after a while with somethin like


# smartctl -a /dev/sdX
[...]
SMART Self-test log

Num  Test  Status segment  LifeTime  
LBA_first_err [SK ASC ASQ]


 Description  number   (hours)

# 1  Background long   Failed in segment -->   -4378  
35494670 [0x3 0x11 0x0]

[...]


Now mark the segmant as "malfunction" - my system was Ubuntu

# apt install sg3-utils/xenial
# sg_verify --lba=35494670 /dev/sdX1
# sg_reassign --address=35494670 /dev/sdX
# sg_reassign --grown /dev/sdX

the next long test should hopefully work fine:
# smartctl -t long /dev/sdX

If not repeat the above with new found defekt lba.

Ive done this three time successfully - but not with an error on a 
primary pg.


After that you can start the osd with

# systemctl start ceph-osd@32
# ceph osd in osd.32

HTH
- Mehmet


Am 2017-03-17 20:08, schrieb Shain Miley:

Brian,

Thank you for the detailed information.  I was able to compare the 3
hexdump files and it looks like the primary pg is the odd man out.

I stopped the OSD and then I attempted to move the object:

root@hqosd3:/var/lib/ceph/osd/ceph-32/current/3.2b8_head/DIR_8/DIR_B/DIR_2/DIR_A/DIR_0#
mv rb.0.fe307e.238e1f29.0076024c__head_4650A2B8__3 /root
 mv: error reading
‘rb.0.fe307e.238e1f29.0076024c__head_4650A2B8__3’:
Input/output error
 mv: failed to extend
‘/root/rb.0.fe307e.238e1f29.0076024c__head_4650A2B8__3’:
Input/output error

However I got a nice Input/output error instead.

 I assume that this is not the case normally.

Any ideas on how I should proceed at this point..should I fail out
this OSD and replace the drive (I have had no indication other than
the IO error that there is an issue with this disk), or is there
something I can try first?

Thanks again,

Shain

On 03/17/2017 11:38 AM, Brian Andrus wrote:


We went through a period of time where we were experiencing these
daily...

cd to the PG directory on each OSD and do a find for
"238e1f29.0076024c" (mentioned in your error message). This will
likely return a file that has a slash in the name, something like
rbdudata.238e1f29.0076024c_head_blah_1f...

hexdump -C the object (tab completing the name helps) and pipe the
output to a different location. Once you obtain the hexdumps, do a
diff or cmp against them and find which one is not like the others.

If the primary is not the outlier, perform the PG repair without
worry. If the primary is the outlier, you will need to stop the OSD,
move the object out of place, start it back up and then it will be
okay to issue a PG repair.

Other less common inconsistent PGs we see are differing object sizes
(easy to detect with a simple list of file size) and differing
attributes ("attr -l", but the error logs are usually precise in
identifying the problematic PG copy).

On Fri, Mar 17, 2017 at 8:16 AM, Shain Miley  wrote:


Hello,

Ceph status is showing:

1 pgs inconsistent
1 scrub errors
1 active+clean+inconsistent

I located the error messages in the logfile after querying the pg
in question:

root@hqosd3:/var/log/ceph# zgrep -Hn 'ERR' ceph-osd.32.log.1.gz

ceph-osd.32.log.1.gz:846:2017-03-17 02:25:20.281608 7f7744d7f700
-1 log_channel(cluster) log [ERR] : 3.2b8 shard 32: soid
3/4650a2b8/rb.0.fe307e.238e1f29.0076024c/head candidate had a
read error, data_digest 0x84c33490 != known data_digest 0x974a24a7
from auth shard




62  
  



ceph-osd.32.log.1.gz:847:2017-03-17 02:30:40.264219 7f7744d7f700
-1 log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 0 missing, 1
inconsistent




objects 



ceph-osd.32.log.1.gz:848:2017-03-17 02:30:40.264307 7f7744d7f700
-1 log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 1 errors

Is this a case where it would be safe to use 'ceph pg repair'? The
documentation indicates there are times where running this command
is less safe than others...and I would like to be sure before I do
so.

Thanks,
Shain

--
NPR | Shain Miley | Manager of Infrastructure, Digital Media |
smi...@npr.org | 202.513.3649 [1]

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [2]


--

Brian Andrus | Cloud Systems Engineer | DreamHost
brian.and...@dreamhost.com | www.dreamhost.com [3]


--
NPR | Shain Miley | Manager of Infrastructure, Digital Media |
smi...@npr.org | 202.513.3649


Links:
--
[1] tel:%28202%29%20513-3649
[2] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[3] http://www.dreamhost.com

_

[ceph-users] How to prevent blocked requests?

2017-02-24 Thread Mehmet

Hey friends,

a month ago i had an issue with few blocked requests where some of my 
VMs did freeze while this happened.
I guessed the culprit was a spinning disk with a lot of "delayed ECC" 
(showed via smartctl: 48701).


So we decided to take this osd down/out to do some checks. After this 
blocked requests were gone and we got no more freezes.


Btw, this is related to the mentioned blocked requests
*dmesg* on the Server produced (two times)
[4927177.901845] INFO: task filestore_sync:5907 blocked for more than 
120 seconds.
[4927177.902147]   Tainted: G  I 4.4.0-43-generic 
#63-Ubuntu
[4927177.902416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[4927177.902735] filestore_sync  D 8810073e3e00 0  5907  1 
0x
[4927177.902741]  8810073e3e00 88102a1f0db8 8810367fb700 
8810281b0dc0
[4927177.902745]  8810073e4000 88102a1f0de8 88102a1f0a98 
8810073e3e8c
[4927177.902748]  5638fa13e000 8810073e3e18 8182d7c5 
8810073e3e8c

[4927177.902751] Call Trace:
[4927177.902764]  [] schedule+0x35/0x80
[4927177.902771]  [] wb_wait_for_completion+0x58/0xa0
[4927177.902779]  [] ? 
wake_atomic_t_function+0x60/0x60

[4927177.902782]  [] sync_inodes_sb+0xa3/0x1f0
[4927177.902786]  [] sync_filesystem+0x5a/0xa0
[4927177.902789]  [] SyS_syncfs+0x3e/0x70
[4927177.902794]  [] 
entry_SYSCALL_64_fastpath+0x16/0x71


Later (after smartctl long check) we put the mentioned osd in again and 
had also no more issues.


Finaly my Question :)

Is Ceph able to deal with "problematic" disks? How to tune this? Perhaps 
special timeouts?
I mean, let's say ceph cannot read a shard of a pg because there is a 
"i/o error"? Or..

When a OSD takes too long - like the dmesg output above?
In our setup we are using size of 3, so when a read/write request takes 
too much time ceph should be able to use another copy of the shard (for 
reads).


This is my Setup (in production):

*Software/OS*
- Jewel
#> ceph tell osd.* version | grep version | uniq
"version": "ceph version 10.2.3 
(ecc23778eb545d8dd55e2e4735b53cc93f92e65b)"


#> ceph tell mon.* version
 [...] ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)

- Ubuntu 16.04.01 LTS on all OSD and MON Server
#> uname -a
Linux server 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 
2016 x86_64 x86_64 x86_64 GNU/Linux


*Server*
4x OSD Server, 3x with

- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no 
Hyper-Threading

- 64GB RAM
- 12x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs
- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for 
12 Disks (20G Journal size)

- 1x Samsung SSD 840/850 Pro only for the OS

and 1x OSD Server with

- 1x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz (10 Cores 20 Threads)
- 64GB RAM
- 23x 2TB TOSHIBA MK2001TRKB SAS2 (6GB/s) Disks as OSDs
- 1x SEAGATE ST32000445SS SAS2 (6GB/s) Disk as OSDs
- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for 
24 Disks (15G Journal size)

- 1x Samsung SSD 850 Pro only for the OS

3x MON Server

- Two of them with 1x Intel(R) Xeon(R) CPU E3-1265L V2 @ 2.50GHz (4 
Cores, 8 Threads)
- The third one has 2x Intel(R) Xeon(R) CPU L5430 @ 2.66GHz ==> 8 Cores, 
no Hyper-Threading

- 32 GB RAM
- 1x Raid 10 (4 Disks)

*Network*

- Each Server and Client has 2x 10GB (LACP);
- We do not use Jumbo Frames yet..
- Public and Cluster-Network related Ceph traffic is going through this 
one active (LACP) 10GB Interface on each Server.


*ceph.conf*
[global]
fsid = ----
public_network = xxx.16.0.0/24
cluster_network = xx.0.0.0/24
mon_initial_members = monserver1, monserver2, monserver3
mon_host = xxx.16.0.2,xxx.16.0.3,xxx.16.0.4
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_crush_initial_weight = 0

mon_osd_full_ratio = 0.90
mon_osd_nearfull_ratio = 0.80

[mon]
mon_allow_pool_delete = false

[osd]
#osd_journal_size = 20480
osd_journal_size = 15360

Please ask if you need more information.
Thanks so far.

- Mehmet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Objects Stuck Degraded

2017-01-24 Thread Mehmet
Perhaps a deep scrub will cause a scrub error Which you can try to ceph pg 
repair?

Btw. It seems that you use 2 replicas Which is not recommendet except for dev 
environments.

Am 24. Januar 2017 22:58:14 MEZ schrieb Richard Bade :
>Hi Everyone,
>I've got a strange one. After doing a reweight of some osd's the other
>night our cluster is showing 1pg stuck unclean.
>
>2017-01-25 09:48:41 : 1 pgs stuck unclean | recovery 140/71532872
>objects degraded (0.000%) | recovery 2553/71532872 objects misplaced
>(0.004%)
>
>When I query the pg it shows one of the osd's is not up.
>
>"state": "active+remapped",
>"snap_trimq": "[]",
>"epoch": 231928,
>"up": [
>155
>],
>"acting": [
>155,
>105
>],
>"actingbackfill": [
>"105",
>"155"
>],
>
>I've tried restarting the osd's, ceph pg repair, ceph pg 4.559
>list_missing, ceph pg 4.559 mark_unfound_lost revert.
>Nothing works.
>I've just tried setting osd.105 out, waiting for backfill to evacuate
>the osd and stopping the osd process to see if it'll recreate the 2nd
>set of data but no luck.
>It would seem that the primary copy of the data on osd.155 is fine but
>the 2nd copy on osd.105 isn't there.
>
>Any ideas how I can force rebuilding the 2nd copy? Or any other ideas
>to resolve this?
>
>We're running Hammer
>ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90)
>
>Regards,
>Richard
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph is rebalancing CRUSH on every osd add

2017-01-23 Thread Mehmet
I guess this is cause you are using always the Same root tree.

Am 23. Januar 2017 10:50:16 MEZ schrieb Sascha Spreitzer :
>Hi all
>
>I reckognized ceph is rebalancing the whole crush map when i add osd's
>that should not affect any of my crush rulesets.
>
>Is there a way to add osd's to the crush map without having the cluster
>change all the osd mappings (rebalancing)?
>
>Or am i doing something wrong terribly?
>
>How does this work internally in general? What happens when you add an
>osd?
>
>Ceph jewel, tunables optimal
>
># rules
>rule replicated_ruleset {
>   ruleset 0
>   type replicated
>   min_size 1
>   max_size 10
>   step take default
>   step chooseleaf firstn 0 type host
>   step emit
>}
>rule sascha-ssd {
>   ruleset 1
>   type replicated
>   min_size 1
>   max_size 10
>   step take sascha
>   step chooseleaf firstn 0 type ssd
>   step emit
>}
>rule sascha-spin {
>   ruleset 2
>   type replicated
>   min_size 1
>   max_size 10
>   step take sascha
>   step chooseleaf firstn 0 type spin
>   step emit
>}
>rule sascha-usb {
>   ruleset 3
>   type replicated
>   min_size 1
>   max_size 10
>   step take sascha
>   step chooseleaf firstn 0 type usb
>   step emit
>}
>rule sascha-archive {
>ruleset 4
>type replicated
>min_size 1
>max_size 10
>step take sascha
>step chooseleaf firstn 1 type ssd
>   step emit
>   step take sascha
>step chooseleaf firstn -1 type usb
>step emit
>}
>
># end crush map
>
>[root@vm1 ceph]# ceph osd tree
>ID WEIGHT  TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
>-1 6.0 root default
>-8 6.0 region sascha
>-9 6.0 room sascha-living
>-2 3.0 host vm1
>-4 3.0 ssd ssd0
> 0 1.0 osd.0   up  1.0  1.0
> 4 1.0 osd.4   up  1.0  1.0
> 5 1.0 osd.5   up  1.0  1.0
>-6   0 usb usb0
>-3 3.0 host vm2
>-5 3.0 ssd ssd1
> 1 1.0 osd.1   up  1.0  1.0
> 2 1.0 osd.2   up  1.0  1.0
> 3 1.0 osd.3   up  1.0  1.0
>-7   0 usb usb1
> 6   0 osd.6 down0  1.0
> 7   0 osd.7   up  1.0  1.0
>[root@vm1 ceph]# ceph osd crush tree
>[
>{
>"id": -1,
>"name": "default",
>"type": "root",
>"type_id": 13,
>"items": [
>{
>"id": -8,
>"name": "sascha",
>"type": "region",
>"type_id": 12,
>"items": [
>{
>"id": -9,
>"name": "sascha-living",
>"type": "room",
>"type_id": 10,
>"items": [
>{
>"id": -2,
>"name": "vm1",
>"type": "host",
>"type_id": 4,
>"items": [
>{
>"id": -4,
>"name": "ssd0",
>"type": "ssd",
>"type_id": 1,
>"items": [
>{
>"id": 0,
>"name": "osd.0",
>"type": "osd",
>"type_id": 0,
>  "crush_weight": 1.00,
>"depth": 5
>},
>{
>"id": 4,
>"name": "osd.4",
>"type": "osd",
>"type_id": 0,
>  "crush_weight": 1.00,
>"depth": 5
>},
>{
>"id": 5,
>"name": "osd.5",
>"type": "osd",
>"type_id": 0,
> 

Re: [ceph-users] Ceph pg active+clean+inconsistent

2016-12-21 Thread Mehmet
Hi Andras,

Iam not the experienced User but i guess you could have a look on this object 
on each related osd for the pg, compare them and delete the Different object. I 
assume you have size = 3.

Then again pg repair.

But be carefull iirc the replica will be recovered from the primary pg.

Hth

Am 20. Dezember 2016 22:39:44 MEZ, schrieb Andras Pataki 
:
>Hi cephers,
>
>Any ideas on how to proceed on the inconsistencies below?  At the
>moment 
>our ceph setup has 5 of these - in all cases it seems like some zero 
>length objects that match across the three replicas, but do not match 
>the object info size.  I tried running pg repair on one of them, but it
>
>didn't repair the problem:
>
>2016-12-20 16:24:40.870307 7f3e1a4b1700  0 log_channel(cluster) log
>[INF] : 6.92c repair starts
>2016-12-20 16:27:06.183186 7f3e1a4b1700 -1 log_channel(cluster) log
>[ERR] : repair 6.92c 6:34932257:::1000187bbb5.0009:head on disk
>size (0) does not match object info size (3014656) adjusted for
>ondisk to (3014656)
>2016-12-20 16:27:35.885496 7f3e17cac700 -1 log_channel(cluster) log
>[ERR] : 6.92c repair 1 errors, 0 fixed
>
>
>Any help/hints would be appreciated.
>
>Thanks,
>
>Andras
>
>
>On 12/15/2016 10:13 AM, Andras Pataki wrote:
>> Hi everyone,
>>
>> Yesterday scrubbing turned up an inconsistency in one of our
>placement 
>> groups.  We are running ceph 10.2.3, using CephFS and RBD for some VM
>
>> images.
>>
>> [root@hyperv017 ~]# ceph -s
>> cluster d7b33135-0940-4e48-8aa6-1d2026597c2f
>>  health HEALTH_ERR
>> 1 pgs inconsistent
>> 1 scrub errors
>> noout flag(s) set
>>  monmap e15: 3 mons at 
>>
>{hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}
>> election epoch 27192, quorum 0,1,2 
>> hyperv029,hyperv030,hyperv031
>>   fsmap e17181: 1/1/1 up {0=hyperv029=up:active}, 2 up:standby
>>  osdmap e342930: 385 osds: 385 up, 385 in
>> flags noout
>>   pgmap v37580512: 34816 pgs, 5 pools, 673 TB data, 198 Mobjects
>> 1583 TB used, 840 TB / 2423 TB avail
>>34809 active+clean
>>4 active+clean+scrubbing+deep
>>2 active+clean+scrubbing
>>1 active+clean+inconsistent
>>   client io 87543 kB/s rd, 671 MB/s wr, 23 op/s rd, 2846 op/s wr
>>
>> # ceph pg dump | grep inconsistent
>> 6.13f1  46920   0   0   0 16057314767 3087 3087  
> 
>> active+clean+inconsistent 2016-12-14 16:49:48.391572 342929'41011
>> 342929:43966 [158,215,364]   158 [158,215,364]   158 342928'40540
>
>> 2016-12-14 16:49:48.391511  342928'405402016-12-14 
>> 16:49:48.391511
>>
>> I tried a couple of other deep scrubs on pg 6.13f1 but got repeated 
>> errors.  In the OSD logs:
>>
>> 2016-12-14 16:48:07.733291 7f3b56e3a700 -1 log_channel(cluster) log 
>> [ERR] : deep-scrub 6.13f1 6:8fc91b77:::1000187bb70.0009:head on 
>> disk size (0) does not match object info size (1835008) adjusted for 
>> ondisk to (1835008)
>> I looked at the objects on the 3 OSD's on their respective hosts and 
>> they are the same, zero length files:
>>
>> # cd ~ceph/osd/ceph-158/current/6.13f1_head
>> # find . -name *1000187bb70* -ls
>> 6697380 -rw-r--r--   1 ceph ceph0 Dec 13 17:00 
>>
>./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.0009__head_EED893F1__6
>>
>> # cd ~ceph/osd/ceph-215/current/6.13f1_head
>> # find . -name *1000187bb70* -ls
>> 5398156470 -rw-r--r--   1 ceph ceph0 Dec 13 17:00
>
>>
>./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.0009__head_EED893F1__6
>>
>> # cd ~ceph/osd/ceph-364/current/6.13f1_head
>> # find . -name *1000187bb70* -ls
>> 18814322150 -rw-r--r--   1 ceph ceph0 Dec 13
>17:00 
>>
>./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.0009__head_EED893F1__6
>>
>> At the time of the write, there wasn't anything unusual going on as 
>> far as I can tell (no hardware/network issues, all processes were up,
>
>> etc).
>>
>> This pool is a CephFS data pool, and the corresponding file (inode
>hex 
>> 1000187bb70, decimal 1099537300336) looks like this:
>>
>> # ls -li chr4.tags.tsv
>> 1099537300336 -rw-r--r-- 1 xichen xichen 14469915 Dec 13 17:01 
>> chr4.tags.tsv
>>
>> Reading the file is also ok (no errors, right number of bytes):
>> # cat chr4.tags.tsv > /dev/null
>> # wc chr4.tags.tsv
>>   592251  2961255 14469915 chr4.tags.tsv
>>
>> We are using the standard 4MB block size for CephFS, and if I 
>> interpret this right, this is the 9th chunk, so there shouldn't be
>any 
>> data (or even a 9th chunk), since the file is only 14MB. Should I run
>
>> pg repair on this?  Any ideas on how this could come about? Any other
>
>> recommendations?
>>
>> Thanks,
>>
>> Andras
>> apat...@apataki.net
>>
>
>
>
>
>
>___
>ceph-users 

Re: [ceph-users] PGs stuck at creating forever

2016-11-02 Thread Mehmet
I would try to Set pgp for your pool equal to 300

#Ceph osd pool yourpool set pgp 300

...not sure about the command...

If that did not help try to restart osd 7 and 15

Hth
- Mehmet 

Am 2. November 2016 14:15:09 MEZ, schrieb Vlad Blando :
>​I have a 3 Cluster Giant setup with 8 OSD each, during the
>installation I
>had to redo a cluster but it looks like the info is still on crush map
>(based on my readings). How do I fix this?
>
>[root@avatar0-ceph1 ~]# ceph -s
>cluster 2f0d1928-2ee5-4731-a259-64c0dc16110a
>health HEALTH_WARN 139 pgs stuck inactive; 139 pgs stuck unclean; 2
>requests are blocked > 32 sec; pool rbd pg_num 300 > pgp_num 64
> monmap e1: 3 mons at {avatar0-ceph0=
>172.40.40.100:6789/0,avatar0-ceph1=172.40.40.101:6789/0,avatar0-ceph2=172.40.40.102:6789/0},
>election epoch 56, quorum 0,1,2
>avatar0-ceph0,avatar0-ceph1,avatar0-ceph2
> osdmap e557: 24 osds: 24 up, 24 in
>  pgmap v1807359: 1500 pgs, 5 pools, 358 GB data, 48728 objects
>737 GB used, 88630 GB / 89368 GB avail
> 139 creating
>1361 active+clean
>  client io 44391 B/s wr, 19 op/s
>[root@avatar0-ceph1 ~]#
>
>
>[root@avatar0-ceph0 current]# ceph health detail
>HEALTH_WARN 139 pgs stuck inactive; 139 pgs stuck unclean; 2 requests
>are
>blocked > 32 sec; 2 osds have slow requests; pool rbd pg_num 300 >
>pgp_num
>64
>pg 0.f4 is stuck inactive since forever, current state creating, last
>acting [7,9]
>pg 0.f2 is stuck inactive since forever, current state creating, last
>acting [16,14]
>pg 0.ef is stuck inactive since forever, current state creating, last
>acting [13,0]
>pg 0.ee is stuck inactive since forever, current state creating, last
>acting [19,0]
>pg 0.ec is stuck inactive since forever, current state creating, last
>acting [12,20]
>pg 0.ea is stuck inactive since forever, current state creating, last
>acting [11,3]
>pg 0.e9 is stuck inactive since forever, current state creating, last
>acting [9,21]
>pg 0.e3 is stuck inactive since forever, current state creating, last
>acting [3,8]
>pg 0.e2 is stuck inactive since forever, current state creating, last
>acting [8,5]
>pg 0.e0 is stuck inactive since forever, current state creating, last
>acting [7,18]
>pg 0.dd is stuck inactive since forever, current state creating, last
>acting [5,17]
>pg 0.dc is stuck inactive since forever, current state creating, last
>acting [3,13]
>pg 0.db is stuck inactive since forever, current state creating, last
>acting [4,10]
>pg 0.da is stuck inactive since forever, current state creating, last
>acting [6,18]
>pg 0.d9 is stuck inactive since forever, current state creating, last
>acting [23,12]
>pg 0.d5 is stuck inactive since forever, current state creating, last
>acting [22,11]
>pg 0.d3 is stuck inactive since forever, current state creating, last
>acting [3,15]
>pg 0.d2 is stuck inactive since forever, current state creating, last
>acting [1,9]
>pg 0.d1 is stuck inactive since forever, current state creating, last
>acting [0,13]
>pg 0.ce is stuck inactive since forever, current state creating, last
>acting [19,7]
>pg 0.cd is stuck inactive since forever, current state creating, last
>acting [13,3]
>pg 0.cc is stuck inactive since forever, current state creating, last
>acting [0,15]
>pg 0.cb is stuck inactive since forever, current state creating, last
>acting [17,3]
>pg 0.ca is stuck inactive since forever, current state creating, last
>acting [15,7]
>pg 0.c9 is stuck inactive since forever, current state creating, last
>acting [18,6]
>pg 0.c8 is stuck inactive since forever, current state creating, last
>acting [3,10]
>pg 0.c5 is stuck inactive since forever, current state creating, last
>acting [10,22]
>pg 0.c4 is stuck inactive since forever, current state creating, last
>acting [0,12]
>pg 0.c1 is stuck inactive since forever, current state creating, last
>acting [2,13]
>pg 0.c0 is stuck inactive since forever, current state creating, last
>acting [17,4]
>pg 0.bf is stuck inactive since forever, current state creating, last
>acting [18,12]
>pg 0.be is stuck inactive since forever, current state creating, last
>acting [13,21]
>pg 0.bd is stuck inactive since forever, current state creating, last
>acting [23,14]
>pg 0.bb is stuck inactive since forever, current state creating, last
>acting [23,8]
>pg 0.ba is stuck inactive since forever, current state creating, last
>acting [17,9]
>pg 0.b9 is stuck inactive since forever, current state creating, last
>acting [0,16]
>pg 0.b7 is stuck inactive since forever, current state creating, last
>acting [1,21]
>pg 0.b6 is stuck inactive since forever, current state creating, last
>acting [0,8]
>pg 0.b4 is stuck inactive s

Re: [ceph-users] [EXTERNAL] Re: pg stuck with unfound objects on non exsisting osd's

2016-11-02 Thread Mehmet
Yes a rolling restart should work. That was enough in my case.

Am 2. November 2016 01:20:48 MEZ, schrieb "Will.Boege" :
>Start with a rolling restart of just the OSDs one system at a time,
>checking the status after each restart.
>
>On Nov 1, 2016, at 6:20 PM, Ronny Aasen
>mailto:ronny+ceph-us...@aasen.cx>> wrote:
>
>thanks for the suggestion.
>
>is a rolling reboot sufficient? or must all osd's be down at the same
>time ?
>one is no problem.  the other takes some scheduling..
>
>Ronny Aasen
>
>
>On 01.11.2016 21:52, c...@elchaka.de<mailto:c...@elchaka.de> wrote:
>Hello Ronny,
>
>if it is possible for you, try to Reboot all OSD Nodes.
>
>I had this issue on my test Cluster and it become healthy after
>rebooting.
>
>Hth
>- Mehmet
>
>Am 1. November 2016 19:55:07 MEZ, schrieb Ronny Aasen
><mailto:ronny+ceph-us...@aasen.cx>:
>
>Hello.
>
>I have a cluster stuck with 2 pg's stuck undersized degraded, with 25
>unfound objects.
>
># ceph health detail
>HEALTH_WARN 2 pgs degraded; 2 pgs recovering; 2 pgs stuck degraded; 2
>pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized; recovery
>294599/149522370 objects degraded (0.197%); recovery 640073/149522370
>objects misplaced (0.428%); recovery 25/46579241 unfound (0.000%);
>noout flag(s) set
>pg 6.d4 is stuck unclean for 8893374.380079, current state
>active+recovering+undersized+degraded+remapped, last acting [62]
>pg 6.ab is stuck unclean for 8896787.249470, current state
>active+recovering+undersized+degraded+remapped, last acting [18,12]
>pg 6.d4 is stuck undersized for 438122.427341, current state
>active+recovering+undersized+degraded+remapped, last acting [62]
>pg 6.ab is stuck undersized for 416947.461950, current state
>active+recovering+undersized+degraded+remapped, last acting [18,12]pg
>6.d4 is stuck degraded for 438122.427402, current state
>active+recovering+undersized+degraded+remapped, last acting [62]
>pg 6.ab is stuck degraded for 416947.462010, current state
>active+recovering+undersized+degraded+remapped, last acting [18,12]
>pg 6.d4 is active+recovering+undersized+degraded+remapped, acting [62],
>25 unfound
>pg 6.ab is active+recovering+undersized+degraded+remapped, acting
>[18,12]
>recovery 294599/149522370 objects degraded (0.197%)
>recovery 640073/149522370 objects misplaced (0.428%)
>recovery 25/46579241 unfound (0.000%)
>noout flag(s) set
>
>
>have been following the troubleshooting guide at
>http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
>but gets stuck without a resolution.
>
>luckily it is not critical data. so i wanted to mark the pg lost so it
>could become health-ok<
> br
>/>
>
># ceph pg 6.d4 mark_unfound_lost delete
>Error EINVAL: pg has 25 unfound objects but we haven't probed all
>sources, not marking lost
>
>querying the pg i see that it would want osd.80 and osd 36
>
>  {
> "osd": "80",
> "status": "osd is down"
> },
>
>trying to mark the osd's lost does not work either. since the osd's was
>removed from the cluster a long time ago.
>
># ceph osd lost 80 --yes-i-really-mean-it
>osd.80 is not down or doesn't exist
>
># ceph osd lost 36 --yes-i-really-mean-it
>osd.36 is not down or doesn't exist
>
>
>and this is where i am stuck.
>
>have tried stopping and starting the 3 osd's but that did not have any
>effect.
>
>Anyone have any advice how to proceed ?
>
>full output at:  http://paste.debian.net/hidden/be03a185/
>
>this is hammer 0.94.9  on debian 8.
>
>
>kind regards
>
>Ronny Aasen
>
>
>
>
>
>
>ceph-users mailing list
>ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-12 Thread Mehmet

Hey Alexey,

sorry - it seems that the log files does not contain the debug message 
which i got @ the command line


here it is
- http://slexy.org/view/s20A6m2Tfr

Mehmet

Am 2016-09-12 15:48, schrieb Alexey Sheplyakov:

Hi,


This is the actual logfile for osd.10

 > - http://slexy.org/view/s21lhpkLGQ [5]

Unfortunately this log does not contain any new data -- for some
reason the log levels haven't changed (see line 36369).

Could you please try the following command:

ceph-osd -d --flush-journal --debug_filestore 20/20 --debug_journal
20/20  -i 10

Best regards,

 Alexey

On Fri, Sep 9, 2016 at 11:24 AM, Mehmet  wrote:


Hello Alexey,

thank you for your mail - my answers inline :)

Am 2016-09-08 16:24, schrieb Alexey Sheplyakov:
Hi,

root@:~# ceph-osd -i 12 --flush-journal
 > SG_IO: questionable sense data, results may be incorrect
 > SG_IO: questionable sense data, results may be incorrect

As far as I understand these lines is a hdparm warning (OSD uses
hdparm command to query the journal device write cache state).

The message means hdparm is unable to reliably figure out if the
drive
write cache is enabled. This might indicate a hardware problem.


 I guess this has to do with the the NVMe-Device (Intel DC P3700 NVMe)
which is used for journaling.
 And so.. a normal behavior?


ceph-osd -i 12 --flush-journal


I think it's a good idea to
a) check the journal drive (smartctl),


 The disks are all fine - checked 2-3 weeks before.


b) capture a more verbose log,

i.e. add this to ceph.conf

[osd]
debug filestore = 20/20
debug journal = 20/20

and try flushing the journal once more (note: this won't fix the
problem, the point is to get a useful log)


 I have flushed the the journal @ ~09:55:26 today and got this lines

 root@:~# ceph-osd -i 10 --flush-journal
 SG_IO: questionable sense data, results may be incorrect
 SG_IO: questionable sense data, results may be incorrect
 *** Caught signal (Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd
  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
 2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught signal
(Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd

  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
  NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

      0> 2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught
signal (Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd

  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
  NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

 Segmentation fault

 This is the actual logfile for osd.10
 - http://slexy.org/view/s21lhpkLGQ [5]

 By the way:
 I have done "ceph osd set noout" before stop and flushing.

 Hope this is useful for you!

 - Mehmet


Best regards,
  Alexey

On Wed, Sep 7, 2016 at 6:48 PM, Mehmet  wrote:

Hey again,

now i have stopped my osd.12 via

root@:~# systemctl stop ceph-osd@12

and when i am flush the journal...

root@:~# ceph-osd -i 12 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

     0> 2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught
signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Segmentation fault

The logfile with further information
- http://slexy.org/view/s2T8AohMfU [1] [4]

I guess i will get same message when i flush the other journals.

- Mehmet

Am 2016-09-07 13:23, schrieb Mehmet:

Hello ceph people,

yesterday i stopped one of my OSDs via

root@:~# systemctl stop ceph-osd@10

and tried to flush the journal for this osd via

root@:~# ceph-osd -i 10 --flush-journal

but getting this output on the screen:

SG_IO: questionable sense data, results ma

Re: [ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-12 Thread Mehmet

Hello Alexey,

this time i did not get any error with your given command


ceph-osd -d --flush-journal --debug_filestore 20/20 --debug_journal
20/20  -i 10


for osd.10
- http://slexy.org/view/s21dWEKymn

*but* i have tried another osd (12) and got indeed an error :*(

for osd.12
- http://slexy.org/view/s2vrUnNBEW

Thanky you for investigations :)
HTH

kind regards,
- Mehmet


Am 2016-09-12 15:48, schrieb Alexey Sheplyakov:

Hi,


This is the actual logfile for osd.10

 > - http://slexy.org/view/s21lhpkLGQ [5]

Unfortunately this log does not contain any new data -- for some
reason the log levels haven't changed (see line 36369).

Could you please try the following command:

ceph-osd -d --flush-journal --debug_filestore 20/20 --debug_journal
20/20  -i 10

Best regards,

 Alexey

On Fri, Sep 9, 2016 at 11:24 AM, Mehmet  wrote:


Hello Alexey,

thank you for your mail - my answers inline :)

Am 2016-09-08 16:24, schrieb Alexey Sheplyakov:
Hi,

root@:~# ceph-osd -i 12 --flush-journal
 > SG_IO: questionable sense data, results may be incorrect
 > SG_IO: questionable sense data, results may be incorrect

As far as I understand these lines is a hdparm warning (OSD uses
hdparm command to query the journal device write cache state).

The message means hdparm is unable to reliably figure out if the
drive
write cache is enabled. This might indicate a hardware problem.


 I guess this has to do with the the NVMe-Device (Intel DC P3700 NVMe)
which is used for journaling.
 And so.. a normal behavior?


ceph-osd -i 12 --flush-journal


I think it's a good idea to
a) check the journal drive (smartctl),


 The disks are all fine - checked 2-3 weeks before.


b) capture a more verbose log,

i.e. add this to ceph.conf

[osd]
debug filestore = 20/20
debug journal = 20/20

and try flushing the journal once more (note: this won't fix the
problem, the point is to get a useful log)


 I have flushed the the journal @ ~09:55:26 today and got this lines

 root@:~# ceph-osd -i 10 --flush-journal
 SG_IO: questionable sense data, results may be incorrect
 SG_IO: questionable sense data, results may be incorrect
 *** Caught signal (Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd
  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
 2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught signal
(Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd

  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
  NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

      0> 2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught
signal (Segmentation fault) **
  in thread 7f38a2ecf700 thread_name:ceph-osd

  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
  1: (()+0x96bdde) [0x560356296dde]
  2: (()+0x113d0) [0x7f38a81b03d0]
  3: [0x560360f79f00]
  NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

 Segmentation fault

 This is the actual logfile for osd.10
 - http://slexy.org/view/s21lhpkLGQ [5]

 By the way:
 I have done "ceph osd set noout" before stop and flushing.

 Hope this is useful for you!

 - Mehmet


Best regards,
  Alexey

On Wed, Sep 7, 2016 at 6:48 PM, Mehmet  wrote:

Hey again,

now i have stopped my osd.12 via

root@:~# systemctl stop ceph-osd@12

and when i am flush the journal...

root@:~# ceph-osd -i 12 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

     0> 2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught
signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Segmentation fault

The logfile with further information
- http://slexy.org/view/s2T8AohMfU [1] [4]

I guess i will get same message when i flush the other journals.

- Mehmet

Am 2016-09-07 13:23, schrieb Mehmet:

Hello ceph people,

yesterday i stopped one of my OSDs via


Re: [ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-09 Thread Mehmet

Hello Alexey,

thank you for your mail - my answers inline :)

Am 2016-09-08 16:24, schrieb Alexey Sheplyakov:

Hi,


root@:~# ceph-osd -i 12 --flush-journal

 > SG_IO: questionable sense data, results may be incorrect
 > SG_IO: questionable sense data, results may be incorrect

As far as I understand these lines is a hdparm warning (OSD uses
hdparm command to query the journal device write cache state).

The message means hdparm is unable to reliably figure out if the drive
write cache is enabled. This might indicate a hardware problem.


I guess this has to do with the the NVMe-Device (Intel DC P3700 NVMe) 
which is used for journaling.

And so.. a normal behavior?


ceph-osd -i 12 --flush-journal


I think it's a good idea to
a) check the journal drive (smartctl),


The disks are all fine - checked 2-3 weeks before.


b) capture a more verbose log,

i.e. add this to ceph.conf

[osd]
debug filestore = 20/20
debug journal = 20/20

and try flushing the journal once more (note: this won't fix the
problem, the point is to get a useful log)


I have flushed the the journal @ ~09:55:26 today and got this lines

root@:~# ceph-osd -i 10 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f38a2ecf700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x560356296dde]
 2: (()+0x113d0) [0x7f38a81b03d0]
 3: [0x560360f79f00]
2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7f38a2ecf700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x560356296dde]
 2: (()+0x113d0) [0x7f38a81b03d0]
 3: [0x560360f79f00]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


 0> 2016-09-09 09:55:26.446925 7f38a2ecf700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7f38a2ecf700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x560356296dde]
 2: (()+0x113d0) [0x7f38a81b03d0]
 3: [0x560360f79f00]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


Segmentation fault


This is the actual logfile for osd.10
- http://slexy.org/view/s21lhpkLGQ

By the way:
I have done "ceph osd set noout" before stop and flushing.

Hope this is useful for you!

- Mehmet


Best regards,
  Alexey

On Wed, Sep 7, 2016 at 6:48 PM, Mehmet  wrote:


Hey again,

now i have stopped my osd.12 via

root@:~# systemctl stop ceph-osd@12

and when i am flush the journal...

root@:~# ceph-osd -i 12 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

     0> 2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught
signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Segmentation fault

The logfile with further information
- http://slexy.org/view/s2T8AohMfU [4]

I guess i will get same message when i flush the other journals.

- Mehmet

Am 2016-09-07 13:23, schrieb Mehmet:


Hello ceph people,

yesterday i stopped one of my OSDs via

root@:~# systemctl stop ceph-osd@10

and tried to flush the journal for this osd via

root@:~# ceph-osd -i 10 --flush-journal

but getting this output on the screen:

SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS `
is
needed to interpret

Re: [ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-07 Thread Mehmet

Hey again,

now i have stopped my osd.12 via

root@:~# systemctl stop ceph-osd@12

and when i am flush the journal...

root@:~# ceph-osd -i 12 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


 0> 2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


Segmentation fault

The logfile with further information
- http://slexy.org/view/s2T8AohMfU

I guess i will get same message when i flush the other journals.

- Mehmet

Am 2016-09-07 13:23, schrieb Mehmet:

Hello ceph people,

yesterday i stopped one of my OSDs via

root@:~# systemctl stop ceph-osd@10

and tried to flush the journal for this osd via

root@:~# ceph-osd -i 10 --flush-journal

but getting this output on the screen:

SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

 0> 2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Segmentation fault

This is the logfile from my osd.10 with further informations
- http://slexy.org/view/s21tfwQ1fZ

Today i stopped another OSD (osd.11)

root@:~# systemctl stop ceph-osd@11

I did not not get the above mentioned error - but this

root@:~# ceph-osd -i 11 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
2016-09-07 13:19:39.729894 7f3601a298c0 -1 flushed journal
/var/lib/ceph/osd/ceph-11/journal for object store
/var/lib/ceph/osd/ceph-11

This is the logfile from my osd.11 with further informations
- http://slexy.org/view/s2AlEhV38m

This is not realy a case actualy cause i will setup the journal
partitions again with 20GB (from 5GB actual) an bring the OSD then
bring up again.
But i thought i should mail this error to the mailing list.

This is my Setup:

*Software/OS*
- Jewel
#> ceph tell osd.* version | grep version | uniq
"version": "ceph version 10.2.2 
(45107e21c568dd033c2f0a3107dec8f0b0e58374)"


#> ceph tell mon.* version
[...] ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)

- Ubuntu 16.04 LTS on all OSD and MON Server
#> uname -a
31.08.2016: Linux reilif 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11
18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

*Server*
3x OSD Server, each with

- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no 
Hyper-Threading


- 64GB RAM
- 10x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs

- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device
for 10-12 Disks

- 1x Samsung SSD 840/850 Pro only for the OS

3x MON Server
- Two of them with 1x Intel(R) Xeon(R) CPU E3-1265L V2 @ 2.50GHz (4
Cores, 8 Threads) - The third one has 2x Intel(R) Xeon(R) CPU L5430 @
2.66GHz ==> 8 Cores, no Hyper-Threading

- 32 GB RAM
- 1x Raid 10 (4 Disks)

*Network*
- Actualy each Server and Client has on active connection @ 1x 1GB; In
Short this will be changed to 2x 10GB Fibre perhaps with LACP when
possible.

- We do not use Jumbo Frames yet..

- Public and Cluster-Network related Ceph traffic is actualy going
through this one active 1

[ceph-users] Jewel 10.2.2 - Error when flushing journal

2016-09-07 Thread Mehmet

Hello ceph people,

yesterday i stopped one of my OSDs via

root@:~# systemctl stop ceph-osd@10

and tried to flush the journal for this osd via

root@:~# ceph-osd -i 10 --flush-journal

but getting this output on the screen:

SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


 0> 2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal 
(Segmentation fault) **

 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


Segmentation fault

This is the logfile from my osd.10 with further informations
- http://slexy.org/view/s21tfwQ1fZ

Today i stopped another OSD (osd.11)

root@:~# systemctl stop ceph-osd@11

I did not not get the above mentioned error - but this

root@:~# ceph-osd -i 11 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
2016-09-07 13:19:39.729894 7f3601a298c0 -1 flushed journal 
/var/lib/ceph/osd/ceph-11/journal for object store 
/var/lib/ceph/osd/ceph-11


This is the logfile from my osd.11 with further informations
- http://slexy.org/view/s2AlEhV38m

This is not realy a case actualy cause i will setup the journal 
partitions again with 20GB (from 5GB actual) an bring the OSD then bring 
up again.

But i thought i should mail this error to the mailing list.

This is my Setup:

*Software/OS*
- Jewel
#> ceph tell osd.* version | grep version | uniq
"version": "ceph version 10.2.2 
(45107e21c568dd033c2f0a3107dec8f0b0e58374)"


#> ceph tell mon.* version
[...] ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)

- Ubuntu 16.04 LTS on all OSD and MON Server
#> uname -a
31.08.2016: Linux reilif 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 
18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux


*Server*
3x OSD Server, each with

- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no 
Hyper-Threading


- 64GB RAM
- 10x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs

- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device for 
10-12 Disks


- 1x Samsung SSD 840/850 Pro only for the OS

3x MON Server
- Two of them with 1x Intel(R) Xeon(R) CPU E3-1265L V2 @ 2.50GHz (4 
Cores, 8 Threads) - The third one has 2x Intel(R) Xeon(R) CPU L5430 @ 
2.66GHz ==> 8 Cores, no Hyper-Threading


- 32 GB RAM
- 1x Raid 10 (4 Disks)

*Network*
- Actualy each Server and Client has on active connection @ 1x 1GB; In 
Short this will be changed to 2x 10GB Fibre perhaps with LACP when 
possible.


- We do not use Jumbo Frames yet..

- Public and Cluster-Network related Ceph traffic is actualy going 
through this one active 1GB Interface on each Server.


hf
- Mehmet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-30 Thread Mehmet

Good news Jean-Charles :)

now i have deleted the object

[...]
-rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 vm-101-disk-2__head_383C3223__0
[...]

root@:~# rados -p rbd rm vm-101-disk-2

and did run again a deep-scrub on 0.223.

root@gengalos:~# ceph pg 0.223 query

No blocked requests anymore :)

To be sure i have checked

root@:~# ceph pg 0.223 query | less
[...]
"stat_sum": {
"num_bytes": 110264905728,
"num_objects": 703,
[...]

But this is not changed till now. I guess i have to wait a while.

Thank you very much for your patience and great help!

Now lets play a bit with ceph ^^

Best regards,

- Mehmet

Am 2016-08-30 00:02, schrieb Jean-Charles Lopez:

How Mehmet

OK so it does come from a rados put.

As you were able to check the VM device objet size is 4 MB.

So we'll see after you have removed the object with rados -p rbd rm.

I'll wait for an update.

JC

While moving. Excuse unintended typos.


On Aug 29, 2016, at 14:34, Mehmet  wrote:

Hey JC,

after setting up the ceph-cluster i tried to migrate an image from one 
of our production vm into ceph via


# rados -p rbd put ...

but i have got always "file too large". I guess this file

# -rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 
vm-101-disk-2__head_383C3223__0


is the result of this :) - did not thought that there will be 
something stay in ceph after the mentioned error above.

Seems i was wrong...

This could match the time where the issue happened first time...:

1. i tried to put via "rados -p rbd put..." this did not worked (tried 
to put a ~400G file...)
2. after ~ 1 week i see the blocked requests after first running 
"deep-scrub" (default where ceph starts deep-scrubbing)


I guess the deleting of this file should solve the issue.
Did you see my mail where i wrote the test results of this?

# osd_scrub_chunk_max = 5
# osd_deep_scrub_stride = 1048576

Only corner note.


This seems more to me like a pure radios object of 100GB that was
uploaded to the cluster. From the name it could be a VM disk image
that was uploaded as an object. If it was an RBD object, it’s size
would be in the boundaries of an RBD objects (order 12=4K order
25=32MB).



Verify that when you do a "rados -p rbd ls | grep vm-101-disk-2”
command, you can see an object named vm-101-disk-2.


root@:~# rados -p rbd ls | grep vm-101-disk-2
rbd_id.vm-101-disk-2
vm-101-disk-2

Verify if you have an RBD named this way “rbd -p rbd ls | grep 
vm-101-disk-2"


root@:~# rbd -p rbd ls | grep vm-101-disk-2
vm-101-disk-2


As I’m not familiar with proxmox so I’d suggest the following:
If yes to 1, for security, copy this file somewhere else and then to 
a

rados -p rbd rm vm-101-disk-2.


root@:~# rbd -p rbd info vm-101-disk-2
rbd image 'vm-101-disk-2':
   size 400 GB in 102400 objects
   order 22 (4096 kB objects)
   block_name_prefix: rbd_data.5e7d1238e1f29
   format: 2
   features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten

   flags:

The VM with the id "101" is up and running. This is using 
"vm-101-disk-2" as disk - i have moved the disk sucessfully in another 
way :) (same name :/) after "rados put" did not worked. And as we can 
see here the objects for this image also exists within ceph


root@:~# rados -p rbd ls | grep "rbd_data.5e7d1238e1f29" | wc -l
53011

I assumed here to get 102400 objects but as ceph is doing thin 
provisining this should be ok.



If no to 1, for security, copy this file somewhere else and then to a
rm -rf vm-101-disk-2__head_383C3223__0


I should be able to delete the mentioned "100G file".


Make sure all your PG copies show the same content and wait for the
next scrub to see what is happening.


Will make a backup of this file and in addition from the vm within 
proxmox tomorrow on all involved osds and then start a deep-scrub and 
of course keep you informed.



If anything goes wrong you will be able to upload an object with the
exact same content from the file you copied.
Is proxmox using such huge objects for something to your knowledge 
(VM
boot image or something else)? Can you search the proxmox mailing 
list

and open tickets to verify.


As i already wrote in this eMail i guess that i am the cause for this 
:*( with the wrong usage of "rados put".
Proxmox is using librbd to talk with ceph so it should not be able to 
create such a large one file.



And is this the cause of the long deep scrub? I do think so but I’m
not in front of the cluster.


Let it see :) - i hope that my next eMail will close this issue.

Thank you very much for your help!

Best regards,
- Mehmet

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet

Hey JC,

after setting up the ceph-cluster i tried to migrate an image from one 
of our production vm into ceph via


# rados -p rbd put ...

but i have got always "file too large". I guess this file

# -rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 
vm-101-disk-2__head_383C3223__0


is the result of this :) - did not thought that there will be something 
stay in ceph after the mentioned error above.

Seems i was wrong...

This could match the time where the issue happened first time...:

1. i tried to put via "rados -p rbd put..." this did not worked (tried 
to put a ~400G file...)
2. after ~ 1 week i see the blocked requests after first running 
"deep-scrub" (default where ceph starts deep-scrubbing)


I guess the deleting of this file should solve the issue.
Did you see my mail where i wrote the test results of this?

# osd_scrub_chunk_max = 5
# osd_deep_scrub_stride = 1048576

Only corner note.


This seems more to me like a pure radios object of 100GB that was
uploaded to the cluster. From the name it could be a VM disk image
that was uploaded as an object. If it was an RBD object, it’s size
would be in the boundaries of an RBD objects (order 12=4K order
25=32MB).



Verify that when you do a "rados -p rbd ls | grep vm-101-disk-2”
command, you can see an object named vm-101-disk-2.


root@:~# rados -p rbd ls | grep vm-101-disk-2
rbd_id.vm-101-disk-2
vm-101-disk-2

Verify if you have an RBD named this way “rbd -p rbd ls | grep 
vm-101-disk-2"


root@:~# rbd -p rbd ls | grep vm-101-disk-2
vm-101-disk-2


As I’m not familiar with proxmox so I’d suggest the following:
If yes to 1, for security, copy this file somewhere else and then to a
rados -p rbd rm vm-101-disk-2.


root@:~# rbd -p rbd info vm-101-disk-2
rbd image 'vm-101-disk-2':
size 400 GB in 102400 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.5e7d1238e1f29
format: 2
features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten

flags:

The VM with the id "101" is up and running. This is using 
"vm-101-disk-2" as disk - i have moved the disk sucessfully in another 
way :) (same name :/) after "rados put" did not worked. And as we can 
see here the objects for this image also exists within ceph


root@:~# rados -p rbd ls | grep "rbd_data.5e7d1238e1f29" | wc -l
53011

I assumed here to get 102400 objects but as ceph is doing thin 
provisining this should be ok.



If no to 1, for security, copy this file somewhere else and then to a
rm -rf vm-101-disk-2__head_383C3223__0


I should be able to delete the mentioned "100G file".


Make sure all your PG copies show the same content and wait for the
next scrub to see what is happening.


Will make a backup of this file and in addition from the vm within 
proxmox tomorrow on all involved osds and then start a deep-scrub and of 
course keep you informed.



If anything goes wrong you will be able to upload an object with the
exact same content from the file you copied.

Is proxmox using such huge objects for something to your knowledge (VM
boot image or something else)? Can you search the proxmox mailing list
and open tickets to verify.


As i already wrote in this eMail i guess that i am the cause for this 
:*( with the wrong usage of "rados put".
Proxmox is using librbd to talk with ceph so it should not be able to 
create such a large one file.



And is this the cause of the long deep scrub? I do think so but I’m
not in front of the cluster.


Let it see :) - i hope that my next eMail will close this issue.

Thank you very much for your help!

Best regards,
- Mehmet
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet

Hello JC,

in short for the records:


What you can try doing is to change the following settings on all the
OSDs that host this particular PG and see if it makes things better



[osd]

[...]

osd_scrub_chunk_max = 5   #
maximum number of chunks the scrub will process in one go. Defaults to
25.
osd_deep_scrub_stride = 1048576   # Read size
during scrubbing operations. The idea here is to do less chunks but
bigger sequential reads. Defaults to 512KB=524288.

[...]

I have tried this to see if it would help when the object sizes needs to 
be realy big ^^ - i know this should not and i guess will normaly not be 
:)


In ceph.conf (all nodes!) i have add

[...]
[osd]
osd_scrub_chunk_max = 5
osd_deep_scrub_stride = 1048576

and restartet

systemctl restart ceph-mon@*
systemctl restart ceph-osd@*

on each osd and mon node.

Then i have done "ceph pg deep-scrub 0.223" again.
But again slow/blocked requests :*(

the running config
root@:~# ceph daemon osd.17 config show | grep scrub
[...]
"osd_scrub_chunk_min": "5",
"osd_scrub_chunk_max": "5",
[...]
"osd_deep_scrub_stride": "1048576",
[...]

root@:~# ceph --show-config | grep scrub
[...]
osd_scrub_chunk_min = 5
osd_scrub_chunk_max = 25
[...]
osd_deep_scrub_stride = 524288
[...]

the last command seems not to be load the actual config - but this is 
another story.

This was only to keep you informed.

Ah :) i see you wrote me an answear in this minutes ^^. Will response 
tomorrow - family is waiting ;)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread Mehmet

Hey JC,

thank you very much! - My answers inline :)

Am 2016-08-26 19:26, schrieb LOPEZ Jean-Charles:

Hi Mehmet,

what is interesting in the PG stats is that the PG contains around
700+ objects and you said that you are using RBD only in your cluster
if IIRC. With the default RBD order (4MB objects) this would consume
around 2.8GB and the PG stats show there is over 100GB of data.


Right, we actualy using this cluster only for KVM Disk-Images - accessed 
through librbd (using proxmox).

That is realy strange, how can that happen? :


So here are some extra questions so I can try to imagine better into
more details what could go wrong:
- Have you changed the object allocation size for some RBDs in your
environment or you always use the default RBD Image order?


we did not changed the allocation. All default - except setting tunables 
to "jewel" after setting up our ceph cluster.



- Do you snapshot your RBDs?



- If the answer is yes, is there on particular set of RBDs you
snapshot more often and keep the snapshot for a longer amount of time?


Yes (via proxmox), but actualy only before doing Updates within the VMs. 
Normaly we delete snapshots
after a few days. There are no snapshots actualy - the last one was 
taken perhaps a month ago and is deleted.



I’d recommend that you inspect the pg 0.223 head directory to find out
which RBD Image objects it is hosting. Issue a ls command on
/var/lib/ceph/osd/ceph-9/current/0.0223__head and identify unique RBD
Image header. Match those header against the header you can find for
each RBD Image in your cluster (you can check that with an rbd info
 command and by looking at the RBD Prefix output. This will
let you know for sure which RBD Images use that particular PG and then
check the snapshot settings and inspect the order setting for all of
them.


That was a very good point so i had this time a closer look on the 
objects within


- /var/lib/ceph/osd/ceph-9/current/0.223_head/DIR_3/DIR_2/DIR_2

and found this:

root@osdserver1:/var/lib/ceph/osd/ceph-9/current/0.223_head/DIR_3/DIR_2/DIR_2# 
ls -lah | grep -v "4.0M"

total 101G
drwxr-xr-x 2 ceph ceph  16K Aug 29 09:38 .
drwxr-xr-x 6 ceph ceph  24K Jul 31 01:04 ..
-rw-r--r-- 1 ceph ceph  64K Jul 31 00:44 
benchmark\udata\uinkscope.domain.tld\u20940\uobject3394__head_4E4D8223__0
-rw-r--r-- 1 ceph ceph  64K Jul 31 00:44 
benchmark\udata\uinkscope.domain.tld\u22557\uobject2956__head_B9E34223__0
-rw-r--r-- 1 ceph ceph  64K Jul 31 00:44 
benchmark\udata\uinkscope.domain.tld\u22580\uobject214__head_329AE223__0
-rw-r--r-- 1 ceph ceph  64K Jul 31 00:44 
benchmark\udata\uinkscope.domain.tld\u22580\uobject588__head_31751223__0
-rw-r--r-- 1 ceph ceph  64K Jul 31 00:44 
benchmark\udata\uinkscope.domain.tld\u25893\uobject2770__head_7A2F2223__0
-rw-r--r-- 1 ceph ceph  64K Aug  5 13:52 
benchmark\udata\uinkscope.domain.tld\u3982\uobject200__head_9AA23223__0

-rw-r--r-- 1 ceph ceph0 Jul 31 00:31 __head_0223__0
-rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 vm-101-disk-2__head_383C3223__0

*vm-101-disk-2__head_383C3223__0* seem wrong here and exists also on the 
replica osd for this pg.



What you can try doing is to change the following settings on all the
OSDs that host this particular PG and see if it makes things better



[osd]
osd_scrub_begin_hour = {Hour scrubbing starts} # These 2
parameters will let you define the best window for scrubbing and hence
reduce impact on the VMs at a crucial moment of the day. Default to 0.
osd_scrub_end_hour = {Hour scrubbing stops}# Values are
expressed in 24 hour format for both. Begin=3 End=4 scrubbing will
only take place between 3 and 4 in the morning. Default to 24.
osd_scrub_sleep = .1#
Introduce a delay between scrub operation to give more disk time to
the OSD to perform client operations. Defaults to 0.0
osd_scrub_chunk_max = 5   #
maximum number of chunks the scrub will process in one go. Defaults to
25.
osd_deep_scrub_stride = 1048576   # Read size
during scrubbing operations. The idea here is to do less chunks but
bigger sequential reads. Defaults to 512KB=524288.


Thanks for this suggenstions.
These are already on my notebook to setup after the cluster is running 
fine without any issues.



If your devices use coq as an elevator, you can add the following 2
lines in the section to lower the priority of the scrubbing read IOs
osd_disk_thread_ioprio_class = idle  # Change
the coq priority for the scrub thread. Default is the same priority
class as the OSD
osd_disk_thread_ioprio_priority = 0   # Change
the priority within the class to the lowest possible. Default is the
same priority as the OSD


Actualy they are on default (deadline), but it seems to be worth to 
change this to cfq.
We will have a closer look on this when the cluster is working without 
any issues.



Keep me post

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-26 Thread Mehmet

Hello JC,

as promised here is my

- ceph.conf (I have done a "diff" on all involved server - all using the 
same ceph.conf) = ceph_conf.txt

- ceph pg 0.223 query = ceph_pg_0223_query_20161236.txt
- ceph -s = ceph_s.txt
- ceph df = ceph_df.txt
- ceph osd df = ceph_osd_df.txt
- ceph osd dump | grep pool = ceph_osd_dump_pool.txt
- ceph osd crush rule dump = ceph_osd_crush_rule_dump.txt

as attached txt files.

I have done again a "ceph pg deep-scrub 0.223" before I have created the 
files above. The issue still exists today ~ 12:24 on ... :*(

The deep-scrub on this pg has taken ~14 minutes:

- 2016-08-26 12:24:01.463411 osd.9 172.16.0.11:6808/29391 1777 : cluster 
[INF] 0.223 deep-scrub starts
- 2016-08-26 12:38:07.201726 osd.9 172.16.0.11:6808/29391 2485 : cluster 
[INF] 0.223 deep-scrub ok


Ceph: version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
OS: Ubuntu 16.04 LTS (Linux osdserver1 4.4.0-31-generic #50-Ubuntu SMP 
Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux)


As a remark, assuming the size parameter of the rbd pool is set to 3, 
the number of PGs in your cluster should be higher


I know I could increase this to 2048 (with 30 OSDs). But perhaps we will 
create further Pools so I did not want to set this to high for this pool 
because it is not possible to decrease the pg for the pool.
Furthermore if I would change this now and the issue is gone, we would 
not know what the cause was... :)


When you need further informations, please do not hesitate to ask - I 
will provides this as soon as possible.
Please keep in mind that I have 1 additional disk on each OSD-Node (3 
disk in sum) which I can add to the cluster, so that the acting set for 
this pg could change.
I did removed this before to force other OSDs to become the acting set 
for pg 0.223.


Thank you, your help is very appreciated!

- Mehmet

Am 2016-08-25 13:58, schrieb c...@elchaka.de:

Hey JC,

 Thank you very much for your mail!

 I will provide the Informations tomorrow when i am at work again.

 Hope that we will find a solution :)

 - Mehmet

Am 24. August 2016 16:58:58 MESZ, schrieb LOPEZ Jean-Charles
:


Hi Mehmet,

I’m just seeing your message and read the thread going with it.

Can you please provide me with a copy of the ceph.conf file on the
MON and OSD side assuming it’s identical and if the ceph.conf file
is different on the client side (the VM side) can you please provide
me with a copy of it.

Can you also provide me as attached txt files with
output of your pg query of the pg 0.223?
output of ceph -s
output of ceph df
output of ceph osd df
output of ceph osd dump | grep pool
output of ceph osd crush rule dump

Thank you and I’ll see if I can get something to ease your pain.

As a remark, assuming the size parameter of the rbd pool is set to
3, the number of PGs in your cluster should be higher

If we manage to move forward and get it fixed, we will repost to the
mailing list the changes we made to your configuration.

Regards
JC

On Aug 24, 2016, at 06:41, Mehmet  wrote:

Hello Guys,

the issue still exists :(

If we run a "ceph pg deep-scrub 0.223" nearly all VMs stop for a
while (blocked requests).

- we already replaced the OSDs (SAS Disks - journal on NVMe)
- Removed OSDs so that acting set for pg 0.223 has changed
- checked the filesystem on the acting OSDs
- changed the tunables back from jewel to default
- changed the tunables again to jewel from default
- done a deep-scrub on the hole OSDs (ceph osd deep-scrub osd.)
- only when a deeph-scrub on pg 0.223 runs we get blocked requests

The deep-scrub on pg 0.223 took always 13-15 Min. to finish. It
does not matter which OSDs are in the acting set for this pg.

So, i dont have any ideas what coul d be the issue for this.

As long as "ceph osd set nodeep-scrub" is set - so that no
deep-scrub on 0.223 is running - the cluster is fine!

Could this be a bug?

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
Kernel: 4.4.0-31-generic #50-Ubuntu

Any ideas?
- Mehmet

Am 2016-08-02 17:57, schrieb c:
Am 2016-08-02 13:30, schrieb c:
Hello Guys,
this time without the original acting-set osd.4, 16 and 28. The
issue
still exists...
[...]
For the record, this ONLY happens with this PG and no others that
share
the same OSDs, right?
Yes, right.

 [...]


When doing the deep-scrub, monitor (atop, etc) all 3 nodes and
see if a
particular OSD (HDD) stands out, as I would expect it to.
Now I logged all disks via atop each 2 seconds while the deep-scrub
was running ( atop -w osdXX_atop 2 ).
As you expected all disks was 100% busy - with constant 150MB
(osd.4), 130MB (osd.28) and 170MB (osd.16)...
- osd.4 (/dev/sdf) http://slexy.org/view/s21emd2u6j [1] [1]
- osd.16 (/dev/sdm): http://slexy.org/view/s20vukWz5E [2] [2]
- osd.28 (/dev/sdh): http://slexy.org/view/s20YX0lzZY [3] [3]
[...]
But what is causing this? A deep-scrub on all other disks - same
model and ordered at the same time - seems to not have this issue.

Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-24 Thread Mehmet

Hello Guys,

the issue still exists :(

If we run a "ceph pg deep-scrub 0.223" nearly all VMs stop for a while 
(blocked requests).


- we already replaced the OSDs (SAS Disks - journal on NVMe)
- Removed OSDs so that acting set for pg 0.223 has changed
- checked the filesystem on the acting OSDs
- changed the tunables back from jewel to default
- changed the tunables again to jewel from default
- done a deep-scrub on the hole OSDs (ceph osd deep-scrub osd.) - 
only when a deeph-scrub on pg 0.223 runs we get blocked requests


The deep-scrub on pg 0.223 took always 13-15 Min. to finish. It does not 
matter which OSDs are in the acting set for this pg.


So, i dont have any ideas what could be the issue for this.

As long as "ceph osd set nodeep-scrub" is set - so that no deep-scrub on 
0.223 is running - the cluster is fine!


Could this be a bug?

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
Kernel: 4.4.0-31-generic #50-Ubuntu

Any ideas?
- Mehmet



Am 2016-08-02 17:57, schrieb c:

Am 2016-08-02 13:30, schrieb c:

Hello Guys,

this time without the original acting-set osd.4, 16 and 28. The issue
still exists...

[...]

For the record, this ONLY happens with this PG and no others that
share
the same OSDs, right?


Yes, right.

[...]

When doing the deep-scrub, monitor (atop, etc) all 3 nodes and
see if a
particular OSD (HDD) stands out, as I would expect it to.


Now I logged all disks via atop each 2 seconds while the deep-scrub
was running ( atop -w osdXX_atop 2 ).
As you expected all disks was 100% busy - with constant 150MB
(osd.4), 130MB (osd.28) and 170MB (osd.16)...

- osd.4 (/dev/sdf) http://slexy.org/view/s21emd2u6j [1]
- osd.16 (/dev/sdm): http://slexy.org/view/s20vukWz5E [2]
- osd.28 (/dev/sdh): http://slexy.org/view/s20YX0lzZY [3]
[...]
But what is causing this? A deep-scrub on all other disks - same
model and ordered at the same time - seems to not have this issue.

[...]

Next week, I will do this

1.1 Remove osd.4 completely from Ceph - again (the actual primary
for PG 0.223)


osd.4 is now removed completely.
The Primary PG is now on "osd.9"

# ceph pg map 0.223
osdmap e8671 pg 0.223 (0.223) -> up [9,16,28] acting [9,16,28]


1.2 xfs_repair -n /dev/sdf1 (osd.4): to see possible error


xfs_repair did not find/show any error


1.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.4,16,28 injectargs "--debug_osd 5/5"


Because now osd.9 is the Primary PG i have set the debug_osd on this 
too:

ceph tell osd.9 injectargs "--debug_osd 5/5"

and run the deep-scrub on 0.223 (and againg nearly all of my VMs stop
working for a while)
Start @ 15:33:27
End @ 15:48:31

The "ceph.log"
- http://slexy.org/view/s2WbdApDLz

The related LogFiles (OSDs 9,16 and 28) and the LogFile via atop for 
the osds


LogFile - osd.9 (/dev/sdk)
- ceph-osd.9.log: http://slexy.org/view/s2kXeLMQyw
- atop Log: http://slexy.org/view/s21wJG2qr8

LogFile - osd.16 (/dev/sdh)
- ceph-osd.16.log: http://slexy.org/view/s20D6WhD4d
- atop Log: http://slexy.org/view/s2iMjer8rC

LogFile - osd.28 (/dev/sdm)
- ceph-osd.28.log: http://slexy.org/view/s21dmXoEo7
- atop log: http://slexy.org/view/s2gJqzu3uG


2.1 Remove osd.16 completely from Ceph


osd.16 is now removed completely - now replaced with osd.17 witihin
the acting set.

# ceph pg map 0.223
osdmap e9017 pg 0.223 (0.223) -> up [9,17,28] acting [9,17,28]


2.2 xfs_repair -n /dev/sdh1


xfs_repair did not find/show any error


2.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.9,17,28 injectargs "--debug_osd 5/5"


and run the deep-scrub on 0.223 (and againg nearly all of my VMs stop
working for a while)

Start @ 2016-08-02 10:02:44
End @ 2016-08-02 10:17:22

The "Ceph.log": http://slexy.org/view/s2ED5LvuV2

LogFile - osd.9 (/dev/sdk)
- ceph-osd.9.log: http://slexy.org/view/s21z9JmwSu
- atop Log: http://slexy.org/view/s20XjFZFEL

LogFile - osd.17 (/dev/sdi)
- ceph-osd.17.log: http://slexy.org/view/s202fpcZS9
- atop Log: http://slexy.org/view/s2TxeR1JSz

LogFile - osd.28 (/dev/sdm)
- ceph-osd.28.log: http://slexy.org/view/s2eCUyC7xV
- atop log: http://slexy.org/view/s21AfebBqK


3.1 Remove osd.28 completely from Ceph


Now osd.28 is also removed completely from Ceph - now replaced with 
osd.23


# ceph pg map 0.223
osdmap e9363 pg 0.223 (0.223) -> up [9,17,23] acting [9,17,23]


3.2 xfs_repair -n /dev/sdm1


As expected: xfs_repair did not find/show any error


3.3 ceph pg deep-scrub 0.223
- Log with " ceph tell osd.9,17,23 injectargs "--debug_osd 5/5"


... againg nearly all of my VMs stop working for a while...

Now are all "original" OSDs (4,16,28) removed which was in the
acting-set when i wrote my first eMail to this mailinglist. But the
issue still exists with different OSDs (9,17,23) as the acting-set
while the questionable PG 0.223 is still the same!

In suspicion that the "tunable" could be the cause, i have now