[ceph-users] krbd upmap support on kernel-4.16 ?

2018-05-22 Thread Heðin Ejdesgaard Møller
Hello,

I have a test env. with a single centos-7.5 ceph server and one rbd client, 
running Fedora 28.

I have set the minimum compatibility level to luminous, as shown in ceph osd 
dump below.

When I use krbd for mapping rbd/demo, then it shows up as a "Jewel" client in 
ceph features.
When I use rbd-nbd-12.2.5 for mapping rbd/demo, then it shows up as a 
"Luminous" client in ceph features.

I have 2 questions:
#1 Does the krbd "Jewel" client understand upmap ?
#2 How can I enforce krbd client to present itself as Luminous compliant?


Details are pasted below.

Regards
Heðin

Server is CentOS-7.5
Client is Fedora 28
Both are VM's.

 Server data 
# ceph -v
ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)

# ceph osd dump
epoch 55
fsid 3d9ea97f-3490-4266-894f-e770a742ec9e
created 2018-05-13 20:52:22.288571
modified 2018-05-14 00:08:31.955018
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 15
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client luminous
require_osd_release luminous
pool 1 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 256 pgp_num 256 last_change 38 flags
hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
max_osd 5
osd.0 up   in  weight 1 up_from 54 up_thru 54 down_at 53 last_clean_interval 
[46,49) 10.10.1.109:6809/2095
10.10.1.109:6810/2095 10.10.1.109:6811/2095 10.10.1.109:6812/2095 exists,up 
af0e4141-6396-426d-ab3b-1fd94451a205
osd.1 up   in  weight 1 up_from 51 up_thru 54 down_at 50 last_clean_interval 
[45,49) 10.10.1.109:6800/1477
10.10.1.109:6801/1477 10.10.1.109:6802/1477 10.10.1.109:6803/1477 exists,up 
33ef93fc-62ad-4a4f-a0e0-1f2abf511255
osd.2 up   in  weight 1 up_from 54 up_thru 54 down_at 50 last_clean_interval 
[46,49) 10.10.1.109:6817/2153
10.10.1.109:6818/2153 10.10.1.109:6819/2153 10.10.1.109:6820/2153 exists,up 
88d7464e-ff15-4549-9555-db5d21c0ad9c
osd.3 up   in  weight 1 up_from 54 up_thru 54 down_at 50 last_clean_interval 
[47,49) 10.10.1.109:6805/2071
10.10.1.109:6806/2071 10.10.1.109:6807/2071 10.10.1.109:6808/2071 exists,up 
3ab68921-008c-476b-8d2c-e681571218c8
osd.4 up   in  weight 1 up_from 54 up_thru 54 down_at 50 last_clean_interval 
[46,49) 10.10.1.109:6813/2144
10.10.1.109:6814/2144 10.10.1.109:6815/2144 10.10.1.109:6816/2144 exists,up 
ba7ebc90-9013-41d7-82ca-e10f38f3189b
pg_upmap_items 1.1 [4,0,3,0]
pg_upmap_items 1.4 [3,0,1,0]

# rbd info rbd/demo
rbd image 'demo':
size 6144 MB in 1536 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.104774b0dc51
format: 2
features: layering
flags:
create_timestamp: Sun May 13 21:24:42 2018

# rbd info rbd/demo2
rbd image 'demo2':
size 6144 MB in 1536 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.5e8074b0dc51
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Mon May 14 00:39:40 2018


# ceph features  With client having mapped as follows: rbd map rbd/demo
{
"mon": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 1
}
},
"osd": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 5
}
},
"client": {
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 2
}
}
}

# ceph features  With client having mapped as follows: rbd nbd map rbd/demo
{
"mon": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 1
}
},
"osd": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 5
}
},
"client": {
"group": {
"features": "0x1ffddff8eea4fffb",
"release": "luminous",
"num": 3
}
}
}


 Client data 
# uname -r
4.16.7-300.fc28.x86_64

# rbd -v
ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)

# dnf list installed|grep -i -E "ceph|rbd"
ceph-common.x86_64 1:12.2.5-1.fc28 @updates
ceph-deploy.noarch 1.5.32-5.fc28   @fedora
libcephfs2.x86_64  1:12.2.5-1.fc28 @updates
librbd1.x86_64 1:12.2.5-1.fc28 @updates
libvirt-daemon-driver-storage-rbd.x86_64   4.1.0-2.fc28@anaconda
python-cephfs.x86_64   1:12.2.5-1.fc28 @updates
python-rbd.x86_64

Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread Pardhiv Karri
Hi,

Here is our complete crush map that is being  used.

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
device 9 osd.9
device 10 osd.10
device 11 osd.11
device 12 osd.12
device 13 osd.13
device 14 osd.14
device 15 osd.15
device 16 osd.16
device 17 osd.17
device 18 osd.18
device 19 osd.19
device 20 osd.20
device 21 osd.21
device 22 osd.22
device 23 osd.23
device 24 osd.24
device 25 osd.25
device 26 osd.26
device 27 osd.27
device 28 osd.28
device 29 osd.29
device 30 osd.30
device 31 osd.31
device 32 osd.32
device 33 osd.33
device 34 osd.34
device 35 osd.35
device 36 osd.36
device 37 osd.37
device 38 osd.38
device 39 osd.39

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host or1010051251040 {
id -3 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item osd.0 weight 2.000 pos 0
item osd.1 weight 2.000 pos 1
item osd.2 weight 2.000 pos 2
item osd.3 weight 2.000 pos 3
item osd.4 weight 2.000 pos 4
item osd.5 weight 2.000 pos 5
item osd.6 weight 2.000 pos 6
item osd.7 weight 2.000 pos 7
item osd.8 weight 2.000 pos 8
item osd.9 weight 2.000 pos 9
}
host or1010051251044 {
id -8 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item osd.30 weight 2.000 pos 0
item osd.31 weight 2.000 pos 1
item osd.32 weight 2.000 pos 2
item osd.33 weight 2.000 pos 3
item osd.34 weight 2.000 pos 4
item osd.35 weight 2.000 pos 5
item osd.36 weight 2.000 pos 6
item osd.37 weight 2.000 pos 7
item osd.38 weight 2.000 pos 8
item osd.39 weight 2.000 pos 9
}
rack rack_A1 {
id -2 # do not change unnecessarily
# weight 40.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item or1010051251040 weight 20.000 pos 0
item or1010051251044 weight 20.000 pos 1
}
host or1010051251041 {
id -5 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item osd.10 weight 2.000 pos 0
item osd.11 weight 2.000 pos 1
item osd.12 weight 2.000 pos 2
item osd.13 weight 2.000 pos 3
item osd.14 weight 2.000 pos 4
item osd.15 weight 2.000 pos 5
item osd.16 weight 2.000 pos 6
item osd.17 weight 2.000 pos 7
item osd.18 weight 2.000 pos 8
item osd.19 weight 2.000 pos 9
}
host or1010051251045 {
id -9 # do not change unnecessarily
# weight 0.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
}
rack rack_B1 {
id -4 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item or1010051251041 weight 20.000 pos 0
item or1010051251045 weight 0.000 pos 1
}
host or1010051251042 {
id -7 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item osd.20 weight 2.000 pos 0
item osd.21 weight 2.000 pos 1
item osd.22 weight 2.000 pos 2
item osd.23 weight 2.000 pos 3
item osd.24 weight 2.000 pos 4
item osd.25 weight 2.000 pos 5
item osd.26 weight 2.000 pos 6
item osd.27 weight 2.000 pos 7
item osd.28 weight 2.000 pos 8
item osd.29 weight 2.000 pos 9
}
host or1010051251046 {
id -10 # do not change unnecessarily
# weight 0.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
}
host or1010051251023 {
id -11 # do not change unnecessarily
# weight 0.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
}
rack rack_C1 {
id -6 # do not change unnecessarily
# weight 20.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item or1010051251042 weight 20.000 pos 0
item or1010051251046 weight 0.000 pos 1
item or1010051251023 weight 0.000 pos 2
}
host or1010051251048 {
id -12 # do not change unnecessarily
# weight 0.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
}
rack rack_D1 {
id -13 # do not change unnecessarily
# weight 0.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item or1010051251048 weight 0.000 pos 0
}
root default {
id -1 # do not change unnecessarily
# weight 80.000
alg tree # do not change pos for existing items unnecessarily
hash 0 # rjenkins1
item rack_A1 weight 40.000 pos 0
item rack_B1 weight 20.000 pos 1
item rack_C1 weight 20.000 pos 2
item rack_D1 weight 0.000 pos 3
}

# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type rack
step emit
}

# end crush map

Thanks,
Pardhiv Karri

On Tue, May 22, 2018 at 9:58 AM, Pardhiv 

Re: [ceph-users] Web panel is failing where create rpm

2018-05-22 Thread John Spray
On Tue, May 22, 2018 at 6:38 PM, Antonio Novaes
 wrote:
> Hi people,
> I need help for you.
> I trying create package rpm ceph calamari but i get error.
> The package of calamari server is OK, but Diamond get error.
> I send log
> My ceph is OK, create pool, put file remove file and remove pool with
> sucessful.
> But create rpm for web panel is failing
> I am using vagrant and following the doc https://ceph.com/category/ceph-gui/

I'm not sure if anyone will have any tips or not, but that article is
over three years old, and Calamari hasn't been worked on for quite
some time.

Since Ceph 12.x, there has been a built in web dashboard (much
improved in 13.x): http://docs.ceph.com/docs/mimic/mgr/dashboard/

John

> Att,
> Antonio Novaes de C. Jr
> Analista TIC - Sistema e Infraestrutura
> Especialista em Segurança de Rede de Computadores
> Information Security Foundation based on ISO/IEC 27002 | ISFS
> EXIN Cloud Computing (CLOUDF)
> Red Hat Certified Engineer (RHCE)
> Red Hat Certified Jboss Administrator (RHCJA)
> Linux Certified Engineer (LPIC-2)
> Novell Certified Linux Administrator (SUSE CLA)
> ID Linux: 481126 | LPI000255169
> LinkedIN: Perfil Público
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Web panel is failing where create rpm

2018-05-22 Thread Antonio Novaes
The error:

--
  ID: cp-artifacts-up Diamond/dist/diamond-*.rpm
Function: cmd.run
Name: cp Diamond/dist/diamond-*.rpm /git
  Result: False
 Comment: Command "cp Diamond/dist/diamond-*.rpm /git" run
 Started: 14:26:08.056879
Duration: 20.077 ms
 Changes:
  --
  pid:
  13258
  retcode:
  1
  stderr:
  cp: cannot stat 'Diamond/dist/diamond-*.rpm': No such
file or directory
  stdout:

Summary for local

Succeeded: 4 (changed=4)
Failed:2



Att,
Antonio Novaes de C. Jr
Analista TIC - Sistema e Infraestrutura
Especialista em Segurança de Rede de Computadores
Information Security Foundation based on ISO/IEC 27002 | ISFS
Red Hat Certified Engineer (RHCE)
Red Hat Certified Jboss Administrator (RHCJA)
Linux Certified Engineer (LPIC-2)
Novell Certified Linux Administrator (SUSE CLA)
ID Linux: 481126 | LPI000255169

Em ter, 22 de mai de 2018 14:38, Antonio Novaes 
escreveu:

> Hi people,
> I need help for you.
> I trying create package rpm ceph calamari but i get error.
> The package of calamari server is OK, but Diamond get error.
> I send log
> My ceph is OK, create pool, put file remove file and remove pool with
> sucessful.
> But create rpm for web panel is failing
> I am using vagrant and following the doc
> https://ceph.com/category/ceph-gui/
>
> Att,
> Antonio Novaes de C. Jr
> Analista TIC - Sistema e Infraestrutura
> Especialista em Segurança de Rede de Computadores
> Information Security Foundation based on ISO/IEC 27002 | ISFS
> EXIN Cloud Computing (CLOUDF)
> Red Hat Certified Engineer (RHCE)
> Red Hat Certified Jboss Administrator (RHCJA)
> Linux Certified Engineer (LPIC-2)
> Novell Certified Linux Administrator (SUSE CLA)
> ID Linux: 481126 | LPI000255169
> LinkedIN: Perfil Público
> 
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Data recovery after loosing all monitors

2018-05-22 Thread Frank Li
Just having reliable hardware isn’t enough for monitor failures.  I’ve had a 
case where a wrongly typed command
Brought down all three monitors via segfault and no way to bring them back 
since the command caused the monitor
Database to be corrupt.  I wish there was a checkpoint implemented in the 
monitor database so we can revert back
Changes.  I’m not even sure a regular backup of the monitor database, say every 
5 minute would have helped as it could
Still cause out of sync issue between the OSD and Monitor.  I’ve also tried the 
method of restoring the monitor database
Via  ceph-objectstore-tool but just end up with out of sync OSD and monitors 
where the monitor thinks the OSD is off line
But OSD is up, not to mention PGs were all out of whack as well.

https://tracker.ceph.com/issues/22847

--
Efficiency is Intelligent Laziness

From: ceph-users  on behalf of Caspar Smit 

Date: Tuesday, May 22, 2018 at 7:05 AM
To: ceph-users 
Subject: Re: [ceph-users] Data recovery after loosing all monitors

2018-05-22 15:51 GMT+02:00 Wido den Hollander 
>:


On 05/22/2018 03:38 PM, George Shuklin wrote:
> Good news, it's not an emergency, just a curiosity.
>
> Suppose I lost all monitors in a ceph cluster in my laboratory. I have
> all OSDs intact. Is it possible to recover something from Ceph?

Yes, there is. Using ceph-objectstore-tool you are able to rebuild the
MON database.

BUT, this isn't something you would really want to do as you loose your
cephx keys and such and getting them all back will be a total nightmare.

My advice, make sure you have reliable hardware for your Monitors. Run
them on DC-grade SSDs and you'll be fine.

And be sure to have enough space available on them to sustain a long period of 
PGS not being active+clean.
Kind regards,
Caspar

Wido

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Web panel is failing where create rpm

2018-05-22 Thread Antonio Novaes
Hi people,
I need help for you.
I trying create package rpm ceph calamari but i get error.
The package of calamari server is OK, but Diamond get error.
I send log
My ceph is OK, create pool, put file remove file and remove pool with
sucessful.
But create rpm for web panel is failing
I am using vagrant and following the doc https://ceph.com/category/ceph-gui/

Att,
Antonio Novaes de C. Jr
Analista TIC - Sistema e Infraestrutura
Especialista em Segurança de Rede de Computadores
Information Security Foundation based on ISO/IEC 27002 | ISFS
EXIN Cloud Computing (CLOUDF)
Red Hat Certified Engineer (RHCE)
Red Hat Certified Jboss Administrator (RHCJA)
Linux Certified Engineer (LPIC-2)
Novell Certified Linux Administrator (SUSE CLA)
ID Linux: 481126 | LPI000255169
LinkedIN: Perfil Público

local:
--
  ID: build_deps
Function: pkg.installed
  Result: True
 Comment: All specified packages are already installed
 Started: 14:20:53.044284
Duration: 1203.527 ms
 Changes:   
--
  ID: calamari_clone
Function: git.latest
Name: /git/calamari
  Result: True
 Comment: Repository /home/vagrant/calamari is up-to-date
 Started: 14:20:54.261772
Duration: 1011.265 ms
 Changes:   
--
  ID: build-calamari-server
Function: cmd.run
Name: ./build-rpm.sh
  Result: True
 Comment: Command "./build-rpm.sh" run
 Started: 14:20:55.276892
Duration: 312571.864 ms
 Changes:   
  --
  pid:
  5158
  retcode:
  0
  stderr:
  8159 blocks
  + umask 022
  + cd /home/vagrant/rpmbuild/BUILD
  + cd /home/vagrant/rpmbuild/BUILD
  + rm -rf calamari-server-1.5.2
  + /usr/bin/gzip -dc /home/vagrant/rpmbuild/SOURCES/calamari-server_1.5.2.tar.gz
  + /usr/bin/tar -xf -
  + STATUS=0
  + '[' 0 -ne 0 ']'
  + cd calamari-server-1.5.2
  + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
  + exit 0
  + umask 022
  + cd /home/vagrant/rpmbuild/BUILD
  + cd calamari-server-1.5.2
  + cd selinux
  + for selinuxvariant in targeted minimum mls
  + make NAME=targeted -f /usr/share/selinux/devel/Makefile
  + mv calamari-server.pp calamari-server.pp.targeted
  + make NAME=targeted -f /usr/share/selinux/devel/Makefile clean
  + for selinuxvariant in targeted minimum mls
  + make NAME=minimum -f /usr/share/selinux/devel/Makefile
  + mv calamari-server.pp calamari-server.pp.minimum
  + make NAME=minimum -f /usr/share/selinux/devel/Makefile clean
  + for selinuxvariant in targeted minimum mls
  + make NAME=mls -f /usr/share/selinux/devel/Makefile
  + mv calamari-server.pp calamari-server.pp.mls
  + make NAME=mls -f /usr/share/selinux/devel/Makefile clean
  + cd -
  + exit 0
  + umask 022
  + cd /home/vagrant/rpmbuild/BUILD
  + '[' /home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.5.2-15_g5b8fa14.el7.x86_64 '!=' / ']'
  + rm -rf /home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.5.2-15_g5b8fa14.el7.x86_64
  ++ dirname /home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.5.2-15_g5b8fa14.el7.x86_64
  + mkdir -p /home/vagrant/rpmbuild/BUILDROOT
  + mkdir /home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.5.2-15_g5b8fa14.el7.x86_64
  + cd calamari-server-1.5.2
  + make DESTDIR=/home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.5.2-15_g5b8fa14.el7.x86_64 install-rpm
  + export PYTHONDONTWRITEBYTECODE=1
  + PYTHONDONTWRITEBYTECODE=1
  + cd venv
  ++ ./bin/python -c 'import sys; print "{0}.{1}".format(sys.version_info[0], sys.version_info[1])'
  + pyver=2.7
  + grep -s -q carbon
  + ./bin/python ./bin/pip freeze
  You are using pip version 9.0.1, however version 10.0.1 is available.
  You should consider upgrading via the 'pip install --upgrade pip' command.
  + ./bin/python ./bin/pip install --install-option=--prefix=/home/vagrant/rpmbuild/BUILD/calamari-server-1.5.2/venv --install-option=--install-lib=/home/vagrant/rpmbuild/BUILD/calamari-server-1.5.2/venv/lib/python2.7/site-packages carbon==0.9.15
  

Re: [ceph-users] Delete pool nicely

2018-05-22 Thread David Turner
>From my experience, that would cause you some troubles as it would throw
the entire pool into the deletion queue to be processed as it cleans up the
disks and everything.  I would suggest using a pool listing from `rados -p
.rgw.buckets ls` and iterate on that using some scripts around the `rados
-p .rgw.buckest rm ` command that you could stop, restart at a
faster pace, slow down, etc.  Once the objects in the pool are gone, you
can delete the empty pool without any problems.  I like this option because
it makes it simple to stop it if you're impacting your VM traffic.


On Tue, May 22, 2018 at 11:05 AM Simon Ironside 
wrote:

> Hi Everyone,
>
> I have an older cluster (Hammer 0.94.7) with a broken radosgw service
> that I'd just like to blow away before upgrading to Jewel after which
> I'll start again with EC pools.
>
> I don't need the data but I'm worried that deleting the .rgw.buckets
> pool will cause performance degradation for the production RBD pool used
> by VMs. .rgw.buckets is a replicated pool (size=3) with ~14TB data in
> 5.3M objects. A little over half the data in the whole cluster.
>
> Is deleting this pool simply using ceph osd pool delete likely to cause
> me a performance problem? If so, is there a way I can do it better?
>
> Thanks,
> Simon.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread Pardhiv Karri
Hi David,

We are using tree algorithm.



Thanks,
Pardhiv Karri

On Tue, May 22, 2018 at 9:42 AM, David Turner  wrote:

> Your PG counts per pool per osd doesn't have any PGs on osd.38. that
> definitely matches what your seeing, but I've never seen this happen
> before. The osd doesn't seem to be misconfigured at all.
>
> Does anyone have any ideas what could be happening here?  I expected to
> see something wrong in one of those outputs, but it all looks good.
> Possibly something with straw vs straw2 or crush tunables.
>
>
> On Tue, May 22, 2018, 12:33 PM Pardhiv Karri 
> wrote:
>
>> Hi David,
>>
>> root@or1010051251044:~# ceph df
>> GLOBAL:
>> SIZE   AVAIL  RAW USED %RAW USED
>> 79793G 56832G   22860G 28.65
>> POOLS:
>> NAMEID USED  %USED MAX AVAIL OBJECTS
>> rbd 0  0 014395G   0
>> compute 1  0 014395G   0
>> volumes 2  7605G 28.6014395G 1947372
>> images  4  0 014395G   0
>> root@or1010051251044:~#
>>
>>
>>
>> pool : 4 0 1 2 | SUM
>> 
>> osd.10 8 10 44 96 | 158
>> osd.11 14 8 58 100 | 180
>> osd.12 12 6 50 95 | 163
>> osd.13 14 4 49 121 | 188
>> osd.14 9 8 54 86 | 157
>> osd.15 12 5 55 103 | 175
>> osd.16 23 5 56 99 | 183
>> osd.30 6 4 31 47 | 88
>> osd.17 8 8 50 114 | 180
>> osd.31 7 1 23 35 | 66
>> osd.18 15 5 42 94 | 156
>> osd.32 12 6 24 54 | 96
>> osd.19 13 5 54 116 | 188
>> osd.33 4 2 28 49 | 83
>> osd.34 7 5 18 62 | 92
>> osd.35 10 2 21 56 | 89
>> osd.36 5 1 34 35 | 75
>> osd.37 4 4 24 45 | 77
>> osd.39 14 8 48 106 | 176
>> osd.0 12 3 27 67 | 109
>> osd.1 8 3 27 43 | 81
>> osd.2 4 5 27 45 | 81
>> osd.3 4 3 19 50 | 76
>> osd.4 4 1 23 54 | 82
>> osd.5 4 2 23 56 | 85
>> osd.6 1 5 32 50 | 88
>> osd.7 9 1 32 66 | 108
>> osd.8 7 4 27 49 | 87
>> osd.9 6 4 24 55 | 89
>> osd.20 7 4 43 122 | 176
>> osd.21 14 5 46 95 | 160
>> osd.22 13 8 51 107 | 179
>> osd.23 11 7 54 105 | 177
>> osd.24 11 6 52 112 | 181
>> osd.25 16 6 36 98 | 156
>> osd.26 15 7 59 101 | 182
>> osd.27 7 9 58 101 | 175
>> osd.28 16 5 60 89 | 170
>> osd.29 18 7 53 94 | 172
>> 
>> SUM : 384 192 1536 3072
>>
>>
>>
>> root@or1010051251044:~# for i in `rados lspools`; do echo
>> "="; echo Working on pool: $i; ceph osd pool get $i pg_num;
>> ceph osd pool get $i pgp_num; done = Working on pool: rbd
>> pg_num: 64 pgp_num: 64 = Working on pool: compute pg_num:
>> 512 pgp_num: 512 = Working on pool: volumes pg_num: 1024
>> pgp_num: 1024 = Working on pool: images pg_num: 128
>> pgp_num: 128 root@or1010051251044:~#
>>
>>
>>
>> Thanks,
>> Pardhiv Karri
>>
>> On Tue, May 22, 2018 at 9:16 AM, David Turner 
>> wrote:
>>
>>> This is all weird. Maybe it just doesn't have any PGs with data on
>>> them.  `ceph df`, how many PGs you have in each pool, and which PGs are on
>>> osd 38.
>>>
>>>
>>> On Tue, May 22, 2018, 11:19 AM Pardhiv Karri 
>>> wrote:
>>>
 Hi David,



 root@or1010051251044:~# ceph osd tree
 ID  WEIGHT   TYPE NAMEUP/DOWN REWEIGHT
 PRIMARY-AFFINITY
  -1 80.0 root default

  -2 40.0 rack rack_A1

  -3 20.0 host or1010051251040

   0  2.0 osd.0 up  1.0
  1.0
   1  2.0 osd.1 up  1.0
  1.0
   2  2.0 osd.2 up  1.0
  1.0
   3  2.0 osd.3 up  1.0
  1.0
   4  2.0 osd.4 up  1.0
  1.0
   5  2.0 osd.5 up  1.0
  1.0
   6  2.0 osd.6 up  1.0
  1.0
   7  2.0 osd.7 up  1.0
  1.0
   8  2.0 osd.8 up  1.0
  1.0
   9  2.0 osd.9 up  1.0
  1.0
  -8 20.0 host or1010051251044

  30  2.0 osd.30up  1.0
  1.0
  31  2.0 osd.31up  1.0
  1.0
  32  2.0 osd.32up  1.0
  1.0
  33  2.0 osd.33up  1.0
  1.0
  34  2.0 osd.34up  1.0
  1.0
  35  2.0 osd.35up  1.0
  1.0
  36  2.0 osd.36up  1.0
  1.0
  37  2.0 osd.37up  1.0
  1.0
  38  

Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread David Turner
Your PG counts per pool per osd doesn't have any PGs on osd.38. that
definitely matches what your seeing, but I've never seen this happen
before. The osd doesn't seem to be misconfigured at all.

Does anyone have any ideas what could be happening here?  I expected to see
something wrong in one of those outputs, but it all looks good. Possibly
something with straw vs straw2 or crush tunables.

On Tue, May 22, 2018, 12:33 PM Pardhiv Karri  wrote:

> Hi David,
>
> root@or1010051251044:~# ceph df
> GLOBAL:
> SIZE   AVAIL  RAW USED %RAW USED
> 79793G 56832G   22860G 28.65
> POOLS:
> NAMEID USED  %USED MAX AVAIL OBJECTS
> rbd 0  0 014395G   0
> compute 1  0 014395G   0
> volumes 2  7605G 28.6014395G 1947372
> images  4  0 014395G   0
> root@or1010051251044:~#
>
>
>
> pool : 4 0 1 2 | SUM
> 
> osd.10 8 10 44 96 | 158
> osd.11 14 8 58 100 | 180
> osd.12 12 6 50 95 | 163
> osd.13 14 4 49 121 | 188
> osd.14 9 8 54 86 | 157
> osd.15 12 5 55 103 | 175
> osd.16 23 5 56 99 | 183
> osd.30 6 4 31 47 | 88
> osd.17 8 8 50 114 | 180
> osd.31 7 1 23 35 | 66
> osd.18 15 5 42 94 | 156
> osd.32 12 6 24 54 | 96
> osd.19 13 5 54 116 | 188
> osd.33 4 2 28 49 | 83
> osd.34 7 5 18 62 | 92
> osd.35 10 2 21 56 | 89
> osd.36 5 1 34 35 | 75
> osd.37 4 4 24 45 | 77
> osd.39 14 8 48 106 | 176
> osd.0 12 3 27 67 | 109
> osd.1 8 3 27 43 | 81
> osd.2 4 5 27 45 | 81
> osd.3 4 3 19 50 | 76
> osd.4 4 1 23 54 | 82
> osd.5 4 2 23 56 | 85
> osd.6 1 5 32 50 | 88
> osd.7 9 1 32 66 | 108
> osd.8 7 4 27 49 | 87
> osd.9 6 4 24 55 | 89
> osd.20 7 4 43 122 | 176
> osd.21 14 5 46 95 | 160
> osd.22 13 8 51 107 | 179
> osd.23 11 7 54 105 | 177
> osd.24 11 6 52 112 | 181
> osd.25 16 6 36 98 | 156
> osd.26 15 7 59 101 | 182
> osd.27 7 9 58 101 | 175
> osd.28 16 5 60 89 | 170
> osd.29 18 7 53 94 | 172
> 
> SUM : 384 192 1536 3072
>
>
>
> root@or1010051251044:~# for i in `rados lspools`; do echo
> "="; echo Working on pool: $i; ceph osd pool get $i pg_num;
> ceph osd pool get $i pgp_num; done = Working on pool: rbd
> pg_num: 64 pgp_num: 64 = Working on pool: compute pg_num:
> 512 pgp_num: 512 = Working on pool: volumes pg_num: 1024
> pgp_num: 1024 = Working on pool: images pg_num: 128
> pgp_num: 128 root@or1010051251044:~#
>
>
>
> Thanks,
> Pardhiv Karri
>
> On Tue, May 22, 2018 at 9:16 AM, David Turner 
> wrote:
>
>> This is all weird. Maybe it just doesn't have any PGs with data on them.
>> `ceph df`, how many PGs you have in each pool, and which PGs are on osd 38.
>>
>>
>> On Tue, May 22, 2018, 11:19 AM Pardhiv Karri 
>> wrote:
>>
>>> Hi David,
>>>
>>>
>>>
>>> root@or1010051251044:~# ceph osd tree
>>> ID  WEIGHT   TYPE NAMEUP/DOWN REWEIGHT
>>> PRIMARY-AFFINITY
>>>  -1 80.0 root default
>>>
>>>  -2 40.0 rack rack_A1
>>>
>>>  -3 20.0 host or1010051251040
>>>
>>>   0  2.0 osd.0 up  1.0
>>>  1.0
>>>   1  2.0 osd.1 up  1.0
>>>  1.0
>>>   2  2.0 osd.2 up  1.0
>>>  1.0
>>>   3  2.0 osd.3 up  1.0
>>>  1.0
>>>   4  2.0 osd.4 up  1.0
>>>  1.0
>>>   5  2.0 osd.5 up  1.0
>>>  1.0
>>>   6  2.0 osd.6 up  1.0
>>>  1.0
>>>   7  2.0 osd.7 up  1.0
>>>  1.0
>>>   8  2.0 osd.8 up  1.0
>>>  1.0
>>>   9  2.0 osd.9 up  1.0
>>>  1.0
>>>  -8 20.0 host or1010051251044
>>>
>>>  30  2.0 osd.30up  1.0
>>>  1.0
>>>  31  2.0 osd.31up  1.0
>>>  1.0
>>>  32  2.0 osd.32up  1.0
>>>  1.0
>>>  33  2.0 osd.33up  1.0
>>>  1.0
>>>  34  2.0 osd.34up  1.0
>>>  1.0
>>>  35  2.0 osd.35up  1.0
>>>  1.0
>>>  36  2.0 osd.36up  1.0
>>>  1.0
>>>  37  2.0 osd.37up  1.0
>>>  1.0
>>>  38  2.0 osd.38up  1.0
>>>  1.0
>>>  39  2.0 osd.39up  1.0
>>>  1.0
>>>  -4 20.0 rack rack_B1
>>>
>>>  -5 20.0 host or1010051251041
>>>
>>>  10  2.0 osd.10up  1.0
>>>  1.0
>>>  

Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread Pardhiv Karri
Hi David,

root@or1010051251044:~# ceph df
GLOBAL:
SIZE   AVAIL  RAW USED %RAW USED
79793G 56832G   22860G 28.65
POOLS:
NAMEID USED  %USED MAX AVAIL OBJECTS
rbd 0  0 014395G   0
compute 1  0 014395G   0
volumes 2  7605G 28.6014395G 1947372
images  4  0 014395G   0
root@or1010051251044:~#



pool : 4 0 1 2 | SUM

osd.10 8 10 44 96 | 158
osd.11 14 8 58 100 | 180
osd.12 12 6 50 95 | 163
osd.13 14 4 49 121 | 188
osd.14 9 8 54 86 | 157
osd.15 12 5 55 103 | 175
osd.16 23 5 56 99 | 183
osd.30 6 4 31 47 | 88
osd.17 8 8 50 114 | 180
osd.31 7 1 23 35 | 66
osd.18 15 5 42 94 | 156
osd.32 12 6 24 54 | 96
osd.19 13 5 54 116 | 188
osd.33 4 2 28 49 | 83
osd.34 7 5 18 62 | 92
osd.35 10 2 21 56 | 89
osd.36 5 1 34 35 | 75
osd.37 4 4 24 45 | 77
osd.39 14 8 48 106 | 176
osd.0 12 3 27 67 | 109
osd.1 8 3 27 43 | 81
osd.2 4 5 27 45 | 81
osd.3 4 3 19 50 | 76
osd.4 4 1 23 54 | 82
osd.5 4 2 23 56 | 85
osd.6 1 5 32 50 | 88
osd.7 9 1 32 66 | 108
osd.8 7 4 27 49 | 87
osd.9 6 4 24 55 | 89
osd.20 7 4 43 122 | 176
osd.21 14 5 46 95 | 160
osd.22 13 8 51 107 | 179
osd.23 11 7 54 105 | 177
osd.24 11 6 52 112 | 181
osd.25 16 6 36 98 | 156
osd.26 15 7 59 101 | 182
osd.27 7 9 58 101 | 175
osd.28 16 5 60 89 | 170
osd.29 18 7 53 94 | 172

SUM : 384 192 1536 3072



root@or1010051251044:~# for i in `rados lspools`; do echo
"="; echo Working on pool: $i; ceph osd pool get $i pg_num;
ceph osd pool get $i pgp_num; done = Working on pool: rbd
pg_num: 64 pgp_num: 64 = Working on pool: compute pg_num:
512 pgp_num: 512 = Working on pool: volumes pg_num: 1024
pgp_num: 1024 = Working on pool: images pg_num: 128
pgp_num: 128 root@or1010051251044:~#



Thanks,
Pardhiv Karri

On Tue, May 22, 2018 at 9:16 AM, David Turner  wrote:

> This is all weird. Maybe it just doesn't have any PGs with data on them.
> `ceph df`, how many PGs you have in each pool, and which PGs are on osd 38.
>
>
> On Tue, May 22, 2018, 11:19 AM Pardhiv Karri 
> wrote:
>
>> Hi David,
>>
>>
>>
>> root@or1010051251044:~# ceph osd tree
>> ID  WEIGHT   TYPE NAMEUP/DOWN REWEIGHT
>> PRIMARY-AFFINITY
>>  -1 80.0 root default
>>
>>  -2 40.0 rack rack_A1
>>
>>  -3 20.0 host or1010051251040
>>
>>   0  2.0 osd.0 up  1.0
>>  1.0
>>   1  2.0 osd.1 up  1.0
>>  1.0
>>   2  2.0 osd.2 up  1.0
>>  1.0
>>   3  2.0 osd.3 up  1.0
>>  1.0
>>   4  2.0 osd.4 up  1.0
>>  1.0
>>   5  2.0 osd.5 up  1.0
>>  1.0
>>   6  2.0 osd.6 up  1.0
>>  1.0
>>   7  2.0 osd.7 up  1.0
>>  1.0
>>   8  2.0 osd.8 up  1.0
>>  1.0
>>   9  2.0 osd.9 up  1.0
>>  1.0
>>  -8 20.0 host or1010051251044
>>
>>  30  2.0 osd.30up  1.0
>>  1.0
>>  31  2.0 osd.31up  1.0
>>  1.0
>>  32  2.0 osd.32up  1.0
>>  1.0
>>  33  2.0 osd.33up  1.0
>>  1.0
>>  34  2.0 osd.34up  1.0
>>  1.0
>>  35  2.0 osd.35up  1.0
>>  1.0
>>  36  2.0 osd.36up  1.0
>>  1.0
>>  37  2.0 osd.37up  1.0
>>  1.0
>>  38  2.0 osd.38up  1.0
>>  1.0
>>  39  2.0 osd.39up  1.0
>>  1.0
>>  -4 20.0 rack rack_B1
>>
>>  -5 20.0 host or1010051251041
>>
>>  10  2.0 osd.10up  1.0
>>  1.0
>>  11  2.0 osd.11up  1.0
>>  1.0
>>  12  2.0 osd.12up  1.0
>>  1.0
>>  13  2.0 osd.13up  1.0
>>  1.0
>>  14  2.0 osd.14up  1.0
>>  1.0
>>  15  2.0 osd.15up  1.0
>>  1.0
>>  16  2.0 osd.16up  1.0
>>  1.0
>>  17  2.0 osd.17up  1.0
>>  1.0
>>  18  2.0 osd.18up  1.0
>>  1.0
>>  19  2.0 osd.19up  1.0
>>  1.0
>>  -90 

Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread David Turner
This is all weird. Maybe it just doesn't have any PGs with data on them.
`ceph df`, how many PGs you have in each pool, and which PGs are on osd 38.

On Tue, May 22, 2018, 11:19 AM Pardhiv Karri  wrote:

> Hi David,
>
>
>
> root@or1010051251044:~# ceph osd tree
> ID  WEIGHT   TYPE NAMEUP/DOWN REWEIGHT
> PRIMARY-AFFINITY
>  -1 80.0 root default
>
>  -2 40.0 rack rack_A1
>
>  -3 20.0 host or1010051251040
>
>   0  2.0 osd.0 up  1.0
>  1.0
>   1  2.0 osd.1 up  1.0
>  1.0
>   2  2.0 osd.2 up  1.0
>  1.0
>   3  2.0 osd.3 up  1.0
>  1.0
>   4  2.0 osd.4 up  1.0
>  1.0
>   5  2.0 osd.5 up  1.0
>  1.0
>   6  2.0 osd.6 up  1.0
>  1.0
>   7  2.0 osd.7 up  1.0
>  1.0
>   8  2.0 osd.8 up  1.0
>  1.0
>   9  2.0 osd.9 up  1.0
>  1.0
>  -8 20.0 host or1010051251044
>
>  30  2.0 osd.30up  1.0
>  1.0
>  31  2.0 osd.31up  1.0
>  1.0
>  32  2.0 osd.32up  1.0
>  1.0
>  33  2.0 osd.33up  1.0
>  1.0
>  34  2.0 osd.34up  1.0
>  1.0
>  35  2.0 osd.35up  1.0
>  1.0
>  36  2.0 osd.36up  1.0
>  1.0
>  37  2.0 osd.37up  1.0
>  1.0
>  38  2.0 osd.38up  1.0
>  1.0
>  39  2.0 osd.39up  1.0
>  1.0
>  -4 20.0 rack rack_B1
>
>  -5 20.0 host or1010051251041
>
>  10  2.0 osd.10up  1.0
>  1.0
>  11  2.0 osd.11up  1.0
>  1.0
>  12  2.0 osd.12up  1.0
>  1.0
>  13  2.0 osd.13up  1.0
>  1.0
>  14  2.0 osd.14up  1.0
>  1.0
>  15  2.0 osd.15up  1.0
>  1.0
>  16  2.0 osd.16up  1.0
>  1.0
>  17  2.0 osd.17up  1.0
>  1.0
>  18  2.0 osd.18up  1.0
>  1.0
>  19  2.0 osd.19up  1.0
>  1.0
>  -90 host or1010051251045
>
>  -6 20.0 rack rack_C1
>
>  -7 20.0 host or1010051251042
>
>  20  2.0 osd.20up  1.0
>  1.0
>  21  2.0 osd.21up  1.0
>  1.0
>  22  2.0 osd.22up  1.0
>  1.0
>  23  2.0 osd.23up  1.0
>  1.0
>  24  2.0 osd.24up  1.0
>  1.0
>  25  2.0 osd.25up  1.0
>  1.0
>  26  2.0 osd.26up  1.0
>  1.0
>  27  2.0 osd.27up  1.0
>  1.0
>  28  2.0 osd.28up  1.0
>  1.0
>  29  2.0 osd.29up  1.0
>  1.0
> -100 host or1010051251046
>
> -110 host or1010051251023
>
> root@or1010051251044:~#
>
>
>
>
>
> root@or1010051251044:~# ceph -s
> cluster 6eacac66-087a-464d-94cb-9ca2585b98d5
>  health HEALTH_OK
>  monmap e3: 3 mons at {or1010051251037=
> 10.51.251.37:6789/0,or1010051251038=10.51.251.38:6789/0,or1010051251039=10.51.251.39:6789/0
> }
> election epoch 144, quorum 0,1,2
> or1010051251037,or1010051251038,or1010051251039
>  osdmap e1814: 40 osds: 40 up, 40 in
>   pgmap v446581: 1728 pgs, 4 pools, 7389 GB data, 1847 kobjects
> 1 GB used, 57472 GB / 79793 GB avail
> 1728 active+clean
>   client io 61472 kB/s wr, 30 op/s
> root@or1010051251044:~#
>
>
> Thanks,
> Pardhiv Karri
>
> On Tue, May 22, 2018 at 5:01 AM, David Turner 
> wrote:
>
>> What are your `ceph osd tree` and `ceph status` as well?
>>
>> On Tue, May 22, 2018, 3:05 AM Pardhiv Karri 
>> wrote:
>>
>>> Hi,
>>>
>>> We are using Ceph Hammer 0.94.9. Some of our OSDs never get any data or
>>> PGs even at their full crush weight, up and running. Rest of the OSDs are
>>> at 50% full. Is there a bug in Hammer that is causing this issue? Does
>>> upgrading to Jewel or Luminous fix this issue?
>>>
>>> I tried deleting and recreating this OSD N number of times and still the
>>> same 

[ceph-users] Recovery time is very long till we have a double tree in the crushmap

2018-05-22 Thread Vincent Godin
Two monthes ago, we had a simple crushmap :
- one root
- one region
- two datacenters
- one room per datacenter
- two pools per room (one SATA and one SSD)
- hosts in SATA pool only
- osds in host

So we created a ceph pool at the level SATA on each site.
After some disk problems which impacted almost all VMs on a site, we
decided to add a level between pool and hosts : rack (3 racks per pool).
The aim was to create ceph pools based on rack so a defected disk on a
server would impact in the worse case only the VMs attached to the rack.

Adding a rack between pool and hosts in the current tree would move all the
data already in the old pool SATA. So we decided to create an other tree
with only three racks per site pointing on the servers they owned.

It's working but :

- some ceph command are not any more possible. adding a new server on both
tree is not possible despite the doc. It is possible to add the server in
the SATA pool on the old tree but not in the corresponding rack on the new
tree (even if we precise the new root)
- some ceph command give strange results. A ceph osd df will show you the
same osd twice

The worse :

Before adding this new tree, adding 4 new servers took roughly a week. We
added last month 4 servers and it took 3 weeks to converge and get a Ceph
OK state

Is this a normal behaviour. Do we need to fall back to a single tree and
insert the rack between the pool and the hosts even if it will move a lot
of data ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to see PGs of a pool on a OSD

2018-05-22 Thread Pardhiv Karri
This is exactly what I'm looking for. Tested it in our lab and it works
great. Thanks you Caspar!

Thanks,
Pardhiv Karri

On Tue, May 22, 2018 at 3:42 AM, Caspar Smit  wrote:

> Here you go:
>
> ps. You might have to map your poolnames to pool ids
>
> http://cephnotes.ksperis.com/blog/2015/02/23/get-the-
> number-of-placement-groups-per-osd
>
> Kind regards,
> Caspar
>
> 2018-05-22 9:13 GMT+02:00 Pardhiv Karri :
>
>> Hi,
>>
>> Our ceph cluster have 12 pools and only 3 pools are really used. How can
>> I see number of PGs on a OSD and which PGs belong to which pool on that OSD?
>>
>> Something like below,
>> OSD 0 = 1000PGs (500PGs belong to PoolA, 200PGs belong to PoolB, 300PGs
>> belong to PoolC)
>>
>> Thanks,
>> Pardhiv Karri
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>


-- 
*Pardhiv Karri*
"Rise and Rise again until LAMBS become LIONS"
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread Pardhiv Karri
Hi David,



root@or1010051251044:~# ceph osd tree
ID  WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
 -1 80.0 root default
 -2 40.0 rack rack_A1
 -3 20.0 host or1010051251040
  0  2.0 osd.0 up  1.0  1.0
  1  2.0 osd.1 up  1.0  1.0
  2  2.0 osd.2 up  1.0  1.0
  3  2.0 osd.3 up  1.0  1.0
  4  2.0 osd.4 up  1.0  1.0
  5  2.0 osd.5 up  1.0  1.0
  6  2.0 osd.6 up  1.0  1.0
  7  2.0 osd.7 up  1.0  1.0
  8  2.0 osd.8 up  1.0  1.0
  9  2.0 osd.9 up  1.0  1.0
 -8 20.0 host or1010051251044
 30  2.0 osd.30up  1.0  1.0
 31  2.0 osd.31up  1.0  1.0
 32  2.0 osd.32up  1.0  1.0
 33  2.0 osd.33up  1.0  1.0
 34  2.0 osd.34up  1.0  1.0
 35  2.0 osd.35up  1.0  1.0
 36  2.0 osd.36up  1.0  1.0
 37  2.0 osd.37up  1.0  1.0
 38  2.0 osd.38up  1.0  1.0
 39  2.0 osd.39up  1.0  1.0
 -4 20.0 rack rack_B1
 -5 20.0 host or1010051251041
 10  2.0 osd.10up  1.0  1.0
 11  2.0 osd.11up  1.0  1.0
 12  2.0 osd.12up  1.0  1.0
 13  2.0 osd.13up  1.0  1.0
 14  2.0 osd.14up  1.0  1.0
 15  2.0 osd.15up  1.0  1.0
 16  2.0 osd.16up  1.0  1.0
 17  2.0 osd.17up  1.0  1.0
 18  2.0 osd.18up  1.0  1.0
 19  2.0 osd.19up  1.0  1.0
 -90 host or1010051251045
 -6 20.0 rack rack_C1
 -7 20.0 host or1010051251042
 20  2.0 osd.20up  1.0  1.0
 21  2.0 osd.21up  1.0  1.0
 22  2.0 osd.22up  1.0  1.0
 23  2.0 osd.23up  1.0  1.0
 24  2.0 osd.24up  1.0  1.0
 25  2.0 osd.25up  1.0  1.0
 26  2.0 osd.26up  1.0  1.0
 27  2.0 osd.27up  1.0  1.0
 28  2.0 osd.28up  1.0  1.0
 29  2.0 osd.29up  1.0  1.0
-100 host or1010051251046
-110 host or1010051251023
root@or1010051251044:~#





root@or1010051251044:~# ceph -s
cluster 6eacac66-087a-464d-94cb-9ca2585b98d5
 health HEALTH_OK
 monmap e3: 3 mons at {or1010051251037=
10.51.251.37:6789/0,or1010051251038=10.51.251.38:6789/0,or1010051251039=10.51.251.39:6789/0
}
election epoch 144, quorum 0,1,2
or1010051251037,or1010051251038,or1010051251039
 osdmap e1814: 40 osds: 40 up, 40 in
  pgmap v446581: 1728 pgs, 4 pools, 7389 GB data, 1847 kobjects
1 GB used, 57472 GB / 79793 GB avail
1728 active+clean
  client io 61472 kB/s wr, 30 op/s
root@or1010051251044:~#


Thanks,
Pardhiv Karri

On Tue, May 22, 2018 at 5:01 AM, David Turner  wrote:

> What are your `ceph osd tree` and `ceph status` as well?
>
> On Tue, May 22, 2018, 3:05 AM Pardhiv Karri  wrote:
>
>> Hi,
>>
>> We are using Ceph Hammer 0.94.9. Some of our OSDs never get any data or
>> PGs even at their full crush weight, up and running. Rest of the OSDs are
>> at 50% full. Is there a bug in Hammer that is causing this issue? Does
>> upgrading to Jewel or Luminous fix this issue?
>>
>> I tried deleting and recreating this OSD N number of times and still the
>> same issue. I am seeing this in 3 of our 4 ceph clusters in different
>> datacenters. We are using HDD as OSD and SSD as Journal drive.
>>
>> The below is from our lab and OSD 38 is the one 

[ceph-users] Delete pool nicely

2018-05-22 Thread Simon Ironside

Hi Everyone,

I have an older cluster (Hammer 0.94.7) with a broken radosgw service 
that I'd just like to blow away before upgrading to Jewel after which 
I'll start again with EC pools.


I don't need the data but I'm worried that deleting the .rgw.buckets 
pool will cause performance degradation for the production RBD pool used 
by VMs. .rgw.buckets is a replicated pool (size=3) with ~14TB data in 
5.3M objects. A little over half the data in the whole cluster.


Is deleting this pool simply using ceph osd pool delete likely to cause 
me a performance problem? If so, is there a way I can do it better?


Thanks,
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW won't start after upgrade to 12.2.5

2018-05-22 Thread Marc Spencer
This is now filed as bug #24228
 
    
Marc D. Spencer
Chief Technology Officer
T: 866.808.4937 × 202 
E: mspen...@liquidpixels.com 
www.liquidpixels.com 
  
 
 


> On May 21, 2018, at 10:58 PM, Marc Spencer  > wrote:
> 
> I found the issue, for the curious.
> 
> The default configuration for rgw_ldap_secret seems to be set to 
> /etc/openldap/secret, which on my system is empty:
> 
> # ceph-conf -D | grep ldap
> rgw_ldap_binddn = uid=admin,cn=users,dc=example,dc=com
> rgw_ldap_dnattr = uid
> rgw_ldap_searchdn = cn=users,cn=accounts,dc=example,dc=com
> rgw_ldap_searchfilter = 
> rgw_ldap_secret = /etc/openldap/secret
> rgw_ldap_uri = ldaps://
> rgw_s3_auth_use_ldap = false
> 
> # cat /etc/openldap/secret
> cat: /etc/openldap/secret: No such file or directory
> 
> But the code assumes that if it is set, the named file has content. Since it 
> doesn’t, safe_read_file() asserts.
> 
> I set it to nothing (rgw_ldap_secret = ) in my configuration, and everything 
> seems happy.
> 
> std::string parse_rgw_ldap_bindpw(CephContext* ctx)
> {
>   string ldap_bindpw;
>   string ldap_secret = ctx->_conf->rgw_ldap_secret;
> 
>   if (ldap_secret.empty()) {
> ldout(ctx, 10)
>   << __func__ << " LDAP auth no rgw_ldap_secret file found in conf"
>   << dendl;
> } else {
>   char bindpw[1024];
>   memset(bindpw, 0, 1024);
>   int pwlen = safe_read_file("" /* base */, ldap_secret.c_str(),
>  bindpw, 1023);
> if (pwlen) {
>   ldap_bindpw = bindpw;
>   boost::algorithm::trim(ldap_bindpw);
>   if (ldap_bindpw.back() == '\n')
> ldap_bindpw.pop_back();
> }
>   }
> 
>   return ldap_bindpw;
> }
> 
> 
>> On May 21, 2018, at 5:27 PM, Marc Spencer > > wrote:
>> 
>> Hi,
>> 
>>   I have a test cluster of 4 servers running Luminous. We were running 
>> 12.2.2 under Fedora 17 and have just completed upgrading to 12.2.5 under 
>> Fedora 18.
>> 
>>   All seems well: all MONs are up, OSDs are up, I can see objects stored as 
>> expected with rados -p default.rgw.buckets.data ls. 
>> 
>>   But when i start RGW, my load goes through the roof as radosgw 
>> continuously rapid-fire core dumps. 
>> 
>> Log Excerpt:
>> 
>> 
>> … 
>> 
>>-16> 2018-05-21 15:52:48.244579 7fc70eeda700  5 -- 
>> 10.19.33.13:0/3446208184 >> 10.19.33.14:6800/1417 conn(0x55e78a610800 :-1 
>> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14567 cs=1 l=1). rx osd.6 
>> seq 7 0x55e78a67b500 osd_op_reply(47 notify.6 [watch watch cookie 
>> 94452947886080] v1092'43446 uv43445 ondisk = 0) v8
>>-15> 2018-05-21 15:52:48.244619 7fc70eeda700  1 -- 
>> 10.19.33.13:0/3446208184 <== osd.6 10.19.33.14:6800/1417 7  
>> osd_op_reply(47 notify.6 [watch watch cookie 94452947886080] v1092'43446 
>> uv43445 ondisk = 0) v8  152+0+0 (1199963694 0 0) 0x55e78a67b500 con 
>> 0x55e78a610800
>>-14> 2018-05-21 15:52:48.244777 7fc723656000  1 -- 
>> 10.19.33.13:0/3446208184 --> 10.19.33.15:6800/1433 -- osd_op(unknown.0.0:48 
>> 16.1 16:93e5b521:::notify.7:head [create] snapc 0=[] 
>> ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a67bc00 con 0
>>-13> 2018-05-21 15:52:48.275650 7fc70eeda700  5 -- 
>> 10.19.33.13:0/3446208184 >> 10.19.33.15:6800/1433 conn(0x55e78a65e000 :-1 
>> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14572 cs=1 l=1). rx osd.2 
>> seq 7 0x55e78a678380 osd_op_reply(48 notify.7 [create] v1092'43453 uv43453 
>> ondisk = 0) v8
>>-12> 2018-05-21 15:52:48.275675 7fc70eeda700  1 -- 
>> 10.19.33.13:0/3446208184 <== osd.2 10.19.33.15:6800/1433 7  
>> osd_op_reply(48 notify.7 [create] v1092'43453 uv43453 ondisk = 0) v8  
>> 152+0+0 (2720997170 0 0) 0x55e78a678380 con 0x55e78a65e000
>>-11> 2018-05-21 15:52:48.275849 7fc723656000  1 -- 
>> 10.19.33.13:0/3446208184 --> 10.19.33.15:6800/1433 -- osd_op(unknown.0.0:49 
>> 16.1 16:93e5b521:::notify.7:head [watch watch cookie 94452947887232] snapc 
>> 0=[] ondisk+write+known_if_redirected e1092) v8 -- 0x55e78a688000 con 0
>>-10> 2018-05-21 15:52:48.296799 7fc70eeda700  5 -- 
>> 10.19.33.13:0/3446208184 >> 10.19.33.15:6800/1433 conn(0x55e78a65e000 :-1 
>> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=14572 cs=1 l=1). rx osd.2 
>> seq 8 0x55e78a688000 osd_op_reply(49 notify.7 [watch watch cookie 
>> 94452947887232] v1092'43454 uv43453 ondisk = 0) v8
>> -9> 2018-05-21 15:52:48.296824 7fc70eeda700  1 -- 
>> 10.19.33.13:0/3446208184 <== osd.2 10.19.33.15:6800/1433 8  
>> osd_op_reply(49 notify.7 [watch watch cookie 94452947887232] 

Re: [ceph-users] Data recovery after loosing all monitors

2018-05-22 Thread Caspar Smit
2018-05-22 15:51 GMT+02:00 Wido den Hollander :

>
>
> On 05/22/2018 03:38 PM, George Shuklin wrote:
> > Good news, it's not an emergency, just a curiosity.
> >
> > Suppose I lost all monitors in a ceph cluster in my laboratory. I have
> > all OSDs intact. Is it possible to recover something from Ceph?
>
> Yes, there is. Using ceph-objectstore-tool you are able to rebuild the
> MON database.
>
> BUT, this isn't something you would really want to do as you loose your
> cephx keys and such and getting them all back will be a total nightmare.
>
> My advice, make sure you have reliable hardware for your Monitors. Run
> them on DC-grade SSDs and you'll be fine.
>
>
And be sure to have enough space available on them to sustain a long period
of PGS not being active+clean.

Kind regards,
Caspar


> Wido
>
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Data recovery after loosing all monitors

2018-05-22 Thread Wido den Hollander


On 05/22/2018 03:38 PM, George Shuklin wrote:
> Good news, it's not an emergency, just a curiosity.
> 
> Suppose I lost all monitors in a ceph cluster in my laboratory. I have
> all OSDs intact. Is it possible to recover something from Ceph?

Yes, there is. Using ceph-objectstore-tool you are able to rebuild the
MON database.

BUT, this isn't something you would really want to do as you loose your
cephx keys and such and getting them all back will be a total nightmare.

My advice, make sure you have reliable hardware for your Monitors. Run
them on DC-grade SSDs and you'll be fine.

Wido

> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Data recovery after loosing all monitors

2018-05-22 Thread George Shuklin

Good news, it's not an emergency, just a curiosity.

Suppose I lost all monitors in a ceph cluster in my laboratory. I have 
all OSDs intact. Is it possible to recover something from Ceph?


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crush Map Changed After Reboot

2018-05-22 Thread Caspar Smit
Fwir, you could also put this into your ceph.conf to explicitly put an osd
into the correct chassis at start if you have other osd's which you still
want the crush_update_on_start setting set to true for:

[osd.34]
   osd crush location = "chassis=ceph-osd3-internal"
[osd.35]
   osd crush location = "chassis=ceph-osd3-internal"

etc..

Kind regards,
Caspar

2018-05-22 3:03 GMT+02:00 David Turner :

> Your problem sounds like osd_crush_update_on_start.  While set to the
> default of true, when an osd starts it tells the mons which server it is on
> and the mons will update the crush map to reflect it. While these osds
> running on the host, but placed in a custom host in the crush map... when
> they start they will update to show which host they are running on.  You
> probably want to disable that in the config so that your custom crush
> placement is not altered.
>
>
> On Mon, May 21, 2018, 6:29 PM Martin, Jeremy  wrote:
>
>> Hello,
>>
>> I had a ceph cluster up and running for a few months now and all has been
>> well and good except for today where I updated two osd nodes and well still
>> well, these two nodes are designated within a rack and the rack is the
>> failure domain so they are essentially mirrors of each.  The issue came
>> when I updated and rebooted the third node which has internal and external
>> disks in a shelf and the failure domain is at the actual osd level as these
>> are normal off the shelf disks for low priority storage that is not mission
>> critical.  The issue is that before the reboot the crush map look and
>> behaved correctly but after the reboot the crush map was changed and had to
>> be rebuilt to get the storage back online, all was well after the
>> reassignment by I need to track down why it lots it configuration.  The
>> main differences here is that the first four disks (34-37) are supposed to
>> be assigned to the chassis ceph-osd3-internal (like the before) and 21-31
>> assigned to chassis chassis-ceph-osd3-shelf1
>>  (again like the before).  After the reboot everything (34-37 and 21-31)
>> was reassigned to the host ceph-osd3.  Update was from 12.2.4 to 12.2.5.
>> Any thoughts?
>>
>> Jeremy
>>
>> Before
>>
>> ID  CLASS WEIGHT   TYPE NAME  STATUS REWEIGHT PRI-AFF
>> -58  0 root osd3-internal
>> -54  0 chassis ceph-osd3-internal
>>  34   hdd  0.42899 osd.34 up  1.0 1.0
>>  35   hdd  0.42899 osd.35 up  1.0 1.0
>>  36   hdd  0.42899 osd.36 up  1.0 1.0
>>  37   hdd  0.42899 osd.37 up  1.0 1.0
>> -50  0 root osd3-shelf1
>> -56  0 chassis ceph-osd3-shelf1
>>  21   hdd  1.81898 osd.21 up  1.0 1.0
>>  22   hdd  1.81898 osd.22 up  1.0 1.0
>>  23   hdd  1.81898 osd.23 up  1.0 1.0
>>  24   hdd  1.81898 osd.24 up  1.0 1.0
>>  25   hdd  1.81898 osd.25 up  1.0 1.0
>>  26   hdd  1.81898 osd.26 up  1.0 1.0
>>  27   hdd  1.81898 osd.27 up  1.0 1.0
>>  28   hdd  1.81898 osd.28 up  1.0 1.0
>>  29   hdd  1.81898 osd.29 up  1.0 1.0
>>  30   hdd  1.81898 osd.30 up  1.0 1.0
>>  31   hdd  1.81898 osd.31 up  1.0 1.0
>>  -7  0 host ceph-osd3
>>  -1   47.21199 root default
>> -40   23.59000 rack mainehall
>>  -3   23.59000 host ceph-osd1
>>   0   hdd  1.81898 osd.0  up  1.0 1.0
>>   Additional osd's left off for brevity
>> -5   23.62199 host ceph-osd2
>>  11   hdd  1.81898 osd.11 up  1.0 1.0
>>   Additional osd's left off for brevity
>>
>> After
>>
>> ID  CLASS WEIGHT   TYPE NAME  STATUS REWEIGHT PRI-AFF
>> -58  0 root osd3-internal
>> -54  0 chassis ceph-osd3-internal
>> -50  0 root osd3-shelf1
>> -56  0 chassis ceph-osd3-shelf1
>> -7   0 host ceph-osd3
>>  21   hdd  1.81898 osd.21 up  1.0 1.0
>>  22   hdd  1.81898 osd.22 up  1.0 1.0
>>  23   hdd  1.81898 osd.23 up  1.0 1.0
>>  24   hdd  1.81898 osd.24 up  1.0 1.0
>>  25   hdd  1.81898 osd.25 up  1.0 1.0
>>  26   hdd  1.81898 osd.26 up  1.0 1.0
>>  27   hdd  1.81898 osd.27 up  1.0 1.0
>>  28   hdd  1.81898 osd.28

Re: [ceph-users] samba gateway experiences with cephfs ?

2018-05-22 Thread David Disseldorp
Hi Daniel and Jake,

On Mon, 21 May 2018 22:46:01 +0200, Daniel Baumann wrote:

> Hi
> 
> On 05/21/2018 05:38 PM, Jake Grimmett wrote:
> > Unfortunately we have a large number (~200) of Windows and Macs clients
> > which need CIFS/SMB  access to cephfs.  
> 
> we too, which is why we're (partially) exporting cephfs over samba too,
> 1.5y in production now.
> 
> for us, cephfs-over-samba is significantly slower than cephfs directly
> too, but it's not really an issue here (basically, if people use a
> windows client here, they're already on the slow track anyway).
> 
> we had to do two things to get it working reliably though:
> 
> a) disable all locking on samba (otherwise "opportunistic locking" on
> windows clients killed within hours all mds (kraken at that time))

Have you seen this on more recent versions? Please raise a bug if so -
client induced MDS outages would be a pretty serious issue.

If your share path is isolated from non-samba clients (as you mention
below), then allowing clients to cache reads / writes locally via
oplocks / SMB2+ leases should offer a significant performance
improvements.

> b) only allow writes to a specific space on cephfs, reserved to samba
> (with luminous; otherwise, we'd have problems with data consistency on
> cephfs with people writing the same files from linux->cephfs and
> samba->cephfs concurrently). my hunch is that samba caches writes and
> doesn't give them back appropriatly.

If you're sharing a kernel CephFS mount, then the Linux page cache will
be used for Samba share I/O, but Samba will fully abide by client sync
requests if "strict sync = yes" (default in Samba 4.7+).

> > Finally, is the vfs_ceph module for Samba useful? It doesn't seem to be
> > widely available pre-complied for for RHEL derivatives. Can anyone
> > comment on their experiences using vfs_ceph, or point me to a Centos 7.x
> > repo that has it?  
> 
> we use debian, with backported kernel and backported samba, which has
> vfs_ceph pre-compiled. however, we couldn't make vfs_ceph work at all -
> the snapshot patters just don't seem to match/align (and nothing we
> tried seem to work).

vfs_ceph doesn't support snapshots at this stage. I hope to work on this
feature in the near future, so that it's fully integrated with the
Explorer previous versions UI.

Cheers, David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous: resilience - private interface down , no read/write

2018-05-22 Thread David Turner
What happens when a storage node loses its cluster network but not it's
public network is that all other osss on the cluster see that it's down and
report that to the mons, but the node call still talk to the mons telling
the mons that it is up and in fact everything else is down.

The setting osd _min_reporters (I think that's the name of it off the top
of my head) is designed to help with this scenario. It's default is 1 which
means any osd on either side of the network problem will be trusted by the
mons to mark osds down. What you want to do with this seeing is to set it
to at least 1 more than the number of osds in your failure domain. If the
failure domain is host and each node has 32 osds, then setting it to 33
will prevent a full problematic node from being able to cause havoc.

The osds will still try to mark themselves as up and this will still cause
problems for read until the osd process stops or the network comes back up.
There might be a seeing for how long an odd will try telling the mons it's
up, but this isn't really a situation I've come across after initial
testing and installation of nodes.

On Tue, May 22, 2018, 1:47 AM nokia ceph  wrote:

> Hi Ceph users,
>
> We have a cluster with 5 node (67 disks) and EC 4+1 configuration and
> min_size set as 4.
> Ceph version : 12.2.5
> While executing one of our resilience usecase , making private interface
> down on one of the node, till kraken we saw less outage in rados (60s) .
>
> Now with luminous, we could able to see rados read/write outage for more
> than 200s . In the logs we could able to see that peer OSDs inform that one
> of the node OSDs are down however the OSDs  defend like it is wrongly
> marked down and does not move to down state for long time.
>
> 2018-05-22 05:37:17.871049 7f6ac71e6700  0 log_channel(cluster) log [WRN]
> : Monitor daemon marked osd.1 down, but it is still running
> 2018-05-22 05:37:17.871072 7f6ac71e6700  0 log_channel(cluster) log [DBG]
> : map e35690 wrongly marked me down at e35689
> 2018-05-22 05:37:17.878347 7f6ac71e6700  0 osd.1 35690 crush map has
> features 1009107927421960192, adjusting msgr requires for osds
> 2018-05-22 05:37:18.296643 7f6ac71e6700  0 osd.1 35691 crush map has
> features 1009107927421960192, adjusting msgr requires for osds
>
>
> Only when all 67 OSDs are move to down state , the read/write traffic is
> resumed.
>
> Could you please help us in resolving this issue and if it is bug , we
> will create corresponding ticket.
>
> Thanks,
> Muthu
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread David Turner
What are your `ceph osd tree` and `ceph status` as well?

On Tue, May 22, 2018, 3:05 AM Pardhiv Karri  wrote:

> Hi,
>
> We are using Ceph Hammer 0.94.9. Some of our OSDs never get any data or
> PGs even at their full crush weight, up and running. Rest of the OSDs are
> at 50% full. Is there a bug in Hammer that is causing this issue? Does
> upgrading to Jewel or Luminous fix this issue?
>
> I tried deleting and recreating this OSD N number of times and still the
> same issue. I am seeing this in 3 of our 4 ceph clusters in different
> datacenters. We are using HDD as OSD and SSD as Journal drive.
>
> The below is from our lab and OSD 38 is the one that never fills.
>
>
> ID  WEIGHT   REWEIGHT SIZE   USEAVAIL  %USE  VAR  TYPE NAME
>
>  -1 80.0-  0  0  0 00 root default
>
>  -2 40.0- 39812G  6190G 33521G 15.55 0.68 rack rack_A1
>
>  -3 20.0- 19852G  3718G 16134G 18.73 0.82 host
> or1010051251040
>   0  2.0  1.0  1861G   450G  1410G 24.21 1.07 osd.0
>
>   1  2.0  1.0  1999G   325G  1673G 16.29 0.72 osd.1
>
>   2  2.0  1.0  1999G   336G  1662G 16.85 0.74 osd.2
>
>   3  2.0  1.0  1999G   386G  1612G 19.35 0.85 osd.3
>
>   4  2.0  1.0  1999G   385G  1613G 19.30 0.85 osd.4
>
>   5  2.0  1.0  1999G   364G  1634G 18.21 0.80 osd.5
>
>   6  2.0  1.0  1999G   319G  1679G 15.99 0.70 osd.6
>
>   7  2.0  1.0  1999G   434G  1564G 21.73 0.96 osd.7
>
>   8  2.0  1.0  1999G   352G  1646G 17.63 0.78 osd.8
>
>   9  2.0  1.0  1999G   362G  1636G 18.12 0.80 osd.9
>
>  -8 20.0- 19959G  2472G 17387G 12.39 0.55 host
> or1010051251044
>  30  2.0  1.0  1999G   362G  1636G 18.14 0.80 osd.30
>
>  31  2.0  1.0  1999G   293G  1705G 14.66 0.65 osd.31
>
>  32  2.0  1.0  1999G   202G  1796G 10.12 0.45 osd.32
>
>  33  2.0  1.0  1999G   215G  1783G 10.76 0.47 osd.33
>
>  34  2.0  1.0  1999G   192G  1806G  9.61 0.42 osd.34
>
>  35  2.0  1.0  1999G   337G  1661G 16.90 0.74 osd.35
>
>  36  2.0  1.0  1999G   206G  1792G 10.35 0.46 osd.36
>
>  37  2.0  1.0  1999G   266G  1732G 13.33 0.59 osd.37
>
>  38  2.0  1.0  1999G 55836k  1998G  0.000 osd.38
>
>  39  2.0  1.0  1968G   396G  1472G 20.12 0.89 osd.39
>
>  -4 20.0-  0  0  0 00 rack rack_B1
>
>  -5 20.0- 19990G  5978G 14011G 29.91 1.32 host
> or1010051251041
>  10  2.0  1.0  1999G   605G  1393G 30.27 1.33 osd.10
>
>  11  2.0  1.0  1999G   592G  1406G 29.62 1.30 osd.11
>
>  12  2.0  1.0  1999G   539G  1460G 26.96 1.19 osd.12
>
>  13  2.0  1.0  1999G   684G  1314G 34.22 1.51 osd.13
>
>  14  2.0  1.0  1999G   510G  1488G 25.56 1.13 osd.14
>
>  15  2.0  1.0  1999G   590G  1408G 29.52 1.30 osd.15
>
>  16  2.0  1.0  1999G   595G  1403G 29.80 1.31 osd.16
>
>  17  2.0  1.0  1999G   652G  1346G 32.64 1.44 osd.17
>
>  18  2.0  1.0  1999G   544G  1454G 27.23 1.20 osd.18
>
>  19  2.0  1.0  1999G   665G  1333G 33.27 1.46 osd.19
>
>  -90-  0  0  0 00 host
> or1010051251045
>  -6 20.0-  0  0  0 00 rack rack_C1
>
>  -7 20.0- 19990G  5956G 14033G 29.80 1.31 host
> or1010051251042
>  20  2.0  1.0  1999G   701G  1297G 35.11 1.55 osd.20
>
>  21  2.0  1.0  1999G   573G  1425G 28.70 1.26 osd.21
>
>  22  2.0  1.0  1999G   652G  1346G 32.64 1.44 osd.22
>
>  23  2.0  1.0  1999G   612G  1386G 30.62 1.35 osd.23
>
>  24  2.0  1.0  1999G   614G  1384G 30.74 1.35 osd.24
>
>  25  2.0  1.0  1999G   561G  1437G 28.11 1.24 osd.25
>
>  26  2.0  1.0  1999G   558G  1440G 27.93 1.23 osd.26
>
>  27  2.0  1.0  1999G   610G  1388G 30.52 1.34 osd.27
>
>  28  2.0  1.0  1999G   515G  1483G 25.81 1.14 osd.28
>
>  29  2.0  1.0  1999G   555G  1443G 27.78 1.22 osd.29
>
> -100-  0  0  0 00 host
> or1010051251046
> -110-  0  0  0 00 host
> or1010051251023
> TOTAL 79793G 18126G 61566G 22.72
>
> MIN/MAX VAR: 0/1.55  STDDEV: 8.26
>
>
> Thanks
> Pardhiv karri
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> 

Re: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname] ?

2018-05-22 Thread David Turner
We use radosgw in our deployment. It doesn't really matter as you can
specify the key in the config file.  You could call it
client.thatobjectthing.hostname and it would work fine.

On Tue, May 22, 2018, 5:54 AM Massimo Sgaravatto <
massimo.sgarava...@gmail.com> wrote:

> # ls /var/lib/ceph/radosgw/
> ceph-rgw.ceph-test-rgw-01
>
>
> So [client.rgw.ceph-test-rgw-01]
>
> Thanks, Massimo
>
>
> On Tue, May 22, 2018 at 6:28 AM, Marc Roos 
> wrote:
>
>>
>> I can relate to your issue, I am always looking at
>>
>> /var/lib/ceph/
>>
>> See what is used there
>>
>>
>> -Original Message-
>> From: Massimo Sgaravatto [mailto:massimo.sgarava...@gmail.com]
>> Sent: dinsdag 22 mei 2018 11:46
>> To: Ceph Users
>> Subject: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname]
>> ?
>>
>> I am really confused about the use of [client.rgw.hostname] or
>> [client.radosgw.hostname] in the configuration file. I don't understand
>> if they have different purposes or if there is just a problem with
>> documentation.
>>
>>
>> E.g.:
>>
>> http://docs.ceph.com/docs/luminous/start/quick-rgw/
>>
>>
>> says that [client.rgw.hostname] should be used
>>
>> while:
>>
>> http://docs.ceph.com/docs/luminous/radosgw/config-ref/
>>
>>
>> talks about [client.radosgw.{instance-name}]
>>
>>
>> In my luminous-centos7 cluster it looks like only [client.rgw.hostname]
>> works
>>
>>
>>
>> Thanks, Massimo
>>
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Several questions on the radosgw-openstack integration

2018-05-22 Thread Massimo Sgaravatto
I have several questions on the radosgw - OpenStack integration.

I was more or less able to set it (using a Luminous ceph cluster
and an Ocata OpenStack cloud), but I don't know if it working as expected.


So, the questions:


1.
I miss the meaning of the attribute "rgw keystone implicit tenants"
If I set "rgw keystone implicit tenants = false", accounts are created
using id:

 and the display name is the name of the OpenStack
project


If I set "rgw keystone implicit tenants = true", accounts are created using
id:

$<

and, again, the display name is the name of the OpenStack project


So one account per openstack project in both cases
I would have expected two radosgw accounts for 2 openstack users belonging
to the same project, setting "rgw keystone implicit tenants = true"


2
Are OpenStack users supposed to access to their data only using swift, or
also via S3 ?
In the latter case, how can the user find her S3 credentials ?
I am not able to find the S3 keys for such OpenStack users also using
radosgw-admin

# radosgw-admin user info
--uid="a22db12575694c9e9f8650dde73ef565\$a22db12575694c9e9f8650dde73ef565"
--rgw-realm=cloudtest
...
...
 "keys": [],
...
...


3
How is the admin supposed to set default quota for each project/user ?
How can then the admin modify the quota for a user ?
How can the user see the assigned quota ?

I tried relying on the "rgw user default quota max size" attribute to
set the default quota. It works for users created using "radosgw-admin user
create" while
I am not able to see it working for OpenStack users (see also the thread
"rgw default user quota for OpenStack users")

If I explicitly set the quota for a OpenStack user using:

radosgw-admin quota set --quota-scope=user --max-size=2G
--uid="a22db12575694c9e9f8650dde73ef565\$a22db12575694c9e9f8650dde73ef565"
--rgw-realm=cloudtest
radosgw-admin quota enable --quota-scope=user
--uid="a22db12575694c9e9f8650dde73ef565\$a22db12575694c9e9f8650dde73ef565"
--rgw-realm=cloudtest


this works (i.e. quota is enforced) but such quota is not exposed to the
user (at least it is not reported anywhere in the OpenStack dashboard nor
in the "swift stat" output)


4
I tried creating (using the OpenStack dashboard) containers with public
access.
It looks like this works only if "rgw keystone implicit tenants" is set to
false
Is this expected ?


Many thanks, Massimo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to see PGs of a pool on a OSD

2018-05-22 Thread Caspar Smit
Here you go:

ps. You might have to map your poolnames to pool ids

http://cephnotes.ksperis.com/blog/2015/02/23/get-the-number-of-placement-groups-per-osd

Kind regards,
Caspar

2018-05-22 9:13 GMT+02:00 Pardhiv Karri :

> Hi,
>
> Our ceph cluster have 12 pools and only 3 pools are really used. How can I
> see number of PGs on a OSD and which PGs belong to which pool on that OSD?
>
> Something like below,
> OSD 0 = 1000PGs (500PGs belong to PoolA, 200PGs belong to PoolB, 300PGs
> belong to PoolC)
>
> Thanks,
> Pardhiv Karri
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph - Xen accessing RBDs through libvirt

2018-05-22 Thread Eugen Block

Hi,


So "somthing" goes wrong:

# cat /var/log/libvirt/libxl/libxl-driver.log
-> ...
2018-05-20 15:28:15.270+: libxl:
libxl_bootloader.c:634:bootloader_finished: bootloader failed - consult
logfile /var/log/xen/bootloader.7.log
2018-05-20 15:28:15.270+: libxl:
libxl_exec.c:118:libxl_report_child_exitstatus: bootloader [26640]
exited with error status 1
2018-05-20 15:28:15.271+: libxl:
libxl_create.c:1259:domcreate_rebuild_done: cannot (re-)build domain: -3

# cat /var/log/xen/bootloader.7.log
->
Traceback (most recent call last):
  File "/usr/lib64/xen/bin/pygrub", line 896, in 
part_offs = get_partition_offsets(file)
  File "/usr/lib64/xen/bin/pygrub", line 113, in get_partition_offsets
image_type = identify_disk_image(file)
  File "/usr/lib64/xen/bin/pygrub", line 56, in identify_disk_image
fd = os.open(file, os.O_RDONLY)
OSError: [Errno 2] No such file or directory:
'rbd:devel-pool/testvm3.rbd:id=libvirt:key=AQBThwFbGFRYFx==:auth_supported=cephx\\;none:mon_host=10.20.30.1\\:6789\\;10.20.30.2\\:6789\\;10.20.30.3\\:6789'


we used to work with Xen hypervisors before we switched to KVM, all  
the VMs are within OpenStack. There was one thing we had to configure  
for Xen instances: the base image needed two image properties,  
"hypervisor_type = xen" and "kernel_id = " where the image  
for the kernel_id was uploaded from /usr/lib/grub2/x86_64-xen/grub.xen.

For VMs independent from openstack we had to provide the kernel like this:

# kernel="/usr/lib/grub2/x86_64-xen/grub.xen"
kernel="/usr/lib/grub2/i386-xen/grub.xen"

I'm not sure if this is all that's required in your environment but we  
managed to run Xen VMs with Ceph backend.


Regards,
Eugen


Zitat von thg :


Hi all@list,

my background: I'm doing Xen since 10++ years, many years with DRBD for
high availability, since some time I'm using preferable GlusterFS with
FUSE as replicated storage, where I place the image-files for the vms.

In my current project we started (successfully) with Xen/GlusterFS too,
but because the provider, where we placed the servers, uses widely CEPH.
So we decided to switch, because of getting better support for this.

Unfortunately I'm new to CEPH, but with help of a technician, we have
running a 3 node CEPH-cluster now, that seems to work fine.

Hardware:
- Xeons, 24 Cores, 256 GB RAM,
  2x 240 GB system-SSDs RAID1, 4x 1,92 TB data-SSDs (no RAID)

Software we are using:
- CentOS 7.5.1804
- Kernel: 4.9.86-30.el7 @centos-virt-xen-48
- Xen: 4.8.3-5.el7  @centos-virt-xen-48
- libvirt-xen: 4.1.0-2.xen48.el7@centos-virt-xen-48
- Ceph: 2:12.2.5-0.el7  @Ceph


What is working:
I've converted a vm to a RBD-device, mapped it, mounted it and can start
this as pvm on the Xen hypervisor via xl create:

# qemu-img convert -O rbd img/testvm.img rbd:devel-pool/testvm3.rbd
# rbd ls -l devel-pool
-> NAME  SIZE PARENT FMT PROT LOCK
   ...
   testvm3.rbd 16384M  2
# rbd info devel-pool/testvm3.rbd
-> rbd image 'testvm3.rbd':
   size 16384 MB in 4096 objects
   order 22 (4096 kB objects)
   block_name_prefix: rbd_data.fac72ae8944a
   format: 2
   features: layering, exclusive-lock, object-map, fast-diff,
deep-flatten
   flags:
   create_timestamp: Sun May 20 14:13:42 2018
# qemu-img info rbd:devel-pool/testvm3.rbd
-> image: rbd:devel-pool/testvm3.rbd
   file format: raw
   virtual size: 16G (17179869184 bytes)
   disk size: unavailable

# rbd feature disable devel-pool/testvm2.rbd deep-flatten, fast-diff,
object-map (otherwise mapping does not work)
# rbd info devel-pool/testvm3.rbd
-> rbd image 'testvm3.rbd':
   size 16384 MB in 4096 objects
   order 22 (4096 kB objects)
   block_name_prefix: rbd_data.acda2ae8944a
   format: 2
   features: layering, exclusive-lock
   ...
# rbd map devel-pool/testvm3.rbd
-> /dev/rbd0
# rbd showmapped
-> id pool   image   snap device
   0  devel-pool testvm3.rbd -/dev/rbd0
# fdisk -l /dev/rbd0
-> Disk /dev/rbd0: 17.2 GB, 17179869184 bytes, 33554432 sectors
   Units = sectors of 1 * 512 = 512 bytes
   Sector size (logical/physical): 512 bytes / 512 bytes
   ...
Device Boot  Start End  Blocks   Id  System
   /dev/rbd0p1   *2048 2099199 1048576   83  Linux
   /dev/rbd0p2 20992002936217513631488   83  Linux
   /dev/rbd0p32936217633554431 2096128   82  Linux swap
   ...
# mount /dev/rbd0p2 /mnt
# ll /mnt/
-> ...
   lrwxrwxrwx.  1 root root7 Jan  2 23:42 bin -> usr/bin
   drwxr-xr-x.  2 root root6 Jan  2 23:42 boot
   drwxr-xr-x.  2 root root6 Jan  2 23:42 dev
   drwxr-xr-x. 81 root root 8192 May  7 02:08 etc
   drwxr-xr-x.  8 root root   98 Jan 29 02:19 home
   ...
   drwxr-xr-x. 19 root root  267 Jan  3 13:22 var
# umount /dev/rbd0p2

# cat testvm3.rbd0
-> name = "testvm3"
   ...
   disk = [ 

Re: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname] ?

2018-05-22 Thread Massimo Sgaravatto
# ls /var/lib/ceph/radosgw/
ceph-rgw.ceph-test-rgw-01


So [client.rgw.ceph-test-rgw-01]

Thanks, Massimo


On Tue, May 22, 2018 at 6:28 AM, Marc Roos  wrote:

>
> I can relate to your issue, I am always looking at
>
> /var/lib/ceph/
>
> See what is used there
>
>
> -Original Message-
> From: Massimo Sgaravatto [mailto:massimo.sgarava...@gmail.com]
> Sent: dinsdag 22 mei 2018 11:46
> To: Ceph Users
> Subject: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname]
> ?
>
> I am really confused about the use of [client.rgw.hostname] or
> [client.radosgw.hostname] in the configuration file. I don't understand
> if they have different purposes or if there is just a problem with
> documentation.
>
>
> E.g.:
>
> http://docs.ceph.com/docs/luminous/start/quick-rgw/
>
>
> says that [client.rgw.hostname] should be used
>
> while:
>
> http://docs.ceph.com/docs/luminous/radosgw/config-ref/
>
>
> talks about [client.radosgw.{instance-name}]
>
>
> In my luminous-centos7 cluster it looks like only [client.rgw.hostname]
> works
>
>
>
> Thanks, Massimo
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname] ?

2018-05-22 Thread Marc Roos
 
I can relate to your issue, I am always looking at 

/var/lib/ceph/

See what is used there


-Original Message-
From: Massimo Sgaravatto [mailto:massimo.sgarava...@gmail.com] 
Sent: dinsdag 22 mei 2018 11:46
To: Ceph Users
Subject: [ceph-users] [client.rgw.hostname] or [client.radosgw.hostname] 
?

I am really confused about the use of [client.rgw.hostname] or  
[client.radosgw.hostname] in the configuration file. I don't understand 
if they have different purposes or if there is just a problem with 
documentation. 


E.g.:

http://docs.ceph.com/docs/luminous/start/quick-rgw/


says that [client.rgw.hostname] should be used

while:

http://docs.ceph.com/docs/luminous/radosgw/config-ref/


talks about [client.radosgw.{instance-name}]


In my luminous-centos7 cluster it looks like only [client.rgw.hostname]  
works 



Thanks, Massimo


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [client.rgw.hostname] or [client.radosgw.hostname] ?

2018-05-22 Thread Massimo Sgaravatto
I am really confused about the use of [client.rgw.hostname] or
[client.radosgw.hostname] in the configuration file. I don't understand if
they have different purposes or if there is just a problem with
documentation.


E.g.:

http://docs.ceph.com/docs/luminous/start/quick-rgw/

says that [client.rgw.hostname] should be used

while:

http://docs.ceph.com/docs/luminous/radosgw/config-ref/

talks about [client.radosgw.{instance-name}]


In my luminous-centos7 cluster it looks like only [client.rgw.hostname]
works



Thanks, Massimo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] leveldb to rocksdb migration

2018-05-22 Thread Захаров Алексей
Hi all.
I'm trying to change osd's kv backend using instructions mentioned here: 
http://pic.doit.com.cn/ceph/pdf/20180322/4/0401.pdf

But, ceph-osdomap-tool --check step fails with the following error:
ceph-osdomap-tool: /build/ceph-12.2.5/src/rocksdb/db/version_edit.h:188: void 
rocksdb::VersionEdit::AddFile(int, uint64_t, uint32_t, uint64_t, const 
rocksdb::InternalKey&, const rocksdb::InternalKey&, const SequenceNumber&, 
const SequenceNumber&, bool): Assertion `smallest_seqno <= largest_seqno' 
failed.
*** Caught signal (Aborted) **
 in thread 7f1b8d88e640 thread_name:ceph-osdomap-to
 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
(stable)
 1: (()+0x189194) [0x55b3e66fa194]
 2: (()+0x11390) [0x7f1b84297390]
 3: (gsignal()+0x38) [0x7f1b83244428]
 4: (abort()+0x16a) [0x7f1b8324602a]
 5: (()+0x2dbd7) [0x7f1b8323cbd7]
 6: (()+0x2dc82) [0x7f1b8323cc82]
 7: (rocksdb::VersionSet::WriteSnapshot(rocksdb::log::Writer*)+0xad0) 
[0x55b3e675abc0]
 8: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, 
rocksdb::MutableCFOptions const&, rocksdb::autovector const&, rocksdb::InstrumentedMutex*, rocksdb::Directory*, bool, 
rocksdb::ColumnFamilyOptions const*)+0x1624) [0x55b3e6764524]
 9: (rocksdb::DBImpl::RecoverLogFiles(std::vector > const&, unsigned long*, bool)+0x1c48) 
[0x55b3e672d438]
 10: (rocksdb::DBImpl::Recover(std::vector const&, bool, bool, 
bool)+0x8c4) [0x55b3e672e544]
 11: (rocksdb::DB::Open(rocksdb::DBOptions const&, 
std::__cxx11::basic_string 
const&, std::vector const&, 
std::vector >*, rocksdb::DB**)+0xedc) 
[0x55b3e672f90c]
 12: (rocksdb::DB::Open(rocksdb::Options const&, 
std::__cxx11::basic_string 
const&, rocksdb::DB**)+0x6b1) [0x55b3e67311a1]
 13: (RocksDBStore::do_open(std::ostream&, bool)+0x89c) [0x55b3e66e4adc]
 14: (main()+0xa55) [0x55b3e6638ea5]
 15: (__libc_start_main()+0xf0) [0x7f1b8322f830]
 16: (_start()+0x29) [0x55b3e66b73b9]
Aborted

i've found this issue https://github.com/facebook/rocksdb/issues/946 , but i 
don't see easy workaround for it.
Has anyone faced the same problem?
Is there a tool to add fields to manifest files? For example to add 
largest_seqno. Or, is there a way to ignore this assert?

Using: ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
(stable)
Updated from jewel 10.2.10.

-- 
Regards,
Aleksei Zakharov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw default user quota for OpenStack users

2018-05-22 Thread Massimo Sgaravatto
The openstack nodes have their own ceph config file, but they have the same
content

On Mon, May 21, 2018 at 4:14 PM, David Turner  wrote:

> Is openstack/keystone maintaining it's own version of the ceph config
> file?  I know that's the case with software like Proxmox.  That might be a
> good place to start.  You could also look at the keystone code to see if
> it's manually specifying things based on an application config file.
>
> On Mon, May 21, 2018 at 9:21 AM Massimo Sgaravatto <
> massimo.sgarava...@gmail.com> wrote:
>
>> I set:
>>
>>  rgw user default quota max size = 2G
>>
>> in the ceph configuration file and I see that this works for users
>> created using the "radosgw-admin user create" command [**]
>>
>> I see that instead quota is not set for users created through keystone.
>>
>> This [*] is the relevant part of my ceph configuration file
>>
>> Any hints ?
>>
>> Thanks, Massimo
>>
>>
>> [*]
>>
>> [global]
>> rgw user default quota max size = 2G
>>
>> [client.myhostname]
>> rgw_frontends="civetweb port=7480"
>> rgw_zone=cloudtest
>> rgw_zonegroup=cloudtest
>> rgw_realm=cloudtest
>> debug rgw = 5
>> rgw keystone url = https://fqdn:35357
>> rgw keystone accepted roles = project_manager, _member_, user, admin,
>> Member
>> rgw keystone api version = 3
>> rgw keystone admin token = xyz
>> rgw keystone token cache size = 0
>> rgw s3 auth use keystone = true
>> nss_db_path = /var/ceph/nss
>>
>>
>> [**]
>>
>> # radosgw-admin user create --uid=xyz --display-name="xyz"
>> --rgw-realm=cloudtest
>>
>>
>> # radosgw-admin user info --uid=xyz --display-name="xyz"
>> --rgw-realm=cloudtest
>> ...
>> "user_quota": {
>> "enabled": true,
>> "check_on_raw": false,
>> "max_size": 2147483648 <(214)%20748-3648>,
>> "max_size_kb": 2097152,
>> "max_objects": -1
>> },
>> ...
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph - Xen accessing RBDs through libvirt

2018-05-22 Thread thg
Hi Marc,

> in the last weeks we spent some time in in improving RBDSR a rbd storage
> repository for XenServer.
> RBDSR is capable to userRBD by fuse, krbd and rbd-nbd.

I will have a look in this, thank you very much!

> I am pretty sure, that we will use this in production in a few weeks :-)

Well, better would be in "few days", to get up the dev-environment on
Ceph, that is still running on GlusterFS ;-)

> Probably this might be a alternative for you if you have in depth xen
> knowledge and python programming skills.

We use Xen, not XenServer, thus the xl- and not xe-tool-stack, python
would be "doable", but my ceph-knowledge still is quite limited.

So (a working) libvirt-solution would be the preferred one and relating
to the documentation, normally it should be supported ...

-- 

kind regards,

thg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph MeetUp Berlin – May 28

2018-05-22 Thread Robert Sander
On 19.05.2018 00:16, Gregory Farnum wrote:
> Is there any chance of sharing those slides when the meetup has
> finished? It sounds interesting! :)

We usually put a link to the slides on the MeetUp page.

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to see PGs of a pool on a OSD

2018-05-22 Thread Pardhiv Karri
Hi,

Our ceph cluster have 12 pools and only 3 pools are really used. How can I
see number of PGs on a OSD and which PGs belong to which pool on that OSD?

Something like below,
OSD 0 = 1000PGs (500PGs belong to PoolA, 200PGs belong to PoolB, 300PGs
belong to PoolC)

Thanks,
Pardhiv Karri
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph - Xen accessing RBDs through libvirt

2018-05-22 Thread Marc Schöchlin
Hello thg,

in the last weeks we spent some time in in improving RBDSR a rbd storage
repository for XenServer.
RBDSR is capable to userRBD by fuse, krbd and rbd-nbd.

Our improvements are based on
https://github.com/rposudnevskiy/RBDSR/tree/v2.0 and are currently
published at https://github.com/vico-research-and-consulting/RBDSR.

We are using it by rbd-nbd mode, because minimizes kernel dependencies
by a good flexibility and performance.
(http://docs.ceph.com/docs/master/man/8/rbd-nbd/)

The implementation still seems to need lots improvements, but our
current test results were promising from a stability and performance view.
I am pretty sure, that we will use this in production in a few weeks :-)

Probably this might be a alternative for you if you have in depth xen
knowledge and python programming skills.

Regards
Marc


Out Setup will look like this:

  * XEN
  o 220 virtual machines
  o 9 XEN Server 7.2 nodes (because of licencing reasons)
  o 4 * 1  GBIT LACP
  * Ceph
  o Luminous/12.2.5
  o Ubuntu 16.04
  o 5 OSD Nodes (24*8 TB HDD OSDs, 48*1TB SSD OSDS, Bluestore, 6Gb
Cache per OSD)
  o Size per OSD, 192GB RAM, 56 HT CPUs)
  o 3 Mons (64 GB RAM, 200GB SSD, 4 visible CPUs)
  o 2 * 10 GBIT, SFP+, bonded xmit_hash_policy layer3+4 for Ceph


Am 20.05.2018 um 20:15 schrieb thg:
> Hi all@list,
>
> my background: I'm doing Xen since 10++ years, many years with DRBD for
> high availability, since some time I'm using preferable GlusterFS with
> FUSE as replicated storage, where I place the image-files for the vms.
>
> In my current project we started (successfully) with Xen/GlusterFS too,
> but because the provider, where we placed the servers, uses widely CEPH.
> So we decided to switch, because of getting better support for this.
>
> Unfortunately I'm new to CEPH, but with help of a technician, we have
> running a 3 node CEPH-cluster now, that seems to work fine.
>
> Hardware:
> - Xeons, 24 Cores, 256 GB RAM,
>   2x 240 GB system-SSDs RAID1, 4x 1,92 TB data-SSDs (no RAID)
>
> Software we are using:
> - CentOS 7.5.1804
> - Kernel: 4.9.86-30.el7 @centos-virt-xen-48
> - Xen: 4.8.3-5.el7  @centos-virt-xen-48
> - libvirt-xen: 4.1.0-2.xen48.el7@centos-virt-xen-48
> - Ceph: 2:12.2.5-0.el7  @Ceph
>
>
> What is working:
> I've converted a vm to a RBD-device, mapped it, mounted it and can start
> this as pvm on the Xen hypervisor via xl create:
>
> # qemu-img convert -O rbd img/testvm.img rbd:devel-pool/testvm3.rbd
> # rbd ls -l devel-pool
> -> NAME  SIZE PARENT FMT PROT LOCK
>...
>testvm3.rbd 16384M  2
> # rbd info devel-pool/testvm3.rbd
> -> rbd image 'testvm3.rbd':
>size 16384 MB in 4096 objects
>order 22 (4096 kB objects)
>block_name_prefix: rbd_data.fac72ae8944a
>format: 2
>features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten
>flags:
>create_timestamp: Sun May 20 14:13:42 2018
> # qemu-img info rbd:devel-pool/testvm3.rbd
> -> image: rbd:devel-pool/testvm3.rbd
>file format: raw
>virtual size: 16G (17179869184 bytes)
>disk size: unavailable
>
> # rbd feature disable devel-pool/testvm2.rbd deep-flatten, fast-diff,
> object-map (otherwise mapping does not work)
> # rbd info devel-pool/testvm3.rbd
> -> rbd image 'testvm3.rbd':
>size 16384 MB in 4096 objects
>order 22 (4096 kB objects)
>block_name_prefix: rbd_data.acda2ae8944a
>format: 2
>features: layering, exclusive-lock
>...
> # rbd map devel-pool/testvm3.rbd
> -> /dev/rbd0
> # rbd showmapped
> -> id pool   image   snap device
>0  devel-pool testvm3.rbd -/dev/rbd0
> # fdisk -l /dev/rbd0
> -> Disk /dev/rbd0: 17.2 GB, 17179869184 bytes, 33554432 sectors
>Units = sectors of 1 * 512 = 512 bytes
>Sector size (logical/physical): 512 bytes / 512 bytes
>...
> Device Boot  Start End  Blocks   Id  System
>/dev/rbd0p1   *2048 2099199 1048576   83  Linux
>/dev/rbd0p2 20992002936217513631488   83  Linux
>/dev/rbd0p32936217633554431 2096128   82  Linux swap
>...
> # mount /dev/rbd0p2 /mnt
> # ll /mnt/
> -> ...
>lrwxrwxrwx.  1 root root7 Jan  2 23:42 bin -> usr/bin
>drwxr-xr-x.  2 root root6 Jan  2 23:42 boot
>drwxr-xr-x.  2 root root6 Jan  2 23:42 dev
>drwxr-xr-x. 81 root root 8192 May  7 02:08 etc
>drwxr-xr-x.  8 root root   98 Jan 29 02:19 home
>...
>drwxr-xr-x. 19 root root  267 Jan  3 13:22 var
> # umount /dev/rbd0p2
>
> # cat testvm3.rbd0
> -> name = "testvm3"
>...
>disk = [ "phy:/dev/rbd0,xvda,w" ]
>...
> # xl create -c testvm3.rbd0
> -> Parsing config from vpngw1.rbd0
>Using  to parse /grub2/grub.cfg
>...
>Welcome to CentOS Linux 7 (Core)!
>...
>CentOS Linux 7 (Core)
> Kernel 

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-22 Thread Alexandre DERUMIER
Hi,some new stats, mds memory is not 16G,

I have almost same number of items and bytes in cache  vs some weeks ago when 
mds was using 8G. (ceph 12.2.5)


root@ceph4-2:~# while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf dump | 
jq '.mds_mem.rss'; ceph daemon mds.ceph4-2.odiso.net dump_mempools | jq -c 
'.mds_co'; done
16905052
{"items":43350988,"bytes":5257428143}
16905052
{"items":43428329,"bytes":5283850173}
16905052
{"items":43209167,"bytes":5208578149}
16905052
{"items":43177631,"bytes":5198833577}
16905052
{"items":43312734,"bytes":5252649462}
16905052
{"items":43355753,"bytes":5277197972}
16905052
{"items":43700693,"bytes":5303376141}
16905052
{"items":43115809,"bytes":5156628138}
^C




root@ceph4-2:~# ceph status
  cluster:
id: e22b8e83-3036-4fe5-8fd5-5ce9d539beca
health: HEALTH_OK
 
  services:
mon: 3 daemons, quorum ceph4-1,ceph4-2,ceph4-3
mgr: ceph4-1.odiso.net(active), standbys: ceph4-2.odiso.net, 
ceph4-3.odiso.net
mds: cephfs4-1/1/1 up  {0=ceph4-2.odiso.net=up:active}, 2 up:standby
osd: 18 osds: 18 up, 18 in
rgw: 3 daemons active
 
  data:
pools:   11 pools, 1992 pgs
objects: 75677k objects, 6045 GB
usage:   20579 GB used, 6246 GB / 26825 GB avail
pgs: 1992 active+clean
 
  io:
client:   14441 kB/s rd, 2550 kB/s wr, 371 op/s rd, 95 op/s wr


root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net cache status 
{
"pool": {
"items": 44523608,
"bytes": 5326049009
}
}


root@ceph4-2:~# ceph daemon mds.ceph4-2.odiso.net perf dump 
{
"AsyncMessenger::Worker-0": {
"msgr_recv_messages": 798876013,
"msgr_send_messages": 825999506,
"msgr_recv_bytes": 7003223097381,
"msgr_send_bytes": 691501283744,
"msgr_created_connections": 148,
"msgr_active_connections": 146,
"msgr_running_total_time": 39914.832387470,
"msgr_running_send_time": 13744.704199430,
"msgr_running_recv_time": 32342.160588451,
"msgr_running_fast_dispatch_time": 5996.336446782
},
"AsyncMessenger::Worker-1": {
"msgr_recv_messages": 429668771,
"msgr_send_messages": 414760220,
"msgr_recv_bytes": 5003149410825,
"msgr_send_bytes": 396281427789,
"msgr_created_connections": 132,
"msgr_active_connections": 132,
"msgr_running_total_time": 23644.410515392,
"msgr_running_send_time": 7669.068710688,
"msgr_running_recv_time": 19751.610043696,
"msgr_running_fast_dispatch_time": 4331.023453385
},
"AsyncMessenger::Worker-2": {
"msgr_recv_messages": 1312910919,
"msgr_send_messages": 1260040403,
"msgr_recv_bytes": 5330386980976,
"msgr_send_bytes": 3341965016878,
"msgr_created_connections": 143,
"msgr_active_connections": 138,
"msgr_running_total_time": 61696.635450100,
"msgr_running_send_time": 23491.027014598,
"msgr_running_recv_time": 53858.409319734,
"msgr_running_fast_dispatch_time": 4312.451966809
},
"finisher-PurgeQueue": {
"queue_len": 0,
"complete_latency": {
"avgcount": 1889416,
"sum": 29224.227703697,
"avgtime": 0.015467333
}
},
"mds": {
"request": 1822420924,
"reply": 1822420886,
"reply_latency": {
"avgcount": 1822420886,
"sum": 5258467.616943274,
"avgtime": 0.002885429
},
"forward": 0,
"dir_fetch": 116035485,
"dir_commit": 1865012,
"dir_split": 17,
"dir_merge": 24,
"inode_max": 2147483647,
"inodes": 1600438,
"inodes_top": 210492,
"inodes_bottom": 100560,
"inodes_pin_tail": 1289386,
"inodes_pinned": 1299735,
"inodes_expired": 3476046,
"inodes_with_caps": 1299137,
"caps": 2211546,
"subtrees": 2,
"traverse": 1953482456,
"traverse_hit": 1127647211,
"traverse_forward": 0,
"traverse_discover": 0,
"traverse_dir_fetch": 105833969,
"traverse_remote_ino": 31686,
"traverse_lock": 4344,
"load_cent": 182244014474,
"q": 104,
"exported": 0,
"exported_inodes": 0,
"imported": 0,
"imported_inodes": 0
},
"mds_cache": {
"num_strays": 14980,
"num_strays_delayed": 7,
"num_strays_enqueuing": 0,
"strays_created": 1672815,
"strays_enqueued": 1659514,
"strays_reintegrated": 666,
"strays_migrated": 0,
"num_recovering_processing": 0,
"num_recovering_enqueued": 0,
"num_recovering_prioritized": 0,
"recovery_started": 2,
"recovery_completed": 2,
"ireq_enqueue_scrub": 0,
"ireq_exportdir": 0,
"ireq_flush": 0,
"ireq_fragmentdir": 41,
"ireq_fragstats": 0,
"ireq_inodestats": 0
},
"mds_log": {
"evadd": 357717092,

[ceph-users] Some OSDs never get any data or PGs

2018-05-22 Thread Pardhiv Karri
Hi,

We are using Ceph Hammer 0.94.9. Some of our OSDs never get any data or PGs
even at their full crush weight, up and running. Rest of the OSDs are at
50% full. Is there a bug in Hammer that is causing this issue? Does
upgrading to Jewel or Luminous fix this issue?

I tried deleting and recreating this OSD N number of times and still the
same issue. I am seeing this in 3 of our 4 ceph clusters in different
datacenters. We are using HDD as OSD and SSD as Journal drive.

The below is from our lab and OSD 38 is the one that never fills.


ID  WEIGHT   REWEIGHT SIZE   USEAVAIL  %USE  VAR  TYPE NAME

 -1 80.0-  0  0  0 00 root default

 -2 40.0- 39812G  6190G 33521G 15.55 0.68 rack rack_A1

 -3 20.0- 19852G  3718G 16134G 18.73 0.82 host
or1010051251040
  0  2.0  1.0  1861G   450G  1410G 24.21 1.07 osd.0

  1  2.0  1.0  1999G   325G  1673G 16.29 0.72 osd.1

  2  2.0  1.0  1999G   336G  1662G 16.85 0.74 osd.2

  3  2.0  1.0  1999G   386G  1612G 19.35 0.85 osd.3

  4  2.0  1.0  1999G   385G  1613G 19.30 0.85 osd.4

  5  2.0  1.0  1999G   364G  1634G 18.21 0.80 osd.5

  6  2.0  1.0  1999G   319G  1679G 15.99 0.70 osd.6

  7  2.0  1.0  1999G   434G  1564G 21.73 0.96 osd.7

  8  2.0  1.0  1999G   352G  1646G 17.63 0.78 osd.8

  9  2.0  1.0  1999G   362G  1636G 18.12 0.80 osd.9

 -8 20.0- 19959G  2472G 17387G 12.39 0.55 host
or1010051251044
 30  2.0  1.0  1999G   362G  1636G 18.14 0.80 osd.30

 31  2.0  1.0  1999G   293G  1705G 14.66 0.65 osd.31

 32  2.0  1.0  1999G   202G  1796G 10.12 0.45 osd.32

 33  2.0  1.0  1999G   215G  1783G 10.76 0.47 osd.33

 34  2.0  1.0  1999G   192G  1806G  9.61 0.42 osd.34

 35  2.0  1.0  1999G   337G  1661G 16.90 0.74 osd.35

 36  2.0  1.0  1999G   206G  1792G 10.35 0.46 osd.36

 37  2.0  1.0  1999G   266G  1732G 13.33 0.59 osd.37

 38  2.0  1.0  1999G 55836k  1998G  0.000 osd.38

 39  2.0  1.0  1968G   396G  1472G 20.12 0.89 osd.39

 -4 20.0-  0  0  0 00 rack rack_B1

 -5 20.0- 19990G  5978G 14011G 29.91 1.32 host
or1010051251041
 10  2.0  1.0  1999G   605G  1393G 30.27 1.33 osd.10

 11  2.0  1.0  1999G   592G  1406G 29.62 1.30 osd.11

 12  2.0  1.0  1999G   539G  1460G 26.96 1.19 osd.12

 13  2.0  1.0  1999G   684G  1314G 34.22 1.51 osd.13

 14  2.0  1.0  1999G   510G  1488G 25.56 1.13 osd.14

 15  2.0  1.0  1999G   590G  1408G 29.52 1.30 osd.15

 16  2.0  1.0  1999G   595G  1403G 29.80 1.31 osd.16

 17  2.0  1.0  1999G   652G  1346G 32.64 1.44 osd.17

 18  2.0  1.0  1999G   544G  1454G 27.23 1.20 osd.18

 19  2.0  1.0  1999G   665G  1333G 33.27 1.46 osd.19

 -90-  0  0  0 00 host
or1010051251045
 -6 20.0-  0  0  0 00 rack rack_C1

 -7 20.0- 19990G  5956G 14033G 29.80 1.31 host
or1010051251042
 20  2.0  1.0  1999G   701G  1297G 35.11 1.55 osd.20

 21  2.0  1.0  1999G   573G  1425G 28.70 1.26 osd.21

 22  2.0  1.0  1999G   652G  1346G 32.64 1.44 osd.22

 23  2.0  1.0  1999G   612G  1386G 30.62 1.35 osd.23

 24  2.0  1.0  1999G   614G  1384G 30.74 1.35 osd.24

 25  2.0  1.0  1999G   561G  1437G 28.11 1.24 osd.25

 26  2.0  1.0  1999G   558G  1440G 27.93 1.23 osd.26

 27  2.0  1.0  1999G   610G  1388G 30.52 1.34 osd.27

 28  2.0  1.0  1999G   515G  1483G 25.81 1.14 osd.28

 29  2.0  1.0  1999G   555G  1443G 27.78 1.22 osd.29

-100-  0  0  0 00 host
or1010051251046
-110-  0  0  0 00 host
or1010051251023
TOTAL 79793G 18126G 61566G 22.72

MIN/MAX VAR: 0/1.55  STDDEV: 8.26


Thanks
Pardhiv karri
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com