[ceph-users] Cannot create Initial Monitor

2015-12-03 Thread Aakanksha Pudipeddi-SSI
Hello Cephers,

I am unable to create the initial monitor during ceph cluster deployment. I do 
not know what changed since the same recipe used to work until very recently. 
These are the steps I used:
Ceph-deploy new  -- works
Dpkg -i -R  --works
Ceph-deploy mon create-initial - fails

Log:
[ceph_deploy.cli][INFO  ] Invoked (1.5.28): /usr/bin/ceph-deploy mon 
create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  subcommand: create-initial
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  keyrings  : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts myhost
[ceph_deploy.mon][DEBUG ] detecting platform for host myhost ...
[Myhost][DEBUG ] connection detected need for sudo
[Myhost][DEBUG ] connected to host: myhost
[Myhost][DEBUG ] detect platform information from remote host
[Myhost][DEBUG ] detect machine type
[Myhost][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
[Myhost][DEBUG ] determining if provided host has same hostname in remote
[Myhost][DEBUG ] get remote short hostname
[Myhost][DEBUG ] deploying mon to myhost
[Myhost][DEBUG ] get remote short hostname
[Myhost][DEBUG ] remote hostname: myhost
[Myhost][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[Myhost][DEBUG ] create the mon path if it does not exist
[Myhost][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-myhost/done
[Myhost][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-myhost/done
[Myhost][INFO  ] creating keyring file: 
/var/lib/ceph/tmp/ceph-myhost.mon.keyring
[Myhost][DEBUG ] create the monitor keyring file
[Myhost][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i myhost 
--keyring /var/lib/ceph/tmp/ceph-myhost.mon.keyring
[Myhost][DEBUG ] ceph-mon: renaming mon.noname-a xx.xx.xxx.xx:6789/0 to 
mon.myhost
[Myhost][DEBUG ] ceph-mon: set fsid to 5573b0c6-02fd-4c45-aa89-b88fd08b3b87
[Myhost][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-myhost for 
mon.myhost
[Myhost][INFO  ] unlinking keyring file 
/var/lib/ceph/tmp/ceph-myhost.mon.keyring
[Myhost][DEBUG ] create a done file to avoid re-doing the mon deployment
[Myhost][DEBUG ] create the init path if it does not exist
[Myhost][DEBUG ] locating the `service` executable...
[Myhost][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph 
id=myhost
[Myhost][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.myhost.asok mon_status
[Myhost][ERROR ] admin_socket: exception getting command descriptions: [Errno 
2] No such file or directory
[Myhost][WARNING] monitor: mon.myhost, might not be running yet

I checked the monitor log in /var/log/ceph and it does not have anything 
unusual, just the pid for the ceph-mon process. However, there is no 
/var/run/ceph/ceph-mon.myhost.asok. I do not know in which step this file is 
created; hence I am not able to debug this issue. Any pointers wrt this issue 
would be appreciated. I am using the BLKIN Ceph branch (wip-blkin) i.e. 9.0.1 
Ceph packages built from source and my ceph-deploy version is 1.5.28.

Thanks,
Aakanksha



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-disk activate Permission denied problems

2015-12-03 Thread Goncalo Borges

Hi Adrien...

Thanks for the pointer. It effectually solved our issue.

Cheers
G.

On 12/04/2015 12:53 AM, Adrien Gillard wrote:
This is the clean way to handle this. But you can also use udev to do 
this at boot. From what I found on the mailing list and made working 
before using GUID :


cat > /etc/udev/rules.d/89-ceph-journal.rules << EOF
KERNEL=="sda?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
KERNEL=="sdb?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
KERNEL=="sdc?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
EOF

On Thu, Dec 3, 2015 at 2:28 PM, Florent B > wrote:


Hi,

Is setting GUID is the only way to fix this ? I don't use GPT but
MBR and I don't want to try conversion on production servers...

On 12/03/2015 08:39 AM, Adrien Gillard wrote:

You should check that the owner of your ceph partitions (both
journal and data) is 'ceph', otherwise the ceph user won't mount it.

You can simply do : chown ceph:disk /dev/sdc3

If this solve your issue you should set the GPT GUID [1] of the
partitions with a tool like sgdisk to make this persistent across
reboot.

I think only your journal is affected as ceph-disk does not
prepare the partition (WARNING:ceph-disk:Journal /dev/sdc3 was
not prepared with ceph-disk. Symlinking directly)


[1]
https://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs






--
-
Adrien GILLARD

+33 (0)6 29 06 16 31
gillard.adr...@gmail.com 


--
Goncalo Borges
Research Computing
ARC Centre of Excellence for Particle Physics at the Terascale
School of Physics A28 | University of Sydney, NSW  2006
T: +61 2 93511937

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 转发: Confused about priority of client OP.

2015-12-03 Thread Wukongming
Hi,haomai
A bit tough question I asked above, but do you know the answer?

-
wukongming ID: 12019
Tel:0571-86760239
Dept:2014 UIS2 ONEStor

-邮件原件-
发件人: wukongming 12019 (RD)
发送时间: 2015年12月3日 22:15
收件人: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
抄送: zhengbin 08747 (RD)
主题: Confused about priority of client OP.

Hi, All:
I 've got a question about a priority. We defined 
osd_client_op_priority = 63. CEPH_MSG_PRIO_LOW = 64.
We are clear there are multiple IO to be discussed. Why not define 
osd_client_op_priority > 64, so we can just deal with client IO in first 
priority.


-
wukongming ID: 12019
Tel:0571-86760239
Dept:2014 UIS2 ONEStor


-
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 答复: How long will the logs be kept?

2015-12-03 Thread Wukongming
Yes, I can find ceph of rotate configure file in the directory of 
/etc/logrotate.d. 
Also, I find sth. Weird.

drwxr-xr-x  2 root root   4.0K Dec  3 14:54 ./
drwxrwxr-x 19 root syslog 4.0K Dec  3 13:33 ../
-rw---  1 root root  0 Dec  2 06:25 ceph.audit.log
-rw---  1 root root85K Nov 25 09:17 ceph.audit.log.1.gz
-rw---  1 root root   228K Dec  3 16:00 ceph.log
-rw---  1 root root28K Dec  3 06:23 ceph.log.1.gz
-rw---  1 root root   374K Dec  2 06:22 ceph.log.2.gz
-rw-r--r--  1 root root   4.3M Dec  3 16:01 ceph-mon.wkm01.log
-rw-r--r--  1 root root   561K Dec  3 06:25 ceph-mon.wkm01.log.1.gz
-rw-r--r--  1 root root   2.2M Dec  2 06:25 ceph-mon.wkm01.log.2.gz
-rw-r--r--  1 root root  0 Dec  2 06:25 ceph-osd.0.log
-rw-r--r--  1 root root992 Dec  1 09:09 ceph-osd.0.log.1.gz
-rw-r--r--  1 root root19K Dec  3 10:51 ceph-osd.2.log
-rw-r--r--  1 root root   2.3K Dec  2 10:50 ceph-osd.2.log.1.gz
-rw-r--r--  1 root root27K Dec  1 10:31 ceph-osd.2.log.2.gz
-rw-r--r--  1 root root13K Dec  3 10:23 ceph-osd.5.log
-rw-r--r--  1 root root   1.6K Dec  2 09:57 ceph-osd.5.log.1.gz
-rw-r--r--  1 root root22K Dec  1 09:51 ceph-osd.5.log.2.gz
-rw-r--r--  1 root root19K Dec  3 10:51 ceph-osd.8.log
-rw-r--r--  1 root root18K Dec  2 10:50 ceph-osd.8.log.1
-rw-r--r--  1 root root   261K Dec  1 13:54 ceph-osd.8.log.2

I deployed ceph cluster on Nov 21, from that day to Dec.1, I mean the continue 
10 days' logs were compressed into one file, it is not what I want.
Does any OP affect log compressing?

Thanks!
Kongming Wu
-
wukongming ID: 12019
Tel:0571-86760239
Dept:2014 UIS2 ONEStor

-邮件原件-
发件人: huang jun [mailto:hjwsm1...@gmail.com] 
发送时间: 2015年12月3日 13:19
收件人: wukongming 12019 (RD)
抄送: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
主题: Re: How long will the logs be kept?

it will rotate every week by default, you can see the logrotate file 
/etc/ceph/logrotate.d/ceph

2015-12-03 12:37 GMT+08:00 Wukongming :
> Hi ,All
> Is there anyone who knows How long or how many days will the logs.gz 
> (mon/osd/mds)be kept, maybe before flushed?
>
> -
> wukongming ID: 12019
> Tel:0571-86760239
> Dept:2014 UIS2 OneStor
>
> --
> ---
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from 
> H3C, which is intended only for the person or entity whose address is 
> listed above. Any use of the information contained herein in any way 
> (including, but not limited to, total or partial disclosure, 
> reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, 
> please notify the sender by phone or email immediately and delete it!



--
thanks
huangjun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-disk list crashes in infernalis

2015-12-03 Thread Stolte, Felix
Hi all,

 

i upgraded from hammer to infernalis today and even so I had a hard time
doing so I finally got my cluster running in a healthy state (mainly my
fault, because I did not read the release notes carefully).

 

But when I try to list my disks with "ceph-disk list" I get the following
Traceback:

 

 ceph-disk list

Traceback (most recent call last):

  File "/usr/sbin/ceph-disk", line 3576, in 

main(sys.argv[1:])

  File "/usr/sbin/ceph-disk", line 3532, in main

main_catch(args.func, args)

  File "/usr/sbin/ceph-disk", line 3554, in main_catch

func(args)

  File "/usr/sbin/ceph-disk", line 2915, in main_list

devices = list_devices(args)

  File "/usr/sbin/ceph-disk", line 2855, in list_devices

partmap = list_all_partitions(args.path)

  File "/usr/sbin/ceph-disk", line 545, in list_all_partitions

dev_part_list[name] = list_partitions(os.path.join('/dev', name))

  File "/usr/sbin/ceph-disk", line 550, in list_partitions

if is_mpath(dev):

  File "/usr/sbin/ceph-disk", line 433, in is_mpath

uuid = get_dm_uuid(dev)

  File "/usr/sbin/ceph-disk", line 421, in get_dm_uuid

uuid_path = os.path.join(block_path(dev), 'dm', 'uuid')

  File "/usr/sbin/ceph-disk", line 416, in block_path

rdev = os.stat(path).st_rdev

OSError: [Errno 2] No such file or directory: '/dev/cciss!c0d0'

 

 

I'm running ceph 9.2 on Ubuntu 14.04.3 LTS on HP Hardware with HP P400
Raidcontroller. 4 Node Cluster (3 of them are Mons), 5-6 OSDs per Node with
journals on separate drive.

 

Does anyone know how to solve this or did I hit a bug?

 

Regards Felix

 

Forschungszentrum Juelich GmbH

52425 Juelich

Sitz der Gesellschaft: Juelich

Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498

Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher

Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),

Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,

Prof. Dr. Sebastian M. Schmidt

 



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph infernal-can not find the dependency package selinux-policy-base-3.13.1-23.el7_1.18.noarch.rpm

2015-12-03 Thread Xiangyu (Raijin, BP Dept)
When install the ceph infernal(v9.2.0) ,it require the package 
selinux-policy-base-3.13.1-23.el7_1.18.noarch.rpm, I tried search it by google 
, but got nothing, if anyone know how to get it?


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-03 Thread Nick Fisk
I would suggest you forget about 15k disks, there probably isn't much point in 
using them vs SSD's nowdays. For 10K disks, if cost is a key factor I would 
maybe look at the WD Raptor disks.

In terms of numbers of disks, it's very hard to calculate with the numbers you 
have provided. That simple formula is great if the IO load is constant, but 
what you will often find is that not all VM's will all be doing 150iops at once 
and so your actual total figure will be a lot less.

But yes if you have 3x replication, you will need 3 times the number of disk 
iops. Without knowing your read/write split, I would imagine this would be very 
hard to calculate though.

Do you have any current systems running to be able to get a rough idea of how 
much IO you might generate? Otherwise other people with similar sized VM 
workloads might be able to give example usage patterns.

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Sam Huracan
> Sent: 03 December 2015 09:02
> To: Srinivasula Maram 
> Cc: Nick Fisk ; ceph-us...@ceph.com
> Subject: Re: [ceph-users] Ceph Sizing
> 
> I'm following this presentation of Mirantis team:
> http://www.slideshare.net/mirantis/ceph-talk-vancouver-20
> 
> They calculate CEPH IOPS = Disk IOPS * HDD Quantity * 0.88 (4-8k random
> read proportion)
> 
> And  VM IOPS = CEPH IOPS / VM Quantity
> 
> But if I use replication of 3, Would VM IOPS be divided by 3?
> 
> 2015-12-03 7:09 GMT+07:00 Sam Huracan :
> IO size is 4 KB, and I need a Minimum sizing, cost optimized
> I intend use SuperMicro Devices
> http://www.supermicro.com/solutions/storage_Ceph.cfm
> 
> What do you think?
> 
> 2015-12-02 23:17 GMT+07:00 Srinivasula Maram
> :
> One more factor we need to consider here is IO size(block size) to get
> required IOPS, based on this we can calculate the bandwidth and design the
> solution.
> 
> Thanks
> Srinivas
> 
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: Wednesday, December 02, 2015 9:28 PM
> To: 'Sam Huracan'; ceph-us...@ceph.com
> Subject: Re: [ceph-users] Ceph Sizing
> 
> You've left out an important factorcost. Otherwise I would just say buy
> enough SSD to cover the capacity.
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Sam Huracan
> > Sent: 02 December 2015 15:46
> > To: ceph-us...@ceph.com
> > Subject: [ceph-users] Ceph Sizing
> >
> > Hi,
> > I'm building a storage structure for OpenStack cloud System, input:
> > - 700 VM
> > - 150 IOPS per VM
> > - 20 Storage per VM (boot volume)
> > - Some VM run database (SQL or MySQL)
> >
> > I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
> > I list some factors considered:
> > - Amount of OSD (SAS Disk)
> > - Amount of Journal (SSD)
> > - Amount of OSD Servers
> > - Amount of MON Server
> > - Network
> > - Replica ( default is 3)
> >
> > I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
> > Should I use all 3 disk types in one server or build dedicated servers
> > for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for Pool-2.
> >
> > Could you help me a formula to calculate the minimum devices needed
> > for above input.
> >
> > Thanks and regards.
> 
> 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New cluster performance analysis

2015-12-03 Thread Nick Fisk
Couple of things to check

1.  Can you create just a normal non cached pool and test performance to 
rule out any funnies going on there.
2.  Can you also run something like iostat during the benchmarks and see if 
it looks like all your disks are getting saturated.


_
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On 
Behalf Of Adrien Gillard
Sent: 02 December 2015 21:33
To: ceph-us...@ceph.com
Subject: [ceph-users] New cluster performance analysis


Hi everyone, 
 
I am currently testing our new cluster and I would like some 
feedback on the numbers I am getting.
 
For the hardware : 
7 x OSD : 2 x Intel 2640v3 (8x2.6GHz), 64B RAM, 2x10Gbits LACP 
for public net., 2x10Gbits LACP for cluster net., MTU 9000
1 x MON : 2 x Intel 2630L (6x2GHz), 32GB RAM and Intel DC SSD, 
2x10Gbits LACP for public net., MTU 9000
2 x MON : VMs (8 cores, 8GB RAM), backed by SSD
 
Journals are 20GB partitions on SSD
 
The system is CentOS 7.1 with stock kernel 
(3.10.0-229.20.1.el7.x86_64). No particular system optimizations.
 
Ceph is Infernalis from Ceph repository  : ceph version 9.2.0 
(bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
 
[cephadm@cph-adm-01  ~/scripts]$ ceph -s
cluster 259f65a3-d6c8-4c90-a9c2-71d4c3c55cce
 health HEALTH_OK
 monmap e1: 3 mons at 
{clb-cph-frpar1-mon-02=x.x.x.2:6789/0,clb-cph-frpar2-mon-01=x.x.x.1:6789/0,clb-cph-frpar2-mon-03=x.x.x.3:6789/0}
election epoch 62, quorum 0,1,2 
clb-cph-frpar2-mon-01,clb-cph-frpar1-mon-02,clb-cph-frpar2-mon-03
 osdmap e844: 84 osds: 84 up, 84 in
flags sortbitwise
  pgmap v111655: 3136 pgs, 3 pools, 3166 GB data, 19220 
kobjects
8308 GB used, 297 TB / 305 TB avail
3136 active+clean
 
My ceph.conf :
 
[global]
fsid = 259f65a3-d6c8-4c90-a9c2-71d4c3c55cce
mon_initial_members = clb-cph-frpar2-mon-01, 
clb-cph-frpar1-mon-02, clb-cph-frpar2-mon-03
mon_host = x.x.x.1,x.x.x.2,x.x.x.3
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
public network = 10.25.25.0/24  
cluster network = 10.25.26.0/24  
debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_journaler = 0/0
debug_objectcatcher = 0/0
debug_client = 0/0
debug_osd = 0/0
debug_optracker = 0/0
debug_objclass = 0/0
debug_filestore = 0/0
debug_journal = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0
 
[osd]
osd journal size = 0
osd mount options xfs = 
"rw,noatime,inode64,logbufs=8,logbsize=256k"
filestore min sync interval = 5
filestore max sync interval = 15
filestore queue max ops = 2048
filestore queue max bytes = 1048576000
filestore queue committing max ops = 4096
filestore queue committing max bytes = 1048576000
filestore op thread = 32
filestore journal writeahead = true
filestore merge threshold = 40
filestore split multiple = 8
 
journal max write bytes = 1048576000
journal max write entries = 4096
journal queue max ops = 8092
journal queue max bytes = 1048576000
 
osd max write size = 512
osd op threads = 16
osd disk threads = 2
osd op num 

Re: [ceph-users] How long will the logs be kept?

2015-12-03 Thread Jan Schermer
You can setup logrotate however you want - not sure what the default is for 
your distro.
Usually logrotate doesn't touch files that are smaller than some size even if 
they are old. It will also not delete logs for OSDs that no longer exist. 

Ceph itself has nothing to do with log rotation, logrotate does the work. Ceph 
packages likely contain default logrotate rules for the logs but you can edit 
them to your liking.

Jan

> On 03 Dec 2015, at 09:38, Wukongming  wrote:
> 
> Yes, I can find ceph of rotate configure file in the directory of 
> /etc/logrotate.d. 
> Also, I find sth. Weird.
> 
> drwxr-xr-x  2 root root   4.0K Dec  3 14:54 ./
> drwxrwxr-x 19 root syslog 4.0K Dec  3 13:33 ../
> -rw---  1 root root  0 Dec  2 06:25 ceph.audit.log
> -rw---  1 root root85K Nov 25 09:17 ceph.audit.log.1.gz
> -rw---  1 root root   228K Dec  3 16:00 ceph.log
> -rw---  1 root root28K Dec  3 06:23 ceph.log.1.gz
> -rw---  1 root root   374K Dec  2 06:22 ceph.log.2.gz
> -rw-r--r--  1 root root   4.3M Dec  3 16:01 ceph-mon.wkm01.log
> -rw-r--r--  1 root root   561K Dec  3 06:25 ceph-mon.wkm01.log.1.gz
> -rw-r--r--  1 root root   2.2M Dec  2 06:25 ceph-mon.wkm01.log.2.gz
> -rw-r--r--  1 root root  0 Dec  2 06:25 ceph-osd.0.log
> -rw-r--r--  1 root root992 Dec  1 09:09 ceph-osd.0.log.1.gz
> -rw-r--r--  1 root root19K Dec  3 10:51 ceph-osd.2.log
> -rw-r--r--  1 root root   2.3K Dec  2 10:50 ceph-osd.2.log.1.gz
> -rw-r--r--  1 root root27K Dec  1 10:31 ceph-osd.2.log.2.gz
> -rw-r--r--  1 root root13K Dec  3 10:23 ceph-osd.5.log
> -rw-r--r--  1 root root   1.6K Dec  2 09:57 ceph-osd.5.log.1.gz
> -rw-r--r--  1 root root22K Dec  1 09:51 ceph-osd.5.log.2.gz
> -rw-r--r--  1 root root19K Dec  3 10:51 ceph-osd.8.log
> -rw-r--r--  1 root root18K Dec  2 10:50 ceph-osd.8.log.1
> -rw-r--r--  1 root root   261K Dec  1 13:54 ceph-osd.8.log.2
> 
> I deployed ceph cluster on Nov 21, from that day to Dec.1, I mean the 
> continue 10 days' logs were compressed into one file, it is not what I want.
> Does any OP affect log compressing?
> 
> Thanks!
>Kongming Wu
> -
> wukongming ID: 12019
> Tel:0571-86760239
> Dept:2014 UIS2 ONEStor
> 
> -邮件原件-
> 发件人: huang jun [mailto:hjwsm1...@gmail.com] 
> 发送时间: 2015年12月3日 13:19
> 收件人: wukongming 12019 (RD)
> 抄送: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> 主题: Re: How long will the logs be kept?
> 
> it will rotate every week by default, you can see the logrotate file 
> /etc/ceph/logrotate.d/ceph
> 
> 2015-12-03 12:37 GMT+08:00 Wukongming :
>> Hi ,All
>>Is there anyone who knows How long or how many days will the logs.gz 
>> (mon/osd/mds)be kept, maybe before flushed?
>> 
>> -
>> wukongming ID: 12019
>> Tel:0571-86760239
>> Dept:2014 UIS2 OneStor
>> 
>> --
>> ---
>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
>> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
>> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
>> 邮件!
>> This e-mail and its attachments contain confidential information from 
>> H3C, which is intended only for the person or entity whose address is 
>> listed above. Any use of the information contained herein in any way 
>> (including, but not limited to, total or partial disclosure, 
>> reproduction, or dissemination) by persons other than the intended
>> recipient(s) is prohibited. If you receive this e-mail in error, 
>> please notify the sender by phone or email immediately and delete it!
> 
> 
> 
> --
> thanks
> huangjun
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-03 Thread Sam Huracan
I'm following this presentation of Mirantis team:
http://www.slideshare.net/mirantis/ceph-talk-vancouver-20

They calculate CEPH IOPS = Disk IOPS * HDD Quantity * 0.88 (4-8k random
read proportion)

And  VM IOPS = CEPH IOPS / VM Quantity

But if I use replication of 3, *Would VM IOPS be divided by 3? *

2015-12-03 7:09 GMT+07:00 Sam Huracan :

> IO size is 4 KB, and I need a Minimum sizing, cost optimized
> I intend use SuperMicro Devices
> http://www.supermicro.com/solutions/storage_Ceph.cfm
>
> What do you think?
>
> 2015-12-02 23:17 GMT+07:00 Srinivasula Maram <
> srinivasula.ma...@sandisk.com>:
>
>> One more factor we need to consider here is IO size(block size) to get
>> required IOPS, based on this we can calculate the bandwidth and design the
>> solution.
>>
>> Thanks
>> Srinivas
>>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Nick Fisk
>> Sent: Wednesday, December 02, 2015 9:28 PM
>> To: 'Sam Huracan'; ceph-us...@ceph.com
>> Subject: Re: [ceph-users] Ceph Sizing
>>
>> You've left out an important factorcost. Otherwise I would just say
>> buy enough SSD to cover the capacity.
>>
>> > -Original Message-
>> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> > Of Sam Huracan
>> > Sent: 02 December 2015 15:46
>> > To: ceph-us...@ceph.com
>> > Subject: [ceph-users] Ceph Sizing
>> >
>> > Hi,
>> > I'm building a storage structure for OpenStack cloud System, input:
>> > - 700 VM
>> > - 150 IOPS per VM
>> > - 20 Storage per VM (boot volume)
>> > - Some VM run database (SQL or MySQL)
>> >
>> > I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
>> > I list some factors considered:
>> > - Amount of OSD (SAS Disk)
>> > - Amount of Journal (SSD)
>> > - Amount of OSD Servers
>> > - Amount of MON Server
>> > - Network
>> > - Replica ( default is 3)
>> >
>> > I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
>> > Should I use all 3 disk types in one server or build dedicated servers
>> > for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for
>> Pool-2.
>> >
>> > Could you help me a formula to calculate the minimum devices needed
>> > for above input.
>> >
>> > Thanks and regards.
>>
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-disk list crashes in infernalis

2015-12-03 Thread Loic Dachary
Hi Felix,

This is a bug, I file an issue for you at http://tracker.ceph.com/issues/13970

Cheers

On 03/12/2015 10:56, Stolte, Felix wrote:
> Hi all,
> 
>  
> 
> i upgraded from hammer to infernalis today and even so I had a hard time 
> doing so I finally got my cluster running in a healthy state (mainly my 
> fault, because I did not read the release notes carefully).
> 
>  
> 
> But when I try to list my disks with „ceph-disk list“ I get the following 
> Traceback:
> 
>  
> 
>  ceph-disk list
> 
> Traceback (most recent call last):
> 
>   File "/usr/sbin/ceph-disk", line 3576, in 
> 
> main(sys.argv[1:])
> 
>   File "/usr/sbin/ceph-disk", line 3532, in main
> 
> main_catch(args.func, args)
> 
>   File "/usr/sbin/ceph-disk", line 3554, in main_catch
> 
> func(args)
> 
>   File "/usr/sbin/ceph-disk", line 2915, in main_list
> 
> devices = list_devices(args)
> 
>   File "/usr/sbin/ceph-disk", line 2855, in list_devices
> 
> partmap = list_all_partitions(args.path)
> 
>   File "/usr/sbin/ceph-disk", line 545, in list_all_partitions
> 
> dev_part_list[name] = list_partitions(os.path.join('/dev', name))
> 
>   File "/usr/sbin/ceph-disk", line 550, in list_partitions
> 
> if is_mpath(dev):
> 
>   File "/usr/sbin/ceph-disk", line 433, in is_mpath
> 
> uuid = get_dm_uuid(dev)
> 
>   File "/usr/sbin/ceph-disk", line 421, in get_dm_uuid
> 
> uuid_path = os.path.join(block_path(dev), 'dm', 'uuid')
> 
>   File "/usr/sbin/ceph-disk", line 416, in block_path
> 
> rdev = os.stat(path).st_rdev
> 
> OSError: [Errno 2] No such file or directory: '/dev/cciss!c0d0'
> 
>  
> 
>  
> 
> I’m running ceph 9.2 on Ubuntu 14.04.3 LTS on HP Hardware with HP P400 
> Raidcontroller. 4 Node Cluster (3 of them are Mons), 5-6 OSDs per Node with 
> journals on separate drive.
> 
>  
> 
> Does anyone know how to solve this or did I hit a bug?
> 
>  
> 
> Regards Felix
> 
>  
> 
> Forschungszentrum Juelich GmbH
> 
> 52425 Juelich
> 
> Sitz der Gesellschaft: Juelich
> 
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> 
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> 
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> 
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> 
> Prof. Dr. Sebastian M. Schmidt
> 
>  
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Confused about priority of client OP.

2015-12-03 Thread huang jun
In SimpleMessenger, the client OP like OSD_OP will dispatch by
ms_fast_dispatch, and not queued in PriortizedQueue in Messenger.

2015-12-03 22:14 GMT+08:00 Wukongming :
> Hi, All:
> I 've got a question about a priority. We defined 
> osd_client_op_priority = 63. CEPH_MSG_PRIO_LOW = 64.
> We are clear there are multiple IO to be discussed. Why not define 
> osd_client_op_priority > 64, so we can just deal with client IO in first 
> priority.
>
>
> -
> wukongming ID: 12019
> Tel:0571-86760239
> Dept:2014 UIS2 ONEStor
>
>
> -
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, 
> which is
> intended only for the person or entity whose address is listed above. Any use 
> of the
> information contained herein in any way (including, but not limited to, total 
> or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please 
> notify the sender
> by phone or email immediately and delete it!



-- 
thanks
huangjun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph infernal-can not find the dependency package selinux-policy-base-3.13.1-23.el7_1.18.noarch.rpm

2015-12-03 Thread Alfredo Deza
What distribution and version/release are you trying to install it on?
On a CentOS 7 box I see it is available:

$ sudo yum provides selinux-policy-base
...
selinux-policy-minimum-3.13.1-23.el7.noarch : SELinux minimum base policy
Repo: base
Matched from:
Provides: selinux-policy-base = 3.13.1-23.el7



On Thu, Dec 3, 2015 at 6:10 AM, Xiangyu (Raijin, BP Dept)
 wrote:
> When install the ceph infernal(v9.2.0) ,it require the package
> selinux-policy-base-3.13.1-23.el7_1.18.noarch.rpm, I tried search it by
> google , but got nothing, if anyone know how to get it?
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-disk activate Permission denied problems

2015-12-03 Thread Adrien Gillard
This is the clean way to handle this. But you can also use udev to do this
at boot. From what I found on the mailing list and made working before
using GUID :

cat > /etc/udev/rules.d/89-ceph-journal.rules << EOF
KERNEL=="sda?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
KERNEL=="sdb?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
KERNEL=="sdc?" SUBSYSTEM=="block" OWNER="ceph" GROUP="disk" MODE="0660"
EOF

On Thu, Dec 3, 2015 at 2:28 PM, Florent B  wrote:

> Hi,
>
> Is setting GUID is the only way to fix this ? I don't use GPT but MBR and
> I don't want to try conversion on production servers...
>
> On 12/03/2015 08:39 AM, Adrien Gillard wrote:
>
> You should check that the owner of your ceph partitions (both journal and
> data) is 'ceph', otherwise the ceph user won't mount it.
>
> You can simply do : chown ceph:disk /dev/sdc3
>
> If this solve your issue you should set the GPT GUID [1] of the partitions
> with a tool like sgdisk to make this persistent across reboot.
>
> I think only your journal is affected as ceph-disk does not prepare the
> partition (WARNING:ceph-disk:Journal /dev/sdc3 was not prepared with
> ceph-disk. Symlinking directly)
>
>
> [1]
> https://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs
>
>
>


-- 
-
Adrien GILLARD

+33 (0)6 29 06 16 31
gillard.adr...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New cluster performance analysis

2015-12-03 Thread Adrien Gillard
I did some more tests :

fio on a raw RBD volume (4K, numjob=32, QD=1) gives me around 3000 IOPS

I also tuned xfs mount options on client (I realized I didn't do that
already) and with
"largeio,inode64,swalloc,logbufs=8,logbsize=256k,attr2,auto,nodev,noatime,nodiratime"
I get better performance :

4k-32-1-randwrite-libaio: (groupid=0, jobs=32): err= 0: pid=26793: Thu Dec
 3 10:45:55 2015
  write: io=1685.3MB, bw=5720.1KB/s, iops=1430, runt=301652msec
slat (usec): min=5, max=1620, avg=41.61, stdev=25.82
clat (msec): min=1, max=4141, avg=14.61, stdev=112.55
 lat (msec): min=1, max=4141, avg=14.65, stdev=112.55
clat percentiles (msec):
 |  1.00th=[3],  5.00th=[4], 10.00th=[4], 20.00th=[4],
 | 30.00th=[4], 40.00th=[5], 50.00th=[5], 60.00th=[5],
 | 70.00th=[5], 80.00th=[6], 90.00th=[7], 95.00th=[7],
 | 99.00th=[  227], 99.50th=[  717], 99.90th=[ 1844], 99.95th=[ 2245],
 | 99.99th=[ 3097]

So, more than 50% improvement but it actually varies quite a lot between
tests (sometimes I get a bit more than 1000). If I run the test fo 30
minutes it drops to 900 IOPS.

As you suggested I also filled a volume with zeros (dd if=/dev/zero
of=/dev/rbd1 bs=1M) and then ran fio on the raw device, I didn't see a lot
of improvement.

If I run fio test directly on block devices I seem to saturate the
spinners, [1] is a graph of IO load on one of the OSD host.
[2] is the same OSD graph but when the test is done on a device mounted and
formatted with XFS on the client.
If I get half of the IOPS on the XFS volume because of the journal,
shouldn't I get the same amount of IOPS on the backend ?
[3] shows what happen if I run the test for 30 minutes.

During the fio tests on the raw device, load average on the OSD servers
increases up to 13/14 and I get a bit of iowait (I guess because the OSD
are busy)
During the fio tests on the raw device, load average on the OSD servers
peaks at the beginning and decreases to 5/6, but goes trough the roof on
the client.
Scheduler is deadline for all the drives, I didn't try to change it yet.

What I don't understand, even with your explanations, are the rados
results. From what I understand it performs at the RADOS level and thus
should not be impacted by client filesystem.
Given the results above I guess you are right and this has to do with the
client filesystem.

The cluster will be used for backups, write IO size during backups is
around 150/200K (I guess mostly sequential) and I am looking for the
highest bandwith and parallelization.

@Nick, I will try to create a new stand alone replicated pool.


[1] http://postimg.org/image/qvtvdq1n1/
[2] http://postimg.org/image/nhf6lzwgl/
[3] http://postimg.org/image/h7l0obw7h/

On Thu, Dec 3, 2015 at 1:30 PM, Nick Fisk  wrote:

> Couple of things to check
>
> 1.  Can you create just a normal non cached pool and test performance
> to rule out any funnies going on there.
>
> 2.  Can you also run something like iostat during the benchmarks and
> see if it looks like all your disks are getting saturated.
>
>
>
>_
>   *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com
>   ]* On Behalf Of* Adrien Gillard
>   *Sent:* 02 December 2015 21:33
>   *To:* ceph-us...@ceph.com
>   *Subject:* [ceph-users] New cluster performance analysis
>
>   Hi everyone,
>
>
>
>   I am currently testing our new cluster and I would like some
>   feedback on the numbers I am getting.
>
>
>
>   For the hardware :
>
>   7 x OSD : 2 x Intel 2640v3 (8x2.6GHz), 64B RAM, 2x10Gbits LACP for
>   public net., 2x10Gbits LACP for cluster net., MTU 9000
>
>   1 x MON : 2 x Intel 2630L (6x2GHz), 32GB RAM and Intel DC SSD,
>   2x10Gbits LACP for public net., MTU 9000
>
>   2 x MON : VMs (8 cores, 8GB RAM), backed by SSD
>
>
>
>   Journals are 20GB partitions on SSD
>
>
>
>   The system is CentOS 7.1 with stock kernel
>   (3.10.0-229.20.1.el7.x86_64). No particular system optimizations.
>
>
>
>   Ceph is Infernalis from Ceph repository  : ceph version 9.2.0
>   (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
>
>
>
>   [cephadm@cph-adm-01  ~/scripts]$ ceph -s
>
>   cluster 259f65a3-d6c8-4c90-a9c2-71d4c3c55cce
>
>health HEALTH_OK
>
>monmap e1: 3 mons at
>   
> {clb-cph-frpar1-mon-02=x.x.x.2:6789/0,clb-cph-frpar2-mon-01=x.x.x.1:6789/0,clb-cph-frpar2-mon-03=x.x.x.3:6789/0}
>
>   election epoch 62, quorum 0,1,2
>   clb-cph-frpar2-mon-01,clb-cph-frpar1-mon-02,clb-cph-frpar2-mon-03
>
>osdmap e844: 84 osds: 84 up, 84 in
>
>   flags sortbitwise
>
> pgmap v111655: 3136 pgs, 3 pools, 3166 GB data, 19220 kobjects
>
>   8308 GB used, 297 TB / 305 TB avail
>
>   3136 active+clean
>
>
>
>   My ceph.conf :
>

[ceph-users] Confused about priority of client OP.

2015-12-03 Thread Wukongming
Hi, All:
I 've got a question about a priority. We defined 
osd_client_op_priority = 63. CEPH_MSG_PRIO_LOW = 64.
We are clear there are multiple IO to be discussed. Why not define 
osd_client_op_priority > 64, so we can just deal with client IO in first 
priority.


-
wukongming ID: 12019
Tel:0571-86760239
Dept:2014 UIS2 ONEStor


-
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v9.2.0 Infernalis released

2015-12-03 Thread François Lafont
Hi,

On 03/12/2015 12:12, Florent B wrote:
> It seems that if some OSD are using journal devices, ceph user needs to
> be a member of "disk" group on Debian. Can someone confirm this ?

Yes, I confirm... if you are talking about the journal partitions of OSDs.

Another solution: via a udev rule, set "ceph" as owner of these partitions. In 
my case, all journal partitions have this kind of partname (gtp partition): 
"osd-$id-journal".
So I have defined this udev rule:

# cat /etc/udev/rules.d/90-ceph.rules 
# Udev rule to have "ceph" owner of each "/dev/disk/by-partlabel/osd-*-journal"
# partition from GPT disks. Totally inspired from this file
# /lib/udev/rules.d/60-persistent-storage.rules
ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_NAME}=="osd-?*-journal", 
OWNER="ceph"

For me, it works.
HTH.

François Lafont
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Timofey Titovets
Some users already ask on list about this problem on Debian
You can fix that by:
ln -sv
Or by
systemctl edit --full ceph-disk@.service

Just choose the way

2015-12-03 23:00 GMT+03:00 Florent B :
> Ok and /bin/flock is supposed to exist on all systems ? Don't have it on
> Debian... flock is at /usr/bin/flock
>
> My problem is that "ceph" service is doing everything, and all others
> systemd services does not run...
>
> it seems there is a problem switching from old init.d services to new
> systemd..
>
> On 12/03/2015 08:31 PM, Timofey Titovets wrote:
>> Lol, it's opensource guys
>> https://github.com/ceph/ceph/tree/master/systemd
>> ceph-disk@
>>
>> 2015-12-03 21:59 GMT+03:00 Florent B :
>>> "ceph" service does mount :
>>>
>>> systemctl status ceph -l
>>> ● ceph.service - LSB: Start Ceph distributed file system daemons at boot
>>> time
>>>Loaded: loaded (/etc/init.d/ceph)
>>>Active: active (exited) since Thu 2015-12-03 17:48:52 CET; 2h 9min ago
>>>   Process: 931 ExecStart=/etc/init.d/ceph start (code=exited,
>>> status=0/SUCCESS)
>>>
>>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1218.service.
>>> Dec 03 17:48:47 test3 ceph[931]: Starting ceph-create-keys on test3...
>>> Dec 03 17:48:47 test3 ceph[931]: === mds.1 ===
>>> Dec 03 17:48:47 test3 ceph[931]: Starting Ceph mds.1 on test3...
>>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1318.service.
>>> Dec 03 17:48:47 test3 ceph[931]: === osd.2 ===
>>> Dec 03 17:48:47 test3 ceph[931]: Mounting xfs on
>>> test3:/var/lib/ceph/osd/ceph-2
>>> Dec 03 17:48:52 test3 ceph[931]: create-or-move updated item name
>>> 'osd.2' weight 0.8447 at location {host=test3,root=default} to crush map
>>> Dec 03 17:48:52 test3 ceph[931]: Starting Ceph osd.2 on test3...
>>> Dec 03 17:48:52 test3 ceph[931]: Running as unit run-1580.service.
>>>
>>>
>>> I don't see any udev rule related to Ceph on my servers...
>>>
>>>
>>> On 12/03/2015 07:56 PM, Adrien Gillard wrote:
 I think OSD are automatically mouted at boot via udev rules and that
 the ceph service does not handle the mounting part.

>>
>>
>



-- 
Have a nice day,
Timofey.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Jan Schermer
echo add >/sys/block/sdX/sdXY/uevent

The easiest way to make it mount automagically

Jan

> On 03 Dec 2015, at 20:31, Timofey Titovets  wrote:
> 
> Lol, it's opensource guys
> https://github.com/ceph/ceph/tree/master/systemd
> ceph-disk@
> 
> 2015-12-03 21:59 GMT+03:00 Florent B :
>> "ceph" service does mount :
>> 
>> systemctl status ceph -l
>> ● ceph.service - LSB: Start Ceph distributed file system daemons at boot
>> time
>>   Loaded: loaded (/etc/init.d/ceph)
>>   Active: active (exited) since Thu 2015-12-03 17:48:52 CET; 2h 9min ago
>>  Process: 931 ExecStart=/etc/init.d/ceph start (code=exited,
>> status=0/SUCCESS)
>> 
>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1218.service.
>> Dec 03 17:48:47 test3 ceph[931]: Starting ceph-create-keys on test3...
>> Dec 03 17:48:47 test3 ceph[931]: === mds.1 ===
>> Dec 03 17:48:47 test3 ceph[931]: Starting Ceph mds.1 on test3...
>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1318.service.
>> Dec 03 17:48:47 test3 ceph[931]: === osd.2 ===
>> Dec 03 17:48:47 test3 ceph[931]: Mounting xfs on
>> test3:/var/lib/ceph/osd/ceph-2
>> Dec 03 17:48:52 test3 ceph[931]: create-or-move updated item name
>> 'osd.2' weight 0.8447 at location {host=test3,root=default} to crush map
>> Dec 03 17:48:52 test3 ceph[931]: Starting Ceph osd.2 on test3...
>> Dec 03 17:48:52 test3 ceph[931]: Running as unit run-1580.service.
>> 
>> 
>> I don't see any udev rule related to Ceph on my servers...
>> 
>> 
>> On 12/03/2015 07:56 PM, Adrien Gillard wrote:
>>> I think OSD are automatically mouted at boot via udev rules and that
>>> the ceph service does not handle the mounting part.
>>> 
>> 
> 
> 
> 
> -- 
> Have a nice day,
> Timofey.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Timofey Titovets
Lol, it's opensource guys
https://github.com/ceph/ceph/tree/master/systemd
ceph-disk@

2015-12-03 21:59 GMT+03:00 Florent B :
> "ceph" service does mount :
>
> systemctl status ceph -l
> ● ceph.service - LSB: Start Ceph distributed file system daemons at boot
> time
>Loaded: loaded (/etc/init.d/ceph)
>Active: active (exited) since Thu 2015-12-03 17:48:52 CET; 2h 9min ago
>   Process: 931 ExecStart=/etc/init.d/ceph start (code=exited,
> status=0/SUCCESS)
>
> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1218.service.
> Dec 03 17:48:47 test3 ceph[931]: Starting ceph-create-keys on test3...
> Dec 03 17:48:47 test3 ceph[931]: === mds.1 ===
> Dec 03 17:48:47 test3 ceph[931]: Starting Ceph mds.1 on test3...
> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1318.service.
> Dec 03 17:48:47 test3 ceph[931]: === osd.2 ===
> Dec 03 17:48:47 test3 ceph[931]: Mounting xfs on
> test3:/var/lib/ceph/osd/ceph-2
> Dec 03 17:48:52 test3 ceph[931]: create-or-move updated item name
> 'osd.2' weight 0.8447 at location {host=test3,root=default} to crush map
> Dec 03 17:48:52 test3 ceph[931]: Starting Ceph osd.2 on test3...
> Dec 03 17:48:52 test3 ceph[931]: Running as unit run-1580.service.
>
>
> I don't see any udev rule related to Ceph on my servers...
>
>
> On 12/03/2015 07:56 PM, Adrien Gillard wrote:
>> I think OSD are automatically mouted at boot via udev rules and that
>> the ceph service does not handle the mounting part.
>>
>



-- 
Have a nice day,
Timofey.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-03 Thread Warren Wang - ISD
I would be a lot more conservative in terms of what a spinning drive can
do. The Mirantis presentation has pretty high expectations out of a
spinning drive, as they¹re ignoring somewhat latency (til the last few
slides). Look at the max latencies for anything above 1 QD on a spinning
drive.

If you factor in a latency requirement, the capability of the drives fall
dramatically. You might be able to offset this by using NVMe or something
as a cache layer between the journal and the OSD, using bcache, LVM cache,
etc. In much of the performance testing that we¹ve done, the average isn¹t
too bad, but 90th percentile numbers tend to be quite bad. Part of it is
probably from locking PGs during a flush, and the other part is just the
nature of spinning drives.

I¹d try to get a handle on expected workloads before picking the gear, but
if you have to pick before that, SSD if you have the budget :) You can
offset it a little by using erasure coding for the RGW portion, or using
spinning drives for that.

I think picking gear for Ceph is tougher than running an actual cluster :)
Best of luck. I think you¹re still starting with better, and more info
than some of us did years ago.

Warren Wang




From:  Sam Huracan 
Date:  Thursday, December 3, 2015 at 4:01 AM
To:  Srinivasula Maram 
Cc:  Nick Fisk , "ceph-us...@ceph.com"

Subject:  Re: [ceph-users] Ceph Sizing


I'm following this presentation of Mirantis team:
http://www.slideshare.net/mirantis/ceph-talk-vancouver-20

They calculate CEPH IOPS = Disk IOPS * HDD Quantity * 0.88 (4-8k random
read proportion)


And  VM IOPS = CEPH IOPS / VM Quantity

But if I use replication of 3, Would VM IOPS be divided by 3?


2015-12-03 7:09 GMT+07:00 Sam Huracan :

IO size is 4 KB, and I need a Minimum sizing, cost optimized
I intend use SuperMicro Devices
http://www.supermicro.com/solutions/storage_Ceph.cfm


What do you think?


2015-12-02 23:17 GMT+07:00 Srinivasula Maram
:

One more factor we need to consider here is IO size(block size) to get
required IOPS, based on this we can calculate the bandwidth and design the
solution.

Thanks
Srinivas

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Nick Fisk
Sent: Wednesday, December 02, 2015 9:28 PM
To: 'Sam Huracan'; ceph-us...@ceph.com
Subject: Re: [ceph-users] Ceph Sizing

You've left out an important factorcost. Otherwise I would just say
buy enough SSD to cover the capacity.

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Sam Huracan
> Sent: 02 December 2015 15:46
> To: ceph-us...@ceph.com
> Subject: [ceph-users] Ceph Sizing
>
> Hi,
> I'm building a storage structure for OpenStack cloud System, input:
> - 700 VM
> - 150 IOPS per VM
> - 20 Storage per VM (boot volume)
> - Some VM run database (SQL or MySQL)
>
> I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
> I list some factors considered:
> - Amount of OSD (SAS Disk)
> - Amount of Journal (SSD)
> - Amount of OSD Servers
> - Amount of MON Server
> - Network
> - Replica ( default is 3)
>
> I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
> Should I use all 3 disk types in one server or build dedicated servers
> for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for
>Pool-2.
>
> Could you help me a formula to calculate the minimum devices needed
> for above input.
>
> Thanks and regards.








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com












This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Adrien Gillard
I think OSD are automatically mouted at boot via udev rules and that the
ceph service does not handle the mounting part.

On Thu, Dec 3, 2015 at 7:40 PM, Florent B  wrote:

> Hi,
>
> On 12/03/2015 07:36 PM, Timofey Titovets wrote:
>
>
> On 3 Dec 2015 8:56 p.m., "Florent B" < 
> flor...@coppint.com> wrote:
> >
> > By the way, when system boots, "ceph" service is starting everything
> > fine. So "ceph-osd@" service is disabled => how to restart an OSD ?!
> >
> AFAIK, ceph now have 2 services:
> 1. Mount device
> 2. Start OSD
>
>
> I don't understand, what's the name of the service to mount device ? It
> should be logic to call it before ceph-osd service, no ?
>
> Also, service can be disabled, but this not mean, what it can't work
>
> If ceph-osd@ already started, you can easy see status of service, and can
> restart it, if it's needed ___
>
>
> On a server just rebooted, all Ceph services are managed by "ceph"
> service, so "ceph-osd@" services are unusable :
>
> ● ceph-osd@2.service - Ceph object storage daemon
>Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled)
>Active: inactive (dead)
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
-
Adrien GILLARD

+33 (0)6 29 06 16 31
gillard.adr...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Loic Dachary
Hi,

On 03/12/2015 21:00, Florent B wrote:
> Ok and /bin/flock is supposed to exist on all systems ? Don't have it on
> Debian... flock is at /usr/bin/flock

I filed a bug for this : http://tracker.ceph.com/issues/13975

Cheers

> 
> My problem is that "ceph" service is doing everything, and all others
> systemd services does not run...
> 
> it seems there is a problem switching from old init.d services to new
> systemd..
> 
> On 12/03/2015 08:31 PM, Timofey Titovets wrote:
>> Lol, it's opensource guys
>> https://github.com/ceph/ceph/tree/master/systemd
>> ceph-disk@
>>
>> 2015-12-03 21:59 GMT+03:00 Florent B :
>>> "ceph" service does mount :
>>>
>>> systemctl status ceph -l
>>> ● ceph.service - LSB: Start Ceph distributed file system daemons at boot
>>> time
>>>Loaded: loaded (/etc/init.d/ceph)
>>>Active: active (exited) since Thu 2015-12-03 17:48:52 CET; 2h 9min ago
>>>   Process: 931 ExecStart=/etc/init.d/ceph start (code=exited,
>>> status=0/SUCCESS)
>>>
>>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1218.service.
>>> Dec 03 17:48:47 test3 ceph[931]: Starting ceph-create-keys on test3...
>>> Dec 03 17:48:47 test3 ceph[931]: === mds.1 ===
>>> Dec 03 17:48:47 test3 ceph[931]: Starting Ceph mds.1 on test3...
>>> Dec 03 17:48:47 test3 ceph[931]: Running as unit run-1318.service.
>>> Dec 03 17:48:47 test3 ceph[931]: === osd.2 ===
>>> Dec 03 17:48:47 test3 ceph[931]: Mounting xfs on
>>> test3:/var/lib/ceph/osd/ceph-2
>>> Dec 03 17:48:52 test3 ceph[931]: create-or-move updated item name
>>> 'osd.2' weight 0.8447 at location {host=test3,root=default} to crush map
>>> Dec 03 17:48:52 test3 ceph[931]: Starting Ceph osd.2 on test3...
>>> Dec 03 17:48:52 test3 ceph[931]: Running as unit run-1580.service.
>>>
>>>
>>> I don't see any udev rule related to Ceph on my servers...
>>>
>>>
>>> On 12/03/2015 07:56 PM, Adrien Gillard wrote:
 I think OSD are automatically mouted at boot via udev rules and that
 the ceph service does not handle the mounting part.

>>
>>
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Flapping OSDs, Large meta directories in OSDs

2015-12-03 Thread Tom Christensen
We were able to prevent the blacklist operations, and now the cluster is
much happier, however, the OSDs have not started cleaning up old osd maps
after 48 hours.  Is there anything we can do to poke them to get them to
start cleaning up old osd maps?



On Wed, Dec 2, 2015 at 11:25 AM, Gregory Farnum  wrote:

> On Tue, Dec 1, 2015 at 10:02 AM, Tom Christensen  wrote:
> > Another thing that we don't quite grasp is that when we see slow requests
> > now they almost always, probably 95% have the "known_if_redirected" state
> > set.  What does this state mean?  Does it indicate we have OSD maps that
> are
> > lagging and the cluster isn't really in sync?  Could this be the cause of
> > our growing osdmaps?
>
> This is just a flag set on operations by new clients to let the OSD
> perform more effectively — you don't need to worry about it.
>
> I'm not sure why you're getting a bunch of client blacklist
> operations, but each one will generate a new OSDMap (if nothing else
> prompts one), yes.
> -Greg
>
> >
> > -Tom
> >
> >
> > On Tue, Dec 1, 2015 at 2:35 AM, HEWLETT, Paul (Paul)
> >  wrote:
> >>
> >> I believe that ‘filestore xattr use omap’ is no longer used in Ceph –
> can
> >> anybody confirm this?
> >> I could not find any usage in the Ceph source code except that the value
> >> is set in some of the test software…
> >>
> >> Paul
> >>
> >>
> >> From: ceph-users  on behalf of Tom
> >> Christensen 
> >> Date: Monday, 30 November 2015 at 23:20
> >> To: "ceph-users@lists.ceph.com" 
> >> Subject: Re: [ceph-users] Flapping OSDs, Large meta directories in OSDs
> >>
> >> What counts as ancient?  Concurrent to our hammer upgrade we went from
> >> 3.16->3.19 on ubuntu 14.04.  We are looking to revert to the 3.16 kernel
> >> we'd been running because we're also seeing an intermittent (its
> happened
> >> twice in 2 weeks) massive load spike that completely hangs the osd node
> >> (we're talking about load averages that hit 20k+ before the box becomes
> >> completely unresponsive).  We saw a similar behavior on a 3.13 kernel,
> which
> >> resolved by moving to the 3.16 kernel we had before.  I'll try to catch
> one
> >> with debug_ms=1 and see if I can see it we're hitting a similar hang.
> >>
> >> To your comment about omap, we do have filestore xattr use omap = true
> in
> >> our conf... which we believe was placed there by ceph-deploy (which we
> used
> >> to deploy this cluster).  We are on xfs, but we do take tons of RBD
> >> snapshots.  If either of these use cases will cause lots of osd map size
> >> then, we may just be exceeding the limits of the number of rbd snapshots
> >> ceph can handle (we take about 4-5000/day, 1 per RBD in the cluster)
> >>
> >> An interesting note, we had an OSD flap earlier this morning, and when
> it
> >> did, immediately after it came back I checked its meta directory size
> with
> >> du -sh, this returned immediately, and showed a size of 107GB.  The fact
> >> that it returned immediately indicated to me that something had just
> >> recently read through that whole directory and it was all cached in the
> FS
> >> cache.  Normally a du -sh on the meta directory takes a good 5 minutes
> to
> >> return.  Anyway, since it dropped this morning its meta directory size
> >> continues to shrink and is down to 93GB.  So it feels like something
> happens
> >> that makes the OSD read all its historical maps which results in the OSD
> >> hanging cause there are a ton of them, and then it wakes up and
> realizes it
> >> can delete a bunch of them...
> >>
> >> On Mon, Nov 30, 2015 at 2:11 PM, Dan van der Ster 
> >> wrote:
> >>>
> >>> The trick with debugging heartbeat problems is to grep back through the
> >>> log to find the last thing the affected thread was doing, e.g. is
> >>> 0x7f5affe72700 stuck in messaging, writing to the disk, reading
> through the
> >>> omap, etc..
> >>>
> >>> I agree this doesn't look to be network related, but if you want to
> rule
> >>> it out you should use debug_ms=1.
> >>>
> >>> Last week we upgraded a 1200 osd cluster from firefly to 0.94.5 and
> >>> similarly started getting slow requests. To make a long story short,
> our
> >>> issue turned out to be sendmsg blocking (very rarely), probably due to
> an
> >>> ancient el6 kernel (these osd servers had ~800 days' uptime). The
> signature
> >>> of this was 900s of slow requests, then an ms log showing "initiating
> >>> reconnect". Until we got the kernel upgraded everywhere, we used a
> >>> workaround of ms tcp read timeout = 60.
> >>> So, check your kernels, and upgrade if they're ancient. Latest el6
> >>> kernels work for us.
> >>>
> >>> Otherwise, those huge osd leveldb's don't look right. (Unless you're
> >>> using tons and tons of omap...) And it kinda reminds me of the other
> problem
> >>> we hit after the hammer upgrade, namely the 

Re: [ceph-users] [Ceph-maintainers] ceph packages link is gone

2015-12-03 Thread Dan Mick
This was sent to the ceph-maintainers list; answering here:

On 11/25/2015 02:54 AM, Alaâ Chatti wrote:
> Hello,
> 
> I used to install qemu-ceph on centos 6 machine from
> http://ceph.com/packages/, but the link has been removed, and there is
> no alternative in the documentation. Would you please update the link so
> I can install the version of qemu that supports rbd.
> 
> Thank you

Packages can be found on http://download.ceph.com/

-- 
Dan Mick
Red Hat, Inc.
Ceph docs: http://ceph.com/docs
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd@.service does not mount OSD data disk

2015-12-03 Thread Timofey Titovets
On 3 Dec 2015 8:56 p.m., "Florent B"  wrote:
>
> By the way, when system boots, "ceph" service is starting everything
> fine. So "ceph-osd@" service is disabled => how to restart an OSD ?!
>
AFAIK, ceph now have 2 services:
1. Mount device
2. Start OSD

Also, service can be disabled, but this not mean, what it can't work

If ceph-osd@ already started, you can easy see status of service, and can
restart it, if it's needed ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Remap PGs with size=1 on specific OSD

2015-12-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Reweighting the OSD to 0.0 or setting the osd out (but not terminating
the process) should allow it to backfill the PGs to a new OSD. I would
try the reweight first (and in a test environment).
- 
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Dec 3, 2015 at 10:32 AM, Florent B  wrote:
> Hi everyone,
>
> I have some pools with size=1.
>
> I want to perform some maintenance on some OSDs.
>
> I would like to know if there is a way to remap PGs which single copy is
> on osd.X
>
> Is it possible ?
>
> I know about primary affinity, setting an OSD out... but this is not
> specific to PGs with size=1, it will also rebalance PGs with size=2
> which is not necessary in my case.
>
> Thank you a lot.
>
> Florent
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.3.0
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWYIuACRDmVDuy+mK58QAA+KIP/36GFkHbbSPXBXvdsBFM
oJagMFjDmw3umUzVRGBsoO3+owqiZeMm8Wi8SpvEG1Xok84ZbUVrwI1JqzRw
canb/nWiIofadrmNzGGYzPPAFO/2fFrgOMD+pjeXIBxp/ClvECe/lA0FEWcp
an3cxRo0aGJCR3c0lzuBYit5/AuXHoxhC5+Yhtif2RebCh9WNktYtObpVRh2
sw6cGM2GNc/7HT1SYmesMLGvSdMyxrvNyQt6NDMP3q+fq3UR/cN93fdPqT/a
1OwME356zje9Ws5oqyg9ma1HFHA84W7Z6uKlaZHl8gQ5A4MTz5TpAPAyTwto
32tAxL0pj5wMQU3DqmVKkMhcXbso0A8XhNvvR8p31bywdxGuPwuVaqozyTCX
WuEwHOM+y+Jn/l0CJz8ZIJI8uSYom8/vtcB6st/AeXSsoBH/N8midgGIg1Co
HasKH4H+2njrKVCc12rIqqXY1OUbPyLJynFHbpu/vNKqKuqn8dVxUbPrRLoN
q+PUx28TIDql3ijhJPll/iPtAITzTe4oBlsGcg0r6UBu/Afe5yrbRB5m/2ON
AeCEPM+pzvOEQLa8E8QpTNuXNAOHxnI13tzMYwM5Vn5iBRaXPFTtSE+SCT6a
yOahKIpcAvAsV3OOTNT1YKd21A8GCc3MOv4aOMg7JTWCIg44MLSeApWvf86R
3sv+
=XSTX
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Remap PGs with size=1 on specific OSD

2015-12-03 Thread Timofey Titovets
On 3 Dec 2015 9:35 p.m., "Robert LeBlanc"  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Reweighting the OSD to 0.0 or setting the osd out (but not terminating
> the process) should allow it to backfill the PGs to a new OSD. I would
> try the reweight first (and in a test environment).
> - 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Thu, Dec 3, 2015 at 10:32 AM, Florent B  wrote:
> > Hi everyone,
> >
> > I have some pools with size=1.
> >
> > I want to perform some maintenance on some OSDs.
> >
> > I would like to know if there is a way to remap PGs which single copy is
> > on osd.X
> >
> > Is it possible ?
> >
> > I know about primary affinity, setting an OSD out... but this is not
> > specific to PGs with size=1, it will also rebalance PGs with size=2
> > which is not necessary in my case.
> >
> > Thank you a lot.
> >
> > Florent
>

AFAIK, you can remap PGs from osd, only by set PGs to another crush rule
Just create new, temporary crushmap and set pool, to it
___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> -BEGIN PGP SIGNATURE-
> Version: Mailvelope v1.3.0
> Comment: https://www.mailvelope.com
>
> wsFcBAEBCAAQBQJWYIuACRDmVDuy+mK58QAA+KIP/36GFkHbbSPXBXvdsBFM
> oJagMFjDmw3umUzVRGBsoO3+owqiZeMm8Wi8SpvEG1Xok84ZbUVrwI1JqzRw
> canb/nWiIofadrmNzGGYzPPAFO/2fFrgOMD+pjeXIBxp/ClvECe/lA0FEWcp
> an3cxRo0aGJCR3c0lzuBYit5/AuXHoxhC5+Yhtif2RebCh9WNktYtObpVRh2
> sw6cGM2GNc/7HT1SYmesMLGvSdMyxrvNyQt6NDMP3q+fq3UR/cN93fdPqT/a
> 1OwME356zje9Ws5oqyg9ma1HFHA84W7Z6uKlaZHl8gQ5A4MTz5TpAPAyTwto
> 32tAxL0pj5wMQU3DqmVKkMhcXbso0A8XhNvvR8p31bywdxGuPwuVaqozyTCX
> WuEwHOM+y+Jn/l0CJz8ZIJI8uSYom8/vtcB6st/AeXSsoBH/N8midgGIg1Co
> HasKH4H+2njrKVCc12rIqqXY1OUbPyLJynFHbpu/vNKqKuqn8dVxUbPrRLoN
> q+PUx28TIDql3ijhJPll/iPtAITzTe4oBlsGcg0r6UBu/Afe5yrbRB5m/2ON
> AeCEPM+pzvOEQLa8E8QpTNuXNAOHxnI13tzMYwM5Vn5iBRaXPFTtSE+SCT6a
> yOahKIpcAvAsV3OOTNT1YKd21A8GCc3MOv4aOMg7JTWCIg44MLSeApWvf86R
> 3sv+
> =XSTX
> -END PGP SIGNATURE-
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-maintainers] ceph packages link is gone

2015-12-03 Thread Ken Dreyer
On Thu, Dec 3, 2015 at 5:53 PM, Dan Mick  wrote:
> This was sent to the ceph-maintainers list; answering here:
>
> On 11/25/2015 02:54 AM, Alaâ Chatti wrote:
>> Hello,
>>
>> I used to install qemu-ceph on centos 6 machine from
>> http://ceph.com/packages/, but the link has been removed, and there is
>> no alternative in the documentation. Would you please update the link so
>> I can install the version of qemu that supports rbd.
>>
>> Thank you
>
> Packages can be found on http://download.ceph.com/

When we re-arranged the download structure for packages and moved
everything from ceph.com to download.ceph.com, we did not carry
ceph-extras over.

The reason is that the packages there were unmaintained. The EL6 QEMU
binaries were vulnerable to VENOM (CVE-2015-3456) and maybe other
CVEs, and no users should rely on them any more.

We've removed all references to ceph-extras from our test framework
upstream (eg https://github.com/ceph/ceph-cm-ansible/pull/137) and I
recommend that everyone else do the same.

If you need QEMU with RBD support on CentOS, I recommend that you
upgrade from CentOS 6 to CentOS 7.1+. Red Hat's QEMU package in RHEL
7.1 is built with librbd support.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs ceph: fill_inode badness

2015-12-03 Thread Don Waterloo
i have a file which is untouchable: ls -i gives an error, stat gives an
error. it shows ??? for all fields except name.

How do i clean this up?

I'm on ubuntu 15.10, running 0.94.5
# ceph -v
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)

the node that accessed the file then caused a problem with mds:

root@nubo-1:/home/git/go/src/github.com/gogits/gogs# ceph status
cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded
 health HEALTH_WARN
mds0: Client nubo-1 failing to respond to capability release
 monmap e1: 3 mons at {nubo-1=
10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0}
election epoch 906, quorum 0,1,2 nubo-1,nubo-2,nubo-3
 mdsmap e418: 1/1/1 up {0=nubo-2=up:active}, 2 up:standby
 osdmap e2081: 6 osds: 6 up, 6 in
  pgmap v95696: 560 pgs, 6 pools, 131 GB data, 97784 objects
265 GB used, 5357 GB / 5622 GB avail
 560 active+clean

Trying a different node, i see the same problem.

I'm getting this error dumped to dmesg:

[670243.421212] Workqueue: ceph-msgr con_work [libceph]
[670243.421213]   e800e516 8810cd68f9d8
817e8c09
[670243.421215]    8810cd68fa18
8107b3c6
[670243.421217]  8810cd68fa28 ffea 

[670243.421218] Call Trace:
[670243.421221]  [] dump_stack+0x45/0x57
[670243.421223]  [] warn_slowpath_common+0x86/0xc0
[670243.421225]  [] warn_slowpath_null+0x1a/0x20
[670243.421229]  [] fill_inode.isra.18+0xc5c/0xc90 [ceph]
[670243.421233]  [] ? inode_init_always+0x107/0x1b0
[670243.421236]  [] ? ceph_mount+0x7e0/0x7e0 [ceph]
[670243.421241]  [] ceph_fill_trace+0x332/0x910 [ceph]
[670243.421248]  [] handle_reply+0x525/0xb70 [ceph]
[670243.421255]  [] dispatch+0x3c8/0xbb0 [ceph]
[670243.421260]  [] con_work+0x57b/0x1770 [libceph]
[670243.421262]  [] ? dequeue_task_fair+0x36b/0x700
[670243.421263]  [] ? put_prev_entity+0x31/0x420
[670243.421265]  [] ? __switch_to+0x1f9/0x5c0
[670243.421267]  [] process_one_work+0x1aa/0x440
[670243.421269]  [] worker_thread+0x4b/0x4c0
[670243.421271]  [] ? process_one_work+0x440/0x440
[670243.421273]  [] ? process_one_work+0x440/0x440
[670243.421274]  [] kthread+0xd8/0xf0
[670243.421276]  [] ? kthread_create_on_node+0x1f0/0x1f0
[670243.421277]  [] ret_from_fork+0x3f/0x70
[670243.421279]  [] ? kthread_create_on_node+0x1f0/0x1f0
[670243.421280] ---[ end trace 5cded7a882dfd5d1 ]---
[670243.421282] ceph: fill_inode badness 88179e2d9f28
1004e91.fffe

this problem persisted through a reboot, and there is no fsck to help me.

I also tried with ceph-fuse, but it crashes when I access the file.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy osd prepare for journal size 0

2015-12-03 Thread Mike Miller

Hi,

for testing I would like to create some OSD in the hammer release with 
journal size 0


I included this in ceph.conf:
[osd]
osd journal size = 0

Then I zapped the disk in question and tried:
'ceph-deploy disk zap o1:sda'

Thank you for your advice how to prepare an osd without journal / 
journal size 0.


Thanks and regards,

Mike

---
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.22): /usr/bin/ceph-deploy disk 
prepare o1:sda

[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks o1:/dev/sda:
[o1][DEBUG ] connection detected need for sudo
[o1][DEBUG ] connected to host: o1
[o1][DEBUG ] detect platform information from remote host
[o1][DEBUG ] detect machine type
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] Deploying osd to o1
[o1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[o1][INFO  ] Running command: sudo udevadm trigger 
--subsystem-match=block --action=add
[ceph_deploy.osd][DEBUG ] Preparing host o1 disk /dev/sda journal None 
activate False
[o1][INFO  ] Running command: sudo ceph-disk -v prepare --fs-type xfs 
--cluster ceph -- /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--cluster=ceph --show-config-value=fsid
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--cluster=ceph --show-config-value=osd_journal_size
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_cryptsetup_parameters
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_dmcrypt_key_size
[o1][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-conf 
--cluster=ceph --name=osd. --lookup osd_dmcrypt_type

[o1][WARNIN] INFO:ceph-disk:Will colocate journal with data on /dev/sda
[o1][WARNIN] DEBUG:ceph-disk:Creating journal partition num 2 size 0 on 
/dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/sgdisk --new=2:0:0M 
--change-name=2:ceph journal 
--partition-guid=2:ded83283-2023-4c8e-93ae-b33341710bde 
--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sda

[o1][DEBUG ] The operation has completed successfully.
[o1][WARNIN] DEBUG:ceph-disk:Calling partprobe on prepared device /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/partprobe /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/udevadm settle
[o1][WARNIN] DEBUG:ceph-disk:Journal is GPT partition 
/dev/disk/by-partuuid/ded83283-2023-4c8e-93ae-b33341710bde
[o1][WARNIN] DEBUG:ceph-disk:Journal is GPT partition 
/dev/disk/by-partuuid/ded83283-2023-4c8e-93ae-b33341710bde

[o1][WARNIN] DEBUG:ceph-disk:Creating osd partition on /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/sgdisk 
--largest-new=1 --change-name=1:ceph data 
--partition-guid=1:87ab533b-e530-4fa3-bfad-8a157a88cc88 
--typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be -- /dev/sda

[o1][DEBUG ] The operation has completed successfully.
[o1][WARNIN] DEBUG:ceph-disk:Calling partprobe on created device /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/partprobe /dev/sda
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/udevadm settle
[o1][WARNIN] DEBUG:ceph-disk:Creating xfs fs on /dev/sda1
[o1][WARNIN] INFO:ceph-disk:Running command: /sbin/mkfs -t xfs -f -i 
size=2048 -- /dev/sda1

[o1][WARNIN] warning: device is not properly aligned /dev/sda1
[o1][WARNIN] agsize (251 blocks) too small, need at least 4096 blocks
[o1][WARNIN] Usage: mkfs.xfs
[o1][WARNIN] /* blocksize */[-b log=n|size=num]
[o1][WARNIN] /* data subvol */  [-d 
agcount=n,agsize=n,file,name=xxx,size=num,
[o1][WARNIN] 
(sunit=value,swidth=value|su=num,sw=num),

[o1][WARNIN]sectlog=n|sectsize=num
[o1][WARNIN] /* inode size */   [-i 
log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,

[o1][WARNIN]projid32bit=0|1]
[o1][WARNIN] /* log subvol */   [-l 
agnum=n,internal,size=num,logdev=xxx,version=n
[o1][WARNIN] 
sunit=value|su=num,sectlog=n|sectsize=num,

[o1][WARNIN]lazy-count=0|1]
[o1][WARNIN] /* label */[-L label (maximum 12 characters)]
[o1][WARNIN] /* naming */   [-n log=n|size=num,version=2|ci]
[o1][WARNIN] /* prototype file */   [-p fname]
[o1][WARNIN] /* quiet */[-q]
[o1][WARNIN] /* realtime subvol */  [-r 

[ceph-users] Bug on rbd rm when using cache tiers Was: OSD on XFS ENOSPC at 84% data / 5% inode and inode64?

2015-12-03 Thread Laurent GUERBY
On Fri, 2015-11-27 at 10:00 +0100, Laurent GUERBY wrote:
> > 
> > Hi, from given numbers one can conclude that you are facing some kind
> > of XFS preallocation bug, because ((raw space divided by number of
> > files)) is four times lower than the ((raw space divided by 4MB
> > blocks)). At a glance it could be avoided by specifying relatively
> > small allocsize= mount option, of course by impacting overall
> > performance, appropriate benchmarks could be found through
> > ceph-users/ceph-devel. Also do you plan to preserve overcommit ratio
> > to be that high forever?
> 
> Hi again,
> 
> Looks like we hit a bug in image deletion leaving objects undeleted on
> disk:
> 
> http://tracker.ceph.com/issues/13894
> 
> I assume we'll get a lot more free space when it's fixed :).
> 
> Laurent

Hi,

As the bug above, rbd rm not releasing any real disk space on cache
tiers, has been closed with "Rejected" by ceph developpers, I added the
comment below to the ticket.  

Since no usable LTS version will get this issue fixed before a few years
ceph users should be aware of it:

http://tracker.ceph.com/issues/13894#note-15
<<
The issue we're facing is that the size reported by all ceph tools is
completely wrong, from my ticket:

pool name KB objects
ec4p1c 41297 2672953
2.6 millions objects taking 41 megabytes according to ceph, but about 10
terabytes on disk.

As most ceph documentation do suggest to set target_max_bytes there will
be no eviction at all on rbd rm until it's too late as we found out (OSD
will die to full disks, ...).

The only way to prevent users to run into this issue is either fix the
bug on the byte counting for cache promoted "rm"ed objects or to tell
users for the years to come to use exclusively target_max_objects and
not target_max_bytes to control caching, with target_max_objects based
on their estimation of the available disk space and average object size.

Not fixing this issue will cause endless trouble for users for years to
come.
>>

Sincerely,

Laurent


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com