[ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-03 Thread Jeffrey Zhang
I am testing ceph Luminous, the environment is

- centos 7.4
- ceph luminous ( ceph offical repo)
- ceph-deploy 2.0
- bluestore + separate wal and db

I found the ceph osd folder `/var/lib/ceph/osd/ceph-0` is mounted
from tmpfs. But where the files in that folder come from? like `keyring`,
`whoami`?

$ ls -alh /var/lib/ceph/osd/ceph-0/
lrwxrwxrwx.  1 ceph ceph   24 Apr  3 16:49 block -> /dev/ceph-pool/osd0.data
lrwxrwxrwx.  1 root root   22 Apr  3 16:49 block.db ->
/dev/ceph-pool/osd0-db
lrwxrwxrwx.  1 root root   23 Apr  3 16:49 block.wal ->
/dev/ceph-pool/osd0-wal
-rw---.  1 ceph ceph   37 Apr  3 16:49 ceph_fsid
-rw---.  1 ceph ceph   37 Apr  3 16:49 fsid
-rw---.  1 ceph ceph   55 Apr  3 16:49 keyring
-rw---.  1 ceph ceph6 Apr  3 16:49 ready
-rw---.  1 ceph ceph   10 Apr  3 16:49 type
-rw---.  1 ceph ceph2 Apr  3 16:49 whoami

I guess they may be loaded from bluestore. But I can not find any clue for
this.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-03 Thread Jeffrey Zhang
Btw, I am using ceph-volume.

I just test ceph-disk. In this case, the ceph-0 folder is mounted from
/dev/sdb1.

So tmpfs only happens when using ceph-volume? how it works?

On Wed, Apr 4, 2018 at 9:29 AM, Jeffrey Zhang <
zhang.lei.fly+ceph-us...@gmail.com> wrote:

> I am testing ceph Luminous, the environment is
>
> - centos 7.4
> - ceph luminous ( ceph offical repo)
> - ceph-deploy 2.0
> - bluestore + separate wal and db
>
> I found the ceph osd folder `/var/lib/ceph/osd/ceph-0` is mounted
> from tmpfs. But where the files in that folder come from? like `keyring`,
> `whoami`?
>
> $ ls -alh /var/lib/ceph/osd/ceph-0/
> lrwxrwxrwx.  1 ceph ceph   24 Apr  3 16:49 block ->
> /dev/ceph-pool/osd0.data
> lrwxrwxrwx.  1 root root   22 Apr  3 16:49 block.db ->
> /dev/ceph-pool/osd0-db
> lrwxrwxrwx.  1 root root   23 Apr  3 16:49 block.wal ->
> /dev/ceph-pool/osd0-wal
> -rw---.  1 ceph ceph   37 Apr  3 16:49 ceph_fsid
> -rw---.  1 ceph ceph   37 Apr  3 16:49 fsid
> -rw---.  1 ceph ceph   55 Apr  3 16:49 keyring
> -rw---.  1 ceph ceph6 Apr  3 16:49 ready
> -rw---.  1 ceph ceph   10 Apr  3 16:49 type
> -rw---.  1 ceph ceph2 Apr  3 16:49 whoami
>
> I guess they may be loaded from bluestore. But I can not find any clue for
> this.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-06 Thread Jeffrey Zhang
​​
Yes, I am using ceph-volume.

And i found where the keyring comes from.

bluestore will save all the information at the starting of
disk (BDEV_LABEL_BLOCK_SIZE=4096)
this area is used for saving labels, including keyring, whoami etc.

these can be read through ceph-bluestore-tool show-lable

$ ceph-bluestore-tool  show-label --path /var/lib/ceph/osd/ceph-0
{
"/var/lib/ceph/osd/ceph-0/block": {
"osd_uuid": "c349b2ba-690f-4a36-b6f6-2cc0d0839f29",
"size": 2147483648,
"btime": "2018-04-04 10:22:25.216117",
"description": "main",
"bluefs": "1",
"ceph_fsid": "14941be9-c327-4a17-8b86-be50ee2f962e",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"osd_key": "AQDgNsRaVtsRIBAA6pmOf7y2GBufyE83nHwVvg==",
"ready": "ready",
"whoami": "0"
}
}

So during mounting the /var/lib/ceph/osd/ceph-0, ceph will dump these
content into the
tmpfs folder.

On Fri, Apr 6, 2018 at 10:21 PM, David Turner  wrote:

> Likely the differences you're seeing of /dev/sdb1 and tmpfs have to do
> with how ceph-disk vs ceph-volume manage the OSDs and what their defaults
> are.  ceph-disk will create partitions on devices while ceph-volume
> configures LVM on the block device.  Also with bluestore you do not have a
> standard filesystem, so ceph-volume creates a mock folder to place the
> necessary information into /var/lib/ceph/osd/ceph-0 to track the
> information for the OSD and how to start it.
>
> On Wed, Apr 4, 2018 at 6:20 PM Gregory Farnum  wrote:
>
>> On Tue, Apr 3, 2018 at 6:30 PM Jeffrey Zhang > gmail.com> wrote:
>>
>>> I am testing ceph Luminous, the environment is
>>>
>>> - centos 7.4
>>> - ceph luminous ( ceph offical repo)
>>> - ceph-deploy 2.0
>>> - bluestore + separate wal and db
>>>
>>> I found the ceph osd folder `/var/lib/ceph/osd/ceph-0` is mounted
>>> from tmpfs. But where the files in that folder come from? like `keyring`,
>>> `whoami`?
>>>
>>
>> These are generated as part of the initialization process. I don't know
>> the exact commands involved, but the keyring for instance will draw from
>> the results of "ceph osd new" (which is invoked by one of the ceph-volume
>> setup commands). That and whoami are part of the basic information an OSD
>> needs to communicate with a monitor.
>> -Greg
>>
>>
>>>
>>> $ ls -alh /var/lib/ceph/osd/ceph-0/
>>> lrwxrwxrwx.  1 ceph ceph   24 Apr  3 16:49 block ->
>>> /dev/ceph-pool/osd0.data
>>> lrwxrwxrwx.  1 root root   22 Apr  3 16:49 block.db ->
>>> /dev/ceph-pool/osd0-db
>>> lrwxrwxrwx.  1 root root   23 Apr  3 16:49 block.wal ->
>>> /dev/ceph-pool/osd0-wal
>>> -rw---.  1 ceph ceph   37 Apr  3 16:49 ceph_fsid
>>> -rw---.  1 ceph ceph   37 Apr  3 16:49 fsid
>>> -rw---.  1 ceph ceph   55 Apr  3 16:49 keyring
>>> -rw---.  1 ceph ceph6 Apr  3 16:49 ready
>>> -rw---.  1 ceph ceph   10 Apr  3 16:49 type
>>> -rw---.  1 ceph ceph2 Apr  3 16:49 whoami
>>>
>>> I guess they may be loaded from bluestore. But I can not find any clue
>>> for this.
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is there any way for ceph-osd to control the max fds?

2018-09-17 Thread Jeffrey Zhang
In one env, which is deployed through container, i found the ceph-osd always
be suicide due to "error (24) Too many open files"

Then i increased the LimitNOFILE for the container from 65k to 655k, which
could fix the issue.
But the FDs increase all the time. now the max number is around 155k. I
am afraid
it will increase forever.

I also found there is a option `max_open_files`, but it seems only used for
upstart scripts at OS level, and the default value is
16384 now[2]. whereas if you are using systemd, it will never load from the
`max_open_files` options and fixed to 1048576 in default[1].
So i guess if the ceph-osd live long enough, it will still read the OS
level strict and suicide at the end.

So here is the question
1. since almost all os distro already moved to systemd, so max_open_files
is uselss now.
2. is there any mechanism that ceph-osd could release some fds?


[0]
https://github.com/ceph/ceph/commit/16c603b26f3d7dfcf0028e17fe3c04d94a434387
[1] https://github.com/ceph/ceph/commit/8453a89
[2]
https://github.com/ceph/ceph/commit/672c56b18de3b02606e47013edfc2e8b679d8797



-- 
Regards,
Jeffrey Zhang
Blog: http://xcodest.me
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is there any way for ceph-osd to control the max fds?

2018-09-19 Thread Jeffrey Zhang
Thanks Gregory for the explanation.

>
> files open (which mostly only applies to FileStore and there's a
> config, defaults to 1024 I think).
>
After searched the ceph doc, i do not think there is a such option[0]
I found a similar  `filestore flusher max fds=512` option, but it is
already deprecated
since v0.65

[0]
http://docs.ceph.com/docs/master/rados/configuration/filestore-config-ref/

-- 
Regards,
Jeffrey Zhang
Blog: http://xcodest.me
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd + openshift cause cpu stuck now and then

2018-08-24 Thread Jeffrey Zhang
: 0001 RSI: 00b083a5 RDI:
939ccf1a6958
[4381987.771150] RBP: 93a1cf22fdf8 R08:  R09:

[4381987.771151] R10: 939c3f9dbb80 R11: db1b021f3800 R12:
ac232ec0
[4381987.771153] R13: acc73000 R14:  R15:

[4381987.771155] FS:  7fc6c57db880() GS:939c3f9c()
knlGS:
[4381987.771157] CS:  0010 DS:  ES:  CR0: 80050033
[4381987.771171] CR2: 7fc6c499c15c CR3: 00108440 CR4:
001607e0
[4381987.771174] Call Trace:
[4381987.771181]  [] ? shrink_dentry_list+0x3c/0x230
[4381987.771185]  [] shrink_dcache_sb+0x9a/0xe0
[4381987.771190]  [] do_remount_sb+0x51/0x200
[4381987.771195]  [] do_mount+0x757/0xce0
[4381987.771200]  [] ? memdup_user+0x42/0x70
[4381987.771202]  [] SyS_mount+0x83/0xd0
[4381987.771208]  [] system_call_fastpath+0x1c/0x21
[4381987.771210] Code: 44 00 00 85 d2 74 e4 0f 1f 40 00 eb ed 66 0f 1f
44 00 00 b8 01 00 00 00 5d c3 90 0f 1f 44 00 00 31 c0 ba 01 00 00 00
f0 0f b1 17 <85> c0 75 01 c3 55 89 c6 48 89 e5 e8 c5 2c ff ff 5d c3 0f
1f 40
[4381990.001417] rbd: rbd0: encountered watch error: -107
[4382009.590969] libceph: mon0 192.168.100.74:6789 session lost,
hunting for new mon
[4382009.593496] libceph: mon2 192.168.100.75:6789 session established
[4382016.237493] ixgbe :05:00.0 p6p1: initiating reset due to tx timeout
[4382016.237555] ixgbe :05:00.0 p6p1: Reset adapter
[4382035.769185] NMI watchdog: BUG: soft lockup - CPU#14 stuck for
22s! [etcd:214637]
[4382035.770413] Modules linked in: vfat fat isofs ip_vs fuse ext4
mbcache jbd2 rbd libceph dns_resolver cfg80211 rfkill udp_diag
unix_diag tcp_diag inet_diag veth nf_conntrack_netlink nfnetlink
xt_statistic xt_nat xt_recent ipt_REJECT nf_reject_ipv4 xt_mark
xt_comment ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
xt_addrtype iptable_filter xt_conntrack br_netfilter bridge stp llc
overlay(T) scsi_transport_iscsi bonding vport_vxlan vxlan
ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat
nf_conntrack sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi
kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel
lrw gf128mul glue_helper ablk_helper cryptd sg ipmi_ssif joydev mei_me
mei iTCO_wdt iTCO_vendor_support
[4382035.770480]  pcspkr dcdbas ipmi_si ipmi_devintf ipmi_msghandler
shpchp lpc_ich acpi_pad acpi_power_meter wmi nfsd auth_rpcgss nfs_acl
lockd grace sunrpc ip_tables xfs sr_mod sd_mod cdrom mgag200
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ahci ttm libahci drm ixgbe libata crc32c_intel tg3
megaraid_sas i2c_core mdio dca ptp pps_core dm_mirror dm_region_hash
dm_log dm_snapshot target_core_user uio target_core_mod crc_t10dif
crct10dif_generic crct10dif_pclmul crct10dif_common dm_multipath
dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_mod
libcrc32c
[4382035.770521] CPU: 14 PID: 214637 Comm: etcd Kdump: loaded Tainted:
GWL  T 3.10.0-862.6.3.el7.x86_64 #1
[4382035.770523] Hardware name: Dell Inc. PowerEdge R720/0DCWD1, BIOS
2.6.1 02/12/2018
[4382035.770526] task: 93a7ff2e2f70 ti: 93aaefe94000 task.ti:
93aaefe94000
[4382035.770528] RIP: 0010:[]  []
__d_free+0x18/0x40
[4382035.770536] RSP: :939c3f9c3ea8  EFLAGS: 0292
[4382035.770538] RAX: ac232ec0 RBX: 93a58a68eb40 RCX:
0001002a001f
[4382035.770540] RDX: 93a9cca75eb0 RSI: db1b76329d00 RDI:
93a9cca75e38
[4382035.770541] RBP: 939c3f9c3eb0 R08: 93a9cca74b40 R09:
0001002a001f
[4382035.770543] R10: cca75501 R11: db1b76329d00 R12:
939c3f9c3e18
[4382035.770544] R13: ac7217b2 R14: 939c3f9c3eb0 R15:
93a9cca75e00
[4382035.770546] FS:  7f36d7fff700() GS:939c3f9c()
knlGS:
[4382035.770548] CS:  0010 DS:  ES:  CR0: 80050033
[4382035.770550] CR2: 00c431171000 CR3: 001d1a2a4000 CR4:
001607e0
[4382035.770552] Call Trace:
[4382035.770555]  
[4382035.770565]  [] rcu_process_callbacks+0x1e0/0x580
[4382035.770572]  [] __do_softirq+0xf5/0x280
[4382035.770577]  [] call_softirq+0x1c/0x30
[4382035.770582]  [] do_softirq+0x65/0xa0
[4382035.770585]  [] irq_exit+0x105/0x110
[4382035.770588]  [] smp_apic_timer_interrupt+0x48/0x60
[4382035.770593]  [] apic_timer_interrupt+0x162/0x170
[4382035.770594]  
[4382035.770596] Code:
[4382035.770597] 00 00 01 c6 07 00 0f 1f 40 00 5b 41 5c 5d c3 0f 1f 40
00 0f 1f 44 00 00 55 48 89 e5 53 48 8d 9f 50 ff ff ff 48 8b bf 78 ff
ff ff <48> 8d 43 38 48 39 c7 74 05 e8 ba 43 fc ff 48 8b 3d 4b 65 b1 00
[4382048.912110] libceph: mon2 192.168.100.75:6789 session lost,
hunting for new mon
[4382048.921812] rbd: rbd0: encountered watch error: -107
[4382049.175254] bond2: link status definitely down for interface
p6p1, disabling it
[4382049.243019] ixgbe :05:00.0 p6p1: detected SFP+: