[ceph-users] Ceph Kraken + CephFS + Kernel NFSv3/v4.1 + ESXi

2017-03-13 Thread Timofey Titovets
Hi does anyone try this stack, may be someone can provide some
feedback about it?

Thanks.

P.S.
AFAIK at now Ceph RBD + LIO lack of iSCSI HA support, so i think about NFS.

UPD1:
I did some tests and get strange behavior:
Every several minutes io from nfs client to nfs proxy just stops, no
messages in dmesg from client/server.
Reboot did fix that everytime. But it's very bad.
And this without any HA.
Just mount cephfs and export it, and running some fio load.

-- 
Have a nice day,
Timofey.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] total storage size available in my CEPH setup?

2017-03-13 Thread Christian Balzer

Hello,

On Mon, 13 Mar 2017 21:32:45 + James Okken wrote:

> Hi all,
> 
> I have a 3 storage node openstack setup using CEPH.
> I believe that means I have 3 OSDs, as each storage node has a one of 3 fiber 
> channel storage locations mounted.

You use "believe" a lot, so I'm assuming you're quite new and unfamiliar
with Ceph.
Reading docs and google are your friends.

> The storage media behind each node is actually single 7TB HP fiber channel 
> MSA array.
> The best performance configuration for the hard drives in the MSA just 
> happened to be 3x 2.3TB RAID10's. And that matched nicely to the 
> 3xStorageNode/OSD of the CEPH setup.

A pretty unusual approach, not that others (including me) have done similar
things.
Having just 3 OSDs is iffy, since there are corner cases where Ceph may
not be able to distribute PGs (using default parameters) with such a small
pool of OSDs. 

> I believe my replication factor is 3.
> 
You answered that yourself in the dump, a
"ceph osd pool ls detail"
would have been more elegant and yielded the same information.

You also have a min_size of 1, which can be problematic (search the
archives), but with a really small cluster that may still be advantageous.
Lastly, since your OSDs are RAIDs and thus very reliable, a replication of
2 is feasible.

> My question is how much total CEPH storage does this allow me? Only 2.3TB? or 
> does the way CEPH duplicates data enable more than 1/3 of the storage?

3 means 3, so 2.3TB. Note that Ceph is spare, so that can help quite a bit.

> A follow up question would be what is the best way to tell, thru CEPH, the 
> space used and space free? Thanks!!
> 
Well, what do you use with *ix, "df", don'tcha?
"ceph df {detail}"

Christian

> root@node-1:/var/log# ceph osd tree
> ID WEIGHT  TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 6.53998 root default
> -5 2.17999 host node-28
> 3 2.17999 osd.3 up  1.0  1.0
> -6 2.17999 host node-30
> 4 2.17999 osd.4 up  1.0  1.0
> -7 2.17999 host node-31
> 5 2.17999 osd.5 up  1.0  1.0
> 0   0 osd.0   down0  1.0
> 1   0 osd.1   down0  1.0
> 2   0 osd.2   down0  1.0
> 
> 
> 
> ##
> root@node-1:/var/log# ceph osd lspools
> 0 rbd,2 volumes,3 backups,4 .rgw.root,5 .rgw.control,6 .rgw,7 .rgw.gc,8 
> .users.uid,9 .users,10 compute,11 images,
> 
> 
> 
> ##
> root@node-1:/var/log# ceph osd dump
> epoch 216
> fsid d06d61b0-1cd0-4e1a-ac20-67972d0e1fde
> created 2016-10-11 14:15:05.638099
> modified 2017-03-09 14:45:01.030678
> flags
> pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
> pool 2 'volumes' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 130 flags hashpspool stripe_width 0
> removed_snaps [1~5]
> pool 3 'backups' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 14 flags hashpspool stripe_width 0
> pool 4 '.rgw.root' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 16 flags hashpspool stripe_width 0
> pool 5 '.rgw.control' replicated size 3 min_size 1 crush_ruleset 0 
> object_hash rjenkins pg_num 64 pgp_num 64 last_change 18 owner 
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 6 '.rgw' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 20 owner 18446744073709551615 flags 
> hashpspool stripe_width 0
> pool 7 '.rgw.gc' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 21 flags hashpspool stripe_width 0
> pool 8 '.users.uid' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 22 owner 18446744073709551615 flags 
> hashpspool stripe_width 0
> pool 9 '.users' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 24 flags hashpspool stripe_width 0
> pool 10 'compute' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 216 flags hashpspool stripe_width 0
> removed_snaps [1~37]
> pool 11 'images' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 189 flags hashpspool stripe_width 0
> removed_snaps [1~3,5~8,f~4,14~2,18~2,1c~1,1e~1]
> max_osd 6
> osd.0 down out weight 0 up_from 48 up_thru 50 down_at 52 last_clean_interval 
> [44,45) 192.168.0.9:6800/4485 192.168.1.4:6800/4485 192.168.1.4:6801/4485 
> 192.168.0.9:6801/4485 exists,new
> osd.1 down out weight 0 up_from 10 up_thru 48 down_at 50 last_clean_interval 
> [5,8) 192.168.0.7:6800/60912 192.168.1.6:6801/60912 192.168.1.6:6802/60912 
> 192.168.

Re: [ceph-users] speed decrease with size

2017-03-13 Thread Christian Balzer

Hello,

On Mon, 13 Mar 2017 11:25:15 -0400 Ben Erridge wrote:

> On Sun, Mar 12, 2017 at 8:24 PM, Christian Balzer  wrote:
> 
> >
> > Hello,
> >
> > On Sun, 12 Mar 2017 19:37:16 -0400 Ben Erridge wrote:
> >  
> > > I am testing attached volume storage on our openstack cluster which uses
> > > ceph for block storage.
> > > our Ceph nodes have large SSD's for their journals 50+GB for each OSD.  
> > I'm  
> > > thinking some parameter is a little off because with relatively small
> > > writes I am seeing drastically reduced write speeds.
> > >  
> > Large journals are a waste for most people, especially when your backing
> > storage are HDDs.
> >  
> > >
> > > we have 2 nodes withs 12 total OSD's each with 50GB SSD Journal.
> > >  
> > I hope that's not your plan for production, with a replica of 2 you're
> > looking at pretty much guaranteed data loss over time, unless your OSDs
> > are actually RAIDs.
> >
> > I am aware that replica of 3 is suggested thanks.  
> 
> 
> > 5GB journals tend to be overkill already.
> > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008606.html
> >
> > If you were to actually look at your OSD nodes during those tests with
> > something like atop or "iostat -x", you'd likely see that with prolonged
> > writes you wind up with the speed of what your HDDs can do, i.e. see them
> > (all or individually) being quite busy.
> >  
> 
> That is what I was thinking as well which is not what I want. I want to
> better utilize these large SSD journals. If I have 50GB journal
> and I only want to write 5GB of data I should be able to get near SSD speed
> for this operation. Why am I not? 
See the thread above and
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010754.html

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-April/038669.html

> Maybe I should increase
> *filestore_max_sync_interval.*
> 
That is your least worry, even though it seems to be the first parameter
to change.
Use your google foo to find some really old threads about this.

The journal* parameters are what you want to look at, see the threads
above. And AFAIK Ceph will flush the journal at 50% full, no matter what.

And at the end you will likely find that using your 50GB journals in full
will be difficult and doing so w/o getting a very uneven performance
nearly impossible.

Christian
> 
> >
> > Lastly, for nearly everybody in real life situations the
> > bandwidth/throughput becomes a distant second to latency considerations.
> >  
> 
> Thanks for the advice however.
> 
> 
> > Christian
> >  
> > >
> > >  here is our Ceph config
> > >
> > > [global]
> > > fsid = 19bc15fd-c0cc-4f35-acd2-292a86fbcf7d
> > > mon_initial_members = node-5 node-4 node-3
> > > mon_host = 192.168.0.8 192.168.0.7 192.168.0.13
> > > auth_cluster_required = cephx
> > > auth_service_required = cephx
> > > auth_client_required = cephx
> > > filestore_xattr_use_omap = true
> > > log_to_syslog_level = info
> > > log_to_syslog = True
> > > osd_pool_default_size = 1
> > > osd_pool_default_min_size = 1
> > > osd_pool_default_pg_num = 64
> > > public_network = 192.168.0.0/24
> > > log_to_syslog_facility = LOG_LOCAL0
> > > osd_journal_size = 5
> > > auth_supported = cephx
> > > osd_pool_default_pgp_num = 64
> > > osd_mkfs_type = xfs
> > > cluster_network = 192.168.1.0/24
> > > osd_recovery_max_active = 1
> > > osd_max_backfills = 1
> > >
> > > [client]
> > > rbd_cache = True
> > > rbd_cache_writethrough_until_flush = True
> > >
> > > [client.radosgw.gateway]
> > > rgw_keystone_accepted_roles = _member_, Member, admin, swiftoperator
> > > keyring = /etc/ceph/keyring.radosgw.gateway
> > > rgw_socket_path = /tmp/radosgw.sock
> > > rgw_keystone_revocation_interval = 100
> > > rgw_keystone_url = 192.168.0.2:35357
> > > rgw_keystone_admin_token = ZBz37Vlv
> > > host = node-3
> > > rgw_dns_name = *.ciminc.com
> > > rgw_print_continue = True
> > > rgw_keystone_token_cache_size = 10
> > > rgw_data = /var/lib/ceph/radosgw
> > > user = www-data
> > >
> > > This is the degradation I am speaking of..
> > >
> > >
> > > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=1k; rm -f
> > > /mnt/ext4/output;
> > > 1024+0 records in
> > > 1024+0 records out
> > > 1048576000 bytes (1.0 GB) copied, 0.887431 s, 1.2 GB/s
> > >
> > > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=2k; rm -f
> > > /mnt/ext4/output;
> > > 2048+0 records in
> > > 2048+0 records out
> > > 2097152000 bytes (2.1 GB) copied, 3.75782 s, 558 MB/s
> > >
> > >  dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=3k; rm -f
> > > /mnt/ext4/output;
> > > 3072+0 records in
> > > 3072+0 records out
> > > 3145728000 bytes (3.1 GB) copied, 10.0054 s, 314 MB/s
> > >
> > > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=5k; rm -f
> > > /mnt/ext4/output;
> > > 5120+0 records in
> > > 5120+0 records out
> > > 524288 bytes (5.2 GB) copied, 24.1971 s, 217 MB/s
> > >
> > > Any suggestions for improving the large write degradation?  
> >
> >
> > --
> > Christian Bal

Re: [ceph-users] CephFS fuse client users stuck

2017-03-13 Thread John Spray
On Mon, Mar 13, 2017 at 8:15 PM, Andras Pataki
 wrote:
> Dear Cephers,
>
> We're using the ceph file system with the fuse client, and lately some of
> our processes are getting stuck seemingly waiting for fuse operations.  At
> the same time, the cluster is healthy, no slow requests, all OSDs up and
> running, and both the MDS and the fuse client think that there are no
> pending operations.  The situation is semi-reproducible.  When I run a
> various cluster jobs, some get stuck after a few hours of correct operation.
> The cluster is on ceph 10.2.5 and 10.2.6, the fuse clients are 10.2.6, but I
> have tried 10.2.5 and 10.2.3, all of which have the same issue.  This is on
> CentOS (7.2 for the clients, 7.3 for the MDS/OSDs).
>
> Here are some details:
>
> The node with the stuck processes:
>
> [root@worker1070 ~]# ps -auxwww | grep 30519
> apataki   30519 39.8  0.9 8728064 5257588 ? Dl   12:11  60:50 ./Arepo
> param.txt 2 6
> [root@worker1070 ~]# cat /proc/30519/stack
> [] fuse_file_aio_write+0xbb/0x340 [fuse]
> [] do_sync_write+0x8d/0xd0
> [] vfs_write+0xbd/0x1e0
> [] SyS_write+0x7f/0xe0
> [] system_call_fastpath+0x16/0x1b
> [] 0x
>
> [root@worker1070 ~]# ps -auxwww | grep 30533
> apataki   30533 39.8  0.9 8795316 5261308 ? Sl   12:11  60:55 ./Arepo
> param.txt 2 6
> [root@worker1070 ~]# cat /proc/30533/stack
> [] wait_answer_interruptible+0x91/0xe0 [fuse]
> [] __fuse_request_send+0x253/0x2c0 [fuse]
> [] fuse_request_send+0x12/0x20 [fuse]
> [] fuse_send_write+0xd6/0x110 [fuse]
> [] fuse_perform_write+0x2ed/0x590 [fuse]
> [] fuse_file_aio_write+0x2a1/0x340 [fuse]
> [] do_sync_write+0x8d/0xd0
> [] vfs_write+0xbd/0x1e0
> [] SyS_write+0x7f/0xe0
> [] system_call_fastpath+0x16/0x1b
> [] 0x
>
> Presumably the second process is waiting on the first holding some lock ...
>
> The fuse client on the node:
>
> [root@worker1070 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok status
> {
> "metadata": {
> "ceph_sha1": "656b5b63ed7c43bd014bcafd81b001959d5f089f",
> "ceph_version": "ceph version 10.2.6
> (656b5b63ed7c43bd014bcafd81b001959d5f089f)",
> "entity_id": "admin",
> "hostname": "worker1070",
> "mount_point": "\/mnt\/ceph",
> "root": "\/"
> },
> "dentry_count": 40,
> "dentry_pinned_count": 23,
> "inode_count": 123,
> "mds_epoch": 19041,
> "osd_epoch": 462327,
> "osd_epoch_barrier": 462326
> }
>
> [root@worker1070 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok
> mds_sessions
> {
> "id": 3616543,
> "sessions": [
> {
> "mds": 0,
> "addr": "10.128.128.110:6800\/909443124",
> "seq": 338,
> "cap_gen": 0,
> "cap_ttl": "2017-03-13 14:47:37.575229",
> "last_cap_renew_request": "2017-03-13 14:46:37.575229",
> "cap_renew_seq": 12694,
> "num_caps": 713,
> "state": "open"
> }
> ],
> "mdsmap_epoch": 19041
> }
>
> [root@worker1070 ~]# ceph daemon /var/run/ceph/ceph-client.admin.asok
> mds_requests
> {}
>
>
> The overall cluster health and the MDS:
>
> [root@cephosd000 ~]# ceph -s
> cluster d7b33135-0940-4e48-8aa6-1d2026597c2f
>  health HEALTH_WARN
> noscrub,nodeep-scrub,require_jewel_osds flag(s) set
>  monmap e17: 3 mons at
> {hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}
> election epoch 29148, quorum 0,1,2 hyperv029,hyperv030,hyperv031
>   fsmap e19041: 1/1/1 up {0=cephosd000=up:active}
>  osdmap e462328: 624 osds: 624 up, 624 in
> flags noscrub,nodeep-scrub,require_jewel_osds
>   pgmap v44458747: 42496 pgs, 6 pools, 924 TB data, 272 Mobjects
> 2154 TB used, 1791 TB / 3946 TB avail
>42496 active+clean
>   client io 86911 kB/s rd, 556 MB/s wr, 227 op/s rd, 303 op/s wr
>
> [root@cephosd000 ~]# ceph daemon /var/run/ceph/ceph-mds.cephosd000.asok ops
> {
> "ops": [],
> "num_ops": 0
> }
>
>
> The odd thing is that if in this state I restart the MDS, the client process
> wakes up and proceeds with its work without any errors.  As if a request was
> lost and somehow retransmitted/restarted when the MDS got restarted and the
> fuse layer reconnected to it.

Interesting.  A couple of ideas for more debugging:

* Next time you go through this process of restarting the MDS while
there is a stuck client, first increase the client's logging (ceph
daemon .asok> config set debug_client
20").  Then we should get a clear sense of exactly what's happening on
the MDS restart that's enabling the client to proceed.
* When inspecting the client's "mds_sessions" output, also check the
"session ls" output on the MDS side to make sure the MDS and client
both agree that it has an open session.

John

>
> When I try to attach a gdb session to either of the client processes, gdb
> just hangs.  However, right after the MDS restart gdb attaches to the
> pr

[ceph-users] total storage size available in my CEPH setup?

2017-03-13 Thread James Okken
Hi all,

I have a 3 storage node openstack setup using CEPH.
I believe that means I have 3 OSDs, as each storage node has a one of 3 fiber 
channel storage locations mounted.
The storage media behind each node is actually single 7TB HP fiber channel MSA 
array.
The best performance configuration for the hard drives in the MSA just happened 
to be 3x 2.3TB RAID10's. And that matched nicely to the 3xStorageNode/OSD of 
the CEPH setup.
I believe my replication factor is 3.

My question is how much total CEPH storage does this allow me? Only 2.3TB? or 
does the way CEPH duplicates data enable more than 1/3 of the storage?
A follow up question would be what is the best way to tell, thru CEPH, the 
space used and space free? Thanks!!

root@node-1:/var/log# ceph osd tree
ID WEIGHT  TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 6.53998 root default
-5 2.17999 host node-28
3 2.17999 osd.3 up  1.0  1.0
-6 2.17999 host node-30
4 2.17999 osd.4 up  1.0  1.0
-7 2.17999 host node-31
5 2.17999 osd.5 up  1.0  1.0
0   0 osd.0   down0  1.0
1   0 osd.1   down0  1.0
2   0 osd.2   down0  1.0



##
root@node-1:/var/log# ceph osd lspools
0 rbd,2 volumes,3 backups,4 .rgw.root,5 .rgw.control,6 .rgw,7 .rgw.gc,8 
.users.uid,9 .users,10 compute,11 images,



##
root@node-1:/var/log# ceph osd dump
epoch 216
fsid d06d61b0-1cd0-4e1a-ac20-67972d0e1fde
created 2016-10-11 14:15:05.638099
modified 2017-03-09 14:45:01.030678
flags
pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 2 'volumes' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 130 flags hashpspool stripe_width 0
removed_snaps [1~5]
pool 3 'backups' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 14 flags hashpspool stripe_width 0
pool 4 '.rgw.root' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 16 flags hashpspool stripe_width 0
pool 5 '.rgw.control' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 18 owner 18446744073709551615 flags 
hashpspool stripe_width 0
pool 6 '.rgw' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 20 owner 18446744073709551615 flags hashpspool 
stripe_width 0
pool 7 '.rgw.gc' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 21 flags hashpspool stripe_width 0
pool 8 '.users.uid' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 22 owner 18446744073709551615 flags 
hashpspool stripe_width 0
pool 9 '.users' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 24 flags hashpspool stripe_width 0
pool 10 'compute' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 216 flags hashpspool stripe_width 0
removed_snaps [1~37]
pool 11 'images' replicated size 3 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 189 flags hashpspool stripe_width 0
removed_snaps [1~3,5~8,f~4,14~2,18~2,1c~1,1e~1]
max_osd 6
osd.0 down out weight 0 up_from 48 up_thru 50 down_at 52 last_clean_interval 
[44,45) 192.168.0.9:6800/4485 192.168.1.4:6800/4485 192.168.1.4:6801/4485 
192.168.0.9:6801/4485 exists,new
osd.1 down out weight 0 up_from 10 up_thru 48 down_at 50 last_clean_interval 
[5,8) 192.168.0.7:6800/60912 192.168.1.6:6801/60912 192.168.1.6:6802/60912 
192.168.0.7:6801/60912 exists,new
osd.2 down out weight 0 up_from 10 up_thru 48 down_at 50 last_clean_interval 
[5,8) 192.168.0.6:6800/61013 192.168.1.7:6800/61013 192.168.1.7:6801/61013 
192.168.0.6:6801/61013 exists,new
osd.3 up   in  weight 1 up_from 192 up_thru 201 down_at 190 last_clean_interval 
[83,191) 192.168.0.9:6800/2634194 192.168.1.7:6802/3634194 
192.168.1.7:6803/3634194 192.168.0.9:6802/3634194 exists,up 
28b02052-3196-4203-bec8-ac83a69fcbc5
osd.4 up   in  weight 1 up_from 196 up_thru 201 down_at 194 last_clean_interval 
[80,195) 192.168.0.7:6800/2629319 192.168.1.6:6802/3629319 
192.168.1.6:6803/3629319 192.168.0.7:6802/3629319 exists,up 
124b58e6-1e38-4246-8838-cfc3b88e8a5a
osd.5 up   in  weight 1 up_from 201 up_thru 201 down_at 199 last_clean_interval 
[134,200) 192.168.0.6:6800/5494 192.168.1.4:6802/1005494 
192.168.1.4:6803/1005494 192.168.0.6:6802/1005494 exists,up 
ddfca14e-e6f6-4c48-aa8f-0ebfc765d32f
root@node-1:/var/log#


James Okken
Lab Manager
Dialogic Research Inc.
4 Gatehall Drive
Parsippany
NJ 07054
USA

Tel:   973 967 5179
Email:   james.ok...@dialogic.com

Re: [ceph-users] cephfs deep scrub error:

2017-03-13 Thread Gregory Farnum
On Mon, Mar 13, 2017 at 3:28 AM, Dan van der Ster  wrote:
> Hi John,
>
> Last week we updated our prod CephFS cluster to 10.2.6 (clients and
> server side), and for the first time today we've got an object info
> size mismatch:
>
> I found this ticket you created in the tracker, which is why I've
> emailed you: http://tracker.ceph.com/issues/18240
>
> Here's the detail of our error:
>
> 2017-03-13 07:17:49.989297 osd.67 :6819/3441125 262 : cluster
> [ERR] deep-scrub 1.3da 1:5bc0e9dc:::1260f4b.0003:head on disk
> size (4187974) does not match object info size (4193094) adjusted for
> ondisk to (4193094)
>
> All three replica's have the same object size/md5sum:
>
> # ls -l 1260f4b.0003__head_3B9703DA__1
> -rw-r--r--. 1 ceph ceph 4187974 Mar 12 18:50
> 1260f4b.0003__head_3B9703DA__1
> # md5sum 1260f4b.0003__head_3B9703DA__1
> db1e1bab199b33fce3ad9195832626ef 1260f4b.0003__head_3B9703DA__1
>
> And indeed the object info does not agree with the files on disk:
>
> # ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json
> {
> "oid": {
> "oid": "1260f4b.0003",
> "key": "",
> "snapid": -2,
> "hash": 999752666,
> "max": 0,
> "pool": 1,
> "namespace": ""
> },
> "version": "5262'221037",
> "prior_version": "5262'221031",
> "last_reqid": "osd.67.0:1180241",
> "user_version": 221031,

The only string I can see to tug on here is this: that the version and
user_version have clearly diverged, which (most?) commonly means some
kind of recovery or caching op was performed. You're not using cache
tiers, are you? Is it possible somebody ran recovery and overwrote
good objects with the bad one before you got to look at the raw files?
-Greg

> "size": 4193094,
> "mtime": "0.00",
> "local_mtime": "0.00",
> "lost": 0,
> "flags": 52,
> "snaps": [],
> "truncate_seq": 80,
> "truncate_size": 0,
> "data_digest": 2779145704,
> "omap_digest": 4294967295,
> "watchers": {}
> }
>
>
> PG repair doesn't handle these kind of corruptions, but I found a
> recipe in an old thread to fix the object info with hexedit. Before
> doing this I wanted to see if we can understand exactly how this is
> possible.
>
> I managed to find the exact cephfs file, and asked the user how they
> created it. They said the file was the output of a make test on some
> program. The make test was taking awhile, so they left their laptop,
> and when they returned to the computer, the ssh connection to their
> cephfs workstation had broken. I assume this means that the process
> writing the file had been killed while writing to cephfs. But I don't
> understand how a killed client process could result in inconsistent
> object info.
>
> Is there anything else needed to help debug this inconsistency?
>
> Cheers, Dan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Snapshots on Cephfs Pools?

2017-03-13 Thread Kent Borg

On 03/13/2017 11:54 AM, John Spray wrote:

On Mon, Mar 13, 2017 at 2:13 PM, Kent Borg  wrote:

We have a Cephfs cluster stuck in read-only mode, looks like following the
Disaster Recovery steps, is it a good idea to first make a RADOS snapshot of
the Cephfs pools? Or are there ways that could make matters worse?

Don't do this.  Pool-level snapshots are incompatible with so-called
"self-managed" snapshots where something above rados (cephfs/rbd) is
handling snapshots.


Thanks,

-kb
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS fuse client users stuck

2017-03-13 Thread Andras Pataki

Dear Cephers,

We're using the ceph file system with the fuse client, and lately some 
of our processes are getting stuck seemingly waiting for fuse 
operations.  At the same time, the cluster is healthy, no slow requests, 
all OSDs up and running, and both the MDS and the fuse client think that 
there are no pending operations.  The situation is semi-reproducible.  
When I run a various cluster jobs, some get stuck after a few hours of 
correct operation.  The cluster is on ceph 10.2.5 and 10.2.6, the fuse 
clients are 10.2.6, but I have tried 10.2.5 and 10.2.3, all of which 
have the same issue.  This is on CentOS (7.2 for the clients, 7.3 for 
the MDS/OSDs).


Here are some details:

The node with the stuck processes:

   [root@worker1070 ~]# ps -auxwww | grep 30519
   *apataki   30519 39.8  0.9 8728064 5257588 ? Dl 12:11  60:50
   ./Arepo param.txt 2 6*
   [root@worker1070 ~]# cat /proc/30519/stack
   *[] fuse_file_aio_write+0xbb/0x340 [fuse]*
   [] do_sync_write+0x8d/0xd0
   [] vfs_write+0xbd/0x1e0
   [] SyS_write+0x7f/0xe0
   [] system_call_fastpath+0x16/0x1b
   [] 0x

   [root@worker1070 ~]# ps -auxwww | grep 30533
   *apataki   30533 39.8  0.9 8795316 5261308 ? Sl   12:11 60:55
   ./Arepo param.txt 2 6*
   [root@worker1070 ~]# cat /proc/30533/stack
   *[] wait_answer_interruptible+0x91/0xe0 [fuse]**
   **[] __fuse_request_send+0x253/0x2c0 [fuse]**
   **[] fuse_request_send+0x12/0x20 [fuse]**
   **[] fuse_send_write+0xd6/0x110 [fuse]**
   **[] fuse_perform_write+0x2ed/0x590 [fuse]**
   **[] fuse_file_aio_write+0x2a1/0x340 [fuse]**
   **[] do_sync_write+0x8d/0xd0*
   [] vfs_write+0xbd/0x1e0
   [] SyS_write+0x7f/0xe0
   [] system_call_fastpath+0x16/0x1b
   [] 0x

Presumably the second process is waiting on the first holding some lock ...

The fuse client on the node:

   [root@worker1070 ~]# ceph daemon
   /var/run/ceph/ceph-client.admin.asok status
   {
"metadata": {
"ceph_sha1": "656b5b63ed7c43bd014bcafd81b001959d5f089f",
"ceph_version": "ceph version 10.2.6
   (656b5b63ed7c43bd014bcafd81b001959d5f089f)",
"entity_id": "admin",
"hostname": "worker1070",
"mount_point": "\/mnt\/ceph",
"root": "\/"
},
"dentry_count": 40,
"dentry_pinned_count": 23,
"inode_count": 123,
"mds_epoch": 19041,
"osd_epoch": 462327,
"osd_epoch_barrier": 462326
   }

   [root@worker1070 ~]# ceph daemon
   /var/run/ceph/ceph-client.admin.asok mds_sessions
   {
"id": 3616543,
"sessions": [
{
"mds": 0,
"addr": "10.128.128.110:6800\/909443124",
"seq": 338,
"cap_gen": 0,
"cap_ttl": "2017-03-13 14:47:37.575229",
"last_cap_renew_request": "2017-03-13 14:46:37.575229",
"cap_renew_seq": 12694,
"num_caps": 713,
"state": "open"
}
],
"mdsmap_epoch": 19041
   }

   [root@worker1070 ~]# ceph daemon
   /var/run/ceph/ceph-client.admin.asok mds_requests
   {}


The overall cluster health and the MDS:

   [root@cephosd000 ~]# ceph -s
cluster d7b33135-0940-4e48-8aa6-1d2026597c2f
 health HEALTH_WARN
noscrub,nodeep-scrub,require_jewel_osds flag(s) set
 monmap e17: 3 mons at
   
{hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}
election epoch 29148, quorum 0,1,2
   hyperv029,hyperv030,hyperv031
  fsmap e19041: 1/1/1 up {0=cephosd000=up:active}
 osdmap e462328: 624 osds: 624 up, 624 in
flags noscrub,nodeep-scrub,require_jewel_osds
  pgmap v44458747: 42496 pgs, 6 pools, 924 TB data, 272 Mobjects
2154 TB used, 1791 TB / 3946 TB avail
   42496 active+clean
  client io 86911 kB/s rd, 556 MB/s wr, 227 op/s rd, 303 op/s wr

   [root@cephosd000 ~]# ceph daemon
   /var/run/ceph/ceph-mds.cephosd000.asok ops
   {
"ops": [],
"num_ops": 0
   }


The odd thing is that if in this state I restart the MDS, the client 
process wakes up and proceeds with its work without any errors.  As if a 
request was lost and somehow retransmitted/restarted when the MDS got 
restarted and the fuse layer reconnected to it.


When I try to attach a gdb session to either of the client processes, 
gdb just hangs.  However, right after the MDS restart gdb attaches to 
the process successfully, and shows that the getting stuck happened on 
closing of a file.  In fact, it looks like both processes were trying to 
write to the same file opened with fopen("filename", "a") and close it:


   (gdb) where
   #0  0x2dc53abd in write () from /lib64/libc.so.6
   #1  0x2dbe2383 in _IO_new_file_write () from /lib64/libc.so.6
   #2  0x2dbe37ec in __GI__IO_do_write () from /lib64/libc.so.6
   #3  0x2dbe30e0 in __GI__

Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Iban Cabrillo
Of course!

[root@cephrgw01 ~]# ps -ef | grep rgw
root   766 1  0 mar09 ?00:00:00 /sbin/dhclient -H cephrgw01
-q -lf /var/lib/dhclient/dhclient--eth0.lease -pf
/var/run/dhclient-eth0.pid eth0
ceph   895 1  0 mar09 ?00:14:39 /usr/bin/radosgw -f
--cluster ceph --name client.rgw.cephrgw --setuser ceph --setgroup ceph
root 14332 10826  0 19:16 pts/000:00:00 grep --color=auto rgw
Regards, I

2017-03-13 19:08 GMT+01:00 Yair Magnezi :

> Thank you Iban.
> Can you please also send me the output of : ps -ef ! grep rgw
> Many Thanks.
>
> On Mar 13, 2017 7:32 PM, "Iban Cabrillo"  wrote:
>
>> HI Yair,
>>   This is my conf:
>>
>> [client.rgw.cephrgw]
>> host = cephrgw01
>> rgw_frontends = "civetweb port=8080s ssl_certificate=/etc/pki/tls/c
>> ephrgw01.crt"
>> rgw_zone = RegionOne
>> keyring = /etc/ceph/ceph.client.rgw.cephrgw.keyring
>> log_file = /var/log/ceph/client.rgw.cephrgw.log
>> rgw_keystone_url = https://keystone.xxx,xx:5000
>> rgw_keystone_admin_user = cloud
>> rgw_keystone_admin_password = XXX
>> rgw_keystone_admin_tenant = service
>> rgw_keystone_accepted_roles = admin member _member_
>> rgw keystone admin project = service
>> rgw keystone admin domain = admin
>> rgw keystone api version = 2
>> rgw_s3_auth_use_keystone = true
>> nss_db_path = /var/ceph/nss/
>> rgw_keystone_verify_ssl = true
>>
>> Regards, I
>>
>> 2017-03-13 17:28 GMT+01:00 Yair Magnezi :
>>
>>> But per the doc the client stanza should include
>>>  client.radosgw.instance_name
>>>
>>>
>>> [client.rgw.ceph-rgw-02]
>>> host = ceph-rgw-02
>>> keyring = /etc/ceph/ceph.client.radosgw.keyring
>>> log file = /var/log/radosgw/client.radosgw.gateway.log
>>> rgw_frontends = "civetweb port=8080"
>>>
>>>
>>> "For example, if your node name is gateway-node1, add a section like
>>> this after the [global] section:
>>>
>>> [client.rgw.gateway-node1]
>>> rgw_frontends = "civetweb port=80" "
>>>
>>>
>>> Here, {hostname} is the short hostname (output of command hostname -s)
>>> of the node that is going to provide the gateway service i.e, the
>>> gateway host.
>>>
>>> The [client.radosgw.gateway] portion of the gateway instance identifies
>>> this portion of the Ceph configuration file as configuring a Ceph Storage
>>> Cluster client where the client type is a Ceph Object Gateway (i.e.,
>>> radosgw).
>>>
>>>
>>> Does anyone run the RGW on a different port and can share his
>>> configuration ?
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Mar 13, 2017 at 5:28 PM, Abhishek Lekshmanan 
>>> wrote:
>>>


 On 03/13/2017 04:06 PM, Yair Magnezi wrote:

> Thank you Abhishek
>
> But still ...
>
> root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
> ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
> --cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph
>
>
> root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
> [global]
> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
> mon_initial_members = ceph-osd-01 ceph-osd-02
> mon_host = 10.83.1.78,10.83.1.79
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> public_network = 10.83.1.0/24 
> rbd default features = 3
> #debug ms = 1
> #debug rgw = 20
>
> [client.radosgw.*rgw.ceph-rgw-02*]
> host = ceph-rgw-02
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> log file = /var/log/radosgw/client.radosgw.gateway.log
> rgw_frontends = "civetweb port=8080"
>


 Try client.rgw.ceph-rgw-02 here (and similar in id), ie. basically what
 you pass as the id should be what the ceph.conf section should look like.


> root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
> tcp0  0 0.0.0.0:7480 
>  0.0.0.0:*   LISTEN
> tcp0  0 10.83.1.100:56884 
> 10.83.1.78:6800  ESTABLISHED
> tcp0  0 10.83.1.100:47842 
> 10.83.1.78:6804  TIME_WAIT
> tcp0  0 10.83.1.100:47846 
> 10.83.1.78:6804  ESTABLISHED
> tcp0  0 10.83.1.100:44791 
> 10.83.1.80:6804  ESTABLISHED
> tcp0  0 10.83.1.100:44782 
> 10.83.1.80:6804  TIME_WAIT
> tcp0  0 10.83.1.100:38082 
> 10.83.1.80:6789  ESTABLISHED
> tcp0  0 10.83.1.100:41999 
> 10.83.1.80:6800  ESTABLISHED
> tcp0  0 10.83.1.100:59681 
> 10.83.1.79:6800  

Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Yair Magnezi
Thank you Iban.
Can you please also send me the output of : ps -ef ! grep rgw
Many Thanks.

On Mar 13, 2017 7:32 PM, "Iban Cabrillo"  wrote:

> HI Yair,
>   This is my conf:
>
> [client.rgw.cephrgw]
> host = cephrgw01
> rgw_frontends = "civetweb port=8080s ssl_certificate=/etc/pki/tls/
> cephrgw01.crt"
> rgw_zone = RegionOne
> keyring = /etc/ceph/ceph.client.rgw.cephrgw.keyring
> log_file = /var/log/ceph/client.rgw.cephrgw.log
> rgw_keystone_url = https://keystone.xxx,xx:5000
> rgw_keystone_admin_user = cloud
> rgw_keystone_admin_password = XXX
> rgw_keystone_admin_tenant = service
> rgw_keystone_accepted_roles = admin member _member_
> rgw keystone admin project = service
> rgw keystone admin domain = admin
> rgw keystone api version = 2
> rgw_s3_auth_use_keystone = true
> nss_db_path = /var/ceph/nss/
> rgw_keystone_verify_ssl = true
>
> Regards, I
>
> 2017-03-13 17:28 GMT+01:00 Yair Magnezi :
>
>> But per the doc the client stanza should include
>>  client.radosgw.instance_name
>>
>>
>> [client.rgw.ceph-rgw-02]
>> host = ceph-rgw-02
>> keyring = /etc/ceph/ceph.client.radosgw.keyring
>> log file = /var/log/radosgw/client.radosgw.gateway.log
>> rgw_frontends = "civetweb port=8080"
>>
>>
>> "For example, if your node name is gateway-node1, add a section like
>> this after the [global] section:
>>
>> [client.rgw.gateway-node1]
>> rgw_frontends = "civetweb port=80" "
>>
>>
>> Here, {hostname} is the short hostname (output of command hostname -s)
>> of the node that is going to provide the gateway service i.e, the gateway
>>  host.
>>
>> The [client.radosgw.gateway] portion of the gateway instance identifies
>> this portion of the Ceph configuration file as configuring a Ceph Storage
>> Cluster client where the client type is a Ceph Object Gateway (i.e.,
>> radosgw).
>>
>>
>> Does anyone run the RGW on a different port and can share his
>> configuration ?
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Mar 13, 2017 at 5:28 PM, Abhishek Lekshmanan 
>> wrote:
>>
>>>
>>>
>>> On 03/13/2017 04:06 PM, Yair Magnezi wrote:
>>>
 Thank you Abhishek

 But still ...

 root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
 ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
 --cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph


 root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
 [global]
 fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
 mon_initial_members = ceph-osd-01 ceph-osd-02
 mon_host = 10.83.1.78,10.83.1.79
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 public_network = 10.83.1.0/24 
 rbd default features = 3
 #debug ms = 1
 #debug rgw = 20

 [client.radosgw.*rgw.ceph-rgw-02*]
 host = ceph-rgw-02
 keyring = /etc/ceph/ceph.client.radosgw.keyring
 log file = /var/log/radosgw/client.radosgw.gateway.log
 rgw_frontends = "civetweb port=8080"

>>>
>>>
>>> Try client.rgw.ceph-rgw-02 here (and similar in id), ie. basically what
>>> you pass as the id should be what the ceph.conf section should look like.
>>>
>>>
 root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
 tcp0  0 0.0.0.0:7480 
  0.0.0.0:*   LISTEN
 tcp0  0 10.83.1.100:56884 
 10.83.1.78:6800  ESTABLISHED
 tcp0  0 10.83.1.100:47842 
 10.83.1.78:6804  TIME_WAIT
 tcp0  0 10.83.1.100:47846 
 10.83.1.78:6804  ESTABLISHED
 tcp0  0 10.83.1.100:44791 
 10.83.1.80:6804  ESTABLISHED
 tcp0  0 10.83.1.100:44782 
 10.83.1.80:6804  TIME_WAIT
 tcp0  0 10.83.1.100:38082 
 10.83.1.80:6789  ESTABLISHED
 tcp0  0 10.83.1.100:41999 
 10.83.1.80:6800  ESTABLISHED
 tcp0  0 10.83.1.100:59681 
 10.83.1.79:6800  ESTABLISHED
 tcp0  0 10.83.1.100:37590 
 10.83.1.79:6804  ESTABLISHED


 2017-03-13 15:05:23.836844 7f5c2fc80900  0 starting handler: civetweb
 2017-03-13 15:05:23.838497 7f5c11379700  0 -- 10.83.1.100:0/2130438046
  submit_message
 mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0
 , failed lossy con, dropping message
 0x7f5bfc011850
 2017-03-13 15:05:23.842769 7f5c11379700  0 monclient: hunting for new
 mon
 20

Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Iban Cabrillo
HI Yair,
  This is my conf:

[client.rgw.cephrgw]
host = cephrgw01
rgw_frontends = "civetweb port=8080s
ssl_certificate=/etc/pki/tls/cephrgw01.crt"
rgw_zone = RegionOne
keyring = /etc/ceph/ceph.client.rgw.cephrgw.keyring
log_file = /var/log/ceph/client.rgw.cephrgw.log
rgw_keystone_url = https://keystone.xxx,xx:5000
rgw_keystone_admin_user = cloud
rgw_keystone_admin_password = XXX
rgw_keystone_admin_tenant = service
rgw_keystone_accepted_roles = admin member _member_
rgw keystone admin project = service
rgw keystone admin domain = admin
rgw keystone api version = 2
rgw_s3_auth_use_keystone = true
nss_db_path = /var/ceph/nss/
rgw_keystone_verify_ssl = true

Regards, I

2017-03-13 17:28 GMT+01:00 Yair Magnezi :

> But per the doc the client stanza should include
>  client.radosgw.instance_name
>
>
> [client.rgw.ceph-rgw-02]
> host = ceph-rgw-02
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> log file = /var/log/radosgw/client.radosgw.gateway.log
> rgw_frontends = "civetweb port=8080"
>
>
> "For example, if your node name is gateway-node1, add a section like this
> after the [global] section:
>
> [client.rgw.gateway-node1]
> rgw_frontends = "civetweb port=80" "
>
>
> Here, {hostname} is the short hostname (output of command hostname -s) of
> the node that is going to provide the gateway service i.e, the gateway
> host.
>
> The [client.radosgw.gateway] portion of the gateway instance identifies
> this portion of the Ceph configuration file as configuring a Ceph Storage
> Cluster client where the client type is a Ceph Object Gateway (i.e.,
> radosgw).
>
>
> Does anyone run the RGW on a different port and can share his
> configuration ?
>
> Thanks
>
>
>
>
>
>
>
> On Mon, Mar 13, 2017 at 5:28 PM, Abhishek Lekshmanan 
> wrote:
>
>>
>>
>> On 03/13/2017 04:06 PM, Yair Magnezi wrote:
>>
>>> Thank you Abhishek
>>>
>>> But still ...
>>>
>>> root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
>>> ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
>>> --cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph
>>>
>>>
>>> root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
>>> [global]
>>> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
>>> mon_initial_members = ceph-osd-01 ceph-osd-02
>>> mon_host = 10.83.1.78,10.83.1.79
>>> auth_cluster_required = cephx
>>> auth_service_required = cephx
>>> auth_client_required = cephx
>>> public_network = 10.83.1.0/24 
>>> rbd default features = 3
>>> #debug ms = 1
>>> #debug rgw = 20
>>>
>>> [client.radosgw.*rgw.ceph-rgw-02*]
>>> host = ceph-rgw-02
>>> keyring = /etc/ceph/ceph.client.radosgw.keyring
>>> log file = /var/log/radosgw/client.radosgw.gateway.log
>>> rgw_frontends = "civetweb port=8080"
>>>
>>
>>
>> Try client.rgw.ceph-rgw-02 here (and similar in id), ie. basically what
>> you pass as the id should be what the ceph.conf section should look like.
>>
>>
>>> root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
>>> tcp0  0 0.0.0.0:7480 
>>>  0.0.0.0:*   LISTEN
>>> tcp0  0 10.83.1.100:56884 
>>> 10.83.1.78:6800  ESTABLISHED
>>> tcp0  0 10.83.1.100:47842 
>>> 10.83.1.78:6804  TIME_WAIT
>>> tcp0  0 10.83.1.100:47846 
>>> 10.83.1.78:6804  ESTABLISHED
>>> tcp0  0 10.83.1.100:44791 
>>> 10.83.1.80:6804  ESTABLISHED
>>> tcp0  0 10.83.1.100:44782 
>>> 10.83.1.80:6804  TIME_WAIT
>>> tcp0  0 10.83.1.100:38082 
>>> 10.83.1.80:6789  ESTABLISHED
>>> tcp0  0 10.83.1.100:41999 
>>> 10.83.1.80:6800  ESTABLISHED
>>> tcp0  0 10.83.1.100:59681 
>>> 10.83.1.79:6800  ESTABLISHED
>>> tcp0  0 10.83.1.100:37590 
>>> 10.83.1.79:6804  ESTABLISHED
>>>
>>>
>>> 2017-03-13 15:05:23.836844 7f5c2fc80900  0 starting handler: civetweb
>>> 2017-03-13 15:05:23.838497 7f5c11379700  0 -- 10.83.1.100:0/2130438046
>>>  submit_message
>>> mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0
>>> , failed lossy con, dropping message
>>> 0x7f5bfc011850
>>> 2017-03-13 15:05:23.842769 7f5c11379700  0 monclient: hunting for new mon
>>> 2017-03-13 15:05:23.846976 7f5c2fc80900  0 starting handler: fastcgi
>>> 2017-03-13 15:05:23.849245 7f5b87a6a700  0 ERROR: no socket server point
>>> defined, cannot start fcgi frontend
>>>
>>>
>>>
>>>
>>> Any more ideas
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> *
>>> *
>>> **
>>>
>>>
>>>
>>> On Mon, Mar 13, 2017 at

Re: [ceph-users] A Jewel in the rough? (cache tier bugs and documentation omissions)

2017-03-13 Thread Gregory Farnum
On Mon, Mar 13, 2017 at 9:10 AM Ken Dreyer  wrote:

> At a general level, is there any way we could update the documentation
> automatically whenever src/common/config_opts.h changes?


GitHub PR hooks that block any change to the file which doesn't include a
documentation patch including those strings?
I don't think anything weaker is likely to be reliable. :)
-Greg



>
> - Ken
>
> On Tue, Mar 7, 2017 at 2:44 AM, Nick Fisk  wrote:
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of John Spray
> >> Sent: 07 March 2017 01:45
> >> To: Christian Balzer 
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] A Jewel in the rough? (cache tier bugs and
> documentation omissions)
> >>
> >> On Tue, Mar 7, 2017 at 12:28 AM, Christian Balzer 
> wrote:
> >> >
> >> >
> >> > Hello,
> >> >
> >> > It's now 10 months after this thread:
> >> >
> >> > http://www.spinics.net/lists/ceph-users/msg27497.html (plus next
> >> > message)
> >> >
> >> > and we're at the fifth iteration of Jewel and still
> >> >
> >> > osd_tier_promote_max_objects_sec
> >> > and
> >> > osd_tier_promote_max_bytes_sec
> >> >
> >> > are neither documented (master or jewel), nor mentioned in the
> >> > changelogs and most importantly STILL default to the broken reverse
> settings above.
> >>
> >> Is there a pull request?
> >
> > Mark fixed it in this commit, but looks like it was never marked for
> backport to Jewel.
> >
> >
> https://github.com/ceph/ceph/commit/793ceac2f3d5a2c404ac50569c44a21de6001b62
> >
> > I will look into getting the documentation updated for these settings.
> >
> >>
> >> John
> >>
> >> > Anybody coming from Hammer or even starting with Jewel and using cache
> >> > tiering will be having a VERY bad experience.
> >> >
> >> > Christian
> >> > --
> >> > Christian BalzerNetwork/Systems Engineer
> >> > ch...@gol.com   Global OnLine Japan/Rakuten Communications
> >> > http://www.gol.com/
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Yair Magnezi
But per the doc the client stanza should include
 client.radosgw.instance_name


[client.rgw.ceph-rgw-02]
host = ceph-rgw-02
keyring = /etc/ceph/ceph.client.radosgw.keyring
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw_frontends = "civetweb port=8080"


"For example, if your node name is gateway-node1, add a section like this
after the [global] section:

[client.rgw.gateway-node1]
rgw_frontends = "civetweb port=80" "


Here, {hostname} is the short hostname (output of command hostname -s) of
the node that is going to provide the gateway service i.e, the gateway host.

The [client.radosgw.gateway] portion of the gateway instance identifies
this portion of the Ceph configuration file as configuring a Ceph Storage
Cluster client where the client type is a Ceph Object Gateway (i.e., radosgw
).


Does anyone run the RGW on a different port and can share his configuration
?

Thanks







On Mon, Mar 13, 2017 at 5:28 PM, Abhishek Lekshmanan 
wrote:

>
>
> On 03/13/2017 04:06 PM, Yair Magnezi wrote:
>
>> Thank you Abhishek
>>
>> But still ...
>>
>> root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
>> ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
>> --cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph
>>
>>
>> root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
>> [global]
>> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
>> mon_initial_members = ceph-osd-01 ceph-osd-02
>> mon_host = 10.83.1.78,10.83.1.79
>> auth_cluster_required = cephx
>> auth_service_required = cephx
>> auth_client_required = cephx
>> public_network = 10.83.1.0/24 
>> rbd default features = 3
>> #debug ms = 1
>> #debug rgw = 20
>>
>> [client.radosgw.*rgw.ceph-rgw-02*]
>> host = ceph-rgw-02
>> keyring = /etc/ceph/ceph.client.radosgw.keyring
>> log file = /var/log/radosgw/client.radosgw.gateway.log
>> rgw_frontends = "civetweb port=8080"
>>
>
>
> Try client.rgw.ceph-rgw-02 here (and similar in id), ie. basically what
> you pass as the id should be what the ceph.conf section should look like.
>
>
>> root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
>> tcp0  0 0.0.0.0:7480 
>>  0.0.0.0:*   LISTEN
>> tcp0  0 10.83.1.100:56884 
>> 10.83.1.78:6800  ESTABLISHED
>> tcp0  0 10.83.1.100:47842 
>> 10.83.1.78:6804  TIME_WAIT
>> tcp0  0 10.83.1.100:47846 
>> 10.83.1.78:6804  ESTABLISHED
>> tcp0  0 10.83.1.100:44791 
>> 10.83.1.80:6804  ESTABLISHED
>> tcp0  0 10.83.1.100:44782 
>> 10.83.1.80:6804  TIME_WAIT
>> tcp0  0 10.83.1.100:38082 
>> 10.83.1.80:6789  ESTABLISHED
>> tcp0  0 10.83.1.100:41999 
>> 10.83.1.80:6800  ESTABLISHED
>> tcp0  0 10.83.1.100:59681 
>> 10.83.1.79:6800  ESTABLISHED
>> tcp0  0 10.83.1.100:37590 
>> 10.83.1.79:6804  ESTABLISHED
>>
>>
>> 2017-03-13 15:05:23.836844 7f5c2fc80900  0 starting handler: civetweb
>> 2017-03-13 15:05:23.838497 7f5c11379700  0 -- 10.83.1.100:0/2130438046
>>  submit_message
>> mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0
>> , failed lossy con, dropping message
>> 0x7f5bfc011850
>> 2017-03-13 15:05:23.842769 7f5c11379700  0 monclient: hunting for new mon
>> 2017-03-13 15:05:23.846976 7f5c2fc80900  0 starting handler: fastcgi
>> 2017-03-13 15:05:23.849245 7f5b87a6a700  0 ERROR: no socket server point
>> defined, cannot start fcgi frontend
>>
>>
>>
>>
>> Any more ideas
>>
>> Thanks
>>
>>
>>
>>
>> *
>> *
>> **
>>
>>
>>
>> On Mon, Mar 13, 2017 at 4:34 PM, Abhishek Lekshmanan > > wrote:
>>
>>
>>
>> On 03/13/2017 03:26 PM, Yair Magnezi wrote:
>>
>> Hello Wido
>>
>> yes , the is my  /etc/cep/ceph.conf
>>
>> and yes  radosgw.ceph-rgw-02 is the running instance .
>>
>> root@ceph-rgw-02:/var/log/ceph# ps -ef | grep -i rgw
>> ceph 17226 1  0 14:02 ?00:00:01 /usr/bin/radosgw
>> --cluster=ceph --id rgw.ceph-rgw-02 -f --setuser ceph --setgroup
>> ceph
>>
>>
>> The ID passed to rgw here is `rgw.ceph-rgw-02`, whereas your conf
>> has a section named `radosgw.ceph-rgw-02` try running this service
>> (systemctl start ceph-rado...@radosgw.ceph-rgw-02 maybe?)
>>
>> --
>> Abhishek Lekshmanan
>> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham
>> Norton, HRB 21284 (AG Nürnberg)
>>
>>
>>

Re: [ceph-users] A Jewel in the rough? (cache tier bugs and documentation omissions)

2017-03-13 Thread Ken Dreyer
At a general level, is there any way we could update the documentation
automatically whenever src/common/config_opts.h changes?

- Ken

On Tue, Mar 7, 2017 at 2:44 AM, Nick Fisk  wrote:
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
>> John Spray
>> Sent: 07 March 2017 01:45
>> To: Christian Balzer 
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] A Jewel in the rough? (cache tier bugs and 
>> documentation omissions)
>>
>> On Tue, Mar 7, 2017 at 12:28 AM, Christian Balzer  wrote:
>> >
>> >
>> > Hello,
>> >
>> > It's now 10 months after this thread:
>> >
>> > http://www.spinics.net/lists/ceph-users/msg27497.html (plus next
>> > message)
>> >
>> > and we're at the fifth iteration of Jewel and still
>> >
>> > osd_tier_promote_max_objects_sec
>> > and
>> > osd_tier_promote_max_bytes_sec
>> >
>> > are neither documented (master or jewel), nor mentioned in the
>> > changelogs and most importantly STILL default to the broken reverse 
>> > settings above.
>>
>> Is there a pull request?
>
> Mark fixed it in this commit, but looks like it was never marked for backport 
> to Jewel.
>
> https://github.com/ceph/ceph/commit/793ceac2f3d5a2c404ac50569c44a21de6001b62
>
> I will look into getting the documentation updated for these settings.
>
>>
>> John
>>
>> > Anybody coming from Hammer or even starting with Jewel and using cache
>> > tiering will be having a VERY bad experience.
>> >
>> > Christian
>> > --
>> > Christian BalzerNetwork/Systems Engineer
>> > ch...@gol.com   Global OnLine Japan/Rakuten Communications
>> > http://www.gol.com/
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Snapshots on Cephfs Pools?

2017-03-13 Thread John Spray
On Mon, Mar 13, 2017 at 2:13 PM, Kent Borg  wrote:
> We have a Cephfs cluster stuck in read-only mode, looks like following the
> Disaster Recovery steps, is it a good idea to first make a RADOS snapshot of
> the Cephfs pools? Or are there ways that could make matters worse?

Don't do this.  Pool-level snapshots are incompatible with so-called
"self-managed" snapshots where something above rados (cephfs/rbd) is
handling snapshots.

As the docs suggest, you should always take a backup of your journal.
You can even take a full copy of the metadata pool with "rados export"
if it's manageably small (metadata often is).

John

> Thanks,
>
> -kb
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Abhishek Lekshmanan



On 03/13/2017 04:06 PM, Yair Magnezi wrote:

Thank you Abhishek

But still ...

root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
--cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph


root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
[global]
fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
mon_initial_members = ceph-osd-01 ceph-osd-02
mon_host = 10.83.1.78,10.83.1.79
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = 10.83.1.0/24 
rbd default features = 3
#debug ms = 1
#debug rgw = 20

[client.radosgw.*rgw.ceph-rgw-02*]
host = ceph-rgw-02
keyring = /etc/ceph/ceph.client.radosgw.keyring
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw_frontends = "civetweb port=8080"



Try client.rgw.ceph-rgw-02 here (and similar in id), ie. basically what 
you pass as the id should be what the ceph.conf section should look like.




root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
tcp0  0 0.0.0.0:7480 
 0.0.0.0:*   LISTEN
tcp0  0 10.83.1.100:56884 
10.83.1.78:6800  ESTABLISHED
tcp0  0 10.83.1.100:47842 
10.83.1.78:6804  TIME_WAIT
tcp0  0 10.83.1.100:47846 
10.83.1.78:6804  ESTABLISHED
tcp0  0 10.83.1.100:44791 
10.83.1.80:6804  ESTABLISHED
tcp0  0 10.83.1.100:44782 
10.83.1.80:6804  TIME_WAIT
tcp0  0 10.83.1.100:38082 
10.83.1.80:6789  ESTABLISHED
tcp0  0 10.83.1.100:41999 
10.83.1.80:6800  ESTABLISHED
tcp0  0 10.83.1.100:59681 
10.83.1.79:6800  ESTABLISHED
tcp0  0 10.83.1.100:37590 
10.83.1.79:6804  ESTABLISHED


2017-03-13 15:05:23.836844 7f5c2fc80900  0 starting handler: civetweb
2017-03-13 15:05:23.838497 7f5c11379700  0 -- 10.83.1.100:0/2130438046
 submit_message
mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0
, failed lossy con, dropping message
0x7f5bfc011850
2017-03-13 15:05:23.842769 7f5c11379700  0 monclient: hunting for new mon
2017-03-13 15:05:23.846976 7f5c2fc80900  0 starting handler: fastcgi
2017-03-13 15:05:23.849245 7f5b87a6a700  0 ERROR: no socket server point
defined, cannot start fcgi frontend




Any more ideas

Thanks




*
*
**



On Mon, Mar 13, 2017 at 4:34 PM, Abhishek Lekshmanan mailto:abhis...@suse.com>> wrote:



On 03/13/2017 03:26 PM, Yair Magnezi wrote:

Hello Wido

yes , the is my  /etc/cep/ceph.conf

and yes  radosgw.ceph-rgw-02 is the running instance .

root@ceph-rgw-02:/var/log/ceph# ps -ef | grep -i rgw
ceph 17226 1  0 14:02 ?00:00:01 /usr/bin/radosgw
--cluster=ceph --id rgw.ceph-rgw-02 -f --setuser ceph --setgroup
ceph


The ID passed to rgw here is `rgw.ceph-rgw-02`, whereas your conf
has a section named `radosgw.ceph-rgw-02` try running this service
(systemctl start ceph-rado...@radosgw.ceph-rgw-02 maybe?)

--
Abhishek Lekshmanan
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham
Norton, HRB 21284 (AG Nürnberg)


Thanks



*Yair Magnezi *
*Storage & Data Protection TL   // *Kenshoo*
*Office* +972 7 32862423//
*Mobile* +972 50 575-2955 
__
*
**



On Mon, Mar 13, 2017 at 4:06 PM, Wido den Hollander
mailto:w...@42on.com>
>> wrote:


> Op 13 maart 2017 om 15:03 schreef Yair Magnezi
mailto:yair.magn...@kenshoo.com>
>>:
>
>
> Hello Cephers .
>
> I'm trying to modify the   civetweb default  port to 80
but from some
> reason it insists on listening on the default 7480 port
>
> My configuration is quiet  simple ( experimental  ) and
looks like this :
>
>
> [global]
> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
> mon_initial_members = ceph-osd-01 ceph-osd-02
> mon_host = 10.83.1.78,10.83.1.79
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = 

Re: [ceph-users] speed decrease with size

2017-03-13 Thread Ben Erridge
On Sun, Mar 12, 2017 at 8:24 PM, Christian Balzer  wrote:

>
> Hello,
>
> On Sun, 12 Mar 2017 19:37:16 -0400 Ben Erridge wrote:
>
> > I am testing attached volume storage on our openstack cluster which uses
> > ceph for block storage.
> > our Ceph nodes have large SSD's for their journals 50+GB for each OSD.
> I'm
> > thinking some parameter is a little off because with relatively small
> > writes I am seeing drastically reduced write speeds.
> >
> Large journals are a waste for most people, especially when your backing
> storage are HDDs.
>
> >
> > we have 2 nodes withs 12 total OSD's each with 50GB SSD Journal.
> >
> I hope that's not your plan for production, with a replica of 2 you're
> looking at pretty much guaranteed data loss over time, unless your OSDs
> are actually RAIDs.
>
> I am aware that replica of 3 is suggested thanks.


> 5GB journals tend to be overkill already.
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008606.html
>
> If you were to actually look at your OSD nodes during those tests with
> something like atop or "iostat -x", you'd likely see that with prolonged
> writes you wind up with the speed of what your HDDs can do, i.e. see them
> (all or individually) being quite busy.
>

That is what I was thinking as well which is not what I want. I want to
better utilize these large SSD journals. If I have 50GB journal
and I only want to write 5GB of data I should be able to get near SSD speed
for this operation. Why am I not? Maybe I should increase
*filestore_max_sync_interval.*


>
> Lastly, for nearly everybody in real life situations the
> bandwidth/throughput becomes a distant second to latency considerations.
>

Thanks for the advice however.


> Christian
>
> >
> >  here is our Ceph config
> >
> > [global]
> > fsid = 19bc15fd-c0cc-4f35-acd2-292a86fbcf7d
> > mon_initial_members = node-5 node-4 node-3
> > mon_host = 192.168.0.8 192.168.0.7 192.168.0.13
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > filestore_xattr_use_omap = true
> > log_to_syslog_level = info
> > log_to_syslog = True
> > osd_pool_default_size = 1
> > osd_pool_default_min_size = 1
> > osd_pool_default_pg_num = 64
> > public_network = 192.168.0.0/24
> > log_to_syslog_facility = LOG_LOCAL0
> > osd_journal_size = 5
> > auth_supported = cephx
> > osd_pool_default_pgp_num = 64
> > osd_mkfs_type = xfs
> > cluster_network = 192.168.1.0/24
> > osd_recovery_max_active = 1
> > osd_max_backfills = 1
> >
> > [client]
> > rbd_cache = True
> > rbd_cache_writethrough_until_flush = True
> >
> > [client.radosgw.gateway]
> > rgw_keystone_accepted_roles = _member_, Member, admin, swiftoperator
> > keyring = /etc/ceph/keyring.radosgw.gateway
> > rgw_socket_path = /tmp/radosgw.sock
> > rgw_keystone_revocation_interval = 100
> > rgw_keystone_url = 192.168.0.2:35357
> > rgw_keystone_admin_token = ZBz37Vlv
> > host = node-3
> > rgw_dns_name = *.ciminc.com
> > rgw_print_continue = True
> > rgw_keystone_token_cache_size = 10
> > rgw_data = /var/lib/ceph/radosgw
> > user = www-data
> >
> > This is the degradation I am speaking of..
> >
> >
> > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=1k; rm -f
> > /mnt/ext4/output;
> > 1024+0 records in
> > 1024+0 records out
> > 1048576000 bytes (1.0 GB) copied, 0.887431 s, 1.2 GB/s
> >
> > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=2k; rm -f
> > /mnt/ext4/output;
> > 2048+0 records in
> > 2048+0 records out
> > 2097152000 bytes (2.1 GB) copied, 3.75782 s, 558 MB/s
> >
> >  dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=3k; rm -f
> > /mnt/ext4/output;
> > 3072+0 records in
> > 3072+0 records out
> > 3145728000 bytes (3.1 GB) copied, 10.0054 s, 314 MB/s
> >
> > dd if=/dev/zero of=/mnt/ext4/output bs=1000k count=5k; rm -f
> > /mnt/ext4/output;
> > 5120+0 records in
> > 5120+0 records out
> > 524288 bytes (5.2 GB) copied, 24.1971 s, 217 MB/s
> >
> > Any suggestions for improving the large write degradation?
>
>
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
>



-- 
-.
Ben Erridge
Center For Information Management, Inc.
(734) 930-0855
3550 West Liberty Road Ste 1
Ann Arbor, MI 48103
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Yair Magnezi
Thank you Abhishek

But still ...

root@ceph-rgw-02:/var/log/ceph# ps -ef | grep rgw
ceph  1332 1  1 14:59 ?00:00:00 /usr/bin/radosgw
--cluster=ceph --id *rgw.ceph-rgw-02* -f --setuser ceph --setgroup ceph


root@ceph-rgw-02:/var/log/ceph# cat /etc/ceph/ceph.conf
[global]
fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
mon_initial_members = ceph-osd-01 ceph-osd-02
mon_host = 10.83.1.78,10.83.1.79
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = 10.83.1.0/24
rbd default features = 3
#debug ms = 1
#debug rgw = 20

[client.radosgw.*rgw.ceph-rgw-02*]
host = ceph-rgw-02
keyring = /etc/ceph/ceph.client.radosgw.keyring
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw_frontends = "civetweb port=8080"


root@ceph-rgw-02:/var/log/ceph# netstat -an | grep 80
tcp0  0 0.0.0.0:74800.0.0.0:*   LISTEN
tcp0  0 10.83.1.100:56884   10.83.1.78:6800
ESTABLISHED
tcp0  0 10.83.1.100:47842   10.83.1.78:6804
TIME_WAIT
tcp0  0 10.83.1.100:47846   10.83.1.78:6804
ESTABLISHED
tcp0  0 10.83.1.100:44791   10.83.1.80:6804
ESTABLISHED
tcp0  0 10.83.1.100:44782   10.83.1.80:6804
TIME_WAIT
tcp0  0 10.83.1.100:38082   10.83.1.80:6789
ESTABLISHED
tcp0  0 10.83.1.100:41999   10.83.1.80:6800
ESTABLISHED
tcp0  0 10.83.1.100:59681   10.83.1.79:6800
ESTABLISHED
tcp0  0 10.83.1.100:37590   10.83.1.79:6804
ESTABLISHED


2017-03-13 15:05:23.836844 7f5c2fc80900  0 starting handler: civetweb
2017-03-13 15:05:23.838497 7f5c11379700  0 -- 10.83.1.100:0/2130438046
submit_message mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0,
failed lossy con, dropping message 0x7f5bfc011850
2017-03-13 15:05:23.842769 7f5c11379700  0 monclient: hunting for new mon
2017-03-13 15:05:23.846976 7f5c2fc80900  0 starting handler: fastcgi
2017-03-13 15:05:23.849245 7f5b87a6a700  0 ERROR: no socket server point
defined, cannot start fcgi frontend




Any more ideas

Thanks








On Mon, Mar 13, 2017 at 4:34 PM, Abhishek Lekshmanan 
wrote:

>
>
> On 03/13/2017 03:26 PM, Yair Magnezi wrote:
>
>> Hello Wido
>>
>> yes , the is my  /etc/cep/ceph.conf
>>
>> and yes  radosgw.ceph-rgw-02 is the running instance .
>>
>> root@ceph-rgw-02:/var/log/ceph# ps -ef | grep -i rgw
>> ceph 17226 1  0 14:02 ?00:00:01 /usr/bin/radosgw
>> --cluster=ceph --id rgw.ceph-rgw-02 -f --setuser ceph --setgroup ceph
>>
>
> The ID passed to rgw here is `rgw.ceph-rgw-02`, whereas your conf has a
> section named `radosgw.ceph-rgw-02` try running this service
> (systemctl start ceph-rado...@radosgw.ceph-rgw-02 maybe?)
>
> --
> Abhishek Lekshmanan
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB
> 21284 (AG Nürnberg)
>
>>
>> Thanks
>>
>>
>>
>> *Yair Magnezi *
>> *Storage & Data Protection TL   // *Kenshoo*
>> *Office* +972 7 32862423   // *Mobile* +972 50 575-2955
>> __
>> *
>> **
>>
>>
>>
>> On Mon, Mar 13, 2017 at 4:06 PM, Wido den Hollander > > wrote:
>>
>>
>> > Op 13 maart 2017 om 15:03 schreef Yair Magnezi
>> mailto:yair.magn...@kenshoo.com>>:
>> >
>> >
>> > Hello Cephers .
>> >
>> > I'm trying to modify the   civetweb default  port to 80 but from
>> some
>> > reason it insists on listening on the default 7480 port
>> >
>> > My configuration is quiet  simple ( experimental  ) and looks like
>> this :
>> >
>> >
>> > [global]
>> > fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
>> > mon_initial_members = ceph-osd-01 ceph-osd-02
>> > mon_host = 10.83.1.78,10.83.1.79
>> > auth_cluster_required = cephx
>> > auth_service_required = cephx
>> > auth_client_required = cephx
>> > public_network = 10.83.1.0/24 
>> > rbd default features = 3
>> > #debug ms = 1
>> > #debug rgw = 20
>> >
>> > [client.radosgw.ceph-rgw-02]
>> > host = ceph-rgw-02
>> > keyring = /etc/ceph/ceph.client.radosgw.keyring
>> > log file = /var/log/radosgw/client.radosgw.gateway.log
>> > rgw_frontends = "civetweb port=80"
>> >
>> >
>>
>> Are you sure this is in /etc/ceph/ceph.conf?
>>
>> In addition, are you also sure the RGW is running as user
>> 'radosgw.ceph-rgw-02' ?
>>
>> Wido
>>
>> > after restart still the same :
>> >
>> > root@ceph-rgw-02:/var/log/ceph# netstat -an |  grep 80
>> > *tcp0  0 0.0.0.0:7480 
>> 
>> >  0.0.0.0:*   LISTEN*
>> > tcp0  0 10.83.1.100:56697 
>>  10.83.1.79:6804 
>> > ESTABLISHED
>> > tcp0  0 10.83.1.100:59482 
>>  10.83.1.79:6800 
>> > TIME_WAIT
>> > tc

Re: [ceph-users] cephfs deep scrub error:

2017-03-13 Thread Dan van der Ster
On Mon, Mar 13, 2017 at 1:35 PM, John Spray  wrote:
> On Mon, Mar 13, 2017 at 10:28 AM, Dan van der Ster  
> wrote:
>> Hi John,
>>
>> Last week we updated our prod CephFS cluster to 10.2.6 (clients and
>> server side), and for the first time today we've got an object info
>> size mismatch:
>>
>> I found this ticket you created in the tracker, which is why I've
>> emailed you: http://tracker.ceph.com/issues/18240
>>
>> Here's the detail of our error:
>>
>> 2017-03-13 07:17:49.989297 osd.67 :6819/3441125 262 : cluster
>> [ERR] deep-scrub 1.3da 1:5bc0e9dc:::1260f4b.0003:head on disk
>> size (4187974) does not match object info size (4193094) adjusted for
>> ondisk to (4193094)
>
>
> Hmm, never investigated that particular ticket but did notice that the
> issue never re-occurred on master, or on jewel since the start of
> 2017.
>
> The last revision where it cropped up in testing was
> 5b402f8a7b5a763852e93cd0a5decd34572f4518, looking at the history
> between that commit and the tip of the jewel branch I don't see
> anything that I recognise as a corruption fix (apart from the omap fix
> which is clearly unrelated as we're looking at data objects in these
> failures).
>
> John

Thanks for checking, John. In the end we deleted the inconsistent
file, rescrubbed, and will just keep a look out for similar cases.

-- Dan


>
>
>> All three replica's have the same object size/md5sum:
>>
>> # ls -l 1260f4b.0003__head_3B9703DA__1
>> -rw-r--r--. 1 ceph ceph 4187974 Mar 12 18:50
>> 1260f4b.0003__head_3B9703DA__1
>> # md5sum 1260f4b.0003__head_3B9703DA__1
>> db1e1bab199b33fce3ad9195832626ef 1260f4b.0003__head_3B9703DA__1
>>
>> And indeed the object info does not agree with the files on disk:
>>
>> # ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json
>> {
>> "oid": {
>> "oid": "1260f4b.0003",
>> "key": "",
>> "snapid": -2,
>> "hash": 999752666,
>> "max": 0,
>> "pool": 1,
>> "namespace": ""
>> },
>> "version": "5262'221037",
>> "prior_version": "5262'221031",
>> "last_reqid": "osd.67.0:1180241",
>> "user_version": 221031,
>> "size": 4193094,
>> "mtime": "0.00",
>> "local_mtime": "0.00",
>> "lost": 0,
>> "flags": 52,
>> "snaps": [],
>> "truncate_seq": 80,
>> "truncate_size": 0,
>> "data_digest": 2779145704,
>> "omap_digest": 4294967295,
>> "watchers": {}
>> }
>>
>>
>> PG repair doesn't handle these kind of corruptions, but I found a
>> recipe in an old thread to fix the object info with hexedit. Before
>> doing this I wanted to see if we can understand exactly how this is
>> possible.
>>
>> I managed to find the exact cephfs file, and asked the user how they
>> created it. They said the file was the output of a make test on some
>> program. The make test was taking awhile, so they left their laptop,
>> and when they returned to the computer, the ssh connection to their
>> cephfs workstation had broken. I assume this means that the process
>> writing the file had been killed while writing to cephfs. But I don't
>> understand how a killed client process could result in inconsistent
>> object info.
>>
>> Is there anything else needed to help debug this inconsistency?
>>
>> Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Abhishek Lekshmanan



On 03/13/2017 03:26 PM, Yair Magnezi wrote:

Hello Wido

yes , the is my  /etc/cep/ceph.conf

and yes  radosgw.ceph-rgw-02 is the running instance .

root@ceph-rgw-02:/var/log/ceph# ps -ef | grep -i rgw
ceph 17226 1  0 14:02 ?00:00:01 /usr/bin/radosgw
--cluster=ceph --id rgw.ceph-rgw-02 -f --setuser ceph --setgroup ceph


The ID passed to rgw here is `rgw.ceph-rgw-02`, whereas your conf has a 
section named `radosgw.ceph-rgw-02` try running this service

(systemctl start ceph-rado...@radosgw.ceph-rgw-02 maybe?)

--
Abhishek Lekshmanan
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, 
HRB 21284 (AG Nürnberg)


Thanks



*Yair Magnezi *
*Storage & Data Protection TL   // *Kenshoo*
*Office* +972 7 32862423   // *Mobile* +972 50 575-2955
__
*
**



On Mon, Mar 13, 2017 at 4:06 PM, Wido den Hollander mailto:w...@42on.com>> wrote:


> Op 13 maart 2017 om 15:03 schreef Yair Magnezi
mailto:yair.magn...@kenshoo.com>>:
>
>
> Hello Cephers .
>
> I'm trying to modify the   civetweb default  port to 80 but from some
> reason it insists on listening on the default 7480 port
>
> My configuration is quiet  simple ( experimental  ) and looks like this :
>
>
> [global]
> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
> mon_initial_members = ceph-osd-01 ceph-osd-02
> mon_host = 10.83.1.78,10.83.1.79
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> public_network = 10.83.1.0/24 
> rbd default features = 3
> #debug ms = 1
> #debug rgw = 20
>
> [client.radosgw.ceph-rgw-02]
> host = ceph-rgw-02
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> log file = /var/log/radosgw/client.radosgw.gateway.log
> rgw_frontends = "civetweb port=80"
>
>

Are you sure this is in /etc/ceph/ceph.conf?

In addition, are you also sure the RGW is running as user
'radosgw.ceph-rgw-02' ?

Wido

> after restart still the same :
>
> root@ceph-rgw-02:/var/log/ceph# netstat -an |  grep 80
> *tcp0  0 0.0.0.0:7480 

>  0.0.0.0:*   LISTEN*
> tcp0  0 10.83.1.100:56697 
 10.83.1.79:6804 
> ESTABLISHED
> tcp0  0 10.83.1.100:59482 
 10.83.1.79:6800 
> TIME_WAIT
> tcp0  0 10.83.1.100:33129 
 10.83.1.78:6804 
> ESTABLISHED
> tcp0  0 10.83.1.100:56318 
 10.83.1.80:6804 
> TIME_WAIT
> tcp0  0 10.83.1.100:56324 
 10.83.1.80:6804 
> ESTABLISHED
> tcp0  0 10.83.1.100:60990 
 10.83.1.78:6800 
> ESTABLISHED
> tcp0  0 10.83.1.100:60985 
 10.83.1.78:6800 
> TIME_WAIT
> tcp0  0 10.83.1.100:56691 
 10.83.1.79:6804 
> TIME_WAIT
> tcp0  0 10.83.1.100:33123 
 10.83.1.78:6804 
> TIME_WAIT
> tcp0  0 10.83.1.100:59494 
 10.83.1.79:6800 
> ESTABLISHED
> tcp0  0 10.83.1.100:55924 
 10.83.1.80:6800 
> ESTABLISHED
> tcp0  0 10.83.1.100:57629 
 10.83.1.80:6789 
> ESTABLISHED
>
>
> Besides that it also looks like the service tries  to start the
fcgi  (
> besides the civetweb ) is there a reason for that ?  ( fastcgi &
Apache are
> not  installed )  ?
>
>
> 2017-03-13 13:44:35.938897 7f05f3fd7700  1 handle_sigterm set
alarm for 120
> 2017-03-13 13:44:35.938916 7f06692c7900 -1 shutting down
> 2017-03-13 13:44:36.170559 7f06692c7900  1 final shutdown
> 2017-03-13 13:45:13.980814 7fbdb2e6c900  0 deferred set uid:gid to
> 64045:64045 (ceph:ceph)
> 2017-03-13 13:45:13.980992 7fbdb2e6c900  0 ceph version 10.2.6
> (656b5b63ed7c43bd014bcafd81b001959d5f089f), process radosgw, pid 16995
> *2017-03-13 13:45:14.115639 7fbdb2e6c900  0 starting handler:
civetweb*
> 2017-03-13 13:45:14.117003 7fbd8700  0 -- 10.83.1.100:0/3241919968 

> submit_message mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0 
,
> failed lossy con, dropping message 0x7fbd78011cc0
> 2

Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Yair Magnezi
Hello Wido

yes , the is my  /etc/cep/ceph.conf

and yes  radosgw.ceph-rgw-02 is the running instance .

root@ceph-rgw-02:/var/log/ceph# ps -ef | grep -i rgw
ceph 17226 1  0 14:02 ?00:00:01 /usr/bin/radosgw
--cluster=ceph --id rgw.ceph-rgw-02 -f --setuser ceph --setgroup ceph

Thanks



*Yair Magnezi *




*Storage & Data Protection TL   // KenshooOffice +972 7 32862423   //
Mobile +972 50 575-2955__*



On Mon, Mar 13, 2017 at 4:06 PM, Wido den Hollander  wrote:

>
> > Op 13 maart 2017 om 15:03 schreef Yair Magnezi  >:
> >
> >
> > Hello Cephers .
> >
> > I'm trying to modify the   civetweb default  port to 80 but from some
> > reason it insists on listening on the default 7480 port
> >
> > My configuration is quiet  simple ( experimental  ) and looks like this :
> >
> >
> > [global]
> > fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
> > mon_initial_members = ceph-osd-01 ceph-osd-02
> > mon_host = 10.83.1.78,10.83.1.79
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > public_network = 10.83.1.0/24
> > rbd default features = 3
> > #debug ms = 1
> > #debug rgw = 20
> >
> > [client.radosgw.ceph-rgw-02]
> > host = ceph-rgw-02
> > keyring = /etc/ceph/ceph.client.radosgw.keyring
> > log file = /var/log/radosgw/client.radosgw.gateway.log
> > rgw_frontends = "civetweb port=80"
> >
> >
>
> Are you sure this is in /etc/ceph/ceph.conf?
>
> In addition, are you also sure the RGW is running as user
> 'radosgw.ceph-rgw-02' ?
>
> Wido
>
> > after restart still the same :
> >
> > root@ceph-rgw-02:/var/log/ceph# netstat -an |  grep 80
> > *tcp0  0 0.0.0.0:7480 
> >  0.0.0.0:*   LISTEN*
> > tcp0  0 10.83.1.100:56697   10.83.1.79:6804
> > ESTABLISHED
> > tcp0  0 10.83.1.100:59482   10.83.1.79:6800
> > TIME_WAIT
> > tcp0  0 10.83.1.100:33129   10.83.1.78:6804
> > ESTABLISHED
> > tcp0  0 10.83.1.100:56318   10.83.1.80:6804
> > TIME_WAIT
> > tcp0  0 10.83.1.100:56324   10.83.1.80:6804
> > ESTABLISHED
> > tcp0  0 10.83.1.100:60990   10.83.1.78:6800
> > ESTABLISHED
> > tcp0  0 10.83.1.100:60985   10.83.1.78:6800
> > TIME_WAIT
> > tcp0  0 10.83.1.100:56691   10.83.1.79:6804
> > TIME_WAIT
> > tcp0  0 10.83.1.100:33123   10.83.1.78:6804
> > TIME_WAIT
> > tcp0  0 10.83.1.100:59494   10.83.1.79:6800
> > ESTABLISHED
> > tcp0  0 10.83.1.100:55924   10.83.1.80:6800
> > ESTABLISHED
> > tcp0  0 10.83.1.100:57629   10.83.1.80:6789
> > ESTABLISHED
> >
> >
> > Besides that it also looks like the service tries  to start the fcgi  (
> > besides the civetweb ) is there a reason for that ?  ( fastcgi & Apache
> are
> > not  installed )  ?
> >
> >
> > 2017-03-13 13:44:35.938897 7f05f3fd7700  1 handle_sigterm set alarm for
> 120
> > 2017-03-13 13:44:35.938916 7f06692c7900 -1 shutting down
> > 2017-03-13 13:44:36.170559 7f06692c7900  1 final shutdown
> > 2017-03-13 13:45:13.980814 7fbdb2e6c900  0 deferred set uid:gid to
> > 64045:64045 (ceph:ceph)
> > 2017-03-13 13:45:13.980992 7fbdb2e6c900  0 ceph version 10.2.6
> > (656b5b63ed7c43bd014bcafd81b001959d5f089f), process radosgw, pid 16995
> > *2017-03-13 13:45:14.115639 7fbdb2e6c900  0 starting handler: civetweb*
> > 2017-03-13 13:45:14.117003 7fbd8700  0 -- 10.83.1.100:0/3241919968
> > submit_message mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0,
> > failed lossy con, dropping message 0x7fbd78011cc0
> > 2017-03-13 13:45:14.117644 7fbd8700  0 monclient: hunting for new mon
> > *2017-03-13 13:45:14.120767 7fbdb2e6c900  0 starting handler: fastcgi*
> > *2017-03-13 13:45:14.123000 7fbd09f7b700  0 ERROR: no socket server point
> > defined, cannot start fcgi frontend*
> >
> >
> > Any idea what do i miss here
> >
> >
> > ceph version --> 10.2.6
> > ceph-rgw-02 is the gateway .
> >
> > Many Thanks
> >
> > Yair
> >
> > --
> > This e-mail, as well as any attached document, may contain material which
> > is confidential and privileged and may include trademark, copyright and
> > other intellectual property rights that are proprietary to Kenshoo Ltd,
> >  its subsidiaries or affiliates ("Kenshoo"). This e-mail and its
> > attachments may be read, copied and used only by the addressee for the
> > purpose(s) for which it was disclosed herein. If you have received it in
> > error, please destroy the message and any attachment, and contact us
> > immediately. If you are not the intended recipient, be aware that any
> > review, reliance, disclosure, copying, distribution or use of the
> contents
> > of this message without Kenshoo's express permission is strictly
> prohibited.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
This e-mail

[ceph-users] RADOS Snapshots on Cephfs Pools?

2017-03-13 Thread Kent Borg
We have a Cephfs cluster stuck in read-only mode, looks like following 
the Disaster Recovery steps, is it a good idea to first make a RADOS 
snapshot of the Cephfs pools? Or are there ways that could make matters 
worse?


Thanks,

-kb

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] modify civetweb default port won't work

2017-03-13 Thread Wido den Hollander

> Op 13 maart 2017 om 15:03 schreef Yair Magnezi :
> 
> 
> Hello Cephers .
> 
> I'm trying to modify the   civetweb default  port to 80 but from some
> reason it insists on listening on the default 7480 port
> 
> My configuration is quiet  simple ( experimental  ) and looks like this :
> 
> 
> [global]
> fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
> mon_initial_members = ceph-osd-01 ceph-osd-02
> mon_host = 10.83.1.78,10.83.1.79
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> public_network = 10.83.1.0/24
> rbd default features = 3
> #debug ms = 1
> #debug rgw = 20
> 
> [client.radosgw.ceph-rgw-02]
> host = ceph-rgw-02
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> log file = /var/log/radosgw/client.radosgw.gateway.log
> rgw_frontends = "civetweb port=80"
> 
> 

Are you sure this is in /etc/ceph/ceph.conf?

In addition, are you also sure the RGW is running as user 'radosgw.ceph-rgw-02' 
?

Wido

> after restart still the same :
> 
> root@ceph-rgw-02:/var/log/ceph# netstat -an |  grep 80
> *tcp0  0 0.0.0.0:7480 
>  0.0.0.0:*   LISTEN*
> tcp0  0 10.83.1.100:56697   10.83.1.79:6804
> ESTABLISHED
> tcp0  0 10.83.1.100:59482   10.83.1.79:6800
> TIME_WAIT
> tcp0  0 10.83.1.100:33129   10.83.1.78:6804
> ESTABLISHED
> tcp0  0 10.83.1.100:56318   10.83.1.80:6804
> TIME_WAIT
> tcp0  0 10.83.1.100:56324   10.83.1.80:6804
> ESTABLISHED
> tcp0  0 10.83.1.100:60990   10.83.1.78:6800
> ESTABLISHED
> tcp0  0 10.83.1.100:60985   10.83.1.78:6800
> TIME_WAIT
> tcp0  0 10.83.1.100:56691   10.83.1.79:6804
> TIME_WAIT
> tcp0  0 10.83.1.100:33123   10.83.1.78:6804
> TIME_WAIT
> tcp0  0 10.83.1.100:59494   10.83.1.79:6800
> ESTABLISHED
> tcp0  0 10.83.1.100:55924   10.83.1.80:6800
> ESTABLISHED
> tcp0  0 10.83.1.100:57629   10.83.1.80:6789
> ESTABLISHED
> 
> 
> Besides that it also looks like the service tries  to start the fcgi  (
> besides the civetweb ) is there a reason for that ?  ( fastcgi & Apache are
> not  installed )  ?
> 
> 
> 2017-03-13 13:44:35.938897 7f05f3fd7700  1 handle_sigterm set alarm for 120
> 2017-03-13 13:44:35.938916 7f06692c7900 -1 shutting down
> 2017-03-13 13:44:36.170559 7f06692c7900  1 final shutdown
> 2017-03-13 13:45:13.980814 7fbdb2e6c900  0 deferred set uid:gid to
> 64045:64045 (ceph:ceph)
> 2017-03-13 13:45:13.980992 7fbdb2e6c900  0 ceph version 10.2.6
> (656b5b63ed7c43bd014bcafd81b001959d5f089f), process radosgw, pid 16995
> *2017-03-13 13:45:14.115639 7fbdb2e6c900  0 starting handler: civetweb*
> 2017-03-13 13:45:14.117003 7fbd8700  0 -- 10.83.1.100:0/3241919968
> submit_message mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0,
> failed lossy con, dropping message 0x7fbd78011cc0
> 2017-03-13 13:45:14.117644 7fbd8700  0 monclient: hunting for new mon
> *2017-03-13 13:45:14.120767 7fbdb2e6c900  0 starting handler: fastcgi*
> *2017-03-13 13:45:14.123000 7fbd09f7b700  0 ERROR: no socket server point
> defined, cannot start fcgi frontend*
> 
> 
> Any idea what do i miss here
> 
> 
> ceph version --> 10.2.6
> ceph-rgw-02 is the gateway .
> 
> Many Thanks
> 
> Yair
> 
> -- 
> This e-mail, as well as any attached document, may contain material which 
> is confidential and privileged and may include trademark, copyright and 
> other intellectual property rights that are proprietary to Kenshoo Ltd, 
>  its subsidiaries or affiliates ("Kenshoo"). This e-mail and its 
> attachments may be read, copied and used only by the addressee for the 
> purpose(s) for which it was disclosed herein. If you have received it in 
> error, please destroy the message and any attachment, and contact us 
> immediately. If you are not the intended recipient, be aware that any 
> review, reliance, disclosure, copying, distribution or use of the contents 
> of this message without Kenshoo's express permission is strictly prohibited.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] modify civetweb default port won't work

2017-03-13 Thread Yair Magnezi
Hello Cephers .

I'm trying to modify the   civetweb default  port to 80 but from some
reason it insists on listening on the default 7480 port

My configuration is quiet  simple ( experimental  ) and looks like this :


[global]
fsid = 00c167db-aea1-41b4-903b-69b0c86b6a0f
mon_initial_members = ceph-osd-01 ceph-osd-02
mon_host = 10.83.1.78,10.83.1.79
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = 10.83.1.0/24
rbd default features = 3
#debug ms = 1
#debug rgw = 20

[client.radosgw.ceph-rgw-02]
host = ceph-rgw-02
keyring = /etc/ceph/ceph.client.radosgw.keyring
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw_frontends = "civetweb port=80"


after restart still the same :

root@ceph-rgw-02:/var/log/ceph# netstat -an |  grep 80
*tcp0  0 0.0.0.0:7480 
 0.0.0.0:*   LISTEN*
tcp0  0 10.83.1.100:56697   10.83.1.79:6804
ESTABLISHED
tcp0  0 10.83.1.100:59482   10.83.1.79:6800
TIME_WAIT
tcp0  0 10.83.1.100:33129   10.83.1.78:6804
ESTABLISHED
tcp0  0 10.83.1.100:56318   10.83.1.80:6804
TIME_WAIT
tcp0  0 10.83.1.100:56324   10.83.1.80:6804
ESTABLISHED
tcp0  0 10.83.1.100:60990   10.83.1.78:6800
ESTABLISHED
tcp0  0 10.83.1.100:60985   10.83.1.78:6800
TIME_WAIT
tcp0  0 10.83.1.100:56691   10.83.1.79:6804
TIME_WAIT
tcp0  0 10.83.1.100:33123   10.83.1.78:6804
TIME_WAIT
tcp0  0 10.83.1.100:59494   10.83.1.79:6800
ESTABLISHED
tcp0  0 10.83.1.100:55924   10.83.1.80:6800
ESTABLISHED
tcp0  0 10.83.1.100:57629   10.83.1.80:6789
ESTABLISHED


Besides that it also looks like the service tries  to start the fcgi  (
besides the civetweb ) is there a reason for that ?  ( fastcgi & Apache are
not  installed )  ?


2017-03-13 13:44:35.938897 7f05f3fd7700  1 handle_sigterm set alarm for 120
2017-03-13 13:44:35.938916 7f06692c7900 -1 shutting down
2017-03-13 13:44:36.170559 7f06692c7900  1 final shutdown
2017-03-13 13:45:13.980814 7fbdb2e6c900  0 deferred set uid:gid to
64045:64045 (ceph:ceph)
2017-03-13 13:45:13.980992 7fbdb2e6c900  0 ceph version 10.2.6
(656b5b63ed7c43bd014bcafd81b001959d5f089f), process radosgw, pid 16995
*2017-03-13 13:45:14.115639 7fbdb2e6c900  0 starting handler: civetweb*
2017-03-13 13:45:14.117003 7fbd8700  0 -- 10.83.1.100:0/3241919968
submit_message mon_subscribe({osdmap=48}) v2 remote, 10.83.1.78:6789/0,
failed lossy con, dropping message 0x7fbd78011cc0
2017-03-13 13:45:14.117644 7fbd8700  0 monclient: hunting for new mon
*2017-03-13 13:45:14.120767 7fbdb2e6c900  0 starting handler: fastcgi*
*2017-03-13 13:45:14.123000 7fbd09f7b700  0 ERROR: no socket server point
defined, cannot start fcgi frontend*


Any idea what do i miss here


ceph version --> 10.2.6
ceph-rgw-02 is the gateway .

Many Thanks

Yair

-- 
This e-mail, as well as any attached document, may contain material which 
is confidential and privileged and may include trademark, copyright and 
other intellectual property rights that are proprietary to Kenshoo Ltd, 
 its subsidiaries or affiliates ("Kenshoo"). This e-mail and its 
attachments may be read, copied and used only by the addressee for the 
purpose(s) for which it was disclosed herein. If you have received it in 
error, please destroy the message and any attachment, and contact us 
immediately. If you are not the intended recipient, be aware that any 
review, reliance, disclosure, copying, distribution or use of the contents 
of this message without Kenshoo's express permission is strictly prohibited.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-13 Thread Christoph Adomeit
Thanks for the detailed upgrade report.

We have another scenario: We have allready upgraded to jewel 10.2.6 but 
we are still running all our monitors and osd daemons as root using the
setuser match path directive. 

What would be the recommended way to have all daemons running as ceph:ceph user 
?

Could we chown -R the monitor and osd data directories under /var/lib/ceph one 
by one while keeping up service ?

Thanks
  Christoph

On Sat, Mar 11, 2017 at 12:21:38PM +0100, cephmailingl...@mosibi.nl wrote:
> Hello list,
> 
> A week ago we upgraded our Ceph clusters from Hammer to Jewel and with this
> email we want to share our experiences.
> 
-- 
Christoph Adomeit
GATWORKS GmbH
Reststrauch 191
41199 Moenchengladbach
Sitz: Moenchengladbach
Amtsgericht Moenchengladbach, HRB 6303
Geschaeftsfuehrer:
Christoph Adomeit, Hans Wilhelm Terstappen

christoph.adom...@gatworks.de Internetloesungen vom Feinsten
Fon. +49 2166 9149-32  Fax. +49 2166 9149-10
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs deep scrub error:

2017-03-13 Thread John Spray
On Mon, Mar 13, 2017 at 10:28 AM, Dan van der Ster  wrote:
> Hi John,
>
> Last week we updated our prod CephFS cluster to 10.2.6 (clients and
> server side), and for the first time today we've got an object info
> size mismatch:
>
> I found this ticket you created in the tracker, which is why I've
> emailed you: http://tracker.ceph.com/issues/18240
>
> Here's the detail of our error:
>
> 2017-03-13 07:17:49.989297 osd.67 :6819/3441125 262 : cluster
> [ERR] deep-scrub 1.3da 1:5bc0e9dc:::1260f4b.0003:head on disk
> size (4187974) does not match object info size (4193094) adjusted for
> ondisk to (4193094)


Hmm, never investigated that particular ticket but did notice that the
issue never re-occurred on master, or on jewel since the start of
2017.

The last revision where it cropped up in testing was
5b402f8a7b5a763852e93cd0a5decd34572f4518, looking at the history
between that commit and the tip of the jewel branch I don't see
anything that I recognise as a corruption fix (apart from the omap fix
which is clearly unrelated as we're looking at data objects in these
failures).

John


> All three replica's have the same object size/md5sum:
>
> # ls -l 1260f4b.0003__head_3B9703DA__1
> -rw-r--r--. 1 ceph ceph 4187974 Mar 12 18:50
> 1260f4b.0003__head_3B9703DA__1
> # md5sum 1260f4b.0003__head_3B9703DA__1
> db1e1bab199b33fce3ad9195832626ef 1260f4b.0003__head_3B9703DA__1
>
> And indeed the object info does not agree with the files on disk:
>
> # ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json
> {
> "oid": {
> "oid": "1260f4b.0003",
> "key": "",
> "snapid": -2,
> "hash": 999752666,
> "max": 0,
> "pool": 1,
> "namespace": ""
> },
> "version": "5262'221037",
> "prior_version": "5262'221031",
> "last_reqid": "osd.67.0:1180241",
> "user_version": 221031,
> "size": 4193094,
> "mtime": "0.00",
> "local_mtime": "0.00",
> "lost": 0,
> "flags": 52,
> "snaps": [],
> "truncate_seq": 80,
> "truncate_size": 0,
> "data_digest": 2779145704,
> "omap_digest": 4294967295,
> "watchers": {}
> }
>
>
> PG repair doesn't handle these kind of corruptions, but I found a
> recipe in an old thread to fix the object info with hexedit. Before
> doing this I wanted to see if we can understand exactly how this is
> possible.
>
> I managed to find the exact cephfs file, and asked the user how they
> created it. They said the file was the output of a make test on some
> program. The make test was taking awhile, so they left their laptop,
> and when they returned to the computer, the ssh connection to their
> cephfs workstation had broken. I assume this means that the process
> writing the file had been killed while writing to cephfs. But I don't
> understand how a killed client process could result in inconsistent
> object info.
>
> Is there anything else needed to help debug this inconsistency?
>
> Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Florian Haas
> Sent: 13 March 2017 10:09
> To: Dan van der Ster 
> Cc: ceph-users 
> Subject: Re: [ceph-users] osd_disk_thread_ioprio_priority help
> 
> On Mon, Mar 13, 2017 at 11:00 AM, Dan van der Ster
>  wrote:
> >> I'm sorry, I may have worded that in a manner that's easy to
> >> misunderstand. I generally *never* suggest that people use CFQ on
> >> reasonably decent I/O hardware, and thus have never come across any
> >> need to set this specific ceph.conf parameter.
> >
> > OTOH, cfq *does* help our hammer clusters. deadline's default
> > behaviour is to delay writes up to 5 seconds if the disk is busy
> > reading -- which it is, of couse, while deep scrubbing. And deadline
> > does not offer any sort of fairness between processes accessing the
> > same disk (which is admittedly less of an issue in jewel). But back in
> > hammer days it was nice to be able to make the disk threads only read
> > while the disk was otherwise idle.
> 
> Thanks for pointing out the default 5000-ms write deadline. We frequently
> tune that down to 1500ms. Disabling front merges also sometimes seems to
> help.
> 
> For the archives: those settings are in
> /sys/block/*/queue/iosched/{write_expire,front_merges} and can be
> persisted on Debian/Ubuntu with sysfsutils.

Also it may be of some interest that in Linux 4.10 there is new background
priority writeback functionality

https://kernelnewbies.org/Linux_4.10#head-f6ecae920c0660b7f4bcee913f2c71a859
dcc184

I've found this makes quite a big difference to read latency if the cluster
is under a heavy writes and the WBthrottle allows 5000 IO's to queue up by
default.

> 
> Cheers,
> Florian
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-13 Thread Piotr Dałek

On 03/13/2017 11:07 AM, Dan van der Ster wrote:

On Sat, Mar 11, 2017 at 12:21 PM,  wrote:


The next and biggest problem we encountered had to do with the CRC errors on 
the OSD map. On every map update, the OSDs that were not upgraded yet, got that 
CRC error and asked the monitor for a full OSD map instead of just a delta 
update. At first we did not understand what exactly happened, we ran the 
upgrade per node using a script and in that script we watch the state of the 
cluster and when the cluster is healthy again, we upgrade the next host. Every 
time we started the script (skipping the already upgraded hosts) the first 
host(s) upgraded without issues and then we got blocked I/O on the cluster. The 
blocked I/O went away within a minute of 2 (not measured). After investigation 
we found out that the blocked I/O happened when nodes where asking the monitor 
for a (full) OSD map and that resulted shortly in a full saturated network link 
on our monitor.



Thanks for the detailed upgrade report. I wanted to zoom in on this
CRC/fullmap issue because it could be quite disruptive for us when we
upgrade from hammer to jewel.

I've read various reports that the fool proof way to avoid the full
map DoS would be to upgrade all OSDs to jewel before the mon's.
Did anyone have success with that workaround? I'm cc'ing Bryan because
he knows this issue very well.


With https://github.com/ceph/ceph/pull/13131 merged into 10.2.6, this issue 
shouldn't be a problem (at least we don't see it anymore).


--
Piotr Dałek
piotr.da...@corp.ovh.com
https://www.ovh.com/us/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs deep scrub error:

2017-03-13 Thread Dan van der Ster
Hi John,

Last week we updated our prod CephFS cluster to 10.2.6 (clients and
server side), and for the first time today we've got an object info
size mismatch:

I found this ticket you created in the tracker, which is why I've
emailed you: http://tracker.ceph.com/issues/18240

Here's the detail of our error:

2017-03-13 07:17:49.989297 osd.67 :6819/3441125 262 : cluster
[ERR] deep-scrub 1.3da 1:5bc0e9dc:::1260f4b.0003:head on disk
size (4187974) does not match object info size (4193094) adjusted for
ondisk to (4193094)

All three replica's have the same object size/md5sum:

# ls -l 1260f4b.0003__head_3B9703DA__1
-rw-r--r--. 1 ceph ceph 4187974 Mar 12 18:50
1260f4b.0003__head_3B9703DA__1
# md5sum 1260f4b.0003__head_3B9703DA__1
db1e1bab199b33fce3ad9195832626ef 1260f4b.0003__head_3B9703DA__1

And indeed the object info does not agree with the files on disk:

# ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json
{
"oid": {
"oid": "1260f4b.0003",
"key": "",
"snapid": -2,
"hash": 999752666,
"max": 0,
"pool": 1,
"namespace": ""
},
"version": "5262'221037",
"prior_version": "5262'221031",
"last_reqid": "osd.67.0:1180241",
"user_version": 221031,
"size": 4193094,
"mtime": "0.00",
"local_mtime": "0.00",
"lost": 0,
"flags": 52,
"snaps": [],
"truncate_seq": 80,
"truncate_size": 0,
"data_digest": 2779145704,
"omap_digest": 4294967295,
"watchers": {}
}


PG repair doesn't handle these kind of corruptions, but I found a
recipe in an old thread to fix the object info with hexedit. Before
doing this I wanted to see if we can understand exactly how this is
possible.

I managed to find the exact cephfs file, and asked the user how they
created it. They said the file was the output of a make test on some
program. The make test was taking awhile, so they left their laptop,
and when they returned to the computer, the ssh connection to their
cephfs workstation had broken. I assume this means that the process
writing the file had been killed while writing to cephfs. But I don't
understand how a killed client process could result in inconsistent
object info.

Is there anything else needed to help debug this inconsistency?

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Archive search is broken, was Re: MySQL and ceph volumes

2017-03-13 Thread Christopher Kunz
Am 08.03.17 um 02:47 schrieb Christian Balzer:
> 
> Hello,
> 
> as Adrian pointed out, this is not really Ceph specific.
> 
> That being said, there are literally dozen of threads in this ML about
> this issue and speeding up things in general, use your google-foo.

Yeah about that, the search function on
 is broken:
"htsearch detected an error. Please report this to the webmaster of this
site by sending an e-mail to: mail...@listserver-dap.dreamhost.com The
error message is:

Unable to read word database file
'/dh/mailman/dap/archives/private/ceph-users-ceph.com/htdig/db.words.db'
Did you run htdig?"

Maybe someone can have this looked at, ccing mailman@.

Regards,

--ck



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Florian Haas
On Mon, Mar 13, 2017 at 11:00 AM, Dan van der Ster  wrote:
>> I'm sorry, I may have worded that in a manner that's easy to
>> misunderstand. I generally *never* suggest that people use CFQ on
>> reasonably decent I/O hardware, and thus have never come across any
>> need to set this specific ceph.conf parameter.
>
> OTOH, cfq *does* help our hammer clusters. deadline's default
> behaviour is to delay writes up to 5 seconds if the disk is busy
> reading -- which it is, of couse, while deep scrubbing. And deadline
> does not offer any sort of fairness between processes accessing the
> same disk (which is admittedly less of an issue in jewel). But back in
> hammer days it was nice to be able to make the disk threads only read
> while the disk was otherwise idle.

Thanks for pointing out the default 5000-ms write deadline. We
frequently tune that down to 1500ms. Disabling front merges also
sometimes seems to help.

For the archives: those settings are in
/sys/block/*/queue/iosched/{write_expire,front_merges} and can be
persisted on Debian/Ubuntu with sysfsutils.

Cheers,
Florian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-13 Thread Dan van der Ster
On Sat, Mar 11, 2017 at 12:21 PM,  wrote:
>
> The next and biggest problem we encountered had to do with the CRC errors on 
> the OSD map. On every map update, the OSDs that were not upgraded yet, got 
> that CRC error and asked the monitor for a full OSD map instead of just a 
> delta update. At first we did not understand what exactly happened, we ran 
> the upgrade per node using a script and in that script we watch the state of 
> the cluster and when the cluster is healthy again, we upgrade the next host. 
> Every time we started the script (skipping the already upgraded hosts) the 
> first host(s) upgraded without issues and then we got blocked I/O on the 
> cluster. The blocked I/O went away within a minute of 2 (not measured). After 
> investigation we found out that the blocked I/O happened when nodes where 
> asking the monitor for a (full) OSD map and that resulted shortly in a full 
> saturated network link on our monitor.


Thanks for the detailed upgrade report. I wanted to zoom in on this
CRC/fullmap issue because it could be quite disruptive for us when we
upgrade from hammer to jewel.

I've read various reports that the fool proof way to avoid the full
map DoS would be to upgrade all OSDs to jewel before the mon's.
Did anyone have success with that workaround? I'm cc'ing Bryan because
he knows this issue very well.

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Dan van der Ster
On Mon, Mar 13, 2017 at 10:35 AM, Florian Haas  wrote:
> On Sun, Mar 12, 2017 at 9:07 PM, Laszlo Budai  wrote:
>> Hi Florian,
>>
>> thank you for your answer.
>>
>> We have already set the IO scheduler to cfq in order to be able to lower the
>> priority of the scrub operations.
>> My problem is that I've found different values set for the same parameter,
>> and in each case they were doing it in order to achieve the same thing as we
>> do.
>> This is why I was asking for some more details about this parameter. Is the
>> value 0 or 7 which will set the scrub to the lowest priority? I want my ceph
>> cluster to be responsive for the client requests, and do the scrub in the
>> background.
>>
>> I'm open to any ideas/suggestions which would help to improove the cluster's
>> responsiveness during deep scrub operations.
>
> I'm sorry, I may have worded that in a manner that's easy to
> misunderstand. I generally *never* suggest that people use CFQ on
> reasonably decent I/O hardware, and thus have never come across any
> need to set this specific ceph.conf parameter.

OTOH, cfq *does* help our hammer clusters. deadline's default
behaviour is to delay writes up to 5 seconds if the disk is busy
reading -- which it is, of couse, while deep scrubbing. And deadline
does not offer any sort of fairness between processes accessing the
same disk (which is admittedly less of an issue in jewel). But back in
hammer days it was nice to be able to make the disk threads only read
while the disk was otherwise idle.

-- Dan

>
> Maybe if you share your full ceph.conf and hardware details, I or
> others on this list can offer more useful suggestions than tweaking
> osd_disk_thread_ioprio_priority.
>
> Cheers,
> Florian
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Florian Haas
On Sun, Mar 12, 2017 at 9:07 PM, Laszlo Budai  wrote:
> Hi Florian,
>
> thank you for your answer.
>
> We have already set the IO scheduler to cfq in order to be able to lower the
> priority of the scrub operations.
> My problem is that I've found different values set for the same parameter,
> and in each case they were doing it in order to achieve the same thing as we
> do.
> This is why I was asking for some more details about this parameter. Is the
> value 0 or 7 which will set the scrub to the lowest priority? I want my ceph
> cluster to be responsive for the client requests, and do the scrub in the
> background.
>
> I'm open to any ideas/suggestions which would help to improove the cluster's
> responsiveness during deep scrub operations.

I'm sorry, I may have worded that in a manner that's easy to
misunderstand. I generally *never* suggest that people use CFQ on
reasonably decent I/O hardware, and thus have never come across any
need to set this specific ceph.conf parameter.

Maybe if you share your full ceph.conf and hardware details, I or
others on this list can offer more useful suggestions than tweaking
osd_disk_thread_ioprio_priority.

Cheers,
Florian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com