Re: [ceph-users] Erasure coding
Hi John, Thanks for the reply. It seems that to understand the internal mechanism and the algorithmic structure of ceph, knowledge of Information theory is necessary ☺. Thanks Kumar From: John Wilkins [mailto:john.wilk...@inktank.com] Sent: Monday, May 19, 2014 10:18 PM To: Gnan Kumar, Yalla Cc: Loic Dachary; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Erasure coding I have also added a big part of Loic's discussion of the architecture into the Ceph architecture document here: http://ceph.com/docs/master/architecture/#erasure-coding On Mon, May 19, 2014 at 5:35 AM, yalla.gnan.ku...@accenture.commailto:yalla.gnan.ku...@accenture.com wrote: Hi Loic, Thanks for the reply. Thanks Kumar -Original Message- From: Loic Dachary [mailto:l...@dachary.orgmailto:l...@dachary.org] Sent: Monday, May 19, 2014 6:04 PM To: Gnan Kumar, Yalla; ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Subject: Re: [ceph-users] Erasure coding Hi, The general idea to preserve resilience but save space compared to replication. It costs more in terms of CPU and network. You will find a short introduction here : https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend https://wiki.ceph.com/Planning/Blueprints/Firefly/Erasure_coded_storage_backend_%28step_3%29 For the next Ceph release Pyramid Codes will help reduce the bandwidth requirements https://wiki.ceph.com/Planning/Blueprints/Giant/Pyramid_Erasure_Code Cheers On 19/05/2014 13:52, yalla.gnan.ku...@accenture.commailto:yalla.gnan.ku...@accenture.com wrote: Hi All, What exactly is erasure coding and why is it used in ceph ? I could not get enough explanatory information from the documentation. Thanks Kumar -- This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.comhttp://www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.commailto:john.wilk...@inktank.com (415) 425-9599 http://inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fwd: rbd map command hangs
On Tue, May 20, 2014 at 7:52 AM, Jay Janardhan jay.janard...@kaseya.com wrote: Got the stack trace when it crashed. I had to enable serial port to capture this. Would this help? [ 172.227318] libceph: mon0 192.168.56.102:6789 feature set mismatch, my 40002 server's 20042040002, missing 2004200 [ 172.451109] libceph: mon0 192.168.56.102:6789 socket error on read [ 172.539837] [ cut here ] [ 172.640704] kernel BUG at /home/apw/COD/linux/net/ceph/messenger.c:2366! [ 172.740775] invalid opcode: [#1] SMP [ 172.805429] Modules linked in: rbd libceph libcrc32c nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc ext2 ppdev microcode psmouse serio_raw parport_pc i2c_piix4 mac_hid lp parport e1000 [ 173.072985] CPU 0 [ 173.143909] Pid: 385, comm: kworker/0:3 Not tainted 3.6.9-030609-generic #201212031610 innotek GmbH VirtualBox/VirtualBox [ 173.358836] RIP: 0010:[a0183ff7] [a0183ff7] ceph_fault+0x267/0x270 [libceph] [ 173.629918] RSP: 0018:88007b497d90 EFLAGS: 00010286 [ 173.731786] RAX: fffe RBX: 88007b909298 RCX: 0003 [ 173.901361] RDX: RSI: RDI: 0039 [ 174.040360] RBP: 88007b497dc0 R08: 000a R09: fffb [ 174.235587] R10: R11: 0199 R12: 88007b9092c8 [ 174.385067] R13: R14: a0199580 R15: a0195773 [ 174.541288] FS: () GS:88007fc0() knlGS: [ 174.620856] CS: 0010 DS: ES: CR0: 8005003b [ 174.740551] CR2: 7fefd16c5168 CR3: 7bb41000 CR4: 06f0 [ 174.948095] DR0: DR1: DR2: [ 175.076881] DR3: DR6: 0ff0 DR7: 0400 [ 175.320731] Process kworker/0:3 (pid: 385, threadinfo 88007b496000, task 880079735bc0) [ 175.565218] Stack: [ 175.630655] 88007b909298 88007b909690 88007b9093d0 [ 175.699571] 88007b909418 88007fc0e300 88007b497df0 a018525c [ 175.710012] 88007b909690 880078e4d800 88007fc1bf00 88007fc0e340 [ 175.859748] Call Trace: [ 175.909572] [a018525c] con_work+0x14c/0x1c0 [libceph] [ 176.010436] [810763b6] process_one_work+0x136/0x550 [ 176.131098] [a0185110] ? try_read+0x440/0x440 [libceph] [ 176.249904] [810775b5] worker_thread+0x165/0x3c0 [ 176.368412] [81077450] ? manage_workers+0x190/0x190 [ 176.512415] [8107c5e3] kthread+0x93/0xa0 [ 176.623469] [816b8c04] kernel_thread_helper+0x4/0x10 [ 176.670502] [8107c550] ? flush_kthread_worker+0xb0/0xb0 [ 176.731089] [816b8c00] ? gs_change+0x13/0x13 [ 176.901284] Code: 00 00 00 00 48 8b 83 38 01 00 00 a8 02 0f 85 f6 fe ff ff 3e 80 a3 38 01 00 00 fb 48 c7 83 40 01 00 00 06 00 00 00 e9 37 ff ff ff 0f 0b 0f 0b 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 [ 177.088895] RIP [a0183ff7] ceph_fault+0x267/0x270 [libceph] [ 177.251573] RSP 88007b497d90 [ 177.310320] ---[ end trace f66ddfdda09b9821 ]--- OK, it definitely shouldn't have crashed here and there is a patch in later kernels that prevents this crash from happening. But, because 3.6 is too old and misses features, which is reported just prior to the crash splat, you wouldn't be able to use it with firefly userspace even if it didn't crash. You are going to need to run at least 3.9 and then disable vary_r tunable in your crushmap (vary_r will only be supported starting with 3.15) or primary-affinity adjustments - I can't tell which one is it just from the feature set mismatch message. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 70+ OSD are DOWN and not coming up
Hello Cephers , need your suggestion for troubleshooting. My cluster is terribly struggling , 70+ osd are down out of 165 Problem — OSD are getting marked out of cluster and are down. The cluster is degraded. On checking logs of failed OSD we are getting wired entries that are continuously getting generated. Osd Debug logs :: http://pastebin.com/agTKh6zB 2014-05-20 10:19:03.699886 7f2328e237a0 0 osd.158 357532 done with init, starting boot process 2014-05-20 10:19:03.700093 7f22ff621700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6802/910005982 pipe(0x8698500 sd=35 :33500 s=1 pgs=0 cs=0 l=0 c=0x83018c0).connect claims to be 192.168.1.109:6802/63896 not 192.168.1.109:6802/910005982 - wrong node! 2014-05-20 10:19:03.700152 7f22ff621700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6802/910005982 pipe(0x8698500 sd=35 :33500 s=1 pgs=0 cs=0 l=0 c=0x83018c0).fault with nothing to send, going to standby 2014-05-20 10:19:09.551269 7f22fdd12700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6803/1176009454 pipe(0x56aee00 sd=53 :40060 s=1 pgs=0 cs=0 l=0 c=0x533fd20).connect claims to be 192.168.1.109:6803/63896 not 192.168.1.109:6803/1176009454 - wrong node! 2014-05-20 10:19:09.551347 7f22fdd12700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6803/1176009454 pipe(0x56aee00 sd=53 :40060 s=1 pgs=0 cs=0 l=0 c=0x533fd20).fault with nothing to send, going to standby 2014-05-20 10:19:09.703901 7f22fd80d700 0 -- 192.168.1.112:6802/3807 192.168.1.113:6802/13870 pipe(0x56adf00 sd=137 :42889 s=1 pgs=0 cs=0 l=0 c=0x8302aa0).connect claims to be 192.168.1.113:6802/24612 not 192.168.1.113:6802/13870 - wrong node! 2014-05-20 10:19:09.704039 7f22fd80d700 0 -- 192.168.1.112:6802/3807 192.168.1.113:6802/13870 pipe(0x56adf00 sd=137 :42889 s=1 pgs=0 cs=0 l=0 c=0x8302aa0).fault with nothing to send, going to standby 2014-05-20 10:19:10.243139 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).connect claims to be 192.168.1.112:6800/2852 not 192.168.1.112:6800/14114 - wrong node! 2014-05-20 10:19:10.243190 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).fault with nothing to send, going to standby 2014-05-20 10:19:10.349693 7f22fc7fd700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6800/13492 pipe(0x8698c80 sd=156 :0 s=1 pgs=0 cs=0 l=0 c=0x83070c0).fault with nothing to send, going to standby ceph -v ceph version 0.80-469-g991f7f1 (991f7f15a6e107b33a24bbef1169f21eb7fcce2c) # ceph osd stat osdmap e357073: 165 osds: 91 up, 165 in flags noout # I have tried doing : 1. Restarting the problematic OSDs , but no luck 2. i restarted entire host but no luck, still osds are down and getting the same mesage 2014-05-20 10:19:10.243139 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).connect claims to be 192.168.1.112:6800/2852 not 192.168.1.112:6800/14114 - wrong node! 2014-05-20 10:19:10.243190 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).fault with nothing to send, going to standby 2014-05-20 10:19:10.349693 7f22fc7fd700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6800/13492 pipe(0x8698c80 sd=156 :0 s=1 pgs=0 cs=0 l=0 c=0x83070c0).fault with nothing to send, going to standby 2014-05-20 10:22:23.312473 7f2307e61700 0 osd.158 357781 do_command r=0 2014-05-20 10:22:23.326110 7f2307e61700 0 osd.158 357781 do_command r=0 debug_osd=0/5 2014-05-20 10:22:23.326123 7f2307e61700 0 log [INF] : debug_osd=0/5 2014-05-20 10:34:08.161864 7f230224d700 0 -- 192.168.1.112:6802/3807 192.168.1.102:6808/13276 pipe(0x8698280 sd=22 :41078 s=2 pgs=603 cs=1 l=0 c=0x8301600).fault with nothing to send, going to standby 3. Disks do not have errors , no message in dmesg and /var/log/messages 4. there was a bug in the past http://tracker.ceph.com/issues/4006 , dont know it again came bacin in Firefly 5. Recently no activity performed on cluster , except some pool and keys creation for cinder /glance integration 6. Nodes have enough free resources for osds. 7. No issues with network , osds are down on all cluster nodes. not from a single node. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How do I do deep-scrub manually?
Dear Yang, I planed set nodeep-scrub at nigh daily by crontab. and with error HEALTH_WARN nodeep-scrub flag(s) set. I only concentrate messages from the monitoring tool (vd: nagios) = and I re-writed nagios'checkscript to with message HEALTH_WARN nodeep-scrub flag(s) set returns code = 0. On 05/20/2014 10:47 AM, Jianing Yang wrote: I found that deep scrub has a significant impact on my cluster. I've used ceph osd set nodeep-scrub disable it. But I got an error HEALTH_WARN nodeep-scrub flag(s) set. What is the proper way to disable deep scrub? and how can I run it manually? -- _ / Install 'denyhosts' to help protect \ | against brute force SSH attacks,| \ auto-blocking multiple attempts./ - \ \ \ .- O -..--. ,---. .-==-. /_-\'''/-_\ / / '' \ \ |,-.| /____\ |/ o) (o \|| | ')(' | | /,'-'.\ |/ (')(') \| \ ._. / \ \/ / {_/(') (')\_} \ __ / ,-_,,,_-. '=jf=' `. _ .','--__--'. / . \/\ /'-___-'\/:|\ (_) . (_) / \ / \ (_) :| (_) \_-'--/ (_)(_) (_)___(_) |___:|| \___/ || \___/ |_| ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Data still in OSD directories after removing
Hi, short : I removed a 1TB RBD image, but I still see files about it on OSD. long : 1) I did : rbd snap purge $pool/$img but since it overload the cluster, I stopped it (CTRL+C) 2) latter, rbd snap purge $pool/$img 3) then, rbd rm $pool/$img now, on the disk I can found files of this v1 RBD image (prefix was rb.0.14bfb5a.238e1f29) : # find /var/lib/ceph/osd/ceph-64/ -name 'rb.0.14bfb5a.238e1f29.*' /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_3/rb.0.14bfb5a.238e1f29.00021431__snapdir_C96635C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_3/rb.0.14bfb5a.238e1f29.5622__a252_32F435C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_3/rb.0.14bfb5a.238e1f29.00021431__a252_C96635C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_3/rb.0.14bfb5a.238e1f29.5622__snapdir_32F435C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_9/rb.0.14bfb5a.238e1f29.00011e08__a172_594495C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_9/rb.0.14bfb5a.238e1f29.00011e08__snapdir_594495C1__9 /var/lib/ceph/osd/ceph-64/current/9.5c1_head/DIR_1/DIR_C/DIR_5/DIR_A/rb.0.14bfb5a.238e1f29.00021620__a252_779FA5C1__9 ... So, is there a way to force OSD to detect if files are orphans, then remove them ? Thanks, Olivier ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Expanding pg's of an erasure coded pool
Hi, On a setup of 400 OSDs (20 nodes, with 20 OSDs per node), I first tried to create a erasure coded pool with 4096 pgs, but this crashed the cluster. I then started with 1024 pgs, expanding to 2048 (pg_num and pgp_num), when I then try to expand to 4096 (not even quite enough) the cluster crashes again. ( Do we need less of pg's with erasure coding?) The crash starts with individual OSDs crashing, eventually bringing down the mons (until there is no more quorum or too few osds) Out of the logs: -16 2014-05-20 10:31:55.545590 7fd42f34d700 5 -- op tracker -- , seq: 14301, time: 2014-05-20 10:31:55.545590, event: started, request: pg_query(0.974 epoch 3315) v3 -15 2014-05-20 10:31:55.545776 7fd42f34d700 1 -- 130.246.178.141:6836/10446 -- 130.246.179.191:6826/21854 -- pg_notify(0.974 epoch 3326) v5 -- ?+0 0xc8b4ec0 con 0x9 026b40 -14 2014-05-20 10:31:55.545807 7fd42f34d700 5 -- op tracker -- , seq: 14301, time: 2014-05-20 10:31:55.545807, event: done, request: pg_query(0.974 epoch 3315) v3 -13 2014-05-20 10:31:55.559661 7fd3fdb0f700 1 -- 130.246.178.141:6837/10446 :/0 pipe(0xce0c380 sd=468 :6837 s=0 pgs=0 cs=0 l=0 c=0x1255f0c0).accept sd=468 130.246.179.191:60618/0 -12 2014-05-20 10:31:55.564034 7fd3bf72f700 1 -- 130.246.178.141:6838/10446 :/0 pipe(0xe3f2300 sd=596 :6838 s=0 pgs=0 cs=0 l=0 c=0x129b5ee0).accept sd=596 130.246.179.191:43913/0 -11 2014-05-20 10:31:55.627776 7fd42df4b700 1 -- 130.246.178.141:0/10446 == osd.170 130.246.179.191:6827/21854 3 osd_ping(ping_reply e3316 stamp 2014-05-20 10:31:52.994368) v2 47+0+0 (855262282 0 0) 0xb6863c0 con 0x1255b9c0 -10 2014-05-20 10:31:55.629425 7fd42df4b700 1 -- 130.246.178.141:0/10446 == osd.170 130.246.179.191:6827/21854 4 osd_ping(ping_reply e3316 stamp 2014-05-20 10:31:53.509621) v2 47+0+0 (2581193378 0 0) 0x93d6c80 con 0x1255b9c0 -9 2014-05-20 10:31:55.631270 7fd42f34d700 1 -- 130.246.178.141:6836/10446 == osd.169 130.246.179.191:6841/25473 2 pg_query(7.3ffs6 epoch 3326) v3 144+0+0 (221596234 0 0) 0x10b994a0 con 0x9383860 -8 2014-05-20 10:31:55.631308 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631130, event: header_read, request: pg_query(7.3ffs6 epoch 3326) v3 -7 2014-05-20 10:31:55.631315 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631133, event: throttled, request: pg_query(7.3ffs6 epoch 3326) v3 -6 2014-05-20 10:31:55.631339 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631207, event: all_read, request: pg_query(7.3ffs6 epoch 3326) v3 -5 2014-05-20 10:31:55.631343 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631303, event: dispatched, request: pg_query(7.3ffs6 epoch 3326) v3 -4 2014-05-20 10:31:55.631349 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631349, event: waiting_for_osdmap, request: pg_query(7.3ffs6 epoch 3326) v3 -3 2014-05-20 10:31:55.631363 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631363, event: started, request: pg_query(7.3ffs6 epoch 3326) v3 -2 2014-05-20 10:31:55.631402 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631402, event: done, request: pg_query(7.3ffs6 epoch 3326) v3 -1 2014-05-20 10:31:55.631488 7fd427b41700 1 -- 130.246.178.141:6836/10446 -- 130.246.179.191:6841/25473 -- pg_notify(7.3ffs6(14) epoch 3326) v5 -- ?+0 0xcc7b9c0 con 0x9383860 0 2014-05-20 10:31:55.632127 7fd42cb49700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fd42cb49700 time 2014-05-20 10:31:55.630937 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa] 3: (Accepter::entry()+0x265) [0xb3ca45] 4: (()+0x79d1) [0x7fd4436b19d1] 5: (clone()+0x6d) [0x7fd4423ecb6d] --- begin dump of recent events --- 0 2014-05-20 10:31:56.622247 7fd3bc5fe700 -1 *** Caught signal (Aborted) ** in thread 7fd3bc5fe700 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: /usr/bin/ceph-osd() [0x9ab3b1] 2: (()+0xf710) [0x7fd4436b9710] 3: (gsignal()+0x35) [0x7fd442336925] 4: (abort()+0x175) [0x7fd442338105] 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7fd442bf0a5d] 6: (()+0xbcbe6) [0x7fd442beebe6] 7: (()+0xbcc13) [0x7fd442beec13] 8: (()+0xbcd0e) [0x7fd442beed0e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f2) [0xaec612] 10: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 11: (Pipe::connect()+0x2efb) [0xb2850b] 12: (Pipe::writer()+0x9f3) [0xb2a063] 13: (Pipe::Writer::entry()+0xd) [0xb359cd] 14: (()+0x79d1) [0x7fd4436b19d1] 15: (clone()+0x6d) [0x7fd4423ecb6d] NOTE: a copy of the executable, or `objdump -rdS executable` is
Re: [ceph-users] Problem with radosgw and some file name characters
Anyone have any idea how to fix the problem with getting 403 when trying to upload files with none standard characters? I am sure I am not the only one with these requirements. Cheers - Original Message - From: Andrei Mikhailovsky and...@arhont.com To: Yehuda Sadeh yeh...@inktank.com Cc: ceph-users@lists.ceph.com Sent: Monday, 19 May, 2014 12:38:29 PM Subject: Re: [ceph-users] Problem with radosgw and some file name characters Yehuda, Never mind my last post, i've found the issue with the rule that you've suggested. my fastcgi script is called differently, so that's why i was getting the 404. I've tried your rewrite rule and I am still having the same issues. The same characters are failing with the rule you've suggested. Any idea how to fix the issue? Cheers Andrei - Original Message - From: Andrei Mikhailovsky and...@arhont.com To: Yehuda Sadeh yeh...@inktank.com Cc: ceph-users@lists.ceph.com Sent: Monday, 19 May, 2014 9:30:03 AM Subject: Re: [ceph-users] Problem with radosgw and some file name characters Yehuda, I've tried the rewrite rule that you've suggested, but it is not working for me. I get 404 when trying to access the service. RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] Any idea what is wrong with this rule? Cheers Andrei - Original Message - From: Yehuda Sadeh yeh...@inktank.com To: Andrei Mikhailovsky and...@arhont.com Cc: ceph-users@lists.ceph.com Sent: Friday, 16 May, 2014 5:44:52 PM Subject: Re: [ceph-users] Problem with radosgw and some file name characters Was talking about this. There is a different and simpler rule that we use nowadays, for some reason it's not well documented: RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] I still need to see a more verbose log to make a better educated guess. Yehuda On Thu, May 15, 2014 at 3:01 PM, Andrei Mikhailovsky and...@arhont.com wrote: Yehuda, what do you mean by the rewrite rule? is this for Apache? I've used the ceph documentation to create it. My rule is: RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) /s3gw.fcgi?page=$1params=$2%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] Or are you talking about something else? Cheers Andrei From: Yehuda Sadeh yeh...@inktank.com To: Andrei Mikhailovsky and...@arhont.com Cc: ceph-users@lists.ceph.com Sent: Thursday, 15 May, 2014 4:05:06 PM Subject: Re: [ceph-users] Problem with radosgw and some file name characters Your rewrite rule might be off a bit. Can you provide log with 'debug rgw = 20'? Yehuda On Thu, May 15, 2014 at 8:02 AM, Andrei Mikhailovsky and...@arhont.com wrote: Hello guys, I am trying to figure out what is the problem here. Currently running Ubuntu 12.04 with latest updates and radosgw version 0.72.2-1precise. My ceph.conf file is pretty standard from the radosgw howto. I am testing radosgw as a backup solution to S3 compatible clients. I am planning to copy a large number of files/folders and I am having issues with a large number of files. The client reports the following error on some files: ?xml version='1.0' encoding='UTF-8'? Error CodeAccessDenied/Code /Error Looking on the server backup I only see the following errors in the radosgw.log file: 2014-05-13 23:50:35.786181 7f09467dc700 1 == starting new request req=0x245d7e0 = 2014-05-13 23:50:35.786470 7f09467dc700 1 == req done req=0x245d7e0 http_status=403 == So, i've done a small file set comprising of test files including the following names: Testing and Testing.txt Testing ^ Testing.txt Testing = Testing.txt Testing _ Testing.txt Testing - Testing.txt Testing ; Testing.txt Testing ! Testing.txt Testing ? Testing.txt Testing ( Testing.txt Testing ) Testing.txt Testing @ Testing.txt Testing $ Testing.txt Testing * Testing.txt Testing Testing.txt Testing # Testing.txt Testing % Testing.txt Testing + Testing.txt From the above list the files with the following characters are giving me Access Denied / 403 error: =;()@$*+ The rest of the files are successfully uploaded. Does anyone know what is required to fix the problem? Many thanks Andrei ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing
[ceph-users] rbd watchers
Hi, I'm having some trouble with an rbd image. I want to rename the current rbd and create a new rbd with the same name. I renamed the rbd with rbd mv, but it was still mapped on another node, so rbd mv gave me an error that it was unable to remove the source. I then unmapped the original rbd and tried to remove it. Despite it being unmapped, the cluster still believes that there is a watcher on the rbd: root@ceph-admin:~# rados -p poolname listwatchers rbdname.rbd watcher=x.x.x.x:0/2329830975 client.26367 cookie=48 root@ceph-admin:~# rbd rm -p poolname rbdname Removing image: 99% complete...failed.2014-05-20 11:50:15.023823 7fa6372e4780 -1 librbd: error removing header: (16) Device or resource busy rbd: error: image still has watchers This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout. I've already rebooted the node that the cluster claims is a watcher and confirmed it definitely is not mapped. I'm 99.9% sure that there are no nodes actually using this rbd. Does anyone know how I can get rid of it? Currently running ceph 0.73-1 on Ubuntu 12.04. Thanks J ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fwd: rbd map command hangs
Thanks again Ilya. I was following this recommendations: http://ceph.com/docs/master/start/os-recommendations/. Should this page be updated in that case? I'm going to upgrade to 3.9. Should I update the Ceph cluster nodes as well or just the ceph block device client? On Tue, May 20, 2014 at 3:20 AM, Ilya Dryomov ilya.dryo...@inktank.comwrote: On Tue, May 20, 2014 at 7:52 AM, Jay Janardhan jay.janard...@kaseya.com wrote: Got the stack trace when it crashed. I had to enable serial port to capture this. Would this help? [ 172.227318] libceph: mon0 192.168.56.102:6789 feature set mismatch, my 40002 server's 20042040002, missing 2004200 [ 172.451109] libceph: mon0 192.168.56.102:6789 socket error on read [ 172.539837] [ cut here ] [ 172.640704] kernel BUG at /home/apw/COD/linux/net/ceph/messenger.c:2366! [ 172.740775] invalid opcode: [#1] SMP [ 172.805429] Modules linked in: rbd libceph libcrc32c nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc ext2 ppdev microcode psmouse serio_raw parport_pc i2c_piix4 mac_hid lp parport e1000 [ 173.072985] CPU 0 [ 173.143909] Pid: 385, comm: kworker/0:3 Not tainted 3.6.9-030609-generic #201212031610 innotek GmbH VirtualBox/VirtualBox [ 173.358836] RIP: 0010:[a0183ff7] [a0183ff7] ceph_fault+0x267/0x270 [libceph] [ 173.629918] RSP: 0018:88007b497d90 EFLAGS: 00010286 [ 173.731786] RAX: fffe RBX: 88007b909298 RCX: 0003 [ 173.901361] RDX: RSI: RDI: 0039 [ 174.040360] RBP: 88007b497dc0 R08: 000a R09: fffb [ 174.235587] R10: R11: 0199 R12: 88007b9092c8 [ 174.385067] R13: R14: a0199580 R15: a0195773 [ 174.541288] FS: () GS:88007fc0() knlGS: [ 174.620856] CS: 0010 DS: ES: CR0: 8005003b [ 174.740551] CR2: 7fefd16c5168 CR3: 7bb41000 CR4: 06f0 [ 174.948095] DR0: DR1: DR2: [ 175.076881] DR3: DR6: 0ff0 DR7: 0400 [ 175.320731] Process kworker/0:3 (pid: 385, threadinfo 88007b496000, task 880079735bc0) [ 175.565218] Stack: [ 175.630655] 88007b909298 88007b909690 88007b9093d0 [ 175.699571] 88007b909418 88007fc0e300 88007b497df0 a018525c [ 175.710012] 88007b909690 880078e4d800 88007fc1bf00 88007fc0e340 [ 175.859748] Call Trace: [ 175.909572] [a018525c] con_work+0x14c/0x1c0 [libceph] [ 176.010436] [810763b6] process_one_work+0x136/0x550 [ 176.131098] [a0185110] ? try_read+0x440/0x440 [libceph] [ 176.249904] [810775b5] worker_thread+0x165/0x3c0 [ 176.368412] [81077450] ? manage_workers+0x190/0x190 [ 176.512415] [8107c5e3] kthread+0x93/0xa0 [ 176.623469] [816b8c04] kernel_thread_helper+0x4/0x10 [ 176.670502] [8107c550] ? flush_kthread_worker+0xb0/0xb0 [ 176.731089] [816b8c00] ? gs_change+0x13/0x13 [ 176.901284] Code: 00 00 00 00 48 8b 83 38 01 00 00 a8 02 0f 85 f6 fe ff ff 3e 80 a3 38 01 00 00 fb 48 c7 83 40 01 00 00 06 00 00 00 e9 37 ff ff ff 0f 0b 0f 0b 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 [ 177.088895] RIP [a0183ff7] ceph_fault+0x267/0x270 [libceph] [ 177.251573] RSP 88007b497d90 [ 177.310320] ---[ end trace f66ddfdda09b9821 ]--- OK, it definitely shouldn't have crashed here and there is a patch in later kernels that prevents this crash from happening. But, because 3.6 is too old and misses features, which is reported just prior to the crash splat, you wouldn't be able to use it with firefly userspace even if it didn't crash. You are going to need to run at least 3.9 and then disable vary_r tunable in your crushmap (vary_r will only be supported starting with 3.15) or primary-affinity adjustments - I can't tell which one is it just from the feature set mismatch message. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fwd: rbd map command hangs
On Tue, May 20, 2014 at 5:09 PM, Jay Janardhan jay.janard...@kaseya.com wrote: Thanks again Ilya. I was following this recommendations: http://ceph.com/docs/master/start/os-recommendations/. Should this page be updated in that case? I'm going to upgrade to 3.9. Should I update the Ceph cluster nodes as well or just the ceph block device client? If you are going to be upgrading, you might as well upgrade to a later kernel. No, only the nodes that you want to do 'rbd map' on. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Access denied error for list users
Hi, GET /admin/user with no parameter doesn't work. You must use GET /admin/metadata/user to fetch the user list (with metadata capabity). Alain De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de Shanil S Envoyé : mardi 20 mai 2014 07:13 À : ceph-users@lists.ceph.com; w...@42on.com; s...@inktank.com; Yehuda Sadeh Objet : [ceph-users] Access denied error for list users Hi, I am trying to create and list all users by using the functions http://ceph.com/docs/master/radosgw/adminops/ and i successfully created the access tokens but i am getting an access denied and 403 for listing users function. The GET /{admin}/user is used for getting the complete users list, but its not listing and getting the error. The user which called this function has the complete permission and i am adding the permission of this user { type: admin, perm: *}, { type: buckets, perm: *}, { type: caps, perm: *}, { type: metadata, perm: *}, { type: usage, perm: *}, { type: users, perm: *}], op_mask: read, write, delete, default_placement: , placement_tags: [], bucket_quota: { enabled: false, max_size_kb: -1, max_objects: -1}} This is in the log file which executed the list user function - GET application/x-www-form-urlencoded Tue, 20 May 2014 05:06:57 GMT /admin/user/ 2014-05-20 13:06:59.506233 7f0497fa7700 15 calculated digest=Z8FgXRLk+ah5MUThpP9IBJrMnrA= 2014-05-20 13:06:59.506236 7f0497fa7700 15 auth_sign=Z8FgXRLk+ah5MUThpP9IBJrMnrA= 2014-05-20 13:06:59.506237 7f0497fa7700 15 compare=0 2014-05-20 13:06:59.506240 7f0497fa7700 2 req 98:0.000308::GET /admin/user/:get_user_info:reading permissions 2014-05-20 13:06:59.506244 7f0497fa7700 2 req 98:0.000311::GET /admin/user/:get_user_info:init op 2014-05-20 13:06:59.506247 7f0497fa7700 2 req 98:0.000314::GET /admin/user/:get_user_info:verifying op mask 2014-05-20 13:06:59.506249 7f0497fa7700 20 required_mask= 0 user.op_mask=7 2014-05-20 13:06:59.506251 7f0497fa7700 2 req 98:0.000319::GET /admin/user/:get_user_info:verifying op permissions 2014-05-20 13:06:59.506254 7f0497fa7700 2 req 98:0.000322::GET /admin/user/:get_user_info:verifying op params 2014-05-20 13:06:59.506257 7f0497fa7700 2 req 98:0.000324::GET /admin/user/:get_user_info:executing 2014-05-20 13:06:59.506291 7f0497fa7700 2 req 98:0.000359::GET /admin/user/:get_user_info:http status=403 2014-05-20 13:06:59.506294 7f0497fa7700 1 == req done req=0x7f04c800d7f0 http_status=403 == 2014-05-20 13:06:59.506302 7f0497fa7700 20 process_request() returned -13 - Could you please check what is the issue ? I am using the ceph version : ceph version 0.80.1 _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fwd: rbd map command hangs
Ilya, how exactly do I disable vary_r in the crushmap? I ran ceph osd crush tunables firefly but that is resulting in feature set mismatch, my 384a042a42 server's 2384a042a42, missing 200 On Tue, May 20, 2014 at 9:15 AM, Ilya Dryomov ilya.dryo...@inktank.comwrote: On Tue, May 20, 2014 at 5:09 PM, Jay Janardhan jay.janard...@kaseya.com wrote: Thanks again Ilya. I was following this recommendations: http://ceph.com/docs/master/start/os-recommendations/. Should this page be updated in that case? I'm going to upgrade to 3.9. Should I update the Ceph cluster nodes as well or just the ceph block device client? If you are going to be upgrading, you might as well upgrade to a later kernel. No, only the nodes that you want to do 'rbd map' on. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [radosgw] unable to perform any operation using s3 api
Dear ceph users! I've been running into some issues trying to use the radosgw S3 API for talking to my ceph cluster. For avoiding carrying away some issues from older installations, I reinstalled everything from ground up using 0.79 and 0.80.1, running on Debian and Ubuntu as well, with the exactly same results using Nginx or Apache as a fastcgi gateway. It seems to be related to some failing permissions I can't put my finger on, since users digest and tokens do match. The following is a radosgw log for 'create bucket' operation using the python boto S3 library on python2.7, which throws a 403 since failed to perform the operation (S3ResponseError: S3ResponseError: 403 Forbidden), but I get the same issue with the s3cmd tool as well. 2014-05-20 11:34:08.630807 7f7853fff700 20 enqueued request req=0x7f784c00f050 2014-05-20 11:34:08.630826 7f7853fff700 20 RGWWQ: 2014-05-20 11:34:08.630827 7f7853fff700 20 req: 0x7f784c00f050 2014-05-20 11:34:08.630831 7f7853fff700 10 allocated request req=0x7f784c00f340 2014-05-20 11:34:08.630837 7f785a7fc700 20 dequeued request req=0x7f784c00f050 2014-05-20 11:34:08.630845 7f785a7fc700 20 RGWWQ: empty 2014-05-20 11:34:08.630884 7f785a7fc700 20 CONTENT_LENGTH=0 2014-05-20 11:34:08.630885 7f785a7fc700 20 CONTEXT_DOCUMENT_ROOT=/var/www/ 2014-05-20 11:34:08.630885 7f785a7fc700 20 CONTEXT_PREFIX= 2014-05-20 11:34:08.630886 7f785a7fc700 20 DOCUMENT_ROOT=/var/www/ 2014-05-20 11:34:08.630889 7f785a7fc700 20 FCGI_ROLE=RESPONDER 2014-05-20 11:34:08.630891 7f785a7fc700 20 GATEWAY_INTERFACE=CGI/1.1 2014-05-20 11:34:08.630892 7f785a7fc700 20 HTTP_ACCEPT_ENCODING=identity 2014-05-20 11:34:08.630892 7f785a7fc700 20 HTTP_AUTHORIZATION=AWS 5DFF8DCDXPK2AJ557N3J:F/exUe4uIjwJHRZC6+3MNPOBnIU= 2014-05-20 11:34:08.630893 7f785a7fc700 20 HTTP_DATE=Tue, 20 May 2014 14:34:20 GMT 2014-05-20 11:34:08.630893 7f785a7fc700 20 HTTP_HOST=serverIP 2014-05-20 11:34:08.630894 7f785a7fc700 20 HTTP_USER_AGENT=Boto/2.27.0 Python/2.7.6 Linux/3.14-1-amd64 2014-05-20 11:34:08.630894 7f785a7fc700 20 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2014-05-20 11:34:08.630895 7f785a7fc700 20 QUERY_STRING=page=my-test-bucket-1params=/ 2014-05-20 11:34:08.630895 7f785a7fc700 20 REMOTE_ADDR=clientIP 2014-05-20 11:34:08.630896 7f785a7fc700 20 REMOTE_PORT=50588 2014-05-20 11:34:08.630896 7f785a7fc700 20 REQUEST_METHOD=PUT 2014-05-20 11:34:08.630897 7f785a7fc700 20 REQUEST_SCHEME=http 2014-05-20 11:34:08.630897 7f785a7fc700 20 REQUEST_URI=/my-test-bucket-1/ 2014-05-20 11:34:08.630898 7f785a7fc700 20 RGW_LOG_LEVEL=20 2014-05-20 11:34:08.630898 7f785a7fc700 20 RGW_PRINT_CONTINUE=yes 2014-05-20 11:34:08.630899 7f785a7fc700 20 RGW_SHOULD_LOG=yes 2014-05-20 11:34:08.630899 7f785a7fc700 20 SCRIPT_FILENAME=/var/www/s3gw.fcgi 2014-05-20 11:34:08.630900 7f785a7fc700 20 SCRIPT_NAME=/my-test-bucket-1/ 2014-05-20 11:34:08.630900 7f785a7fc700 20 SCRIPT_URI=http://serverIP/my-test-bucket-1/ 2014-05-20 11:34:08.630902 7f785a7fc700 20 SCRIPT_URL=/my-test-bucket-1/ 2014-05-20 11:34:08.630903 7f785a7fc700 20 SERVER_ADDR=serverIP 2014-05-20 11:34:08.630903 7f785a7fc700 20 SERVER_ADMIN=webmaster@localhost 2014-05-20 11:34:08.630904 7f785a7fc700 20 SERVER_NAME=serverIP 2014-05-20 11:34:08.630904 7f785a7fc700 20 SERVER_PORT=80 2014-05-20 11:34:08.630905 7f785a7fc700 20 SERVER_PROTOCOL=HTTP/1.1 2014-05-20 11:34:08.630905 7f785a7fc700 20 SERVER_SIGNATURE=addressApache/2.4.7 (Ubuntu) Server at serverIP Port 80/address 2014-05-20 11:34:08.630906 7f785a7fc700 20 SERVER_SOFTWARE=Apache/2.4.7 (Ubuntu) 2014-05-20 11:34:08.630907 7f785a7fc700 1 == starting new request req=0x7f784c00f050 = 2014-05-20 11:34:08.630917 7f785a7fc700 2 req 13:0.10::PUT /my-test-bucket-1/::initializing 2014-05-20 11:34:08.630920 7f785a7fc700 10 host=serverIP rgw_dns_name=labs.mydomain.com 2014-05-20 11:34:08.630945 7f785a7fc700 10 s-object=NULL s-bucket=my-test-bucket-1 2014-05-20 11:34:08.630948 7f785a7fc700 2 req 13:0.41:s3:PUT /my-test-bucket-1/::getting op *2014-05-20 11:34:08.630952 7f785a7fc700 2 req 13:0.45:s3:PUT /my-test-bucket-1/:create_bucket:authorizing* 2014-05-20 11:34:08.630979 7f785a7fc700 20 get_obj_state: rctx=0x7f780c0059f0 obj=.users:5DFF8DCDXPK2AJ557N3J state=0x7f780c005f18 s-prefetch_data=0 2014-05-20 11:34:08.630984 7f785a7fc700 10 cache get: name=.users+5DFF8DCDXPK2AJ557N3J : hit 2014-05-20 11:34:08.630988 7f785a7fc700 20 get_obj_state: s-obj_tag was set empty 2014-05-20 11:34:08.630992 7f785a7fc700 10 cache get: name=.users+5DFF8DCDXPK2AJ557N3J : hit 2014-05-20 11:34:08.631011 7f785a7fc700 20 get_obj_state: rctx=0x7f780c0059f0 obj=.users.uid:test123 state=0x7f780c006938 s-prefetch_data=0 2014-05-20 11:34:08.631014 7f785a7fc700 10 cache get: name=.users.uid+test123 : hit 2014-05-20 11:34:08.631016 7f785a7fc700 20 get_obj_state: s-obj_tag was set empty 2014-05-20 11:34:08.631018 7f785a7fc700 10 cache get: name=.users.uid+test123 : hit 2014-05-20 11:34:08.631047 7f785a7fc700 10 get_canon_resource():
Re: [ceph-users] Fwd: rbd map command hangs
On Tue, May 20, 2014 at 6:10 PM, Jay Janardhan jay.janard...@kaseya.com wrote: Ilya, how exactly do I disable vary_r in the crushmap? I ran ceph osd crush tunables firefly but that is resulting in feature set mismatch, my 384a042a42 server's 2384a042a42, missing 200 ceph osd getcrushmap -o /tmp/crush crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new ceph osd setcrushmap -i /tmp/crush.new Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 70+ OSD are DOWN and not coming up
On Tue, 20 May 2014, Karan Singh wrote: Hello Cephers , need your suggestion for troubleshooting. My cluster is terribly struggling , 70+ osd are down out of 165 Problem ?OSD are getting marked out of cluster and are down. The cluster is degraded. On checking logs of failed OSD we are getting wired entries that are continuously getting generated. Tracking this at http://tracker.ceph.com/issues/8387 The most recent bits you posted in the ticket don't quite make sense: the OSD is trying to connect to an address for an OSD that is currently marked down. I suspect this is just timing between when the logs were captured and when teh ceph osd dump was captured. To get a complete pictures, please: 1) add debug osd = 20 debug ms = 1 in [osd] and restart all osds 2) ceph osd set nodown (to prevent flapping) 3) find some OSD that is showing these messages 4) capture a 'ceph osd dump' output. Also happy to debug this interactively over IRC; that will likely be faster! Thanks- sage Osd Debug logs :: http://pastebin.com/agTKh6zB 1. 2014-05-20 10:19:03.699886 7f2328e237a0 0 osd.158 357532 done with init, starting boot process 2. 2014-05-20 10:19:03.700093 7f22ff621700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6802/910005982 pipe(0x8698500 sd=35 :33500 s=1 pgs=0 cs=0 l=0 c=0x83018c0).connect claims to be 192.168.1.109:6802/63896 not 192.168.1.109:6802/910005982 - wrong node! 3. 2014-05-20 10:19:03.700152 7f22ff621700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6802/910005982 pipe(0x8698500 sd=35 :33500 s=1 pgs=0 cs=0 l=0 c=0x83018c0).fault with nothing to send, going to standby 4. 2014-05-20 10:19:09.551269 7f22fdd12700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6803/1176009454 pipe(0x56aee00 sd=53 :40060 s=1 pgs=0 cs=0 l=0 c=0x533fd20).connect claims to be 192.168.1.109:6803/63896 not 192.168.1.109:6803/1176009454 - wrong node! 5. 2014-05-20 10:19:09.551347 7f22fdd12700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6803/1176009454 pipe(0x56aee00 sd=53 :40060 s=1 pgs=0 cs=0 l=0 c=0x533fd20).fault with nothing to send, going to standby 6. 2014-05-20 10:19:09.703901 7f22fd80d700 0 -- 192.168.1.112:6802/3807 192.168.1.113:6802/13870 pipe(0x56adf00 sd=137 :42889 s=1 pgs=0 cs=0 l=0 c=0x8302aa0).connect claims to be 192.168.1.113:6802/24612 not 192.168.1.113:6802/13870 - wrong node! 7. 2014-05-20 10:19:09.704039 7f22fd80d700 0 -- 192.168.1.112:6802/3807 192.168.1.113:6802/13870 pipe(0x56adf00 sd=137 :42889 s=1 pgs=0 cs=0 l=0 c=0x8302aa0).fault with nothing to send, going to standby 8. 2014-05-20 10:19:10.243139 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).connect claims to be 192.168.1.112:6800/2852 not 192.168.1.112:6800/14114 - wrong node! 9. 2014-05-20 10:19:10.243190 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).fault with nothing to send, going to standby 10. 2014-05-20 10:19:10.349693 7f22fc7fd700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6800/13492 pipe(0x8698c80 sd=156 :0 s=1 pgs=0 cs=0 l=0 c=0x83070c0).fault with nothing to send, going to standby 1. ceph -v ceph version 0.80-469-g991f7f1 (991f7f15a6e107b33a24bbef1169f21eb7fcce2c) # 1. ceph osd stat osdmap e357073: 165 osds: 91 up, 165 in flags noout # I have tried doing : 1. Restarting the problematic OSDs , but no luck 2. i restarted entire host but no luck, still osds are down and getting the same mesage 1. 2014-05-20 10:19:10.243139 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).connect claims to be 192.168.1.112:6800/2852 not 192.168.1.112:6800/14114 - wrong node! 2. 2014-05-20 10:19:10.243190 7f22fd005700 0 -- 192.168.1.112:6802/3807 192.168.1.112:6800/14114 pipe(0x56a8f00 sd=146 :43726 s=1 pgs=0 cs=0 l=0 c=0x8304780).fault with nothing to send, going to standby 3. 2014-05-20 10:19:10.349693 7f22fc7fd700 0 -- 192.168.1.112:6802/3807 192.168.1.109:6800/13492 pipe(0x8698c80 sd=156 :0 s=1 pgs=0 cs=0 l=0 c=0x83070c0).fault with nothing to send, going to standby 4. 2014-05-20 10:22:23.312473 7f2307e61700 0 osd.158 357781 do_command r=0 5. 2014-05-20 10:22:23.326110 7f2307e61700 0 osd.158 357781 do_command r=0 debug_osd=0/5 6. 2014-05-20 10:22:23.326123 7f2307e61700 0 log [INF] : debug_osd=0/5 7. 2014-05-20 10:34:08.161864 7f230224d700 0 -- 192.168.1.112:6802/3807 192.168.1.102:6808/13276 pipe(0x8698280 sd=22 :41078 s=2 pgs=603 cs=1 l=0 c=0x8301600).fault with nothing to send, going to standby 3. Disks do not have errors , no message in dmesg and /var/log/messages 4. there was a bug in the past
Re: [ceph-users] RBD for ephemeral
On Mon, May 19, 2014 at 2:27 PM, Pierre Grandin pierre.gran...@tubemogul.com wrote: With the help of Josh on IRC, we found that actually the glance_api_version directive has to be in the [default] block of your cinder.conf ( i happen to have two storage backends and this directive was in my rbd block). After fixing this config my volumes created via cinder are indeed COW. Now I need to figure out why nova is still doing rbd --import... I'm not sure if the v1 url below is correct... Any idea? location: http://172.16.128.223:9292/v1/images/6850844a-2899-4bc6-a957-de9705d56130 http://172.16.128.223:9292/v1/images/6850844a-2899-4bc6-a957-de9705d56130x-image-meta-is_public So using glance --os-image-api-version 2 i was able to confirm that glance does serve direct urls. After digging some more, i've narrowed to this call : https://github.com/jdurgin/nova/blob/havana-ephemeral-rbd/nova/virt/libvirt/imagebackend.py#L787-792 The clone seems to fail, as the volume is not created, but i have nothing in my logs. I've added some debugging and the parameters seems to be correct : [...] rbd c29a96da-5165-4190-9727-0b728648f6a6_disk [...] pool images [...] image 55542add-6dca-4a01-8f17-bc4aed962f5e [...] snapshot snap No exception is raised according to my logs, i've also read https://ceph.com/docs/master/rbd/librbdpy/ I'm not really clear about how to create the contexts so I can't try to reproduce it from a simple python script right now. Any help appreciated! Pierre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with ceph_filestore_dump, possibly stuck in a loop
It isn’t clear to me what could cause a loop there. Just to be sure you don’t have a filesystem corruption please try to run a “find” or “ls -R” on the filestore root directory to be sure it completes. Can you send the log you generated? Also, what version of Ceph are you running? David Zafman Senior Developer http://www.inktank.com On May 16, 2014, at 6:20 AM, Jeff Bachtel jbach...@bericotechnologies.com wrote: Overnight, I tried to use ceph_filestore_dump to export a pg that is missing from other osds from an osd, with the intent of manually copying the export to the osds in the pg map and importing. Unfortunately, what is on-disk 59gb of data had filled 1TB when I got in this morning, and still hadn't completed. Is it possible for a loop to develop in a ceph_filestore_dump export? My C++ isn't the best. I can see in ceph_filestore_dump.cc int export_files a loop could occur if a broken collection was read, possibly. Maybe. --debug output seems to confirm? grep '^read' /tmp/ceph_filestore_dump.out | sort | wc -l ; grep '^read' /tmp/ceph_filestore_dump.out | sort | uniq | wc -l 2714 258 (only 258 unique reads are being reported, but each repeated 10 times so far) From start of debug output Supported features: compat={},rocompat={},incompat={1=initial feature set(~v.18),2=pginfo object,3=object locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper,11=sharded objects} On-disk features: compat={},rocompat={},incompat={1=initial feature set(~v.18),2=pginfo object,3=object locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper} Exporting 0.2f read 8210002f/100d228.00019150/head//0 size=4194304 data section offset=1048576 len=1048576 data section offset=2097152 len=1048576 data section offset=3145728 len=1048576 data section offset=4194304 len=1048576 attrs size 2 then at line 1810 ead 8210002f/100d228.00019150/head//0 size=4194304 data section offset=1048576 len=1048576 data section offset=2097152 len=1048576 data section offset=3145728 len=1048576 data section offset=4194304 len=1048576 attrs size 2 If this is a loop due to a broken filestore, is there any recourse on repairing it? The osd I'm trying to dump from isn't in the pg map for the cluster, I'm trying to save some data by exporting this version of the pg and importing it on an osd that's mapped. If I'm failing at a basic premise even trying to do that, please let me know so I can wave off (in which case, I believe I'd use ceph_filestore_dump to delete all copies of this pg in the cluster so I can force create it, which is failing at this time). Thanks, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Expanding pg's of an erasure coded pool
This failure means the messenger subsystem is trying to create a thread and is getting an error code back — probably due to a process or system thread limit that you can turn up with ulimit. This is happening because a replicated PG primary needs a connection to only its replicas (generally 1 or 2 connections), but with an erasure-coded PG the primary requires a connection to m+n-1 replicas (everybody who's in the erasure-coding set, including itself). Right now our messenger requires a thread for each connection, so kerblam. (And it actually requires a couple such connections because we have separate heartbeat, cluster data, and client data systems.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, May 20, 2014 at 3:43 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, On a setup of 400 OSDs (20 nodes, with 20 OSDs per node), I first tried to create a erasure coded pool with 4096 pgs, but this crashed the cluster. I then started with 1024 pgs, expanding to 2048 (pg_num and pgp_num), when I then try to expand to 4096 (not even quite enough) the cluster crashes again. ( Do we need less of pg's with erasure coding?) The crash starts with individual OSDs crashing, eventually bringing down the mons (until there is no more quorum or too few osds) Out of the logs: -16 2014-05-20 10:31:55.545590 7fd42f34d700 5 -- op tracker -- , seq: 14301, time: 2014-05-20 10:31:55.545590, event: started, request: pg_query(0.974 epoch 3315) v3 -15 2014-05-20 10:31:55.545776 7fd42f34d700 1 -- 130.246.178.141:6836/10446 -- 130.246.179.191:6826/21854 -- pg_notify(0.974 epoch 3326) v5 -- ?+0 0xc8b4ec0 con 0x9 026b40 -14 2014-05-20 10:31:55.545807 7fd42f34d700 5 -- op tracker -- , seq: 14301, time: 2014-05-20 10:31:55.545807, event: done, request: pg_query(0.974 epoch 3315) v3 -13 2014-05-20 10:31:55.559661 7fd3fdb0f700 1 -- 130.246.178.141:6837/10446 :/0 pipe(0xce0c380 sd=468 :6837 s=0 pgs=0 cs=0 l=0 c=0x1255f0c0).accept sd=468 130.246.179.191:60618/0 -12 2014-05-20 10:31:55.564034 7fd3bf72f700 1 -- 130.246.178.141:6838/10446 :/0 pipe(0xe3f2300 sd=596 :6838 s=0 pgs=0 cs=0 l=0 c=0x129b5ee0).accept sd=596 130.246.179.191:43913/0 -11 2014-05-20 10:31:55.627776 7fd42df4b700 1 -- 130.246.178.141:0/10446 == osd.170 130.246.179.191:6827/21854 3 osd_ping(ping_reply e3316 stamp 2014-05-20 10:31:52.994368) v2 47+0+0 (855262282 0 0) 0xb6863c0 con 0x1255b9c0 -10 2014-05-20 10:31:55.629425 7fd42df4b700 1 -- 130.246.178.141:0/10446 == osd.170 130.246.179.191:6827/21854 4 osd_ping(ping_reply e3316 stamp 2014-05-20 10:31:53.509621) v2 47+0+0 (2581193378 0 0) 0x93d6c80 con 0x1255b9c0 -9 2014-05-20 10:31:55.631270 7fd42f34d700 1 -- 130.246.178.141:6836/10446 == osd.169 130.246.179.191:6841/25473 2 pg_query(7.3ffs6 epoch 3326) v3 144+0+0 (221596234 0 0) 0x10b994a0 con 0x9383860 -8 2014-05-20 10:31:55.631308 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631130, event: header_read, request: pg_query(7.3ffs6 epoch 3326) v3 -7 2014-05-20 10:31:55.631315 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631133, event: throttled, request: pg_query(7.3ffs6 epoch 3326) v3 -6 2014-05-20 10:31:55.631339 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631207, event: all_read, request: pg_query(7.3ffs6 epoch 3326) v3 -5 2014-05-20 10:31:55.631343 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631303, event: dispatched, request: pg_query(7.3ffs6 epoch 3326) v3 -4 2014-05-20 10:31:55.631349 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631349, event: waiting_for_osdmap, request: pg_query(7.3ffs6 epoch 3326) v3 -3 2014-05-20 10:31:55.631363 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631363, event: started, request: pg_query(7.3ffs6 epoch 3326) v3 -2 2014-05-20 10:31:55.631402 7fd42f34d700 5 -- op tracker -- , seq: 14302, time: 2014-05-20 10:31:55.631402, event: done, request: pg_query(7.3ffs6 epoch 3326) v3 -1 2014-05-20 10:31:55.631488 7fd427b41700 1 -- 130.246.178.141:6836/10446 -- 130.246.179.191:6841/25473 -- pg_notify(7.3ffs6(14) epoch 3326) v5 -- ?+0 0xcc7b9c0 con 0x9383860 0 2014-05-20 10:31:55.632127 7fd42cb49700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7fd42cb49700 time 2014-05-20 10:31:55.630937 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1: (Thread::create(unsigned long)+0x8a) [0xa83f8a] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xa2a6aa] 3: (Accepter::entry()+0x265) [0xb3ca45] 4: (()+0x79d1) [0x7fd4436b19d1] 5: (clone()+0x6d) [0x7fd4423ecb6d] --- begin dump of recent events --- 0 2014-05-20 10:31:56.622247 7fd3bc5fe700 -1 *** Caught signal (Aborted) ** in thread
Re: [ceph-users] RBD for ephemeral
On Tue, May 20, 2014 at 10:14 AM, Pierre Grandin pierre.gran...@tubemogul.com wrote: After digging some more, i've narrowed to this call : https://github.com/jdurgin/nova/blob/havana-ephemeral-rbd/nova/virt/libvirt/imagebackend.py#L787-792 The clone seems to fail, as the volume is not created, but i have nothing in my logs. I've added some debugging and the parameters seems to be correct : [...] rbd c29a96da-5165-4190-9727-0b728648f6a6_disk [...] pool images [...] image 55542add-6dca-4a01-8f17-bc4aed962f5e [...] snapshot snap No exception is raised according to my logs, i've also read https://ceph.com/docs/master/rbd/librbdpy/ I'm not really clear about how to create the contexts so I can't try to reproduce it from a simple python script right now. Ok i found the error with my code. So if i try to do the snapshot with the image and volume from my previous mail, it works : import rbd import rados cluster = rados.Rados(conffile='/etc/ceph/ceph.conf') cluster.connect() src_ioctx = cluster.open_ioctx('images') dest_ioctx = cluster.open_ioctx('volumes') rbd.RBD().clone(src_ioctx, '55542add-6dca-4a01-8f17-bc4aed962f5e', 'snap', dest_ioctx, 'c29a96da-5165-4190-9727-0b728648f6a6_disk', features=rbd.RBD_FEATURE_LAYERING) # rbd -p volumes info c29a96da-5165-4190-9727-0b728648f6a6_disk rbd image 'c29a96da-5165-4190-9727-0b728648f6a6_disk': size 2048 MB in 256 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.1d4886b8b4567 format: 2 features: layering parent: images/55542add-6dca-4a01-8f17-bc4aed962f5e@snap overlap: 2048 MB Now need to find why nova believe it did not work.. -- Pierre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] nginx (tengine) and radosgw
I've just finished converting from nginx/radosgw to tengine/radosgw, and it's fixed all the weird issues I was seeing (uploads failing, random clock skew errors, timeouts). The problem with nginx and radosgw is that nginx insists on buffering all the uploads to disk. This causes a significant performance hit, and prevents larger uploads from working. Supposedly, there is going to be an option in nginx to disable this, but it hasn't been released yet (nor do I see anything on the nginx devel list about it). tengine ( http://tengine.taobao.org/ ) is an nginx fork that implements unbuffered uploads to fastcgi. It's basically a drop in replacement for nginx. My configuration looks like this: server { listen 80; server_name *.rados.test rados.test; client_max_body_size 10g; # This is the important option that tengine has, but nginx does not fastcgi_request_buffering off; location / { fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; if ($request_method = PUT ) { rewrite ^ /PUT$request_uri; } include fastcgi_params; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } location /PUT/ { internal; fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; include fastcgi_params; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } } if anyone else is looking to run radosgw without having to run apache, I would recommend you look into tengine :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] issues with creating Swift users for radosgw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello I've been experimenting with radosgw and I've had no issues with the S3 interface, however I cannot get a subuser created for use with the Swift api. First I created a user: root@ceph1:~# radosgw-admin user create --uid=shw - --display-name=Simon Weald { user_id: shw, display_name: Simon Weald, email: , suspended: 0, max_buckets: 1000, auid: 0, subusers: [], keys: [ { user: shw, access_key: 1WFY4I8I152WX8P74NZ7, secret_key: AkYBun7GubMaJq+IV4\/Rd904gkThrTVTLnhDATNm}], swift_keys: [], caps: [], op_mask: read, write, delete, default_placement: , placement_tags: [], bucket_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, user_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, temp_url_keys: []} Then I created a subuser: root@ceph1:~# radosgw-admin subuser create --uid=shw - --subuser=shw:swift --access=full { user_id: shw, display_name: Simon Weald, email: , suspended: 0, max_buckets: 1000, auid: 0, subusers: [ { id: shw:swift, permissions: full-control}], keys: [ { user: shw, access_key: 1WFY4I8I152WX8P74NZ7, secret_key: AkYBun7GubMaJq+IV4\/Rd904gkThrTVTLnhDATNm}, { user: shw:swift, access_key: QJDYHDW1E63ZU0B75Z3P, secret_key: }], swift_keys: [], caps: [], op_mask: read, write, delete, default_placement: , placement_tags: [], bucket_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, user_quota: { enabled: false, max_size_kb: -1, max_objects: -1}, temp_url_keys: []} The issue comes when trying to create a secret key for the subuser: root@ceph1:~# radosgw-admin key create --subuser=shw:swift - --key-type=swift --gen-secret 2014-05-20 20:13:29.460167 7f579bed5700 0 -- :/1004375 10.16.116.14:6789/0 pipe(0x1f94240 sd=3 :0 s=1 pgs=0 cs=0t could not create key: unable to add access key, unable to store user info2014-05-20 20:13:32.530032 7f57a5e7a780 0 ) I'm running Firefly on Wheezy. Thanks! - -- PGP key - http://www.simonweald.com/simonweald.asc -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTe7e8AAoJEJiOmFh0er6IzJoH/RKeXDCFKiR108zjpnwmd+O2 b0+u6N3Y+4KoLRRZbq7aJOSxH42lgFuGwwhkIZxXWC/xIHuxwHlwn4zqoBrTtfG3 BAoOZkFdeEyoVfB3/xnAY8PXQPOCbTq6E2qma3dTxDS30h27ru09uGrWPuSfZV18 g/cPGuOXpEp+bXHaRVgKBKp98sO+679V3uWrqszgRDV/xkc4h0Z9qicWJCIT+y4u niYeRL9zfBg/zQG5urx8GCkmkpVdvQ/L0M29zFpoDrlMORHtBy5Fs/3Wh9zFacNB u7KY44JbMrYPnbegbWK+5d5D2nO84d63k498KFkk3ExlFJJ8MC3JmKFlhWEc1K4= =Q/Yk -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] nginx (tengine) and radosgw
That looks very interesting indeed. I've tried to use nginx, but from what I recall it had some ssl related issues. Have you tried to make the ssl work so that nginx acts as an ssl proxy in front of the radosgw? Cheers Andrei - Original Message - From: Brian Rak b...@gameservers.com To: ceph-users@lists.ceph.com Sent: Tuesday, 20 May, 2014 9:11:58 PM Subject: [ceph-users] nginx (tengine) and radosgw I've just finished converting from nginx/radosgw to tengine/radosgw, and it's fixed all the weird issues I was seeing (uploads failing, random clock skew errors, timeouts). The problem with nginx and radosgw is that nginx insists on buffering all the uploads to disk. This causes a significant performance hit, and prevents larger uploads from working. Supposedly, there is going to be an option in nginx to disable this, but it hasn't been released yet (nor do I see anything on the nginx devel list about it). tengine ( http://tengine.taobao.org/ ) is an nginx fork that implements unbuffered uploads to fastcgi. It's basically a drop in replacement for nginx. My configuration looks like this: server { listen 80; server_name *.rados.test rados.test; client_max_body_size 10g; # This is the important option that tengine has, but nginx does not fastcgi_request_buffering off; location / { fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; if ($request_method = PUT ) { rewrite ^ /PUT$request_uri; } include fastcgi_params; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } location /PUT/ { internal; fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; include fastcgi_params; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } } if anyone else is looking to run radosgw without having to run apache, I would recommend you look into tengine :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] nginx (tengine) and radosgw
I haven't tried SSL yet. We currently don't have a wildcard certificate for this, so it hasn't been a concern (and our current use case, all the files are public anyway). On 5/20/2014 4:26 PM, Andrei Mikhailovsky wrote: That looks very interesting indeed. I've tried to use nginx, but from what I recall it had some ssl related issues. Have you tried to make the ssl work so that nginx acts as an ssl proxy in front of the radosgw? Cheers Andrei *From: *Brian Rak b...@gameservers.com *To: *ceph-users@lists.ceph.com *Sent: *Tuesday, 20 May, 2014 9:11:58 PM *Subject: *[ceph-users] nginx (tengine) and radosgw I've just finished converting from nginx/radosgw to tengine/radosgw, and it's fixed all the weird issues I was seeing (uploads failing, random clock skew errors, timeouts). The problem with nginx and radosgw is that nginx insists on buffering all the uploads to disk. This causes a significant performance hit, and prevents larger uploads from working. Supposedly, there is going to be an option in nginx to disable this, but it hasn't been released yet (nor do I see anything on the nginx devel list about it). tengine ( http://tengine.taobao.org/ ) is an nginx fork that implements unbuffered uploads to fastcgi. It's basically a drop in replacement for nginx. My configuration looks like this: server { listen 80; server_name *.rados.test rados.test; client_max_body_size 10g; # This is the important option that tengine has, but nginx does not fastcgi_request_buffering off; location / { fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; if ($request_method = PUT ) { rewrite ^ /PUT$request_uri; } include fastcgi_params; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } location /PUT/ { internal; fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; include fastcgi_params; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } } if anyone else is looking to run radosgw without having to run apache, I would recommend you look into tengine :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How do I do deep-scrub manually?
If deep-scrubbing is causing a performance problem, you're going to be very unhappy during recovery. I think you'd be better off improving performance so that you can handle normal deep-scrubbing. If you still want to proceed, I wouldn't set the nodeep-scrub flag. I'd adjust the scrubbing variables: osd scrub load threshold osd scrub min interval osd scrub max interval osd deep scrub interval (See https://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing) If you increase the deep scrub interval to something like a month, then run a weekly cron, that should solve your problem without the warning. If the cron fails, deep-scrubbing will still happen once a month. Deep-scrubbing manually is relatively easy, run ceph pg deep-scrub pg_id. You can get a list of the pg_id's from ceph pg dump. If you're doing that in a script, you might prefer ceph pg dump --format=json. The pg dump contains last_scrub, scrub_stamp, last_deep_scrub, and deep_scrub_stamp for each PG. You might want to sort the PGs by last_deep_scrub, and do the oldest one first. There's a lot of information here that your script can use to make this manually scrubbing work better for you. On 5/20/14 02:29 , Ta Ba Tuan wrote: Dear Yang, I planed set nodeep-scrub at nigh daily by crontab. and with error HEALTH_WARN nodeep-scrub flag(s) set. I only concentrate messages from the monitoring tool (vd: nagios) = and I re-writed nagios'checkscript to with message HEALTH_WARN nodeep-scrub flag(s) set returns code = 0. On 05/20/2014 10:47 AM, Jianing Yang wrote: I found that deep scrub has a significant impact on my cluster. I've used ceph osd set nodeep-scrub disable it. But I got an error HEALTH_WARN nodeep-scrub flag(s) set. What is the proper way to disable deep scrub? and how can I run it manually? -- _ / Install 'denyhosts' to help protect \ | against brute force SSH attacks,| \ auto-blocking multiple attempts./ - \ \ \ .- O -..--. ,---. .-==-. /_-\'''/-_\ / / '' \ \ |,-.| /____\ |/ o) (o \|| | ')(' | | /,'-'.\ |/ (')(') \| \ ._. / \ \/ / {_/(') (')\_} \ __ / ,-_,,,_-. '=jf=' `. _ .' ,'--__--'. / . \/\ /'-___-'\/:|\ (_) . (_) / \ / \ (_) :| (_) \_-'--/ (_)(_) (_)___(_) |___:|| \___/ || \___/ |_| ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com *Central Desktop. Work together in ways you never thought possible.* Connect with us Website http://www.centraldesktop.com/ | Twitter http://www.twitter.com/centraldesktop | Facebook http://www.facebook.com/CentralDesktop | LinkedIn http://www.linkedin.com/groups?gid=147417 | Blog http://cdblog.centraldesktop.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Selection Criteria for Deep-Scrub
Today I noticed that deep-scrub is consistently missing some of my Placement Groups, leaving me with the following distribution of PGs and the last day they were successfully deep-scrubbed. # ceph pg dump all | grep active | awk '{ print $20}' | sort -k1 | uniq -c 5 2013-11-06 221 2013-11-20 1 2014-02-17 25 2014-02-19 60 2014-02-20 4 2014-03-06 3 2014-04-03 6 2014-04-04 6 2014-04-05 13 2014-04-06 4 2014-04-08 3 2014-04-10 2 2014-04-11 50 2014-04-12 28 2014-04-13 14 2014-04-14 3 2014-04-15 78 2014-04-16 44 2014-04-17 8 2014-04-18 1 2014-04-20 16 2014-05-02 69 2014-05-04 140 2014-05-05 569 2014-05-06 9231 2014-05-07 103 2014-05-08 514 2014-05-09 1593 2014-05-10 393 2014-05-16 2563 2014-05-17 1283 2014-05-18 1640 2014-05-19 1979 2014-05-20 I have been running the default osd deep scrub interval of once per week, but have disabled deep-scrub on several occasions in an attempt to avoid the associated degraded cluster performance I have written about before. To get the PGs longest in need of a deep-scrub started, I set the nodeep-scrub flag, and wrote a script to manually kick off deep-scrub according to age. It is processing as expected. Do you consider this a feature request or a bug? Perhaps the code that schedules PGs to deep-scrub could be improved to prioritize PGs that have needed a deep-scrub the longest. Thanks, Mike Dawson ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph booth in Paris at solutionlinux.fr
It was nice to meet you guys ! I'll try to come to next meetup. (or any ceph workshop). Could be fantastic to have some kind of ceph full day meetup in the future, too many questions for this (too) short meetup ;) See you soon, Regards, Alexandre - Mail original - De: Loic Dachary l...@dachary.org À: ceph-users ceph-users@lists.ceph.com Cc: ERIC MOURGAYA VIRAPATRIN eric.mourgaya-virapat...@arkea.com, Christophe Courtaut christophe.court...@cloudwatt.com Envoyé: Mardi 20 Mai 2014 22:18:17 Objet: [ceph-users] Ceph booth in Paris at solutionlinux.fr Hi Ceph, The first Ceph User Committee exhibition booth has been a fantastic experience. Guilhem, Christophe and Eric were there most of the time and it was very busy. Not only did we get to discuss with Ceph users who attended our previous meetups, it was also an opportunity to explain Ceph to visitors who discovered it on this occasion. The weird Ceph + Glusterfs glasses brought back from Atlanta intrigued people (Eric Mourgaya wears them on the attached picture). The goodies were also appreciated and we have barely enough for tomorrow. Last but not least, the booth was invaded by the overwhelming amount of cheese and wine coming from the adjacent booths http://april.org/ and http://openstack.fr/. The after party went on hours after the exhibition was closed, until everybody was gently kicked out by security. Cheers -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG Selection Criteria for Deep-Scrub
For what it's worth, version 0.79 has different headers, and the awk command needs $19 instead of $20. But here is the output I have on a small cluster that I recently rebuilt: $ ceph pg dump all | grep active | awk '{ print $19}' | sort -k1 | uniq -c dumped all in format plain 1 2014-05-15 2 2014-05-17 19 2014-05-18 193 2014-05-19 105 2014-05-20 I have set noscrub and nodeep-scrub, as well as noout and nodown off and on while I performed various maintenance, but that hasn't (apparently) impeded the regular schedule. With what frequency are you setting the nodeep-scrub flag? -Aaron On Tue, May 20, 2014 at 5:21 PM, Mike Dawson mike.daw...@cloudapt.comwrote: Today I noticed that deep-scrub is consistently missing some of my Placement Groups, leaving me with the following distribution of PGs and the last day they were successfully deep-scrubbed. # ceph pg dump all | grep active | awk '{ print $20}' | sort -k1 | uniq -c 5 2013-11-06 221 2013-11-20 1 2014-02-17 25 2014-02-19 60 2014-02-20 4 2014-03-06 3 2014-04-03 6 2014-04-04 6 2014-04-05 13 2014-04-06 4 2014-04-08 3 2014-04-10 2 2014-04-11 50 2014-04-12 28 2014-04-13 14 2014-04-14 3 2014-04-15 78 2014-04-16 44 2014-04-17 8 2014-04-18 1 2014-04-20 16 2014-05-02 69 2014-05-04 140 2014-05-05 569 2014-05-06 9231 2014-05-07 103 2014-05-08 514 2014-05-09 1593 2014-05-10 393 2014-05-16 2563 2014-05-17 1283 2014-05-18 1640 2014-05-19 1979 2014-05-20 I have been running the default osd deep scrub interval of once per week, but have disabled deep-scrub on several occasions in an attempt to avoid the associated degraded cluster performance I have written about before. To get the PGs longest in need of a deep-scrub started, I set the nodeep-scrub flag, and wrote a script to manually kick off deep-scrub according to age. It is processing as expected. Do you consider this a feature request or a bug? Perhaps the code that schedules PGs to deep-scrub could be improved to prioritize PGs that have needed a deep-scrub the longest. Thanks, Mike Dawson ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG Selection Criteria for Deep-Scrub
I tend to set it whenever I don't want to be bothered by storage performance woes (nights I value sleep, etc). This cluster is bounded by relentless small writes (it has a couple dozen rbd volumes backing video surveillance DVRs). Some of the software we run is completely unaffected whereas other software falls apart during periods of deep-scrubs. I theorize it has to do with the individual software's attitude about flushing to disk / buffering. - Mike On 5/20/2014 8:31 PM, Aaron Ten Clay wrote: For what it's worth, version 0.79 has different headers, and the awk command needs $19 instead of $20. But here is the output I have on a small cluster that I recently rebuilt: $ ceph pg dump all | grep active | awk '{ print $19}' | sort -k1 | uniq -c dumped all in format plain 1 2014-05-15 2 2014-05-17 19 2014-05-18 193 2014-05-19 105 2014-05-20 I have set noscrub and nodeep-scrub, as well as noout and nodown off and on while I performed various maintenance, but that hasn't (apparently) impeded the regular schedule. With what frequency are you setting the nodeep-scrub flag? -Aaron On Tue, May 20, 2014 at 5:21 PM, Mike Dawson mike.daw...@cloudapt.com mailto:mike.daw...@cloudapt.com wrote: Today I noticed that deep-scrub is consistently missing some of my Placement Groups, leaving me with the following distribution of PGs and the last day they were successfully deep-scrubbed. # ceph pg dump all | grep active | awk '{ print $20}' | sort -k1 | uniq -c 5 2013-11-06 221 2013-11-20 1 2014-02-17 25 2014-02-19 60 2014-02-20 4 2014-03-06 3 2014-04-03 6 2014-04-04 6 2014-04-05 13 2014-04-06 4 2014-04-08 3 2014-04-10 2 2014-04-11 50 2014-04-12 28 2014-04-13 14 2014-04-14 3 2014-04-15 78 2014-04-16 44 2014-04-17 8 2014-04-18 1 2014-04-20 16 2014-05-02 69 2014-05-04 140 2014-05-05 569 2014-05-06 9231 2014-05-07 103 2014-05-08 514 2014-05-09 1593 2014-05-10 393 2014-05-16 2563 2014-05-17 1283 2014-05-18 1640 2014-05-19 1979 2014-05-20 I have been running the default osd deep scrub interval of once per week, but have disabled deep-scrub on several occasions in an attempt to avoid the associated degraded cluster performance I have written about before. To get the PGs longest in need of a deep-scrub started, I set the nodeep-scrub flag, and wrote a script to manually kick off deep-scrub according to age. It is processing as expected. Do you consider this a feature request or a bug? Perhaps the code that schedules PGs to deep-scrub could be improved to prioritize PGs that have needed a deep-scrub the longest. Thanks, Mike Dawson _ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com