[ceph-users] All SSD storage and journals

2014-10-24 Thread Christian Balzer

Hello,

as others have reported in the past and now having tested things here
myself, there really is no point in having journals for SSD backed OSDs on
other SSDs.

It is a zero sum game, because:
a) using that journal SSD as another OSD with integrated journal will
yield the same overall result performance wise, if all SSDs are the same.
And In addition its capacity will be made available for actual storage.
b) if the journal SSD is faster than the OSD SSDs it tends to be priced
accordingly. For example the DC P3700 400GB is about twice as fast (write)
and expensive as the DC S3700 400GB.

Things _may_ be different if one doesn't look at bandwidth but IOPS (though
certainly not in the near future in regard to Ceph actually getting SSDs
busy), but even there the difference is negligible when for example
comparing the Intel S and P models in write performance.
Reads are another thing, but nobody cares about those in journals. ^o^

Obvious things that come to mind in this context would be the ability to
disable journals (difficult, I know, not touching BTRFS, thank you) and
probably K/V store in the future.

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-24 Thread yuelongguang
hi,yehuda
 
1.
can we deploy multi-rgws on one ceph cluster?
if so  does it bring us any problems? 
 
2. what is the major difference between apache and civetweb? 
what is  civetweb's advantage? 
 
thanks


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] get/put files with radosgw once MDS crash

2014-10-24 Thread 廖建锋
dear cepher,
 Today, I use mds to put/get files from ceph storgate cluster as it is 
very easy to use for each side of a company.
But ceph mds is not very stable, So my question:
is it possbile to get the file name and contentes from OSD with radosgw 
once MDS crash and how ?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Continuous OSD crash with kv backend (firefly)

2014-10-24 Thread Andrey Korolyov
Hi,

during recovery testing on a latest firefly with leveldb backend we
found that the OSDs on a selected host may crash at once, leaving
attached backtrace. In other ways, recovery goes more or less smoothly
for hours.

Timestamps shows how the issue is correlated between different
processes on same node:

core.ceph-osd.25426.node01.1414148261
core.ceph-osd.25734.node01.1414148263
core.ceph-osd.25566.node01.1414148345

The question is about kv backend state in Firefly - is it considered
stable enough to run production test against it or we should better
move to giant/master for this?

Thanks!
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-linux-gnu.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /usr/bin/ceph-osd...Reading symbols from 
/usr/lib/debug/usr/bin/ceph-osd...done.
done.
[New LWP 10182]
[New LWP 10183]
[New LWP 10699]
[New LWP 10184]
[New LWP 10703]
[New LWP 10704]
[New LWP 10702]
[New LWP 10708]
[New LWP 10707]
[New LWP 10710]
[New LWP 10700]
[New LWP 10717]
[New LWP 10765]
[New LWP 10705]
[New LWP 10706]
[New LWP 10701]
[New LWP 10712]
[New LWP 10735]
[New LWP 10713]
[New LWP 10750]
[New LWP 10718]
[New LWP 10711]
[New LWP 10716]
[New LWP 10715]
[New LWP 10785]
[New LWP 10766]
[New LWP 10796]
[New LWP 10720]
[New LWP 10725]
[New LWP 10736]
[New LWP 10709]
[New LWP 10730]
[New LWP 11541]
[New LWP 10770]
[New LWP 11573]
[New LWP 10778]
[New LWP 10804]
[New LWP 11561]
[New LWP 9388]
[New LWP 9398]
[New LWP 11538]
[New LWP 10790]
[New LWP 11586]
[New LWP 10798]
[New LWP 9910]
[New LWP 10726]
[New LWP 21823]
[New LWP 10815]
[New LWP 9397]
[New LWP 11248]
[New LWP 10723]
[New LWP 11253]
[New LWP 10728]
[New LWP 10791]
[New LWP 9389]
[New LWP 10724]
[New LWP 10780]
[New LWP 11287]
[New LWP 11592]
[New LWP 10816]
[New LWP 10812]
[New LWP 10787]
[New LWP 20622]
[New LWP 21822]
[New LWP 10751]
[New LWP 10768]
[New LWP 10767]
[New LWP 11874]
[New LWP 10733]
[New LWP 10811]
[New LWP 11574]
[New LWP 11873]
[New LWP 10771]
[New LWP 11551]
[New LWP 10799]
[New LWP 10729]
[New LWP 18254]
[New LWP 10792]
[New LWP 10803]
[New LWP 9912]
[New LWP 11293]
[New LWP 20623]
[New LWP 14805]
[New LWP 10773]
[New LWP 11298]
[New LWP 11872]
[New LWP 10763]
[New LWP 10783]
[New LWP 10769]
[New LWP 11300]
[New LWP 10777]
[New LWP 10764]
[New LWP 10802]
[New LWP 10749]
[New LWP 14806]
[New LWP 10806]
[New LWP 10805]
[New LWP 18255]
[New LWP 10181]
[New LWP 11277]
[New LWP 9913]
[New LWP 10800]
[New LWP 10801]
[New LWP 11555]
[New LWP 11871]
[New LWP 10748]
[New LWP 9915]
[New LWP 10779]
[New LWP 11294]
[New LWP 9916]
[New LWP 10757]
[New LWP 10734]
[New LWP 10786]
[New LWP 10727]
[New LWP 19063]
[New LWP 11279]
[New LWP 9905]
[New LWP 9911]
[New LWP 10772]
[New LWP 10722]
[New LWP 9914]
[New LWP 10789]
[New LWP 11540]
[New LWP 9917]
[New LWP 11289]
[New LWP 10714]
[New LWP 10721]
[New LWP 10719]
[New LWP 10788]
[New LWP 10782]
[New LWP 10784]
[New LWP 10776]
[New LWP 10774]
[New LWP 10737]
[New LWP 19064]
[Thread debugging using libthread_db enabled]
Using host libthread_db library /lib/x86_64-linux-gnu/libthread_db.so.1.
Core was generated by `/usr/bin/ceph-osd -i 1 --pid-file 
/var/run/ceph/osd.1.pid -c /etc/ceph/ceph.con'.
Program terminated with signal 6, Aborted.
#0  0x7ff9ad91eb7b in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) 
Thread 135 (Thread 0x7ff99a492700 (LWP 19064)):
#0  0x7ff9ad91ad84 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00c496da in Wait (mutex=..., this=0x108cd110) at 
./common/Cond.h:55
#2  Pipe::writer (this=0x108ccf00) at msg/Pipe.cc:1730
#3  0x00c5485d in Pipe::Writer::entry (this=optimized out) at 
msg/Pipe.h:61
#4  0x7ff9ad916e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#5  0x7ff9ac4a43dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x in ?? ()

Thread 134 (Thread 0x7ff975e1d700 (LWP 10737)):
#0  0x7ff9ac498a13 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00c3e73c in Pipe::tcp_read_wait (this=this@entry=0x4a53180) at 
msg/Pipe.cc:2282
#2  0x00c3e9d0 in Pipe::tcp_read (this=this@entry=0x4a53180, 
buf=optimized out, buf@entry=0x7ff975e1cccf \377, len=len@entry=1)
at msg/Pipe.cc:2255
#3  0x00c5095f in Pipe::reader (this=0x4a53180) at msg/Pipe.cc:1421
#4  0x00c5497d in Pipe::Reader::entry (this=optimized out) at 
msg/Pipe.h:49
#5  0x7ff9ad916e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#6  0x7ff9ac4a43dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x in ?? ()

Thread 133 (Thread 0x7ff972dda700 (LWP 

Re: [ceph-users] Continuous OSD crash with kv backend (firefly)

2014-10-24 Thread Haomai Wang
It's not stable at Firely for kvstore. But for the master branch, it's
should be no existing/known bug.

On Fri, Oct 24, 2014 at 7:41 PM, Andrey Korolyov and...@xdel.ru wrote:
 Hi,

 during recovery testing on a latest firefly with leveldb backend we
 found that the OSDs on a selected host may crash at once, leaving
 attached backtrace. In other ways, recovery goes more or less smoothly
 for hours.

 Timestamps shows how the issue is correlated between different
 processes on same node:

 core.ceph-osd.25426.node01.1414148261
 core.ceph-osd.25734.node01.1414148263
 core.ceph-osd.25566.node01.1414148345

 The question is about kv backend state in Firefly - is it considered
 stable enough to run production test against it or we should better
 move to giant/master for this?

 Thanks!

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Best Regards,

Wheat
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-24 Thread Jasper Siero
Hello Greg and John,

I used the patch on the ceph cluster and tried it again:
 /usr/bin/ceph-mds -i th1-mon001 -c /etc/ceph/ceph.conf --cluster ceph 
--undump-journal 0 journaldumptgho-mon001
undump journaldumptgho-mon001
start 9483323613 len 134213311
writing header 200.
writing 9483323613~1048576
writing 9484372189~1048576


writing 9614395613~1048576
writing 9615444189~1048576
writing 9616492765~1044159
done.

It went well without errors and after that I restarted the mds.
The status went from up:replay to up:reconnect to up:rejoin(lagged or crashed)

In the log there is an error about trim_to  trimming_pos and its like Greg 
mentioned that maybe the dumpfile needs to be truncated to the proper length 
and resetting and undumping again.

How can I truncate the dumped file to the correct length?

The mds log during the undumping and starting the mds:
http://pastebin.com/y14pSvM0

Kind Regards,

Jasper

Van: john.sp...@inktank.com [john.sp...@inktank.com] namens John Spray 
[john.sp...@redhat.com]
Verzonden: donderdag 16 oktober 2014 12:23
Aan: Jasper Siero
CC: Gregory Farnum; ceph-users
Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full

Following up: firefly fix for undump is: https://github.com/ceph/ceph/pull/2734

Jasper: if you still need to try undumping on this existing firefly
cluster, then you can download ceph-mds packages from this
wip-firefly-undump branch from
http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/

Cheers,
John

On Wed, Oct 15, 2014 at 8:15 PM, John Spray john.sp...@redhat.com wrote:
 Sadly undump has been broken for quite some time (it was fixed in
 giant as part of creating cephfs-journal-tool).  If there's a one line
 fix for this then it's probably worth putting in firefly since it's a
 long term supported branch -- I'll do that now.

 John

 On Wed, Oct 15, 2014 at 8:23 AM, Jasper Siero
 jasper.si...@target-holding.nl wrote:
 Hello Greg,

 The dump and reset of the journal was succesful:

 [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
 /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
 --dump-journal 0 journaldumptgho-mon001
 journal is 9483323613~134215459
 read 134213311 bytes at offset 9483323613
 wrote 134213311 bytes at offset 9483323613 to journaldumptgho-mon001
 NOTE: this is a _sparse_ file; you can
 $ tar cSzf journaldumptgho-mon001.tgz journaldumptgho-mon001
   to efficiently compress it while preserving sparseness.

 [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
 /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
 --reset-journal 0
 old journal was 9483323613~134215459
 new journal start will be 9621733376 (4194304 bytes past old end)
 writing journal head
 writing EResetJournal entry
 done


 Undumping the journal was not successful and looking into the error 
 client_lock.is_locked() is showed several times. The mds is not running 
 when I start the undumping so maybe have forgot something?

 [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file 
 /var/run/ceph/mds.th1-mon001.pid -c /etc/ceph/ceph.conf --cluster ceph 
 --undump-journal 0 journaldumptgho-mon001
 undump journaldumptgho-mon001
 start 9483323613 len 134213311
 writing header 200.
 osdc/Objecter.cc: In function 'ceph_tid_t 
 Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 2014-10-15 
 09:09:32.020287
 osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())
  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
  1: /usr/bin/ceph-mds() [0x80f15e]
  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
  3: (main()+0x1632) [0x569c62]
  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
  5: /usr/bin/ceph-mds() [0x567d99]
  NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
 interpret this.
 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In function 
 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 time 
 2014-10-15 09:09:32.020287
 osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())

  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
  1: /usr/bin/ceph-mds() [0x80f15e]
  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
  3: (main()+0x1632) [0x569c62]
  4: (__libc_start_main()+0xfd) [0x7fec3ca68d5d]
  5: /usr/bin/ceph-mds() [0x567d99]
  NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
 interpret this.

  0 2014-10-15 09:09:32.021313 7fec3e5ad7a0 -1 osdc/Objecter.cc: In 
 function 'ceph_tid_t Objecter::op_submit(Objecter::Op*)' thread 7fec3e5ad7a0 
 time 2014-10-15 09:09:32.020287
 osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked())

  ceph version 0.80.5 (38b73c67d375a2552d8ed67843c
 [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --p8a65c2c0feba6)
  1: /usr/bin/ceph-mds() [0x80f15e]
  2: (Dumper::undump(char const*)+0x65d) [0x56c7ad]
  3: 

[ceph-users] Object Storage Statistics

2014-10-24 Thread Dane Elwell
Hi list,

We're using the object storage in production and billing people based
on their usage, much like S3. We're also trying to produce things like
hourly bandwidth graphs for our clients.

We're having some issues with the API not returning the correct
statistics. I can see that there is a --sync-stats option for the
command line radosgw-admin, but there doesn't appear to be anything
similar for the admin REST API. Is there an equivalent feature for the
API that hasn't been documented by chance?

Thanks

Dane
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
Hey folks,

I am trying to enable OpenStack to use RBD as image backend:
https://bugs.launchpad.net/nova/+bug/1226351

For some reason, nova-compute segfaults due to librados crash:

./log/SubsystemMap.h: In function 'bool
ceph::log::SubsystemMap::should_gather(unsigned
int, int)' thread 7f1b477fe700 time 2014-10-24 03:20:17.382769
./log/SubsystemMap.h: 62: FAILED assert(sub  m_subsys.size())
ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
1: (()+0x42785) [0x7f1b4c4db785]
2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
4: (()+0x6b50) [0x7f1b6ea93b50]
5: (clone()+0x6d) [0x7f1b6df3e0ed]
NOTE: a copy of the executable, or `objdump -rdS executable` is needed to
interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

I feel that there is some concurrency issue, since this sometimes happen
before and sometimes after this line:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/rbd_utils.py#L208

Any idea what are the potential causes of the crash?

Thanks.
-Simon
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread HURTEVENT VINCENT
Hello,

I was running a multi mon (3) Ceph cluster and in a migration move, I reinstall 
2 of the 3 monitors nodes without deleting them properly into the cluster.

So, there is only one monitor left which is stuck in probing phase and the 
cluster is down.

As I can only connect to mon socket, I don't how if it's possible to add a 
monitor, get and edit monmap.

This cluster is running Ceph version 0.67.1.

Is there a way to force my last monitor into a leader state or re build a lost 
monitor to pass the probe and election phases ?

Thank you,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-24 Thread Sergey Nazarov
Any update?

On Tue, Oct 21, 2014 at 3:32 PM, Sergey Nazarov nataraj...@gmail.com wrote:
 Ouch, I think client log is missing.
 Here it goes:
 https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0

 On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov nataraj...@gmail.com wrote:
 I enabled logging and performed same tests.
 Here is the link on archive with logs, they are only from one node
 (from the node where active MDS was sitting):
 https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0

 Rados bench results:

 # rados bench -p test 10 write
  Maintaining 16 concurrent writes of 4194304 bytes for up to 10
 seconds or 0 objects
  Object prefix: benchmark_data_atl-fs11_4630
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  164630   119.967   120  0.201327  0.348463
  2  168872   143.969   168  0.132983  0.353677
  3  16   124   108   143.972   144  0.930837  0.383018
  4  16   155   139   138.976   124  0.899468  0.426396
  5  16   203   187   149.575   192  0.236534  0.400806
  6  16   243   227   151.309   160  0.835213  0.397673
  7  16   276   260   148.549   132  0.905989  0.406849
  8  16   306   290   144.978   120  0.353279  0.422106
  9  16   335   319   141.757   116   1.12114  0.428268
 10  16   376   360143.98   164  0.418921   0.43351
 11  16   377   361   131.254 4  0.499769  0.433693
  Total time run: 11.206306
 Total writes made:  377
 Write size: 4194304
 Bandwidth (MB/sec): 134.567

 Stddev Bandwidth:   60.0232
 Max bandwidth (MB/sec): 192
 Min bandwidth (MB/sec): 0
 Average Latency:0.474923
 Stddev Latency: 0.376038
 Max latency:1.82171
 Min latency:0.060877


 # rados bench -p test 10 seq
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  166145   179.957   180  0.010405   0.25243
  2  16   10993   185.962   192  0.908263  0.284303
  3  16   151   135   179.965   168  0.255312  0.297283
  4  16   191   175174.97   160  0.836727  0.330659
  5  16   236   220   175.971   180  0.009995  0.330832
  6  16   275   259   172.639   156   1.06855  0.345418
  7  16   311   295   168.545   144  0.907648  0.361689
  8  16   351   335   167.474   160  0.947688  0.363552
  9  16   390   374   166.196   156  0.140539  0.369057
  Total time run:9.755367
 Total reads made: 401
 Read size:4194304
 Bandwidth (MB/sec):164.422

 Average Latency:   0.387705
 Max latency:   1.33852
 Min latency:   0.008064

 # rados bench -p test 10 rand
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  165539   155.938   156  0.773716  0.257267
  2  169377   153.957   152  0.006573  0.339199
  3  16   135   119   158.629   168  0.009851  0.359675
  4  16   171   155   154.967   144  0.892027  0.359015
  5  16   209   193   154.369   152   1.13945  0.378618
  6  16   256   240159.97   188  0.009965  0.368439
  7  16   295   279 159.4   156  0.195812  0.371259
  8  16   343   327   163.472   192  0.880587  0.370759
  9  16   380   364161.75   148  0.113111  0.377983
 10  16   424   408   163.173   176  0.772274  0.379497
  Total time run:10.518482
 Total reads made: 425
 Read size:4194304
 Bandwidth (MB/sec):161.620

 Average Latency:   0.393978
 Max latency:   1.36572
 Min latency:   0.006448

 On Tue, Oct 21, 2014 at 2:03 PM, Gregory Farnum g...@inktank.com wrote:
 Can you enable debugging on the client (debug ms = 1, debug client
 = 20) and mds (debug ms = 1, debug mds = 20), run this test
 again, and post them somewhere for me to look at?

 While you're at it, can you try rados bench and see what sort of
 results you get?
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com


 On Tue, Oct 21, 2014 at 10:57 AM, Sergey Nazarov nataraj...@gmail.com 
 wrote:
 It is CephFS mounted via ceph-fuse.
 I am getting the same results not depending on how many other clients
 are having this fs mounted and their activity.
 Cluster is working on Debian Wheezy, kernel 

Re: [ceph-users] RGW Federated Gateways and Apache 2.4 problems

2014-10-24 Thread Yehuda Sadeh
On Thu, Oct 23, 2014 at 3:51 PM, Craig Lewis cle...@centraldesktop.com wrote:
 I'm having a problem getting RadosGW replication to work after upgrading to
 Apache 2.4 on my primary test cluster.  Upgrading the secondary cluster to
 Apache 2.4 doesn't cause any problems. Both Ceph's apache packages and
 Ubuntu's packages cause the same problem.

 I'm pretty sure I'm missing something obvious, but I'm not seeing it.

 Has anybody else upgraded their federated gateways to apache 2.4?



 My setup
 2 VMs, each running their own ceph cluster with replication=1
 test0-ceph.cdlocal is the primary zone, named us-west
 test1-ceph.cdlocal is the secondary zone, named us-central
 Before I start, replication works, and I'm running

 Ubuntu 14.04 LTS
 Emperor (0.72.2-1precise, retained using apt-hold)
 Apache 2.2 (2.2.22-2precise.ceph, retained using apt-hold)


 As soon as I upgrade Apache to 2.4 in the primary cluster, replication gets
 permission errors.  radosgw-agent.log:
 2014-10-23T15:13:43.022 31106:ERROR:radosgw_agent.worker:failed to sync
 object bucket3/test6.jpg: state is error

 The access logs from the primary say (using vhost_combined log format):
 test0-ceph.cdlocal:80 172.16.205.1 - - [23/Oct/2014:15:16:51 -0700] PUT
 /test6.jpg HTTP/1.1 200 209 - -- - - [23/Oct/2014:13:24:18 -0700] GET
 /?delimiter=/ HTTP/1.1 200 1254 - - bucket3.test0-ceph.cdlocal
 snip
 test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] GET
 /admin/log?marker=089.89.3type=bucket-indexbucket-instance=bucket3%3Aus-west.5697.2max-entries=1000
 HTTP/1.1 200 398 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
 test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] GET
 /bucket3/test6.jpg?rgwx-uid=us-centralrgwx-region=usrgwx-prepend-metadata=us
 HTTP/1.1 403 249 - -

 172.16.205.143 is the primary cluster, .144 is the secondary cluster, and .1
 is my workstation.


 The access logs on the secondary show:
 test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] GET
 /admin/replica_log?boundstype=bucket-indexbucket-instance=bucket3%3Aus-west.5697.2
 HTTP/1.1 200 643 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
 test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] PUT
 /bucket3/test6.jpg?rgwx-op-id=test1-ceph0.cdlocal%3A6484%3A3rgwx-source-zone=us-westrgwx-client-id=radosgw-agent
 HTTP/1.1 403 286 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
 test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] GET
 /admin/opstate?client-id=radosgw-agentobject=bucket3%2Ftest6.jpgop-id=test1-ceph0.cdlocal%3A6484%3A3
 HTTP/1.1 200 355 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic

 If I crank up radosgw debugging, it tells me that the calculated digest is
 correct for the /admin/* requests, but fails for the object GET:
 /admin/log
 2014-10-23 15:44:29.257688 7fa6fcfb9700 15 calculated
 digest=6Tt13P6naWJEc0mJmYyDj6NzBS8=
 2014-10-23 15:44:29.257690 7fa6fcfb9700 15
 auth_sign=6Tt13P6naWJEc0mJmYyDj6NzBS8=
 /bucket3/test6.jpg
 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
 digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
 2014-10-23 15:44:29.257691 7fa6fcfb9700 15 compare=0
 2014-10-23 15:44:29.257693 7fa6fcfb9700 20 system request
 snip
 /bucket3/test6.jpg
 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
 digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
 2014-10-23 15:44:29.411573 7fa6fc7b8700 15
 auth_sign=Gv398QNc6gLig9/0QbdO+1UZUq0=
 2014-10-23 15:44:29.411574 7fa6fc7b8700 15 compare=-41
 2014-10-23 15:44:29.411577 7fa6fc7b8700 10 failed to authorize request

 That explains the 403 responses.

 So I have metadata replication working, but the data replication is failing
 with permission problems.  I verified that I can create users and buckets in
 the primary, and have them replicate to the secondary.


 A similar situation was posted to the list before.  That time, the problem
 was that the system users weren't correctly deployed to both the primary and
 secondary clusters.  I verified that both users exist in both clusters, with
 the same access and secret.

 Just to test, I used s3cmd.  I can read and write to both clusters using
 both system user's credentials.


 Anybody have any ideas?


You're hitting issue #9206. Apache 2.4 filters out certain http
headers because they use underscores instead of dashes. There's a fix
for that for firefly, although it hasn't made it to an officially
released version.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Dan van der Ster
Hi,

October 24 2014 5:28 PM, HURTEVENT VINCENT vincent.hurtev...@univ-lyon1.fr 
wrote: 
 Hello,
 
 I was running a multi mon (3) Ceph cluster and in a migration move, I 
 reinstall 2 of the 3 monitors
 nodes without deleting them properly into the cluster.
 
 So, there is only one monitor left which is stuck in probing phase and the 
 cluster is down.
 
 As I can only connect to mon socket, I don't how if it's possible to add a 
 monitor, get and edit
 monmap.
 
 This cluster is running Ceph version 0.67.1.
 
 Is there a way to force my last monitor into a leader state or re build a 
 lost monitor to pass the
 probe and election phases ?

Did you already try to remake one of the lost monitors? Assuming your ceph.conf 
has the addresses of the mons, and the keyrings are in place, maybe this will 
work:

ceph-mon --mkfs -i previous name 

then start the process?

I've never been in this situation before, so I don't know if it will work.

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Loic Dachary
Bonjour,

Maybe http://ceph.com/docs/giant/rados/troubleshooting/troubleshooting-mon/ can 
help ? Joao wrote that a few month ago and it covers a number of scenarios.

Cheers

On 24/10/2014 08:27, HURTEVENT VINCENT wrote:
 Hello,
 
 I was running a multi mon (3) Ceph cluster and in a migration move, I 
 reinstall 2 of the 3 monitors nodes without deleting them properly into the 
 cluster.
 
 So, there is only one monitor left which is stuck in probing phase and the 
 cluster is down.
 
 As I can only connect to mon socket, I don't how if it's possible to add a 
 monitor, get and edit monmap.
 
 This cluster is running Ceph version 0.67.1.
 
 Is there a way to force my last monitor into a leader state or re build a 
 lost monitor to pass the probe and election phases ?
 
 Thank you,
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-24 Thread Yan, Zheng
On Fri, Oct 24, 2014 at 8:47 AM, Sergey Nazarov nataraj...@gmail.com wrote:
 Any update?


The short answer is that when the command is executed for second time,
the MDS needs to truncate the file zero length. The speed of truncate
a file is limited by the OSD speed. (creating file and write data to
the file are async operations, but truncating a file is sync
operation)

Regards
Yan, Zheng


 On Tue, Oct 21, 2014 at 3:32 PM, Sergey Nazarov nataraj...@gmail.com wrote:
 Ouch, I think client log is missing.
 Here it goes:
 https://www.dropbox.com/s/650mjim2ldusr66/ceph-client.admin.log.gz?dl=0

 On Tue, Oct 21, 2014 at 3:22 PM, Sergey Nazarov nataraj...@gmail.com wrote:
 I enabled logging and performed same tests.
 Here is the link on archive with logs, they are only from one node
 (from the node where active MDS was sitting):
 https://www.dropbox.com/s/80axovtoofesx5e/logs.tar.gz?dl=0

 Rados bench results:

 # rados bench -p test 10 write
  Maintaining 16 concurrent writes of 4194304 bytes for up to 10
 seconds or 0 objects
  Object prefix: benchmark_data_atl-fs11_4630
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  164630   119.967   120  0.201327  0.348463
  2  168872   143.969   168  0.132983  0.353677
  3  16   124   108   143.972   144  0.930837  0.383018
  4  16   155   139   138.976   124  0.899468  0.426396
  5  16   203   187   149.575   192  0.236534  0.400806
  6  16   243   227   151.309   160  0.835213  0.397673
  7  16   276   260   148.549   132  0.905989  0.406849
  8  16   306   290   144.978   120  0.353279  0.422106
  9  16   335   319   141.757   116   1.12114  0.428268
 10  16   376   360143.98   164  0.418921   0.43351
 11  16   377   361   131.254 4  0.499769  0.433693
  Total time run: 11.206306
 Total writes made:  377
 Write size: 4194304
 Bandwidth (MB/sec): 134.567

 Stddev Bandwidth:   60.0232
 Max bandwidth (MB/sec): 192
 Min bandwidth (MB/sec): 0
 Average Latency:0.474923
 Stddev Latency: 0.376038
 Max latency:1.82171
 Min latency:0.060877


 # rados bench -p test 10 seq
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  166145   179.957   180  0.010405   0.25243
  2  16   10993   185.962   192  0.908263  0.284303
  3  16   151   135   179.965   168  0.255312  0.297283
  4  16   191   175174.97   160  0.836727  0.330659
  5  16   236   220   175.971   180  0.009995  0.330832
  6  16   275   259   172.639   156   1.06855  0.345418
  7  16   311   295   168.545   144  0.907648  0.361689
  8  16   351   335   167.474   160  0.947688  0.363552
  9  16   390   374   166.196   156  0.140539  0.369057
  Total time run:9.755367
 Total reads made: 401
 Read size:4194304
 Bandwidth (MB/sec):164.422

 Average Latency:   0.387705
 Max latency:   1.33852
 Min latency:   0.008064

 # rados bench -p test 10 rand
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1  165539   155.938   156  0.773716  0.257267
  2  169377   153.957   152  0.006573  0.339199
  3  16   135   119   158.629   168  0.009851  0.359675
  4  16   171   155   154.967   144  0.892027  0.359015
  5  16   209   193   154.369   152   1.13945  0.378618
  6  16   256   240159.97   188  0.009965  0.368439
  7  16   295   279 159.4   156  0.195812  0.371259
  8  16   343   327   163.472   192  0.880587  0.370759
  9  16   380   364161.75   148  0.113111  0.377983
 10  16   424   408   163.173   176  0.772274  0.379497
  Total time run:10.518482
 Total reads made: 425
 Read size:4194304
 Bandwidth (MB/sec):161.620

 Average Latency:   0.393978
 Max latency:   1.36572
 Min latency:   0.006448

 On Tue, Oct 21, 2014 at 2:03 PM, Gregory Farnum g...@inktank.com wrote:
 Can you enable debugging on the client (debug ms = 1, debug client
 = 20) and mds (debug ms = 1, debug mds = 20), run this test
 again, and post them somewhere for me to look at?

 While you're at it, can you try rados bench and see 

Re: [ceph-users] RGW Federated Gateways and Apache 2.4 problems

2014-10-24 Thread Craig Lewis
Thanks!  I'll continue with Apache 2.2 until the next release.

On Fri, Oct 24, 2014 at 8:58 AM, Yehuda Sadeh yeh...@redhat.com wrote:

 On Thu, Oct 23, 2014 at 3:51 PM, Craig Lewis cle...@centraldesktop.com
 wrote:
  I'm having a problem getting RadosGW replication to work after upgrading
 to
  Apache 2.4 on my primary test cluster.  Upgrading the secondary cluster
 to
  Apache 2.4 doesn't cause any problems. Both Ceph's apache packages and
  Ubuntu's packages cause the same problem.
 
  I'm pretty sure I'm missing something obvious, but I'm not seeing it.
 
  Has anybody else upgraded their federated gateways to apache 2.4?
 
 
 
  My setup
  2 VMs, each running their own ceph cluster with replication=1
  test0-ceph.cdlocal is the primary zone, named us-west
  test1-ceph.cdlocal is the secondary zone, named us-central
  Before I start, replication works, and I'm running
 
  Ubuntu 14.04 LTS
  Emperor (0.72.2-1precise, retained using apt-hold)
  Apache 2.2 (2.2.22-2precise.ceph, retained using apt-hold)
 
 
  As soon as I upgrade Apache to 2.4 in the primary cluster, replication
 gets
  permission errors.  radosgw-agent.log:
  2014-10-23T15:13:43.022 31106:ERROR:radosgw_agent.worker:failed to sync
  object bucket3/test6.jpg: state is error
 
  The access logs from the primary say (using vhost_combined log format):
  test0-ceph.cdlocal:80 172.16.205.1 - - [23/Oct/2014:15:16:51 -0700] PUT
  /test6.jpg HTTP/1.1 200 209 - -- - - [23/Oct/2014:13:24:18 -0700]
 GET
  /?delimiter=/ HTTP/1.1 200 1254 - - bucket3.test0-ceph.cdlocal
  snip
  test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700]
 GET
 
 /admin/log?marker=089.89.3type=bucket-indexbucket-instance=bucket3%3Aus-west.5697.2max-entries=1000
  HTTP/1.1 200 398 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
  test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700]
 GET
 
 /bucket3/test6.jpg?rgwx-uid=us-centralrgwx-region=usrgwx-prepend-metadata=us
  HTTP/1.1 403 249 - -
 
  172.16.205.143 is the primary cluster, .144 is the secondary cluster,
 and .1
  is my workstation.
 
 
  The access logs on the secondary show:
  test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
 GET
 
 /admin/replica_log?boundstype=bucket-indexbucket-instance=bucket3%3Aus-west.5697.2
  HTTP/1.1 200 643 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
  test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
 PUT
 
 /bucket3/test6.jpg?rgwx-op-id=test1-ceph0.cdlocal%3A6484%3A3rgwx-source-zone=us-westrgwx-client-id=radosgw-agent
  HTTP/1.1 403 286 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
  test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700]
 GET
 
 /admin/opstate?client-id=radosgw-agentobject=bucket3%2Ftest6.jpgop-id=test1-ceph0.cdlocal%3A6484%3A3
  HTTP/1.1 200 355 - Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic
 
  If I crank up radosgw debugging, it tells me that the calculated digest
 is
  correct for the /admin/* requests, but fails for the object GET:
  /admin/log
  2014-10-23 15:44:29.257688 7fa6fcfb9700 15 calculated
  digest=6Tt13P6naWJEc0mJmYyDj6NzBS8=
  2014-10-23 15:44:29.257690 7fa6fcfb9700 15
  auth_sign=6Tt13P6naWJEc0mJmYyDj6NzBS8=
  /bucket3/test6.jpg
  2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
  digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
  2014-10-23 15:44:29.257691 7fa6fcfb9700 15 compare=0
  2014-10-23 15:44:29.257693 7fa6fcfb9700 20 system request
  snip
  /bucket3/test6.jpg
  2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated
  digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U=
  2014-10-23 15:44:29.411573 7fa6fc7b8700 15
  auth_sign=Gv398QNc6gLig9/0QbdO+1UZUq0=
  2014-10-23 15:44:29.411574 7fa6fc7b8700 15 compare=-41
  2014-10-23 15:44:29.411577 7fa6fc7b8700 10 failed to authorize request
 
  That explains the 403 responses.
 
  So I have metadata replication working, but the data replication is
 failing
  with permission problems.  I verified that I can create users and
 buckets in
  the primary, and have them replicate to the secondary.
 
 
  A similar situation was posted to the list before.  That time, the
 problem
  was that the system users weren't correctly deployed to both the primary
 and
  secondary clusters.  I verified that both users exist in both clusters,
 with
  the same access and secret.
 
  Just to test, I used s3cmd.  I can read and write to both clusters using
  both system user's credentials.
 
 
  Anybody have any ideas?
 

 You're hitting issue #9206. Apache 2.4 filters out certain http
 headers because they use underscores instead of dashes. There's a fix
 for that for firefly, although it hasn't made it to an officially
 released version.

 Yehuda

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Gregory Farnum
There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:
 I'm doing some fio tests on Giant using fio rbd driver to measure
 performance on a new ceph cluster.

 However with block sizes  1M (initially noticed with 4M) I am seeing
 absolutely no IOPS for *reads* - and the fio process becomes non
 interrupteable (needs kill -9):

 $ ceph -v
 ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)

 $ fio --version
 fio-2.1.11-20-g9a44

 $ fio read-busted.fio
 env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
 fio-2.1.11-20-g9a44
 Starting 1 process
 rbd engine: RBD version: 0.1.8
 Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
 1158050441d:06h:58m:03s]

 This appears to be a pure fio rbd driver issue, as I can attach the relevant
 rbd volume to a vm and dd from it using 4M blocks no problem.

 Any ideas?

 Cheers

 Mark

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Mark Nelson
FWIW the specific fio read problem appears to have started after 0.86 
and before commit 42bcabf.


Mark

On 10/24/2014 12:56 PM, Gregory Farnum wrote:

There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:

I'm doing some fio tests on Giant using fio rbd driver to measure
performance on a new ceph cluster.

However with block sizes  1M (initially noticed with 4M) I am seeing
absolutely no IOPS for *reads* - and the fio process becomes non
interrupteable (needs kill -9):

$ ceph -v
ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)

$ fio --version
fio-2.1.11-20-g9a44

$ fio read-busted.fio
env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
fio-2.1.11-20-g9a44
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:58m:03s]

This appears to be a pure fio rbd driver issue, as I can attach the relevant
rbd volume to a vm and dd from it using 4M blocks no problem.

Any ideas?

Cheers

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Josh Durgin

On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:

Hey folks,

I am trying to enable OpenStack to use RBD as image backend:
https://bugs.launchpad.net/nova/+bug/1226351

For some reason, nova-compute segfaults due to librados crash:

./log/SubsystemMap.h: In function 'bool
ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
7f1b477fe700 time 2014-10-24 03:20:17.382769
./log/SubsystemMap.h: 62: FAILED assert(sub  m_subsys.size())
ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
1: (()+0x42785) [0x7f1b4c4db785]
2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
4: (()+0x6b50) [0x7f1b6ea93b50]
5: (clone()+0x6d) [0x7f1b6df3e0ed]
NOTE: a copy of the executable, or `objdump -rdS executable` is needed
to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted

I feel that there is some concurrency issue, since this sometimes happen
before and sometimes after this line:
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/rbd_utils.py#L208

Any idea what are the potential causes of the crash?

Thanks.
-Simon


This is http://tracker.ceph.com/issues/8912, fixed in the latest
firefly and dumpling releases.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Object Storage Statistics

2014-10-24 Thread Yehuda Sadeh
On Fri, Oct 24, 2014 at 8:17 AM, Dane Elwell dane.elw...@gmail.com wrote:
 Hi list,

 We're using the object storage in production and billing people based
 on their usage, much like S3. We're also trying to produce things like
 hourly bandwidth graphs for our clients.

 We're having some issues with the API not returning the correct
 statistics. I can see that there is a --sync-stats option for the
 command line radosgw-admin, but there doesn't appear to be anything
 similar for the admin REST API. Is there an equivalent feature for the
 API that hasn't been documented by chance?


There are two different statistics that are collected, one is the
'usage' information that collects data about actual operations that
clients do in a period of time. This information can be accessed
through the admin api. The other one is the user stats info that is
part of the user quota system, which at the moment is not hooked into
a REST interface.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph and hadoop

2014-10-24 Thread Matan Safriel
Hi,

Given HDFS is far from ideal for small files, I am examining the
possibility of using Hadoop on top Ceph. I found mainly one online resource
about it https://ceph.com/docs/v0.79/cephfs/hadoop/. I am wondering whether
there is any reference implementation or blog post you are aware of, about
hadoop on top Ceph. Likewise happy to have any pointers about why _not_ to
attempt just that

Thanks!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to recover Incomplete PGs from lost time symptom?

2014-10-24 Thread Chris Kitzmiller
I have a number of PGs which are marked as incomplete. I'm at a loss for how to 
go about recovering these PGs and believe they're suffering from the lost 
time symptom. How do I recover these PGs? I'd settle for sacrificing the lost 
time and just going with what I've got. I've lost the ability to mount the RBD 
within this pool and I'm afraid that unless I can resolve this I'll have lost 
all my data.

A query from one of my incomplete PGs: http://pastebin.com/raw.php?i=AJ3RMjz6

My CRUSH map: http://pastebin.com/raw.php?i=gWtJuhsy___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get/put files with radosgw once MDS crash

2014-10-24 Thread Craig Lewis
No, MDS and RadosGW store their data in different pools.  There's no way
for them to access the other's data.

All of the data is stored in RADOS, and can be accessed via the rados CLI.
It's not easy, and you'd probably have to spend a lot of time reading the
source code to do it.


On Fri, Oct 24, 2014 at 1:49 AM, 廖建锋 de...@f-club.cn wrote:

  dear cepher,
  Today, I use mds to put/get files from ceph storgate cluster as
 it is very easy to use for each side of a company.
 But ceph mds is not very stable, So my question:
 is it possbile to get the file name and contentes from OSD with
 radosgw once MDS crash and how ?




 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-24 Thread Craig Lewis
You can deploy multiple RadosGW in a single cluster.  You'll need to setup
zones (see http://ceph.com/docs/master/radosgw/federated-config/).  Most
people seem to be using zones for geo-replication, but local replication
works even better.  Multiple zones don't have to be replicated either.  For
example, you could use multiple zones for tiered services.  For example, a
service with 4x replication on pure SSDs, and a cheaper service with 2x
replication on HDDs.

If you do have separate zones in a single cluster, you'll want to configure
different OSDs to serve the different zones.  You want fault isolation
between the zones. The problems this brings are mostly management of the
extra complexity.


CivetWeb is embedded into the RadosGW daemon, where as Apache talks to
RadosGW using FastCGI.  Overall, CivetWeb should be simpler to setup and
manage, since it doesn't require Apache, it's configuration, or the
overhead.

I don't know if Civetweb is considered production ready.  Giant has a bunch
of fixes for Civetweb, so I'm leaning towards not on Firefly unless
somebody more knowledgeable tells me otherwise.


On Thu, Oct 23, 2014 at 11:04 PM, yuelongguang fasts...@163.com wrote:

 hi,yehuda

 1.
 can we deploy multi-rgws on one ceph cluster?
 if so  does it bring us any problems?

 2. what is the major difference between apache and civetweb?
 what is  civetweb's advantage?

 thanks





 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't start osd- one osd alway be down.

2014-10-24 Thread Craig Lewis
It looks like you're running into http://tracker.ceph.com/issues/5699

You're running 0.80.7, which has a fix for that bug.  From my reading of
the code, I believe the fix only prevents the issue from occurring.  It
doesn't work around or repair bad snapshots created on older versions of
Ceph.

Were any of the snapshots you're removing up created on older versions of
Ceph?  If they were all created on Firefly, then you should open a new
tracker issue, and try to get some help on IRC or the developers mailing
list.


On Thu, Oct 23, 2014 at 10:21 PM, Ta Ba Tuan tua...@vccloud.vn wrote:

 Dear everyone

 I can't start osd.21, (attached log file).
 some pgs can't be repair. I'm using replicate 3 for my data pool.
 Feel some objects in those pgs be failed,

 I tried to delete some data that related above objects, but still not
 start osd.21
 and, removed osd.21, but other osds (eg: osd.86 down, not start osd.86).

 Guide me to debug it, please! Thanks!

 --
 Tuan
 Ha Noi - VietNam










 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Mark Kirkwood

Yeah, looks like it. If I disable the rbd ccahe:

$ tail /etc/ceph/ceph.conf
...
[client]
rbd cache = false

then the 2-4M reads work fine (no invalid reads in valgrind either). 
I'll let the fio guys know.


Cheers

Mark

On 25/10/14 06:56, Gregory Farnum wrote:

There's an issue in master branch temporarily that makes rbd reads
greater than the cache size hang (if the cache was on). This might be
that. (Jason is working on it: http://tracker.ceph.com/issues/9854)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:

I'm doing some fio tests on Giant using fio rbd driver to measure
performance on a new ceph cluster.

However with block sizes  1M (initially noticed with 4M) I am seeing
absolutely no IOPS for *reads* - and the fio process becomes non
interrupteable (needs kill -9):

$ ceph -v
ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0)

$ fio --version
fio-2.1.11-20-g9a44

$ fio read-busted.fio
env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32
fio-2.1.11-20-g9a44
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
1158050441d:06h:58m:03s]

This appears to be a pure fio rbd driver issue, as I can attach the relevant
rbd volume to a vm and dd from it using 4M blocks no problem.

Any ideas?

Cheers

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
Thanks. I found the commit on git and confirms 0.80.7 fixes the issue.

On Friday, October 24, 2014, Josh Durgin josh.dur...@inktank.com wrote:

 On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:

 Hey folks,

 I am trying to enable OpenStack to use RBD as image backend:
 https://bugs.launchpad.net/nova/+bug/1226351

 For some reason, nova-compute segfaults due to librados crash:

 ./log/SubsystemMap.h: In function 'bool
 ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
 7f1b477fe700 time 2014-10-24 03:20:17.382769
 ./log/SubsystemMap.h: 62: FAILED assert(sub  m_subsys.size())
 ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
 1: (()+0x42785) [0x7f1b4c4db785]
 2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
 4: (()+0x6b50) [0x7f1b6ea93b50]
 5: (clone()+0x6d) [0x7f1b6df3e0ed]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed
 to interpret this.
 terminate called after throwing an instance of 'ceph::FailedAssertion'
 Aborted

 I feel that there is some concurrency issue, since this sometimes happen
 before and sometimes after this line:
 https://github.com/openstack/nova/blob/master/nova/virt/
 libvirt/rbd_utils.py#L208

 Any idea what are the potential causes of the crash?

 Thanks.
 -Simon


 This is http://tracker.ceph.com/issues/8912, fixed in the latest
 firefly and dumpling releases.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librados crash in nova-compute

2014-10-24 Thread Xu (Simon) Chen
I am actually curious about one more thing.

In the image - rbd case, is rbd_secret_uuid config option really used? I
am running nova-compute as a non-root user, so virsh secret shouldn't be
accessible unless we get it via rootwrap. I had to make ceph keyring file
readable to the nova-compute user for the whole thing to work...


On Friday, October 24, 2014, Xu (Simon) Chen xche...@gmail.com wrote:

 Thanks. I found the commit on git and confirms 0.80.7 fixes the issue.

 On Friday, October 24, 2014, Josh Durgin josh.dur...@inktank.com
 javascript:_e(%7B%7D,'cvml','josh.dur...@inktank.com'); wrote:

 On 10/24/2014 08:21 AM, Xu (Simon) Chen wrote:

 Hey folks,

 I am trying to enable OpenStack to use RBD as image backend:
 https://bugs.launchpad.net/nova/+bug/1226351

 For some reason, nova-compute segfaults due to librados crash:

 ./log/SubsystemMap.h: In function 'bool
 ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread
 7f1b477fe700 time 2014-10-24 03:20:17.382769
 ./log/SubsystemMap.h: 62: FAILED assert(sub  m_subsys.size())
 ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
 1: (()+0x42785) [0x7f1b4c4db785]
 2: (ObjectCacher::flusher_entry()+0xfda) [0x7f1b4c53759a]
 3: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f1b4c54a16d]
 4: (()+0x6b50) [0x7f1b6ea93b50]
 5: (clone()+0x6d) [0x7f1b6df3e0ed]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed
 to interpret this.
 terminate called after throwing an instance of 'ceph::FailedAssertion'
 Aborted

 I feel that there is some concurrency issue, since this sometimes happen
 before and sometimes after this line:
 https://github.com/openstack/nova/blob/master/nova/virt/
 libvirt/rbd_utils.py#L208

 Any idea what are the potential causes of the crash?

 Thanks.
 -Simon


 This is http://tracker.ceph.com/issues/8912, fixed in the latest
 firefly and dumpling releases.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] journals relabeled by OS, symlinks broken

2014-10-24 Thread Steve Anthony
Hello,

I was having problems with a node in my cluster (Ceph v0.80.7/Debian
Wheezy/Kernel 3.12), so I rebooted it and the disks were relabled when
it came back up. Now all the symlinks to the journals are broken. The
SSDs are now sda, sdb, and sdc but the journals were sdc, sdd, and sde:

root@ceph17:~# ls -l /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx 1 root root 9 Oct 20 16:47 /var/lib/ceph/osd/ceph-150/journal
- /dev/sde1
lrwxrwxrwx 1 root root 9 Oct 20 16:53 /var/lib/ceph/osd/ceph-157/journal
- /dev/sdd1
lrwxrwxrwx 1 root root 9 Oct 21 08:31 /var/lib/ceph/osd/ceph-164/journal
- /dev/sdc1
lrwxrwxrwx 1 root root 9 Oct 21 16:33 /var/lib/ceph/osd/ceph-171/journal
- /dev/sde2
lrwxrwxrwx 1 root root 9 Oct 22 10:50 /var/lib/ceph/osd/ceph-178/journal
- /dev/sdc2
lrwxrwxrwx 1 root root 9 Oct 22 15:48 /var/lib/ceph/osd/ceph-184/journal
- /dev/sdd2
lrwxrwxrwx 1 root root 9 Oct 23 10:46 /var/lib/ceph/osd/ceph-191/journal
- /dev/sde3
lrwxrwxrwx 1 root root 9 Oct 23 15:22 /var/lib/ceph/osd/ceph-195/journal
- /dev/sdc3
lrwxrwxrwx 1 root root 9 Oct 23 16:59 /var/lib/ceph/osd/ceph-201/journal
- /dev/sdd3
lrwxrwxrwx 1 root root 9 Oct 24 21:32 /var/lib/ceph/osd/ceph-214/journal
- /dev/sde4
lrwxrwxrwx 1 root root 9 Oct 24 21:33 /var/lib/ceph/osd/ceph-215/journal
- /dev/sdd4

Any way to fix this without just removing all the OSDs and re-adding
them? I thought about recreating the symlinks to point at the new SSD
labels, but I figured I'd check here first. Thanks!

-Steve

-- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma...@lehigh.edu

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't start osd- one osd alway be down.

2014-10-24 Thread Ta Ba Tuan

Hi Craig, Thanks for replying.
When i started that osd, Ceph Log from ceph -w warns pgs 7.9d8 23.596, 
23.9c6, 23.63 can't recovery as pasted log.


Those pgs are active+degraded state.
#ceph pg map 7.9d8
osdmap e102808 pg 7.9d8 (7.9d8) - up [93,49] acting [93,49] (When start 
osd.21 then pg 7.9d8 and three remain pgs  to changed to state 
active+recovering) . osd.21 still down after following logs:



2014-10-25 10:57:48.415920 osd.21 [WRN] slow request 30.835731 seconds 
old, received at 2014-10-25 10:57:17.580013: MOSDPGPush(*7.9d8 *102803 [Push
Op(e13589d8/rbd_data.4b843b2ae8944a.0c00/head//6, version: 
102798'7794851, data_included: [0~4194304], data_size: 4194304, omap_heade
r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(e13589d8/rbd_data.4b843b2ae8944a.0c00/head//6@102
798'7794851, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complete
:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_rec

overed_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:48.415927 osd.21 [WRN] slow request 30.275588 seconds 
old, received at 2014-10-25 10:57:18.140156: MOSDPGPush(*23.596* 102803 [Pus
hOp(4ca76d96/rbd_data.5dd32f2ae8944a.0385/head//24, version: 
102798'295732, data_included: [0~4194304], data_size: 4194304, omap_head
er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(4ca76d96/rbd_data.5dd32f2ae8944a.0385/head//24@1
02798'295732, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complet
e:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_re

covered_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:48.415910 osd.21 [WRN] slow request 30.860696 seconds 
old, received at 2014-10-25 10:57:17.555048: MOSDPGPush(*23.9c6* 102803 [Pus
hOp(efdde9c6/rbd_data.5b64062ae8944a.0b15/head//24, version: 
102798'66056, data_included: [0~4194304], data_size: 4194304, omap_heade
r_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(efdde9c6/rbd_data.5b64062ae8944a.0b15/head//24@10
2798'66056, copy_subset: [0~4194304], clone_subset: {}), after_progress: 
ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:
true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_reco

vered_to:, omap_complete:false))]) v2 currently no flag points reached

2014-10-25 10:57:58.418847 osd.21 [WRN] 26 slow requests, 1 included 
below; oldest blocked for  54.967456 secs
2014-10-25 10:57:58.418859 osd.21 [WRN] slow request 30.967294 seconds 
old, received at 2014-10-25 10:57:27.451488: MOSDPGPush(*23.63c* 102803 [Pus
hOp(40e4b63c/rbd_data.57ed612ae8944a.0c00/head//24, version: 
102748'145637, data_included: [0~4194304], data_size: 4194304, omap_head
er_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: 
ObjectRecoveryInfo(40e4b63c/rbd_data.57ed612ae8944a.0c00/head//24@1
02748'145637, copy_subset: [0~4194304], clone_subset: {}), 
after_progress: ObjectRecoveryProgress(!first, 
data_recovered_to:4194304, data_complet
e:true, omap_recovered_to:, omap_complete:true), before_progress: 
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, 
omap_re

covered_to:, omap_complete:false))]) v2 currently no flag points reached

Thanks!
--
Tuan
HaNoi-VietNam

On 10/25/2014 05:07 AM, Craig Lewis wrote:

It looks like you're running into http://tracker.ceph.com/issues/5699

You're running 0.80.7, which has a fix for that bug.  From my reading 
of the code, I believe the fix only prevents the issue from 
occurring.  It doesn't work around or repair bad snapshots created on 
older versions of Ceph.


Were any of the snapshots you're removing up created on older versions 
of Ceph?  If they were all created on Firefly, then you should open a 
new tracker issue, and try to get some help on IRC or the developers 
mailing list.


On Thu, Oct 23, 2014 at 10:21 PM, Ta Ba Tuan tua...@vccloud.vn 
mailto:tua...@vccloud.vn wrote:


Dear everyone

I can't start osd.21, (attached log file).
some pgs can't be repair. I'm using replicate 3 for my data pool.
Feel some objects in those pgs be failed,

I tried to delete some data that related above objects, but still
not start osd.21
and, removed osd.21, but other osds (eg: osd.86 down, not start
osd.86).

Guide me to debug it, please! Thanks!

--
Tuan
Ha Noi - VietNam










___
ceph-users mailing list