[ceph-users] how to get ceph daemon debug info from ceph-rest-api ?

2014-07-24 Thread zhu qiang
Hi all,

I want to use ceph-rest-api to view some debug details from ceph daemons.
On linux shell I can get this message from below:
# ceph daemon osd.0 dump_ops_in_flight | python -m json.tool
  { num_ops: 0,
  ops: []}

This is my question:
Can I get this output from ceph-rest-api ?

Until now I tried some method : curl, python-cephclient
I did not get the right respone.
Can some guys help me ?





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to get ceph daemon debug info from ceph-rest-api ?

2014-07-24 Thread John Spray
It doesn't currently support that.  ceph-rest-api only wraps commands
that are sent to the mon cluster, whereas the ceph daemon operations
use the local admin socket (/var/run/ceph/*.asok) of the service.
There has been some discussion of enabling calls to admin socket
operations via the mon though.

John

On Thu, Jul 24, 2014 at 9:20 AM, zhu qiang zhu_qiang...@foxmail.com wrote:
 Hi all,

 I want to use ceph-rest-api to view some debug details from ceph daemons.
 On linux shell I can get this message from below:
 # ceph daemon osd.0 dump_ops_in_flight | python -m json.tool
   { num_ops: 0,
   ops: []}

 This is my question:
 Can I get this output from ceph-rest-api ?

 Until now I tried some method : curl, python-cephclient
 I did not get the right respone.
 Can some guys help me ?





 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW recommendations for OSD journals?

2014-07-24 Thread Chris Kitzmiller
I found this article very interesting: 
http://techreport.com/review/26523/the-ssd-endurance-experiment-casualties-on-the-way-to-a-petabyte

I've got Samsung 840 Pros and while I'm thinking that I wouldn't go with them 
again I am interested in the fact that (in this anecdotal experiment) it seemed 
to last much longer than the wear leveling indicator would have suggested.

On a side note, if anyone is having performance issues with these drives, I've 
found that this produced a drastic speed up: 
https://wiki.archlinux.org/index.php/SSD_Memory_Cell_Clearing___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON segfaulting when setting a crush ruleset to a pool (firefly 0.80.4)

2014-07-24 Thread Olivier DELHOMME
Hi Joao,

In the meanwhile I have done the following things :

$ ceph osd crush move ceph-osd15 rack=rack1-pdu1
moved item id -17 name 'ceph-osd15' to location {rack=rack1-pdu1} in crush map

$ ceph osd crush rm rack2-pdu3
removed item id -23 name 'rack2-pdu3' from crush map

But it does not solve the problem either.
I saw in the documentation that restarting the osd where the PG are stuck
could help... I did restart all the OSD but it leads to the following status :

cluster 4a8669b9-b379-43b2-9488-7fca6e1366bc
 health HEALTH_WARN 80 pgs degraded; 152 pgs peering; 411 pgs stale; 166 
pgs stuck inactive; 411 pgs stuck stale; 620 pgs stuck unclean; recovery 
51106/694410 objects degraded (7.360%)
 monmap e2: 3 mons at 
{ceph-mon0=10.1.2.1:6789/0,ceph-mon1=10.1.2.2:6789/0,ceph-mon2=10.1.2.3:6789/0},
 election epoch 68, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2
 osdmap e1825: 16 osds: 16 up, 16 in
  pgmap v301798: 712 pgs, 5 pools, 1350 GB data, 338 kobjects
2763 GB used, 5615 GB / 8379 GB avail
51106/694410 objects degraded (7.360%)
 152 stale+peering
  73 stale+active+remapped
  80 stale+active+degraded+remapped
  92 stale+active+clean
 301 active+remapped
  14 stale

You'll find my crush map here :

http://pastebin.com/F9aFjcjm

Cheers,

Olivier.


- Mail original -
 De: Joao Eduardo Luis joao.l...@inktank.com
 À: Olivier DELHOMME olivier.delho...@mines-paristech.fr, 
 ceph-users@lists.ceph.com
 Envoyé: Mercredi 23 Juillet 2014 19:39:52
 Objet: Re: [ceph-users] MON segfaulting when setting a crush ruleset to a 
 pool (firefly 0.80.4)
 
 Hey Olivier,
 
 On 07/23/2014 02:06 PM, Olivier DELHOMME wrote:
  Hello,
 
  I'm running a test cluster (mon and osd are debian 7
  with  3.2.57-3+deb7u2 kernel). The client is a debian 7
  with a 3.15.4 kernel that I compiled myself.
 
  The cluster has 3 monitors and 16 osd servers.
  I created a pool (periph) and used it a bit and then
  I decided to create some buckets and moved the hosts
  into :
 
 Can you share your crush map?
 
 Cheers!
 
-Joao
 
 
 
  $ ceph osd crush add-bucket rack1-pdu1 rack
  $ ceph osd crush add-bucket rack1-pdu2 rack
  $ ceph osd crush add-bucket rack1-pdu3 rack
  $ ceph osd crush add-bucket rack2-pdu1 rack
  $ ceph osd crush add-bucket rack2-pdu2 rack
  $ ceph osd crush add-bucket rack2-pdu3 rack
  $ ceph osd crush move ceph-osd0 rack=rack1-pdu1
  $ ceph osd crush move ceph-osd1 rack=rack1-pdu1
  $ ceph osd crush move ceph-osd2 rack=rack1-pdu1
  $ ceph osd crush move ceph-osd3 rack=rack1-pdu2
  $ ceph osd crush move ceph-osd4 rack=rack1-pdu2
  $ ceph osd crush move ceph-osd5 rack=rack1-pdu2
  $ ceph osd crush move ceph-osd6 rack=rack1-pdu3
  $ ceph osd crush move ceph-osd7 rack=rack1-pdu3
  $ ceph osd crush move ceph-osd8 rack=rack1-pdu3
  $ ceph osd crush move ceph-osd9 rack=rack2-pdu1
  $ ceph osd crush move ceph-osd10 rack=rack2-pdu1
  $ ceph osd crush move ceph-osd11 rack=rack2-pdu1
  $ ceph osd crush move ceph-osd12 rack=rack2-pdu2
  $ ceph osd crush move ceph-osd13 rack=rack2-pdu2
  $ ceph osd crush move ceph-osd14 rack=rack2-pdu2
  $ ceph osd crush move ceph-osd15 rack=rack2-pdu3
 
  It did well :
 
  $ ceph osd tree
  # idweight  type name   up/down reweight
  -23 0.91rack rack2-pdu3
  -17 0.91host ceph-osd15
  15  0.91osd.15  up  1
  -22 1.81rack rack2-pdu2
  -14 0.45host ceph-osd12
  12  0.45osd.12  up  1
  -15 0.45host ceph-osd13
  13  0.45osd.13  up  1
  -16 0.91host ceph-osd14
  14  0.91osd.14  up  1
  -21 1.35rack rack2-pdu1
  -11 0.45host ceph-osd9
  9   0.45osd.9   up  1
  -12 0.45host ceph-osd10
  10  0.45osd.10  up  1
  -13 0.45host ceph-osd11
  11  0.45osd.11  up  1
  -20 1.35rack rack1-pdu3
  -8  0.45host ceph-osd6
  6   0.45osd.6   up  1
  -9  0.45host ceph-osd7
  7   0.45osd.7   up  1
  -10 0.45host ceph-osd8
  8   0.45osd.8   up  1
  -19 1.35rack rack1-pdu2
  -5  0.45host ceph-osd3
  3   0.45osd.3   up  1
  -6  0.45host ceph-osd4
  4   0.45osd.4   up  1
  -7  0.45host ceph-osd5
  5   0.45osd.5   up  1
  -18 1.35rack rack1-pdu1
  -2  0.45host ceph-osd0
  0   0.45osd.0   up  1
  -3  0.45host ceph-osd1
  1   0.45osd.1   up  1
  -4  0.45host ceph-osd2
  

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi Steve,
I'm also looking for improvements of single-thread-reads.

A little bit higher values (twice?) should be possible with your config.
I have 5 nodes with 60 4-TB hdds and got following:
rados -p test bench -b 4194304 60 seq -t 1 --no-cleanup
Total time run:60.066934
Total reads made: 863
Read size:4194304
Bandwidth (MB/sec):57.469
Average Latency:   0.0695964
Max latency:   0.434677
Min latency:   0.016444

In my case I had some osds (xfs) with an high fragmentation (20%).
Changing the mount options and defragmentation help slightly.
Performance changes:
[client]
rbd cache = true
rbd cache writethrough until flush = true

[osd]   


osd mount options xfs =
rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M


osd_op_threads =
4   
   

osd_disk_threads = 4


But I expect much more speed for an single thread...

Udo

On 23.07.2014 22:13, Steve Anthony wrote:
 Ah, ok. That makes sense. With one concurrent operation I see numbers
 more in line with the read speeds I'm seeing from the filesystems on the
 rbd images.

 # rados -p bench bench 300 seq --no-cleanup -t 1
 Total time run:300.114589
 Total reads made: 2795
 Read size:4194304
 Bandwidth (MB/sec):37.252

 Average Latency:   0.10737
 Max latency:   0.968115
 Min latency:   0.039754

 # rados -p bench bench 300 rand --no-cleanup -t 1
 Total time run:300.164208
 Total reads made: 2996
 Read size:4194304
 Bandwidth (MB/sec):39.925

 Average Latency:   0.100183
 Max latency:   1.04772
 Min latency:   0.039584

 I really wish I could find my data on read speeds from a couple weeks
 ago. It's possible that they've always been in this range, but I
 remember one of my test users saturating his 1GbE link over NFS reading
 copying from the rbd client to his workstation. Of course, it's also
 possible that the data set he was using was cached in RAM when he was
 testing, masking the lower rbd speeds.

 It just seems counterintuitive to me that read speeds would be so much
 slower that writes at the filesystem layer in practice. With images in
 the 10-100TB range, reading data at 20-60MB/s isn't going to be
 pleasant. Can you suggest any tunables or other approaches to
 investigate to improve these speeds, or are they in line with what you'd
 expect? Thanks for your help!

 -Steve



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi again,
forget to say - I'm still on 0.72.2!

Udo


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Jean-Tiare LE BIGOT
What is your kernel version ? On kernel = 3.11 sysctl -w 
net.ipv4.tcp_window_scaling=0 seems to improve the situation a lot. It 
also helped a lot to mitigate processes going (and sticking) in 'D' state.


Le 24/07/2014 22:08, Udo Lembke a écrit :

Hi again,
forget to say - I'm still on 0.72.2!

Udo


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
Jean-Tiare, shared-hosting team

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Berlin MeetUp 28.7.

2014-07-24 Thread Robert Sander
Hi,

the next Ceph MeetUp in Berlin, Germany, happens on July 28.

http://www.meetup.com/Ceph-Berlin/events/195107422/

Regards
-- 
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] use sas drives for journal?

2014-07-24 Thread Robert Fantini
Hello.

In this set up:
PowerEdge R720
Raid: Perc H710 eight-port, 6Gb/s

OSD drives: qty 4: Seagate Constellation ES.3 ST2000NM0023 2TB 7200 RPM
128MB Cache SAS 6Gb/s

Would it make sense to uses these good sas drives in raid-1 for journal?
 Western Digital XE WD3001BKHG 300GB 1 RPM 32MB Cache SAS 6Gb/s 2.5


*Or would it make sense to *
*b: put journal on OSD*
*c: get 2 ssd's *

 *I'm  trying to find a good use for the WD  300GB drives..We are using
some for opsys raid-1 .. However we've got a few more to use up..*


 *best regards, Rob Fantini*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Steve Anthony
Thanks for the information!

Based on my reading of http://ceph.com/docs/next/rbd/rbd-config-ref I
was under the impression that rbd cache options wouldn't apply, since
presumably the kernel is handling the caching. I'll have to toggle some
of those values and see it they make a difference in my setup.

I did some additional testing today. If I limit the write benchmark to 1
concurrent operation I see a lower bandwidth number, as I expected.
However, when writing to the XFS filesystem on an rbd image I see
transfer rates closer to to 400MB/s.

# rados -p bench bench 300 write --no-cleanup -t 1

Total time run: 300.105945
Total writes made:  1992
Write size: 4194304
Bandwidth (MB/sec): 26.551

Stddev Bandwidth:   5.69114
Max bandwidth (MB/sec): 40
Min bandwidth (MB/sec): 0
Average Latency:0.15065
Stddev Latency: 0.0732024
Max latency:0.617945
Min latency:0.097339

# time cp -a /mnt/local/climate /mnt/ceph_test1

real2m11.083s
user0m0.440s
sys1m11.632s

# du -h --max-deph=1 /mnt/local
53G/mnt/local/climate

This seems to imply that the there is more than one concurrent operation
when writing into the filesystem on top of the rbd image. However, given
that the filesystem read speeds and the rados benchmark read speeds are
much closer in reported bandwidth, it's as if reads are occurring as a
single operation.

# time cp -a /mnt/ceph_test2/isos /mnt/local/

real36m2.129s
user0m1.572s
sys3m23.404s

# du -h --max-deph=1 /mnt/ceph_test2/
68G/mnt/ceph_test2/isos

Is this apparent single-thread read and multi-thread write with the rbd
kernel module the expected mode of operation? If so, could someone
explain the reason for this limitation?

Based on the information on data striping in
http://ceph.com/docs/next/architecture/#data-striping I would assume
that a format 1 image would stripe a file larger than the 4MB object
size over multiple objects and that those objects would be distributed
over multiple OSDs. This would seem to indicate that reading a file back
would be much faster since even though Ceph is only reading the primary
replica, the read is still distributed over multiple OSDs. At worst I
would expect something near the read bandwidth of a single OSD, which
would still be much higher than 30-40MB/s.

-Steve

On 07/24/2014 04:07 PM, Udo Lembke wrote:
 Hi Steve,
 I'm also looking for improvements of single-thread-reads.

 A little bit higher values (twice?) should be possible with your config.
 I have 5 nodes with 60 4-TB hdds and got following:
 rados -p test bench -b 4194304 60 seq -t 1 --no-cleanup
 Total time run:60.066934
 Total reads made: 863
 Read size:4194304
 Bandwidth (MB/sec):57.469
 Average Latency:   0.0695964
 Max latency:   0.434677
 Min latency:   0.016444

 In my case I had some osds (xfs) with an high fragmentation (20%).
 Changing the mount options and defragmentation help slightly.
 Performance changes:
 [client]
 rbd cache = true
 rbd cache writethrough until flush = true

 [osd] 
   

 osd mount options xfs =
 rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M  
   

 osd_op_threads =
 4 
  

 osd_disk_threads = 4


 But I expect much more speed for an single thread...

 Udo

 On 23.07.2014 22:13, Steve Anthony wrote:
 Ah, ok. That makes sense. With one concurrent operation I see numbers
 more in line with the read speeds I'm seeing from the filesystems on the
 rbd images.

 # rados -p bench bench 300 seq --no-cleanup -t 1
 Total time run:300.114589
 Total reads made: 2795
 Read size:4194304
 Bandwidth (MB/sec):37.252

 Average Latency:   0.10737
 Max latency:   0.968115
 Min latency:   0.039754

 # rados -p bench bench 300 rand --no-cleanup -t 1
 Total time run:300.164208
 Total reads made: 2996
 Read size:4194304
 Bandwidth (MB/sec):39.925

 Average Latency:   0.100183
 Max latency:   1.04772
 Min latency:   0.039584

 I really wish I could find my data on read speeds from a couple weeks
 ago. It's possible that they've always been in this range, but I
 remember one of my test users saturating his 1GbE link over NFS reading
 copying from the rbd client to his workstation. Of course, it's also
 possible that the data set he was using was cached in RAM when he was
 testing, masking the lower rbd speeds.

 It just seems counterintuitive to me that read speeds would be so much
 slower that writes at the filesystem layer in practice. With images in
 the 10-100TB range, reading data at 20-60MB/s isn't going to be
 pleasant. Can you suggest any tunables or other approaches to