[ceph-users] osd down after server failure

2013-10-14 Thread Dominik Mostowiec
Hi,
I had server failure that starts from one disk failure:
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023986] sd 4:2:26:0:
[sdaa] Unhandled error code
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023990] sd 4:2:26:0:
[sdaa]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023995] sd 4:2:26:0:
[sdaa] CDB: Read(10): 28 00 00 00 00 d0 00 00 10 00
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024005] end_request:
I/O error, dev sdaa, sector 208
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024744] XFS (sdaa):
metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
count 8192
Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.025879] XFS (sdaa):
xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.820288] XFS (sdaa):
metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
count 8192
Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.821194] XFS (sdaa):
xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
Oct 14 03:25:32 s3-10-177-64-6 kernel: [1027264.667851] XFS (sdaa):
metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
count 8192

this caused that the server has been unresponsive.

After server restart 3 of 26 osd on it are down.
In ceph-osd log after debug osd = 10 and restart is:

2013-10-14 06:21:23.141936 7fdeb4872700 -1 osd.47 43203 *** Got signal
Terminated ***
2013-10-14 06:21:23.142141 7fdeb4872700 -1 osd.47 43203  pausing thread pools
2013-10-14 06:21:23.142146 7fdeb4872700 -1 osd.47 43203  flushing io
2013-10-14 06:21:25.406187 7f02690f9780  0
filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
appears to work
2013-10-14 06:21:25.406204 7f02690f9780  0
filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
'filestore fiemap' config option
2013-10-14 06:21:25.406557 7f02690f9780  0
filestore(/vol0/data/osd.47) mount did NOT detect btrfs
2013-10-14 06:21:25.412617 7f02690f9780  0
filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
(by glibc and kernel)
2013-10-14 06:21:25.412831 7f02690f9780  0
filestore(/vol0/data/osd.47) mount found snaps 
2013-10-14 06:21:25.415798 7f02690f9780  0
filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
btrfs not detected
2013-10-14 06:21:26.078377 7f02690f9780  2 osd.47 0 mounting
/vol0/data/osd.47 /vol0/data/osd.47/journal
2013-10-14 06:21:26.080872 7f02690f9780  0
filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
appears to work
2013-10-14 06:21:26.080885 7f02690f9780  0
filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
'filestore fiemap' config option
2013-10-14 06:21:26.081289 7f02690f9780  0
filestore(/vol0/data/osd.47) mount did NOT detect btrfs
2013-10-14 06:21:26.087524 7f02690f9780  0
filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
(by glibc and kernel)
2013-10-14 06:21:26.087582 7f02690f9780  0
filestore(/vol0/data/osd.47) mount found snaps 
2013-10-14 06:21:26.089614 7f02690f9780  0
filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
btrfs not detected
2013-10-14 06:21:26.726676 7f02690f9780  2 osd.47 0 boot
2013-10-14 06:21:26.726773 7f02690f9780 10 osd.47 0 read_superblock
sb(16773c25-5054-4451-bf9f-efc1f7f21b89 osd.47
63cf7d70-99cb-0ab1-4006-002f e43203 [41261,43203]
lci=[43194,43203])
2013-10-14 06:21:26.726862 7f02690f9780 10 osd.47 0 add_map_bl 43203 82622 bytes
2013-10-14 06:21:26.727184 7f02690f9780 10 osd.47 43203 load_pgs
2013-10-14 06:21:26.727643 7f02690f9780 10 osd.47 43203 load_pgs
ignoring unrecognized meta
2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
3.df1_TEMP clearing temp

osd.47 is still down, I put it out from cluster.
47  1   osd.47  down0

How can I check what is wrong?

ceph -v
ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)

-- 
Pozdrawiam
Dominik
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 2013年10月14日 14:42:23 自动保存草稿

2013-10-14 Thread
hi all
I follow the mail  configure the ceph with hadoop 
(http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/1809).
1. Install additional packages: libcephfs-java libcephfs-jni  using the 
commonds:
./configure --enable-cephfs-java
make  make install
cp /src/java/libcephfs.jar  /usr/hadoop/lib/
2. Download http://ceph.com/download/hadoop-cephfs.jar
 cp hadoop-cephfs.jar /usr/hadoop/lib
   

  3. Symink JNI library
cd /usr/hadoop/lib/native/Linux-amd64-64 
ln -s /usr/local/lib/libcephfs_jni.so .

   4 vim  core-site.xml 
fs.default.name=ceph://192.168.22.158:6789/
fs.ceph.impl=org.apache.hadoop.fs.ceph.CephFileSystem
ceph.conf.file=/etc/ceph/ceph.conf

and then  
   # hadoop fs -ls
   ls: cannot access . :no such file or directory
#hadoop dfsadmin -report
report:FileSystem ceph://192.168.22.158:6789 is not a distributed file 
System
Usage: java DFSAdmin[-report]

thanks
pengft

 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down after server failure

2013-10-14 Thread Dominik Mostowiec
Hi
I have found somthing.
After restart time was wrong on server (+2hours) before ntp has fixed it.
I restarted this 3 osd - it not helps.
It is possible that ceph banned this osd? Or after start with wrong
time osd has broken hi's filestore?

--
Regards
Dominik


2013/10/14 Dominik Mostowiec dominikmostow...@gmail.com:
 Hi,
 I had server failure that starts from one disk failure:
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023986] sd 4:2:26:0:
 [sdaa] Unhandled error code
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023990] sd 4:2:26:0:
 [sdaa]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023995] sd 4:2:26:0:
 [sdaa] CDB: Read(10): 28 00 00 00 00 d0 00 00 10 00
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024005] end_request:
 I/O error, dev sdaa, sector 208
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024744] XFS (sdaa):
 metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
 count 8192
 Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.025879] XFS (sdaa):
 xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
 Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.820288] XFS (sdaa):
 metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
 count 8192
 Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.821194] XFS (sdaa):
 xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
 Oct 14 03:25:32 s3-10-177-64-6 kernel: [1027264.667851] XFS (sdaa):
 metadata I/O error: block 0xd0 (xfs_trans_read_buf) error 5 buf
 count 8192

 this caused that the server has been unresponsive.

 After server restart 3 of 26 osd on it are down.
 In ceph-osd log after debug osd = 10 and restart is:

 2013-10-14 06:21:23.141936 7fdeb4872700 -1 osd.47 43203 *** Got signal
 Terminated ***
 2013-10-14 06:21:23.142141 7fdeb4872700 -1 osd.47 43203  pausing thread pools
 2013-10-14 06:21:23.142146 7fdeb4872700 -1 osd.47 43203  flushing io
 2013-10-14 06:21:25.406187 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
 appears to work
 2013-10-14 06:21:25.406204 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
 'filestore fiemap' config option
 2013-10-14 06:21:25.406557 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount did NOT detect btrfs
 2013-10-14 06:21:25.412617 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
 (by glibc and kernel)
 2013-10-14 06:21:25.412831 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount found snaps 
 2013-10-14 06:21:25.415798 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
 btrfs not detected
 2013-10-14 06:21:26.078377 7f02690f9780  2 osd.47 0 mounting
 /vol0/data/osd.47 /vol0/data/osd.47/journal
 2013-10-14 06:21:26.080872 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
 appears to work
 2013-10-14 06:21:26.080885 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
 'filestore fiemap' config option
 2013-10-14 06:21:26.081289 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount did NOT detect btrfs
 2013-10-14 06:21:26.087524 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
 (by glibc and kernel)
 2013-10-14 06:21:26.087582 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount found snaps 
 2013-10-14 06:21:26.089614 7f02690f9780  0
 filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
 btrfs not detected
 2013-10-14 06:21:26.726676 7f02690f9780  2 osd.47 0 boot
 2013-10-14 06:21:26.726773 7f02690f9780 10 osd.47 0 read_superblock
 sb(16773c25-5054-4451-bf9f-efc1f7f21b89 osd.47
 63cf7d70-99cb-0ab1-4006-002f e43203 [41261,43203]
 lci=[43194,43203])
 2013-10-14 06:21:26.726862 7f02690f9780 10 osd.47 0 add_map_bl 43203 82622 
 bytes
 2013-10-14 06:21:26.727184 7f02690f9780 10 osd.47 43203 load_pgs
 2013-10-14 06:21:26.727643 7f02690f9780 10 osd.47 43203 load_pgs
 ignoring unrecognized meta
 2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
 3.df1_TEMP clearing temp

 osd.47 is still down, I put it out from cluster.
 47  1   osd.47  down0

 How can I check what is wrong?

 ceph -v
 ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)

 --
 Pozdrawiam
 Dominik



-- 
Pozdrawiam
Dominik
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw-admin doesn't list user anymore

2013-10-14 Thread Valery Tschopp

We upgraded from 0.61.8 to 0.67.4.

The metadata commands works for the users and the buckets:

root@ineri ~$ radosgw-admin metadata list bucket
[
a4mesh,
61a75c04-34a5-11e3-9bea-8f8d15b5cf20,
6e22de72-34a5-11e3-afc4-d3f70b676c52,
...


root@ineri ~$ radosgw-admin metadata list user
[
cloudbroker,
a4mesh,
valery,
...

Cheers,
Valery

On 11/10/13 18:27 , Yehuda Sadeh wrote:

On Fri, Oct 11, 2013 at 7:46 AM, Valery Tschopp
valery.tsch...@switch.ch wrote:

Hi,

Since we upgraded ceph to 0.67.4, the radosgw-admin doesn't list all the
users anymore:

root@ineri:~# radosgw-admin user info
could not fetch user info: no user info saved


But it still work for single user:

root@ineri:~# radosgw-admin user info --uid=valery
{ user_id: valery,
display_name: Valery Tschopp,
email: valery.tsch...@switch.ch,
...

The debug log file is too big for the mailing-list, but here it is on
pastebin: http://pastebin.com/cFypJ2Qd



What version did you upgrade from?

You can try using the following:

$ radosgw-admin metadata list bucket

Thanks,
Yehuda



--
SWITCH
--
Valery Tschopp, Software Engineer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
email: valery.tsch...@switch.ch phone: +41 44 268 1544




smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] using ceph with hadoop

2013-10-14 Thread
|
 


hi all
I follow the mail  configure the ceph with hadoop 
(http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/1809).
1. Install additional packages: libcephfs-java libcephfs-jni  using the 
commonds:
./configure --enable-cephfs-java
make  make install
cp /src/java/libcephfs.jar  /usr/hadoop/lib/
2. Download http://ceph.com/download/hadoop-cephfs.jar
 cp hadoop-cephfs.jar /usr/hadoop/lib
   

  3. Symink JNI library
cd /usr/hadoop/lib/native/Linux-amd64-64 
ln -s /usr/local/lib/libcephfs_jni.so .

   4 vim  core-site.xml 
fs.default.name=ceph://192.168.22.158:6789/
fs.ceph.impl=org.apache.hadoop.fs.ceph.CephFileSystem
ceph.conf.file=/etc/ceph/ceph.conf

and then  
   # hadoop fs -ls
   ls: cannot access . :no such file or directory
#hadoop dfsadmin -report
report:FileSystem ceph://192.168.22.158:6789 is not a distributed file 
System
Usage: java DFSAdmin[-report]
# /usr/hadoop/bin/stop-all.sh
# /usr/hadoop/bin/start-all.sh
   hadoop:Exception in thread IPC Client(47) Connection to 
192.168.58.129:6789 from rwt java.lang.RuntimeException:readObject cant find 
class



 thanks
pengft









|
|
|   |   |
|___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Speed limit on RadosGW?

2013-10-14 Thread Chu Duc Minh
Hi sorry, i missed this mail.

 During writes, does the CPU usage on your RadosGW node go way up?
No, CPU stay the same  very low ( 10%)

When upload small files(300KB/file) over RadosGW:
 - using 1 process: upload bandwidth ~ 3MB/s
 - using 100 processes: upload bandwidth ~ 15MB/s

When upload big files(3GB/file) over RadosGW:
 - using 1 process: upload bandwidth ~ 70MB/s
(Therefore i don't upload big files using multi-processes any more :D)

Maybe, RadosGW have a problem when write many smail files. Or it's a
problem of CEPH when simultaneously write many smail files into a bucket,
that already have millions files?


On Wed, Sep 25, 2013 at 7:24 PM, Mark Nelson mark.nel...@inktank.comwrote:

 On 09/25/2013 02:49 AM, Chu Duc Minh wrote:

 I have a CEPH cluster with 9 nodes (6 data nodes  3 mon/mds nodes)
 And i setup 4 separate nodes to test performance of Rados-GW:
   - 2 node run Rados-GW
   - 2 node run multi-process put file to [multi] Rados-GW

 Result:
 a) When i use 1 RadosGW node  1 upload-node, speed upload = 50MB/s
 /upload-node, Rados-GW input/output speed = 50MB/s

 b) When i use 2 RadosGW node  1 upload-node, speed upload = 50MB/s
 /upload-node; each RadosGW have input/output = 25MB/s == sum
 input/ouput of 2 Rados-GW = 50MB/s

 c) When i use 1 RadosGW node  2 upload-node, speed upload = 25MB/s
 /upload-node == sum output of 2 upload-node = 50MB/s, RadosGW have
 input/output = 50MB/s

 d) When i use 2 RadosGW node  2 upload-node, speed upload = 25MB/s
 /upload-node == sum output of 2 upload-node = 50MB/s; each RadosGW have
 input/output = 25MB/s == sum input/ouput of 2 Rados-GW = 50MB/s

 _*Problem*_: i can pass limit 50MB/s when put file over Rados-GW,

 regardless of the number Rados-GW nodes and upload-nodes.
 When i use this CEPH cluster over librados (openstack/kvm), i can easily
 achieve  300MB/s

 I don't know why performance of RadosGW is so low. What's bottleneck?


 During writes, does the CPU usage on your RadosGW node go way up?

 If this is a test cluster, you might want to try the wip-6286 build from
 our gitbuilder site.  There is a fix that depending on the size of your
 objects, could have a big impact on performance.  We're currently
 investigating some other radosgw performance issues as well, so stay tuned.
 :)

 Mark



 Thank you very much!




 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw can still get the object even if this object's physical file is removed on OSDs

2013-10-14 Thread david zhang
Hi ceph-users,

I uploaded an object successfully to radosgw with 3 replicas. And I located
all the physical paths of 3 replicas on different OSDs.

i.e, one of the 3 physical paths is
/var/lib/ceph/osd/ceph-2/current/3.5_head/DIR_D/default.4896.65\\u20131014\\u1__head_0646563D__3

Then I manually deleted all the 3 replica files on OSDs, but this object
can still get from radosgw with http code 200 even I cleaned all the caches
on both radosgw and OSDs by 'echo 3  /proc/sys/vm/drop_caches'. Only after
I restarted the 3 OSDs, get request will return 404.

What did I miss? Is it not right to clean cache in that way?

Thanks.

-- 
Regards,
Zhi
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using ceph with hadoop error

2013-10-14 Thread Noah Watkins
On Sun, Oct 13, 2013 at 8:28 PM, 鹏 wkp4...@126.com wrote:
  hi all:
 Exception in thread main java.lang.NoClassDefFoundError:
 com/ceph/fs/cephFileAlreadyExisteException
 at java.lang.class.forName0(Native Method)

This looks like a bug, which I'll fixup today. But it shouldn't be
related to the problems you are seeing.

 Caused by :
 java.lang.classNotFoundException:com.ceph.fs.CephFileAlreadyExistsException
  at java.net.URLClassLoader$1.run(URLClassLoader.jar:202)
  at

This looks like you don't have the CephFS Java bindings in a place
that Hadoop can locate. Typically you can stick the libcephfs-jar file
into the lib directory of Hadoop, or add it to your classpath.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] using ceph with hadoop

2013-10-14 Thread Noah Watkins
The error below seems to indicate that Hadoop isn't aware of the `ceph://`
file system. You'll need to manually add this to your core-site.xml:

* property** namefs.ceph.impl/name** 
valueorg.apache.hadoop.fs.ceph.CephFileSystem/value** /property*



 report:FileSystem ceph://192.168.22.158:6789 is not a distributed file 
 System
 Usage: java DFSAdmin[-report]
 # /usr/hadoop/bin/stop-all.sh
 # /usr/hadoop/bin/start-all.sh
hadoop:Exception in thread IPC Client(47) Connection to 
 192.168.58.129:6789 from rwt java.lang.RuntimeException:readObject cant find 
 class



  thanks
 pengft









 **




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 2013年10月14日 14:42:23 自动保存草稿

2013-10-14 Thread Noah Watkins
Do you have the following in your core-site.xml?

 property
 namefs.ceph.impl/name
 valueorg.apache.hadoop.fs.ceph.CephFileSystem/value
 /property

On Sun, Oct 13, 2013 at 11:55 PM, 鹏 wkp4...@126.com wrote:
 hi all
 I follow the mail  configure the ceph with hadoop
 (http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/1809).
 1. Install additional packages: libcephfs-java libcephfs-jni  using the
 commonds:
 ./configure --enable-cephfs-java
 make  make install
 cp /src/java/libcephfs.jar  /usr/hadoop/lib/
 2. Download http://ceph.com/download/hadoop-cephfs.jar
  cp hadoop-cephfs.jar /usr/hadoop/lib


   3. Symink JNI library
 cd /usr/hadoop/lib/native/Linux-amd64-64
 ln -s /usr/local/lib/libcephfs_jni.so .

4 vim  core-site.xml
 fs.default.name=ceph://192.168.22.158:6789/
 fs.ceph.impl=org.apache.hadoop.fs.ceph.CephFileSystem
 ceph.conf.file=/etc/ceph/ceph.conf

 and then
# hadoop fs -ls
ls: cannot access . :no such file or directory
 #hadoop dfsadmin -report
 report:FileSystem ceph://192.168.22.158:6789 is not a distributed file
 System
 Usage: java DFSAdmin[-report]

 thanks
 pengft









 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw can still get the object even if this object's physical file is removed on OSDs

2013-10-14 Thread Yehuda Sadeh
On Mon, Oct 14, 2013 at 4:04 AM, david zhang zhang.david2...@gmail.com wrote:
 Hi ceph-users,

 I uploaded an object successfully to radosgw with 3 replicas. And I located
 all the physical paths of 3 replicas on different OSDs.

 i.e, one of the 3 physical paths is
 /var/lib/ceph/osd/ceph-2/current/3.5_head/DIR_D/default.4896.65\\u20131014\\u1__head_0646563D__3

 Then I manually deleted all the 3 replica files on OSDs, but this object can
 still get from radosgw with http code 200 even I cleaned all the caches on
 both radosgw and OSDs by 'echo 3  /proc/sys/vm/drop_caches'. Only after I
 restarted the 3 OSDs, get request will return 404.

 What did I miss? Is it not right to clean cache in that way?

I'm not too sure what you're trying to achieve. You should never ever
access the osd objects directly like that. The reason you're still
able to read the objects is probably because the osd keeps open fds
for recently opened files and it still holds a reference to them. If
you need to remove objects off the rados backend you should use the
rados tool to do that. However, since you created the objects via
radosgw, you're going to have some radosgw consistency issues, so in
that case the way to go would be by going through radosgw-admin (or
through the radosgw RESTful api).


Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Speed limit on RadosGW?

2013-10-14 Thread Kyle Bader
I've personally saturated 1Gbps links on multiple radosgw nodes on a large
cluster, if I remember correctly, Yehuda has tested it up into the 7Gbps
range with 10Gbps gear. Could you describe your clusters hardware and
connectivity?


On Mon, Oct 14, 2013 at 3:34 AM, Chu Duc Minh chu.ducm...@gmail.com wrote:

 Hi sorry, i missed this mail.


  During writes, does the CPU usage on your RadosGW node go way up?
 No, CPU stay the same  very low ( 10%)

 When upload small files(300KB/file) over RadosGW:
  - using 1 process: upload bandwidth ~ 3MB/s
  - using 100 processes: upload bandwidth ~ 15MB/s

 When upload big files(3GB/file) over RadosGW:
  - using 1 process: upload bandwidth ~ 70MB/s
 (Therefore i don't upload big files using multi-processes any more :D)

 Maybe, RadosGW have a problem when write many smail files. Or it's a
 problem of CEPH when simultaneously write many smail files into a bucket,
 that already have millions files?


 On Wed, Sep 25, 2013 at 7:24 PM, Mark Nelson mark.nel...@inktank.comwrote:

 On 09/25/2013 02:49 AM, Chu Duc Minh wrote:

 I have a CEPH cluster with 9 nodes (6 data nodes  3 mon/mds nodes)
 And i setup 4 separate nodes to test performance of Rados-GW:
   - 2 node run Rados-GW
   - 2 node run multi-process put file to [multi] Rados-GW

 Result:
 a) When i use 1 RadosGW node  1 upload-node, speed upload = 50MB/s
 /upload-node, Rados-GW input/output speed = 50MB/s

 b) When i use 2 RadosGW node  1 upload-node, speed upload = 50MB/s
 /upload-node; each RadosGW have input/output = 25MB/s == sum
 input/ouput of 2 Rados-GW = 50MB/s

 c) When i use 1 RadosGW node  2 upload-node, speed upload = 25MB/s
 /upload-node == sum output of 2 upload-node = 50MB/s, RadosGW have
 input/output = 50MB/s

 d) When i use 2 RadosGW node  2 upload-node, speed upload = 25MB/s
 /upload-node == sum output of 2 upload-node = 50MB/s; each RadosGW have
 input/output = 25MB/s == sum input/ouput of 2 Rados-GW = 50MB/s

 _*Problem*_: i can pass limit 50MB/s when put file over Rados-GW,

 regardless of the number Rados-GW nodes and upload-nodes.
 When i use this CEPH cluster over librados (openstack/kvm), i can easily
 achieve  300MB/s

 I don't know why performance of RadosGW is so low. What's bottleneck?


 During writes, does the CPU usage on your RadosGW node go way up?

 If this is a test cluster, you might want to try the wip-6286 build from
 our gitbuilder site.  There is a fix that depending on the size of your
 objects, could have a big impact on performance.  We're currently
 investigating some other radosgw performance issues as well, so stay tuned.
 :)

 Mark



 Thank you very much!




 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 

Kyle
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] xfs log device and osd journal specifications in ceph.conf

2013-10-14 Thread Snider, Tim
3 questions:

1.   I'd like to use xfs devices with a separate log device in a ceph cluster. 
What's the best way to do this?  Is it possible to specify xfs log devices in 
the [osd.x] sections of ceph.conf? 
E.G.:
[osd.0]
host = delta
devs = /dev/sdx
osd mkfs options xfs = -d su=131072,sw=8 -i size=1024 -l 
logdev=/dev/sdq1,su=131072

[osd.1]
host = epsilon
devs = /dev/sdy
osd mkfs options xfs = -d su=131072,sw=8 -i size=1024 -l 
logdev=/dev/sdq2,su=131072

2.Is this the correct syntax for the line without the log device options?
 osd mkfs options xfs = -d su=131072,sw=8 -i size=1024

3. For osd journal devices. I assume there's a 1:1 relationship between osds 
and journal devices.  The section in sample.ceph.conf seems to imply a single 
entry.
Should there be an osd journal entry in each [osd.x] section of ceph.conf? 

[osd]
; This is where the osd expects its data
osd data = /data/$name

; Ideally, make the journal a separate disk or partition.
; 1-10GB should be enough; more if you have fast or many
; disks.  You can use a file under the osd data dir if need be
; (e.g. /data/$name/journal), but it will be slower than a
; separate disk or partition.
; This is an example of a file-based journal.
osd journal = /data/$name/journal
osd journal size = 1000 ; journal size, in megabytes

On my cluster (deployed with ceph-deploy) the data is in /var/lib/ceph/osd. Not 
/data/$name as in the sample file. Directory organization on my cluster:
/var/lib/ceph/osd/:
ceph-0  ceph-10  ceph-12  ceph-14  ceph-16  ceph-18  ceph-2   ceph-21  
ceph-3  ceph-5  ceph-7  ceph-9
ceph-1  ceph-11  ceph-13  ceph-15  ceph-17  ceph-19  ceph-20  ceph-22  
ceph-4  ceph-6  ceph-8

/var/lib/ceph/osd/ceph-0:

/var/lib/ceph/osd/ceph-1:

ls /data
ls: cannot access /data: No such file or directory
Thanks,
Tim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Production locked: OSDs down

2013-10-14 Thread Mikaël Cluseau

Hi,

I have a pretty big problem here... my OSDs are marked down (except one?!)

I have ceph ceph version 0.61.8 (a6fdcca3bddbc9f177e4e2bf0d9cdd85006b028b).

I recently had a full monitors so I had to remove them but it seemed to 
work.


# idweighttype nameup/downreweight
-115root default
-36datacenter xxx
-20host cloud-1
-40host cloud-2
-73host xxx-1
71osd.7down1
81osd.8down1
91osd.9down1
-83host xxx-2
31osd.3down1
41osd.4down1
51osd.5up1


I see this in the logs when I try to restart them :

2013-10-15 06:54:32.651951 7fa5db16b780  1 journal _open 
/dev/ssd/osd_3_jrn fd 26: 5368709120 bytes, block size 4096 bytes, 
directio = 1, aio = 1
2013-10-15 06:54:36.321235 7fa5ac741700  0 -- 192.168.242.2:6801/29193 
 192.168.242.1:6811/12764 pipe(0x7fa588002490 sd=28 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby
2013-10-15 06:54:36.321256 7fa59c2f3700  0 -- 192.168.242.2:6801/29193 
 192.168.242.1:6801/12362 pipe(0x7fa588001490 sd=27 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby
2013-10-15 06:54:36.321267 7fa5ac13b700  0 -- 192.168.242.2:6801/29193 
 192.168.242.1:6814/13354 pipe(0x7fa588001970 sd=30 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby


Any idea?

Thanks!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] kvm live migrate wil ceph

2013-10-14 Thread Jon
Hello,

I would like to live migrate a VM between two hypervisors.  Is it
possible to do this with a rbd disk or should the vm disks be created as
qcow images on a CephFS/NFS share (is it possible to do clvm over rbds? OR
GlusterFS over rbds?)and point kvm at the network directory.  As I
understand it, rbds aren't cluster aware so you can't mount an rbd on
multiple hosts at once, but maybe libvirt has a way to handle the
transfer...?  I like the idea of master or golden images where guests
write any changes to a new image, I don't think rbds are able to handle
copy-on-write in the same way kvm does so maybe a clustered filesystem
approach is the ideal way to go.

Thanks for your input. I think I'm just missing some piece. .. I just don't
grok...

Bestv Regards,
Jon A
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full OSD with 29% free

2013-10-14 Thread Michael Lowe
How fragmented is that file system?

Sent from my iPad

 On Oct 14, 2013, at 5:44 PM, Bryan Stillwell bstillw...@photobucket.com 
 wrote:
 
 This appears to be more of an XFS issue than a ceph issue, but I've
 run into a problem where some of my OSDs failed because the filesystem
 was reported as full even though there was 29% free:
 
 [root@den2ceph001 ceph-1]# touch blah
 touch: cannot touch `blah': No space left on device
 [root@den2ceph001 ceph-1]# df .
 Filesystem   1K-blocks  Used Available Use% Mounted on
 /dev/sdc1486562672 342139340 144423332  71% 
 /var/lib/ceph/osd/ceph-1
 [root@den2ceph001 ceph-1]# df -i .
 FilesystemInodes   IUsed   IFree IUse% Mounted on
 /dev/sdc160849984 4097408 567525767% /var/lib/ceph/osd/ceph-1
 [root@den2ceph001 ceph-1]#
 
 I've tried remounting the filesystem with the inode64 option like a
 few people recommended, but that didn't help (probably because it
 doesn't appear to be running out of inodes).
 
 This happened while I was on vacation and I'm pretty sure it was
 caused by another OSD failing on the same node.  I've been able to
 recover from the situation by bringing the failed OSD back online, but
 it's only a matter of time until I'll be running into this issue again
 since my cluster is still being populated.
 
 Any ideas on things I can try the next time this happens?
 
 Thanks,
 Bryan
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] kvm live migrate wil ceph

2013-10-14 Thread Michael Lowe
I live migrate all the time using the rbd driver in qemu, no problems.  Qemu 
will issue a flush as part of the migration so everything is consistent.  It's 
the right way to use ceph to back vm's. I would strongly recommend against a 
network file system approach.  You may want to look into format 2 rbd images, 
the cloning and writable snapshots may be what you are looking for.

Sent from my iPad

 On Oct 14, 2013, at 5:37 AM, Jon three1...@gmail.com wrote:
 
 Hello,
 
 I would like to live migrate a VM between two hypervisors.  Is it possible 
 to do this with a rbd disk or should the vm disks be created as qcow images 
 on a CephFS/NFS share (is it possible to do clvm over rbds? OR GlusterFS over 
 rbds?)and point kvm at the network directory.  As I understand it, rbds 
 aren't cluster aware so you can't mount an rbd on multiple hosts at once, 
 but maybe libvirt has a way to handle the transfer...?  I like the idea of 
 master or golden images where guests write any changes to a new image, I 
 don't think rbds are able to handle copy-on-write in the same way kvm does so 
 maybe a clustered filesystem approach is the ideal way to go.
 
 Thanks for your input. I think I'm just missing some piece. .. I just don't 
 grok...
 
 Bestv Regards,
 Jon A
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full OSD with 29% free

2013-10-14 Thread Bryan Stillwell
The filesystem isn't as full now, but the fragmentation is pretty low:

[root@den2ceph001 ~]# df /dev/sdc1
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sdc1486562672 270845628 215717044  56% /var/lib/ceph/osd/ceph-1
[root@den2ceph001 ~]# xfs_db -c frag -r /dev/sdc1
actual 3481543, ideal 3447443, fragmentation factor 0.98%

Bryan

On Mon, Oct 14, 2013 at 4:35 PM, Michael Lowe j.michael.l...@gmail.com wrote:

 How fragmented is that file system?

 Sent from my iPad

  On Oct 14, 2013, at 5:44 PM, Bryan Stillwell bstillw...@photobucket.com 
  wrote:
 
  This appears to be more of an XFS issue than a ceph issue, but I've
  run into a problem where some of my OSDs failed because the filesystem
  was reported as full even though there was 29% free:
 
  [root@den2ceph001 ceph-1]# touch blah
  touch: cannot touch `blah': No space left on device
  [root@den2ceph001 ceph-1]# df .
  Filesystem   1K-blocks  Used Available Use% Mounted on
  /dev/sdc1486562672 342139340 144423332  71% 
  /var/lib/ceph/osd/ceph-1
  [root@den2ceph001 ceph-1]# df -i .
  FilesystemInodes   IUsed   IFree IUse% Mounted on
  /dev/sdc160849984 4097408 567525767% 
  /var/lib/ceph/osd/ceph-1
  [root@den2ceph001 ceph-1]#
 
  I've tried remounting the filesystem with the inode64 option like a
  few people recommended, but that didn't help (probably because it
  doesn't appear to be running out of inodes).
 
  This happened while I was on vacation and I'm pretty sure it was
  caused by another OSD failing on the same node.  I've been able to
  recover from the situation by bringing the failed OSD back online, but
  it's only a matter of time until I'll be running into this issue again
  since my cluster is still being populated.
 
  Any ideas on things I can try the next time this happens?
 
  Thanks,
  Bryan
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Using Hadoop With Cephfs

2013-10-14 Thread log1024
Hi,
I have a 4-node Ceph cluster(2 mon, 1 mds, 2 osd) and a Hadoop node.
Currently, I'm trying to replace HDFS with CephFS. I followed the instructions 
in USING HADOOP WITH CEPHFS. But every time I run bin/start-all.sh to run 
Hadoop, it failed with:


starting namenode, logging to 
/usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-ceph-srv1.out
localhost: starting datanode, logging to 
/usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-ceph-srv1.out
localhost: Exception in thread IPC Client (47) connection to 
/172.29.84.56:6789 from hduser java.lang.RuntimeException: readObject can't 
find class
localhost: at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
localhost: at 
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
localhost: at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:851)
localhost: at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
localhost: Caused by: java.lang.ClassNotFoundException:
localhost: at java.lang.Class.forName0(Native Method)
localhost: at java.lang.Class.forName(Class.java:249)
localhost: at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
localhost: at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)


My core-site.xml:
configuration
property
 nameceph.conf.file/name
 value/etc/ceph/ceph.conf/value
/property
property
 namefs.default.name/name
 valueceph://172.29.84.56:6789//value
/property
property
nameceph.mon.address/name
value172.29.84.56:6789/value
/property
property
nameceph.auth.keyring/name
value/etc/ceph/ceph.client.admin.keyring/value
/property
property
nameceph.data.pools/name
valuehadoop1/value
/property
/configuration




Here is my ceph -s output:
ceph -s
  cluster 942afa43-9a92-434b-9dfa-e893d4e5d565
   health HEALTH_WARN 16 pgs degraded; 16 pgs stuck unclean; recovery 505/1719 
degraded (29.378%); clock skew detected on mon.ceph-srv3
   monmap e1: 2 mons at 
{ceph-srv2=172.29.84.56:6789/0,ceph-srv3=172.29.84.57:6789/0}, election epoch 
12, quorum 0,1 ceph-srv2,ceph-srv3
   osdmap e52: 2 osds: 2 up, 2 in
pgmap v33521: 372 pgs: 356 active+clean, 16 active+degraded; 4097 MB data, 
8277 MB used, 1384 GB / 1392 GB avail; 505/1719 degraded (29.378%)
   mdsmap e17: 1/1/1 up {0=ceph-srv2=up:active}


Can anyone show me how to use hadoop with cephfs correctly?


Thanks,
Kai___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using Hadoop With Cephfs

2013-10-14 Thread Noah Watkins
Hi Kai,

It doesn't look like there is anything Ceph specific in the Java
backtrace you posted. Does you installation work with HDFS? Are there
any logs where an error is occurring with the Ceph plugin?

Thanks,
Noah

On Mon, Oct 14, 2013 at 4:34 PM, log1024 log1...@yeah.net wrote:
 Hi,
 I have a 4-node Ceph cluster(2 mon, 1 mds, 2 osd) and a Hadoop node.
 Currently, I'm trying to replace HDFS with CephFS. I followed the
 instructions in USING HADOOP WITH CEPHFS. But every time I run
 bin/start-all.sh to run Hadoop, it failed with:

 starting namenode, logging to
 /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-ceph-srv1.out
 localhost: starting datanode, logging to
 /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-ceph-srv1.out
 localhost: Exception in thread IPC Client (47) connection to
 /172.29.84.56:6789 from hduser java.lang.RuntimeException: readObject can't
 find class
 localhost: at
 org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
 localhost: at
 org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
 localhost: at
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:851)
 localhost: at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
 localhost: Caused by: java.lang.ClassNotFoundException:
 localhost: at java.lang.Class.forName0(Native Method)
 localhost: at java.lang.Class.forName(Class.java:249)
 localhost: at org.apache.hadoop.conf.Configu
 ration.getClassByName(Configuration.java:802)
 localhost: at
 org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)

 My core-site.xml:
 configuration
 property
  nameceph.conf.file/name
  value/etc/ceph/ceph.conf/value
 /property
 property
  namefs.default.name/name
  nbsp;valueceph://172.29.84.56:6789//value
 /property
 property
 nameceph.mon.address/name
 value172.29.84.56:6789/value
 /property
 property
 nameceph.auth.keyring/name
 value/etc/ceph/ceph.client.admin.keyring/value
 /property
 property
 nameceph.data.pools/name
 valuehadoop1/value
 /property
 /configuration


 Here is my ceph -s output:
 ceph -s
   cluster 942afa43-9a92-434b-9dfa-e893d4e5d565
health HEALTH_WARN 16 pgs degraded; 16 pgs stuck unclean; recovery
 505/1719 degraded (29.378%); clock skew detected on mon.ceph-srv3
monmap e1: 2 mons at
 {ceph-srv2=172.29.84.56:6789/0,ceph-srv3=172.29.84.57:6789/0}, election
 epoch 12, quorum 0,1 ceph-srv2,ceph-srv3
osdmap e52: 2 osds: 2 up, 2 in
 pgmap v33521: 372 pgs: 356 active+clean, 16 active+degraded; 4097 MB
 data, 8277 MB used, 1384 GB / 1392 GB avail; 505/1719 degraded (29.378%)
mdsmap e17: 1/1/1 up {0=ceph-srv2=up:active}

 Can anyone show me how to use hadoop with cephfs correctly?

 Thanks,
 Kai



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-kvm with rbd mem slow leak

2013-10-14 Thread Josh Durgin

On 10/13/2013 07:43 PM, alan.zhang wrote:

CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz *2
MEM: 32GB
KVM: qemu-kvm-0.12.1.2-2.355.el6.2.cuttlefish.async.x86_64
Host: CentOS 6.4, kernel 2.6.32-358.14.1.el6.x86_64
Guest: CentOS 6.4, kernel 2.6.32-279.14.1.el6.x86_64
Ceph: ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
Opennebula: 4.2


top -M info:
top - 10:35:31 up 7 days,  9:19,  1 user,  load average: 0.85, 1.63, 1.40
Tasks: 454 total,   2 running, 452 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.5%us,  6.6%sy,  0.0%ni, 84.2%id,  0.6%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:  32865800k total, 32191072k used,   674728k free,59984k buffers
Swap: 10485752k total, 10134076k used,   351676k free,  3474176k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
20135 oneadmin  20   0 6381m 3.4g 9120 S  2.3 10.8 104:00.48 qemu-kvm
29171 oneadmin  20   0 6452m 3.2g 9072 S  2.0 10.2 168:02.06 qemu-kvm
  8857 oneadmin  20   0 6338m 2.9g 4504 S  2.3  9.3 289:14.48 qemu-kvm
12283 oneadmin  20   0 6591m 2.9g 4464 S  1.3  9.2 268:57.30 qemu-kvm
  6612 oneadmin  20   0 5050m 2.0g 4472 S 12.9  6.3 191:23.51 qemu-kvm
12006 oneadmin  20   0 5532m 1.9g 4468 S  4.3  6.1 236:43.50 qemu-kvm
  7216 oneadmin  20   0 3600m 1.9g 4680 S  1.3  6.1 159:40.53 qemu-kvm
10602 oneadmin  20   0 5333m 1.6g 4636 S  1.3  5.1 208:54.52 qemu-kvm
13162 oneadmin  20   0 3400m 989m 4528 S 50.3  3.1   4151:19 qemu-kvm
  5273 oneadmin  20   0 5168m 842m 4464 S  5.3  2.6 468:20.65 qemu-kvm
  6287 oneadmin  20   0 3150m 761m 4472 S 37.4  2.4 150:32.89 qemu-kvm
  6081 root  20   0 1732m 504m 5744 S  6.3  1.6 243:17.00 ceph-osd
11729 oneadmin  20   0 3541m 498m 4468 S  0.7  1.6  66:48.52 qemu-kvm
12503 oneadmin  20   0 3832m 428m 9336 S  0.3  1.3  19:58.78 qemu-kvm


such as 20135 process command line:
ps -ef | grep 20135
oneadmin 20135 1  2 Oct11 ?01:44:01 /usr/libexec/qemu-kvm
-name one-18 -S -M rhel6.4.0 -enable-kvm -m 2048 -smp
2,sockets=2,cores=1,threads=1 -uuid c40fe8a4-f4fa-9e02-cf2d-6eaaf5062440
-nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-18.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
file=rbd:one/one-0-18-0:auth_supported=none,if=none,id=drive-virtio-disk0,format=raw,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive
file=rbd:one/one-2:auth_supported=none,if=none,id=drive-virtio-disk1,format=raw,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
-drive
file=/var/lib/one/datastores/0/18/disk.1,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0
-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:c0:a8:0a:3b,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:18 -vga cirrus
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

I have only give it 2GB,but as you see, VIRT/RES (6381m/3.4g).


Does the resident memory continue increasing, or does it stay constant?

How does this compare with using only local files instead of rbd with
that qemu package?


I think it must be mem leak.

could any one give me a hand?


If you do observe continued increasing memory usage with rbd, but not
with local files, gathering some heap snaphshots via massif would help
figure out what's leaking. (http://tracker.ceph.com/issues/6494 is a
good example of getting massif output).

Josh

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw-admin doesn't list user anymore

2013-10-14 Thread Derek Yarnell
 root@ineri:~# radosgw-admin user info
 could not fetch user info: no user info saved

Hi Valery,

You need to use

  radosgw-admin metadata list user

Thanks,
derek

-- 
---
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com