Re: [ceph-users] radosgw-admin doesn't list user anymore

2013-10-14 Thread Derek Yarnell
> root@ineri:~# radosgw-admin user info
> could not fetch user info: no user info saved

Hi Valery,

You need to use

  radosgw-admin metadata list user

Thanks,
derek

-- 
---
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down after server failure

2013-10-14 Thread Dong Yuan
>From your informantion, the osd log ended with "
2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
3.df1_TEMP clearing temp"


That means the osd is loading all PG directories from the disk. If
there is any I/O error (disk or xfs error), the process couldn't
finished.

Suggest restart with debug osd = 20 or use xfs_check to check the
osd.47 local filesystem.

On 14 October 2013 15:40, Dominik Mostowiec  wrote:
> Hi
> I have found somthing.
> After restart time was wrong on server (+2hours) before ntp has fixed it.
> I restarted this 3 osd - it not helps.
> It is possible that ceph banned this osd? Or after start with wrong
> time osd has broken hi's filestore?
>
> --
> Regards
> Dominik
>
>
> 2013/10/14 Dominik Mostowiec :
>> Hi,
>> I had server failure that starts from one disk failure:
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023986] sd 4:2:26:0:
>> [sdaa] Unhandled error code
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023990] sd 4:2:26:0:
>> [sdaa]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023995] sd 4:2:26:0:
>> [sdaa] CDB: Read(10): 28 00 00 00 00 d0 00 00 10 00
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024005] end_request:
>> I/O error, dev sdaa, sector 208
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024744] XFS (sdaa):
>> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
>> count 8192
>> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.025879] XFS (sdaa):
>> xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
>> Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.820288] XFS (sdaa):
>> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
>> count 8192
>> Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.821194] XFS (sdaa):
>> xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
>> Oct 14 03:25:32 s3-10-177-64-6 kernel: [1027264.667851] XFS (sdaa):
>> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
>> count 8192
>>
>> this caused that the server has been unresponsive.
>>
>> After server restart 3 of 26 osd on it are down.
>> In ceph-osd log after "debug osd = 10" and restart is:
>>
>> 2013-10-14 06:21:23.141936 7fdeb4872700 -1 osd.47 43203 *** Got signal
>> Terminated ***
>> 2013-10-14 06:21:23.142141 7fdeb4872700 -1 osd.47 43203  pausing thread pools
>> 2013-10-14 06:21:23.142146 7fdeb4872700 -1 osd.47 43203  flushing io
>> 2013-10-14 06:21:25.406187 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
>> appears to work
>> 2013-10-14 06:21:25.406204 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
>> 'filestore fiemap' config option
>> 2013-10-14 06:21:25.406557 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount did NOT detect btrfs
>> 2013-10-14 06:21:25.412617 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
>> (by glibc and kernel)
>> 2013-10-14 06:21:25.412831 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount found snaps <>
>> 2013-10-14 06:21:25.415798 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
>> btrfs not detected
>> 2013-10-14 06:21:26.078377 7f02690f9780  2 osd.47 0 mounting
>> /vol0/data/osd.47 /vol0/data/osd.47/journal
>> 2013-10-14 06:21:26.080872 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
>> appears to work
>> 2013-10-14 06:21:26.080885 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
>> 'filestore fiemap' config option
>> 2013-10-14 06:21:26.081289 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount did NOT detect btrfs
>> 2013-10-14 06:21:26.087524 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
>> (by glibc and kernel)
>> 2013-10-14 06:21:26.087582 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount found snaps <>
>> 2013-10-14 06:21:26.089614 7f02690f9780  0
>> filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
>> btrfs not detected
>> 2013-10-14 06:21:26.726676 7f02690f9780  2 osd.47 0 boot
>> 2013-10-14 06:21:26.726773 7f02690f9780 10 osd.47 0 read_superblock
>> sb(16773c25-5054-4451-bf9f-efc1f7f21b89 osd.47
>> 63cf7d70-99cb-0ab1-4006-002f e43203 [41261,43203]
>> lci=[43194,43203])
>> 2013-10-14 06:21:26.726862 7f02690f9780 10 osd.47 0 add_map_bl 43203 82622 
>> bytes
>> 2013-10-14 06:21:26.727184 7f02690f9780 10 osd.47 43203 load_pgs
>> 2013-10-14 06:21:26.727643 7f02690f9780 10 osd.47 43203 load_pgs
>> ignoring unrecognized meta
>> 2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
>> 3.df1_TEMP clearing temp
>>
>> osd.47 is still down, I put it out from cluster.
>> 47  1   osd.47  down0
>>
>> How can I check what is wrong?
>>
>> ceph -v
>> ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
>>
>> --
>> Pozdrawiam
>> Dominik
>
>
>
> --
> Pozdrawiam
> Dominik
> --
> To un

Re: [ceph-users] qemu-kvm with rbd mem slow leak

2013-10-14 Thread Josh Durgin

On 10/13/2013 07:43 PM, alan.zhang wrote:

CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz *2
MEM: 32GB
KVM: qemu-kvm-0.12.1.2-2.355.el6.2.cuttlefish.async.x86_64
Host: CentOS 6.4, kernel 2.6.32-358.14.1.el6.x86_64
Guest: CentOS 6.4, kernel 2.6.32-279.14.1.el6.x86_64
Ceph: ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
Opennebula: 4.2


top -M info:
top - 10:35:31 up 7 days,  9:19,  1 user,  load average: 0.85, 1.63, 1.40
Tasks: 454 total,   2 running, 452 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.5%us,  6.6%sy,  0.0%ni, 84.2%id,  0.6%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:  32865800k total, 32191072k used,   674728k free,59984k buffers
Swap: 10485752k total, 10134076k used,   351676k free,  3474176k cached

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
20135 oneadmin  20   0 6381m 3.4g 9120 S  2.3 10.8 104:00.48 qemu-kvm
29171 oneadmin  20   0 6452m 3.2g 9072 S  2.0 10.2 168:02.06 qemu-kvm
  8857 oneadmin  20   0 6338m 2.9g 4504 S  2.3  9.3 289:14.48 qemu-kvm
12283 oneadmin  20   0 6591m 2.9g 4464 S  1.3  9.2 268:57.30 qemu-kvm
  6612 oneadmin  20   0 5050m 2.0g 4472 S 12.9  6.3 191:23.51 qemu-kvm
12006 oneadmin  20   0 5532m 1.9g 4468 S  4.3  6.1 236:43.50 qemu-kvm
  7216 oneadmin  20   0 3600m 1.9g 4680 S  1.3  6.1 159:40.53 qemu-kvm
10602 oneadmin  20   0 5333m 1.6g 4636 S  1.3  5.1 208:54.52 qemu-kvm
13162 oneadmin  20   0 3400m 989m 4528 S 50.3  3.1   4151:19 qemu-kvm
  5273 oneadmin  20   0 5168m 842m 4464 S  5.3  2.6 468:20.65 qemu-kvm
  6287 oneadmin  20   0 3150m 761m 4472 S 37.4  2.4 150:32.89 qemu-kvm
  6081 root  20   0 1732m 504m 5744 S  6.3  1.6 243:17.00 ceph-osd
11729 oneadmin  20   0 3541m 498m 4468 S  0.7  1.6  66:48.52 qemu-kvm
12503 oneadmin  20   0 3832m 428m 9336 S  0.3  1.3  19:58.78 qemu-kvm


such as 20135 process command line:
ps -ef | grep 20135
oneadmin 20135 1  2 Oct11 ?01:44:01 /usr/libexec/qemu-kvm
-name one-18 -S -M rhel6.4.0 -enable-kvm -m 2048 -smp
2,sockets=2,cores=1,threads=1 -uuid c40fe8a4-f4fa-9e02-cf2d-6eaaf5062440
-nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-18.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
file=rbd:one/one-0-18-0:auth_supported=none,if=none,id=drive-virtio-disk0,format=raw,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive
file=rbd:one/one-2:auth_supported=none,if=none,id=drive-virtio-disk1,format=raw,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
-drive
file=/var/lib/one/datastores/0/18/disk.1,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0
-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:c0:a8:0a:3b,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:18 -vga cirrus
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

I have only give it 2GB,but as you see, VIRT/RES (6381m/3.4g).


Does the resident memory continue increasing, or does it stay constant?

How does this compare with using only local files instead of rbd with
that qemu package?


I think it must be mem leak.

could any one give me a hand?


If you do observe continued increasing memory usage with rbd, but not
with local files, gathering some heap snaphshots via massif would help
figure out what's leaking. (http://tracker.ceph.com/issues/6494 is a
good example of getting massif output).

Josh

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Speed limit on RadosGW?

2013-10-14 Thread Chu Duc Minh
My cluster has 3 MON nodes & 6 DATA nodes, all nodes have 2Gbps
connectivity (bonding).
Each Data node has 14 SATA HDD (osd), each journal on the same disk as OSD.

Each MON node run RadosGW too.
 On Oct 15, 2013 12:34 AM, "Kyle Bader"  wrote:

> I've personally saturated 1Gbps links on multiple radosgw nodes on a large
> cluster, if I remember correctly, Yehuda has tested it up into the 7Gbps
> range with 10Gbps gear. Could you describe your clusters hardware and
> connectivity?
>
>
> On Mon, Oct 14, 2013 at 3:34 AM, Chu Duc Minh wrote:
>
>> Hi sorry, i missed this mail.
>>
>>
>> > During writes, does the CPU usage on your RadosGW node go way up?
>> No, CPU stay the same & very low (< 10%)
>>
>> When upload small files(300KB/file) over RadosGW:
>>  - using 1 process: upload bandwidth ~ 3MB/s
>>  - using 100 processes: upload bandwidth ~ 15MB/s
>>
>> When upload big files(3GB/file) over RadosGW:
>>  - using 1 process: upload bandwidth ~ 70MB/s
>> (Therefore i don't upload big files using multi-processes any more :D)
>>
>> Maybe, RadosGW have a problem when write many smail files. Or it's a
>> problem of CEPH when simultaneously write many smail files into a bucket,
>> that already have millions files?
>>
>>
>> On Wed, Sep 25, 2013 at 7:24 PM, Mark Nelson wrote:
>>
>>> On 09/25/2013 02:49 AM, Chu Duc Minh wrote:
>>>
 I have a CEPH cluster with 9 nodes (6 data nodes & 3 mon/mds nodes)
 And i setup 4 separate nodes to test performance of Rados-GW:
   - 2 node run Rados-GW
   - 2 node run multi-process put file to [multi] Rados-GW

 Result:
 a) When i use 1 RadosGW node & 1 upload-node, speed upload = 50MB/s
 /upload-node, Rados-GW input/output speed = 50MB/s

 b) When i use 2 RadosGW node & 1 upload-node, speed upload = 50MB/s
 /upload-node; each RadosGW have input/output = 25MB/s ==> sum
 input/ouput of 2 Rados-GW = 50MB/s

 c) When i use 1 RadosGW node & 2 upload-node, speed upload = 25MB/s
 /upload-node ==> sum output of 2 upload-node = 50MB/s, RadosGW have
 input/output = 50MB/s

 d) When i use 2 RadosGW node & 2 upload-node, speed upload = 25MB/s
 /upload-node ==> sum output of 2 upload-node = 50MB/s; each RadosGW have
 input/output = 25MB/s ==> sum input/ouput of 2 Rados-GW = 50MB/s

 _*Problem*_: i can pass limit 50MB/s when put file over Rados-GW,

 regardless of the number Rados-GW nodes and upload-nodes.
 When i use this CEPH cluster over librados (openstack/kvm), i can easily
 achieve > 300MB/s

 I don't know why performance of RadosGW is so low. What's bottleneck?

>>>
>>> During writes, does the CPU usage on your RadosGW node go way up?
>>>
>>> If this is a test cluster, you might want to try the wip-6286 build from
>>> our gitbuilder site.  There is a fix that depending on the size of your
>>> objects, could have a big impact on performance.  We're currently
>>> investigating some other radosgw performance issues as well, so stay tuned.
>>> :)
>>>
>>> Mark
>>>
>>>
>>>
 Thank you very much!




 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com


>>> __**_
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
>
> Kyle
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using Hadoop With Cephfs

2013-10-14 Thread Noah Watkins
Hi Kai,

It doesn't look like there is anything Ceph specific in the Java
backtrace you posted. Does you installation work with HDFS? Are there
any logs where an error is occurring with the Ceph plugin?

Thanks,
Noah

On Mon, Oct 14, 2013 at 4:34 PM, log1024  wrote:
> Hi,
> I have a 4-node Ceph cluster(2 mon, 1 mds, 2 osd) and a Hadoop node.
> Currently, I'm trying to replace HDFS with CephFS. I followed the
> instructions in USING HADOOP WITH CEPHFS. But every time I run
> bin/start-all.sh to run Hadoop, it failed with:
>
> starting namenode, logging to
> /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-ceph-srv1.out
> localhost: starting datanode, logging to
> /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-ceph-srv1.out
> localhost: Exception in thread "IPC Client (47) connection to
> /172.29.84.56:6789 from hduser" java.lang.RuntimeException: readObject can't
> find class
> localhost: at
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
> localhost: at
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
> localhost: at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:851)
> localhost: at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
> localhost: Caused by: java.lang.ClassNotFoundException:
> localhost: at java.lang.Class.forName0(Native Method)
> localhost: at java.lang.Class.forName(Class.java:249)
> localhost: at org.apache.hadoop.conf.Configu
> ration.getClassByName(Configuration.java:802)
> localhost: at
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>
> My core-site.xml:
> 
> 
>  ceph.conf.file
>  /etc/ceph/ceph.conf
> 
> 
>  fs.default.name
> & nbsp;ceph://172.29.84.56:6789/
> 
> 
> ceph.mon.address
> 172.29.84.56:6789
> 
> 
> ceph.auth.keyring
> /etc/ceph/ceph.client.admin.keyring
> 
> 
> ceph.data.pools
> hadoop1
> 
> 
>
>
> Here is my ceph -s output:
> ceph -s
>   cluster 942afa43-9a92-434b-9dfa-e893d4e5d565
>health HEALTH_WARN 16 pgs degraded; 16 pgs stuck unclean; recovery
> 505/1719 degraded (29.378%); clock skew detected on mon.ceph-srv3
>monmap e1: 2 mons at
> {ceph-srv2=172.29.84.56:6789/0,ceph-srv3=172.29.84.57:6789/0}, election
> epoch 12, quorum 0,1 ceph-srv2,ceph-srv3
>osdmap e52: 2 osds: 2 up, 2 in
> pgmap v33521: 372 pgs: 356 active+clean, 16 active+degraded; 4097 MB
> data, 8277 MB used, 1384 GB / 1392 GB avail; 505/1719 degraded (29.378%)
>mdsmap e17: 1/1/1 up {0=ceph-srv2=up:active}
>
> Can anyone show me how to use hadoop with cephfs correctly?
>
> Thanks,
> Kai
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Using Hadoop With Cephfs

2013-10-14 Thread log1024
Hi,
I have a 4-node Ceph cluster(2 mon, 1 mds, 2 osd) and a Hadoop node.
Currently, I'm trying to replace HDFS with CephFS. I followed the instructions 
in USING HADOOP WITH CEPHFS. But every time I run bin/start-all.sh to run 
Hadoop, it failed with:


starting namenode, logging to 
/usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-ceph-srv1.out
localhost: starting datanode, logging to 
/usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-ceph-srv1.out
localhost: Exception in thread "IPC Client (47) connection to 
/172.29.84.56:6789 from hduser" java.lang.RuntimeException: readObject can't 
find class
localhost: at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
localhost: at 
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
localhost: at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:851)
localhost: at org.apache.hadoop.ipc.Client$Connection.run(Client.java:786)
localhost: Caused by: java.lang.ClassNotFoundException:
localhost: at java.lang.Class.forName0(Native Method)
localhost: at java.lang.Class.forName(Class.java:249)
localhost: at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
localhost: at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)


My core-site.xml:


 ceph.conf.file
 /etc/ceph/ceph.conf


 fs.default.name
 ceph://172.29.84.56:6789/


ceph.mon.address
172.29.84.56:6789


ceph.auth.keyring
/etc/ceph/ceph.client.admin.keyring


ceph.data.pools
hadoop1






Here is my ceph -s output:
ceph -s
  cluster 942afa43-9a92-434b-9dfa-e893d4e5d565
   health HEALTH_WARN 16 pgs degraded; 16 pgs stuck unclean; recovery 505/1719 
degraded (29.378%); clock skew detected on mon.ceph-srv3
   monmap e1: 2 mons at 
{ceph-srv2=172.29.84.56:6789/0,ceph-srv3=172.29.84.57:6789/0}, election epoch 
12, quorum 0,1 ceph-srv2,ceph-srv3
   osdmap e52: 2 osds: 2 up, 2 in
pgmap v33521: 372 pgs: 356 active+clean, 16 active+degraded; 4097 MB data, 
8277 MB used, 1384 GB / 1392 GB avail; 505/1719 degraded (29.378%)
   mdsmap e17: 1/1/1 up {0=ceph-srv2=up:active}


Can anyone show me how to use hadoop with cephfs correctly?


Thanks,
Kai___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full OSD with 29% free

2013-10-14 Thread Bryan Stillwell
The filesystem isn't as full now, but the fragmentation is pretty low:

[root@den2ceph001 ~]# df /dev/sdc1
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sdc1486562672 270845628 215717044  56% /var/lib/ceph/osd/ceph-1
[root@den2ceph001 ~]# xfs_db -c frag -r /dev/sdc1
actual 3481543, ideal 3447443, fragmentation factor 0.98%

Bryan

On Mon, Oct 14, 2013 at 4:35 PM, Michael Lowe  wrote:
>
> How fragmented is that file system?
>
> Sent from my iPad
>
> > On Oct 14, 2013, at 5:44 PM, Bryan Stillwell  
> > wrote:
> >
> > This appears to be more of an XFS issue than a ceph issue, but I've
> > run into a problem where some of my OSDs failed because the filesystem
> > was reported as full even though there was 29% free:
> >
> > [root@den2ceph001 ceph-1]# touch blah
> > touch: cannot touch `blah': No space left on device
> > [root@den2ceph001 ceph-1]# df .
> > Filesystem   1K-blocks  Used Available Use% Mounted on
> > /dev/sdc1486562672 342139340 144423332  71% 
> > /var/lib/ceph/osd/ceph-1
> > [root@den2ceph001 ceph-1]# df -i .
> > FilesystemInodes   IUsed   IFree IUse% Mounted on
> > /dev/sdc160849984 4097408 567525767% 
> > /var/lib/ceph/osd/ceph-1
> > [root@den2ceph001 ceph-1]#
> >
> > I've tried remounting the filesystem with the inode64 option like a
> > few people recommended, but that didn't help (probably because it
> > doesn't appear to be running out of inodes).
> >
> > This happened while I was on vacation and I'm pretty sure it was
> > caused by another OSD failing on the same node.  I've been able to
> > recover from the situation by bringing the failed OSD back online, but
> > it's only a matter of time until I'll be running into this issue again
> > since my cluster is still being populated.
> >
> > Any ideas on things I can try the next time this happens?
> >
> > Thanks,
> > Bryan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] kvm live migrate wil ceph

2013-10-14 Thread Michael Lowe
I live migrate all the time using the rbd driver in qemu, no problems.  Qemu 
will issue a flush as part of the migration so everything is consistent.  It's 
the right way to use ceph to back vm's. I would strongly recommend against a 
network file system approach.  You may want to look into format 2 rbd images, 
the cloning and writable snapshots may be what you are looking for.

Sent from my iPad

> On Oct 14, 2013, at 5:37 AM, Jon  wrote:
> 
> Hello,
> 
> I would like to live migrate a VM between two "hypervisors".  Is it possible 
> to do this with a rbd disk or should the vm disks be created as qcow images 
> on a CephFS/NFS share (is it possible to do clvm over rbds? OR GlusterFS over 
> rbds?)and point kvm at the network directory.  As I understand it, rbds 
> aren't "cluster aware" so you can't mount an rbd on multiple hosts at once, 
> but maybe libvirt has a way to handle the transfer...?  I like the idea of 
> "master" or "golden" images where guests write any changes to a new image, I 
> don't think rbds are able to handle copy-on-write in the same way kvm does so 
> maybe a clustered filesystem approach is the ideal way to go.
> 
> Thanks for your input. I think I'm just missing some piece. .. I just don't 
> grok...
> 
> Bestv Regards,
> Jon A
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full OSD with 29% free

2013-10-14 Thread Michael Lowe
How fragmented is that file system?

Sent from my iPad

> On Oct 14, 2013, at 5:44 PM, Bryan Stillwell  
> wrote:
> 
> This appears to be more of an XFS issue than a ceph issue, but I've
> run into a problem where some of my OSDs failed because the filesystem
> was reported as full even though there was 29% free:
> 
> [root@den2ceph001 ceph-1]# touch blah
> touch: cannot touch `blah': No space left on device
> [root@den2ceph001 ceph-1]# df .
> Filesystem   1K-blocks  Used Available Use% Mounted on
> /dev/sdc1486562672 342139340 144423332  71% 
> /var/lib/ceph/osd/ceph-1
> [root@den2ceph001 ceph-1]# df -i .
> FilesystemInodes   IUsed   IFree IUse% Mounted on
> /dev/sdc160849984 4097408 567525767% /var/lib/ceph/osd/ceph-1
> [root@den2ceph001 ceph-1]#
> 
> I've tried remounting the filesystem with the inode64 option like a
> few people recommended, but that didn't help (probably because it
> doesn't appear to be running out of inodes).
> 
> This happened while I was on vacation and I'm pretty sure it was
> caused by another OSD failing on the same node.  I've been able to
> recover from the situation by bringing the failed OSD back online, but
> it's only a matter of time until I'll be running into this issue again
> since my cluster is still being populated.
> 
> Any ideas on things I can try the next time this happens?
> 
> Thanks,
> Bryan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Full OSD with 29% free

2013-10-14 Thread Bryan Stillwell
This appears to be more of an XFS issue than a ceph issue, but I've
run into a problem where some of my OSDs failed because the filesystem
was reported as full even though there was 29% free:

[root@den2ceph001 ceph-1]# touch blah
touch: cannot touch `blah': No space left on device
[root@den2ceph001 ceph-1]# df .
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sdc1486562672 342139340 144423332  71% /var/lib/ceph/osd/ceph-1
[root@den2ceph001 ceph-1]# df -i .
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/sdc160849984 4097408 567525767% /var/lib/ceph/osd/ceph-1
[root@den2ceph001 ceph-1]#

I've tried remounting the filesystem with the inode64 option like a
few people recommended, but that didn't help (probably because it
doesn't appear to be running out of inodes).

This happened while I was on vacation and I'm pretty sure it was
caused by another OSD failing on the same node.  I've been able to
recover from the situation by bringing the failed OSD back online, but
it's only a matter of time until I'll be running into this issue again
since my cluster is still being populated.

Any ideas on things I can try the next time this happens?

Thanks,
Bryan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] qemu-kvm with rbd mem slow leak

2013-10-14 Thread alan.zhang

CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz *2
MEM: 32GB
KVM: qemu-kvm-0.12.1.2-2.355.el6.2.cuttlefish.async.x86_64
Host: CentOS 6.4, kernel 2.6.32-358.14.1.el6.x86_64
Guest: CentOS 6.4, kernel 2.6.32-279.14.1.el6.x86_64
Ceph: ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
Opennebula: 4.2


top -M info:
top - 10:35:31 up 7 days,  9:19,  1 user,  load average: 0.85, 1.63, 
1.40

Tasks: 454 total,   2 running, 452 sleeping,   0 stopped,   0 zombie
Cpu(s):  8.5%us,  6.6%sy,  0.0%ni, 84.2%id,  0.6%wa,  0.0%hi,  0.0%si,  
0.0%st
Mem:  32865800k total, 32191072k used,   674728k free,59984k 
buffers

Swap: 10485752k total, 10134076k used,   351676k free,  3474176k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
20135 oneadmin  20   0 6381m 3.4g 9120 S  2.3 10.8 104:00.48 qemu-kvm
29171 oneadmin  20   0 6452m 3.2g 9072 S  2.0 10.2 168:02.06 qemu-kvm
 8857 oneadmin  20   0 6338m 2.9g 4504 S  2.3  9.3 289:14.48 qemu-kvm
12283 oneadmin  20   0 6591m 2.9g 4464 S  1.3  9.2 268:57.30 qemu-kvm
 6612 oneadmin  20   0 5050m 2.0g 4472 S 12.9  6.3 191:23.51 qemu-kvm
12006 oneadmin  20   0 5532m 1.9g 4468 S  4.3  6.1 236:43.50 qemu-kvm
 7216 oneadmin  20   0 3600m 1.9g 4680 S  1.3  6.1 159:40.53 qemu-kvm
10602 oneadmin  20   0 5333m 1.6g 4636 S  1.3  5.1 208:54.52 qemu-kvm
13162 oneadmin  20   0 3400m 989m 4528 S 50.3  3.1   4151:19 qemu-kvm
 5273 oneadmin  20   0 5168m 842m 4464 S  5.3  2.6 468:20.65 qemu-kvm
 6287 oneadmin  20   0 3150m 761m 4472 S 37.4  2.4 150:32.89 qemu-kvm
 6081 root  20   0 1732m 504m 5744 S  6.3  1.6 243:17.00 ceph-osd
11729 oneadmin  20   0 3541m 498m 4468 S  0.7  1.6  66:48.52 qemu-kvm
12503 oneadmin  20   0 3832m 428m 9336 S  0.3  1.3  19:58.78 qemu-kvm


such as 20135 process command line:
ps -ef | grep 20135
oneadmin 20135 1  2 Oct11 ?01:44:01 /usr/libexec/qemu-kvm 
-name one-18 -S -M rhel6.4.0 -enable-kvm -m 2048 -smp 
2,sockets=2,cores=1,threads=1 -uuid c40fe8a4-f4fa-9e02-cf2d-6eaaf5062440 
-nodefconfig -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-18.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc 
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
file=rbd:one/one-0-18-0:auth_supported=none,if=none,id=drive-virtio-disk0,format=raw,cache=none 
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 
-drive 
file=rbd:one/one-2:auth_supported=none,if=none,id=drive-virtio-disk1,format=raw,cache=none 
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 
-drive 
file=/var/lib/one/datastores/0/18/disk.1,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw 
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 
-netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=27 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:c0:a8:0a:3b,bus=pci.0,addr=0x3 
-chardev pty,id=charserial0 -device 
isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:18 -vga cirrus 
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6


I have only give it 2GB,but as you see, VIRT/RES (6381m/3.4g).

I think it must be mem leak.

could any one give me a hand?

--
Talk is cheap,lead by example.
Blog: https://www.linuxwind.org
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] kvm live migrate wil ceph

2013-10-14 Thread Jon
Hello,

I would like to live migrate a VM between two "hypervisors".  Is it
possible to do this with a rbd disk or should the vm disks be created as
qcow images on a CephFS/NFS share (is it possible to do clvm over rbds? OR
GlusterFS over rbds?)and point kvm at the network directory.  As I
understand it, rbds aren't "cluster aware" so you can't mount an rbd on
multiple hosts at once, but maybe libvirt has a way to handle the
transfer...?  I like the idea of "master" or "golden" images where guests
write any changes to a new image, I don't think rbds are able to handle
copy-on-write in the same way kvm does so maybe a clustered filesystem
approach is the ideal way to go.

Thanks for your input. I think I'm just missing some piece. .. I just don't
grok...

Bestv Regards,
Jon A
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Production locked: OSDs down

2013-10-14 Thread Mikaël Cluseau

Hi,

I have a pretty big problem here... my OSDs are marked down (except one?!)

I have ceph ceph version 0.61.8 (a6fdcca3bddbc9f177e4e2bf0d9cdd85006b028b).

I recently had a full monitors so I had to remove them but it seemed to 
work.


# idweighttype nameup/downreweight
-115root default
-36datacenter xxx
-20host cloud-1
-40host cloud-2
-73host xxx-1
71osd.7down1
81osd.8down1
91osd.9down1
-83host xxx-2
31osd.3down1
41osd.4down1
51osd.5up1


I see this in the logs when I try to restart them :

2013-10-15 06:54:32.651951 7fa5db16b780  1 journal _open 
/dev/ssd/osd_3_jrn fd 26: 5368709120 bytes, block size 4096 bytes, 
directio = 1, aio = 1
2013-10-15 06:54:36.321235 7fa5ac741700  0 -- 192.168.242.2:6801/29193 
>> 192.168.242.1:6811/12764 pipe(0x7fa588002490 sd=28 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby
2013-10-15 06:54:36.321256 7fa59c2f3700  0 -- 192.168.242.2:6801/29193 
>> 192.168.242.1:6801/12362 pipe(0x7fa588001490 sd=27 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby
2013-10-15 06:54:36.321267 7fa5ac13b700  0 -- 192.168.242.2:6801/29193 
>> 192.168.242.1:6814/13354 pipe(0x7fa588001970 sd=30 :0 s=1 pgs=0 cs=0 
l=0).fault with nothing to send, going to standby


Any idea?

Thanks!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] xfs log device and osd journal specifications in ceph.conf

2013-10-14 Thread Snider, Tim
3 questions:

1.   I'd like to use xfs devices with a separate log device in a ceph cluster. 
What's the best way to do this?  Is it possible to specify xfs log devices in 
the [osd.x] sections of ceph.conf? 
E.G.:
[osd.0]
host = delta
devs = /dev/sdx
osd mkfs options xfs = -d su=131072,sw=8 -i size=1024 -l 
logdev=/dev/sdq1,su=131072

[osd.1]
host = epsilon
devs = /dev/sdy
osd mkfs options xfs = -d su=131072,sw=8 -i size=1024 -l 
logdev=/dev/sdq2,su=131072

2.Is this the correct syntax for the line without the log device options?
 osd mkfs options xfs = -d su=131072,sw=8 -i size=1024

3. For osd journal devices. I assume there's a 1:1 relationship between osds 
and journal devices.  The section in sample.ceph.conf seems to imply a single 
entry.
Should there be an osd journal entry in each [osd.x] section of ceph.conf? 

[osd]
; This is where the osd expects its data
osd data = /data/$name

; Ideally, make the journal a separate disk or partition.
; 1-10GB should be enough; more if you have fast or many
; disks.  You can use a file under the osd data dir if need be
; (e.g. /data/$name/journal), but it will be slower than a
; separate disk or partition.
; This is an example of a file-based journal.
osd journal = /data/$name/journal
osd journal size = 1000 ; journal size, in megabytes

On my cluster (deployed with ceph-deploy) the data is in /var/lib/ceph/osd. Not 
/data/$name as in the sample file. Directory organization on my cluster:
/var/lib/ceph/osd/:
ceph-0  ceph-10  ceph-12  ceph-14  ceph-16  ceph-18  ceph-2   ceph-21  
ceph-3  ceph-5  ceph-7  ceph-9
ceph-1  ceph-11  ceph-13  ceph-15  ceph-17  ceph-19  ceph-20  ceph-22  
ceph-4  ceph-6  ceph-8

/var/lib/ceph/osd/ceph-0:

/var/lib/ceph/osd/ceph-1:

ls /data
ls: cannot access /data: No such file or directory
Thanks,
Tim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Speed limit on RadosGW?

2013-10-14 Thread Kyle Bader
I've personally saturated 1Gbps links on multiple radosgw nodes on a large
cluster, if I remember correctly, Yehuda has tested it up into the 7Gbps
range with 10Gbps gear. Could you describe your clusters hardware and
connectivity?


On Mon, Oct 14, 2013 at 3:34 AM, Chu Duc Minh  wrote:

> Hi sorry, i missed this mail.
>
>
> > During writes, does the CPU usage on your RadosGW node go way up?
> No, CPU stay the same & very low (< 10%)
>
> When upload small files(300KB/file) over RadosGW:
>  - using 1 process: upload bandwidth ~ 3MB/s
>  - using 100 processes: upload bandwidth ~ 15MB/s
>
> When upload big files(3GB/file) over RadosGW:
>  - using 1 process: upload bandwidth ~ 70MB/s
> (Therefore i don't upload big files using multi-processes any more :D)
>
> Maybe, RadosGW have a problem when write many smail files. Or it's a
> problem of CEPH when simultaneously write many smail files into a bucket,
> that already have millions files?
>
>
> On Wed, Sep 25, 2013 at 7:24 PM, Mark Nelson wrote:
>
>> On 09/25/2013 02:49 AM, Chu Duc Minh wrote:
>>
>>> I have a CEPH cluster with 9 nodes (6 data nodes & 3 mon/mds nodes)
>>> And i setup 4 separate nodes to test performance of Rados-GW:
>>>   - 2 node run Rados-GW
>>>   - 2 node run multi-process put file to [multi] Rados-GW
>>>
>>> Result:
>>> a) When i use 1 RadosGW node & 1 upload-node, speed upload = 50MB/s
>>> /upload-node, Rados-GW input/output speed = 50MB/s
>>>
>>> b) When i use 2 RadosGW node & 1 upload-node, speed upload = 50MB/s
>>> /upload-node; each RadosGW have input/output = 25MB/s ==> sum
>>> input/ouput of 2 Rados-GW = 50MB/s
>>>
>>> c) When i use 1 RadosGW node & 2 upload-node, speed upload = 25MB/s
>>> /upload-node ==> sum output of 2 upload-node = 50MB/s, RadosGW have
>>> input/output = 50MB/s
>>>
>>> d) When i use 2 RadosGW node & 2 upload-node, speed upload = 25MB/s
>>> /upload-node ==> sum output of 2 upload-node = 50MB/s; each RadosGW have
>>> input/output = 25MB/s ==> sum input/ouput of 2 Rados-GW = 50MB/s
>>>
>>> _*Problem*_: i can pass limit 50MB/s when put file over Rados-GW,
>>>
>>> regardless of the number Rados-GW nodes and upload-nodes.
>>> When i use this CEPH cluster over librados (openstack/kvm), i can easily
>>> achieve > 300MB/s
>>>
>>> I don't know why performance of RadosGW is so low. What's bottleneck?
>>>
>>
>> During writes, does the CPU usage on your RadosGW node go way up?
>>
>> If this is a test cluster, you might want to try the wip-6286 build from
>> our gitbuilder site.  There is a fix that depending on the size of your
>> objects, could have a big impact on performance.  We're currently
>> investigating some other radosgw performance issues as well, so stay tuned.
>> :)
>>
>> Mark
>>
>>
>>
>>> Thank you very much!
>>>
>>>
>>>
>>>
>>> __**_
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>>>
>>>
>> __**_
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Kyle
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw can still get the object even if this object's physical file is removed on OSDs

2013-10-14 Thread Yehuda Sadeh
On Mon, Oct 14, 2013 at 4:04 AM, david zhang  wrote:
> Hi ceph-users,
>
> I uploaded an object successfully to radosgw with 3 replicas. And I located
> all the physical paths of 3 replicas on different OSDs.
>
> i.e, one of the 3 physical paths is
> /var/lib/ceph/osd/ceph-2/current/3.5_head/DIR_D/default.4896.65\\u20131014\\u1__head_0646563D__3
>
> Then I manually deleted all the 3 replica files on OSDs, but this object can
> still get from radosgw with http code 200 even I cleaned all the caches on
> both radosgw and OSDs by 'echo 3 > /proc/sys/vm/drop_caches'. Only after I
> restarted the 3 OSDs, get request will return 404.
>
> What did I miss? Is it not right to clean cache in that way?

I'm not too sure what you're trying to achieve. You should never ever
access the osd objects directly like that. The reason you're still
able to read the objects is probably because the osd keeps open fds
for recently opened files and it still holds a reference to them. If
you need to remove objects off the rados backend you should use the
rados tool to do that. However, since you created the objects via
radosgw, you're going to have some radosgw consistency issues, so in
that case the way to go would be by going through radosgw-admin (or
through the radosgw RESTful api).


Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down after server failure

2013-10-14 Thread Sage Weil
Is osd.47 the one with the bad disk?  I should not start.

If there are other osds on the same host that aren't started with 'service 
ceph start', you may have to mention them by name (the old version of the 
script would stop on the first error instead of continuing).  e.g.,

 service ceph start osd.48
 service ceph start osd.49
 ...

sage

On Mon, 14 Oct 2013, Dominik Mostowiec wrote:

> Hi
> I have found somthing.
> After restart time was wrong on server (+2hours) before ntp has fixed it.
> I restarted this 3 osd - it not helps.
> It is possible that ceph banned this osd? Or after start with wrong
> time osd has broken hi's filestore?
> 
> --
> Regards
> Dominik
> 
> 
> 2013/10/14 Dominik Mostowiec :
> > Hi,
> > I had server failure that starts from one disk failure:
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023986] sd 4:2:26:0:
> > [sdaa] Unhandled error code
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023990] sd 4:2:26:0:
> > [sdaa]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023995] sd 4:2:26:0:
> > [sdaa] CDB: Read(10): 28 00 00 00 00 d0 00 00 10 00
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024005] end_request:
> > I/O error, dev sdaa, sector 208
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024744] XFS (sdaa):
> > metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> > count 8192
> > Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.025879] XFS (sdaa):
> > xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> > Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.820288] XFS (sdaa):
> > metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> > count 8192
> > Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.821194] XFS (sdaa):
> > xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> > Oct 14 03:25:32 s3-10-177-64-6 kernel: [1027264.667851] XFS (sdaa):
> > metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> > count 8192
> >
> > this caused that the server has been unresponsive.
> >
> > After server restart 3 of 26 osd on it are down.
> > In ceph-osd log after "debug osd = 10" and restart is:
> >
> > 2013-10-14 06:21:23.141936 7fdeb4872700 -1 osd.47 43203 *** Got signal
> > Terminated ***
> > 2013-10-14 06:21:23.142141 7fdeb4872700 -1 osd.47 43203  pausing thread 
> > pools
> > 2013-10-14 06:21:23.142146 7fdeb4872700 -1 osd.47 43203  flushing io
> > 2013-10-14 06:21:25.406187 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
> > appears to work
> > 2013-10-14 06:21:25.406204 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
> > 'filestore fiemap' config option
> > 2013-10-14 06:21:25.406557 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount did NOT detect btrfs
> > 2013-10-14 06:21:25.412617 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
> > (by glibc and kernel)
> > 2013-10-14 06:21:25.412831 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount found snaps <>
> > 2013-10-14 06:21:25.415798 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
> > btrfs not detected
> > 2013-10-14 06:21:26.078377 7f02690f9780  2 osd.47 0 mounting
> > /vol0/data/osd.47 /vol0/data/osd.47/journal
> > 2013-10-14 06:21:26.080872 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
> > appears to work
> > 2013-10-14 06:21:26.080885 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
> > 'filestore fiemap' config option
> > 2013-10-14 06:21:26.081289 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount did NOT detect btrfs
> > 2013-10-14 06:21:26.087524 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
> > (by glibc and kernel)
> > 2013-10-14 06:21:26.087582 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount found snaps <>
> > 2013-10-14 06:21:26.089614 7f02690f9780  0
> > filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
> > btrfs not detected
> > 2013-10-14 06:21:26.726676 7f02690f9780  2 osd.47 0 boot
> > 2013-10-14 06:21:26.726773 7f02690f9780 10 osd.47 0 read_superblock
> > sb(16773c25-5054-4451-bf9f-efc1f7f21b89 osd.47
> > 63cf7d70-99cb-0ab1-4006-002f e43203 [41261,43203]
> > lci=[43194,43203])
> > 2013-10-14 06:21:26.726862 7f02690f9780 10 osd.47 0 add_map_bl 43203 82622 
> > bytes
> > 2013-10-14 06:21:26.727184 7f02690f9780 10 osd.47 43203 load_pgs
> > 2013-10-14 06:21:26.727643 7f02690f9780 10 osd.47 43203 load_pgs
> > ignoring unrecognized meta
> > 2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
> > 3.df1_TEMP clearing temp
> >
> > osd.47 is still down, I put it out from cluster.
> > 47  1   osd.47  down0
> >
> > How can I check what is wrong?
> >
> > ceph -v
> > ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
> >
> > --
> > Pozdrawi

Re: [ceph-users] 2013年10月14日 14:42:23 自动保存草稿

2013-10-14 Thread Noah Watkins
Do you have the following in your core-site.xml?

> 
> fs.ceph.impl
> org.apache.hadoop.fs.ceph.CephFileSystem
> 

On Sun, Oct 13, 2013 at 11:55 PM, 鹏  wrote:
> hi all
> I follow the mail  configure the ceph with hadoop
> (http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/1809).
> 1. Install additional packages: libcephfs-java libcephfs-jni  using the
> commonds:
> ./configure --enable-cephfs-java
> make & make install
> cp /src/java/libcephfs.jar  /usr/hadoop/lib/
> 2. Download http://ceph.com/download/hadoop-cephfs.jar
>  cp hadoop-cephfs.jar /usr/hadoop/lib
>
>
>   3. Symink JNI library
> cd /usr/hadoop/lib/native/Linux-amd64-64
> ln -s /usr/local/lib/libcephfs_jni.so .
>
>4 vim  core-site.xml
> fs.default.name=ceph://192.168.22.158:6789/
> fs.ceph.impl=org.apache.hadoop.fs.ceph.CephFileSystem
> ceph.conf.file=/etc/ceph/ceph.conf
>
> and then
># hadoop fs -ls
>ls: cannot access . :no such file or directory
> #hadoop dfsadmin -report
> report:FileSystem ceph://192.168.22.158:6789 is not a distributed file
> System
> Usage: java DFSAdmin[-report]
>
> thanks
> pengft
>
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] using ceph with hadoop

2013-10-14 Thread Noah Watkins
The error below seems to indicate that Hadoop isn't aware of the `ceph://`
file system. You'll need to manually add this to your core-site.xml:

>* *>* fs.ceph.impl*>* 
>org.apache.hadoop.fs.ceph.CephFileSystem*>* *



> report:FileSystem ceph://192.168.22.158:6789 is not a distributed file 
> System
> Usage: java DFSAdmin[-report]
> # /usr/hadoop/bin/stop-all.sh
> # /usr/hadoop/bin/start-all.sh
>hadoop:Exception in thread "IPC Client(47) Connection to 
> 192.168.58.129:6789 from rwt java.lang.RuntimeException:readObject cant find 
> class
>
>
>
>  thanks
> pengft
>
>
>
>
>
>
>
>
>
> **
>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using ceph with hadoop error

2013-10-14 Thread Noah Watkins
On Sun, Oct 13, 2013 at 8:28 PM, 鹏  wrote:
>  hi all:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> com/ceph/fs/cephFileAlreadyExisteException
> at java.lang.class.forName0(Native Method)

This looks like a bug, which I'll fixup today. But it shouldn't be
related to the problems you are seeing.

> Caused by :
> java.lang.classNotFoundException:com.ceph.fs.CephFileAlreadyExistsException
>  at java.net.URLClassLoader$1.run(URLClassLoader.jar:202)
>  at

This looks like you don't have the CephFS Java bindings in a place
that Hadoop can locate. Typically you can stick the libcephfs-jar file
into the lib directory of Hadoop, or add it to your classpath.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw can still get the object even if this object's physical file is removed on OSDs

2013-10-14 Thread david zhang
Hi ceph-users,

I uploaded an object successfully to radosgw with 3 replicas. And I located
all the physical paths of 3 replicas on different OSDs.

i.e, one of the 3 physical paths is
/var/lib/ceph/osd/ceph-2/current/3.5_head/DIR_D/default.4896.65\\u20131014\\u1__head_0646563D__3

Then I manually deleted all the 3 replica files on OSDs, but this object
can still get from radosgw with http code 200 even I cleaned all the caches
on both radosgw and OSDs by 'echo 3 > /proc/sys/vm/drop_caches'. Only after
I restarted the 3 OSDs, get request will return 404.

What did I miss? Is it not right to clean cache in that way?

Thanks.

-- 
Regards,
Zhi
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Speed limit on RadosGW?

2013-10-14 Thread Chu Duc Minh
Hi sorry, i missed this mail.

> During writes, does the CPU usage on your RadosGW node go way up?
No, CPU stay the same & very low (< 10%)

When upload small files(300KB/file) over RadosGW:
 - using 1 process: upload bandwidth ~ 3MB/s
 - using 100 processes: upload bandwidth ~ 15MB/s

When upload big files(3GB/file) over RadosGW:
 - using 1 process: upload bandwidth ~ 70MB/s
(Therefore i don't upload big files using multi-processes any more :D)

Maybe, RadosGW have a problem when write many smail files. Or it's a
problem of CEPH when simultaneously write many smail files into a bucket,
that already have millions files?


On Wed, Sep 25, 2013 at 7:24 PM, Mark Nelson wrote:

> On 09/25/2013 02:49 AM, Chu Duc Minh wrote:
>
>> I have a CEPH cluster with 9 nodes (6 data nodes & 3 mon/mds nodes)
>> And i setup 4 separate nodes to test performance of Rados-GW:
>>   - 2 node run Rados-GW
>>   - 2 node run multi-process put file to [multi] Rados-GW
>>
>> Result:
>> a) When i use 1 RadosGW node & 1 upload-node, speed upload = 50MB/s
>> /upload-node, Rados-GW input/output speed = 50MB/s
>>
>> b) When i use 2 RadosGW node & 1 upload-node, speed upload = 50MB/s
>> /upload-node; each RadosGW have input/output = 25MB/s ==> sum
>> input/ouput of 2 Rados-GW = 50MB/s
>>
>> c) When i use 1 RadosGW node & 2 upload-node, speed upload = 25MB/s
>> /upload-node ==> sum output of 2 upload-node = 50MB/s, RadosGW have
>> input/output = 50MB/s
>>
>> d) When i use 2 RadosGW node & 2 upload-node, speed upload = 25MB/s
>> /upload-node ==> sum output of 2 upload-node = 50MB/s; each RadosGW have
>> input/output = 25MB/s ==> sum input/ouput of 2 Rados-GW = 50MB/s
>>
>> _*Problem*_: i can pass limit 50MB/s when put file over Rados-GW,
>>
>> regardless of the number Rados-GW nodes and upload-nodes.
>> When i use this CEPH cluster over librados (openstack/kvm), i can easily
>> achieve > 300MB/s
>>
>> I don't know why performance of RadosGW is so low. What's bottleneck?
>>
>
> During writes, does the CPU usage on your RadosGW node go way up?
>
> If this is a test cluster, you might want to try the wip-6286 build from
> our gitbuilder site.  There is a fix that depending on the size of your
> objects, could have a big impact on performance.  We're currently
> investigating some other radosgw performance issues as well, so stay tuned.
> :)
>
> Mark
>
>
>
>> Thank you very much!
>>
>>
>>
>>
>> __**_
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>>
>>
> __**_
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] using ceph with hadoop

2013-10-14 Thread
|
 


hi all
I follow the mail  configure the ceph with hadoop 
(http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/1809).
1. Install additional packages: libcephfs-java libcephfs-jni  using the 
commonds:
./configure --enable-cephfs-java
make & make install
cp /src/java/libcephfs.jar  /usr/hadoop/lib/
2. Download http://ceph.com/download/hadoop-cephfs.jar
 cp hadoop-cephfs.jar /usr/hadoop/lib
   

  3. Symink JNI library
cd /usr/hadoop/lib/native/Linux-amd64-64 
ln -s /usr/local/lib/libcephfs_jni.so .

   4 vim  core-site.xml 
fs.default.name=ceph://192.168.22.158:6789/
fs.ceph.impl=org.apache.hadoop.fs.ceph.CephFileSystem
ceph.conf.file=/etc/ceph/ceph.conf

and then  
   # hadoop fs -ls
   ls: cannot access . :no such file or directory
#hadoop dfsadmin -report
report:FileSystem ceph://192.168.22.158:6789 is not a distributed file 
System
Usage: java DFSAdmin[-report]
# /usr/hadoop/bin/stop-all.sh
# /usr/hadoop/bin/start-all.sh
   hadoop:Exception in thread "IPC Client(47) Connection to 
192.168.58.129:6789 from rwt java.lang.RuntimeException:readObject cant find 
class



 thanks
pengft









|
|
|   |   |
|___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw-admin doesn't list user anymore

2013-10-14 Thread Valery Tschopp

We upgraded from 0.61.8 to 0.67.4.

The metadata commands works for the users and the buckets:

root@ineri ~$ radosgw-admin metadata list bucket
[
"a4mesh",
"61a75c04-34a5-11e3-9bea-8f8d15b5cf20",
"6e22de72-34a5-11e3-afc4-d3f70b676c52",
...


root@ineri ~$ radosgw-admin metadata list user
[
"cloudbroker",
"a4mesh",
"valery",
...

Cheers,
Valery

On 11/10/13 18:27 , Yehuda Sadeh wrote:

On Fri, Oct 11, 2013 at 7:46 AM, Valery Tschopp
 wrote:

Hi,

Since we upgraded ceph to 0.67.4, the radosgw-admin doesn't list all the
users anymore:

root@ineri:~# radosgw-admin user info
could not fetch user info: no user info saved


But it still work for single user:

root@ineri:~# radosgw-admin user info --uid=valery
{ "user_id": "valery",
"display_name": "Valery Tschopp",
"email": "valery.tsch...@switch.ch",
...

The debug log file is too big for the mailing-list, but here it is on
pastebin: http://pastebin.com/cFypJ2Qd



What version did you upgrade from?

You can try using the following:

$ radosgw-admin metadata list bucket

Thanks,
Yehuda



--
SWITCH
--
Valery Tschopp, Software Engineer, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
email: valery.tsch...@switch.ch phone: +41 44 268 1544




smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down after server failure

2013-10-14 Thread Dominik Mostowiec
Hi
I have found somthing.
After restart time was wrong on server (+2hours) before ntp has fixed it.
I restarted this 3 osd - it not helps.
It is possible that ceph banned this osd? Or after start with wrong
time osd has broken hi's filestore?

--
Regards
Dominik


2013/10/14 Dominik Mostowiec :
> Hi,
> I had server failure that starts from one disk failure:
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023986] sd 4:2:26:0:
> [sdaa] Unhandled error code
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023990] sd 4:2:26:0:
> [sdaa]  Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.023995] sd 4:2:26:0:
> [sdaa] CDB: Read(10): 28 00 00 00 00 d0 00 00 10 00
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024005] end_request:
> I/O error, dev sdaa, sector 208
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.024744] XFS (sdaa):
> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> count 8192
> Oct 14 03:25:04 s3-10-177-64-6 kernel: [1027237.025879] XFS (sdaa):
> xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.820288] XFS (sdaa):
> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> count 8192
> Oct 14 03:25:28 s3-10-177-64-6 kernel: [1027260.821194] XFS (sdaa):
> xfs_imap_to_bp: xfs_trans_read_buf() returned error 5.
> Oct 14 03:25:32 s3-10-177-64-6 kernel: [1027264.667851] XFS (sdaa):
> metadata I/O error: block 0xd0 ("xfs_trans_read_buf") error 5 buf
> count 8192
>
> this caused that the server has been unresponsive.
>
> After server restart 3 of 26 osd on it are down.
> In ceph-osd log after "debug osd = 10" and restart is:
>
> 2013-10-14 06:21:23.141936 7fdeb4872700 -1 osd.47 43203 *** Got signal
> Terminated ***
> 2013-10-14 06:21:23.142141 7fdeb4872700 -1 osd.47 43203  pausing thread pools
> 2013-10-14 06:21:23.142146 7fdeb4872700 -1 osd.47 43203  flushing io
> 2013-10-14 06:21:25.406187 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
> appears to work
> 2013-10-14 06:21:25.406204 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
> 'filestore fiemap' config option
> 2013-10-14 06:21:25.406557 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount did NOT detect btrfs
> 2013-10-14 06:21:25.412617 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
> (by glibc and kernel)
> 2013-10-14 06:21:25.412831 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount found snaps <>
> 2013-10-14 06:21:25.415798 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
> btrfs not detected
> 2013-10-14 06:21:26.078377 7f02690f9780  2 osd.47 0 mounting
> /vol0/data/osd.47 /vol0/data/osd.47/journal
> 2013-10-14 06:21:26.080872 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is supported and
> appears to work
> 2013-10-14 06:21:26.080885 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount FIEMAP ioctl is disabled via
> 'filestore fiemap' config option
> 2013-10-14 06:21:26.081289 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount did NOT detect btrfs
> 2013-10-14 06:21:26.087524 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount syncfs(2) syscall fully supported
> (by glibc and kernel)
> 2013-10-14 06:21:26.087582 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount found snaps <>
> 2013-10-14 06:21:26.089614 7f02690f9780  0
> filestore(/vol0/data/osd.47) mount: enabling WRITEAHEAD journal mode:
> btrfs not detected
> 2013-10-14 06:21:26.726676 7f02690f9780  2 osd.47 0 boot
> 2013-10-14 06:21:26.726773 7f02690f9780 10 osd.47 0 read_superblock
> sb(16773c25-5054-4451-bf9f-efc1f7f21b89 osd.47
> 63cf7d70-99cb-0ab1-4006-002f e43203 [41261,43203]
> lci=[43194,43203])
> 2013-10-14 06:21:26.726862 7f02690f9780 10 osd.47 0 add_map_bl 43203 82622 
> bytes
> 2013-10-14 06:21:26.727184 7f02690f9780 10 osd.47 43203 load_pgs
> 2013-10-14 06:21:26.727643 7f02690f9780 10 osd.47 43203 load_pgs
> ignoring unrecognized meta
> 2013-10-14 06:21:26.727681 7f02690f9780 10 osd.47 43203 load_pgs
> 3.df1_TEMP clearing temp
>
> osd.47 is still down, I put it out from cluster.
> 47  1   osd.47  down0
>
> How can I check what is wrong?
>
> ceph -v
> ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
>
> --
> Pozdrawiam
> Dominik



-- 
Pozdrawiam
Dominik
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com