date:20151217

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-17 Thread Wido den Hollander

On 12/17/2015 06:29 AM, Ben Hines wrote:
> 
> 
> On Wed, Dec 16, 2015 at 11:05 AM, Florian Haas  > wrote:
> 
> Hi Ben & everyone,
> 
> 
> Ben, you wrote elsewhere
> 
> (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-August/003955.html)
> that you found approx. 900k objects to be the threshold where index
> sharding becomes necessary. Have you found that to be a reasonable
> rule of thumb, as in "try 1-2 shards per million objects in your most
> populous bucket"? Also, do you reckon that beyond that, more shards
> make things worse?
> 
> 
>  
> Oh, and to answer this part.   I didn't do that much experimentation
> unfortunately.  I actually am using about 24 index shards per bucket
> currently and we delete each bucket once it hits about a million
> objects. (it's just a throwaway cache for us) Seems ok, so i stopped
> tweaking.
> 

I have a use case where I need to store 350 Million objects in a single
bucket.

I tested with 4096 shards and that works. Creating the bucket takes a
few seconds though.

This setup is for archiving purposes, so data is written and not read
that much afterwards.

> Also, i think i have a pretty slow cluster as far as write speed is
> concerned, since we do not have SSD Journals. With SSD journals i
> imagine the index write speed is significantly improved, but i am not
> sure how much. A faster cluster could probably handle bigger indexes.
> 
> -Ben
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Loic Dachary


And 95-ceph-osd.rules contains the following ?

#  Check gpt partion for ceph tags and activate
ACTION=="add", SUBSYSTEM=="block", \
  ENV{DEVTYPE}=="partition", \
  ENV{ID_PART_TABLE_TYPE}=="gpt", \
  RUN+="/usr/sbin/ceph-disk-udev $number $name $parent"

On 17/12/2015 08:29, Jesper Thorhauge wrote:
> Hi Loic,
> 
> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4).
> 
> sgdisk for sda shows;
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6
> First sector: 2048 (at 1024.0 KiB)
> Last sector: 1953525134 (at 931.5 GiB)
> Partition size: 1953523087 sectors (931.5 GiB)
> Attribute flags: 
> Partition name: 'ceph data'
> 
> for sdb
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F
> First sector: 2048 (at 1024.0 KiB)
> Last sector: 1953525134 (at 931.5 GiB)
> Partition size: 1953523087 sectors (931.5 GiB)
> Attribute flags: 
> Partition name: 'ceph data'
> 
> for /dev/sdc3
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072
> First sector: 935813120 (at 446.2 GiB)
> Last sector: 956293119 (at 456.0 GiB)
> Partition size: 2048 sectors (9.8 GiB)
> Attribute flags: 
> Partition name: 'ceph journal'
> 
> for /dev/sdc4
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168
> First sector: 956293120 (at 456.0 GiB)
> Last sector: 976773119 (at 465.8 GiB)
> Partition size: 2048 sectors (9.8 GiB)
> Attribute flags: 
> Partition name: 'ceph journal'
> 
> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
> seems correct to me.
> 
> after a reboot, /dev/disk/by-partuuid is;
> 
> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
> -> ../../sdb1
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
> -> ../../sda1
> 
> i dont know how to verify the symlink of the journal file - can you guide me 
> on that one?
> 
> Thank :-) !
> 
> /Jesper
> 
> **
> 
> Hi,
> 
> On 17/12/2015 07:53, Jesper Thorhauge wrote:
>> Hi,
>>
>> Some more information showing in the boot.log;
>>
>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument
>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>> failed with error -22
>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1  ** ERROR: error creating empty 
>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument
>> ERROR:ceph-disk:Failed to activate
>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', 
>> '--mkkey', '-i', '7', '--monmap', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 
>> 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1
>> ceph-disk: Error: One or more partitions failed to activate
>>
>> Maybe related to the "(22) Invalid argument" part..?
> 
> After a reboot the symlinks are reconstructed and if they are still 
> incorrect, it means there is an inconsistency somewhere else. To debug the 
> problem, could you mount /dev/sda1 and verify the symlink of the journal file 
> ? Then verify the content of /dev/disk/by-partuuid. And also display the 
> partition information with sgdisk -i 1 /dev/sda and sgdisk -i 2 /dev/sda. Are 
> you collocating your journal with the data, on the same disk ? Or are they on 
> two different disks ?
> 
> git log --no-merges --oneline tags/v0.94.3..tags/v0.94.5 udev
> 
> shows nothing, meaning there has been no change to udev rules. There is one 
> change related to the installation of the udev rules 
> https://github.com/ceph/ceph/commit/4eb58ad2027148561d94bb43346b464b55d041a6. 
> Could you double check 60-ceph-partuuid-workaround.rules is installed where 
> it should ?
> 
> Cheers
> 
>>
>> /Jesper
>>
>> *
>>
>> Hi,
>>
>> I have done several reboots, and it did not lead to healthy symlinks :-(
>>
>> /Jesper
>>
>> 
>>
>> Hi,
>>
>> On 16/12/2015 07:39, Jesper Thorhauge wrote:
>>> Hi,
>>>
>>> A fresh server install on one of my nodes (and yum update) left me with 
>>> CentOS 6.7 / Ceph 0.94.5. All the other nodes are running Ceph 0.94.2.
>>>
>>> "ceph-disk prepare /dev/sda /dev/sdc" seems to work as expected, but 
>>> "ceph-disk activate / dev/sda1" fails. I have traced

Re: [ceph-users] recommendations for file sharing

2015-12-17 Thread Alex Leake

?Lin,


Thanks for this! I did not see the ownCloud RADOS implementation.


I maintain a local ownCloud environment anyway, so this is a really good idea.


Have you used it?



Regards,

Alex.


From: lin zhou 周林 
Sent: 17 December 2015 02:10
To: Alex Leake; ceph-users@lists.ceph.com
Subject: Re: recommendations for file sharing

seafile is another way.it support write data to ceph using 
librados directly.

在 2015年12月15日 10:51, Wido den Hollander 写道:
> Are you sure you need file sharing? ownCloud for example now has native
> RADOS support using phprados.
>
> Isn't ownCloud something that could work? Talking native RADOS is always
> the best.
>
> Wido
>
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Jesper Thorhauge

Hi Loic, 

Yep, 95-ceph-osd.rules contains exactly that... 

*** 

And 95-ceph-osd.rules contains the following ? 

# Check gpt partion for ceph tags and activate 
ACTION=="add", SUBSYSTEM=="block", \ 
ENV{DEVTYPE}=="partition", \ 
ENV{ID_PART_TABLE_TYPE}=="gpt", \ 
RUN+="/usr/sbin/ceph-disk-udev $number $name $parent" 

On 17/12/2015 08:29, Jesper Thorhauge wrote: 
> Hi Loic, 
> 
> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4). 
> 
> sgdisk for sda shows; 
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6 
> First sector: 2048 (at 1024.0 KiB) 
> Last sector: 1953525134 (at 931.5 GiB) 
> Partition size: 1953523087 sectors (931.5 GiB) 
> Attribute flags:  
> Partition name: 'ceph data' 
> 
> for sdb 
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F 
> First sector: 2048 (at 1024.0 KiB) 
> Last sector: 1953525134 (at 931.5 GiB) 
> Partition size: 1953523087 sectors (931.5 GiB) 
> Attribute flags:  
> Partition name: 'ceph data' 
> 
> for /dev/sdc3 
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072 
> First sector: 935813120 (at 446.2 GiB) 
> Last sector: 956293119 (at 456.0 GiB) 
> Partition size: 2048 sectors (9.8 GiB) 
> Attribute flags:  
> Partition name: 'ceph journal' 
> 
> for /dev/sdc4 
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168 
> First sector: 956293120 (at 456.0 GiB) 
> Last sector: 976773119 (at 465.8 GiB) 
> Partition size: 2048 sectors (9.8 GiB) 
> Attribute flags:  
> Partition name: 'ceph journal' 
> 
> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
> seems correct to me. 
> 
> after a reboot, /dev/disk/by-partuuid is; 
> 
> -rw-r--r-- 1 root root 0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168 
> -rw-r--r-- 1 root root 0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072 
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
> -> ../../sdb1 
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
> -> ../../sda1 
> 
> i dont know how to verify the symlink of the journal file - can you guide me 
> on that one? 
> 
> Thank :-) ! 
> 
> /Jesper 
> 
> ** 
> 
> Hi, 
> 
> On 17/12/2015 07:53, Jesper Thorhauge wrote: 
>> Hi, 
>> 
>> Some more information showing in the boot.log; 
>> 
>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument 
>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>> failed with error -22 
>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1 ** ERROR: error creating empty 
>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument 
>> ERROR:ceph-disk:Failed to activate 
>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', 
>> '--mkkey', '-i', '7', '--monmap', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 
>> 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1 
>> ceph-disk: Error: One or more partitions failed to activate 
>> 
>> Maybe related to the "(22) Invalid argument" part..? 
> 
> After a reboot the symlinks are reconstructed and if they are still 
> incorrect, it means there is an inconsistency somewhere else. To debug the 
> problem, could you mount /dev/sda1 and verify the symlink of the journal file 
> ? Then verify the content of /dev/disk/by-partuuid. And also display the 
> partition information with sgdisk -i 1 /dev/sda and sgdisk -i 2 /dev/sda. Are 
> you collocating your journal with the data, on the same disk ? Or are they on 
> two different disks ? 
> 
> git log --no-merges --oneline tags/v0.94.3..tags/v0.94.5 udev 
> 
> shows nothing, meaning there has been no change to udev rules. There is one 
> change related to the installation of the udev rules 
> https://github.com/ceph/ceph/commit/4eb58ad2027148561d94bb43346b464b55d041a6. 
> Could you double check 60-ceph-partuuid-workaround.rules is installed where 
> it should ? 
> 
> Cheers 
> 
>> 
>> /Jesper 
>> 
>> * 
>> 
>> Hi, 
>> 
>> I have done several reboots, and it did not lead to healthy symlinks :-( 
>> 
>> /Jesper 
>> 
>>  
>> 
>> Hi, 
>> 
>> On 16/12/2015 07:39, Jesper Thorhauge wrote: 
>>> Hi, 
>>> 
>>> A fresh server install on one of my nodes (and yum update) left me with 
>>> CentOS 6.7 / Ceph 0.94.5. All the other nodes are running

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Loic Dachary

The non-symlink files in /dev/disk/by-partuuid come to existence because of:

* system boots
* udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
  * ceph-disk-udev creates the symlink 
/dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
  * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the journal 
journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which 
does not yet exists because /dev/sdc udev rules have not been run yet
  * ceph-osd opens the journal in write mode and that creates the file 
/dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file
  * the file is empty and the osd fails to activate with the error you see 
(EINVAL because the file is empty)

This is ok, supported and expected since there is no way to know which disk 
will show up first.

When /dev/sdc shows up, the same logic will be triggered:

* udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
  * ceph-disk-udev creates the symlink 
/dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 
(overriding the file because ln -sf)
  * ceph-disk activate-journal /dev/sdc3 finds that 
c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal and 
mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f
  * ceph-osd opens the journal and all is well

Except something goes wrong in your case, presumably because ceph-disk-udev is 
not called when /dev/sdc3 shows up ?

On 17/12/2015 08:29, Jesper Thorhauge wrote:
> Hi Loic,
> 
> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4).
> 
> sgdisk for sda shows;
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6
> First sector: 2048 (at 1024.0 KiB)
> Last sector: 1953525134 (at 931.5 GiB)
> Partition size: 1953523087 sectors (931.5 GiB)
> Attribute flags: 
> Partition name: 'ceph data'
> 
> for sdb
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F
> First sector: 2048 (at 1024.0 KiB)
> Last sector: 1953525134 (at 931.5 GiB)
> Partition size: 1953523087 sectors (931.5 GiB)
> Attribute flags: 
> Partition name: 'ceph data'
> 
> for /dev/sdc3
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072
> First sector: 935813120 (at 446.2 GiB)
> Last sector: 956293119 (at 456.0 GiB)
> Partition size: 2048 sectors (9.8 GiB)
> Attribute flags: 
> Partition name: 'ceph journal'
> 
> for /dev/sdc4
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168
> First sector: 956293120 (at 456.0 GiB)
> Last sector: 976773119 (at 465.8 GiB)
> Partition size: 2048 sectors (9.8 GiB)
> Attribute flags: 
> Partition name: 'ceph journal'
> 
> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
> seems correct to me.
> 
> after a reboot, /dev/disk/by-partuuid is;
> 
> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
> -> ../../sdb1
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
> -> ../../sda1
> 
> i dont know how to verify the symlink of the journal file - can you guide me 
> on that one?
> 
> Thank :-) !
> 
> /Jesper
> 
> **
> 
> Hi,
> 
> On 17/12/2015 07:53, Jesper Thorhauge wrote:
>> Hi,
>>
>> Some more information showing in the boot.log;
>>
>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument
>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>> failed with error -22
>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1  ** ERROR: error creating empty 
>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument
>> ERROR:ceph-disk:Failed to activate
>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', 
>> '--mkkey', '-i', '7', '--monmap', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 
>> 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1
>> ceph-disk: Error: One or more partitions failed to activate
>>
>> Maybe related to the "(22) Invalid argument" part..?
> 
> After a reboot the symlinks are reconstructed and if they are still 
> incorrect, it means there is an inconsistency somewhere else. To debug the 
> problem, could

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Jesper Thorhauge

Hi Loic, 

Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway i 
can debug this further? Log-files? Modify the .rules file...? 

/Jesper 

 

The non-symlink files in /dev/disk/by-partuuid come to existence because of: 

* system boots 
* udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 
* ceph-disk-udev creates the symlink 
/dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 
* ceph-disk activate /dev/sda1 is mounted and finds a symlink to the journal 
journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which 
does not yet exists because /dev/sdc udev rules have not been run yet 
* ceph-osd opens the journal in write mode and that creates the file 
/dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file 
* the file is empty and the osd fails to activate with the error you see 
(EINVAL because the file is empty) 

This is ok, supported and expected since there is no way to know which disk 
will show up first. 

When /dev/sdc shows up, the same logic will be triggered: 

* udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 
* ceph-disk-udev creates the symlink 
/dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 
(overriding the file because ln -sf) 
* ceph-disk activate-journal /dev/sdc3 finds that 
c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal and 
mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
* ceph-osd opens the journal and all is well 

Except something goes wrong in your case, presumably because ceph-disk-udev is 
not called when /dev/sdc3 shows up ? 

On 17/12/2015 08:29, Jesper Thorhauge wrote: 
> Hi Loic, 
> 
> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4). 
> 
> sgdisk for sda shows; 
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6 
> First sector: 2048 (at 1024.0 KiB) 
> Last sector: 1953525134 (at 931.5 GiB) 
> Partition size: 1953523087 sectors (931.5 GiB) 
> Attribute flags:  
> Partition name: 'ceph data' 
> 
> for sdb 
> 
> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F 
> First sector: 2048 (at 1024.0 KiB) 
> Last sector: 1953525134 (at 931.5 GiB) 
> Partition size: 1953523087 sectors (931.5 GiB) 
> Attribute flags:  
> Partition name: 'ceph data' 
> 
> for /dev/sdc3 
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072 
> First sector: 935813120 (at 446.2 GiB) 
> Last sector: 956293119 (at 456.0 GiB) 
> Partition size: 2048 sectors (9.8 GiB) 
> Attribute flags:  
> Partition name: 'ceph journal' 
> 
> for /dev/sdc4 
> 
> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168 
> First sector: 956293120 (at 456.0 GiB) 
> Last sector: 976773119 (at 465.8 GiB) 
> Partition size: 2048 sectors (9.8 GiB) 
> Attribute flags:  
> Partition name: 'ceph journal' 
> 
> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
> seems correct to me. 
> 
> after a reboot, /dev/disk/by-partuuid is; 
> 
> -rw-r--r-- 1 root root 0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168 
> -rw-r--r-- 1 root root 0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072 
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
> -> ../../sdb1 
> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
> -> ../../sda1 
> 
> i dont know how to verify the symlink of the journal file - can you guide me 
> on that one? 
> 
> Thank :-) ! 
> 
> /Jesper 
> 
> ** 
> 
> Hi, 
> 
> On 17/12/2015 07:53, Jesper Thorhauge wrote: 
>> Hi, 
>> 
>> Some more information showing in the boot.log; 
>> 
>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument 
>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>> failed with error -22 
>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1 ** ERROR: error creating empty 
>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument 
>> ERROR:ceph-disk:Failed to activate 
>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', 
>> '--mkkey', '-i', '7', '--monmap', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/journal', '--osd-uuid', 
>> 'c83b5aa5-fe77-42f6-9415-25ca0266fb7f', '--keyring', 
>> '/var/lib/ceph/tmp/mnt.aWZTcE/keyring']' returned non-zero exit status 1 
>> ceph-disk: Error: One or more partitions failed to

Re: [ceph-users] [SOLVED] radosgw problem - 411 http status

2015-12-17 Thread Jacek Jarosiewicz


setting the "rgw content length compat" to true solved this...

J

On 12/17/2015 03:30 PM, Jacek Jarosiewicz wrote:

Hi,

I have a strange problem with the rados gateway. I'm getting Http 411
status code (Missing Content Length) whenever I upload any file to ceph.

The setup is: ceph 0.94.5, ubuntu 14.04, tengine (patched nginx).

The strange thing is - everything worked like a charm until today, when
I wanted to add ops logging to rados, and restarted the gw.

After that I reverted to the previous config, but the error persists.



--
Jacek Jarosiewicz
Administrator Systemów Informatycznych


SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego 
Rejestru Sądowego,

nr KRS 029537; kapitał zakładowy 42.756.000 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa


SUPERMEDIA ->   http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] active+undersized+degraded

2015-12-17 Thread Loris Cuoghi

Le 17/12/2015 13:57, Loris Cuoghi a écrit :

Le 17/12/2015 13:52, Burkhard Linke a écrit :

Hi,

On 12/17/2015 01:41 PM, Dan Nica wrote:

And the osd tree:

$ ceph osd tree

ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 21.81180 root default

-2 21.81180 host rimu

0 7.27060 osd.0  up  1.0  1.0

1 7.27060 osd.1  up  1.0  1.0

2 7.27060 osd.2  up  1.0  1.0

the default CRUSH rulesets distribute PG replicates across hosts. With a
single host the rules are not able to find a second OSD for the
replicates.

Solutions:
- add a second host - or -
- change CRUSH ruleset to distribute based on OSDs instead of hosts.

Regards,
Burkhard

--
Dr. rer. nat. Burkhard Linke
Bioinformatics and Systems Biology
Justus-Liebig-University Giessen
35392 Giessen, Germany
Phone: (+49) (0)641 9935810

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Default replication count is now 3, so, 3 hosts are needed by default.
Lowering the pool's replication count (e.g. to 1) for testing purposes
is also a possibility.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

> $ ceph osd dump | grep 'replicated size'
>
> pool 2 'data' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 64 pgp_num 64 last_change 52 flags 
hashpspool stripe_width 0

Sorry, didn't see that...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] active+undersized+degraded

2015-12-17 Thread Dan Nica

And the osd tree:

$ ceph osd tree
ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 21.81180 root default
-2 21.81180 host rimu
0  7.27060 osd.0  up  1.0  1.0
1  7.27060 osd.1  up  1.0  1.0
2  7.27060 osd.2  up  1.0  1.0

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Dan 
Nica
Sent: Thursday, December 17, 2015 2:41 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] active+undersized+degraded

Hi,

After managing to configure the osd server I created a pool "data" and removed 
pool "rbd"
and now the cluster is stuck in active+undersized+degraded

$ ceph status
cluster 046b0180-dc3f-4846-924f-41d9729d48c8
 health HEALTH_WARN
64 pgs degraded
64 pgs stuck unclean
64 pgs undersized
too few PGs per OSD (21 < min 30)
 monmap e1: 3 mons at 
{alder=10.6.250.249:6789/0,ash=10.6.250.248:6789/0,aspen=10.6.250.247:6789/0}
election epoch 6, quorum 0,1,2 aspen,ash,alder
 osdmap e53: 3 osds: 3 up, 3 in
flags sortbitwise
  pgmap v95: 64 pgs, 1 pools, 0 bytes data, 0 objects
107 MB used, 22335 GB / 22335 GB avail
  64 active+undersized+degraded

$ ceph osd dump | grep 'replicated size'
pool 2 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 52 flags hashpspool stripe_width 0

should I increase the number os pgs and pgps ?

--
Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] active+undersized+degraded

2015-12-17 Thread Loris Cuoghi


Le 17/12/2015 13:52, Burkhard Linke a écrit :

Hi,

On 12/17/2015 01:41 PM, Dan Nica wrote:


And the osd tree:

$ ceph osd tree

ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 21.81180 root default

-2 21.81180 host rimu

0 7.27060 osd.0  up  1.0  1.0

1 7.27060 osd.1  up  1.0  1.0

2 7.27060 osd.2  up  1.0  1.0


the default CRUSH rulesets distribute PG replicates across hosts. With a
single host the rules are not able to find a second OSD for the replicates.

Solutions:
- add a second host - or -
- change CRUSH ruleset to distribute based on OSDs instead of hosts.

Regards,
Burkhard

--
Dr. rer. nat. Burkhard Linke
Bioinformatics and Systems Biology
Justus-Liebig-University Giessen
35392 Giessen, Germany
Phone: (+49) (0)641 9935810



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




Default replication count is now 3, so, 3 hosts are needed by default.
Lowering the pool's replication count (e.g. to 1) for testing purposes 
is also a possibility.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] active+undersized+degraded

2015-12-17 Thread Dan Nica

Hi,

After managing to configure the osd server I created a pool "data" and removed 
pool "rbd"
and now the cluster is stuck in active+undersized+degraded

$ ceph status
cluster 046b0180-dc3f-4846-924f-41d9729d48c8
 health HEALTH_WARN
64 pgs degraded
64 pgs stuck unclean
64 pgs undersized
too few PGs per OSD (21 < min 30)
 monmap e1: 3 mons at 
{alder=10.6.250.249:6789/0,ash=10.6.250.248:6789/0,aspen=10.6.250.247:6789/0}
election epoch 6, quorum 0,1,2 aspen,ash,alder
 osdmap e53: 3 osds: 3 up, 3 in
flags sortbitwise
  pgmap v95: 64 pgs, 1 pools, 0 bytes data, 0 objects
107 MB used, 22335 GB / 22335 GB avail
  64 active+undersized+degraded

$ ceph osd dump | grep 'replicated size'
pool 2 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 52 flags hashpspool stripe_width 0

should I increase the number os pgs and pgps ?

--
Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] active+undersized+degraded

2015-12-17 Thread Burkhard Linke


Hi,

On 12/17/2015 01:41 PM, Dan Nica wrote:


And the osd tree:

$ ceph osd tree

ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 21.81180 root default

-2 21.81180 host rimu

0 7.27060 osd.0  up  1.0  1.0

1 7.27060 osd.1  up  1.0  1.0

2 7.27060 osd.2  up  1.0  1.0

the default CRUSH rulesets distribute PG replicates across hosts. With a 
single host the rules are not able to find a second OSD for the replicates.


Solutions:
- add a second host - or -
- change CRUSH ruleset to distribute based on OSDs instead of hosts.

Regards,
Burkhard

--
Dr. rer. nat. Burkhard Linke
Bioinformatics and Systems Biology
Justus-Liebig-University Giessen
35392 Giessen, Germany
Phone: (+49) (0)641 9935810

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Metadata Server (MDS) Hardware Suggestions

2015-12-17 Thread Simon Hallam

Hi all,

I'm looking at sizing up some new MDS nodes, but I'm not sure if my thought 
process is correct or not:

CPU: Limited to a maximum 2 cores. The higher the GHz, the more IOPS available. 
So something like a single E5-2637v3 should fulfil this.
Memory: The more the better, as the metadata can be cached in RAM (how much RAM 
required is dependent on number of files?).
HDD: This is where I'm struggling, does their speed/IOPs have an significant 
impact on the performance of the MDS (I'm guessing this is dependent on if the 
metadata fits within RAM)? If so, does the NVMe SSDs look like the right 
avenue, or will standard SATA SSDs suffice?

Thanks in advance for your help!

Simon Hallam



Please visit our new website at www.pml.ac.uk and follow us on Twitter  
@PlymouthMarine

Winner of the Environment & Conservation category, the Charity Awards 2014.

Plymouth Marine Laboratory (PML) is a company limited by guarantee registered 
in England & Wales, company number 4178503. Registered Charity No. 1091222. 
Registered Office: Prospect Place, The Hoe, Plymouth  PL1 3DH, UK. 

This message is private and confidential. If you have received this message in 
error, please notify the sender and remove it from your system. You are 
reminded that e-mail communications are not secure and may contain viruses; PML 
accepts no liability for any loss or damage which may be caused by viruses.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] radosgw problem - 411 http status

2015-12-17 Thread Jacek Jarosiewicz


Hi,

I have a strange problem with the rados gateway. I'm getting Http 411 
status code (Missing Content Length) whenever I upload any file to ceph.


The setup is: ceph 0.94.5, ubuntu 14.04, tengine (patched nginx).

The strange thing is - everything worked like a charm until today, when 
I wanted to add ops logging to rados, and restarted the gw.


After that I reverted to the previous config, but the error persists.

the gateway is reached by nginx via fastcgi:

fastcgi_pass_request_headers on;
access_log /var/log/nginx/access.log;
include fastcgi_params; # default content in this file
fastcgi_param  CONTENT_LENGTH   $content_length;
fastcgi_param  LENGTH   $content_length;
fastcgi_pass unix:/var/run/ceph.radosgw.gateway.fastcgi.sock;

radosgw config in ceph.conf:

[client.radosgw.gateway]
host = cfgate01
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = /var/run/ceph.radosgw.gateway.fastcgi.sock
log file = /var/log/ceph/client.radosgw.gateway.log
rgw print continue = false
rgw enable usage log = true
rgw enable ops log = true
rgw dns name = fs2.smcloud.net
debug rgw = 20


I've turned on debugging and can see the length and content_length 
parameters in radosgw log, or maybe I'm missing something:


2015-12-17 15:20:47.934962 7f5e98f99700 20 enqueued request 
req=0x7f5ef4032380

2015-12-17 15:20:47.934976 7f5e98f99700 20 RGWWQ:
2015-12-17 15:20:47.934977 7f5e98f99700 20 req: 0x7f5ef4032380
2015-12-17 15:20:47.934987 7f5e98f99700 10 allocated request 
req=0x7f5ef4032bd0
2015-12-17 15:20:47.937538 7f5e94f91700 20 dequeued request 
req=0x7f5ef4032380

2015-12-17 15:20:47.937555 7f5e94f91700 20 RGWWQ: empty
2015-12-17 15:20:47.937789 7f5e94f91700 20 CONTENT_LENGTH=5
2015-12-17 15:20:47.937796 7f5e94f91700 20 CONTENT_TYPE=text/plain
2015-12-17 15:20:47.937797 7f5e94f91700 20 DOCUMENT_ROOT=/etc/nginx/html
2015-12-17 15:20:47.937798 7f5e94f91700 20 DOCUMENT_URI=/PUT/test.txt
2015-12-17 15:20:47.937799 7f5e94f91700 20 FCGI_ROLE=RESPONDER
2015-12-17 15:20:47.937799 7f5e94f91700 20 GATEWAY_INTERFACE=CGI/1.1
2015-12-17 15:20:47.937800 7f5e94f91700 20 HTTP_ACCEPT_ENCODING=identity
2015-12-17 15:20:47.937800 7f5e94f91700 20 HTTP_AUTHORIZATION=AWS 
LZ8YD48VSS0MCBYGG7XC:3frBFFTsiN0QhGEig+wu0aSLwzE=

2015-12-17 15:20:47.937801 7f5e94f91700 20 HTTP_CONTENT_LENGTH=5
2015-12-17 15:20:47.937802 7f5e94f91700 20 HTTP_CONTENT_TYPE=text/plain
2015-12-17 15:20:47.937802 7f5e94f91700 20 HTTP_HOST=eg-hls.fs2.smcloud.net
2015-12-17 15:20:47.937803 7f5e94f91700 20 HTTP_X_AMZ_DATE=Thu, 17 Dec 
2015 14:20:48 +
2015-12-17 15:20:47.937803 7f5e94f91700 20 
HTTP_X_AMZ_META_S3CMD_ATTRS=uid:1000/gname:jj/uname:jjarosiewicz/gid:1000/mode:33188/mtime:1450359091/atime:1450359106/md5:d8e8fca2dc0f896fd7cb4cb0031ba249/ctime:1450359091

2015-12-17 15:20:47.937805 7f5e94f91700 20 LENGTH=5
2015-12-17 15:20:47.937806 7f5e94f91700 20 QUERY_STRING=
2015-12-17 15:20:47.937807 7f5e94f91700 20 REDIRECT_STATUS=200
2015-12-17 15:20:47.937807 7f5e94f91700 20 REMOTE_ADDR=212.180.240.7
2015-12-17 15:20:47.937808 7f5e94f91700 20 REMOTE_PORT=53086
2015-12-17 15:20:47.937808 7f5e94f91700 20 REQUEST_METHOD=PUT
2015-12-17 15:20:47.937809 7f5e94f91700 20 REQUEST_URI=/test.txt
2015-12-17 15:20:47.937809 7f5e94f91700 20 SCRIPT_NAME=/PUT/test.txt
2015-12-17 15:20:47.937810 7f5e94f91700 20 SERVER_ADDR=212.180.241.218
2015-12-17 15:20:47.937810 7f5e94f91700 20 SERVER_NAME=adm-fs2.smcloud.net
2015-12-17 15:20:47.937810 7f5e94f91700 20 SERVER_PORT=80
2015-12-17 15:20:47.937811 7f5e94f91700 20 SERVER_PROTOCOL=HTTP/1.1
2015-12-17 15:20:47.937811 7f5e94f91700 20 SERVER_SOFTWARE=nginx/1.6.2
2015-12-17 15:20:47.937812 7f5e94f91700  1 == starting new request 
req=0x7f5ef4032380 =
2015-12-17 15:20:47.937929 7f5e94f91700  2 req 131:0.000117::PUT 
/test.txt::initializing for trans_id = 
tx00083-005672c4bf-1d3068-default

2015-12-17 15:20:47.937945 7f5e94f91700 10 host=eg-hls.fs2.smcloud.net
2015-12-17 15:20:47.937949 7f5e94f91700 20 subdomain=eg-hls 
domain=fs2.smcloud.net in_hosted_domain=1

2015-12-17 15:20:47.937965 7f5e94f91700 10 meta>> HTTP_X_AMZ_DATE
2015-12-17 15:20:47.937974 7f5e94f91700 10 meta>> 
HTTP_X_AMZ_META_S3CMD_ATTRS
2015-12-17 15:20:47.937978 7f5e94f91700 10 x>> x-amz-date:Thu, 17 Dec 
2015 14:20:48 +
2015-12-17 15:20:47.937980 7f5e94f91700 10 x>> 
x-amz-meta-s3cmd-attrs:uid:1000/gname:jj/uname:jjarosiewicz/gid:1000/mode:33188/mtime:1450359091/atime:1450359106/md5:d8e8fca2dc0f896fd7cb4cb0031ba249/ctime:1450359091
2015-12-17 15:20:47.938078 7f5e94f91700 10 s->object=test.txt 
s->bucket=eg-hls
2015-12-17 15:20:47.938090 7f5e94f91700  2 req 131:0.000278:s3:PUT 
/test.txt::getting op
2015-12-17 15:20:47.938095 7f5e94f91700  2 req 131:0.000283:s3:PUT 
/test.txt:put_obj:authorizing
2015-12-17 15:20:47.938326 7f5e94f91700 10 get_canon_resource(): 
dest=/eg-hls/test.txt

2015-12-17 15:20:47.938338 7f5e94f91700 10 auth_hdr:
PUT

text/plain

x-amz-date:Thu, 17 Dec 2015 14:20:48 +

Re: [ceph-users] active+undersized+degraded

2015-12-17 Thread Dan Nica

Great, also increased the pg(p) to 128 for that pool, "HEALTH_OK" now :)

Thank you
Dan
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Burkhard Linke
Sent: Thursday, December 17, 2015 2:53 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] active+undersized+degraded

Hi,
On 12/17/2015 01:41 PM, Dan Nica wrote:
And the osd tree:

$ ceph osd tree
ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 21.81180 root default
-2 21.81180 host rimu
0  7.27060 osd.0  up  1.0  1.0
1  7.27060 osd.1  up  1.0  1.0
2  7.27060 osd.2  up  1.0  1.0
the default CRUSH rulesets distribute PG replicates across hosts. With a single 
host the rules are not able to find a second OSD for the replicates.

Solutions:
- add a second host - or -
- change CRUSH ruleset to distribute based on OSDs instead of hosts.

Regards,
Burkhard


--

Dr. rer. nat. Burkhard Linke

Bioinformatics and Systems Biology

Justus-Liebig-University Giessen

35392 Giessen, Germany

Phone: (+49) (0)641 9935810
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-17 Thread Florian Haas

Hey Wido,

On Dec 17, 2015 09:52, "Wido den Hollander"  wrote:
>
> On 12/17/2015 06:29 AM, Ben Hines wrote:
> >
> >
> > On Wed, Dec 16, 2015 at 11:05 AM, Florian Haas  > > wrote:
> >
> > Hi Ben & everyone,
> >
> >
> > Ben, you wrote elsewhere
> > (
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-August/003955.html)
> > that you found approx. 900k objects to be the threshold where index
> > sharding becomes necessary. Have you found that to be a reasonable
> > rule of thumb, as in "try 1-2 shards per million objects in your
most
> > populous bucket"? Also, do you reckon that beyond that, more shards
> > make things worse?
> >
> >
> >
> > Oh, and to answer this part.   I didn't do that much experimentation
> > unfortunately.  I actually am using about 24 index shards per bucket
> > currently and we delete each bucket once it hits about a million
> > objects. (it's just a throwaway cache for us) Seems ok, so i stopped
> > tweaking.
> >
>
> I have a use case where I need to store 350 Million objects in a single
> bucket.

How many OSDs are in that cluster?

> I tested with 4096 shards and that works. Creating the bucket takes a
> few seconds though.

Does "that works" mean that you have actually uploaded 350M objects into
that one bucket?

If so, can you give me a feel for your typical object size?

Also, what's the performance drop you saw in bucket listing, vs. having
fewer shards or no sharding at all?

Cheers,
Florian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD only pool without journal

2015-12-17 Thread Loris Cuoghi


Le 17/12/2015 16:47, Misa a écrit :

Hello everyone,

does it make sense to create SSD only pool from OSDs without journal?

 From my point of view, the SSDs are so fast that OSD journal on the SSD
will not make much of a difference.

Cheers
Misa
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Actually, you can't, as the OSD journal is an implementation detail of 
Ceph's architecture.


It's not there only for performance benefits (i.e. writes are stored 
sequentially before acknowledgement), but also for integrity purposes, 
like in journalled filesystems.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrate Block Volumes and VMs

2015-12-17 Thread Sebastien Han

What you can do is flatten all the images so you break the relationship between 
the parent image and the child.
Then you can export/import.

> On 15 Dec 2015, at 12:10, Sam Huracan  wrote:
> 
> Hi everybody,
> 
> My OpenStack System use Ceph as backend for Glance, Cinder, Nova. In the 
> future, we intend build a new Ceph Cluster.
> I can re-connect current OpenStack with new Ceph systems.
> 
> After that, I have tried export rbd images and import to new Ceph, but VMs 
> and Volumes were clone of Glance rbd images, like this:
> 
> rbd children images/e2c852e1-28ce-408d-b2ec-6351db35d55a@snap
> 
> vms/8a4465fa-cbae-4559-b519-861eb4eda378_disk
> volumes/volume-b5937629-5f44-40c8-9f92-5f88129d3171
> 
> 
> How could I export all rbd snapshot and its clones to import in new Ceph 
> Cluster?
> 
> Or is there any solution to move all Vms, Volumes, Images from old Ceph 
> cluster to the new ones?
> 
> Thanks and regards.
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Senior Cloud Architect

"Always give 100%. Unless you're giving blood."

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD only pool without journal

2015-12-17 Thread Lionel Bouton

Hi,

Le 17/12/2015 16:47, Misa a écrit :
> Hello everyone,
>
> does it make sense to create SSD only pool from OSDs without journal?

No, because AFAIK you can't have OSDs without journals yet.
IIRC there is work done for alternate stores where you wouldn't need
journals anymore but it's not yet production ready.

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] SSD only pool without journal

2015-12-17 Thread Misa


Hello everyone,

does it make sense to create SSD only pool from OSDs without journal?

From my point of view, the SSDs are so fast that OSD journal on the SSD 
will not make much of a difference.


Cheers
Misa
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Problems with git.ceph.com release.asc keys

2015-12-17 Thread Tim Gipson

Is anyone else experiencing issues when they try to run a “ceph-deploy install” 
command when it gets to the rpm import of 
https://git.ceph.com/?p=ceph.git;a=blob_plain;f=keys/release.asc ?

I also tried to curl the url with no luck.  I get a 504 Gateway time-out error 
in cephy-deploy.


Tim G.
Systems Engineer
Nashville TN
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Enable RBD Cache

2015-12-17 Thread Sam Huracan

Hi,

I'm testing OpenStack Kilo with Ceph 0.94.5, install in Ubuntu 14.04

To enable RBD cache, I follow this tutorial:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/#configuring-nova

But when I check /var/run/ceph/guests in Compute nodes, there isn't have
any asok file.

How can I enable RBD cache in compute node, and how can check it?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Fwd: Enable RBD Cache

2015-12-17 Thread Sam Huracan

-- Forwarded message --
From: Sam Huracan 
Date: 2015-12-18 1:03 GMT+07:00
Subject: Enable RBD Cache
To: ceph-us...@ceph.com

Hi,

I'm testing OpenStack Kilo with Ceph 0.94.5, install in Ubuntu 14.04

To enable RBD cache, I follow this tutorial:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/#configuring-nova

But when I check /var/run/ceph/guests in Compute nodes, there isn't have
any asok file.

How can I enable RBD cache in compute node, and how can check it?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] [Ceph] Not able to use erasure code profile

2015-12-17 Thread quentin.dore

Hello,

I try to use the default erasure code profile in Ceph 
(v0.80.9-0ubuntu0.14.04.2).
I have created the following pool :

$ ceph osd pool create defaultpool 12 12 erasure

And try to put a file in like this :

$rados --pool=defaultpool put test test.tar

But I am blocked in the process and need a ctrl+c to end.
Here is the debug information (debug ms = 5/5) :

2015-08-04 17:45:59.605003 7f781899a7c0  1 -- :/0 messenger.start
2015-08-04 17:45:59.607497 7f781899a7c0  1 -- :/1014044 --> 127.0.0.1:6789/0 -- 
auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7f781ab47260 con 0x7f781ab46dd0
2015-08-04 17:45:59.607936 7f7818992700  1 -- 127.0.0.1:0/1014044 learned my 
addr 127.0.0.1:0/1014044
2015-08-04 17:45:59.608517 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 1  mon_map v1  191+0+0 (1947850903 0 0) 0x7f7808000ab0 
con 0x7f781ab46dd0
2015-08-04 17:45:59.611547 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 2  auth_reply(proto 1 0 (0) Success) v1  24+0+0 
(2103380028 0 0) 0x7f7808000e50 con 0x7f781ab46dd0
2015-08-04 17:45:59.611628 7f7813af5700  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f781ab47870 con 
0x7f781ab46dd0
2015-08-04 17:45:59.611822 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7f781ab485f0 con 0x7f781ab46dd0
2015-08-04 17:45:59.611830 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7f781ab48b50 con 0x7f781ab46dd0
2015-08-04 17:45:59.612800 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 3  mon_map v1  191+0+0 (1947850903 0 0) 0x7f7808000ab0 
con 0x7f781ab46dd0
2015-08-04 17:45:59.612891 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 4  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000ab0 con 0x7f781ab46dd0
2015-08-04 17:45:59.612977 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 5  osd_map(80..80 src has 1..80) v3  10805+0+0 
(3315356140 0 0) 0x7f7808003480 con 0x7f781ab46dd0
2015-08-04 17:45:59.613338 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=81}) v2 -- ?+0 
0x7f781ab49380 con 0x7f781ab46dd0
2015-08-04 17:45:59.613550 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 6  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:45:59.615178 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 7  osd_map(80..80 src has 1..80) v3  10805+0+0 
(3315356140 0 0) 0x7f7808003450 con 0x7f781ab46dd0
2015-08-04 17:45:59.615257 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 8  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:45:59.615449 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 9  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:46:04.612263 7f78119f0700  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=81}) v2 -- ?+0 
0x7f77f4000a00 con 0x7f781ab46dd0
2015-08-04 17:46:04.612837 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 10  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 
0) 0x7f7808000a30 con 0x7f781ab46dd0
^C

But if I use a replicated pool, it works fine, I can put and get correctly the 
file using the same command.

Is there something particular to set in ceph.conf to be able to use erasure 
code plugin ?
Sorry if it's a newby question, but I start with ceph.

Best regards.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Deploying a Ceph storage cluster using Warewulf on Centos-7

2015-12-17 Thread Chu Ruilin

Hi, all

I don't know which automation tool is best for deploying Ceph and I'd like
to know about. I'm comfortable with Warewulf since I've been using it for
HPC clusters. I find it quite convenient for Ceph too. I wrote a set of
scripts that can deploy a Ceph cluster quickly. Here is how I did it just
using virtualbox:

http://ruilinchu.blogspot.com/2015/09/deploying-ceph-storage-cluster-using.html

comments are welcome!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph read errors

2015-12-17 Thread Arseniy Seroka

Hello! I have a following error.
Doing `scp` from local storage to ceph I'm getting errors in file's
contents.
For example:
```
me@kraken:/ceph_storage/lib$ java -jar file.jar
Error: Invalid or corrupt jarfile file.jar
```
If I'm checking md5 -> everything is ok.

Then I'm going to another server
me@poni:/ceph_storage/lib$ java -jar file.jar
Missing required option(s) [e/edges, e/edges, n/nodes, n/nodes]
Option (* = required)   Description
-   ---
...
```

After that I'm going to the previous server and everything works.
```
me@kraken:/ceph_storage/lib/$ java -jar file.jar
Missing required option(s) [e/edges, e/edges, n/nodes, n/nodes]
Option (* = required)   Description
-   ---
...
```

I think that there are problems with mmap read.
Another example:
```
me@kraken:/ceph_storage/lib$ dd if=file.jar skip=1330676 bs=1 count=10 |
hexdump -C
10+0 records in
10+0 records out
  00 00 00 00 00 00 00 00  00 00|..|
10 bytes (10 B) copied000a
, 0.0149698 s, 0.7 kB/s
me@kraken:/ceph_storage/lib$ head file.jar -c 10 | hexdump -C

  50 4b 03 04 0a 00 00 08  00 00|PK|
000a
```

-- 
Sincerely,
Arseniy Seroka
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v10.0.0 released

2015-12-17 Thread piotr.da...@ts.fujitsu.com

> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Sage Weil
> Sent: Monday, November 23, 2015 5:08 PM
> 
> This is the first development release for the Jewel cycle.  We are off to a
> good start, with lots of performance improvements flowing into the tree.
> We are targetting sometime in Q1 2016 for the final Jewel.
> 
>[..]
> (`pr#5853 `_, Piotr DaÅ‚ek)

Hopefully at that point the script that generates this list will learn how to 
handle UTF-8 ;-)


With best regards / Pozdrawiam
Piotr Dałek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] problem on ceph installation on centos 7

2015-12-17 Thread Bob R

Alex,

It looks like you might have an old repo in there with priority=1 so it's
not trying to install hammer. Try mv /etc/yum.repos.d/ceph.repo
/etc/yum.repos.d/ceph.repo.old && mv /etc/yum.repos.d/ceph.repo.rpmnew
/etc/yum.repos.d/ceph.repo then re-run ceph-deploy.

Bob

On Thu, Dec 10, 2015 at 1:42 PM, Leung, Alex (398C)  wrote:

> Hi,
>
>
>
> I am trying to install ceph on a Centos 7 system and I get the following
> error, Thanks in advance for the help.
>
> [ceph@hdmaster ~]$ ceph-deploy install mon0 osd0 osd1 osd2
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /home/ceph/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.28): /bin/ceph-deploy install mon0
> osd0 osd1 osd2
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  testing   : None
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  dev_commit: None
> [ceph_deploy.cli][INFO  ]  install_mds   : False
> [ceph_deploy.cli][INFO  ]  stable: None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  adjust_repos  : True
> [ceph_deploy.cli][INFO  ]  func  :  install at 0x10edd70>
> [ceph_deploy.cli][INFO  ]  install_all   : False
> [ceph_deploy.cli][INFO  ]  repo  : False
> [ceph_deploy.cli][INFO  ]  host  : ['mon0',
> 'osd0', 'osd1', 'osd2']
> [ceph_deploy.cli][INFO  ]  install_rgw   : False
> [ceph_deploy.cli][INFO  ]  install_tests : False
> [ceph_deploy.cli][INFO  ]  repo_url  : None
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  install_osd   : False
> [ceph_deploy.cli][INFO  ]  version_kind  : stable
> [ceph_deploy.cli][INFO  ]  install_common: False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  dev   : master
> [ceph_deploy.cli][INFO  ]  local_mirror  : None
> [ceph_deploy.cli][INFO  ]  release   : None
> [ceph_deploy.cli][INFO  ]  install_mon   : False
> [ceph_deploy.cli][INFO  ]  gpg_url   : None
> [ceph_deploy.install][DEBUG ] Installing stable version hammer on cluster
> ceph hosts mon0 osd0 osd1 osd2
> [ceph_deploy.install][DEBUG ] Detecting platform for host mon0 ...
> [mon0][DEBUG ] connection detected need for sudo
> [mon0][DEBUG ] connected to host: mon0
> [mon0][DEBUG ] detect platform information from remote host
> [mon0][DEBUG ] detect machine type
> [ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.0.1406 Core
> [mon0][INFO  ] installing Ceph on mon0
> [mon0][INFO  ] Running command: sudo yum clean all
> [mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
> [mon0][DEBUG ] Cleaning repos: base ceph ceph-noarch epel extras updates
> [mon0][DEBUG ] Cleaning up everything
> [mon0][DEBUG ] Cleaning up list of fastest mirrors
> [mon0][INFO  ] Running command: sudo yum -y install epel-release
> [mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
> [mon0][DEBUG ] Determining fastest mirrors
> [mon0][DEBUG ]  * base: mirrors.kernel.org
> [mon0][DEBUG ]  * epel: mirror.sfo12.us.leaseweb.net
> [mon0][DEBUG ]  * extras: mirrors.xmission.com
> [mon0][DEBUG ]  * updates: mirror.hmc.edu
> [mon0][DEBUG ] Package epel-release-7-5.noarch already installed and
> latest version
> [mon0][DEBUG ] Nothing to do
> [mon0][INFO  ] Running command: sudo yum -y install yum-plugin-priorities
> [mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
> [mon0][DEBUG ] Loading mirror speeds from cached hostfile
> [mon0][DEBUG ]  * base: mirrors.kernel.org
> [mon0][DEBUG ]  * epel: mirror.sfo12.us.leaseweb.net
> [mon0][DEBUG ]  * extras: mirrors.xmission.com
> [mon0][DEBUG ]  * updates: mirror.hmc.edu
> [mon0][DEBUG ] Resolving Dependencies
> [mon0][DEBUG ] --> Running transaction check
> [mon0][DEBUG ] ---> Package yum-plugin-priorities.noarch 0:1.1.31-29.el7
> will be installed
> [mon0][DEBUG ] --> Finished Dependency Resolution
> [mon0][DEBUG ]
> [mon0][DEBUG ] Dependencies Resolved
> [mon0][DEBUG ]
> [mon0][DEBUG ]
> 
> [mon0][DEBUG ]  Package Arch Version
>Repository  Size
> [mon0][DEBUG ]
> 
> [mon0][DEBUG ] Installing:
> [mon0][DEBUG ]  yum-plugin-priorities

Re: [ceph-users] [Ceph] Not able to use erasure code profile

2015-12-17 Thread ghislain.chevalier

Hi Quentin
Did you check the pool was correctly created
(Pg allocation)?

Envoyé de mon Galaxy Ace4 Orange


 Message d'origine 
De : quentin.d...@orange.com
Date :17/12/2015 19:45 (GMT+01:00)
À : ceph-users@lists.ceph.com
Cc :
Objet : [ceph-users] [Ceph] Not able to use erasure code profile

Hello,

I try to use the default erasure code profile in Ceph 
(v0.80.9-0ubuntu0.14.04.2).
I have created the following pool :

$ ceph osd pool create defaultpool 12 12 erasure

And try to put a file in like this :

$rados --pool=defaultpool put test test.tar

But I am blocked in the process and need a ctrl+c to end.
Here is the debug information (debug ms = 5/5) :

2015-08-04 17:45:59.605003 7f781899a7c0  1 -- :/0 messenger.start
2015-08-04 17:45:59.607497 7f781899a7c0  1 -- :/1014044 --> 127.0.0.1:6789/0 -- 
auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7f781ab47260 con 0x7f781ab46dd0
2015-08-04 17:45:59.607936 7f7818992700  1 -- 127.0.0.1:0/1014044 learned my 
addr 127.0.0.1:0/1014044
2015-08-04 17:45:59.608517 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 1  mon_map v1  191+0+0 (1947850903 0 0) 0x7f7808000ab0 
con 0x7f781ab46dd0
2015-08-04 17:45:59.611547 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 2  auth_reply(proto 1 0 (0) Success) v1  24+0+0 
(2103380028 0 0) 0x7f7808000e50 con 0x7f781ab46dd0
2015-08-04 17:45:59.611628 7f7813af5700  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f781ab47870 con 
0x7f781ab46dd0
2015-08-04 17:45:59.611822 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7f781ab485f0 con 0x7f781ab46dd0
2015-08-04 17:45:59.611830 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7f781ab48b50 con 0x7f781ab46dd0
2015-08-04 17:45:59.612800 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 3  mon_map v1  191+0+0 (1947850903 0 0) 0x7f7808000ab0 
con 0x7f781ab46dd0
2015-08-04 17:45:59.612891 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 4  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000ab0 con 0x7f781ab46dd0
2015-08-04 17:45:59.612977 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 5  osd_map(80..80 src has 1..80) v3  10805+0+0 
(3315356140 0 0) 0x7f7808003480 con 0x7f781ab46dd0
2015-08-04 17:45:59.613338 7f781899a7c0  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=81}) v2 -- ?+0 
0x7f781ab49380 con 0x7f781ab46dd0
2015-08-04 17:45:59.613550 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 6  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:45:59.615178 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 7  osd_map(80..80 src has 1..80) v3  10805+0+0 
(3315356140 0 0) 0x7f7808003450 con 0x7f781ab46dd0
2015-08-04 17:45:59.615257 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 8  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:45:59.615449 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 9  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 0) 
0x7f7808000a30 con 0x7f781ab46dd0
2015-08-04 17:46:04.612263 7f78119f0700  1 -- 127.0.0.1:0/1014044 --> 
127.0.0.1:6789/0 -- mon_subscribe({monmap=2+,osdmap=81}) v2 -- ?+0 
0x7f77f4000a00 con 0x7f781ab46dd0
2015-08-04 17:46:04.612837 7f7813af5700  1 -- 127.0.0.1:0/1014044 <== mon.0 
127.0.0.1:6789/0 10  mon_subscribe_ack(300s) v1  20+0+0 (3912512126 0 
0) 0x7f7808000a30 con 0x7f781ab46dd0
^C

But if I use a replicated pool, it works fine, I can put and get correctly the 
file using the same command.

Is there something particular to set in ceph.conf to be able to use erasure 
code plugin ?
Sorry if it’s a newby question, but I start with ceph.

Best regards.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have

[ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-17 Thread Florian Haas

Hey everyone,

I recently got my hands on a cluster that has been underperforming in
terms of radosgw throughput, averaging about 60 PUTs/s with 70K
objects where a freshly-installed cluster with near-identical
configuration would do about 250 PUTs/s. (Neither of these values are
what I'd consider high throughput, but this is just to give you a feel
about the relative performance hit.)

Some digging turned up that of the less than 200 buckets in the
cluster, about 40 held in excess of a million objects (1-4M), which
one bucket being an outlier with 45M objects. All buckets were created
post-Hammer, and use 64 index shards. The total number of objects in
radosgw is approx. 160M.

Now this isn't a large cluster in terms of OSD distribution; there are
only 12 OSDs (after all, we're only talking double-digit terabytes
here). In almost all of these OSDs, the LevelDB omap directory has
grown to a size of 10-20 GB.

So I have several questions on this:

- Is it correct to assume that such a large LevelDB would be quite
detrimental to radosgw performance overall?

- If so, would clearing that one large bucket and distributing the
data over several new buckets reduce the LevelDB size at all?

- Is there even something akin to "ceph mon compact" for OSDs?

- Are these large LevelDB databases a simple consequence of having a
combination of many radosgw objects and few OSDs, with the
distribution per-bucket being comparatively irrelevant?

I do understand that the 45M object bucket itself would have been a
problem pre-Hammer, with no index sharding available. But with what
others have shared here, a rule of thumb of one index shard per
million objects should be a good one to follow, so 64 shards for 45M
objects doesn't strike me as totally off the mark. That's why I think
LevelDB I/O is actually the issue here. But I might be totally wrong;
all insights appreciated. :)

Cheers,
Florian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-17 Thread Tom Christensen

I've just checked 1072 and 872, they both look the same, a single op for
the object in question, in retry+read state, appears to be retrying forever.


On Thu, Dec 17, 2015 at 10:05 AM, Tom Christensen  wrote:

> I had already nuked the previous hang, but we have another one:
>
> osdc output:
>
> 70385   osd853  5.fb666328  rbd_data.36f804a163d632a.000370ff
>   read
>
> 11024940osd1072 5.f438406c
> rbd_id.volume-44c74bb5-14f8-4279-b44f-8e867248531b  call
>
> 11241684osd872  5.175f624d
> rbd_id.volume-3e068bc7-75eb-4504-b109-df851a787f89  call
>
> 11689088osd685  5.1fc9acd5  rbd_header.36f804a163d632a
> 442390'5605610926112768 watch
>
>
> #ceph osd map rbd rbd_data.36f804a163d632a.000370ff
>
> osdmap e1309560 pool 'rbd' (5) object
> 'rbd_data.36f804a163d632a.000370ff' -> pg 5.fb666328 (5.6328) -> up
> ([853,247,265], p853) acting ([853,247,265], p853)
>
>
> As to the output of osd.853 ops, objecter_requests, dump_ops_in_flight,
> dump_historic_ops.. I see a single request in the output of osd.853 ops:
>
> {
>
> "ops": [
>
> {
>
> "description": "osd_op(client.58016244.1:70385
> rbd_data.36f804a163d632a.000370ff [read 7274496~4096] 5.fb666328
> RETRY=1 retry+read e1309006)",
>
> "initiated_at": "2015-12-17 10:03:35.360401",
>
> "age": 0.000503,
>
> "duration": 0.000233,
>
> "type_data": [
>
> "reached pg",
>
> {
>
> "client": "client.58016244",
>
> "tid": 70385
>
> },
>
> [
>
> {
>
> "time": "2015-12-17 10:03:35.360401",
>
> "event": "initiated"
>
> },
>
> {
>
> "time": "2015-12-17 10:03:35.360635",
>
> "event": "reached_pg"
>
> }
>
> ]
>
> ]
>
> }
>
> ],
>
> "num_ops": 1
>
> }
>
>
> The other commands either return nothing (ops_in_flight,
> objecter_requests) or in the case of historic ops, it returns 20 ops (thats
> what its set to keep), but none of them are this request or reference this
> object.  It seems this read is just retrying forever?
>
>
>
> On Sat, Dec 12, 2015 at 12:10 PM, Ilya Dryomov  wrote:
>
>> On Sat, Dec 12, 2015 at 6:37 PM, Tom Christensen 
>> wrote:
>> > We had a kernel map get hung up again last night/this morning.  The rbd
>> is
>> > mapped but unresponsive, if I try to unmap it I get the following error:
>> > rbd: sysfs write failed
>> > rbd: unmap failed: (16) Device or resource busy
>> >
>> > Now that this has happened attempting to map another RBD fails, using
>> lsblk
>> > fails as well, both of these tasks just hang forever.
>> >
>> > We have 1480 OSDs in the cluster so posting the osdmap seems excessive,
>> > however here is the beginning (didn't change in 5 runs):
>> > root@wrk-slc-01-02:~# cat
>> >
>> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdmap
>> > epoch 1284256
>> > flags
>> > pool 0 pg_num 2048 (2047) read_tier -1 write_tier -1
>> > pool 1 pg_num 512 (511) read_tier -1 write_tier -1
>> > pool 3 pg_num 2048 (2047) read_tier -1 write_tier -1
>> > pool 4 pg_num 512 (511) read_tier -1 write_tier -1
>> > pool 5 pg_num 32768 (32767) read_tier -1 write_tier -1
>> >
>> > Here is osdc output, it is not changed after 5 runs:
>> >
>> > root@wrk-slc-01-02:~# cat
>> >
>> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
>> > 93835   osd1206 5.6841959c
>> rbd_data.34df3ac703ced61.1dff
>> > read
>> > 9065810 osd1382 5.a50fa0ea  rbd_header.34df3ac703ced61
>> > 474103'5506530325561344 watch
>> > root@wrk-slc-01-02:~# cat
>> >
>> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
>> > 93835   osd1206 5.6841959c
>> rbd_data.34df3ac703ced61.1dff
>> > read
>> > 9067286 osd1382 5.a50fa0ea  rbd_header.34df3ac703ced61
>> > 474103'5506530325561344 watch
>> > root@wrk-slc-01-02:~# cat
>> >
>> /sys/kernel/debug/ceph/f3b7f409-e061-4e39-b4d0-ae380e29ae7e.client55440310/osdc
>> > 93835   osd1206 5.6841959c
>> rbd_data.34df3ac703ced61.1dff
>> > read
>> > 9067831 osd1382 5.a50fa0ea  rbd_header.34df3ac703ced61
>> > 474103'5506530325561344 watch
>> > root@wrk-slc-01-02:~# ls /dev/rbd/rbd
>> > none  volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d
>> > volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d-part1
>> > root@wrk-slc-01-02:~# rbd info
>> volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d
>> > rbd image 'volume-daac5f12-e39b-4d64-a4fa-86c810aeb72d':
>> > size 61439 MB in 7680 objects
>> > order 23 (8192 kB

Re: [ceph-users] all three mons segfault at same time

2015-12-17 Thread Arnulf Heimsbakk

That's good to hear.

My experience was pretty much the same. But depending on the load on
the cluster I got a couple of crashes an our to one a day after I
upgraded everything.

I'm interested to hear if your cluster stays stable over time.

-Arnulf

On 11/10/2015 07:09 PM, Logan V. wrote:
> I am on trusty also but my /var/lib/ceph/mon lives on an xfs
> filesystem.
> 
> My mons seem to have stabilized now after upgrading the last of
> the OSDs to 0.94.5. No crashes in the last 20 minutes whereas they
> were crashing every 1-2 minutes in a rolling fashion the entire
> time I was upgrading OSDs.
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] all three mons segfault at same time

2015-12-17 Thread Arnulf Heimsbakk

Hi Logan!

It seems that I've solved the segfaults on my monitors. Maybe not in the
best way, but they seem to be gone. Original my monitor servers ran
Ubuntu Trusty on ext4, but they've now been converted to CentOS 7 with
XFS as root file system. They've run stable for 24H now.

I'm still running Ubuntu on my OSDs and no issues so far running mixed
OS. Everything is running 0.94.5.

Not a ideal solution, but I'm preparing to convert OSDs to CentOS too if
things stay stable over time.

-Arnulf

On 11/10/2015 05:13 PM, Logan V. wrote:
> I am in the process of upgrading a cluster with mixed 0.94.2/0.94.3 to
> 0.94.5 this morning and am seeing identical crashes. In the process of
> doing a rolling upgrade across the mons this morning, after the 3rd of
> 3 mons was restarted to 0.94.5, all 3 crashed simultaneously identical
> to what you are describing above. Now I am seeing rolling crashes
> across the 3 mons continually. I am still in the process of upgrading
> about 200 OSDs to 0.94.5 so most of them are still running 0.94.2 and
> 0.94.3. There are 3 mds's running 0.94.5 during these crashes.
> 
> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844640] init: ceph-mon
> (ceph/lsn-mc1008) main process (2254664) killed by SEGV signal
> Nov 10 10:07:30 lsn-mc1008 kernel: [6392349.844648] init: ceph-mon
> (ceph/lsn-mc1008) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294124] init: ceph-mon
> (ceph/lsn-mc1006) main process (2183307) killed by SEGV signal
> Nov 10 10:07:46 lsn-mc1006 kernel: [6392890.294132] init: ceph-mon
> (ceph/lsn-mc1006) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894914] init: ceph-mon
> (ceph/lsn-mc1007) main process (1998234) killed by SEGV signal
> Nov 10 10:07:46 lsn-mc1007 kernel: [6392599.894923] init: ceph-mon
> (ceph/lsn-mc1007) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959984] init: ceph-mon
> (ceph/lsn-mc1008) main process (2263082) killed by SEGV signal
> Nov 10 10:07:46 lsn-mc1008 kernel: [6392365.959992] init: ceph-mon
> (ceph/lsn-mc1008) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674332] init: ceph-mon
> (ceph/lsn-mc1006) main process (2191273) killed by SEGV signal
> Nov 10 10:07:52 lsn-mc1006 kernel: [6392896.674340] init: ceph-mon
> (ceph/lsn-mc1006) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324282] init: ceph-mon
> (ceph/lsn-mc1008) main process (2270979) killed by SEGV signal
> Nov 10 10:07:52 lsn-mc1008 kernel: [6392372.324295] init: ceph-mon
> (ceph/lsn-mc1008) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272911] init: ceph-mon
> (ceph/lsn-mc1007) main process (2006118) killed by SEGV signal
> Nov 10 10:07:52 lsn-mc1007 kernel: [6392606.272995] init: ceph-mon
> (ceph/lsn-mc1007) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046307] init: ceph-mon
> (ceph/lsn-mc1006) main process (2192187) killed by SEGV signal
> Nov 10 10:07:55 lsn-mc1006 kernel: [6392899.046315] init: ceph-mon
> (ceph/lsn-mc1006) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1007/syslog <==
> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192476] init: ceph-mon
> (ceph/lsn-mc1007) main process (2006489) killed by SEGV signal
> Nov 10 10:08:17 lsn-mc1007 kernel: [6392631.192484] init: ceph-mon
> (ceph/lsn-mc1007) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600089] init: ceph-mon
> (ceph/lsn-mc1006) main process (2192298) killed by SEGV signal
> Nov 10 10:08:17 lsn-mc1006 kernel: [6392921.600108] init: ceph-mon
> (ceph/lsn-mc1006) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
> Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.277994] init: ceph-mon
> (ceph/lsn-mc1008) main process (2271246) killed by SEGV signal
> Nov 10 10:08:17 lsn-mc1008 kernel: [6392397.278002] init: ceph-mon
> (ceph/lsn-mc1008) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1006/syslog <==
> Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999229] init: ceph-mon
> (ceph/lsn-mc1006) main process (2200399) killed by SEGV signal
> Nov 10 10:08:23 lsn-mc1006 kernel: [6392927.999242] init: ceph-mon
> (ceph/lsn-mc1006) main process ended, respawning
> ==> /var/log/clusterboot/lsn-mc1008/syslog <==
> Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641241] init: ceph-mon
> (ceph/lsn-mc1008) main process (2279050) killed by SEGV signal
> Nov 10 10:08:23 lsn-mc1008 kernel: [6392403.641254] init:

Re: [ceph-users] rados bench object not correct errors on v9.0.3

2015-12-17 Thread Dałek , Piotr

> -Original Message-
> From: Deneau, Tom [mailto:tom.den...@amd.com]
> Sent: Wednesday, August 26, 2015 5:23 PM
> To: Dałek, Piotr; Sage Weil

> > > There have been some recent changes to rados bench... Piotr, does
> > > this seem like it might be caused by your changes?
> >
> > Yes. My PR #4690 (https://github.com/ceph/ceph/pull/4690) caused rados
> > bench to be fast enough to sometimes run into race condition between
> > librados's AIO and objbencher processing. That was fixed in PR #5152
> > (https://github.com/ceph/ceph/pull/5152) which didn't make it into 9.0.3.
> > Tom, you can confirm this by inspecting the contents of objects
> > questioned (their contents should be perfectly fine and I in line with other
> objects).
> > In the meantime you can either apply patch from PR #5152 on your own
> > or use - -no-verify.
> 
> Piotr --
> 
> Thank you.  Yes, when I looked at the contents of the objects they always
> looked correct.  And yes a single object would sometimes report an error and
> sometimes not.  So a race condition makes sense.
> 
> A couple of questions:
> 
>* Why would I not see this behavior using the pre-built 9.0.3 binaries
>  that get installed using "ceph-deploy install --dev v9.0.3"?  I would 
> assume
>  this is built from the same sources as the 9.0.3 tarball.

No idea actually. It's a race condition, so it might be just luck.

>* So I assume one should not compare pre 9.0.3 rados bench numbers with
> 9.0.3 and after?
>  The pull request https://github.com/ceph/ceph/pull/4690 did not mention
> the
>  effect on final bandwidth numbers, did you notice what that effect was?

That depends on the CPU performance, but you should expect differences of a few 
or tens of MB/s on smaller blocks to up to even hundreds of MB per second on 
larger block sizes. More concurrent jobs also make issue more visible and add 
to total difference.

With best regards / Pozdrawiam
Piotr Dałek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-12-17 Thread Dałek , Piotr

> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Somnath Roy
> Sent: Tuesday, October 13, 2015 8:46 AM
> 
> Thanks Haomai..
> Since Async messenger is always using a constant number of threads , there
> could be a potential performance problem of scaling up the client
> connections keeping the constant number of OSDs ?
> May be it's a good tradeoff..

It's not that big issue when you look realistically at it. In fact, having more 
threads than around 2 * available_logical_cpus is going to drag performance 
down, so it's better to have thread wait than make it forcing context switches. 
The point of using more threads per process is to have it spend less time 
waiting for I/O and better utilize current multi-core CPUs. Having threads 
fighting for CPU and/or I/O time is worse than having them underutilized, which 
is particularly true with spinning drives (which aren't going anywhere any 
soon; not every customer is going to accept $1700 price tag per drive that has 
only 800GB of capacity) and slower CPUs (again, not every customer is going to 
accept $1200 price tag per CPU).

With best regards / Pozdrawiam
Piotr Dałek

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] problem on ceph installation on centos 7

2015-12-17 Thread Leung, Alex (398C)

Hi,



I am trying to install ceph on a Centos 7 system and I get the following error, 
Thanks in advance for the help.

[ceph@hdmaster ~]$ ceph-deploy install mon0 osd0 osd1 osd2
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.28): /bin/ceph-deploy install mon0 osd0 
osd1 osd2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  testing   : None
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  dev_commit: None
[ceph_deploy.cli][INFO  ]  install_mds   : False
[ceph_deploy.cli][INFO  ]  stable: None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  adjust_repos  : True
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  install_all   : False
[ceph_deploy.cli][INFO  ]  repo  : False
[ceph_deploy.cli][INFO  ]  host  : ['mon0', 'osd0', 
'osd1', 'osd2']
[ceph_deploy.cli][INFO  ]  install_rgw   : False
[ceph_deploy.cli][INFO  ]  install_tests : False
[ceph_deploy.cli][INFO  ]  repo_url  : None
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  install_osd   : False
[ceph_deploy.cli][INFO  ]  version_kind  : stable
[ceph_deploy.cli][INFO  ]  install_common: False
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  dev   : master
[ceph_deploy.cli][INFO  ]  local_mirror  : None
[ceph_deploy.cli][INFO  ]  release   : None
[ceph_deploy.cli][INFO  ]  install_mon   : False
[ceph_deploy.cli][INFO  ]  gpg_url   : None
[ceph_deploy.install][DEBUG ] Installing stable version hammer on cluster ceph 
hosts mon0 osd0 osd1 osd2
[ceph_deploy.install][DEBUG ] Detecting platform for host mon0 ...
[mon0][DEBUG ] connection detected need for sudo
[mon0][DEBUG ] connected to host: mon0
[mon0][DEBUG ] detect platform information from remote host
[mon0][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.0.1406 Core
[mon0][INFO  ] installing Ceph on mon0
[mon0][INFO  ] Running command: sudo yum clean all
[mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
[mon0][DEBUG ] Cleaning repos: base ceph ceph-noarch epel extras updates
[mon0][DEBUG ] Cleaning up everything
[mon0][DEBUG ] Cleaning up list of fastest mirrors
[mon0][INFO  ] Running command: sudo yum -y install epel-release
[mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
[mon0][DEBUG ] Determining fastest mirrors
[mon0][DEBUG ]  * base: mirrors.kernel.org
[mon0][DEBUG ]  * epel: mirror.sfo12.us.leaseweb.net
[mon0][DEBUG ]  * extras: mirrors.xmission.com
[mon0][DEBUG ]  * updates: mirror.hmc.edu
[mon0][DEBUG ] Package epel-release-7-5.noarch already installed and latest 
version
[mon0][DEBUG ] Nothing to do
[mon0][INFO  ] Running command: sudo yum -y install yum-plugin-priorities
[mon0][DEBUG ] Loaded plugins: fastestmirror, langpacks
[mon0][DEBUG ] Loading mirror speeds from cached hostfile
[mon0][DEBUG ]  * base: mirrors.kernel.org
[mon0][DEBUG ]  * epel: mirror.sfo12.us.leaseweb.net
[mon0][DEBUG ]  * extras: mirrors.xmission.com
[mon0][DEBUG ]  * updates: mirror.hmc.edu
[mon0][DEBUG ] Resolving Dependencies
[mon0][DEBUG ] --> Running transaction check
[mon0][DEBUG ] ---> Package yum-plugin-priorities.noarch 0:1.1.31-29.el7 will 
be installed
[mon0][DEBUG ] --> Finished Dependency Resolution
[mon0][DEBUG ]
[mon0][DEBUG ] Dependencies Resolved
[mon0][DEBUG ]
[mon0][DEBUG ] 

[mon0][DEBUG ]  Package Arch Version  
Repository  Size
[mon0][DEBUG ] 

[mon0][DEBUG ] Installing:
[mon0][DEBUG ]  yum-plugin-priorities   noarch   1.1.31-29.el7
base24 k
[mon0][DEBUG ]
[mon0][DEBUG ] Transaction Summary
[mon0][DEBUG ] 

[mon0][DEBUG ] Install  1 Package
[mon0][DEBUG ]
[mon0][DEBUG ] Total download size: 24 k
[mon0][DEBUG ] Installed size: 28 k
[mon0][DEBUG ] Downloading packages:
[mon0][DEBUG ] Running transaction check
[mon0][DEBUG ] Running transaction test
[mon0][DEBUG ] Transaction test succeeded
[mon0][DEBUG ] Running transaction
[mon0][DEBUG ]

[ceph-users] rbd du

2015-12-17 Thread Allen Liao

Hi all,

The online manual (http://ceph.com/docs/master/man/8/rbd/) for rbd has
documentation for the 'du' command.  I'm running ceph 0.94.2 and that
command isn't recognized, nor is it in the man page.

Is there another command that will "calculate the provisioned and actual
disk usage of all images and associated snapshots within the specified
pool?"
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CoprHD Integrating Ceph

2015-12-17 Thread Patrick McGarry

Hey cephers,

In the pursuit of openness I wanted to share a ceph-related bit of
work that is happening beyond our immediate sphere of influence and
see who is already contributing, or might be interested in the
results.

https://groups.google.com/forum/?hl=en#!topic/coprhddevsupport/llZeiTWxddM

EMC’s CoprHD initiative continues to try to expand their influence
through open contribution. Currently there is work to integrate Ceph
support into their SB SDK. So, a few questions for anyone who wishes
to weigh in:

1) Is this inherently interesting to you?
2) Are you already contributing to this effort (or would you be
interested in contributing to this effort)?
3) Would you want to see this made a priority by the core team to
review and “bless” an integration?

Just want to get an idea to see if anyone is really excited about this
and just hasn’t expressed it yet. If nothing else I wanted people to
be aware that it was an option that was floating around out there.
Thanks.



Best Regards,


Patrick McGarry
pmcga...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Moderation queue

2015-12-17 Thread Patrick McGarry

Hey cephers,

Looks like there was a list migration a while back and a bunch of
things were getting stuck in an admin moderation state and never
getting cleared. I have salvaged what I could, but if your message
never made it to the list it now never will. Sorry for the
inconvenience if there are any messages that you were waiting on.

Just as a note, to ensure that you don’t ever have a message stuck in
a queue you should join the list before sending to it. Most of the
time I am able to see messages come in that need to be moderated, and
whitelist the user, but that is never 100% guaranteed and often
results in message delay if I am traveling.

If you have questions or concerns please let me know. Thanks.

-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rados bench object not correct errors on v9.0.3

2015-12-17 Thread Dałek , Piotr

> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Sage Weil
> Sent: Tuesday, August 25, 2015 7:43 PM

> > I have built rpms from the tarball http://ceph.com/download/ceph-
> 9.0.3.tar.bz2.
> > Have done this for fedora 21 x86_64 and for aarch64.  On both
> > platforms when I run a single node "cluster" with a few osds and run
> > rados bench read tests (either seq or rand) I get occasional reports
> > like
> >
> > benchmark_data_myhost_20729_object73 is not correct!
> >
> > I never saw these with similar rpm builds on these platforms from 9.0.2
> sources.
> >
> > Also, if I go to an x86-64 system running Ubuntu trusty for which I am
> > able to install prebuilt binary packages via
> > ceph-deploy install --dev v9.0.3
> >
> > I do not see the errors there.
> 
> Hrm.. haven't seen it on this end, but we're running/testing master and not
> 9.0.2 specifically.  If you can reproduce this on master, that'd be very 
> helpful!
> 
> There have been some recent changes to rados bench... Piotr, does this
> seem like it might be caused by your changes?

Yes. My PR #4690 (https://github.com/ceph/ceph/pull/4690) caused rados bench to 
be fast enough to sometimes run into race condition between librados's AIO and 
objbencher processing. That was fixed in PR #5152 
(https://github.com/ceph/ceph/pull/5152) which didn't make it into 9.0.3.
Tom, you can confirm this by inspecting the contents of objects questioned 
(their contents should be perfectly fine and I in line with other objects).
In the meantime you can either apply patch from PR #5152 on your own or use 
--no-verify.

With best regards / Pozdrawiam
Piotr Dałek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Cephfs: large files hang

2015-12-17 Thread Bryan Wright

Hi folks,

This is driving me crazy.  I have a ceph filesystem that behaves normally
when I "ls" files, and behaves normally when I copy smallish files on or off
of the filesystem, but large files (~ GB size) hang after copying a few
megabytes.

This is ceph 0.94.5 under Centos 6.7 under kernel 4.3.3-1.el6.elrepo.x86_64.
 I've tried 64-bit and 32-bit clients with several different kernels, but
all behave the same.

After copying the first few bytes I get a stream of "slow request" messages
for the osds, like this:

2015-12-17 14:20:40.458306 osd.208 [WRN] slow request 1922.166564 seconds
old, received at 2015-12-17 13:48:38.291683: osd_op(mds.0.14956:851
100010a7b92.000d [stat] 0.5d427a9a RETRY=5
ack+retry+read+rwordered+known_if_redirected e193868) currently reached_pg

It's not a single OSD misbehaving.  It seems to be any OSD.   The OSDs have
plenty of disk space, and there's nothing in the osd logs that points to a
problem.

How can I find out what's blocking these requests?

Bryan


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Deploying a Ceph storage cluster using Warewulf on Centos-7

2015-12-17 Thread Chris Jones

Hi Chu,

If you can use Chef then:
https://github.com/ceph/ceph-chef

An example of an actual project can be found at:
https://github.com/bloomberg/chef-bcs

Chris

On Wed, Sep 23, 2015 at 4:11 PM, Chu Ruilin  wrote:

> Hi, all
>
> I don't know which automation tool is best for deploying Ceph and I'd like
> to know about. I'm comfortable with Warewulf since I've been using it for
> HPC clusters. I find it quite convenient for Ceph too. I wrote a set of
> scripts that can deploy a Ceph cluster quickly. Here is how I did it just
> using virtualbox:
>
>
> http://ruilinchu.blogspot.com/2015/09/deploying-ceph-storage-cluster-using.html
>
> comments are welcome!
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Best Regards,
Chris Jones

cjo...@cloudm2.com
(p) 770.655.0770
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Metadata Server (MDS) Hardware Suggestions

2015-12-17 Thread John Spray

On Thu, Dec 17, 2015 at 2:31 PM, Simon  Hallam  wrote:
> Hi all,
>
>
>
> I’m looking at sizing up some new MDS nodes, but I’m not sure if my thought
> process is correct or not:
>
>
>
> CPU: Limited to a maximum 2 cores. The higher the GHz, the more IOPS
> available. So something like a single E5-2637v3 should fulfil this.

No idea where you're getting the 2 core part.  But a mid range CPU
like the one you're looking at is probably perfectly appropriate.  As
you have probably gathered, the MDS will not make good use of large
core counts (though there are plenty of threads and various
serialisation/deserialisation parts can happen in parallel).

> Memory: The more the better, as the metadata can be cached in RAM (how much
> RAM required is dependent on number of files?).

Correct, the more RAM you have, the higher you can set mds_cache_size,
and the larger your working set will be.

> HDD: This is where I’m struggling, does their speed/IOPs have an significant
> impact on the performance of the MDS (I’m guessing this is dependent on if
> the metadata fits within RAM)? If so, does the NVMe SSDs look like the right
> avenue, or will standard SATA SSDs suffice?

The MDS does not store anything on local drives.  All the metadata is
stored in RADOS (i.e. on the OSD nodes).  All that goes locally is
config files and debug logs.

John

>
>
>
> Thanks in advance for your help!
>
>
>
> Simon Hallam
>
>
>
> Please visit our new website at www.pml.ac.uk and follow us on Twitter
> @PlymouthMarine
>
> Winner of the Environment & Conservation category, the Charity Awards 2014.
>
> Plymouth Marine Laboratory (PML) is a company limited by guarantee
> registered in England & Wales, company number 4178503. Registered Charity
> No. 1091222. Registered Office: Prospect Place, The Hoe, Plymouth  PL1 3DH,
> UK.
>
> This message is private and confidential. If you have received this message
> in error, please notify the sender and remove it from your system. You are
> reminded that e-mail communications are not secure and may contain viruses;
> PML accepts no liability for any loss or damage which may be caused by
> viruses.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph read errors

2015-12-17 Thread John Spray

On Wed, Aug 12, 2015 at 1:14 PM, Arseniy Seroka  wrote:
> Hello! I have a following error.
> Doing `scp` from local storage to ceph I'm getting errors in file's
> contents.
> For example:
> ```
> me@kraken:/ceph_storage/lib$ java -jar file.jar
> Error: Invalid or corrupt jarfile file.jar
> ```
> If I'm checking md5 -> everything is ok.
>
> Then I'm going to another server
> me@poni:/ceph_storage/lib$ java -jar file.jar
> Missing required option(s) [e/edges, e/edges, n/nodes, n/nodes]
> Option (* = required)   Description
> -   ---
> ...
> ```
>
> After that I'm going to the previous server and everything works.
> ```
> me@kraken:/ceph_storage/lib/$ java -jar file.jar
> Missing required option(s) [e/edges, e/edges, n/nodes, n/nodes]
> Option (* = required)   Description
> -   ---
> ...
> ```
>
> I think that there are problems with mmap read.
> Another example:
> ```
> me@kraken:/ceph_storage/lib$ dd if=file.jar skip=1330676 bs=1 count=10 |
> hexdump -C
> 10+0 records in
> 10+0 records out
>   00 00 00 00 00 00 00 00  00 00|..|
> 10 bytes (10 B) copied000a
> , 0.0149698 s, 0.7 kB/s
> me@kraken:/ceph_storage/lib$ head file.jar -c 10 | hexdump -C
>   50 4b 03 04 0a 00 00 08  00 00|PK|
> 000a
> ```

Interesting.  Are you using the kernel or fuse client (and what version?)

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Deploying a Ceph storage cluster using Warewulf on Centos-7

2015-12-17 Thread Shinobu Kinjo

I prefer puppet.
Anyhow are you going to use the Ceph cluster for /home, or any kind of 
computation area like scratch, or on behalf of lustre?
I'm just asking you.

Thank you,
Shinobu

- Original Message -
From: "Chris Jones" 
To: "Chu Ruilin" 
Cc: ceph-us...@ceph.com
Sent: Friday, December 18, 2015 5:44:29 AM
Subject: Re: [ceph-users] Deploying a Ceph storage cluster using Warewulf on
Centos-7

Hi Chu, 

If you can use Chef then: 
https://github.com/ceph/ceph-chef 

An example of an actual project can be found at: 
https://github.com/bloomberg/chef-bcs 

Chris 

On Wed, Sep 23, 2015 at 4:11 PM, Chu Ruilin < ruilin...@gmail.com > wrote: 



Hi, all 

I don't know which automation tool is best for deploying Ceph and I'd like to 
know about. I'm comfortable with Warewulf since I've been using it for HPC 
clusters. I find it quite convenient for Ceph too. I wrote a set of scripts 
that can deploy a Ceph cluster quickly. Here is how I did it just using 
virtualbox: 

http://ruilinchu.blogspot.com/2015/09/deploying-ceph-storage-cluster-using.html 

comments are welcome! 




___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 




-- 
Best Regards, 
Chris Jones 

cjo...@cloudm2.com 
(p) 770.655.0770 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cephfs: large files hang

2015-12-17 Thread Chris Dunlop

Hi Bryan,

Have you checked your MTUs? I was recently bitten by large packets not
getting through where small packets would. (This list, Dec 14, "All pgs
stuck peering".) Small files working but big files not working smells 
like it could be a similar problem.

Cheers,

Chris

On Thu, Dec 17, 2015 at 07:43:54PM +, Bryan Wright wrote:
> Hi folks,
> 
> This is driving me crazy.  I have a ceph filesystem that behaves normally
> when I "ls" files, and behaves normally when I copy smallish files on or off
> of the filesystem, but large files (~ GB size) hang after copying a few
> megabytes.
> 
> This is ceph 0.94.5 under Centos 6.7 under kernel 4.3.3-1.el6.elrepo.x86_64.
>  I've tried 64-bit and 32-bit clients with several different kernels, but
> all behave the same.
> 
> After copying the first few bytes I get a stream of "slow request" messages
> for the osds, like this:
> 
> 2015-12-17 14:20:40.458306 osd.208 [WRN] slow request 1922.166564 seconds
> old, received at 2015-12-17 13:48:38.291683: osd_op(mds.0.14956:851
> 100010a7b92.000d [stat] 0.5d427a9a RETRY=5
> ack+retry+read+rwordered+known_if_redirected e193868) currently reached_pg
> 
> It's not a single OSD misbehaving.  It seems to be any OSD.   The OSDs have
> plenty of disk space, and there's nothing in the osd logs that points to a
> problem.
> 
> How can I find out what's blocking these requests?
> 
> Bryan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] problem on ceph installation on centos 7

2015-12-17 Thread Leung, Alex (398C)

Yup, Thanks for the email. I got through the software installation part
But I went into this problem when I try to create.

[root@hdmaster ~]# su - ceph
Last login: Thu Dec 17 14:32:22 PST 2015 on pts/2
[ceph@hdmaster ~]$ ceph-deploy -v mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.28): /bin/ceph-deploy -v mon 
create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  verbose   : True
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  subcommand: create-initial
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  keyrings  : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts mon0
[ceph_deploy.mon][DEBUG ] detecting platform for host mon0 ...
[mon0][DEBUG ] connection detected need for sudo
[mon0][DEBUG ] connected to host: mon0
[mon0][DEBUG ] detect platform information from remote host
[mon0][DEBUG ] detect machine type
[mon0][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.2.1511 Core
[mon0][DEBUG ] determining if provided host has same hostname in remote
[mon0][DEBUG ] get remote short hostname
[mon0][WARNIN] 

[mon0][WARNIN] provided hostname must match remote hostname
[mon0][WARNIN] provided hostname: mon0
[mon0][WARNIN] remote hostname: hdmaster
[mon0][WARNIN] monitors may not reach quorum and create-keys will not complete
[mon0][WARNIN] 

[mon0][DEBUG ] deploying mon to mon0
[mon0][DEBUG ] get remote short hostname
[mon0][DEBUG ] remote hostname: hdmaster
[mon0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[mon0][DEBUG ] create the mon path if it does not exist
[mon0][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-hdmaster/done
[mon0][DEBUG ] create a done file to avoid re-doing the mon deployment
[mon0][DEBUG ] create the init path if it does not exist
[mon0][DEBUG ] locating the `service` executable...
[mon0][INFO  ] Running command: sudo /usr/sbin/service ceph -c 
/etc/ceph/ceph.conf start mon.hdmaster
[mon0][DEBUG ] === mon.hdmaster ===
[mon0][DEBUG ] Starting Ceph mon.hdmaster on hdmaster...already running
[mon0][INFO  ] Running command: sudo systemctl enable ceph
[mon0][WARNIN] ceph.service is not a native service, redirecting to 
/sbin/chkconfig.
[mon0][WARNIN] Executing /sbin/chkconfig ceph on
[mon0][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.mon0.asok mon_status
[mon0][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[mon0][WARNIN] monitor: mon.mon0, might not be running yet
[mon0][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.mon0.asok mon_status
[mon0][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[mon0][WARNIN] monitor mon0 does not exist in monmap
[ceph_deploy.mon][INFO  ] processing monitor mon.mon0
[mon0][DEBUG ] connection detected need for sudo
[mon0][DEBUG ] connected to host: mon0
[mon0][DEBUG ] detect platform information from remote host
[mon0][DEBUG ] detect machine type
[mon0][DEBUG ] find the location of an executable
[mon0][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.mon0.asok mon_status
[mon0][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.mon0 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[mon0][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.mon0.asok mon_status
[mon0][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.mon0 monitor is not yet in quorum, tries left: 4
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[mon0][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.mon0.asok mon_status
[mon0][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.mon0 monitor is not yet in quorum, tries left: 3
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[mon0][INFO  ] Running

[ceph-users] rgw deletes object data when multipart completion request timed out and retried

2015-12-17 Thread Gleb Borisov

Hi everyone,

We faced with strange issue on RadosGW (0.94.5-1precise, civet frontend
behind nginx). nginx's access.log:

17/Dec/2015:20:34:55 "PUT /ZZZ?uploadId=XXX=37 HTTP/1.1" 200
17/Dec/2015:20:34:57 "PUT /ZZZ?uploadId=XXX=39 HTTP/1.1" 200
17/Dec/2015:20:35:47 "POST /ZZZ?uploadId=XXX HTTP/1.1" 499
17/Dec/2015:20:36:37 "POST /ZZZ?uploadId=XXX HTTP/1.1" 499

We successfully uploaded 39 parts of this object and hitting read timeout
from CompleteMultipart request (POST) and our library retries it one more
time (Should we retry MultipartComplete request?).

After that we can read entire object for 5 minutes or less (seems like GC
schedule in rgw), afterwards we start receiving 404 NoSuchKey. Interesting
thing is that head object is not deleted and we can fetch object metadata
(using HEAD request). I've scanned all OSD dirs and found no content on
this objects. Only reference is head object with rgw.manifest in xattrs.

I've tried to search related issues in tracker, but didn't found nothing
similiar.
Unfortunatelly we have no rgw logs for this period at all.

I've enable 30/30 logging and collecting logs, but now we've more
acceptable response time and no timeouts at all.

Any ideas?

Thanks!

-- 
Best regards,
Gleb M Borisov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problems with git.ceph.com release.asc keys

2015-12-17 Thread Gregory Farnum

Apparently the keys are now at
https://download.ceph.com/keys/release.asc and you need to upgrade
your ceph-deploy (or maybe just change a config setting? I'm not
really sure).
-Greg

On Thu, Dec 17, 2015 at 7:51 AM, Tim Gipson  wrote:
> Is anyone else experiencing issues when they try to run a “ceph-deploy
> install” command when it gets to the rpm import of
> https://git.ceph.com/?p=ceph.git;a=blob_plain;f=keys/release.asc ?
>
> I also tried to curl the url with no luck.  I get a 504 Gateway time-out
> error in cephy-deploy.
>
>
> Tim G.
> Systems Engineer
> Nashville TN
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Metadata Server (MDS) Hardware Suggestions

2015-12-17 Thread Gregory Farnum

On Thu, Dec 17, 2015 at 2:06 PM, John Spray  wrote:
> On Thu, Dec 17, 2015 at 2:31 PM, Simon  Hallam  wrote:
>> Hi all,
>>
>>
>>
>> I’m looking at sizing up some new MDS nodes, but I’m not sure if my thought
>> process is correct or not:
>>
>>
>>
>> CPU: Limited to a maximum 2 cores. The higher the GHz, the more IOPS
>> available. So something like a single E5-2637v3 should fulfil this.
>
> No idea where you're getting the 2 core part.  But a mid range CPU
> like the one you're looking at is probably perfectly appropriate.  As
> you have probably gathered, the MDS will not make good use of large
> core counts (though there are plenty of threads and various
> serialisation/deserialisation parts can happen in parallel).

There's just not much that happens outside of the big MDS lock right
now, besides journaling and some message handling. So basically two
cores is all we'll be able to use until that happens. ;)

>
>> Memory: The more the better, as the metadata can be cached in RAM (how much
>> RAM required is dependent on number of files?).
>
> Correct, the more RAM you have, the higher you can set mds_cache_size,
> and the larger your working set will be.

Note that "working set" there; it's only the active metadata you need
to worry about when sizing things. I think at last count Zheng was
seeing ~3KB of memory for each inode/dentry combo.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v10.0.0 released

2015-12-17 Thread Loic Dachary

The script handles UTF-8 fine, the copy/paste is at fault here ;-)

On 24/11/2015 07:59, piotr.da...@ts.fujitsu.com wrote:
>> -Original Message-
>> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
>> ow...@vger.kernel.org] On Behalf Of Sage Weil
>> Sent: Monday, November 23, 2015 5:08 PM
>>
>> This is the first development release for the Jewel cycle.  We are off to a
>> good start, with lots of performance improvements flowing into the tree.
>> We are targetting sometime in Q1 2016 for the final Jewel.
>>
>> [..]
>> (`pr#5853 `_, Piotr DaÅ‚ek)
> 
> Hopefully at that point the script that generates this list will learn how to 
> handle UTF-8 ;-)
> 
> 
> With best regards / Pozdrawiam
> Piotr Dałek
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mount.ceph not accepting options, please help

2015-12-17 Thread Gregory Farnum

On Wed, Dec 16, 2015 at 10:54 AM, Mike Miller  wrote:
> Hi,
>
> sorry, the question might seem very easy, probably my bad, but can you
> please help me why I am unable to change read ahead size and other options
> when mounting cephfs?
>
> mount.ceph m2:6789:/ /foo2 -v -o name=cephfs,secret=,rsize=1024000
>
> the result is:
>
> ceph: Unknown mount option rsize
>
> I am using hammer 0.94.5 and ubuntu trusty.
>
> Thanks for your help!

Hmm, I don't actually see anything parsing for rsize in mount.ceph.c,
but it does look for some of the other things. Can you try dropping
that config option and seeing if it works?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cephfs: large files hang

2015-12-17 Thread Gregory Farnum

On Thu, Dec 17, 2015 at 11:43 AM, Bryan Wright  wrote:
> Hi folks,
>
> This is driving me crazy.  I have a ceph filesystem that behaves normally
> when I "ls" files, and behaves normally when I copy smallish files on or off
> of the filesystem, but large files (~ GB size) hang after copying a few
> megabytes.
>
> This is ceph 0.94.5 under Centos 6.7 under kernel 4.3.3-1.el6.elrepo.x86_64.
>  I've tried 64-bit and 32-bit clients with several different kernels, but
> all behave the same.
>
> After copying the first few bytes I get a stream of "slow request" messages
> for the osds, like this:
>
> 2015-12-17 14:20:40.458306 osd.208 [WRN] slow request 1922.166564 seconds
> old, received at 2015-12-17 13:48:38.291683: osd_op(mds.0.14956:851
> 100010a7b92.000d [stat] 0.5d427a9a RETRY=5
> ack+retry+read+rwordered+known_if_redirected e193868) currently reached_pg
>
> It's not a single OSD misbehaving.  It seems to be any OSD.   The OSDs have
> plenty of disk space, and there's nothing in the osd logs that points to a
> problem.
>
> How can I find out what's blocking these requests?

What's the full output of "ceph -s"?

The only time the MDS issues these "stat" ops on objects is during MDS
replay, but the bit where it's blocked on "reached_pg" in the OSD
makes it look like your OSD is just very slow. (Which could
potentially make the MDS back up far enough to get zapped by the
monitors, but in that case it's probably some kind of misconfiguration
issue if they're all hitting it.)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Michał Chybowski


Or, if You have already set partitions, You can do it with this command:
ceph-deploy osd prepare machine:/dev/sdb1:/dev/sdb2

where /dev/sdb1 is Your data partition and /dev/sdb2 is Your journal one.

Regards
Michał Chybowski
Tiktalik.com

W dniu 17.12.2015 o 12:46, Loic Dachary pisze:

Hi,

You can try

ceph-deploy osd prepare osdserver:/dev/sdb

it will create the /dev/sdb1 and /dev/sdb2 partitions for you.

Cheers

On 17/12/2015 12:41, Dan Nica wrote:

Well I get an error when I try to create data and jurnal on same disk

  


[rimu][INFO  ] Running command: sudo ceph-disk -v prepare --cluster ceph 
--fs-type xfs -- /dev/sdb1 /dev/sdb2

[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-allows-journal -i 0 --cluster ceph

[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-wants-journal -i 0 --cluster ceph

[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-needs-journal -i 0 --cluster ceph

[rimu][WARNIN] Traceback (most recent call last):

[rimu][WARNIN]   File "/sbin/ceph-disk", line 3576, in 

[rimu][WARNIN] main(sys.argv[1:])

[rimu][WARNIN]   File "/sbin/ceph-disk", line 3530, in main

[rimu][WARNIN] args.func(args)

[rimu][WARNIN]   File "/sbin/ceph-disk", line 1705, in main_prepare

[rimu][WARNIN] raise Error('data path does not exist', args.data)

[rimu][WARNIN] __main__.Error: Error: data path does not exist: /dev/sdb1

[rimu][ERROR ] RuntimeError: command returned non-zero exit status: 1

[ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk -v prepare 
--cluster ceph --fs-type xfs -- /dev/sdb1 /dev/sdb2

[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

  


ceph-deploy --version

1.5.30

Running on Centos 7

  


--

Dan

  


*From:*Swapnil Jain [mailto:swap...@mask365.com]
*Sent:* Thursday, December 17, 2015 1:17 PM
*To:* Dan Nica ; ceph-users@lists.ceph.com
*Subject:* Re: [ceph-users] data partition and journal on same disk

  


Yes you can have it on a different partition on same disk. But not recommend.

  

  

  


ceph-deploy osd prepare {node-name}:{data-disk}[:{journal-disk}]

ceph-deploy osd prepare osdserver1:sdc1:sdc2

  

  

  


Best Regards,

   


Swapnil Jain | Solution Architect & Certified Instructor

RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE

  

  

  

  


 On 17-Dec-2015, at 4:42 pm, Dan Nica > wrote:

  


 Hi,

  


 Can I have data an journal on the same disk ? if yes, how ?

  


 Thanks

 --

 Dan

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Dan Nica

Well after upgrading the system to latest it worked with “prepare osdserver:sdb”

Thanks you,
Dan

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Michal 
Chybowski
Sent: Thursday, December 17, 2015 2:28 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] data partition and journal on same disk

Or, if You have already set partitions, You can do it with this command:
ceph-deploy osd prepare machine:/dev/sdb1:/dev/sdb2

where /dev/sdb1 is Your data partition and /dev/sdb2 is Your journal one.

Regards

Michał Chybowski

Tiktalik.com
W dniu 17.12.2015 o 12:46, Loic Dachary pisze:

Hi,



You can try



ceph-deploy osd prepare osdserver:/dev/sdb



it will create the /dev/sdb1 and /dev/sdb2 partitions for you.



Cheers



On 17/12/2015 12:41, Dan Nica wrote:

Well I get an error when I try to create data and jurnal on same disk







[rimu][INFO  ] Running command: sudo ceph-disk -v prepare --cluster ceph 
--fs-type xfs -- /dev/sdb1 /dev/sdb2



[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-allows-journal -i 0 --cluster ceph



[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-wants-journal -i 0 --cluster ceph



[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-needs-journal -i 0 --cluster ceph



[rimu][WARNIN] Traceback (most recent call last):



[rimu][WARNIN]   File "/sbin/ceph-disk", line 3576, in 



[rimu][WARNIN] main(sys.argv[1:])



[rimu][WARNIN]   File "/sbin/ceph-disk", line 3530, in main



[rimu][WARNIN] args.func(args)



[rimu][WARNIN]   File "/sbin/ceph-disk", line 1705, in main_prepare



[rimu][WARNIN] raise Error('data path does not exist', args.data)



[rimu][WARNIN] __main__.Error: Error: data path does not exist: /dev/sdb1



[rimu][ERROR ] RuntimeError: command returned non-zero exit status: 1



[ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk -v prepare 
--cluster ceph --fs-type xfs -- /dev/sdb1 /dev/sdb2



[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs







ceph-deploy --version



1.5.30



Running on Centos 7







--



Dan







*From:*Swapnil Jain [mailto:swap...@mask365.com]

*Sent:* Thursday, December 17, 2015 1:17 PM

*To:* Dan Nica 
; 
ceph-users@lists.ceph.com

*Subject:* Re: [ceph-users] data partition and journal on same disk







Yes you can have it on a different partition on same disk. But not recommend.















ceph-deploy osd prepare {node-name}:{data-disk}[:{journal-disk}]



ceph-deploy osd prepare osdserver1:sdc1:sdc2















Best Regards,







Swapnil Jain | Solution Architect & Certified Instructor



RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE



















On 17-Dec-2015, at 4:42 pm, Dan Nica 
 
> 
wrote:







Hi,







Can I have data an journal on the same disk ? if yes, how ?







Thanks



--



Dan



___

ceph-users mailing list

ceph-users@lists.ceph.com 


http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com











___

ceph-users mailing list

ceph-users@lists.ceph.com

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com








___

ceph-users mailing list

ceph-users@lists.ceph.com

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs, low performances

2015-12-17 Thread Christian Balzer


Hello,

On Fri, 18 Dec 2015 03:36:12 +0100 Francois Lafont wrote:

> Hi,
> 
> I have ceph cluster currently unused and I have (to my mind) very low
> performances. I'm not an expert in benchs, here an example of quick
> bench:
> 
> ---
> # fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1
> --name=readwrite --filename=rw.data --bs=4k --iodepth=64 --size=300MB
> --readwrite=randrw --rwmixread=50 readwrite: (g=0): rw=randrw,
> bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64 fio-2.1.3 Starting 1
> process readwrite: Laying out IO file(s) (1 file(s) / 300MB)
> Jobs: 1 (f=1): [m] [100.0% done] [2264KB/2128KB/0KB /s] [566/532/0 iops]
> [eta 00m:00s] readwrite: (groupid=0, jobs=1): err= 0: pid=3783: Fri Dec
> 18 02:01:13 2015 read : io=153640KB, bw=2302.9KB/s, iops=575, runt=
> 66719msec write: io=153560KB, bw=2301.7KB/s, iops=575, runt= 66719msec
>   cpu  : usr=0.77%, sys=3.07%, ctx=115432, majf=0, minf=604
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=99.9% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >64=0.0%, >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0% issued: total=r=38410/w=38390/d=0, short=r=0/w=0/d=0
> 
> Run status group 0 (all jobs):
>READ: io=153640KB, aggrb=2302KB/s, minb=2302KB/s, maxb=2302KB/s,
> mint=66719msec, maxt=66719msec WRITE: io=153560KB, aggrb=2301KB/s,
> minb=2301KB/s, maxb=2301KB/s, mint=66719msec, maxt=66719msec
> ---
> 
> It seems to me very bad. 
Indeed. 
Firstly let me state that I don't use CephFS and have no clues how this
influences things and can/should be tuned.

That being said, the fio above running in VM (RBD) gives me 440 IOPS
against a single OSD storage server (replica 1) with 4 crappy HDDs and
on-disk journals on my test cluster (1Gb/s links). 
So yeah, given your configuration that's bad.

In comparison I get 3000 IOPS against a production cluster (so not idle)
with 4 storage nodes. Each with 4 100GB DC S3700 for journals and OS and 8
SATA HDDs, Infiniband (IPoIB) connectivity for everything.

All of this is with .80.x (Firefly) on Debian Jessie.


> Can I hope better results with my setup
> (explained below)? During the bench, I don't see particular symptoms (no
> CPU blocked at 100% etc). If you have advices to improve the perf and/or
> maybe to make smarter benchs, I'm really interested.
> 
You want to use atop on all your nodes and look for everything from disks
to network utilization. 
There might be nothing obvious going on, but it needs to be ruled out.

> Thanks in advance for your help. Here is my conf...
> 
> I use Ubuntu 14.04 on each server with the 3.13 kernel (it's the same
> for the client ceph where I run my bench) and I use Ceph 9.2.0
> (Infernalis). 

I seem to recall that this particular kernel has issues, you might want to
scour the archives here.

>On the client, cephfs is mounted via cephfs-fuse with this
> in /etc/fstab:
> 
> id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyring,client_mountpoint=/
> /mnt/cephfs
> fuse.ceph noatime,defaults,_netdev0   0
> 
> I have 5 cluster node servers "Supermicro Motherboard X10SLM+-LN4 S1150"
> with one 1GbE port for the ceph public network and one 10GbE port for
> the ceph private network:
> 
For the sake of latency (which becomes the biggest issues when you're not
exhausting CPU/DISK), you'd be better off with everything on 10GbE, unless
you need the 1GbE to connect to clients that have no 10Gb/s ports.

> - 1 x Intel Xeon E3-1265Lv3
> - 1 SSD DC3710 Series 200GB (with partitions for the OS, the 3
> OSD-journals and, just for ceph01, ceph02 and ceph03, the SSD contains
> too a partition for the workdir of a monitor
The 200GB DC S3700 would have been faster, but that's a moot point and not
your bottleneck for sure.

> - 3 HD 4TB Western Digital (WD) SATA 7200rpm
> - RAM 32GB
> - NO RAID controlleur

Which controller are you using?
I recently came across an Adaptec SATA3 HBA that delivered only 176 MB/s
writes with 200GB DC S3700s as opposed to 280MB/s when used with Intel
onboard SATA-3 ports or a LSI 9211-4i HBA.

Regards,

Christian

> - Each partition uses XFS with noatim option, except the OS partition in
> EXT4.
> 
> Here is my ceph.conf :
> 
> ---
> [global]
>   fsid   = 
>   cluster network= 192.168.22.0/24
>   public network = 10.0.2.0/24
>   auth cluster required  = cephx
>   auth service required  = cephx
>   auth client required   = cephx
>   filestore xattr use omap   = true
>   osd pool default size  = 3
>   osd pool default min size  = 1
>   osd pool default pg num= 64
>   osd pool default pgp num   = 64
>   osd crush chooseleaf type  = 1

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Jesper Thorhauge

Nope, the previous post contained all that was in the boot.log :-( 

/Jesper 

** 

- Den 17. dec 2015, kl. 11:53, Loic Dachary  skrev: 

On 17/12/2015 11:33, Jesper Thorhauge wrote: 
> Hi Loic, 
> 
> Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway 
> i can debug this further? Log-files? Modify the .rules file...? 

Do you see traces of what happens when /dev/sdc3 shows up in boot.log ? 

> 
> /Jesper 
> 
>  
> 
> The non-symlink files in /dev/disk/by-partuuid come to existence because of: 
> 
> * system boots 
> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 
> * ceph-disk-udev creates the symlink 
> /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1 
> * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the journal 
> journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which 
> does not yet exists because /dev/sdc udev rules have not been run yet 
> * ceph-osd opens the journal in write mode and that creates the file 
> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file 
> * the file is empty and the osd fails to activate with the error you see 
> (EINVAL because the file is empty) 
> 
> This is ok, supported and expected since there is no way to know which disk 
> will show up first. 
> 
> When /dev/sdc shows up, the same logic will be triggered: 
> 
> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1 
> * ceph-disk-udev creates the symlink 
> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 
> (overriding the file because ln -sf) 
> * ceph-disk activate-journal /dev/sdc3 finds that 
> c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal 
> and mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
> * ceph-osd opens the journal and all is well 
> 
> Except something goes wrong in your case, presumably because ceph-disk-udev 
> is not called when /dev/sdc3 shows up ? 
> 
> On 17/12/2015 08:29, Jesper Thorhauge wrote: 
>> Hi Loic, 
>> 
>> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4). 
>> 
>> sgdisk for sda shows; 
>> 
>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
>> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6 
>> First sector: 2048 (at 1024.0 KiB) 
>> Last sector: 1953525134 (at 931.5 GiB) 
>> Partition size: 1953523087 sectors (931.5 GiB) 
>> Attribute flags:  
>> Partition name: 'ceph data' 
>> 
>> for sdb 
>> 
>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) 
>> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F 
>> First sector: 2048 (at 1024.0 KiB) 
>> Last sector: 1953525134 (at 931.5 GiB) 
>> Partition size: 1953523087 sectors (931.5 GiB) 
>> Attribute flags:  
>> Partition name: 'ceph data' 
>> 
>> for /dev/sdc3 
>> 
>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
>> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072 
>> First sector: 935813120 (at 446.2 GiB) 
>> Last sector: 956293119 (at 456.0 GiB) 
>> Partition size: 2048 sectors (9.8 GiB) 
>> Attribute flags:  
>> Partition name: 'ceph journal' 
>> 
>> for /dev/sdc4 
>> 
>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown) 
>> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168 
>> First sector: 956293120 (at 456.0 GiB) 
>> Last sector: 976773119 (at 465.8 GiB) 
>> Partition size: 2048 sectors (9.8 GiB) 
>> Attribute flags:  
>> Partition name: 'ceph journal' 
>> 
>> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
>> seems correct to me. 
>> 
>> after a reboot, /dev/disk/by-partuuid is; 
>> 
>> -rw-r--r-- 1 root root 0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168 
>> -rw-r--r-- 1 root root 0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072 
>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
>> -> ../../sdb1 
>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
>> -> ../../sda1 
>> 
>> i dont know how to verify the symlink of the journal file - can you guide me 
>> on that one? 
>> 
>> Thank :-) ! 
>> 
>> /Jesper 
>> 
>> ** 
>> 
>> Hi, 
>> 
>> On 17/12/2015 07:53, Jesper Thorhauge wrote: 
>>> Hi, 
>>> 
>>> Some more information showing in the boot.log; 
>>> 
>>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument 
>>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>>> failed with error -22 
>>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1 ** ERROR: error creating empty 
>>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument 
>>> ERROR:ceph-disk:Failed to activate 
>>> ceph-disk: Command

[ceph-users] data partition and journal on same disk

2015-12-17 Thread Dan Nica

Hi,

Can I have data an journal on the same disk ? if yes, how ?

Thanks
--
Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Mart van Santen

Hello Dan,

Yes, this is de default. They are just two partitions. The ceph-deploy
or ceph-disk prepare commands will create two partitions on the disk,
one used as journal and one as data partition, normally formatted as
xfs. In the data partition there is a file 'journal', which is a simlink
to a journal partition (the UUID of this partitinon) (but you can
actually point it to any blockdevice you want to use as journal). If you
want to change the journal to a different location, please stop the osd,
flush the journal, replace the link, prepare the new journal device,
start the osd.

regards.

mart

On 12/17/2015 12:12 PM, Dan Nica wrote:
>
> Hi,
>
>  
>
> Can I have data an journal on the same disk ? if yes, how ?
>
>  
>
> Thanks
>
> --
>
> Dan
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Mart van Santen
Greenhost
E: m...@greenhost.nl
T: +31 20 4890444
W: https://greenhost.nl

A PGP signature can be attached to this e-mail,
you need PGP software to verify it. 
My public key is available in keyserver(s)
see: http://tinyurl.com/openpgp-manual

PGP Fingerprint: CA85 EB11 2B70 042D AF66  B29A 6437 01A1 10A3 D3A5

signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Swapnil Jain

Yes you can have it on a different partition on same disk. But not recommend.

ceph-deploy osd prepare {node-name}:{data-disk}[:{journal-disk}]
ceph-deploy osd prepare osdserver1:sdc1:sdc2

—
Swapnil Jain | swap...@linux.com 
RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE


> On 17-Dec-2015, at 4:42 pm, Dan Nica  wrote:
> 
> Hi,
> 
> Can I have data an journal on the same disk ? if yes, how ?
> 
> Thanks
> --
> Dan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Dan Nica

Well I get an error when I try to create data and jurnal on same disk

[rimu][INFO  ] Running command: sudo ceph-disk -v prepare --cluster ceph 
--fs-type xfs -- /dev/sdb1 /dev/sdb2
[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-allows-journal -i 0 --cluster ceph
[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-wants-journal -i 0 --cluster ceph
[rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
--check-needs-journal -i 0 --cluster ceph
[rimu][WARNIN] Traceback (most recent call last):
[rimu][WARNIN]   File "/sbin/ceph-disk", line 3576, in 
[rimu][WARNIN] main(sys.argv[1:])
[rimu][WARNIN]   File "/sbin/ceph-disk", line 3530, in main
[rimu][WARNIN] args.func(args)
[rimu][WARNIN]   File "/sbin/ceph-disk", line 1705, in main_prepare
[rimu][WARNIN] raise Error('data path does not exist', args.data)
[rimu][WARNIN] __main__.Error: Error: data path does not exist: /dev/sdb1
[rimu][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk -v prepare 
--cluster ceph --fs-type xfs -- /dev/sdb1 /dev/sdb2
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

ceph-deploy --version
1.5.30
Running on Centos 7

--
Dan

From: Swapnil Jain [mailto:swap...@mask365.com]
Sent: Thursday, December 17, 2015 1:17 PM
To: Dan Nica ; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] data partition and journal on same disk

Yes you can have it on a different partition on same disk. But not recommend.



ceph-deploy osd prepare {node-name}:{data-disk}[:{journal-disk}]
ceph-deploy osd prepare osdserver1:sdc1:sdc2



Best Regards,

Swapnil Jain | Solution Architect & Certified Instructor
RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE




On 17-Dec-2015, at 4:42 pm, Dan Nica 
> wrote:

Hi,

Can I have data an journal on the same disk ? if yes, how ?

Thanks
--
Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] data partition and journal on same disk

2015-12-17 Thread Loic Dachary

Hi,

You can try

ceph-deploy osd prepare osdserver:/dev/sdb

it will create the /dev/sdb1 and /dev/sdb2 partitions for you.

Cheers

On 17/12/2015 12:41, Dan Nica wrote:
> Well I get an error when I try to create data and jurnal on same disk
> 
>  
> 
> [rimu][INFO  ] Running command: sudo ceph-disk -v prepare --cluster ceph 
> --fs-type xfs -- /dev/sdb1 /dev/sdb2
> 
> [rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
> --check-allows-journal -i 0 --cluster ceph
> 
> [rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
> --check-wants-journal -i 0 --cluster ceph
> 
> [rimu][WARNIN] INFO:ceph-disk:Running command: /usr/bin/ceph-osd 
> --check-needs-journal -i 0 --cluster ceph
> 
> [rimu][WARNIN] Traceback (most recent call last):
> 
> [rimu][WARNIN]   File "/sbin/ceph-disk", line 3576, in 
> 
> [rimu][WARNIN] main(sys.argv[1:])
> 
> [rimu][WARNIN]   File "/sbin/ceph-disk", line 3530, in main
> 
> [rimu][WARNIN] args.func(args)
> 
> [rimu][WARNIN]   File "/sbin/ceph-disk", line 1705, in main_prepare
> 
> [rimu][WARNIN] raise Error('data path does not exist', args.data)
> 
> [rimu][WARNIN] __main__.Error: Error: data path does not exist: /dev/sdb1
> 
> [rimu][ERROR ] RuntimeError: command returned non-zero exit status: 1
> 
> [ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk -v prepare 
> --cluster ceph --fs-type xfs -- /dev/sdb1 /dev/sdb2
> 
> [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
> 
>  
> 
> ceph-deploy --version
> 
> 1.5.30
> 
> Running on Centos 7
> 
>  
> 
> --
> 
> Dan
> 
>  
> 
> *From:*Swapnil Jain [mailto:swap...@mask365.com]
> *Sent:* Thursday, December 17, 2015 1:17 PM
> *To:* Dan Nica ; ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] data partition and journal on same disk
> 
>  
> 
> Yes you can have it on a different partition on same disk. But not recommend.
> 
>  
> 
>  
> 
>  
> 
> ceph-deploy osd prepare {node-name}:{data-disk}[:{journal-disk}]
> 
> ceph-deploy osd prepare osdserver1:sdc1:sdc2
> 
>  
> 
>  
> 
>  
> 
> Best Regards,
> 
>   
> 
> Swapnil Jain | Solution Architect & Certified Instructor 
> 
> RHC{A,DS,E,I,SA,SA-RHOS,VA}, CE{H,I}, CC{DA,NA}, MCSE, CNE
> 
>  
> 
>  
> 
>  
> 
>  
> 
> On 17-Dec-2015, at 4:42 pm, Dan Nica  > wrote:
> 
>  
> 
> Hi,
> 
>  
> 
> Can I have data an journal on the same disk ? if yes, how ?
> 
>  
> 
> Thanks
> 
> --
> 
> Dan
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
>  
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Loic Dachary



On 17/12/2015 11:33, Jesper Thorhauge wrote:
> Hi Loic,
> 
> Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway 
> i can debug this further? Log-files? Modify the .rules file...?

Do you see traces of what happens when /dev/sdc3 shows up in boot.log ?

> 
> /Jesper
> 
> 
> 
> The non-symlink files in /dev/disk/by-partuuid come to existence because of:
> 
> * system boots
> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>   * ceph-disk-udev creates the symlink 
> /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
>   * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the 
> journal journal -> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 
> which does not yet exists because /dev/sdc udev rules have not been run yet
>   * ceph-osd opens the journal in write mode and that creates the file 
> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file
>   * the file is empty and the osd fails to activate with the error you see 
> (EINVAL because the file is empty)
> 
> This is ok, supported and expected since there is no way to know which disk 
> will show up first.
> 
> When /dev/sdc shows up, the same logic will be triggered:
> 
> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>   * ceph-disk-udev creates the symlink 
> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 
> (overriding the file because ln -sf)
>   * ceph-disk activate-journal /dev/sdc3 finds that 
> c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal 
> and mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f
>   * ceph-osd opens the journal and all is well
> 
> Except something goes wrong in your case, presumably because ceph-disk-udev 
> is not called when /dev/sdc3 shows up ?
> 
> On 17/12/2015 08:29, Jesper Thorhauge wrote:
>> Hi Loic,
>>
>> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4).
>>
>> sgdisk for sda shows;
>>
>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6
>> First sector: 2048 (at 1024.0 KiB)
>> Last sector: 1953525134 (at 931.5 GiB)
>> Partition size: 1953523087 sectors (931.5 GiB)
>> Attribute flags: 
>> Partition name: 'ceph data'
>>
>> for sdb
>>
>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F
>> First sector: 2048 (at 1024.0 KiB)
>> Last sector: 1953525134 (at 931.5 GiB)
>> Partition size: 1953523087 sectors (931.5 GiB)
>> Attribute flags: 
>> Partition name: 'ceph data'
>>
>> for /dev/sdc3
>>
>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072
>> First sector: 935813120 (at 446.2 GiB)
>> Last sector: 956293119 (at 456.0 GiB)
>> Partition size: 2048 sectors (9.8 GiB)
>> Attribute flags: 
>> Partition name: 'ceph journal'
>>
>> for /dev/sdc4
>>
>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168
>> First sector: 956293120 (at 456.0 GiB)
>> Last sector: 976773119 (at 465.8 GiB)
>> Partition size: 2048 sectors (9.8 GiB)
>> Attribute flags: 
>> Partition name: 'ceph journal'
>>
>> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
>> seems correct to me.
>>
>> after a reboot, /dev/disk/by-partuuid is;
>>
>> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
>> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
>> -> ../../sdb1
>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
>> -> ../../sda1
>>
>> i dont know how to verify the symlink of the journal file - can you guide me 
>> on that one?
>>
>> Thank :-) !
>>
>> /Jesper
>>
>> **
>>
>> Hi,
>>
>> On 17/12/2015 07:53, Jesper Thorhauge wrote:
>>> Hi,
>>>
>>> Some more information showing in the boot.log;
>>>
>>> 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
>>> filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal on 
>>> /var/lib/ceph/tmp/mnt.aWZTcE/journal: (22) Invalid argument
>>> 2015-12-16 07:35:33.289842 7f1b990ad800 -1 OSD::mkfs: ObjectStore::mkfs 
>>> failed with error -22
>>> 2015-12-16 07:35:33.289883 7f1b990ad800 -1  ** ERROR: error creating empty 
>>> object store in /var/lib/ceph/tmp/mnt.aWZTcE: (22) Invalid argument
>>> ERROR:ceph-disk:Failed to activate
>>> ceph-disk: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', 
>>> '--mkkey', '-i', '7', '--monmap', 
>>> '/var/lib/ceph/tmp/mnt.aWZTcE/activate.monmap', '--osd-data', 
>>> '/var/lib/ceph/tmp/mnt.aWZTcE', '--osd-journal', 
>>>

Re: [ceph-users] Journal symlink broken / Ceph 0.94.5 / CentOS 6.7

2015-12-17 Thread Loic Dachary

I guess that's the problem you need to solve : why /dev/sdc does not generate 
udev events (different driver than /dev/sda maybe ?). Once it does, Ceph should 
work.

A workaround could be to add somethink like:

ceph-disk-udev 3 sdc3 sdc
ceph-disk-udev 4 sdc4 sdc

in /etc/rc.local.

On 17/12/2015 12:01, Jesper Thorhauge wrote:
> Nope, the previous post contained all that was in the boot.log :-(
> 
> /Jesper
> 
> **
> 
> - Den 17. dec 2015, kl. 11:53, Loic Dachary  skrev:
> 
> On 17/12/2015 11:33, Jesper Thorhauge wrote:
>> Hi Loic,
>>
>> Sounds like something does go wrong when /dev/sdc3 shows up. Is there anyway 
>> i can debug this further? Log-files? Modify the .rules file...?
> 
> Do you see traces of what happens when /dev/sdc3 shows up in boot.log ?
> 
>>
>> /Jesper
>>
>> 
>>
>> The non-symlink files in /dev/disk/by-partuuid come to existence because of:
>>
>> * system boots
>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>>   * ceph-disk-udev creates the symlink 
>> /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f -> ../../sdb1
>>   * ceph-disk activate /dev/sda1 is mounted and finds a symlink to the 
>> journal journal -> 
>> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 which does not 
>> yet exists because /dev/sdc udev rules have not been run yet
>>   * ceph-osd opens the journal in write mode and that creates the file 
>> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 as a regular file
>>   * the file is empty and the osd fails to activate with the error you see 
>> (EINVAL because the file is empty)
>>
>> This is ok, supported and expected since there is no way to know which disk 
>> will show up first.
>>
>> When /dev/sdc shows up, the same logic will be triggered:
>>
>> * udev rule calls ceph-disk-udev via 95-ceph-osd.rules on /dev/sda1
>>   * ceph-disk-udev creates the symlink 
>> /dev/disk/by-partuuid/1e9d527f-0866-4284-b77c-c1cb04c5a168 -> ../../sdc3 
>> (overriding the file because ln -sf)
>>   * ceph-disk activate-journal /dev/sdc3 finds that 
>> c83b5aa5-fe77-42f6-9415-25ca0266fb7f is the data partition for that journal 
>> and mounts /dev/disk/by-partuuid/c83b5aa5-fe77-42f6-9415-25ca0266fb7f
>>   * ceph-osd opens the journal and all is well
>>
>> Except something goes wrong in your case, presumably because ceph-disk-udev 
>> is not called when /dev/sdc3 shows up ?
>>
>> On 17/12/2015 08:29, Jesper Thorhauge wrote:
>>> Hi Loic,
>>>
>>> osd's are on /dev/sda and /dev/sdb, journal's is on /dev/sdc (sdc3 / sdc4).
>>>
>>> sgdisk for sda shows;
>>>
>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>> Partition unique GUID: E85F4D92-C8F1-4591-BD2A-AA43B80F58F6
>>> First sector: 2048 (at 1024.0 KiB)
>>> Last sector: 1953525134 (at 931.5 GiB)
>>> Partition size: 1953523087 sectors (931.5 GiB)
>>> Attribute flags: 
>>> Partition name: 'ceph data'
>>>
>>> for sdb
>>>
>>> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown)
>>> Partition unique GUID: C83B5AA5-FE77-42F6-9415-25CA0266FB7F
>>> First sector: 2048 (at 1024.0 KiB)
>>> Last sector: 1953525134 (at 931.5 GiB)
>>> Partition size: 1953523087 sectors (931.5 GiB)
>>> Attribute flags: 
>>> Partition name: 'ceph data'
>>>
>>> for /dev/sdc3
>>>
>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>> Partition unique GUID: C34D4694-B486-450D-B57F-DA24255F0072
>>> First sector: 935813120 (at 446.2 GiB)
>>> Last sector: 956293119 (at 456.0 GiB)
>>> Partition size: 2048 sectors (9.8 GiB)
>>> Attribute flags: 
>>> Partition name: 'ceph journal'
>>>
>>> for /dev/sdc4
>>>
>>> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
>>> Partition unique GUID: 1E9D527F-0866-4284-B77C-C1CB04C5A168
>>> First sector: 956293120 (at 456.0 GiB)
>>> Last sector: 976773119 (at 465.8 GiB)
>>> Partition size: 2048 sectors (9.8 GiB)
>>> Attribute flags: 
>>> Partition name: 'ceph journal'
>>>
>>> 60-ceph-partuuid-workaround.rules is located in /lib/udev/rules.d, so it 
>>> seems correct to me.
>>>
>>> after a reboot, /dev/disk/by-partuuid is;
>>>
>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 1e9d527f-0866-4284-b77c-c1cb04c5a168
>>> -rw-r--r-- 1 root root  0 Dec 16 07:35 c34d4694-b486-450d-b57f-da24255f0072
>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 c83b5aa5-fe77-42f6-9415-25ca0266fb7f 
>>> -> ../../sdb1
>>> lrwxrwxrwx 1 root root 10 Dec 16 07:35 e85f4d92-c8f1-4591-bd2a-aa43b80f58f6 
>>> -> ../../sda1
>>>
>>> i dont know how to verify the symlink of the journal file - can you guide 
>>> me on that one?
>>>
>>> Thank :-) !
>>>
>>> /Jesper
>>>
>>> **
>>>
>>> Hi,
>>>
>>> On 17/12/2015 07:53, Jesper Thorhauge wrote:
 Hi,

 Some more information showing in the boot.log;

 2015-12-16 07:35:33.289830 7f1b990ad800 -1 
 filestore(/var/lib/ceph/tmp/mnt.aWZTcE) mkjournal error creating journal 
 on

[ceph-users] cephfs, low performances

2015-12-17 Thread Francois Lafont

Hi,

I have ceph cluster currently unused and I have (to my mind) very low 
performances.
I'm not an expert in benchs, here an example of quick bench:

---
# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
--name=readwrite --filename=rw.data --bs=4k --iodepth=64 --size=300MB 
--readwrite=randrw --rwmixread=50
readwrite: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.3
Starting 1 process
readwrite: Laying out IO file(s) (1 file(s) / 300MB)
Jobs: 1 (f=1): [m] [100.0% done] [2264KB/2128KB/0KB /s] [566/532/0 iops] [eta 
00m:00s]
readwrite: (groupid=0, jobs=1): err= 0: pid=3783: Fri Dec 18 02:01:13 2015
  read : io=153640KB, bw=2302.9KB/s, iops=575, runt= 66719msec
  write: io=153560KB, bw=2301.7KB/s, iops=575, runt= 66719msec
  cpu  : usr=0.77%, sys=3.07%, ctx=115432, majf=0, minf=604
  IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
 issued: total=r=38410/w=38390/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=153640KB, aggrb=2302KB/s, minb=2302KB/s, maxb=2302KB/s, 
mint=66719msec, maxt=66719msec
  WRITE: io=153560KB, aggrb=2301KB/s, minb=2301KB/s, maxb=2301KB/s, 
mint=66719msec, maxt=66719msec
---

It seems to me very bad. Can I hope better results with my setup (explained 
below)?
During the bench, I don't see particular symptoms (no CPU blocked at 100% etc). 
If
you have advices to improve the perf and/or maybe to make smarter benchs, I'm 
really
interested.

Thanks in advance for your help. Here is my conf...

I use Ubuntu 14.04 on each server with the 3.13 kernel (it's the same for the 
client
ceph where I run my bench) and I use Ceph 9.2.0 (Infernalis).
On the client, cephfs is mounted via cephfs-fuse with this in /etc/fstab:

id=cephfs,keyring=/etc/ceph/ceph.client.cephfs.keyring,client_mountpoint=/  
/mnt/cephfs fuse.ceph   noatime,defaults,_netdev0   0

I have 5 cluster node servers "Supermicro Motherboard X10SLM+-LN4 S1150" with
one 1GbE port for the ceph public network and one 10GbE port for the ceph 
private
network:

- 1 x Intel Xeon E3-1265Lv3
- 1 SSD DC3710 Series 200GB (with partitions for the OS, the 3 OSD-journals
and, just for ceph01, ceph02 and ceph03, the SSD contains too a partition
for the workdir of a monitor
- 3 HD 4TB Western Digital (WD) SATA 7200rpm
- RAM 32GB
- NO RAID controlleur
- Each partition uses XFS with noatim option, except the OS partition in EXT4.

Here is my ceph.conf :

---
[global]
  fsid   = 
  cluster network= 192.168.22.0/24
  public network = 10.0.2.0/24
  auth cluster required  = cephx
  auth service required  = cephx
  auth client required   = cephx
  filestore xattr use omap   = true
  osd pool default size  = 3
  osd pool default min size  = 1
  osd pool default pg num= 64
  osd pool default pgp num   = 64
  osd crush chooseleaf type  = 1
  osd journal size   = 0
  osd max backfills  = 1
  osd recovery max active= 1
  osd client op priority = 63
  osd recovery op priority   = 1
  osd op threads = 4
  mds cache size = 100
  osd scrub begin hour   = 3
  osd scrub end hour = 5
  mon allow pool delete  = false
  mon osd down out subtree limit = host
  mon osd min down reporters = 4

[mon.ceph01]
  host = ceph01
  mon addr = 10.0.2.101

[mon.ceph02]
  host = ceph02
  mon addr = 10.0.2.102

[mon.ceph03]
  host = ceph03
  mon addr = 10.0.2.103
---

mds are in active/standby mode.

-- 
François Lafont
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

65 matches

Mail list logo