Re: [ceph-users] radosgw multi site different period

2017-11-14 Thread Kim-Norman Sahm
both cluster are in the same epoch and period:
 
root@ceph-a-1:~# radosgw-admin period get-current 
{
"current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current 
{
"current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

but the sync state is still "master is on a different period":

root@ceph-b-1:~# radosgw-admin sync status
  realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
  zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
   zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
  metadata sync syncing
full sync: 0/64 shards
master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source


Am Dienstag, den 14.11.2017, 18:21 +0100 schrieb Kim-Norman Sahm:
> both cluster are in the same epoch and period:
> 
> root@ceph-a-1:~# radosgw-admin period get-current 
> {
> "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
> }
> 
> root@ceph-b-1:~# radosgw-admin period get-current 
> {
> "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
> }
> 
> Am Dienstag, den 14.11.2017, 17:05 + schrieb David Turner:
> > 
> > I'm assuming you've looked at the period in both places `radosgw-
> > admin period get` and confirmed that the second site is behind the
> > master site (based on epochs).  I'm also assuming (since you linked
> > the instructions) that you've done `radosgw-admin period pull` on
> > the
> > second site to get any period updates that have been done to the
> > master site.
> > 
> > If my assumptions are wrong.  Then you should do those things.  If
> > my
> > assumptions are correct, then running `radosgw-admin period update
> > --
> > commit` on the the master site and `radosgw-admin period pull` on
> > the
> > second site might fix this.  If you've already done that as well
> > (as
> > they're steps in the article you linked), then you need someone
> > smarter than I am to chime in.
> > 
> > On Tue, Nov 14, 2017 at 11:35 AM Kim-Norman Sahm  > e>
> > wrote:
> > > 
> > > hi,
> > > 
> > > i've installed a ceph multi site setup with two ceph clusters and
> > > each
> > > one radosgw.
> > > the multi site setup was in sync, so i tried a failover.
> > > cluster A is going down and i've changed the zone (b) on cluster
> > > b
> > > to
> > > the new master zone.
> > > it's working fine.
> > > 
> > > now i start the cluster A and try to switch back the master zone
> > > to
> > > A.
> > > cluster A believes that he is the master, cluster b is secondary.
> > > but on the secondary is a different period and the bucket delta
> > > is
> > > not
> > > synced to the new master zone:
> > > 
> > > root@ceph-a-1:~# radosgw-admin sync status
> > >           realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
> > >   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
> > >    zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
> > >   metadata sync no sync (zone is master)
> > >   data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
> > > syncing
> > > full sync: 0/128 shards
> > > incremental sync: 128/128 shards
> > > data is caught up with source
> > > 
> > > root@ceph-b-1:~# radosgw-admin sync status
> > >   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
> > >   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
> > >    zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
> > >   metadata sync syncing
> > > full sync: 0/64 shards
> > > master is on a different period:
> > > master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
> > > local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
> > > incremental sync: 64/64 shards
> > > metadata is caught up with master
> > >   data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
> > > syncing
> > > full sync: 0/128 shards
> > > incremental sync: 128/128 shards
> > > data is caught up with source
> > > 
> > > how can i force sync the period and the bucket deltas?
> > > i've used this howto: http://docs.ceph.com/docs/master/radosgw/mu
> > > lt
> > > isit
> > > e/
> > > 
> > > br Kim
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > 
> ___
> ceph-users mailing li

[ceph-users] CephFS | Mounting Second CephFS

2017-11-14 Thread Geoffrey Rhodes
Hi,

When running more than one cephfs how would I specify which file system I
want to mount in ceph-fuse or the kernel client?

OS: Ubuntu 16.04.3 LTS
Ceph version: 12.2.1 - Luminous


Kind regards
Geoffrey Rhodes
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 10.2.10: "default" zonegroup in custom root pool not found

2017-11-14 Thread Richard Chan
After creating a non-default root pool
rgw_realm_root_pool = gold.rgw.root
rgw_zonegroup_root_pool = gold.rgw.root
rgw_period_root_pool = gold.rgw.root
rgw_zone_root_pool = gold.rgw.root
rgw_region = gold.rgw.root

radosgw-admin realm create --rgw-realm gold --default
radosgw-admin zonegroup create --rgw-zonegroup=us  --default --master
--endpoints http://rgw:7480

The "default" is not respected anymore:


radosgw-admin period update
--commit

2017-11-15 04:50:42.400404 7f694dd4e9c0  0 failed reading zonegroup info:
ret -2 (2) No such file or directory
couldn't init storage provider


I require --rgw-zonegroup=us on command line or /etc/ceph/ceph.conf

This seems to be regression.




-- 
Richard Chan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Why keep old epochs?

2017-11-14 Thread Bryan Henderson
Some questions about maps and epochs:

I see that I can control the minimum number of osdmap epochs to keep with
"mon min osdmap epoch".  Why do I care?  Why would I want any but the current
osdmap, and why would the system keep more than my minimum?

Similarly, "mon max pgmap epoch" controls the _maximum_ number of pgmap epochs
to keep around.  I believe I need more than the most recent pgmap because I
need to keep previous ones until all PGs that were placed according to that
pgmap have migrated to where the current pgmap says they should be.  But do I
need more epochs than that, and what happens if the maximum I set is too low
to cover those necessesary old pgmaps?

-- 
Bryan Henderson   San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread David Turner
I'd probably say 50GB to leave some extra space over-provisioned.  50GB
should definitely prevent any DB operations from spilling over to the HDD.

On Tue, Nov 14, 2017, 5:43 PM Milanov, Radoslav Nikiforov 
wrote:

> Thank you,
>
> It is 4TB OSDs and they might become full someday, I’ll try 60GB db
> partition – this is the max OSD capacity.
>
>
>
> - Rado
>
>
>
> *From:* David Turner [mailto:drakonst...@gmail.com]
> *Sent:* Tuesday, November 14, 2017 5:38 PM
>
>
> *To:* Milanov, Radoslav Nikiforov 
>
> *Cc:* Mark Nelson ; ceph-users@lists.ceph.com
>
>
> *Subject:* Re: [ceph-users] Bluestore performance 50% of filestore
>
>
>
> You have to configure the size of the db partition in the config file for
> the cluster.  If you're db partition is 1GB, then I can all but guarantee
> that you're using your HDD for your blocks.db very quickly into your
> testing.  There have been multiple threads recently about what size the db
> partition should be and it seems to be based on how many objects your OSD
> is likely to have on it.  The recommendation has been to err on the side of
> bigger.  If you're running 10TB OSDs and anticipate filling them up, then
> you probably want closer to an 80GB+ db partition.  That's why I asked how
> full your cluster was and how large your HDDs are.
>
>
>
> Here's a link to one of the recent ML threads on this topic.
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020822.html
>
> On Tue, Nov 14, 2017 at 4:44 PM Milanov, Radoslav Nikiforov 
> wrote:
>
> Block-db partition is the default 1GB (is there a way to modify this?
> journals are 5GB in filestore case) and usage is low:
>
>
>
> [root@kumo-ceph02 ~]# ceph df
>
> GLOBAL:
>
> SIZEAVAIL  RAW USED %RAW USED
>
> 100602G 99146G1455G  1.45
>
> POOLS:
>
> NAME  ID USED   %USED MAX AVAIL OBJECTS
>
> kumo-vms  1  19757M  0.0231147G5067
>
> kumo-volumes  2214G  0.1831147G   55248
>
> kumo-images   3203G  0.1731147G   66486
>
> kumo-vms3 11 45824M  0.0431147G   11643
>
> kumo-volumes3 13 10837M 031147G2724
>
> kumo-images3  15 82450M  0.0931147G   10320
>
>
>
> - Rado
>
>
>
> *From:* David Turner [mailto:drakonst...@gmail.com]
> *Sent:* Tuesday, November 14, 2017 4:40 PM
> *To:* Mark Nelson 
> *Cc:* Milanov, Radoslav Nikiforov ;
> ceph-users@lists.ceph.com
>
>
> *Subject:* Re: [ceph-users] Bluestore performance 50% of filestore
>
>
>
> How big was your blocks.db partition for each OSD and what size are your
> HDDs?  Also how full is your cluster?  It's possible that your blocks.db
> partition wasn't large enough to hold the entire db and it had to spill
> over onto the HDD which would definitely impact performance.
>
>
>
> On Tue, Nov 14, 2017 at 4:36 PM Mark Nelson  wrote:
>
> How big were the writes in the windows test and how much concurrency was
> there?
>
> Historically bluestore does pretty well for us with small random writes
> so your write results surprise me a bit.  I suspect it's the low queue
> depth.  Sometimes bluestore does worse with reads, especially if
> readahead isn't enabled on the client.
>
> Mark
>
> On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:
> > Hi Mark,
> > Yes RBD is in write back, and the only thing that changed was converting
> OSDs to bluestore. It is 7200 rpm drives and triple replication. I also get
> same results (bluestore 2 times slower) testing continuous writes on a 40GB
> partition on a Windows VM, completely different tool.
> >
> > Right now I'm going back to filestore for the OSDs so additional tests
> are possible if that helps.
> >
> > - Rado
> >
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Mark Nelson
> > Sent: Tuesday, November 14, 2017 4:04 PM
> > To: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Bluestore performance 50% of filestore
> >
> > Hi Radoslav,
> >
> > Is RBD cache enabled and in writeback mode?  Do you have client side
> readahead?
> >
> > Both are doing better for writes than you'd expect from the native
> performance of the disks assuming they are typical 7200RPM drives and you
> are using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small
> file size, I'd expect that you might be getting better journal coalescing
> in filestore.
> >
> > Sadly I imagine you can't do a comparison test at this point, but I'd be
> curious how it would look if you used libaio with a high iodepth and a much
> bigger partition to do random writes over.
> >
> > Mark
> >
> > On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
> >> Hi
> >>
> >> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
> >>
> >> In filestore configuration there are 3 SSDs used for journals of 9
> >> OSDs on each hosts (1 SSD has 

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Milanov, Radoslav Nikiforov
Thank you,
It is 4TB OSDs and they might become full someday, I’ll try 60GB db partition – 
this is the max OSD capacity.

- Rado

From: David Turner [mailto:drakonst...@gmail.com]
Sent: Tuesday, November 14, 2017 5:38 PM
To: Milanov, Radoslav Nikiforov 
Cc: Mark Nelson ; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Bluestore performance 50% of filestore

You have to configure the size of the db partition in the config file for the 
cluster.  If you're db partition is 1GB, then I can all but guarantee that 
you're using your HDD for your blocks.db very quickly into your testing.  There 
have been multiple threads recently about what size the db partition should be 
and it seems to be based on how many objects your OSD is likely to have on it.  
The recommendation has been to err on the side of bigger.  If you're running 
10TB OSDs and anticipate filling them up, then you probably want closer to an 
80GB+ db partition.  That's why I asked how full your cluster was and how large 
your HDDs are.

Here's a link to one of the recent ML threads on this topic.  
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020822.html
On Tue, Nov 14, 2017 at 4:44 PM Milanov, Radoslav Nikiforov 
mailto:rad...@bu.edu>> wrote:
Block-db partition is the default 1GB (is there a way to modify this? journals 
are 5GB in filestore case) and usage is low:

[root@kumo-ceph02 ~]# ceph df
GLOBAL:
SIZEAVAIL  RAW USED %RAW USED
100602G 99146G1455G  1.45
POOLS:
NAME  ID USED   %USED MAX AVAIL OBJECTS
kumo-vms  1  19757M  0.0231147G5067
kumo-volumes  2214G  0.1831147G   55248
kumo-images   3203G  0.1731147G   66486
kumo-vms3 11 45824M  0.0431147G   11643
kumo-volumes3 13 10837M 031147G2724
kumo-images3  15 82450M  0.0931147G   10320

- Rado

From: David Turner [mailto:drakonst...@gmail.com]
Sent: Tuesday, November 14, 2017 4:40 PM
To: Mark Nelson mailto:mnel...@redhat.com>>
Cc: Milanov, Radoslav Nikiforov mailto:rad...@bu.edu>>; 
ceph-users@lists.ceph.com

Subject: Re: [ceph-users] Bluestore performance 50% of filestore

How big was your blocks.db partition for each OSD and what size are your HDDs?  
Also how full is your cluster?  It's possible that your blocks.db partition 
wasn't large enough to hold the entire db and it had to spill over onto the HDD 
which would definitely impact performance.

On Tue, Nov 14, 2017 at 4:36 PM Mark Nelson 
mailto:mnel...@redhat.com>> wrote:
How big were the writes in the windows test and how much concurrency was
there?

Historically bluestore does pretty well for us with small random writes
so your write results surprise me a bit.  I suspect it's the low queue
depth.  Sometimes bluestore does worse with reads, especially if
readahead isn't enabled on the client.

Mark

On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:
> Hi Mark,
> Yes RBD is in write back, and the only thing that changed was converting OSDs 
> to bluestore. It is 7200 rpm drives and triple replication. I also get same 
> results (bluestore 2 times slower) testing continuous writes on a 40GB 
> partition on a Windows VM, completely different tool.
>
> Right now I'm going back to filestore for the OSDs so additional tests are 
> possible if that helps.
>
> - Rado
>
> -Original Message-
> From: ceph-users 
> [mailto:ceph-users-boun...@lists.ceph.com]
>  On Behalf Of Mark Nelson
> Sent: Tuesday, November 14, 2017 4:04 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Bluestore performance 50% of filestore
>
> Hi Radoslav,
>
> Is RBD cache enabled and in writeback mode?  Do you have client side 
> readahead?
>
> Both are doing better for writes than you'd expect from the native 
> performance of the disks assuming they are typical 7200RPM drives and you are 
> using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small file 
> size, I'd expect that you might be getting better journal coalescing in 
> filestore.
>
> Sadly I imagine you can't do a comparison test at this point, but I'd be 
> curious how it would look if you used libaio with a high iodepth and a much 
> bigger partition to do random writes over.
>
> Mark
>
> On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
>> Hi
>>
>> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
>>
>> In filestore configuration there are 3 SSDs used for journals of 9
>> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).
>>
>> I've converted filestore to bluestore by wiping 1 host a time and
>> waiting for recovery. SSDs now contain block-db - again one SSD
>> serving
>> 3 OSDs.
>>
>>
>>
>> Cluster is used as st

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread David Turner
You have to configure the size of the db partition in the config file for
the cluster.  If you're db partition is 1GB, then I can all but guarantee
that you're using your HDD for your blocks.db very quickly into your
testing.  There have been multiple threads recently about what size the db
partition should be and it seems to be based on how many objects your OSD
is likely to have on it.  The recommendation has been to err on the side of
bigger.  If you're running 10TB OSDs and anticipate filling them up, then
you probably want closer to an 80GB+ db partition.  That's why I asked how
full your cluster was and how large your HDDs are.

Here's a link to one of the recent ML threads on this topic.
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020822.html
On Tue, Nov 14, 2017 at 4:44 PM Milanov, Radoslav Nikiforov 
wrote:

> Block-db partition is the default 1GB (is there a way to modify this?
> journals are 5GB in filestore case) and usage is low:
>
>
>
> [root@kumo-ceph02 ~]# ceph df
>
> GLOBAL:
>
> SIZEAVAIL  RAW USED %RAW USED
>
> 100602G 99146G1455G  1.45
>
> POOLS:
>
> NAME  ID USED   %USED MAX AVAIL OBJECTS
>
> kumo-vms  1  19757M  0.0231147G5067
>
> kumo-volumes  2214G  0.1831147G   55248
>
> kumo-images   3203G  0.1731147G   66486
>
> kumo-vms3 11 45824M  0.0431147G   11643
>
> kumo-volumes3 13 10837M 031147G2724
>
> kumo-images3  15 82450M  0.0931147G   10320
>
>
>
> - Rado
>
>
>
> *From:* David Turner [mailto:drakonst...@gmail.com]
> *Sent:* Tuesday, November 14, 2017 4:40 PM
> *To:* Mark Nelson 
> *Cc:* Milanov, Radoslav Nikiforov ;
> ceph-users@lists.ceph.com
>
>
> *Subject:* Re: [ceph-users] Bluestore performance 50% of filestore
>
>
>
> How big was your blocks.db partition for each OSD and what size are your
> HDDs?  Also how full is your cluster?  It's possible that your blocks.db
> partition wasn't large enough to hold the entire db and it had to spill
> over onto the HDD which would definitely impact performance.
>
>
>
> On Tue, Nov 14, 2017 at 4:36 PM Mark Nelson  wrote:
>
> How big were the writes in the windows test and how much concurrency was
> there?
>
> Historically bluestore does pretty well for us with small random writes
> so your write results surprise me a bit.  I suspect it's the low queue
> depth.  Sometimes bluestore does worse with reads, especially if
> readahead isn't enabled on the client.
>
> Mark
>
> On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:
> > Hi Mark,
> > Yes RBD is in write back, and the only thing that changed was converting
> OSDs to bluestore. It is 7200 rpm drives and triple replication. I also get
> same results (bluestore 2 times slower) testing continuous writes on a 40GB
> partition on a Windows VM, completely different tool.
> >
> > Right now I'm going back to filestore for the OSDs so additional tests
> are possible if that helps.
> >
> > - Rado
> >
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Mark Nelson
> > Sent: Tuesday, November 14, 2017 4:04 PM
> > To: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Bluestore performance 50% of filestore
> >
> > Hi Radoslav,
> >
> > Is RBD cache enabled and in writeback mode?  Do you have client side
> readahead?
> >
> > Both are doing better for writes than you'd expect from the native
> performance of the disks assuming they are typical 7200RPM drives and you
> are using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small
> file size, I'd expect that you might be getting better journal coalescing
> in filestore.
> >
> > Sadly I imagine you can't do a comparison test at this point, but I'd be
> curious how it would look if you used libaio with a high iodepth and a much
> bigger partition to do random writes over.
> >
> > Mark
> >
> > On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
> >> Hi
> >>
> >> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
> >>
> >> In filestore configuration there are 3 SSDs used for journals of 9
> >> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).
> >>
> >> I've converted filestore to bluestore by wiping 1 host a time and
> >> waiting for recovery. SSDs now contain block-db - again one SSD
> >> serving
> >> 3 OSDs.
> >>
> >>
> >>
> >> Cluster is used as storage for Openstack.
> >>
> >> Running fio on a VM in that Openstack reveals bluestore performance
> >> almost twice slower than filestore.
> >>
> >> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
> >> --numjobs=2 --time_based --runtime=180 --group_reporting
> >>
> >> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
> >> --numjobs=2 --time_based --runtime=180 --group_reporting
> 

Re: [ceph-users] S3/Swift :: Pools Ceph

2017-11-14 Thread David Turner
While you can configure 1 pool to be used for RBD and Object storage, I
believe that is being deprecated and can cause unforeseen problems in the
future.  It is definitely not a recommended or common use case.

On Tue, Nov 14, 2017 at 4:51 PM Christian Wuerdig <
christian.wuer...@gmail.com> wrote:

> As per documentation: http://docs.ceph.com/docs/luminous/radosgw/
> "The S3 and Swift APIs share a common namespace, so you may write data
> with one API and retrieve it with the other."
> So you can access one pool through both APIs and the data will be
> available via both.
>
>
> On Wed, Nov 15, 2017 at 7:52 AM, Osama Hasebou 
> wrote:
> > Hi Everyone,
> >
> > I was wondering, has anyone tried in a Test/Production environment, to
> have
> > 1 pool, to which you can input/output data using S3 and Swift, or would
> each
> > need a separate pool, one to serve via S3 and one to serve via Swift ?
> >
> > Also, I believe you can use 1 pool for RBD and Object storage as well,
> or is
> > that false ?
> >
> > Thank you!
> >
> > Regards,
> > Ossi
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] S3/Swift :: Pools Ceph

2017-11-14 Thread Christian Wuerdig
As per documentation: http://docs.ceph.com/docs/luminous/radosgw/
"The S3 and Swift APIs share a common namespace, so you may write data
with one API and retrieve it with the other."
So you can access one pool through both APIs and the data will be
available via both.


On Wed, Nov 15, 2017 at 7:52 AM, Osama Hasebou  wrote:
> Hi Everyone,
>
> I was wondering, has anyone tried in a Test/Production environment, to have
> 1 pool, to which you can input/output data using S3 and Swift, or would each
> need a separate pool, one to serve via S3 and one to serve via Swift ?
>
> Also, I believe you can use 1 pool for RBD and Object storage as well, or is
> that false ?
>
> Thank you!
>
> Regards,
> Ossi
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Incorrect pool usage statistics

2017-11-14 Thread David Turner
If you know that the pool should be empty, there wouldn't be a problem with
piping the ouput of `rados ls` to `rados rm`.  By the same notion, if
nothing in the pool is needed you can delete the pool and create a new one
that will be perfectly empty.

On Tue, Nov 14, 2017 at 3:23 PM Karun Josy  wrote:

> Help?!
>
> There seems to be many objects still present in the pool :
> -
> $ rados df
> POOL_NAME   USED   OBJECTS CLONES  COPIES  MISSING_ON_PRIMARY UNFOUND
> DEGRADED RD_OPSRDWR_OPSWR
> vm 886   105  0 315  0
>00943399 1301M 39539 30889M
> ecpool 403G   388652 316701 2720564
> 0   00 156972536 1081G 203383441  4074G
> imagepool89014M   22485  0   67455  0   0
>   0   7856029  708G  13140767   602G
> template   115G29848 43  149240  0
>00  66138389 2955G   1123900   539G
>
>
> Karun Josy
>
> On Tue, Nov 14, 2017 at 4:16 AM, Karun Josy  wrote:
>
>> Hello,
>>
>> Recently, I deleted all the disks from an erasure pool 'ecpool'.
>> The pool is empty. However the space usage shows around 400GB.
>> What might be wrong?
>>
>>
>> $ rbd ls -l ecpool
>> $ $ ceph df
>>
>> GLOBAL:
>> SIZE   AVAIL  RAW USED %RAW USED
>> 19019G 16796G2223G 11.69
>> POOLS:
>> NAMEID USED   %USED MAX AVAIL OBJECTS
>> template 1227G  1.59 2810G   58549
>> vm 21  0 0 4684G   2
>> ecpool  33   403G  2.7910038G  388652
>> imagepool   34 90430M  0.62 4684G   22789
>>
>>
>>
>> Karun Josy
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Milanov, Radoslav Nikiforov
Block-db partition is the default 1GB (is there a way to modify this? journals 
are 5GB in filestore case) and usage is low:

[root@kumo-ceph02 ~]# ceph df
GLOBAL:
SIZEAVAIL  RAW USED %RAW USED
100602G 99146G1455G  1.45
POOLS:
NAME  ID USED   %USED MAX AVAIL OBJECTS
kumo-vms  1  19757M  0.0231147G5067
kumo-volumes  2214G  0.1831147G   55248
kumo-images   3203G  0.1731147G   66486
kumo-vms3 11 45824M  0.0431147G   11643
kumo-volumes3 13 10837M 031147G2724
kumo-images3  15 82450M  0.0931147G   10320

- Rado

From: David Turner [mailto:drakonst...@gmail.com]
Sent: Tuesday, November 14, 2017 4:40 PM
To: Mark Nelson 
Cc: Milanov, Radoslav Nikiforov ; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Bluestore performance 50% of filestore

How big was your blocks.db partition for each OSD and what size are your HDDs?  
Also how full is your cluster?  It's possible that your blocks.db partition 
wasn't large enough to hold the entire db and it had to spill over onto the HDD 
which would definitely impact performance.

On Tue, Nov 14, 2017 at 4:36 PM Mark Nelson 
mailto:mnel...@redhat.com>> wrote:
How big were the writes in the windows test and how much concurrency was
there?

Historically bluestore does pretty well for us with small random writes
so your write results surprise me a bit.  I suspect it's the low queue
depth.  Sometimes bluestore does worse with reads, especially if
readahead isn't enabled on the client.

Mark

On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:
> Hi Mark,
> Yes RBD is in write back, and the only thing that changed was converting OSDs 
> to bluestore. It is 7200 rpm drives and triple replication. I also get same 
> results (bluestore 2 times slower) testing continuous writes on a 40GB 
> partition on a Windows VM, completely different tool.
>
> Right now I'm going back to filestore for the OSDs so additional tests are 
> possible if that helps.
>
> - Rado
>
> -Original Message-
> From: ceph-users 
> [mailto:ceph-users-boun...@lists.ceph.com]
>  On Behalf Of Mark Nelson
> Sent: Tuesday, November 14, 2017 4:04 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Bluestore performance 50% of filestore
>
> Hi Radoslav,
>
> Is RBD cache enabled and in writeback mode?  Do you have client side 
> readahead?
>
> Both are doing better for writes than you'd expect from the native 
> performance of the disks assuming they are typical 7200RPM drives and you are 
> using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small file 
> size, I'd expect that you might be getting better journal coalescing in 
> filestore.
>
> Sadly I imagine you can't do a comparison test at this point, but I'd be 
> curious how it would look if you used libaio with a high iodepth and a much 
> bigger partition to do random writes over.
>
> Mark
>
> On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
>> Hi
>>
>> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
>>
>> In filestore configuration there are 3 SSDs used for journals of 9
>> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).
>>
>> I've converted filestore to bluestore by wiping 1 host a time and
>> waiting for recovery. SSDs now contain block-db - again one SSD
>> serving
>> 3 OSDs.
>>
>>
>>
>> Cluster is used as storage for Openstack.
>>
>> Running fio on a VM in that Openstack reveals bluestore performance
>> almost twice slower than filestore.
>>
>> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
>> --numjobs=2 --time_based --runtime=180 --group_reporting
>>
>> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
>> --numjobs=2 --time_based --runtime=180 --group_reporting
>>
>>
>>
>>
>>
>> Filestore
>>
>>   write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec
>>
>>   write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec
>>
>>   write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec
>>
>>
>>
>>   read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec
>>
>>   read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec
>>
>>   read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec
>>
>>
>>
>> Bluestore
>>
>>   write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec
>>
>>   write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec
>>
>>   write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec
>>
>>
>>
>>   read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec
>>
>>   read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec
>>
>>   read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec
>>
>>
>>
>>
>>
>> - Rado
>>
>>
>>
>>
>>
>> __

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Milanov, Radoslav Nikiforov
16 MB block, single thread, sequential writes, this is



[cid:image001.emz@01D35D67.61AF9D30]



- Rado



-Original Message-
From: Mark Nelson [mailto:mnel...@redhat.com]
Sent: Tuesday, November 14, 2017 4:36 PM
To: Milanov, Radoslav Nikiforov ; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Bluestore performance 50% of filestore



How big were the writes in the windows test and how much concurrency was there?



Historically bluestore does pretty well for us with small random writes so your 
write results surprise me a bit.  I suspect it's the low queue depth.  
Sometimes bluestore does worse with reads, especially if readahead isn't 
enabled on the client.



Mark



On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:

> Hi Mark,

> Yes RBD is in write back, and the only thing that changed was converting OSDs 
> to bluestore. It is 7200 rpm drives and triple replication. I also get same 
> results (bluestore 2 times slower) testing continuous writes on a 40GB 
> partition on a Windows VM, completely different tool.

>

> Right now I'm going back to filestore for the OSDs so additional tests are 
> possible if that helps.

>

> - Rado

>

> -Original Message-

> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf

> Of Mark Nelson

> Sent: Tuesday, November 14, 2017 4:04 PM

> To: ceph-users@lists.ceph.com

> Subject: Re: [ceph-users] Bluestore performance 50% of filestore

>

> Hi Radoslav,

>

> Is RBD cache enabled and in writeback mode?  Do you have client side 
> readahead?

>

> Both are doing better for writes than you'd expect from the native 
> performance of the disks assuming they are typical 7200RPM drives and you are 
> using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small file 
> size, I'd expect that you might be getting better journal coalescing in 
> filestore.

>

> Sadly I imagine you can't do a comparison test at this point, but I'd be 
> curious how it would look if you used libaio with a high iodepth and a much 
> bigger partition to do random writes over.

>

> Mark

>

> On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:

>> Hi

>>

>> We have 3 node, 27 OSDs cluster running Luminous 12.2.1

>>

>> In filestore configuration there are 3 SSDs used for journals of 9

>> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).

>>

>> I've converted filestore to bluestore by wiping 1 host a time and

>> waiting for recovery. SSDs now contain block-db - again one SSD

>> serving

>> 3 OSDs.

>>

>>

>>

>> Cluster is used as storage for Openstack.

>>

>> Running fio on a VM in that Openstack reveals bluestore performance

>> almost twice slower than filestore.

>>

>> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G

>> --numjobs=2 --time_based --runtime=180 --group_reporting

>>

>> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G

>> --numjobs=2 --time_based --runtime=180 --group_reporting

>>

>>

>>

>>

>>

>> Filestore

>>

>>   write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec

>>

>>   write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec

>>

>>   write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec

>>

>>

>>

>>   read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec

>>

>>   read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec

>>

>>   read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec

>>

>>

>>

>> Bluestore

>>

>>   write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec

>>

>>   write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec

>>

>>   write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec

>>

>>

>>

>>   read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec

>>

>>   read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec

>>

>>   read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec

>>

>>

>>

>>

>>

>> - Rado

>>

>>

>>

>>

>>

>> ___

>> ceph-users mailing list

>> ceph-users@lists.ceph.com

>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>

> ___

> ceph-users mailing list

> ceph-users@lists.ceph.com

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>


image001.emz
Description: image001.emz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread David Turner
How big was your blocks.db partition for each OSD and what size are your
HDDs?  Also how full is your cluster?  It's possible that your blocks.db
partition wasn't large enough to hold the entire db and it had to spill
over onto the HDD which would definitely impact performance.

On Tue, Nov 14, 2017 at 4:36 PM Mark Nelson  wrote:

> How big were the writes in the windows test and how much concurrency was
> there?
>
> Historically bluestore does pretty well for us with small random writes
> so your write results surprise me a bit.  I suspect it's the low queue
> depth.  Sometimes bluestore does worse with reads, especially if
> readahead isn't enabled on the client.
>
> Mark
>
> On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:
> > Hi Mark,
> > Yes RBD is in write back, and the only thing that changed was converting
> OSDs to bluestore. It is 7200 rpm drives and triple replication. I also get
> same results (bluestore 2 times slower) testing continuous writes on a 40GB
> partition on a Windows VM, completely different tool.
> >
> > Right now I'm going back to filestore for the OSDs so additional tests
> are possible if that helps.
> >
> > - Rado
> >
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Mark Nelson
> > Sent: Tuesday, November 14, 2017 4:04 PM
> > To: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Bluestore performance 50% of filestore
> >
> > Hi Radoslav,
> >
> > Is RBD cache enabled and in writeback mode?  Do you have client side
> readahead?
> >
> > Both are doing better for writes than you'd expect from the native
> performance of the disks assuming they are typical 7200RPM drives and you
> are using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small
> file size, I'd expect that you might be getting better journal coalescing
> in filestore.
> >
> > Sadly I imagine you can't do a comparison test at this point, but I'd be
> curious how it would look if you used libaio with a high iodepth and a much
> bigger partition to do random writes over.
> >
> > Mark
> >
> > On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
> >> Hi
> >>
> >> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
> >>
> >> In filestore configuration there are 3 SSDs used for journals of 9
> >> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).
> >>
> >> I've converted filestore to bluestore by wiping 1 host a time and
> >> waiting for recovery. SSDs now contain block-db - again one SSD
> >> serving
> >> 3 OSDs.
> >>
> >>
> >>
> >> Cluster is used as storage for Openstack.
> >>
> >> Running fio on a VM in that Openstack reveals bluestore performance
> >> almost twice slower than filestore.
> >>
> >> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
> >> --numjobs=2 --time_based --runtime=180 --group_reporting
> >>
> >> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
> >> --numjobs=2 --time_based --runtime=180 --group_reporting
> >>
> >>
> >>
> >>
> >>
> >> Filestore
> >>
> >>   write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec
> >>
> >>   write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec
> >>
> >>   write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec
> >>
> >>
> >>
> >>   read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec
> >>
> >>   read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec
> >>
> >>   read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec
> >>
> >>
> >>
> >> Bluestore
> >>
> >>   write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec
> >>
> >>   write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec
> >>
> >>   write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec
> >>
> >>
> >>
> >>   read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec
> >>
> >>   read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec
> >>
> >>   read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec
> >>
> >>
> >>
> >>
> >>
> >> - Rado
> >>
> >>
> >>
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Mark Nelson
How big were the writes in the windows test and how much concurrency was 
there?


Historically bluestore does pretty well for us with small random writes 
so your write results surprise me a bit.  I suspect it's the low queue 
depth.  Sometimes bluestore does worse with reads, especially if 
readahead isn't enabled on the client.


Mark

On 11/14/2017 03:14 PM, Milanov, Radoslav Nikiforov wrote:

Hi Mark,
Yes RBD is in write back, and the only thing that changed was converting OSDs 
to bluestore. It is 7200 rpm drives and triple replication. I also get same 
results (bluestore 2 times slower) testing continuous writes on a 40GB 
partition on a Windows VM, completely different tool.

Right now I'm going back to filestore for the OSDs so additional tests are 
possible if that helps.

- Rado

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark 
Nelson
Sent: Tuesday, November 14, 2017 4:04 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Bluestore performance 50% of filestore

Hi Radoslav,

Is RBD cache enabled and in writeback mode?  Do you have client side readahead?

Both are doing better for writes than you'd expect from the native performance 
of the disks assuming they are typical 7200RPM drives and you are using 3X 
replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small file size, I'd 
expect that you might be getting better journal coalescing in filestore.

Sadly I imagine you can't do a comparison test at this point, but I'd be 
curious how it would look if you used libaio with a high iodepth and a much 
bigger partition to do random writes over.

Mark

On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:

Hi

We have 3 node, 27 OSDs cluster running Luminous 12.2.1

In filestore configuration there are 3 SSDs used for journals of 9
OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).

I've converted filestore to bluestore by wiping 1 host a time and
waiting for recovery. SSDs now contain block-db - again one SSD
serving
3 OSDs.



Cluster is used as storage for Openstack.

Running fio on a VM in that Openstack reveals bluestore performance
almost twice slower than filestore.

fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
--numjobs=2 --time_based --runtime=180 --group_reporting

fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
--numjobs=2 --time_based --runtime=180 --group_reporting





Filestore

  write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec

  write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec

  write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec



  read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec

  read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec

  read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec



Bluestore

  write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec

  write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec

  write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec



  read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec

  read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec

  read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec





- Rado





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Milanov, Radoslav Nikiforov
Hi Mark,
Yes RBD is in write back, and the only thing that changed was converting OSDs 
to bluestore. It is 7200 rpm drives and triple replication. I also get same 
results (bluestore 2 times slower) testing continuous writes on a 40GB 
partition on a Windows VM, completely different tool. 

Right now I'm going back to filestore for the OSDs so additional tests are 
possible if that helps.

- Rado

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark 
Nelson
Sent: Tuesday, November 14, 2017 4:04 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Bluestore performance 50% of filestore

Hi Radoslav,

Is RBD cache enabled and in writeback mode?  Do you have client side readahead?

Both are doing better for writes than you'd expect from the native performance 
of the disks assuming they are typical 7200RPM drives and you are using 3X 
replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given the small file size, I'd 
expect that you might be getting better journal coalescing in filestore.

Sadly I imagine you can't do a comparison test at this point, but I'd be 
curious how it would look if you used libaio with a high iodepth and a much 
bigger partition to do random writes over.

Mark

On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:
> Hi
>
> We have 3 node, 27 OSDs cluster running Luminous 12.2.1
>
> In filestore configuration there are 3 SSDs used for journals of 9 
> OSDs on each hosts (1 SSD has 3 journal paritions for 3 OSDs).
>
> I've converted filestore to bluestore by wiping 1 host a time and 
> waiting for recovery. SSDs now contain block-db - again one SSD 
> serving
> 3 OSDs.
>
>
>
> Cluster is used as storage for Openstack.
>
> Running fio on a VM in that Openstack reveals bluestore performance 
> almost twice slower than filestore.
>
> fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
> --numjobs=2 --time_based --runtime=180 --group_reporting
>
> fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
> --numjobs=2 --time_based --runtime=180 --group_reporting
>
>
>
>
>
> Filestore
>
>   write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec
>
>   write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec
>
>   write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec
>
>
>
>   read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec
>
>   read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec
>
>   read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec
>
>
>
> Bluestore
>
>   write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec
>
>   write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec
>
>   write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec
>
>
>
>   read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec
>
>   read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec
>
>   read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec
>
>
>
>
>
> - Rado
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Mark Nelson

Hi Radoslav,

Is RBD cache enabled and in writeback mode?  Do you have client side 
readahead?


Both are doing better for writes than you'd expect from the native 
performance of the disks assuming they are typical 7200RPM drives and 
you are using 3X replication (~150IOPS * 27 / 3 = ~1350 IOPS).  Given 
the small file size, I'd expect that you might be getting better journal 
coalescing in filestore.


Sadly I imagine you can't do a comparison test at this point, but I'd be 
curious how it would look if you used libaio with a high iodepth and a 
much bigger partition to do random writes over.


Mark

On 11/14/2017 01:54 PM, Milanov, Radoslav Nikiforov wrote:

Hi

We have 3 node, 27 OSDs cluster running Luminous 12.2.1

In filestore configuration there are 3 SSDs used for journals of 9 OSDs
on each hosts (1 SSD has 3 journal paritions for 3 OSDs).

I’ve converted filestore to bluestore by wiping 1 host a time and
waiting for recovery. SSDs now contain block-db – again one SSD serving
3 OSDs.



Cluster is used as storage for Openstack.

Running fio on a VM in that Openstack reveals bluestore performance
almost twice slower than filestore.

fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G
--numjobs=2 --time_based --runtime=180 --group_reporting

fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G
--numjobs=2 --time_based --runtime=180 --group_reporting





Filestore

  write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec

  write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec

  write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec



  read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec

  read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec

  read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec



Bluestore

  write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec

  write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec

  write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec



  read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec

  read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec

  read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec





- Rado





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Incorrect pool usage statistics

2017-11-14 Thread Karun Josy
Help?!

There seems to be many objects still present in the pool :
-
$ rados df
POOL_NAME   USED   OBJECTS CLONES  COPIES  MISSING_ON_PRIMARY UNFOUND
DEGRADED RD_OPSRDWR_OPSWR
vm 886   105  0 315  0
 00943399 1301M 39539 30889M
ecpool 403G   388652 316701 2720564  0
 00 156972536 1081G 203383441  4074G
imagepool89014M   22485  0   67455  0   0
  0   7856029  708G  13140767   602G
template   115G29848 43  149240  0
 00  66138389 2955G   1123900   539G


Karun Josy

On Tue, Nov 14, 2017 at 4:16 AM, Karun Josy  wrote:

> Hello,
>
> Recently, I deleted all the disks from an erasure pool 'ecpool'.
> The pool is empty. However the space usage shows around 400GB.
> What might be wrong?
>
>
> $ rbd ls -l ecpool
> $ $ ceph df
>
> GLOBAL:
> SIZE   AVAIL  RAW USED %RAW USED
> 19019G 16796G2223G 11.69
> POOLS:
> NAMEID USED   %USED MAX AVAIL OBJECTS
> template 1227G  1.59 2810G   58549
> vm 21  0 0 4684G   2
> ecpool  33   403G  2.7910038G  388652
> imagepool   34 90430M  0.62 4684G   22789
>
>
>
> Karun Josy
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bluestore performance 50% of filestore

2017-11-14 Thread Milanov, Radoslav Nikiforov
Hi
We have 3 node, 27 OSDs cluster running Luminous 12.2.1
In filestore configuration there are 3 SSDs used for journals of 9 OSDs on each 
hosts (1 SSD has 3 journal paritions for 3 OSDs).
I've converted filestore to bluestore by wiping 1 host a time and waiting for 
recovery. SSDs now contain block-db - again one SSD serving 3 OSDs.

Cluster is used as storage for Openstack.
Running fio on a VM in that Openstack reveals bluestore performance almost 
twice slower than filestore.
fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=1G 
--numjobs=2 --time_based --runtime=180 --group_reporting
fio --name fio_test_file --direct=1 --rw=randread --bs=4k --size=1G --numjobs=2 
--time_based --runtime=180 --group_reporting



Filestore

  write: io=3511.9MB, bw=19978KB/s, iops=4994, runt=180001msec

  write: io=3525.6MB, bw=20057KB/s, iops=5014, runt=180001msec

  write: io=3554.1MB, bw=20222KB/s, iops=5055, runt=180016msec



  read : io=1995.7MB, bw=11353KB/s, iops=2838, runt=180001msec

  read : io=1824.5MB, bw=10379KB/s, iops=2594, runt=180001msec

  read : io=1966.5MB, bw=11187KB/s, iops=2796, runt=180001msec



Bluestore

  write: io=1621.2MB, bw=9222.3KB/s, iops=2305, runt=180002msec

  write: io=1576.3MB, bw=8965.6KB/s, iops=2241, runt=180029msec

  write: io=1531.9MB, bw=8714.3KB/s, iops=2178, runt=180001msec



  read : io=1279.4MB, bw=7276.5KB/s, iops=1819, runt=180006msec

  read : io=773824KB, bw=4298.9KB/s, iops=1074, runt=180010msec

  read : io=1018.5MB, bw=5793.7KB/s, iops=1448, runt=180001msec


- Rado

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Deleting large pools

2017-11-14 Thread David Turner
2 weeks later and things are still deleting, but getting really close to
being done.  I tried to use ceph-objectstore-tool to remove one of the
PGs.  I only tested on 1 PG on 1 OSD, but it's doing something really
weird.  While it was running, my connection to the DC reset and the command
died.  Now when I try to run the tool it segfaults and just running the OSD
it doesn't try to delete the data.  The data in this PG does not matter and
I figure the worst case scenario is that it just sits there taking up 200GB
until I redeploy the OSD.

However, I like to learn things about Ceph.  Is there anyone with any
insight to what is happening with this PG?

[root@osd1 ~] # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0
--journal-path /var/lib/ceph/osd/ceph-0/journal --pgid 97.314s0 --op remove
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
 marking collection for removal
mark_pg_for_removal warning: peek_map_epoch reported error
terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
  what():  buffer::end_of_buffer
*** Caught signal (Aborted) **
 in thread 7f98ab2dc980 thread_name:ceph-objectstor
 ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)
 1: (()+0x95209a) [0x7f98abc4b09a]
 2: (()+0xf100) [0x7f98a91d7100]
 3: (gsignal()+0x37) [0x7f98a7d825f7]
 4: (abort()+0x148) [0x7f98a7d83ce8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f98a86879d5]
 6: (()+0x5e946) [0x7f98a8685946]
 7: (()+0x5e973) [0x7f98a8685973]
 8: (()+0x5eb93) [0x7f98a8685b93]
 9: (ceph::buffer::list::iterator_impl::copy(unsigned int,
char*)+0xa5) [0x7f98abd498a5]
 10: (PG::read_info(ObjectStore*, spg_t, coll_t const&,
ceph::buffer::list&, pg_info_t&, std::map, std::allocator > >&, unsigned char&)+0x324) [0x7f98ab6d3094]
 11: (mark_pg_for_removal(ObjectStore*, spg_t,
ObjectStore::Transaction*)+0x87c) [0x7f98ab66615c]
 12: (initiate_new_remove_pg(ObjectStore*, spg_t,
ObjectStore::Sequencer&)+0x131) [0x7f98ab666a51]
 13: (main()+0x39b7) [0x7f98ab610437]
 14: (__libc_start_main()+0xf5) [0x7f98a7d6eb15]
 15: (()+0x363a57) [0x7f98ab65ca57]
Aborted

On Thu, Nov 2, 2017 at 12:45 PM Gregory Farnum  wrote:

> Deletion is throttled, though I don’t know the configs to change it you
> could poke around if you want stuff to go faster.
>
> Don’t just remove the directory in the filesystem; you need to clean up
> the leveldb metadata as well. ;)
> Removing the pg via Ceph-objectstore-tool would work fine but I’ve seen
> too many people kill the wrong thing to recommend it.
> -Greg
> On Thu, Nov 2, 2017 at 9:40 AM David Turner  wrote:
>
>> Jewel 10.2.7; XFS formatted OSDs; no dmcrypt or LVM.  I have a pool that
>> I deleted 16 hours ago that accounted for about 70% of the available space
>> on each OSD (averaging 84% full), 370M objects in 8k PGs, ec 4+2 profile.
>> Based on the rate that the OSDs are freeing up space after deleting the
>> pool, it will take about a week to finish deleting the PGs from the OSDs.
>>
>> Is there anything I can do to speed this process up?  I feel like there
>> may be a way for me to go through the OSDs and delete the PG folders either
>> with the objectstore tool or while the OSD is offline.  I'm not sure what
>> Ceph is doing to delete the pool, but I don't think that an `rm -Rf` of the
>> PG folder would take nearly this long.
>>
>> Thank you all for your help.
>>
> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] S3/Swift :: Pools Ceph

2017-11-14 Thread Osama Hasebou
Hi Everyone, 

I was wondering, has anyone tried in a Test/Production environment, to have 1 
pool, to which you can input/output data using S3 and Swift, or would each need 
a separate pool, one to serve via S3 and one to serve via Swift ? 

Also, I believe you can use 1 pool for RBD and Object storage as well, or is 
that false ? 

Thank you! 

Regards, 
Ossi 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw multi site different period

2017-11-14 Thread Kim-Norman Sahm
both cluster are in the same epoch and period:

root@ceph-a-1:~# radosgw-admin period get-current 
{
"current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current 
{
"current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

Am Dienstag, den 14.11.2017, 17:05 + schrieb David Turner:
> I'm assuming you've looked at the period in both places `radosgw-
> admin period get` and confirmed that the second site is behind the
> master site (based on epochs).  I'm also assuming (since you linked
> the instructions) that you've done `radosgw-admin period pull` on the
> second site to get any period updates that have been done to the
> master site.
> 
> If my assumptions are wrong.  Then you should do those things.  If my
> assumptions are correct, then running `radosgw-admin period update --
> commit` on the the master site and `radosgw-admin period pull` on the
> second site might fix this.  If you've already done that as well (as
> they're steps in the article you linked), then you need someone
> smarter than I am to chime in.
> 
> On Tue, Nov 14, 2017 at 11:35 AM Kim-Norman Sahm 
> wrote:
> > hi,
> > 
> > i've installed a ceph multi site setup with two ceph clusters and
> > each
> > one radosgw.
> > the multi site setup was in sync, so i tried a failover.
> > cluster A is going down and i've changed the zone (b) on cluster b
> > to
> > the new master zone.
> > it's working fine.
> > 
> > now i start the cluster A and try to switch back the master zone to
> > A.
> > cluster A believes that he is the master, cluster b is secondary.
> > but on the secondary is a different period and the bucket delta is
> > not
> > synced to the new master zone:
> > 
> > root@ceph-a-1:~# radosgw-admin sync status
> >           realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
> >   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
> >    zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
> >   metadata sync no sync (zone is master)
> >   data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
> > syncing
> > full sync: 0/128 shards
> > incremental sync: 128/128 shards
> > data is caught up with source
> > 
> > root@ceph-b-1:~# radosgw-admin sync status
> >   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
> >   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
> >    zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
> >   metadata sync syncing
> > full sync: 0/64 shards
> > master is on a different period:
> > master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
> > local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
> > incremental sync: 64/64 shards
> > metadata is caught up with master
> >   data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
> > syncing
> > full sync: 0/128 shards
> > incremental sync: 128/128 shards
> > data is caught up with source
> > 
> > how can i force sync the period and the bucket deltas?
> > i've used this howto: http://docs.ceph.com/docs/master/radosgw/mult
> > isit
> > e/
> > 
> > br Kim
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw multi site different period

2017-11-14 Thread David Turner
I'm assuming you've looked at the period in both places `radosgw-admin
period get` and confirmed that the second site is behind the master site
(based on epochs).  I'm also assuming (since you linked the instructions)
that you've done `radosgw-admin period pull` on the second site to get any
period updates that have been done to the master site.

If my assumptions are wrong.  Then you should do those things.  If my
assumptions are correct, then running `radosgw-admin period update
--commit` on the the master site and `radosgw-admin period pull` on the
second site might fix this.  If you've already done that as well (as
they're steps in the article you linked), then you need someone smarter
than I am to chime in.

On Tue, Nov 14, 2017 at 11:35 AM Kim-Norman Sahm  wrote:

> hi,
>
> i've installed a ceph multi site setup with two ceph clusters and each
> one radosgw.
> the multi site setup was in sync, so i tried a failover.
> cluster A is going down and i've changed the zone (b) on cluster b to
> the new master zone.
> it's working fine.
>
> now i start the cluster A and try to switch back the master zone to A.
> cluster A believes that he is the master, cluster b is secondary.
> but on the secondary is a different period and the bucket delta is not
> synced to the new master zone:
>
> root@ceph-a-1:~# radosgw-admin sync status
>   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
>   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
>zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
>   metadata sync no sync (zone is master)
>   data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
> syncing
> full sync: 0/128 shards
> incremental sync: 128/128 shards
> data is caught up with source
>
> root@ceph-b-1:~# radosgw-admin sync status
>   realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
>   zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
>zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
>   metadata sync syncing
> full sync: 0/64 shards
> master is on a different period:
> master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
> local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
> incremental sync: 64/64 shards
> metadata is caught up with master
>   data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
> syncing
> full sync: 0/128 shards
> incremental sync: 128/128 shards
> data is caught up with source
>
> how can i force sync the period and the bucket deltas?
> i've used this howto: http://docs.ceph.com/docs/master/radosgw/multisit
> e/ 
>
> br Kim
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw multi site different period

2017-11-14 Thread Kim-Norman Sahm
hi,

i've installed a ceph multi site setup with two ceph clusters and each
one radosgw.
the multi site setup was in sync, so i tried a failover.
cluster A is going down and i've changed the zone (b) on cluster b to
the new master zone.
it's working fine.

now i start the cluster A and try to switch back the master zone to A.
cluster A believes that he is the master, cluster b is secondary.
but on the secondary is a different period and the bucket delta is not
synced to the new master zone:

root@ceph-a-1:~# radosgw-admin sync status
          realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
  zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
   zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
  metadata sync no sync (zone is master)
  data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

root@ceph-b-1:~# radosgw-admin sync status
  realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
  zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
   zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
  metadata sync syncing
full sync: 0/64 shards
master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

how can i force sync the period and the bucket deltas?
i've used this howto: http://docs.ceph.com/docs/master/radosgw/multisit
e/

br Kim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Thanks a lot for all your comments,

If you don't see any problem... I will enable the following features that
might fit my requirements:

Layering
Striping
Exclusive locking
Object map
Fast-diff

Thanks a lot
Óscar Segarra




2017-11-14 16:56 GMT+01:00 Jason Dillaman :

> From the documentation [1]:
>
> shareable
> If present, this indicates the device is expected to be shared between
> domains (assuming the hypervisor and OS support this), which means that
> caching should be deactivated for that device.
>
> Basically, it's the use-case for putting a clustered file system (or
> similar) on top of the block device. For the vast majority of cases, you
> shouldn't enable this in libvirt.
>
> [1] https://libvirt.org/formatdomain.html#elementsDisks
>
> On Tue, Nov 14, 2017 at 10:49 AM, Oscar Segarra 
> wrote:
>
>> Hi Jason,
>>
>> The big use-case for sharing a block device is if you set up a clustered
>> file system on top of it, and I'd argue that you'd probably be better
>> off using CephFS.
>> --> Nice to know!
>>
>> Thanks a lot for your clarifications, in this case I referenced the
>> shareable flag that one can see in the KVM. I'd like to know the suggested
>> configuration for rbd images and live migration
>>
>> [image: Imágenes integradas 1]
>>
>> Thanks a lot.
>>
>> 2017-11-14 16:36 GMT+01:00 Jason Dillaman :
>>
>>> On Tue, Nov 14, 2017 at 10:25 AM, Oscar Segarra 
>>> wrote:
>>> > In my environment, I have a Centos7 updated todate therefore, all
>>> > features might work as expected to do...
>>> >
>>> > Regarding the other question, do you suggest making the virtual disk
>>> > "shareable" in rbd?
>>>
>>> Assuming you are refering to the "--image-shared" option when creating
>>> an image, the answer is no. That is just a short-cut to disable all
>>> features that depend on the exclusive lock. The big use-case for
>>> sharing a block device is if you set up a clustered file system on top
>>> of it, and I'd argue that you'd probably be better off using CephFS.
>>>
>>> > Thanks a lot
>>> >
>>> > 2017-11-14 15:58 GMT+01:00 Jason Dillaman :
>>> >>
>>> >> Concur -- there aren't any RBD image features that should prevent live
>>> >> migration when using a compatible version of librbd. If, however, you
>>> >> had two hosts where librbd versions were out-of-sync and they didn't
>>> >> support the same features, you could hit an issue if a VM with fancy
>>> >> new features was live migrated to a host where those features aren't
>>> >> supported since the destination host wouldn't be able to open the
>>> >> image.
>>> >>
>>> >> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
>>> >>  wrote:
>>> >> > Hi Oscar, exclusive-locking should not interfere with
>>> live-migration. I
>>> >> > have
>>> >> > a small virtualization cluster backed by ceph/rbd and I can migrate
>>> all
>>> >> > the
>>> >> > VMs which RBD image have exclusive-lock enabled without any issue.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
>>> >> >
>>> >> > Hi Konstantin,
>>> >> >
>>> >> > Thanks a lot for your advice...
>>> >> >
>>> >> > I'm specially interested in feature "Exclusive locking". Enabling
>>> this
>>> >> > feature can affect live/offline migration? In this scenario
>>> >> > (online/offline
>>> >> > migration)  I don't know if two hosts (source and destination) need
>>> >> > access
>>> >> > to the same rbd image at the same time
>>> >> >
>>> >> > It looks that enabling Exlucisve locking you can enable some other
>>> >> > interessant features like "Object map" and/or "Fast diff" for
>>> backups.
>>> >> >
>>> >> > Thanks a lot!
>>> >> >
>>> >> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>>> >> >>
>>> >> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>>> >> >>
>>> >> >> What I'm trying to do is reading documentation in order to
>>> understand
>>> >> >> how
>>> >> >> features work and what are they for.
>>> >> >>
>>> >> >> http://tracker.ceph.com/issues/15000
>>> >> >>
>>> >> >>
>>> >> >> I would also be happy to read what features have negative sides.
>>> >> >>
>>> >> >>
>>> >> >> The problem is that documentation is not detailed enough.
>>> >> >>
>>> >> >> The proof-test method you suggest I think is not a good procedure
>>> >> >> because
>>> >> >> I want to a void a corrpution in the future due to a bad
>>> configuration
>>> >> >>
>>> >> >>
>>> >> >> So my recommendation: if you can wait - may be from some side you
>>> >> >> receive
>>> >> >> a new information about features. Otherwise - you can set minimal
>>> >> >> features
>>> >> >> (like '3') - this is enough for virtualization (snapshots, clones).
>>> >> >>
>>> >> >> And start your project.
>>> >> >>
>>> >> >> --
>>> >> >> Best regards,
>>> >> >> Konstantin Shalygin
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > ___
>>> >> > ceph-users mailing list
>>> >> > ceph-users@lists.ceph.com
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >> >
>>> >> >
>>> >> >
>>> >> >

Re: [ceph-users] features required for live migration

2017-11-14 Thread Jason Dillaman
>From the documentation [1]:

shareable
If present, this indicates the device is expected to be shared between
domains (assuming the hypervisor and OS support this), which means that
caching should be deactivated for that device.

Basically, it's the use-case for putting a clustered file system (or
similar) on top of the block device. For the vast majority of cases, you
shouldn't enable this in libvirt.

[1] https://libvirt.org/formatdomain.html#elementsDisks

On Tue, Nov 14, 2017 at 10:49 AM, Oscar Segarra 
wrote:

> Hi Jason,
>
> The big use-case for sharing a block device is if you set up a clustered
> file system on top of it, and I'd argue that you'd probably be better off
> using CephFS.
> --> Nice to know!
>
> Thanks a lot for your clarifications, in this case I referenced the
> shareable flag that one can see in the KVM. I'd like to know the suggested
> configuration for rbd images and live migration
>
> [image: Imágenes integradas 1]
>
> Thanks a lot.
>
> 2017-11-14 16:36 GMT+01:00 Jason Dillaman :
>
>> On Tue, Nov 14, 2017 at 10:25 AM, Oscar Segarra 
>> wrote:
>> > In my environment, I have a Centos7 updated todate therefore, all
>> > features might work as expected to do...
>> >
>> > Regarding the other question, do you suggest making the virtual disk
>> > "shareable" in rbd?
>>
>> Assuming you are refering to the "--image-shared" option when creating
>> an image, the answer is no. That is just a short-cut to disable all
>> features that depend on the exclusive lock. The big use-case for
>> sharing a block device is if you set up a clustered file system on top
>> of it, and I'd argue that you'd probably be better off using CephFS.
>>
>> > Thanks a lot
>> >
>> > 2017-11-14 15:58 GMT+01:00 Jason Dillaman :
>> >>
>> >> Concur -- there aren't any RBD image features that should prevent live
>> >> migration when using a compatible version of librbd. If, however, you
>> >> had two hosts where librbd versions were out-of-sync and they didn't
>> >> support the same features, you could hit an issue if a VM with fancy
>> >> new features was live migrated to a host where those features aren't
>> >> supported since the destination host wouldn't be able to open the
>> >> image.
>> >>
>> >> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
>> >>  wrote:
>> >> > Hi Oscar, exclusive-locking should not interfere with
>> live-migration. I
>> >> > have
>> >> > a small virtualization cluster backed by ceph/rbd and I can migrate
>> all
>> >> > the
>> >> > VMs which RBD image have exclusive-lock enabled without any issue.
>> >> >
>> >> >
>> >> >
>> >> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
>> >> >
>> >> > Hi Konstantin,
>> >> >
>> >> > Thanks a lot for your advice...
>> >> >
>> >> > I'm specially interested in feature "Exclusive locking". Enabling
>> this
>> >> > feature can affect live/offline migration? In this scenario
>> >> > (online/offline
>> >> > migration)  I don't know if two hosts (source and destination) need
>> >> > access
>> >> > to the same rbd image at the same time
>> >> >
>> >> > It looks that enabling Exlucisve locking you can enable some other
>> >> > interessant features like "Object map" and/or "Fast diff" for
>> backups.
>> >> >
>> >> > Thanks a lot!
>> >> >
>> >> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>> >> >>
>> >> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>> >> >>
>> >> >> What I'm trying to do is reading documentation in order to
>> understand
>> >> >> how
>> >> >> features work and what are they for.
>> >> >>
>> >> >> http://tracker.ceph.com/issues/15000
>> >> >>
>> >> >>
>> >> >> I would also be happy to read what features have negative sides.
>> >> >>
>> >> >>
>> >> >> The problem is that documentation is not detailed enough.
>> >> >>
>> >> >> The proof-test method you suggest I think is not a good procedure
>> >> >> because
>> >> >> I want to a void a corrpution in the future due to a bad
>> configuration
>> >> >>
>> >> >>
>> >> >> So my recommendation: if you can wait - may be from some side you
>> >> >> receive
>> >> >> a new information about features. Otherwise - you can set minimal
>> >> >> features
>> >> >> (like '3') - this is enough for virtualization (snapshots, clones).
>> >> >>
>> >> >> And start your project.
>> >> >>
>> >> >> --
>> >> >> Best regards,
>> >> >> Konstantin Shalygin
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jason
>> >
>> >
>>
>>
>>
>> --
>> Jason
>>
>
>


-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-user

Re: [ceph-users] features required for live migration

2017-11-14 Thread Jason Dillaman
>From the documentation [1]:

shareable
If present, this indicates the device is expected to be shared between
domains (assuming the hypervisor and OS support this), which means that
caching should be deactivated for that device.

Basically, it's the use-case for putting a clustered file system (or
similar) on top of the block device. For the vast majority of cases, you
shouldn't enable this in libvirt.

[1] https://libvirt.org/formatdomain.html#elementsDisks

On Tue, Nov 14, 2017 at 10:49 AM, Oscar Segarra 
wrote:

> Hi Jason,
>
> The big use-case for sharing a block device is if you set up a clustered
> file system on top of it, and I'd argue that you'd probably be better off
> using CephFS.
> --> Nice to know!
>
> Thanks a lot for your clarifications, in this case I referenced the
> shareable flag that one can see in the KVM. I'd like to know the suggested
> configuration for rbd images and live migration
>
> [image: Imágenes integradas 1]
>
> Thanks a lot.
>
> 2017-11-14 16:36 GMT+01:00 Jason Dillaman :
>
>> On Tue, Nov 14, 2017 at 10:25 AM, Oscar Segarra 
>> wrote:
>> > In my environment, I have a Centos7 updated todate therefore, all
>> > features might work as expected to do...
>> >
>> > Regarding the other question, do you suggest making the virtual disk
>> > "shareable" in rbd?
>>
>> Assuming you are refering to the "--image-shared" option when creating
>> an image, the answer is no. That is just a short-cut to disable all
>> features that depend on the exclusive lock. The big use-case for
>> sharing a block device is if you set up a clustered file system on top
>> of it, and I'd argue that you'd probably be better off using CephFS.
>>
>> > Thanks a lot
>> >
>> > 2017-11-14 15:58 GMT+01:00 Jason Dillaman :
>> >>
>> >> Concur -- there aren't any RBD image features that should prevent live
>> >> migration when using a compatible version of librbd. If, however, you
>> >> had two hosts where librbd versions were out-of-sync and they didn't
>> >> support the same features, you could hit an issue if a VM with fancy
>> >> new features was live migrated to a host where those features aren't
>> >> supported since the destination host wouldn't be able to open the
>> >> image.
>> >>
>> >> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
>> >>  wrote:
>> >> > Hi Oscar, exclusive-locking should not interfere with
>> live-migration. I
>> >> > have
>> >> > a small virtualization cluster backed by ceph/rbd and I can migrate
>> all
>> >> > the
>> >> > VMs which RBD image have exclusive-lock enabled without any issue.
>> >> >
>> >> >
>> >> >
>> >> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
>> >> >
>> >> > Hi Konstantin,
>> >> >
>> >> > Thanks a lot for your advice...
>> >> >
>> >> > I'm specially interested in feature "Exclusive locking". Enabling
>> this
>> >> > feature can affect live/offline migration? In this scenario
>> >> > (online/offline
>> >> > migration)  I don't know if two hosts (source and destination) need
>> >> > access
>> >> > to the same rbd image at the same time
>> >> >
>> >> > It looks that enabling Exlucisve locking you can enable some other
>> >> > interessant features like "Object map" and/or "Fast diff" for
>> backups.
>> >> >
>> >> > Thanks a lot!
>> >> >
>> >> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>> >> >>
>> >> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>> >> >>
>> >> >> What I'm trying to do is reading documentation in order to
>> understand
>> >> >> how
>> >> >> features work and what are they for.
>> >> >>
>> >> >> http://tracker.ceph.com/issues/15000
>> >> >>
>> >> >>
>> >> >> I would also be happy to read what features have negative sides.
>> >> >>
>> >> >>
>> >> >> The problem is that documentation is not detailed enough.
>> >> >>
>> >> >> The proof-test method you suggest I think is not a good procedure
>> >> >> because
>> >> >> I want to a void a corrpution in the future due to a bad
>> configuration
>> >> >>
>> >> >>
>> >> >> So my recommendation: if you can wait - may be from some side you
>> >> >> receive
>> >> >> a new information about features. Otherwise - you can set minimal
>> >> >> features
>> >> >> (like '3') - this is enough for virtualization (snapshots, clones).
>> >> >>
>> >> >> And start your project.
>> >> >>
>> >> >> --
>> >> >> Best regards,
>> >> >> Konstantin Shalygin
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jason
>> >
>> >
>>
>>
>>
>> --
>> Jason
>>
>
>


-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-user

Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi Jason,

The big use-case for sharing a block device is if you set up a clustered
file system on top of it, and I'd argue that you'd probably be better off
using CephFS.
--> Nice to know!

Thanks a lot for your clarifications, in this case I referenced the
shareable flag that one can see in the KVM. I'd like to know the suggested
configuration for rbd images and live migration

[image: Imágenes integradas 1]

Thanks a lot.

2017-11-14 16:36 GMT+01:00 Jason Dillaman :

> On Tue, Nov 14, 2017 at 10:25 AM, Oscar Segarra 
> wrote:
> > In my environment, I have a Centos7 updated todate therefore, all
> > features might work as expected to do...
> >
> > Regarding the other question, do you suggest making the virtual disk
> > "shareable" in rbd?
>
> Assuming you are refering to the "--image-shared" option when creating
> an image, the answer is no. That is just a short-cut to disable all
> features that depend on the exclusive lock. The big use-case for
> sharing a block device is if you set up a clustered file system on top
> of it, and I'd argue that you'd probably be better off using CephFS.
>
> > Thanks a lot
> >
> > 2017-11-14 15:58 GMT+01:00 Jason Dillaman :
> >>
> >> Concur -- there aren't any RBD image features that should prevent live
> >> migration when using a compatible version of librbd. If, however, you
> >> had two hosts where librbd versions were out-of-sync and they didn't
> >> support the same features, you could hit an issue if a VM with fancy
> >> new features was live migrated to a host where those features aren't
> >> supported since the destination host wouldn't be able to open the
> >> image.
> >>
> >> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
> >>  wrote:
> >> > Hi Oscar, exclusive-locking should not interfere with live-migration.
> I
> >> > have
> >> > a small virtualization cluster backed by ceph/rbd and I can migrate
> all
> >> > the
> >> > VMs which RBD image have exclusive-lock enabled without any issue.
> >> >
> >> >
> >> >
> >> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
> >> >
> >> > Hi Konstantin,
> >> >
> >> > Thanks a lot for your advice...
> >> >
> >> > I'm specially interested in feature "Exclusive locking". Enabling this
> >> > feature can affect live/offline migration? In this scenario
> >> > (online/offline
> >> > migration)  I don't know if two hosts (source and destination) need
> >> > access
> >> > to the same rbd image at the same time
> >> >
> >> > It looks that enabling Exlucisve locking you can enable some other
> >> > interessant features like "Object map" and/or "Fast diff" for backups.
> >> >
> >> > Thanks a lot!
> >> >
> >> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
> >> >>
> >> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
> >> >>
> >> >> What I'm trying to do is reading documentation in order to understand
> >> >> how
> >> >> features work and what are they for.
> >> >>
> >> >> http://tracker.ceph.com/issues/15000
> >> >>
> >> >>
> >> >> I would also be happy to read what features have negative sides.
> >> >>
> >> >>
> >> >> The problem is that documentation is not detailed enough.
> >> >>
> >> >> The proof-test method you suggest I think is not a good procedure
> >> >> because
> >> >> I want to a void a corrpution in the future due to a bad
> configuration
> >> >>
> >> >>
> >> >> So my recommendation: if you can wait - may be from some side you
> >> >> receive
> >> >> a new information about features. Otherwise - you can set minimal
> >> >> features
> >> >> (like '3') - this is enough for virtualization (snapshots, clones).
> >> >>
> >> >> And start your project.
> >> >>
> >> >> --
> >> >> Best regards,
> >> >> Konstantin Shalygin
> >> >
> >> >
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >>
> >>
> >>
> >> --
> >> Jason
> >
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Jason Dillaman
On Tue, Nov 14, 2017 at 10:25 AM, Oscar Segarra  wrote:
> In my environment, I have a Centos7 updated todate therefore, all
> features might work as expected to do...
>
> Regarding the other question, do you suggest making the virtual disk
> "shareable" in rbd?

Assuming you are refering to the "--image-shared" option when creating
an image, the answer is no. That is just a short-cut to disable all
features that depend on the exclusive lock. The big use-case for
sharing a block device is if you set up a clustered file system on top
of it, and I'd argue that you'd probably be better off using CephFS.

> Thanks a lot
>
> 2017-11-14 15:58 GMT+01:00 Jason Dillaman :
>>
>> Concur -- there aren't any RBD image features that should prevent live
>> migration when using a compatible version of librbd. If, however, you
>> had two hosts where librbd versions were out-of-sync and they didn't
>> support the same features, you could hit an issue if a VM with fancy
>> new features was live migrated to a host where those features aren't
>> supported since the destination host wouldn't be able to open the
>> image.
>>
>> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
>>  wrote:
>> > Hi Oscar, exclusive-locking should not interfere with live-migration. I
>> > have
>> > a small virtualization cluster backed by ceph/rbd and I can migrate all
>> > the
>> > VMs which RBD image have exclusive-lock enabled without any issue.
>> >
>> >
>> >
>> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
>> >
>> > Hi Konstantin,
>> >
>> > Thanks a lot for your advice...
>> >
>> > I'm specially interested in feature "Exclusive locking". Enabling this
>> > feature can affect live/offline migration? In this scenario
>> > (online/offline
>> > migration)  I don't know if two hosts (source and destination) need
>> > access
>> > to the same rbd image at the same time
>> >
>> > It looks that enabling Exlucisve locking you can enable some other
>> > interessant features like "Object map" and/or "Fast diff" for backups.
>> >
>> > Thanks a lot!
>> >
>> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>> >>
>> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>> >>
>> >> What I'm trying to do is reading documentation in order to understand
>> >> how
>> >> features work and what are they for.
>> >>
>> >> http://tracker.ceph.com/issues/15000
>> >>
>> >>
>> >> I would also be happy to read what features have negative sides.
>> >>
>> >>
>> >> The problem is that documentation is not detailed enough.
>> >>
>> >> The proof-test method you suggest I think is not a good procedure
>> >> because
>> >> I want to a void a corrpution in the future due to a bad configuration
>> >>
>> >>
>> >> So my recommendation: if you can wait - may be from some side you
>> >> receive
>> >> a new information about features. Otherwise - you can set minimal
>> >> features
>> >> (like '3') - this is enough for virtualization (snapshots, clones).
>> >>
>> >> And start your project.
>> >>
>> >> --
>> >> Best regards,
>> >> Konstantin Shalygin
>> >
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>>
>> --
>> Jason
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
In my environment, I have a Centos7 updated todate therefore, all
features might work as expected to do...

Regarding the other question, do you suggest making the virtual disk
"shareable" in rbd?

Thanks a lot

2017-11-14 15:58 GMT+01:00 Jason Dillaman :

> Concur -- there aren't any RBD image features that should prevent live
> migration when using a compatible version of librbd. If, however, you
> had two hosts where librbd versions were out-of-sync and they didn't
> support the same features, you could hit an issue if a VM with fancy
> new features was live migrated to a host where those features aren't
> supported since the destination host wouldn't be able to open the
> image.
>
> On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
>  wrote:
> > Hi Oscar, exclusive-locking should not interfere with live-migration. I
> have
> > a small virtualization cluster backed by ceph/rbd and I can migrate all
> the
> > VMs which RBD image have exclusive-lock enabled without any issue.
> >
> >
> >
> > Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
> >
> > Hi Konstantin,
> >
> > Thanks a lot for your advice...
> >
> > I'm specially interested in feature "Exclusive locking". Enabling this
> > feature can affect live/offline migration? In this scenario
> (online/offline
> > migration)  I don't know if two hosts (source and destination) need
> access
> > to the same rbd image at the same time
> >
> > It looks that enabling Exlucisve locking you can enable some other
> > interessant features like "Object map" and/or "Fast diff" for backups.
> >
> > Thanks a lot!
> >
> > 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
> >>
> >> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
> >>
> >> What I'm trying to do is reading documentation in order to understand
> how
> >> features work and what are they for.
> >>
> >> http://tracker.ceph.com/issues/15000
> >>
> >>
> >> I would also be happy to read what features have negative sides.
> >>
> >>
> >> The problem is that documentation is not detailed enough.
> >>
> >> The proof-test method you suggest I think is not a good procedure
> because
> >> I want to a void a corrpution in the future due to a bad configuration
> >>
> >>
> >> So my recommendation: if you can wait - may be from some side you
> receive
> >> a new information about features. Otherwise - you can set minimal
> features
> >> (like '3') - this is enough for virtualization (snapshots, clones).
> >>
> >> And start your project.
> >>
> >> --
> >> Best regards,
> >> Konstantin Shalygin
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Jason Dillaman
Concur -- there aren't any RBD image features that should prevent live
migration when using a compatible version of librbd. If, however, you
had two hosts where librbd versions were out-of-sync and they didn't
support the same features, you could hit an issue if a VM with fancy
new features was live migrated to a host where those features aren't
supported since the destination host wouldn't be able to open the
image.

On Tue, Nov 14, 2017 at 7:55 AM, Cassiano Pilipavicius
 wrote:
> Hi Oscar, exclusive-locking should not interfere with live-migration. I have
> a small virtualization cluster backed by ceph/rbd and I can migrate all the
> VMs which RBD image have exclusive-lock enabled without any issue.
>
>
>
> Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:
>
> Hi Konstantin,
>
> Thanks a lot for your advice...
>
> I'm specially interested in feature "Exclusive locking". Enabling this
> feature can affect live/offline migration? In this scenario (online/offline
> migration)  I don't know if two hosts (source and destination) need access
> to the same rbd image at the same time
>
> It looks that enabling Exlucisve locking you can enable some other
> interessant features like "Object map" and/or "Fast diff" for backups.
>
> Thanks a lot!
>
> 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>>
>> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>>
>> What I'm trying to do is reading documentation in order to understand how
>> features work and what are they for.
>>
>> http://tracker.ceph.com/issues/15000
>>
>>
>> I would also be happy to read what features have negative sides.
>>
>>
>> The problem is that documentation is not detailed enough.
>>
>> The proof-test method you suggest I think is not a good procedure because
>> I want to a void a corrpution in the future due to a bad configuration
>>
>>
>> So my recommendation: if you can wait - may be from some side you receive
>> a new information about features. Otherwise - you can set minimal features
>> (like '3') - this is enough for virtualization (snapshots, clones).
>>
>> And start your project.
>>
>> --
>> Best regards,
>> Konstantin Shalygin
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Cassiano Pilipavicius
Hi Oscar, exclusive-locking should not interfere with live-migration. I 
have a small virtualization cluster backed by ceph/rbd and I can migrate 
all the VMs which RBD image have exclusive-lock enabled without any issue.




Em 11/14/2017 9:47 AM, Oscar Segarra escreveu:

Hi Konstantin,

Thanks a lot for your advice...

I'm specially interested in feature "Exclusive locking". Enabling this 
feature can affect live/offline migration? In this scenario 
(online/offline migration)  I don't know if two hosts (source and 
destination) need access to the same rbd image at the same time


It looks that enabling Exlucisve locking you can enable some other 
interessant features like "Object map" and/or "Fast diff" for backups.


Thanks a lot!

2017-11-14 12:26 GMT+01:00 Konstantin Shalygin >:


On 11/14/2017 06:19 PM, Oscar Segarra wrote:


What I'm trying to do is reading documentation in order to
understand how features work and what are they for.

http://tracker.ceph.com/issues/15000



I would also be happy to read what features have negative sides.



The problem is that documentation is not detailed enough.

The proof-test method you suggest I think is not a good procedure
because I want to a void a corrpution in the future due to a bad
configuration


So my recommendation: if you can wait - may be from some side you
receive a new information about features. Otherwise - you can set
minimal features (like '3') - this is enough for virtualization
(snapshots, clones).

And start your project.

-- 
Best regards,

Konstantin Shalygin




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Incomplete pgs on ceph which is partly on Bluestore

2017-11-14 Thread Ольга Ухина
Sorry, I've not mentioned, the ceph version is Luminous 12.2.1

С уважением,
Ухина Ольга

Моб. тел.: 8(905)-566-46-62

2017-11-14 15:30 GMT+03:00 Ольга Ухина :

> Hi! I've a ceph installation where one host with OSDs are on Blustore and
> three other are on FileStore, it worked till deleting this first host with
> all Bluestore OSDs and then these OSDs were back completely clean. Ceph
> remapped and I ended up with 19 pgs inactive and 19 incomplete. Primary
> OSDs for some pgs are on this first host with Bluestore OSDs and replicas
> on other host with Filestore. So I have replica of thess pgs. Is it
> possible to repair these pgs? May be I can change primary OSD for pg?
>
> Kind regards,
> Olga
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Incomplete pgs on ceph which is partly on Bluestore

2017-11-14 Thread Ольга Ухина
Hi! I've a ceph installation where one host with OSDs are on Blustore and
three other are on FileStore, it worked till deleting this first host with
all Bluestore OSDs and then these OSDs were back completely clean. Ceph
remapped and I ended up with 19 pgs inactive and 19 incomplete. Primary
OSDs for some pgs are on this first host with Bluestore OSDs and replicas
on other host with Filestore. So I have replica of thess pgs. Is it
possible to repair these pgs? May be I can change primary OSD for pg?

Kind regards,
Olga
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi,

I include Jason Dillaman, the creator of this post
http://tracker.ceph.com/issues/15000 in this thread

Thanks a lot

2017-11-14 12:47 GMT+01:00 Oscar Segarra :

> Hi Konstantin,
>
> Thanks a lot for your advice...
>
> I'm specially interested in feature "Exclusive locking". Enabling this
> feature can affect live/offline migration? In this scenario (online/offline
> migration)  I don't know if two hosts (source and destination) need access
> to the same rbd image at the same time
>
> It looks that enabling Exlucisve locking you can enable some other
> interessant features like "Object map" and/or "Fast diff" for backups.
>
> Thanks a lot!
>
> 2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :
>
>> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>>
>> What I'm trying to do is reading documentation in order to understand how
>> features work and what are they for.
>>
>> http://tracker.ceph.com/issues/15000
>>
>>
>> I would also be happy to read what features have negative sides.
>>
>>
>> The problem is that documentation is not detailed enough.
>>
>> The proof-test method you suggest I think is not a good procedure because
>> I want to a void a corrpution in the future due to a bad configuration
>>
>>
>> So my recommendation: if you can wait - may be from some side you receive
>> a new information about features. Otherwise - you can set minimal features
>> (like '3') - this is enough for virtualization (snapshots, clones).
>>
>> And start your project.
>>
>> --
>> Best regards,
>> Konstantin Shalygin
>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi Konstantin,

Thanks a lot for your advice...

I'm specially interested in feature "Exclusive locking". Enabling this
feature can affect live/offline migration? In this scenario (online/offline
migration)  I don't know if two hosts (source and destination) need access
to the same rbd image at the same time

It looks that enabling Exlucisve locking you can enable some other
interessant features like "Object map" and/or "Fast diff" for backups.

Thanks a lot!

2017-11-14 12:26 GMT+01:00 Konstantin Shalygin :

> On 11/14/2017 06:19 PM, Oscar Segarra wrote:
>
> What I'm trying to do is reading documentation in order to understand how
> features work and what are they for.
>
> http://tracker.ceph.com/issues/15000
>
>
> I would also be happy to read what features have negative sides.
>
>
> The problem is that documentation is not detailed enough.
>
> The proof-test method you suggest I think is not a good procedure because
> I want to a void a corrpution in the future due to a bad configuration
>
>
> So my recommendation: if you can wait - may be from some side you receive
> a new information about features. Otherwise - you can set minimal features
> (like '3') - this is enough for virtualization (snapshots, clones).
>
> And start your project.
>
> --
> Best regards,
> Konstantin Shalygin
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Konstantin Shalygin

On 11/14/2017 06:19 PM, Oscar Segarra wrote:

What I'm trying to do is reading documentation in order to understand 
how features work and what are they for.


http://tracker.ceph.com/issues/15000


I would also be happy to read what features have negative sides.



The problem is that documentation is not detailed enough.

The proof-test method you suggest I think is not a good procedure 
because I want to a void a corrpution in the future due to a bad 
configuration


So my recommendation: if you can wait - may be from some side you 
receive a new information about features. Otherwise - you can set 
minimal features (like '3') - this is enough for virtualization 
(snapshots, clones).


And start your project.

--
Best regards,
Konstantin Shalygin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi Konstantin,

What I'm trying to do is reading documentation in order to understand how
features work and what are they for.

http://tracker.ceph.com/issues/15000

The problem is that documentation is not detailed enough.

The proof-test method you suggest I think is not a good procedure because I
want to a void a corrpution in the future due to a bad configuration

Thanks a lot,



2017-11-14 12:07 GMT+01:00 Konstantin Shalygin :

> I misunderstand you. If you at the testing/deploy stage - why you can't
> test what features you need and what supported by your librbd?
>
>
> On 11/14/2017 05:39 PM, Oscar Segarra wrote:
>
>> In this moment, I'm deploying and therefore I can upgrade every
>> component... I have recently executed "yum upgrade -y" in order to update
>> all operating system components.
>>
>> And please, apollogize me but In your lines I am not able to find the
>> answer to my questions.
>>
>> Please, can you clarify?
>>
>
> --
> Best regards,
> Konstantin Shalygin
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Konstantin Shalygin
I misunderstand you. If you at the testing/deploy stage - why you can't 
test what features you need and what supported by your librbd?



On 11/14/2017 05:39 PM, Oscar Segarra wrote:
In this moment, I'm deploying and therefore I can upgrade every 
component... I have recently executed "yum upgrade -y" in order to 
update all operating system components.


And please, apollogize me but In your lines I am not able to find the 
answer to my questions.


Please, can you clarify?


--
Best regards,
Konstantin Shalygin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi Konstantin

In this moment, I'm deploying and therefore I can upgrade every
component... I have recently executed "yum upgrade -y" in order to update
all operating system components.

And please, apollogize me but In your lines I am not able to find the
answer to my questions.

Please, can you clarify?

Thanks a lot!


2017-11-14 11:32 GMT+01:00 Konstantin Shalygin :

> For understanding: live migration is just the same run like clean run from
> powered off state, exception only the copying memory from one host to
> another, i.e. if your VM start from powered off state, than live migration
> should works without any issues.
>
> Also, client must be compatible with the features, otherwise qemu will
> not be able to work with the rbd. For this reason, there is an option
> that indicates the default features when creating a volume - if not
> possible to upgrade librbd on host (or any other component), the features
> can be customized for the present librbd.
>
> On 11/14/2017 04:41 PM, Oscar Segarra wrote:
>
> Yes, but looks lots of features like snapshot, fast-diff require some
> other features... If I enable exclusive-locking or journaling, live
> migration will be possible too?
>
> Is it recommended to set KVM disk "shareable" depending on the activated
> features?
>
>
> --
> Best regards,
> Konstantin Shalygin
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-14 Thread Oscar Segarra
Hi Anthony,


o I think you might have some misunderstandings about how Ceph works.  Ceph
is best deployed as a single cluster spanning multiple servers, generally
at least 3.  Is that your plan?

I want to deply servers for 100VDI Windows 10 each (at least 3 servers). I
plan to sell servers dependingo of the number of VDI required by my
customer. For 100 VDI --> 3 servers, for 400 VDI --> 4 servers

This is my proposal of configuration:

*Server1:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server2:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server3:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

*Server4:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)
...
*ServerN:*
CPU: 2x16 Core
RAM: 512
Disk: 2x400 for OS and 8x1.9TB for VM (SSD)

If I create an OSD for each disk and I pin a core for each osd in a server
I wil need 8 cores just for managing osd. If I create 4 RAID0 of 2 disks
each, I will need just 4 osd, and so on:

1 osd x 1 disk of 4TB
1 osd x 2 disks of 2TB
1 odd x 4 disks of 1 TB

If the CPU cycles used by Ceph are a problem, your architecture has IMHO
bigger problems.  You need to design for a safety margin of RAM and CPU to
accommodate spikes in usage, both by Ceph and by your desktops.  There is
no way each of the systems you describe is going to have enough cycles for
100 desktops concurrently active.  You'd be allocating each of them only
~3GB of RAM -- I've not had to run MS Windows 10 but even with page sharing
that seems awfully tight on RAM.

Sorry, I think my design has not been correctly explained. I hope my
previous explanation clarifies it. The problem is i'm in the design phase
and I don't know if ceph CPU cycles can be a problem and that is the
principal object of this post.

With the numbers you mention throughout the thread, it would seem as though
you would end up with potentially as little as 80GB of usable space per
virtual desktop - will that meet your needs?

Sorry, I think 80GB is enough, nevertheless, I plan to use RBD clones and
therefore even with size=2, I think I will have more than 80GB available
for each vdi.

In this design phase where I am, every advice is really welcome!

Thanks a lot

2017-11-13 23:40 GMT+01:00 Anthony D'Atri :

> Oscar, a few thoughts:
>
> o I think you might have some misunderstandings about how Ceph works.
> Ceph is best deployed as a single cluster spanning multiple servers,
> generally at least 3.  Is that your plan?  It sort of sounds as though
> you're thinking of Ceph managing only the drives local to each of your
> converged VDI hosts, like local RAID would.  Ceph doesn't work that way.
> Well, technically it could but wouldn't be a great architecture.  You would
> want to have at least 3 servers, with all of the Ceph OSDs in a single
> cluster.
>
> o Re RAID0:
>
> > Then, may I understand that your advice is a RAID0 for each 4TB? For a
> > balanced configuration...
> >
> > 1 osd x 1 disk of 4TB
> > 1 osd x 2 disks of 2TB
> > 1 odd x 4 disks of 1 TB
>
>
> For performance a greater number of smaller drives is generally going to
> be best.  VDI desktops are going to be fairly latency-sensitive and you'd
> really do best with SSDs.  All those desktops thrashing a small number of
> HDDs is not going to deliver tolerable performance.
>
> Don't use RAID at all for the OSDs.  Even if you get hardware RAID HBAs,
> configure JBOD/passthrough mode so that OSDs are deployed directly on the
> drives.  This will minimize latency as well as manifold hassles that one
> adds when wrapping drives in HBA RAID volumes.
>
> o Re CPU:
>
> > The other question is considering having one OSDs vs 8 OSDs... 8 OSDs
> will
> > consume more CPU than 1 OSD (RAID5) ?
> >
> > As I want to share compute and osd in the same box, resources consumed by
> > OSD can be a handicap.
>
>
> If the CPU cycles used by Ceph are a problem, your architecture has IMHO
> bigger problems.  You need to design for a safety margin of RAM and CPU to
> accommodate spikes in usage, both by Ceph and by your desktops.  There is
> no way each of the systems you describe is going to have enough cycles for
> 100 desktops concurrently active.  You'd be allocating each of them only
> ~3GB of RAM -- I've not had to run MS Windows 10 but even with page sharing
> that seems awfully tight on RAM.
>
> Since you mention PProLiant and 8 drives I'm going assume you're targeting
> the DL360?  I suggest if possible considering the 10SFF models to get you
> more drive bays, ditching the optical drive.  If you can get rear bays to
> use to boot the OS from, that's better yet so you free up front panel drive
> bays for OSD use.  You want to maximize the number of drive bays available
> for OSD use, and if at all possible you want to avoid deploying the
> operating system's filesystems and OSDs on the same drives.
>
> With the numbers you mention throughout the thread, it would seem as
> though you would end up wit

Re: [ceph-users] features required for live migration

2017-11-14 Thread Konstantin Shalygin
For understanding: live migration is just the same run like clean run 
from powered off state, exception only the copying memory from one host 
to another, i.e. if your VM start from powered off state, than live 
migration should works without any issues.


Also, client must be compatible with the features, otherwise qemu will 
not be able to work with the rbd. For this reason, there is an option 
that indicates the default features when creating a volume - if not 
possible to upgrade librbd on host (or any other component), the 
features can be customized for the present librbd.



On 11/14/2017 04:41 PM, Oscar Segarra wrote:
Yes, but looks lots of features like snapshot, fast-diff require some 
other features... If I enable exclusive-locking or journaling, live 
migration will be possible too?


Is it recommended to set KVM disk "shareable" depending on the 
activated features?


--
Best regards,
Konstantin Shalygin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-14 Thread Iban Cabrillo
HI,
   You should do something like #ceph osd in osd.${num}:
   But If this is your tree, I do not see any osd available at this moment
in your cluster, should be something similar to this xesample:

ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
-1   58.21509 root default

-2   29.12000 host cephosd01
 1   hdd  3.64000 osd.1up  1.0 1.0
..
-3   29.09509 host cephosd02
 0   hdd  3.63689 osd.0up  1.0 1.0
..

Please have a look at the guide:
http://docs.ceph.com/docs/luminous/rados/deployment/ceph-deploy-osd/


Regards, I

2017-11-14 10:58 GMT+01:00 Dai Xiang :

> On Tue, Nov 14, 2017 at 10:52:00AM +0100, Iban Cabrillo wrote:
> > Hi Dai Xiang,
> >   There is no OSD available at this moment in your cluste, then you can't
> > read/write or mount anything, maybe the osds are configured but they are
> > out, please could you paste the "#ceph osd tree " command
> > to see your osd status ?
>
> ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF
> -10 root default
>
> It is out indeed, but i really do not know how to fix it.
>
> --
> Best Regards
> Dai Xiang
> >
> > Regards, I
> >
> >
> > 2017-11-14 10:39 GMT+01:00 Dai Xiang :
> >
> > > On Tue, Nov 14, 2017 at 09:21:56AM +, Linh Vu wrote:
> > > > Odd, you only got 2 mons and 0 osds? Your cluster build looks
> incomplete.
> > >
> > > But from the log, osd seems normal:
> > > [172.17.0.4][INFO  ] checking OSD status...
> > > [172.17.0.4][DEBUG ] find the location of an executable
> > > [172.17.0.4][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> > > --format=json
> > > [ceph_deploy.osd][DEBUG ] Host 172.17.0.4 is now ready for osd use.
> > > ...
> > >
> > > [172.17.0.5][INFO  ] Running command: systemctl enable ceph.target
> > > [172.17.0.5][INFO  ] checking OSD status...
> > > [172.17.0.5][DEBUG ] find the location of an executable
> > > [172.17.0.5][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> > > --format=json
> > > [ceph_deploy.osd][DEBUG ] Host 172.17.0.5 is now ready for osd use.
> > >
> > > --
> > > Best Regards
> > > Dai Xiang
> > > >
> > > > Get Outlook for Android
> > > >
> > > > 
> > > > From: Dai Xiang 
> > > > Sent: Tuesday, November 14, 2017 6:12:27 PM
> > > > To: Linh Vu
> > > > Cc: ceph-users@lists.ceph.com
> > > > Subject: Re: mount failed since failed to load ceph kernel module
> > > >
> > > > On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote:
> > > > > Your kernel is way too old for CephFS Luminous. I'd use one of the
> > > newer kernels from http://elrepo.org. :) We're on 4.12 here on RHEL
> 7.4.
> > > >
> > > > I had updated kernel version to newest:
> > > > [root@d32f3a7b6eb8 ~]$ uname -a
> > > > Linux d32f3a7b6eb8 4.14.0-1.el7.elrepo.x86_64 #1 SMP Sun Nov 12
> 20:21:04
> > > EST 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > > [root@d32f3a7b6eb8 ~]$ cat /etc/redhat-release
> > > > CentOS Linux release 7.2.1511 (Core)
> > > >
> > > > But still failed:
> > > > [root@d32f3a7b6eb8 ~]$ /bin/mount 172.17.0.4,172.17.0.5:/ /cephfs -t
> > > ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
> > > > failed to load ceph kernel module (1)
> > > > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > > > mount error 2 = No such file or directory
> > > > [root@d32f3a7b6eb8 ~]$ ll /cephfs
> > > > total 0
> > > >
> > > > [root@d32f3a7b6eb8 ~]$ ceph -s
> > > >   cluster:
> > > > id: a5f1d744-35eb-4e1b-a7c7-cb9871ec559d
> > > > health: HEALTH_WARN
> > > > Reduced data availability: 128 pgs inactive
> > > > Degraded data redundancy: 128 pgs unclean
> > > >
> > > >   services:
> > > > mon: 2 daemons, quorum d32f3a7b6eb8,1d22f2d81028
> > > > mgr: d32f3a7b6eb8(active), standbys: 1d22f2d81028
> > > > mds: cephfs-1/1/1 up  {0=1d22f2d81028=up:creating}, 1 up:standby
> > > > osd: 0 osds: 0 up, 0 in
> > > >
> > > >   data:
> > > > pools:   2 pools, 128 pgs
> > > > objects: 0 objects, 0 bytes
> > > > usage:   0 kB used, 0 kB / 0 kB avail
> > > > pgs: 100.000% pgs unknown
> > > >  128 unknown
> > > >
> > > > [root@d32f3a7b6eb8 ~]$ lsmod | grep ceph
> > > > ceph  372736  0
> > > > libceph   315392  1 ceph
> > > > fscache65536  3 ceph,nfsv4,nfs
> > > > libcrc32c  16384  5 libceph,nf_conntrack,xfs,dm_
> > > persistent_data,nf_nat
> > > >
> > > >
> > > > --
> > > > Best Regards
> > > > Dai Xiang
> > > > >
> > > > >
> > > > > Hi!
> > > > >
> > > > > I got a confused issue in docker as below:
> > > > >
> > > > > After install ceph successfully, i want to mount cephfs but failed:
> > > > >
> > > > > [root@dbffa72704e4 ~]$ /bin/mount http://172.17.0.4:/ > > 172.17.0.4:/> /cephfs -t ceph -o name=admin,secretfile=/etc/
> ceph/admin.secret
> > > -v
> > > > > failed to load ceph kernel module (1)
> > > > > par

Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-14 Thread Dai Xiang
On Tue, Nov 14, 2017 at 10:52:00AM +0100, Iban Cabrillo wrote:
> Hi Dai Xiang,
>   There is no OSD available at this moment in your cluste, then you can't
> read/write or mount anything, maybe the osds are configured but they are
> out, please could you paste the "#ceph osd tree " command
> to see your osd status ?

ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF 
-10 root default

It is out indeed, but i really do not know how to fix it.

-- 
Best Regards
Dai Xiang
> 
> Regards, I
> 
> 
> 2017-11-14 10:39 GMT+01:00 Dai Xiang :
> 
> > On Tue, Nov 14, 2017 at 09:21:56AM +, Linh Vu wrote:
> > > Odd, you only got 2 mons and 0 osds? Your cluster build looks incomplete.
> >
> > But from the log, osd seems normal:
> > [172.17.0.4][INFO  ] checking OSD status...
> > [172.17.0.4][DEBUG ] find the location of an executable
> > [172.17.0.4][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> > --format=json
> > [ceph_deploy.osd][DEBUG ] Host 172.17.0.4 is now ready for osd use.
> > ...
> >
> > [172.17.0.5][INFO  ] Running command: systemctl enable ceph.target
> > [172.17.0.5][INFO  ] checking OSD status...
> > [172.17.0.5][DEBUG ] find the location of an executable
> > [172.17.0.5][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> > --format=json
> > [ceph_deploy.osd][DEBUG ] Host 172.17.0.5 is now ready for osd use.
> >
> > --
> > Best Regards
> > Dai Xiang
> > >
> > > Get Outlook for Android
> > >
> > > 
> > > From: Dai Xiang 
> > > Sent: Tuesday, November 14, 2017 6:12:27 PM
> > > To: Linh Vu
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: mount failed since failed to load ceph kernel module
> > >
> > > On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote:
> > > > Your kernel is way too old for CephFS Luminous. I'd use one of the
> > newer kernels from http://elrepo.org. :) We're on 4.12 here on RHEL 7.4.
> > >
> > > I had updated kernel version to newest:
> > > [root@d32f3a7b6eb8 ~]$ uname -a
> > > Linux d32f3a7b6eb8 4.14.0-1.el7.elrepo.x86_64 #1 SMP Sun Nov 12 20:21:04
> > EST 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > [root@d32f3a7b6eb8 ~]$ cat /etc/redhat-release
> > > CentOS Linux release 7.2.1511 (Core)
> > >
> > > But still failed:
> > > [root@d32f3a7b6eb8 ~]$ /bin/mount 172.17.0.4,172.17.0.5:/ /cephfs -t
> > ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
> > > failed to load ceph kernel module (1)
> > > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > > mount error 2 = No such file or directory
> > > [root@d32f3a7b6eb8 ~]$ ll /cephfs
> > > total 0
> > >
> > > [root@d32f3a7b6eb8 ~]$ ceph -s
> > >   cluster:
> > > id: a5f1d744-35eb-4e1b-a7c7-cb9871ec559d
> > > health: HEALTH_WARN
> > > Reduced data availability: 128 pgs inactive
> > > Degraded data redundancy: 128 pgs unclean
> > >
> > >   services:
> > > mon: 2 daemons, quorum d32f3a7b6eb8,1d22f2d81028
> > > mgr: d32f3a7b6eb8(active), standbys: 1d22f2d81028
> > > mds: cephfs-1/1/1 up  {0=1d22f2d81028=up:creating}, 1 up:standby
> > > osd: 0 osds: 0 up, 0 in
> > >
> > >   data:
> > > pools:   2 pools, 128 pgs
> > > objects: 0 objects, 0 bytes
> > > usage:   0 kB used, 0 kB / 0 kB avail
> > > pgs: 100.000% pgs unknown
> > >  128 unknown
> > >
> > > [root@d32f3a7b6eb8 ~]$ lsmod | grep ceph
> > > ceph  372736  0
> > > libceph   315392  1 ceph
> > > fscache65536  3 ceph,nfsv4,nfs
> > > libcrc32c  16384  5 libceph,nf_conntrack,xfs,dm_
> > persistent_data,nf_nat
> > >
> > >
> > > --
> > > Best Regards
> > > Dai Xiang
> > > >
> > > >
> > > > Hi!
> > > >
> > > > I got a confused issue in docker as below:
> > > >
> > > > After install ceph successfully, i want to mount cephfs but failed:
> > > >
> > > > [root@dbffa72704e4 ~]$ /bin/mount http://172.17.0.4:/ > 172.17.0.4:/> /cephfs -t ceph -o 
> > name=admin,secretfile=/etc/ceph/admin.secret
> > -v
> > > > failed to load ceph kernel module (1)
> > > > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > > > mount error 5 = Input/output error
> > > >
> > > > But ceph related kernel modules have existed:
> > > >
> > > > [root@dbffa72704e4 ~]$ lsmod | grep ceph
> > > > ceph  327687  0
> > > > libceph   287066  1 ceph
> > > > dns_resolver   13140  2 nfsv4,libceph
> > > > libcrc32c  12644  3 xfs,libceph,dm_persistent_data
> > > >
> > > > Check the ceph state(i only set data disk for osd):
> > > >
> > > > [root@dbffa72704e4 ~]$ ceph -s
> > > >   cluster:
> > > > id: 20f51975-303e-446f-903f-04e1feaff7d0
> > > > health: HEALTH_WARN
> > > > Reduced data availability: 128 pgs inactive
> > > > Degraded data redundancy: 128 pgs unclean
> > > >
> > > >   services:
> > > > mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
> > > > mgr: dbffa72704e4(active), st

Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-14 Thread Iban Cabrillo
Hi Dai Xiang,
  There is no OSD available at this moment in your cluste, then you can't
read/write or mount anything, maybe the osds are configured but they are
out, please could you paste the "#ceph osd tree " command
to see your osd status ?

Regards, I


2017-11-14 10:39 GMT+01:00 Dai Xiang :

> On Tue, Nov 14, 2017 at 09:21:56AM +, Linh Vu wrote:
> > Odd, you only got 2 mons and 0 osds? Your cluster build looks incomplete.
>
> But from the log, osd seems normal:
> [172.17.0.4][INFO  ] checking OSD status...
> [172.17.0.4][DEBUG ] find the location of an executable
> [172.17.0.4][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> --format=json
> [ceph_deploy.osd][DEBUG ] Host 172.17.0.4 is now ready for osd use.
> ...
>
> [172.17.0.5][INFO  ] Running command: systemctl enable ceph.target
> [172.17.0.5][INFO  ] checking OSD status...
> [172.17.0.5][DEBUG ] find the location of an executable
> [172.17.0.5][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat
> --format=json
> [ceph_deploy.osd][DEBUG ] Host 172.17.0.5 is now ready for osd use.
>
> --
> Best Regards
> Dai Xiang
> >
> > Get Outlook for Android
> >
> > 
> > From: Dai Xiang 
> > Sent: Tuesday, November 14, 2017 6:12:27 PM
> > To: Linh Vu
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: mount failed since failed to load ceph kernel module
> >
> > On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote:
> > > Your kernel is way too old for CephFS Luminous. I'd use one of the
> newer kernels from http://elrepo.org. :) We're on 4.12 here on RHEL 7.4.
> >
> > I had updated kernel version to newest:
> > [root@d32f3a7b6eb8 ~]$ uname -a
> > Linux d32f3a7b6eb8 4.14.0-1.el7.elrepo.x86_64 #1 SMP Sun Nov 12 20:21:04
> EST 2017 x86_64 x86_64 x86_64 GNU/Linux
> > [root@d32f3a7b6eb8 ~]$ cat /etc/redhat-release
> > CentOS Linux release 7.2.1511 (Core)
> >
> > But still failed:
> > [root@d32f3a7b6eb8 ~]$ /bin/mount 172.17.0.4,172.17.0.5:/ /cephfs -t
> ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
> > failed to load ceph kernel module (1)
> > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > mount error 2 = No such file or directory
> > [root@d32f3a7b6eb8 ~]$ ll /cephfs
> > total 0
> >
> > [root@d32f3a7b6eb8 ~]$ ceph -s
> >   cluster:
> > id: a5f1d744-35eb-4e1b-a7c7-cb9871ec559d
> > health: HEALTH_WARN
> > Reduced data availability: 128 pgs inactive
> > Degraded data redundancy: 128 pgs unclean
> >
> >   services:
> > mon: 2 daemons, quorum d32f3a7b6eb8,1d22f2d81028
> > mgr: d32f3a7b6eb8(active), standbys: 1d22f2d81028
> > mds: cephfs-1/1/1 up  {0=1d22f2d81028=up:creating}, 1 up:standby
> > osd: 0 osds: 0 up, 0 in
> >
> >   data:
> > pools:   2 pools, 128 pgs
> > objects: 0 objects, 0 bytes
> > usage:   0 kB used, 0 kB / 0 kB avail
> > pgs: 100.000% pgs unknown
> >  128 unknown
> >
> > [root@d32f3a7b6eb8 ~]$ lsmod | grep ceph
> > ceph  372736  0
> > libceph   315392  1 ceph
> > fscache65536  3 ceph,nfsv4,nfs
> > libcrc32c  16384  5 libceph,nf_conntrack,xfs,dm_
> persistent_data,nf_nat
> >
> >
> > --
> > Best Regards
> > Dai Xiang
> > >
> > >
> > > Hi!
> > >
> > > I got a confused issue in docker as below:
> > >
> > > After install ceph successfully, i want to mount cephfs but failed:
> > >
> > > [root@dbffa72704e4 ~]$ /bin/mount http://172.17.0.4:/ 172.17.0.4:/> /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret
> -v
> > > failed to load ceph kernel module (1)
> > > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > > mount error 5 = Input/output error
> > >
> > > But ceph related kernel modules have existed:
> > >
> > > [root@dbffa72704e4 ~]$ lsmod | grep ceph
> > > ceph  327687  0
> > > libceph   287066  1 ceph
> > > dns_resolver   13140  2 nfsv4,libceph
> > > libcrc32c  12644  3 xfs,libceph,dm_persistent_data
> > >
> > > Check the ceph state(i only set data disk for osd):
> > >
> > > [root@dbffa72704e4 ~]$ ceph -s
> > >   cluster:
> > > id: 20f51975-303e-446f-903f-04e1feaff7d0
> > > health: HEALTH_WARN
> > > Reduced data availability: 128 pgs inactive
> > > Degraded data redundancy: 128 pgs unclean
> > >
> > >   services:
> > > mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
> > > mgr: dbffa72704e4(active), standbys: 5807d12f920e
> > > mds: cephfs-1/1/1 up  {0=5807d12f920e=up:creating}, 1 up:standby
> > > osd: 0 osds: 0 up, 0 in
> > >
> > >   data:
> > > pools:   2 pools, 128 pgs
> > > objects: 0 objects, 0 bytes
> > > usage:   0 kB used, 0 kB / 0 kB avail
> > > pgs: 100.000% pgs unknown
> > >  128 unknown
> > >
> > > [root@dbffa72704e4 ~]$ ceph version
> > > ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)
> > >
> > > My 

Re: [ceph-users] Cephalocon 2018?

2017-11-14 Thread Danny Al-Gaaf
In Sydney at the OpenStack Summit Sage announced a Cephalocon for
2018.03.22-23 in Beijing (China).

Danny

Am 12.10.2017 um 13:02 schrieb Matthew Vernon:
> Hi,
> 
> The recent FOSDEM CFP reminded me to wonder if there's likely to be a
> Cephalocon in 2018? It was mentioned as a possibility when the 2017 one
> was cancelled...
> 
> Regards,
> 
> Matthew
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] features required for live migration

2017-11-14 Thread Oscar Segarra
Hi,

Yes, but looks lots of features like snapshot, fast-diff require some other
features... If I enable exclusive-locking or journaling, live migration
will be possible too?

Is it recommended to set KVM disk "shareable" depending on the activated
features?

Thanks a lot!


2017-11-14 4:52 GMT+01:00 Konstantin Shalygin :

> I'd like to use the live migration feature of KVM. In this scenario, what
>> features may be enabled in the rbd base image? and in my EV (snapshot
>> clone)?
>>
>
> You can use live migration without features. For KVM I can recommend
> minimal "rbd default features = 3" (layering, striping).
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-14 Thread Dai Xiang
On Tue, Nov 14, 2017 at 09:21:56AM +, Linh Vu wrote:
> Odd, you only got 2 mons and 0 osds? Your cluster build looks incomplete.

But from the log, osd seems normal:
[172.17.0.4][INFO  ] checking OSD status...
[172.17.0.4][DEBUG ] find the location of an executable
[172.17.0.4][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph_deploy.osd][DEBUG ] Host 172.17.0.4 is now ready for osd use.
...

[172.17.0.5][INFO  ] Running command: systemctl enable ceph.target
[172.17.0.5][INFO  ] checking OSD status...
[172.17.0.5][DEBUG ] find the location of an executable
[172.17.0.5][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph_deploy.osd][DEBUG ] Host 172.17.0.5 is now ready for osd use.

-- 
Best Regards
Dai Xiang
> 
> Get Outlook for Android
> 
> 
> From: Dai Xiang 
> Sent: Tuesday, November 14, 2017 6:12:27 PM
> To: Linh Vu
> Cc: ceph-users@lists.ceph.com
> Subject: Re: mount failed since failed to load ceph kernel module
> 
> On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote:
> > Your kernel is way too old for CephFS Luminous. I'd use one of the newer 
> > kernels from http://elrepo.org. :) We're on 4.12 here on RHEL 7.4.
> 
> I had updated kernel version to newest:
> [root@d32f3a7b6eb8 ~]$ uname -a
> Linux d32f3a7b6eb8 4.14.0-1.el7.elrepo.x86_64 #1 SMP Sun Nov 12 20:21:04 EST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
> [root@d32f3a7b6eb8 ~]$ cat /etc/redhat-release
> CentOS Linux release 7.2.1511 (Core)
> 
> But still failed:
> [root@d32f3a7b6eb8 ~]$ /bin/mount 172.17.0.4,172.17.0.5:/ /cephfs -t ceph -o 
> name=admin,secretfile=/etc/ceph/admin.secret -v
> failed to load ceph kernel module (1)
> parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> mount error 2 = No such file or directory
> [root@d32f3a7b6eb8 ~]$ ll /cephfs
> total 0
> 
> [root@d32f3a7b6eb8 ~]$ ceph -s
>   cluster:
> id: a5f1d744-35eb-4e1b-a7c7-cb9871ec559d
> health: HEALTH_WARN
> Reduced data availability: 128 pgs inactive
> Degraded data redundancy: 128 pgs unclean
> 
>   services:
> mon: 2 daemons, quorum d32f3a7b6eb8,1d22f2d81028
> mgr: d32f3a7b6eb8(active), standbys: 1d22f2d81028
> mds: cephfs-1/1/1 up  {0=1d22f2d81028=up:creating}, 1 up:standby
> osd: 0 osds: 0 up, 0 in
> 
>   data:
> pools:   2 pools, 128 pgs
> objects: 0 objects, 0 bytes
> usage:   0 kB used, 0 kB / 0 kB avail
> pgs: 100.000% pgs unknown
>  128 unknown
> 
> [root@d32f3a7b6eb8 ~]$ lsmod | grep ceph
> ceph  372736  0
> libceph   315392  1 ceph
> fscache65536  3 ceph,nfsv4,nfs
> libcrc32c  16384  5 
> libceph,nf_conntrack,xfs,dm_persistent_data,nf_nat
> 
> 
> --
> Best Regards
> Dai Xiang
> >
> >
> > Hi!
> >
> > I got a confused issue in docker as below:
> >
> > After install ceph successfully, i want to mount cephfs but failed:
> >
> > [root@dbffa72704e4 ~]$ /bin/mount http://172.17.0.4:/ 
> > /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
> > failed to load ceph kernel module (1)
> > parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> > mount error 5 = Input/output error
> >
> > But ceph related kernel modules have existed:
> >
> > [root@dbffa72704e4 ~]$ lsmod | grep ceph
> > ceph  327687  0
> > libceph   287066  1 ceph
> > dns_resolver   13140  2 nfsv4,libceph
> > libcrc32c  12644  3 xfs,libceph,dm_persistent_data
> >
> > Check the ceph state(i only set data disk for osd):
> >
> > [root@dbffa72704e4 ~]$ ceph -s
> >   cluster:
> > id: 20f51975-303e-446f-903f-04e1feaff7d0
> > health: HEALTH_WARN
> > Reduced data availability: 128 pgs inactive
> > Degraded data redundancy: 128 pgs unclean
> >
> >   services:
> > mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
> > mgr: dbffa72704e4(active), standbys: 5807d12f920e
> > mds: cephfs-1/1/1 up  {0=5807d12f920e=up:creating}, 1 up:standby
> > osd: 0 osds: 0 up, 0 in
> >
> >   data:
> > pools:   2 pools, 128 pgs
> > objects: 0 objects, 0 bytes
> > usage:   0 kB used, 0 kB / 0 kB avail
> > pgs: 100.000% pgs unknown
> >  128 unknown
> >
> > [root@dbffa72704e4 ~]$ ceph version
> > ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
> > (stable)
> >
> > My container is based on centos:centos7.2.1511, kernel is 3e0728877e22 
> > 3.10.0-514.el7.x86_64.
> >
> > I saw some ceph related images on docker hub so that i think above
> > operation is ok, did i miss something important?
> >
> > --
> > Best Regards
> > Dai Xiang
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-14 Thread Linh Vu
Odd, you only got 2 mons and 0 osds? Your cluster build looks incomplete.

Get Outlook for Android


From: Dai Xiang 
Sent: Tuesday, November 14, 2017 6:12:27 PM
To: Linh Vu
Cc: ceph-users@lists.ceph.com
Subject: Re: mount failed since failed to load ceph kernel module

On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote:
> Your kernel is way too old for CephFS Luminous. I'd use one of the newer 
> kernels from http://elrepo.org. :) We're on 4.12 here on RHEL 7.4.

I had updated kernel version to newest:
[root@d32f3a7b6eb8 ~]$ uname -a
Linux d32f3a7b6eb8 4.14.0-1.el7.elrepo.x86_64 #1 SMP Sun Nov 12 20:21:04 EST 
2017 x86_64 x86_64 x86_64 GNU/Linux
[root@d32f3a7b6eb8 ~]$ cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

But still failed:
[root@d32f3a7b6eb8 ~]$ /bin/mount 172.17.0.4,172.17.0.5:/ /cephfs -t ceph -o 
name=admin,secretfile=/etc/ceph/admin.secret -v
failed to load ceph kernel module (1)
parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
mount error 2 = No such file or directory
[root@d32f3a7b6eb8 ~]$ ll /cephfs
total 0

[root@d32f3a7b6eb8 ~]$ ceph -s
  cluster:
id: a5f1d744-35eb-4e1b-a7c7-cb9871ec559d
health: HEALTH_WARN
Reduced data availability: 128 pgs inactive
Degraded data redundancy: 128 pgs unclean

  services:
mon: 2 daemons, quorum d32f3a7b6eb8,1d22f2d81028
mgr: d32f3a7b6eb8(active), standbys: 1d22f2d81028
mds: cephfs-1/1/1 up  {0=1d22f2d81028=up:creating}, 1 up:standby
osd: 0 osds: 0 up, 0 in

  data:
pools:   2 pools, 128 pgs
objects: 0 objects, 0 bytes
usage:   0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs unknown
 128 unknown

[root@d32f3a7b6eb8 ~]$ lsmod | grep ceph
ceph  372736  0
libceph   315392  1 ceph
fscache65536  3 ceph,nfsv4,nfs
libcrc32c  16384  5 
libceph,nf_conntrack,xfs,dm_persistent_data,nf_nat


--
Best Regards
Dai Xiang
>
>
> Hi!
>
> I got a confused issue in docker as below:
>
> After install ceph successfully, i want to mount cephfs but failed:
>
> [root@dbffa72704e4 ~]$ /bin/mount http://172.17.0.4:/ 
> /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v
> failed to load ceph kernel module (1)
> parsing options: rw,name=admin,secretfile=/etc/ceph/admin.secret
> mount error 5 = Input/output error
>
> But ceph related kernel modules have existed:
>
> [root@dbffa72704e4 ~]$ lsmod | grep ceph
> ceph  327687  0
> libceph   287066  1 ceph
> dns_resolver   13140  2 nfsv4,libceph
> libcrc32c  12644  3 xfs,libceph,dm_persistent_data
>
> Check the ceph state(i only set data disk for osd):
>
> [root@dbffa72704e4 ~]$ ceph -s
>   cluster:
> id: 20f51975-303e-446f-903f-04e1feaff7d0
> health: HEALTH_WARN
> Reduced data availability: 128 pgs inactive
> Degraded data redundancy: 128 pgs unclean
>
>   services:
> mon: 2 daemons, quorum dbffa72704e4,5807d12f920e
> mgr: dbffa72704e4(active), standbys: 5807d12f920e
> mds: cephfs-1/1/1 up  {0=5807d12f920e=up:creating}, 1 up:standby
> osd: 0 osds: 0 up, 0 in
>
>   data:
> pools:   2 pools, 128 pgs
> objects: 0 objects, 0 bytes
> usage:   0 kB used, 0 kB / 0 kB avail
> pgs: 100.000% pgs unknown
>  128 unknown
>
> [root@dbffa72704e4 ~]$ ceph version
> ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
> (stable)
>
> My container is based on centos:centos7.2.1511, kernel is 3e0728877e22 
> 3.10.0-514.el7.x86_64.
>
> I saw some ceph related images on docker hub so that i think above
> operation is ok, did i miss something important?
>
> --
> Best Regards
> Dai Xiang

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Incorrect pool usage statistics

2017-11-14 Thread Alwin Antreich
Hello Karun,
On Tue, Nov 14, 2017 at 04:16:51AM +0530, Karun Josy wrote:
> Hello,
>
> Recently, I deleted all the disks from an erasure pool 'ecpool'.
> The pool is empty. However the space usage shows around 400GB.
> What might be wrong?
>
>
> $ rbd ls -l ecpool
> $ $ ceph df
>
> GLOBAL:
> SIZE   AVAIL  RAW USED %RAW USED
> 19019G 16796G2223G 11.69
> POOLS:
> NAMEID USED   %USED MAX AVAIL OBJECTS
> template 1227G  1.59 2810G   58549
> vm 21  0 0 4684G   2
> ecpool  33   403G  2.7910038G  388652
> imagepool   34 90430M  0.62 4684G   22789
>
>
>
> Karun Josy

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
What does 'rados -p ecpool ls' list? Maybe some leftover benchmarks?

--
Cheers,
Alwin

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com