[ceph-users] About ceph.conf

2014-05-05 Thread Cao, Buddy
According to the change of ceph-deploy from mkcephfs, I feel ceph.conf is not a 
recommended way to manage ceph configuration. Is it true? If so, how do I get 
the configurations previous configured in ceph.conf? e.g., data drive, journal 
drive, [osd] conf, etc.


Wei Cao (Buddy)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
noout was set while I manhandled osd.4 in and out of the cluster 
repeatedly (trying to set copy data from other osds and set attr to make 
osd.4 pick up that it had objects in pg 0.2f). It wasn't set before the 
problem, and isn't set currently.


I don't really know where you saw pool size = 1:

# for p in $(ceph osd lspools | awk 'BEGIN { RS="," } { print $2 }'); do 
ceph osd pool get $p size;  done

size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2
size: 2

All pools are reporting size 2. The osd that last shared the incomplete 
pg (osd.1) had the pg directory intact and appropriately sized. However, 
it seems the pgmap was preferring osd.4 as the most recent copy of that 
pg, even when the pg directory was deleted. I guess because the pg was 
flagged incomplete, there was no further attempt to mirror the bogus pg 
onto another osd.


Since I sent my original email (this afternoon actually), I've nuked 
osd.4 and created an osd.5 on its old disc. I've still got pg 0.2f 
listed as down/incomplete/inactive despite marking its only home osd as 
lost. I'll follow up tomorrow after object recovery is as complete as 
it's going to get.


At this point though I'm shrugging and accepting the data loss, but 
ideas on how to create a new pg to replace the incomplete 0.2f would be 
deeply useful. I'm supposing ceph pg force_create_pg 0.2f would suffice.


Jeff

On 05/05/2014 07:46 PM, Gregory Farnum wrote:

Oh, you've got no-out set. Did you lose an OSD at any point? Are you
really running the system with pool size 1? I think you've managed to
erase the up-to-date data, but not the records of that data's
existence. You'll have to explore the various "lost" commands, but I'm
not sure what the right approach is here. It's possible you're just
out of luck after manually adjusting the store improperly.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, May 5, 2014 at 4:39 PM, Jeff Bachtel
 wrote:

Thanks. That is a cool utility, unfortunately I'm pretty sure the pg in
question had a cephfs object instead of rbd images (because mounting cephfs
is the only noticeable brokenness).

Jeff


On 05/05/2014 06:43 PM, Jake Young wrote:

I was in a similar situation where I could see the PGs data on an osd, but
there was nothing I could do to force the pg to use that osd's copy.

I ended up using the rbd_restore tool to create my rbd on disk and then I
reimported it into the pool.

See this thread for info on rbd_restore:
http://www.spinics.net/lists/ceph-devel/msg11552.html

Of course, you have to copy all of the pieces of the rbd image on one file
system somewhere (thank goodness for thin provisioning!) for the tool to
work.

There really should be a better way.

Jake

On Monday, May 5, 2014, Jeff Bachtel 
wrote:

Well, that'd be the ideal solution. Please check out the github gist I
posted, though. It seems that despite osd.4 having nothing good for pg 0.2f,
the cluster does not acknowledge any other osd has a copy of the pg. I've
tried downing osd.4 and manually deleting the pg directory in question with
the hope that the cluster would roll back epochs for 0.2f, but all it does
is recreate the pg directory (empty) on osd.4.

Jeff

On 05/05/2014 04:33 PM, Gregory Farnum wrote:

What's your cluster look like? I wonder if you can just remove the bad
PG from osd.4 and let it recover from the existing osd.1
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
 wrote:

This is all on firefly rc1 on CentOS 6

I had an osd getting overfull, and misinterpreting directions I downed
it
then manually removed pg directories from the osd mount. On restart and
after a good deal of rebalancing (setting osd weights as I should've
originally), I'm now at

  cluster de10594a-0737-4f34-a926-58dc9254f95f
   health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete; 1 pgs stuck
inactive; 308 pgs stuck unclean; recov
ery 1/2420563 objects degraded (0.000%); noout flag(s) set
   monmap e7: 3 mons at

{controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/0,controller3=10.100.2.
3:6789/0}, election epoch 556, quorum 0,1,2
controller1,controller2,controller3
   mdsmap e268: 1/1/1 up {0=controller1=up:active}
   osdmap e3492: 5 osds: 5 up, 5 in
  flags noout
pgmap v4167420: 320 pgs, 15 pools, 4811 GB data, 1181 kobjects
  9770 GB used, 5884 GB / 15654 GB avail
  1/2420563 objects degraded (0.000%)
 3 active
12 active+clean
 2 active+remapped+wait_backfill
 1 incomplete
   302 active+remapped
client io 364 B/s wr, 0 op/s

# ceph pg dump | grep 0.2f
dumped all in format plain
0.2f0   0   0   0   0   0   0 incomplete
2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
2254'20053  2014-04-28 00:24:36.504086

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
Thanks. That is a cool utility, unfortunately I'm pretty sure the pg in 
question had a cephfs object instead of rbd images (because mounting 
cephfs is the only noticeable brokenness).


Jeff

On 05/05/2014 06:43 PM, Jake Young wrote:
I was in a similar situation where I could see the PGs data on an osd, 
but there was nothing I could do to force the pg to use that osd's copy.


I ended up using the rbd_restore tool to create my rbd on disk and 
then I reimported it into the pool.


See this thread for info on rbd_restore:
http://www.spinics.net/lists/ceph-devel/msg11552.html

Of course, you have to copy all of the pieces of the rbd image on one 
file system somewhere (thank goodness for thin provisioning!) for the 
tool to work.


There really should be a better way.

Jake

On Monday, May 5, 2014, Jeff Bachtel > wrote:


Well, that'd be the ideal solution. Please check out the github
gist I posted, though. It seems that despite osd.4 having nothing
good for pg 0.2f, the cluster does not acknowledge any other osd
has a copy of the pg. I've tried downing osd.4 and manually
deleting the pg directory in question with the hope that the
cluster would roll back epochs for 0.2f, but all it does is
recreate the pg directory (empty) on osd.4.

Jeff

On 05/05/2014 04:33 PM, Gregory Farnum wrote:

What's your cluster look like? I wonder if you can just remove
the bad
PG from osd.4 and let it recover from the existing osd.1
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
 wrote:

This is all on firefly rc1 on CentOS 6

I had an osd getting overfull, and misinterpreting
directions I downed it
then manually removed pg directories from the osd mount.
On restart and
after a good deal of rebalancing (setting osd weights as I
should've
originally), I'm now at

 cluster de10594a-0737-4f34-a926-58dc9254f95f
  health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete;
1 pgs stuck
inactive; 308 pgs stuck unclean; recov
ery 1/2420563 objects degraded (0.000%); noout flag(s) set
  monmap e7: 3 mons at

{controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/0,controller3=10.100.2

.
3:6789/0}, election epoch 556, quorum 0,1,2
controller1,controller2,controller3
  mdsmap e268: 1/1/1 up {0=controller1=up:active}
  osdmap e3492: 5 osds: 5 up, 5 in
 flags noout
   pgmap v4167420: 320 pgs, 15 pools, 4811 GB data,
1181 kobjects
 9770 GB used, 5884 GB / 15654 GB avail
 1/2420563 objects degraded (0.000%)
3 active
   12 active+clean
2 active+remapped+wait_backfill
1 incomplete
  302 active+remapped
   client io 364 B/s wr, 0 op/s

# ceph pg dump | grep 0.2f
dumped all in format plain
0.2f0   0   0   0   0   0   0
incomplete
2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
2254'20053  2014-04-28 00:24:36.504086  2100'18109
2014-04-26
22:26:23.699330

# ceph pg map 0.2f
osdmap e3492 pg 0.2f (0.2f) -> up [4] acting [4]

The pg query for the downed pg is at
https://gist.github.com/jeffb-bt/c8730899ff002070b325

Of course, the osd I manually mucked with is the only one
the cluster is
picking up as up/acting. Now, I can query the pg and find
epochs where other
osds (that I didn't jack up) were acting. And in fact, the
latest of those
entries (osd.1) has the pg directory in its osd mount, and
it's a good
healthy 59gb.

I've tried manually rsync'ing (and preserving attributes)
that set of
directories from osd.1 to osd.4 without success. Likewise
I've tried copying
the directories over without attributes set. I've done
many, many deep
scrubs but the pg query does not show the scrub timestamps
being affected.

I'm seeking ideas for either fixing metadata on the
directory on osd.4 to
cause this pg to be seen/recognized, or ideas on forcing
the cluster's pg
map to point to osd.1 for the incomplete pg (basically
wipi

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Gregory Farnum
Oh, you've got no-out set. Did you lose an OSD at any point? Are you
really running the system with pool size 1? I think you've managed to
erase the up-to-date data, but not the records of that data's
existence. You'll have to explore the various "lost" commands, but I'm
not sure what the right approach is here. It's possible you're just
out of luck after manually adjusting the store improperly.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, May 5, 2014 at 4:39 PM, Jeff Bachtel
 wrote:
> Thanks. That is a cool utility, unfortunately I'm pretty sure the pg in
> question had a cephfs object instead of rbd images (because mounting cephfs
> is the only noticeable brokenness).
>
> Jeff
>
>
> On 05/05/2014 06:43 PM, Jake Young wrote:
>
> I was in a similar situation where I could see the PGs data on an osd, but
> there was nothing I could do to force the pg to use that osd's copy.
>
> I ended up using the rbd_restore tool to create my rbd on disk and then I
> reimported it into the pool.
>
> See this thread for info on rbd_restore:
> http://www.spinics.net/lists/ceph-devel/msg11552.html
>
> Of course, you have to copy all of the pieces of the rbd image on one file
> system somewhere (thank goodness for thin provisioning!) for the tool to
> work.
>
> There really should be a better way.
>
> Jake
>
> On Monday, May 5, 2014, Jeff Bachtel 
> wrote:
>>
>> Well, that'd be the ideal solution. Please check out the github gist I
>> posted, though. It seems that despite osd.4 having nothing good for pg 0.2f,
>> the cluster does not acknowledge any other osd has a copy of the pg. I've
>> tried downing osd.4 and manually deleting the pg directory in question with
>> the hope that the cluster would roll back epochs for 0.2f, but all it does
>> is recreate the pg directory (empty) on osd.4.
>>
>> Jeff
>>
>> On 05/05/2014 04:33 PM, Gregory Farnum wrote:
>>>
>>> What's your cluster look like? I wonder if you can just remove the bad
>>> PG from osd.4 and let it recover from the existing osd.1
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
>>>  wrote:

 This is all on firefly rc1 on CentOS 6

 I had an osd getting overfull, and misinterpreting directions I downed
 it
 then manually removed pg directories from the osd mount. On restart and
 after a good deal of rebalancing (setting osd weights as I should've
 originally), I'm now at

  cluster de10594a-0737-4f34-a926-58dc9254f95f
   health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete; 1 pgs stuck
 inactive; 308 pgs stuck unclean; recov
 ery 1/2420563 objects degraded (0.000%); noout flag(s) set
   monmap e7: 3 mons at

 {controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/0,controller3=10.100.2.
 3:6789/0}, election epoch 556, quorum 0,1,2
 controller1,controller2,controller3
   mdsmap e268: 1/1/1 up {0=controller1=up:active}
   osdmap e3492: 5 osds: 5 up, 5 in
  flags noout
pgmap v4167420: 320 pgs, 15 pools, 4811 GB data, 1181 kobjects
  9770 GB used, 5884 GB / 15654 GB avail
  1/2420563 objects degraded (0.000%)
 3 active
12 active+clean
 2 active+remapped+wait_backfill
 1 incomplete
   302 active+remapped
client io 364 B/s wr, 0 op/s

 # ceph pg dump | grep 0.2f
 dumped all in format plain
 0.2f0   0   0   0   0   0   0 incomplete
 2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
 2254'20053  2014-04-28 00:24:36.504086  2100'18109 2014-04-26
 22:26:23.699330

 # ceph pg map 0.2f
 osdmap e3492 pg 0.2f (0.2f) -> up [4] acting [4]

 The pg query for the downed pg is at
 https://gist.github.com/jeffb-bt/c8730899ff002070b325

 Of course, the osd I manually mucked with is the only one the cluster is
 picking up as up/acting. Now, I can query the pg and find epochs where
 other
 osds (that I didn't jack up) were acting. And in fact, the latest of
 those
 entries (osd.1) has the pg directory in its osd mount, and it's a good
 healthy 59gb.

 I've tried manually rsync'ing (and preserving attributes) that set of
 directories from osd.1 to osd.4 without success. Likewise I've tried
 copying
 the directories over without attributes set. I've done many, many deep
 scrubs but the pg query does not show the scrub timestamps being
 affected.

 I'm seeking ideas for either fixing metadata on the directory on osd.4
 to
 cause this pg to be seen/recognized, or ideas on forcing the cluster's
 pg
 map to point to osd.1 for the incomplete pg (basically wiping out the
 cluster's memory that osd.4 ever had 0.2f). O

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jake Young
I was in a similar situation where I could see the PGs data on an osd, but
there was nothing I could do to force the pg to use that osd's copy.

I ended up using the rbd_restore tool to create my rbd on disk and then I
reimported it into the pool.

See this thread for info on rbd_restore:
http://www.spinics.net/lists/ceph-devel/msg11552.html

Of course, you have to copy all of the pieces of the rbd image on one file
system somewhere (thank goodness for thin provisioning!) for the tool to
work.

There really should be a better way.

Jake

On Monday, May 5, 2014, Jeff Bachtel 
wrote:

> Well, that'd be the ideal solution. Please check out the github gist I
> posted, though. It seems that despite osd.4 having nothing good for pg
> 0.2f, the cluster does not acknowledge any other osd has a copy of the pg.
> I've tried downing osd.4 and manually deleting the pg directory in question
> with the hope that the cluster would roll back epochs for 0.2f, but all it
> does is recreate the pg directory (empty) on osd.4.
>
> Jeff
>
> On 05/05/2014 04:33 PM, Gregory Farnum wrote:
>
>> What's your cluster look like? I wonder if you can just remove the bad
>> PG from osd.4 and let it recover from the existing osd.1
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
>>  wrote:
>>
>>> This is all on firefly rc1 on CentOS 6
>>>
>>> I had an osd getting overfull, and misinterpreting directions I downed it
>>> then manually removed pg directories from the osd mount. On restart and
>>> after a good deal of rebalancing (setting osd weights as I should've
>>> originally), I'm now at
>>>
>>>  cluster de10594a-0737-4f34-a926-58dc9254f95f
>>>   health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete; 1 pgs stuck
>>> inactive; 308 pgs stuck unclean; recov
>>> ery 1/2420563 objects degraded (0.000%); noout flag(s) set
>>>   monmap e7: 3 mons at
>>> {controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/
>>> 0,controller3=10.100.2.
>>> 3:6789/0}, election epoch 556, quorum 0,1,2
>>> controller1,controller2,controller3
>>>   mdsmap e268: 1/1/1 up {0=controller1=up:active}
>>>   osdmap e3492: 5 osds: 5 up, 5 in
>>>  flags noout
>>>pgmap v4167420: 320 pgs, 15 pools, 4811 GB data, 1181 kobjects
>>>  9770 GB used, 5884 GB / 15654 GB avail
>>>  1/2420563 objects degraded (0.000%)
>>> 3 active
>>>12 active+clean
>>> 2 active+remapped+wait_backfill
>>> 1 incomplete
>>>   302 active+remapped
>>>client io 364 B/s wr, 0 op/s
>>>
>>> # ceph pg dump | grep 0.2f
>>> dumped all in format plain
>>> 0.2f0   0   0   0   0   0   0 incomplete
>>> 2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
>>> 2254'20053  2014-04-28 00:24:36.504086  2100'18109 2014-04-26
>>> 22:26:23.699330
>>>
>>> # ceph pg map 0.2f
>>> osdmap e3492 pg 0.2f (0.2f) -> up [4] acting [4]
>>>
>>> The pg query for the downed pg is at
>>> https://gist.github.com/jeffb-bt/c8730899ff002070b325
>>>
>>> Of course, the osd I manually mucked with is the only one the cluster is
>>> picking up as up/acting. Now, I can query the pg and find epochs where
>>> other
>>> osds (that I didn't jack up) were acting. And in fact, the latest of
>>> those
>>> entries (osd.1) has the pg directory in its osd mount, and it's a good
>>> healthy 59gb.
>>>
>>> I've tried manually rsync'ing (and preserving attributes) that set of
>>> directories from osd.1 to osd.4 without success. Likewise I've tried
>>> copying
>>> the directories over without attributes set. I've done many, many deep
>>> scrubs but the pg query does not show the scrub timestamps being
>>> affected.
>>>
>>> I'm seeking ideas for either fixing metadata on the directory on osd.4 to
>>> cause this pg to be seen/recognized, or ideas on forcing the cluster's pg
>>> map to point to osd.1 for the incomplete pg (basically wiping out the
>>> cluster's memory that osd.4 ever had 0.2f). Or any other solution :) It's
>>> only 59g, so worst case I'll mark it lost and recreate the pg, but I'd
>>> prefer to learn enough of the innards to understand what is going on, and
>>> possible means of fixing it.
>>>
>>> Thanks for any help,
>>>
>>> Jeff
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
Hi Wido,

thanks again for inputs.

Everything is fine, except for the Software Router - it doesn't seem to get
created on CEPH, no matter what I try.

I created new offering for CPVV and SSVM and used the guide here:
https://cloudstack.apache.org/docs/en-US/Apache_CloudStack/4.2.0/html-single/Admin_Guide/index.html#sys-offering-sysvmto
start using these new system offerings and it is all fine. Did the
same
for Software Router, but it keeps using original system offering, instead
of the one I created.

CS keeps creating VR on NFS storage, choosen randomly among 3 NFS storage
nodes...

Any suggestion, please ?

Thanks,
Andrija


On 5 May 2014 16:11, Andrija Panic  wrote:

> Will try creating tag inside CS database, since GUI/cloudmoneky editing of
> existing offer is NOT possible...
>
>
>
> On 5 May 2014 16:04, Brian Rak  wrote:
>
>>  This would be a better question for the Cloudstack community.
>>
>>
>> On 5/2/2014 10:06 AM, Andrija Panic wrote:
>>
>> Hi.
>>
>>  I was wondering what would be correct way to migrate system VMs
>> (storage,console,VR) from local storage to CEPH.
>>
>>  I'm on CS 4.2.1 and will be soon updating to 4.3...
>>
>>  Is it enough to just change global setting system.vm.use.local.storage
>> = true, to FALSE, and then destroy system VMs (cloudstack will recreate
>> them in 1-2 minutes)
>>
>>  Also how to make sure that system VMs will NOT end up on NFS storage ?
>>
>>  Thanks for any input...
>>
>>  --
>>
>> Andrija Panić
>>
>>
>> ___
>> ceph-users mailing 
>> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>
>
> --
>
> Andrija Panić
> --
>   http://admintweets.com
> --
>



-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
Well, that'd be the ideal solution. Please check out the github gist I 
posted, though. It seems that despite osd.4 having nothing good for pg 
0.2f, the cluster does not acknowledge any other osd has a copy of the 
pg. I've tried downing osd.4 and manually deleting the pg directory in 
question with the hope that the cluster would roll back epochs for 0.2f, 
but all it does is recreate the pg directory (empty) on osd.4.


Jeff

On 05/05/2014 04:33 PM, Gregory Farnum wrote:

What's your cluster look like? I wonder if you can just remove the bad
PG from osd.4 and let it recover from the existing osd.1
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
 wrote:

This is all on firefly rc1 on CentOS 6

I had an osd getting overfull, and misinterpreting directions I downed it
then manually removed pg directories from the osd mount. On restart and
after a good deal of rebalancing (setting osd weights as I should've
originally), I'm now at

 cluster de10594a-0737-4f34-a926-58dc9254f95f
  health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete; 1 pgs stuck
inactive; 308 pgs stuck unclean; recov
ery 1/2420563 objects degraded (0.000%); noout flag(s) set
  monmap e7: 3 mons at
{controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/0,controller3=10.100.2.
3:6789/0}, election epoch 556, quorum 0,1,2
controller1,controller2,controller3
  mdsmap e268: 1/1/1 up {0=controller1=up:active}
  osdmap e3492: 5 osds: 5 up, 5 in
 flags noout
   pgmap v4167420: 320 pgs, 15 pools, 4811 GB data, 1181 kobjects
 9770 GB used, 5884 GB / 15654 GB avail
 1/2420563 objects degraded (0.000%)
3 active
   12 active+clean
2 active+remapped+wait_backfill
1 incomplete
  302 active+remapped
   client io 364 B/s wr, 0 op/s

# ceph pg dump | grep 0.2f
dumped all in format plain
0.2f0   0   0   0   0   0   0 incomplete
2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
2254'20053  2014-04-28 00:24:36.504086  2100'18109 2014-04-26
22:26:23.699330

# ceph pg map 0.2f
osdmap e3492 pg 0.2f (0.2f) -> up [4] acting [4]

The pg query for the downed pg is at
https://gist.github.com/jeffb-bt/c8730899ff002070b325

Of course, the osd I manually mucked with is the only one the cluster is
picking up as up/acting. Now, I can query the pg and find epochs where other
osds (that I didn't jack up) were acting. And in fact, the latest of those
entries (osd.1) has the pg directory in its osd mount, and it's a good
healthy 59gb.

I've tried manually rsync'ing (and preserving attributes) that set of
directories from osd.1 to osd.4 without success. Likewise I've tried copying
the directories over without attributes set. I've done many, many deep
scrubs but the pg query does not show the scrub timestamps being affected.

I'm seeking ideas for either fixing metadata on the directory on osd.4 to
cause this pg to be seen/recognized, or ideas on forcing the cluster's pg
map to point to osd.1 for the incomplete pg (basically wiping out the
cluster's memory that osd.4 ever had 0.2f). Or any other solution :) It's
only 59g, so worst case I'll mark it lost and recreate the pg, but I'd
prefer to learn enough of the innards to understand what is going on, and
possible means of fixing it.

Thanks for any help,

Jeff

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] some unfound object

2014-05-05 Thread Gregory Farnum
"Need" means "I know this version of the object has existed at some
time in the cluster". "Have" means "this is the newest version of the
object I currently have available". If you're missing OSDs (or have
been in the past) you may need to invoke some of the "lost" commands
to tell the OSDs to just go with what they have.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Mon, May 5, 2014 at 2:01 AM, vernon1...@126.com  wrote:
> Hi everyone,
> my ceph has some object unfound. When I run "ceph pg 50.5 list_missing", it
> show me:
>
> ... ...
> { "oid": { "oid": "rbd_data.53ec83d1b58ba.0740",
>   "key": "",
>   "snapid": -2,
>   "hash": 4097468757,
>   "max": 0,
>   "pool": 50,
>   "namespace": ""},
>   "need": "27229'2203677",
>   "have": "27064'2202729",
>   "locations": []},
> ... ...
>
> I want to know, what's the "need" and "have"? Can I change it? Or how to fix
> it?
> 
> vernon1...@126.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fatigue for XFS

2014-05-05 Thread Andrey Korolyov
On Tue, May 6, 2014 at 12:36 AM, Dave Chinner  wrote:
> On Mon, May 05, 2014 at 11:49:05PM +0400, Andrey Korolyov wrote:
>> Hello,
>>
>> We are currently exploring issue which can be related to Ceph itself
>> or to the XFS - any help is very appreciated.
>>
>> First, the picture: relatively old cluster w/ two years uptime and ten
>> months after fs recreation on every OSD, one of daemons started to
>> flap approximately once per day for couple of weeks, with no external
>> reason (bandwidth/IOPS/host issues). It looks almost the same every
>> time - OSD suddenly stop serving requests for a short period, gets
>> kicked out by peers report, then returns in a couple of seconds. Of
>> course, small but sensitive amount of requests are delayed by 15-30
>> seconds twice, which is bad for us. The only thing which correlates
>> with this kick is a peak of I/O, not too large, even not consuming all
>> underlying disk utilization, but alone in the cluster and clearly
>> visible. Also there are at least two occasions *without* correlated
>> iowait peak.
>
> So, actual numbers and traces are the only thing that tell us what
> is happening during these events. See here:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> If it happens at almost the same time every day, then I'd be looking
> at the crontabs to find what starts up about that time. output of
> top will also probably tell you what process is running, too. topio
> might be instructive, and blktrace almost certainly will be
>
>> I have two versions - we`re touching some sector on disk which is
>> about to be marked as dead but not displayed in SMART statistics or (I
>
> Doubt it - SMART doesn't cause OS visible IO dispatch spikes.
>
>> believe so) some kind of XFS fatigue, which can be more likely in this
>> case, since near-bad sector should be touched more frequently and
>> related impact could leave traces in dmesg/SMART from my experience. I
>
> I doubt that, too, because XFS doesn't have anything that is
> triggered on a daily basis inside it. Maybe you've got xfs_fsr set
> up on a cron job, though...
>
>> would like to ask if anyone has a simular experience before or can
>> suggest to poke existing file system in some way. If no suggestion
>> appear, I`ll probably reformat disk and, if problem will remain after
>> refill, replace it, but I think less destructive actions can be done
>> before.
>
> Yeah, monitoring and determining the process that is issuing the IO
> is what you need to find first.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> da...@fromorbit.com

Thanks Dave,

there are definitely no cron set for specific time (though most of
lockups happened in a relatively small interval which correlates with
the Ceph snapshot operations). In at least one case no Ceph snapshot
operations (including delayed removal) happened and at least two when
no I/O peak was observed. We observed and eliminated weird lockups
related to the openswitch behavior before - we`re combining storage
and compute nodes, so quirks in the OVS datapath caused very
interesting and weird system-wide lockups on (supposedly) spinlock,
and we see 'pure' Ceph lockups on XFS at time with 3.4-3.7 kernels,
all of them was correlated with very high context switch peak.

Current issue is seemingly nothing to do with spinlock-like bugs or
just a hardware problem, we even rebooted problematic node to check if
the memory allocator may stuck at the border of specific NUMA node,
with no help, but first reappearance of this bug was delayed by some
days then. Disabling lazy allocation via specifying allocsize did
nothing too. It may look like I am insisting that it is XFS bug, where
Ceph version is more likely to appear because of way more complicated
logic and operation behaviour, but persistence on specific node across
relaunching of Ceph storage daemon suggests bug relation to the
unlucky byte sequence more than anything else. If it finally appear as
Ceph bug, it`ll ruin our expectations from two-year of close
experience with this product and if it is XFS bug, we haven`t see
anything like this before, thought we had a pretty collection of
XFS-related lockups on the earlier kernels.

So, my understanding is that we hitting neither very rare memory
allocator bug in case of XFS or age-related Ceph issue, both are very
unlikely to exist - but I cannot imagine nothing else. If it helps, I
may collect a series of perf events during next appearance or exact
iostat output (mine graphics can say that the I/O was not choked
completely when peak appeared, that`s all).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Gregory Farnum
What's your cluster look like? I wonder if you can just remove the bad
PG from osd.4 and let it recover from the existing osd.1
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sat, May 3, 2014 at 9:17 AM, Jeff Bachtel
 wrote:
> This is all on firefly rc1 on CentOS 6
>
> I had an osd getting overfull, and misinterpreting directions I downed it
> then manually removed pg directories from the osd mount. On restart and
> after a good deal of rebalancing (setting osd weights as I should've
> originally), I'm now at
>
> cluster de10594a-0737-4f34-a926-58dc9254f95f
>  health HEALTH_WARN 2 pgs backfill; 1 pgs incomplete; 1 pgs stuck
> inactive; 308 pgs stuck unclean; recov
> ery 1/2420563 objects degraded (0.000%); noout flag(s) set
>  monmap e7: 3 mons at
> {controller1=10.100.2.1:6789/0,controller2=10.100.2.2:6789/0,controller3=10.100.2.
> 3:6789/0}, election epoch 556, quorum 0,1,2
> controller1,controller2,controller3
>  mdsmap e268: 1/1/1 up {0=controller1=up:active}
>  osdmap e3492: 5 osds: 5 up, 5 in
> flags noout
>   pgmap v4167420: 320 pgs, 15 pools, 4811 GB data, 1181 kobjects
> 9770 GB used, 5884 GB / 15654 GB avail
> 1/2420563 objects degraded (0.000%)
>3 active
>   12 active+clean
>2 active+remapped+wait_backfill
>1 incomplete
>  302 active+remapped
>   client io 364 B/s wr, 0 op/s
>
> # ceph pg dump | grep 0.2f
> dumped all in format plain
> 0.2f0   0   0   0   0   0   0 incomplete
> 2014-05-03 11:38:01.526832 0'0  3492:23 [4] 4   [4] 4
> 2254'20053  2014-04-28 00:24:36.504086  2100'18109 2014-04-26
> 22:26:23.699330
>
> # ceph pg map 0.2f
> osdmap e3492 pg 0.2f (0.2f) -> up [4] acting [4]
>
> The pg query for the downed pg is at
> https://gist.github.com/jeffb-bt/c8730899ff002070b325
>
> Of course, the osd I manually mucked with is the only one the cluster is
> picking up as up/acting. Now, I can query the pg and find epochs where other
> osds (that I didn't jack up) were acting. And in fact, the latest of those
> entries (osd.1) has the pg directory in its osd mount, and it's a good
> healthy 59gb.
>
> I've tried manually rsync'ing (and preserving attributes) that set of
> directories from osd.1 to osd.4 without success. Likewise I've tried copying
> the directories over without attributes set. I've done many, many deep
> scrubs but the pg query does not show the scrub timestamps being affected.
>
> I'm seeking ideas for either fixing metadata on the directory on osd.4 to
> cause this pg to be seen/recognized, or ideas on forcing the cluster's pg
> map to point to osd.1 for the incomplete pg (basically wiping out the
> cluster's memory that osd.4 ever had 0.2f). Or any other solution :) It's
> only 59g, so worst case I'll mark it lost and recreate the pg, but I'd
> prefer to learn enough of the innards to understand what is going on, and
> possible means of fixing it.
>
> Thanks for any help,
>
> Jeff
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fatigue for XFS

2014-05-05 Thread Andrey Korolyov
Hello,

We are currently exploring issue which can be related to Ceph itself
or to the XFS - any help is very appreciated.

First, the picture: relatively old cluster w/ two years uptime and ten
months after fs recreation on every OSD, one of daemons started to
flap approximately once per day for couple of weeks, with no external
reason (bandwidth/IOPS/host issues). It looks almost the same every
time - OSD suddenly stop serving requests for a short period, gets
kicked out by peers report, then returns in a couple of seconds. Of
course, small but sensitive amount of requests are delayed by 15-30
seconds twice, which is bad for us. The only thing which correlates
with this kick is a peak of I/O, not too large, even not consuming all
underlying disk utilization, but alone in the cluster and clearly
visible. Also there are at least two occasions *without* correlated
iowait peak.

I have two versions - we`re touching some sector on disk which is
about to be marked as dead but not displayed in SMART statistics or (I
believe so) some kind of XFS fatigue, which can be more likely in this
case, since near-bad sector should be touched more frequently and
related impact could leave traces in dmesg/SMART from my experience. I
would like to ask if anyone has a simular experience before or can
suggest to poke existing file system in some way. If no suggestion
appear, I`ll probably reformat disk and, if problem will remain after
refill, replace it, but I think less destructive actions can be done
before.

XFS is running on 3.10 with almost default create and mount options,
ceph version is the latest cuttlefish (this rack should be upgraded, I
know).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados Gateway pagination

2014-05-05 Thread Fabricio Archanjo
Sergey,

Thanks very much.


On Mon, May 5, 2014 at 5:19 AM, Sergey Malinin  wrote:

>  According to documentation, radios gw returns results in 1k objects sets
> by default.
> http://ceph.com/docs/master/radosgw/s3/php/#list-a-bucket-s-content
> "Note If there are more than 1000 objects in this bucket, you need to
> check $ObjectListResponse->body->isTruncated and run again with the name of
> the last key listed. Keep doing this until isTruncated is not true."
>
> You can request your own pagination by setting GET parameters accordingly:
> http://ceph.com/docs/master/radosgw/s3/bucketops/#get-bucket
>
> PARAMETERS
>
> Name Type Description
> …
> marker String A beginning index for the list of objects returned.
> max-keys Integer The maximum number of keys to return. Default is 1000.
>
>  On Friday, May 2, 2014 at 22:14, Fabricio Archanjo wrote:
>
> Hi All,
>
> Someone knows if the rados-gw does have pagination support on S3 API?
>
> I'm using S3 Object to browse my structure, then it's too slow to list one
> bucket that have too much objects.
>
>
> Thanks,
>
> Fabricio
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
Will try creating tag inside CS database, since GUI/cloudmoneky editing of
existing offer is NOT possible...



On 5 May 2014 16:04, Brian Rak  wrote:

>  This would be a better question for the Cloudstack community.
>
>
> On 5/2/2014 10:06 AM, Andrija Panic wrote:
>
> Hi.
>
>  I was wondering what would be correct way to migrate system VMs
> (storage,console,VR) from local storage to CEPH.
>
>  I'm on CS 4.2.1 and will be soon updating to 4.3...
>
>  Is it enough to just change global setting system.vm.use.local.storage =
> true, to FALSE, and then destroy system VMs (cloudstack will recreate them
> in 1-2 minutes)
>
>  Also how to make sure that system VMs will NOT end up on NFS storage ?
>
>  Thanks for any input...
>
>  --
>
> Andrija Panić
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>


-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Brian Rak

This would be a better question for the Cloudstack community.

On 5/2/2014 10:06 AM, Andrija Panic wrote:

Hi.

I was wondering what would be correct way to migrate system VMs 
(storage,console,VR) from local storage to CEPH.


I'm on CS 4.2.1 and will be soon updating to 4.3...

Is it enough to just change global setting system.vm.use.local.storage 
= true, to FALSE, and then destroy system VMs (cloudstack will 
recreate them in 1-2 minutes)


Also how to make sure that system VMs will NOT end up on NFS storage ?

Thanks for any input...

--

Andrija Panic'


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph RADOS Gateway setup with Apache 2.4.3 and FastCGI 2.4.6 vesions

2014-05-05 Thread Srinivasa Rao Ragolu
Hi All,

I would like to share the info regarding, how to setup RADOS with latest
Apache 2.4.x versions.

Please find below additional steps to setup, which are kind of updates to
ceph documentation.

Steps:

1) If you are planning to use Apache 2.4 versions, You cannot build FastCGI
module against it.
Because FastCGI is not updated according to new changes in Apache code.

So Please use attached patch to build mod_fastcgi 2.4.6 version against
Apache 2.4.3

2) Tool "apxs", generated by apache build is used for building
mod_fastcgi.so module.
Here make sure, this tool script should use apache 2.4.3 headers.

For that, build apache 2.4.3 on host machine and give its headers path
in "apxs" script utility.

3) Command to build mod_fastcgi code

 #cd mod_fastcgi-2.4.6
 #/apxs -o mod_fastcgi.so -c *.c

 You can find mod_fastcgi.so in .libs directory, which you can load in
to apache 2.4.3 without any issues

3) Once you configure configurations as mentioned in Ceph documentation .
Now please replace
authorization flags as below in rgw.conf

*Order* Allow,deny
*Allow* from All ->>>*Require* all granted

Best of Luck.
Srinivas.
--- mod_fastcgi-2.4.6/fcgi.h	2007-09-23 22:03:29.0 +0530
+++ mod_fastcgi-2.4.6.new/fcgi.h	2014-05-04 17:53:33.332050696 +0530
@@ -57,10 +57,10 @@
 #define XtOffsetOf APR_OFFSETOF
 #define ap_select select
 
-#define ap_user_idunixd_config.user_id
-#define ap_group_id   unixd_config.group_id
-#define ap_user_name  unixd_config.user_name
-#define ap_suexec_enabled unixd_config.suexec_enabled
+#define ap_user_idap_unixd_config.user_id
+#define ap_group_id   ap_unixd_config.group_id
+#define ap_user_name  ap_unixd_config.user_name
+#define ap_suexec_enabled ap_unixd_config.suexec_enabled
 
 #ifndef S_ISDIR
 #define S_ISDIR(m)  (((m)&(S_IFMT)) == (S_IFDIR))
@@ -352,83 +352,82 @@
 #ifdef APACHE2
 
 #ifdef WIN32
-#define FCGI_LOG_EMERG  __FILE__,__LINE__,APLOG_EMERG,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_ALERT  __FILE__,__LINE__,APLOG_ALERT,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_CRIT   __FILE__,__LINE__,APLOG_CRIT,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_ERR__FILE__,__LINE__,APLOG_ERR,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_WARN   __FILE__,__LINE__,APLOG_WARNING,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_NOTICE __FILE__,__LINE__,APLOG_NOTICE,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_INFO   __FILE__,__LINE__,APLOG_INFO,APR_FROM_OS_ERROR(GetLastError())
-#define FCGI_LOG_DEBUG  __FILE__,__LINE__,APLOG_DEBUG,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_EMERG  APLOG_MARK,APLOG_EMERG,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_ALERT  APLOG_MARK,APLOG_ALERT,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_CRIT   APLOG_MARK,APLOG_CRIT,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_ERRAPLOG_MARK,APLOG_ERR,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_WARN   APLOG_MARK,APLOG_WARNING,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_NOTICE APLOG_MARK,APLOG_NOTICE,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_INFO   APLOG_MARK,APLOG_INFO,APR_FROM_OS_ERROR(GetLastError())
+#define FCGI_LOG_DEBUG  APLOG_MARK,APLOG_DEBUG,APR_FROM_OS_ERROR(GetLastError())
 #else /* !WIN32 */
-#define FCGI_LOG_EMERG  __FILE__,__LINE__,APLOG_EMERG,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_ALERT  __FILE__,__LINE__,APLOG_ALERT,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_CRIT   __FILE__,__LINE__,APLOG_CRIT,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_ERR__FILE__,__LINE__,APLOG_ERR,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_WARN   __FILE__,__LINE__,APLOG_WARNING,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_NOTICE __FILE__,__LINE__,APLOG_NOTICE,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_INFO   __FILE__,__LINE__,APLOG_INFO,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_DEBUG  __FILE__,__LINE__,APLOG_DEBUG,APR_FROM_OS_ERROR(errno)
-#endif
-
-#define FCGI_LOG_EMERG_ERRNO__FILE__,__LINE__,APLOG_EMERG,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_ALERT_ERRNO__FILE__,__LINE__,APLOG_ALERT,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_CRIT_ERRNO __FILE__,__LINE__,APLOG_CRIT,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_ERR_ERRNO  __FILE__,__LINE__,APLOG_ERR,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_WARN_ERRNO __FILE__,__LINE__,APLOG_WARNING,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_NOTICE_ERRNO   __FILE__,__LINE__,APLOG_NOTICE,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_INFO_ERRNO __FILE__,__LINE__,APLOG_INFO,APR_FROM_OS_ERROR(errno)
-#define FCGI_LOG_DEBUG_ERRNO__FILE__,__LINE__,APLOG_DEBUG,APR_FROM_OS_ERROR(errno)
-
-#define FCGI_LOG_EMERG_NOERRNO__FILE__,__LINE__,APLOG_EMERG,0
-#define FCGI_LOG_ALERT_NOERRNO

Re: [ceph-users] ceph editable failure domains

2014-05-05 Thread Fabrizio G. Ventola
Thanks so much Craig, this was really helpful and now works as expected!

Have a nice day,

Fabrizio

On 3 May 2014 01:53, Craig Lewis  wrote:
> On 5/2/14 05:15 , Fabrizio G. Ventola wrote:
>
> Hello everybody,
> I'm making some tests with ceph and its editable cluster map and I'm
> trying to define a "rack" layer for its hierarchy in this way:
>
> ceph osd tree:
>
> # id weight type name up/down reweight
> -1 0.84 root default
> -7 0.28 rack rack1
> -2 0.14 host cephosd1-dev
> 0 0.14 osd.0 up 1
> -3 0.14 host cephosd2-dev
> 1 0.14 osd.1 up 1
> -8 0.28 rack rack2
> -4 0.14 host cephosd3-dev
> 2 0.14 osd.2 up 1
> -5 0.14 host cephosd4-dev
> 3 0.14 osd.3 up 1
> -9 0.28 rack rack3
> -6 0.28 host cephosd5-dev
> 4 0.28 osd.4 up 1
>
> Those are my pools:
> pool 0 'data' rep size 3 min_size 2 crush_ruleset 0 object_hash
> rjenkins pg_num 333 pgp_num 333 last_change 2545 owner 0
> crash_replay_interval 45
> pool 1 'metadata' rep size 3 min_size 2 crush_ruleset 1 object_hash
> rjenkins pg_num 333 pgp_num 333 last_change 2548 owner 0
> pool 2 'rbd' rep size 3 min_size 2 crush_ruleset 2 object_hash
> rjenkins pg_num 333 pgp_num 333 last_change 2529 owner 0
> pool 4 'pool_01' rep size 3 min_size 2 crush_ruleset 0 object_hash
> rjenkins pg_num 333 pgp_num 333 last_change 2542 owner 0
>
> I configured replica 3 for all pools and min_size 2, thus I'm
> expecting when I write new data on ceph-fs (through FUSE) or when I
> make a new RBD to see the same amount of data on every rack (3 racks,
> 3 replicas -> 1 replica per rack). But as you can see the third rack
> has just one OSD (the first two have two by the way) and should have
> the rack1+rack2 amount of data. Instead it has less data than the
> other racks (but more than one single OSD of the first two racks).
> Where am I wrong?
>
> Thank you in advance,
> Fabrizio
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> You also need to edit the crush rules to tell it to choose a leaf from each
> rack, instead of the default host.  If you run
> ceph osd crush dump
>
> You'll see that the rules 0, 1, and 2 are operation chooseleaf_firstn, type
> host.  Those rule numbers are referenced in the pool data's crush_ruleset
> above.
>
>
> This should get you started on editing the crush map:
> https://ceph.com/docs/master/rados/operations/crush-map/#editing-a-crush-map
>
> In the rules section of the decompiled map, change your
> step chooseleaf firstn 0 type host
> to
> step chooseleaf firstn 0 type rack
>
>
> Then compile and set the new crushmap.
>
> A lot of data is going to start moving.  This will give you a chance to use
> your cluster during a heavy recovery operation.
>
>
> --
>
> Craig Lewis
> Senior Systems Engineer
> Office +1.714.602.1309
> Email cle...@centraldesktop.com
>
> Central Desktop. Work together in ways you never thought possible.
> Connect with us   Website  |  Twitter  |  Facebook  |  LinkedIn  |  Blog
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] List users not listing users

2014-05-05 Thread Shanil S
Hi,

I am creating the code to list out all users, but am unable to list out the
users but the authentication is correct. These are i saw in the log file,
could you please check it ?

application/x-www-form-urlencoded
Mon, 05 May 2014 09:41:06 GMT
/admin/user/
2014-05-05 17:41:07.852110 7f2b087b8700 15 calculated
digest=8DyhIXWKT7RL9nC1/jD7+Jp/ESk=
2014-05-05 17:41:07.852114 7f2b087b8700 15
auth_sign=8DyhIXWKT7RL9nC1/jD7+Jp/ESk=
2014-05-05 17:41:07.852115 7f2b087b8700 15 compare=0
2014-05-05 17:41:07.852118 7f2b087b8700  2 req 356:0.000297::GET
/admin/user/:get_user_info:reading permissions
2014-05-05 17:41:07.852122 7f2b087b8700  2 req 356:0.000301::GET
/admin/user/:get_user_info:init op
2014-05-05 17:41:07.852124 7f2b087b8700  2 req 356:0.000303::GET
/admin/user/:get_user_info:verifying op mask
2014-05-05 17:41:07.852126 7f2b087b8700 20 required_mask= 0 user.op_mask=7
2014-05-05 17:41:07.852128 7f2b087b8700  2 req 356:0.000307::GET
/admin/user/:get_user_info:verifying op permissions
2014-05-05 17:41:07.852131 7f2b087b8700  2 req 356:0.000310::GET
/admin/user/:get_user_info:verifying op params
2014-05-05 17:41:07.852133 7f2b087b8700  2 req 356:0.000312::GET
/admin/user/:get_user_info:executing
2014-05-05 17:41:07.852155 7f2b087b8700  2 req 356:0.000334::GET
/admin/user/:get_user_info:http status=403
2014-05-05 17:41:07.852231 7f2b087b8700  1 == req done req=0x2410710
http_status=403 ==
2014-05-05 17:41:28.116589 7f2b5700  2
RGWDataChangesLog::ChangesRenewThread: start
2014-05-05 17:41:50.116740 7f2b5700  2
RGWDataChangesLog::ChangesRenewThread: start
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] List users not listing users

2014-05-05 Thread Shanil S
Hi,

I am planning to join this community. Please add me
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] some unfound object

2014-05-05 Thread vernon1...@126.com
Hi everyone,
my ceph has some object unfound. When I run "ceph pg 50.5 list_missing", it 
show me:

... ...
{ "oid": { "oid": "rbd_data.53ec83d1b58ba.0740",
  "key": "",
  "snapid": -2,
  "hash": 4097468757,
  "max": 0,
  "pool": 50,
  "namespace": ""},
  "need": "27229'2203677",
  "have": "27064'2202729",
  "locations": []},
... ...

I want to know, what's the "need" and "have"? Can I change it? Or how to fix it?



vernon1...@126.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados Gateway pagination

2014-05-05 Thread Sergey Malinin
According to documentation, radios gw returns results in 1k objects sets by 
default.  
http://ceph.com/docs/master/radosgw/s3/php/#list-a-bucket-s-content
"Note If there are more than 1000 objects in this bucket, you need to check 
$ObjectListResponse->body->isTruncated and run again with the name of the last 
key listed. Keep doing this until isTruncated is not true."

You can request your own pagination by setting GET parameters accordingly:
http://ceph.com/docs/master/radosgw/s3/bucketops/#get-bucket

PARAMETERS

Name Type Description
…
marker String A beginning index for the list of objects returned.
max-keys Integer The maximum number of keys to return. Default is 1000.



On Friday, May 2, 2014 at 22:14, Fabricio Archanjo wrote:

> Hi All,
>  
> Someone knows if the rados-gw does have pagination support on S3 API?  
>  
> I'm using S3 Object to browse my structure, then it's too slow to list one 
> bucket that have too much objects.  
>  
>  
> Thanks,
>  
> Fabricio  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com)
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>  
>  


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
Thank you very much Wido, that's exatly what I was looking for.
Thanks


On 4 May 2014 18:30, Wido den Hollander  wrote:

> On 05/02/2014 04:06 PM, Andrija Panic wrote:
>
>> Hi.
>>
>> I was wondering what would be correct way to migrate system VMs
>> (storage,console,VR) from local storage to CEPH.
>>
>> I'm on CS 4.2.1 and will be soon updating to 4.3...
>>
>> Is it enough to just change global setting system.vm.use.local.storage =
>> true, to FALSE, and then destroy system VMs (cloudstack will recreate
>> them in 1-2 minutes)
>>
>>
> Yes, that would be sufficient. CloudStack will then deploy the SSVMs on
> your RBD storage.
>
>
>  Also how to make sure that system VMs will NOT end up on NFS storage ?
>>
>>
> Make use of the tagging. Tag the RBD pools with 'rbd' and change the
> Service Offering for the SSVMs where they require 'rbd' as a storage tag.
>
>  Thanks for any input...
>>
>> --
>>
>> Andrija Panić
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> --
> Wido den Hollander
> 42on B.V.
> Ceph trainer and consultant
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com