Re: [ceph-users] removing cluster name support

2017-11-06 Thread Erik McCormick
On Fri, Jun 9, 2017 at 12:30 PM, Sage Weil  wrote:
> On Fri, 9 Jun 2017, Erik McCormick wrote:
>> On Fri, Jun 9, 2017 at 12:07 PM, Sage Weil  wrote:
>> > On Thu, 8 Jun 2017, Sage Weil wrote:
>> >> Questions:
>> >>
>> >>  - Does anybody on the list use a non-default cluster name?
>> >>  - If so, do you have a reason not to switch back to 'ceph'?
>> >
>> > It sounds like the answer is "yes," but not for daemons. Several users use
>> > it on the client side to connect to multiple clusters from the same host.
>> >
>>
>> I thought some folks said they were running with non-default naming
>> for daemons, but if not, then count me as one who does. This was
>> mainly a relic of the past, where I thought I would be running
>> multiple clusters on one host. Before long I decided it would be a bad
>> idea, but by then the cluster was already in heavy use and I couldn't
>> undo it.
>>
>> I will say that I am not opposed to renaming back to ceph, but it
>> would be great to have a documented process for accomplishing this
>> prior to deprecation. Even going so far as to remove --cluster from
>> deployment tools will leave me unable to add OSDs if I want to upgrade
>> when Luminous is released.
>
> Note that even if the tool doesn't support it, the cluster name is a
> host-local thing, so you can always deploy ceph-named daemons on other
> hosts.
>
> For an existing host, the removal process should be as simple as
>
>  - stop the daemons on the host
>  - rename /etc/ceph/foo.conf -> /etc/ceph/ceph.conf
>  - rename /var/lib/ceph/*/foo-* -> /var/lib/ceph/*/ceph-* (this mainly
> matters for non-osds, since the osd dirs will get dynamically created by
> ceph-disk, but renaming will avoid leaving clutter behind)
>  - comment out the CLUSTER= line in /etc/{sysconfig,default}/ceph (if
> you're on jewel)
>  - reboot
>
> If you wouldn't mind being a guinea pig and verifying that this is
> sufficient that would be really helpful!  We'll definitely want to
> document this process.
>
> Thanks!
> sage
>
Sitting here in a room with you reminded me I dropped the ball on
feeding back on the procedure. I did this a couple weeks ago and it
worked fine. I had a few problems with OSDs not wanting to unmount, so
I had to reboot each node along the way. I just used it as an excuse
to run updates.

-Erik
>
>>
>> > Nobody is colocating multiple daemons from different clusters on the same
>> > host.  Some have in the past but stopped.  If they choose to in the
>> > future, they can customize the systemd units themselves.
>> >
>> > The rbd-mirror daemon has a similar requirement to talk to multiple
>> > clusters as a client.
>> >
>> > This makes me conclude our current path is fine:
>> >
>> >  - leave existing --cluster infrastructure in place in the ceph code, but
>> >  - remove support for deploying daemons with custom cluster names from the
>> > deployment tools.
>> >
>> > This neatly avoids the systemd limitations for all but the most
>> > adventuresome admins and avoid the more common case of an admin falling
>> > into the "oh, I can name my cluster? cool! [...] oh, i have to add
>> > --cluster rover to every command? ick!" trap.
>> >
>>
>> Yeah, that was me in 2012. Oops.
>>
>> -Erik
>>
>> > sage
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] a question about ceph raw space usage

2017-11-06 Thread Kamble, Nitin A
Dear Cephers,

As seen below, I notice that 12.7% of raw storage is consumed with zero pools 
in the system. These are bluestore OSDs. 
Is this expected or an anomaly?

Thanks,
Nitin

maruti1:~ # ceph -v
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
maruti1:~ # ceph -s
  cluster:
id: 37e0fe9e-6a19-4182-8350-e377d45291ce
health: HEALTH_OK

  services:
mon: 1 daemons, quorum maruti1
mgr: maruti1(active)
osd: 12 osds: 12 up, 12 in

  data:
pools:   0 pools, 0 pgs
objects: 0 objects, 0 bytes
usage:   972 GB used, 6681 GB / 7653 GB avail
pgs:

maruti1:~ # ceph df
GLOBAL:
SIZE  AVAIL RAW USED %RAW USED
7653G 6681G 972G 12.70
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
maruti1:~ # ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE  USEAVAIL %USE  VAR  PGS
0   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
6   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
9   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
1   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
5   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
11   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
3   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
7   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
10   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
2   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
4   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
8   hdd 0.62279  1.0  637G 82955M  556G 12.70 1.00   0
TOTAL 7653G   972G 6681G 12.70
MIN/MAX VAR: 1.00/1.00  STDDEV: 0



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-06 Thread Jason Dillaman
If you could install the debug packages and get a gdb backtrace from all
threads it would be helpful. librbd doesn't utilize any QEMU threads so
even if librbd was deadlocked, the worst case that I would expect would be
your guest OS complaining about hung kernel tasks related to disk IO (since
the disk wouldn't be responding).

On Mon, Nov 6, 2017 at 6:02 PM, Jan Pekař - Imatic 
wrote:

> Hi,
>
> I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu
> 1:2.8+dfsg-6+deb9u3
> I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.
>
> When I tested the cluster, I detected strange and severe problem.
> On first node I'm running qemu hosts with librados disk connection to the
> cluster and all 3 monitors mentioned in connection.
> On second node I stopped mon and osd with command
>
> kill -STOP MONPID OSDPID
>
> Within one minute all my qemu hosts on first node freeze, so they even
> don't respond to ping. On VNC screen there is no error (disk or kernel
> panic), they just hung forever with no console response. Even starting MON
> and OSD on stopped host doesn't make them running. Destroying the qemu
> domain and running again is the only solution.
>
> This happens even if virtual machine has all primary OSD on other OSDs
> from that I have stopped - so it is not writing primary to the stopped OSD.
>
> If I stop only OSD and MON keep running, or I stop only MON and OSD keep
> running everything looks OK.
>
> When I stop MON and OSD, I can see in log  osd.0 1300 heartbeat_check: no
> reply from ... as usual when OSD fails. During this are virtuals still
> running, but after that they all stop.
>
> What should I send you to debug this problem? Without fixing that, ceph is
> not reliable to me.
>
> Thank you
> With regards
> Jan Pekar
> Imatic
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-06 Thread Jan Pekař - Imatic

Hi,

I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu 
1:2.8+dfsg-6+deb9u3

I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.

When I tested the cluster, I detected strange and severe problem.
On first node I'm running qemu hosts with librados disk connection to 
the cluster and all 3 monitors mentioned in connection.

On second node I stopped mon and osd with command

kill -STOP MONPID OSDPID

Within one minute all my qemu hosts on first node freeze, so they even 
don't respond to ping. On VNC screen there is no error (disk or kernel 
panic), they just hung forever with no console response. Even starting 
MON and OSD on stopped host doesn't make them running. Destroying the 
qemu domain and running again is the only solution.


This happens even if virtual machine has all primary OSD on other OSDs 
from that I have stopped - so it is not writing primary to the stopped OSD.


If I stop only OSD and MON keep running, or I stop only MON and OSD keep 
running everything looks OK.


When I stop MON and OSD, I can see in log  osd.0 1300 heartbeat_check: 
no reply from ... as usual when OSD fails. During this are virtuals 
still running, but after that they all stop.


What should I send you to debug this problem? Without fixing that, ceph 
is not reliable to me.


Thank you
With regards
Jan Pekar
Imatic
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-06 Thread Vasu Kulkarni
On Sun, Nov 5, 2017 at 8:19 AM, Brady Deetz  wrote:
> My organization has a production  cluster primarily used for cephfs upgraded
> from jewel to luminous. We would very much like to have snapshots on that
> filesystem, but understand that there are risks.
>
> What kind of work could cephfs admins do to help the devs stabilize this
> feature?

I would say from admin point of view have an equivalent smaller non
production cluster
with some decent workloads(or a subset of data can be copied to this
cluster), you can then try
the features you want to try on non production cluster, it will have
smaller data set but would still
help to characterize how it would perform and if you encounter issues
you can use http://tracker.ceph.com/
to file for an issue and get some attention around it and know more.

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster Down from reweight-by-utilization

2017-11-06 Thread Kevin Hrpcek
An update for the list archive and if people have similar issues in the 
future.


My cluster took about 18 hours after resetting noup for all of the OSDs 
to get to the current epoch. In the end there were 5 that took a few 
hours longer than the others. Other small issues came up during the 
process such as ceph logs filling up /var and memory/swap filling 
probably caused this all to take longer than it should have. Simply 
restarting the OSDs when memory/swap was filling up allowed them to 
catch up faster. The daemons probably generated a bit under 1tb of logs 
throughout the whole process, so /var got expanded.


Once the OSDs all had current epoch I unset noup and let the cluster 
peer/activate PGs. This took another ~6 hours and was likely slowed by 
some of oldest undersized OSD servers not having enough cpu/memory to 
handle it. Throughout the peering/activating I periodically briefly 
unset nodown as a way to see if there were OSDs that were having 
problems and then addressed those.


In the end everything came back and the cluster is healthy and there are 
no existing PG problems. How the reweight triggered a problem this 
severe is still unknown.


A couple takeaways:
- CPU and memory may not be highly utilized in daily operations but is 
very important for large recovery operations. Having a bit more memory 
and cores would have probably saved hours of time from the recovery 
process and may have prevented my problem altogether.
- Slowing the map changes by quickly setting nodown,noout,noup when 
everything is already down will help as well.


Sage, thanks again for your input and advice.

Kevin



On 11/04/2017 11:54 PM, Sage Weil wrote:

On Sat, 4 Nov 2017, Kevin Hrpcek wrote:

Hey Sage,

Thanks for getting back to me this late on a weekend.

Do you now why the OSDs were going down?  Are there any crash dumps in the
osd logs, or is the OOM killer getting them?

That's a part I can't nail down yet. OSDs didn't crash, after the 
reweight-by-utilization OSDs on some of our earlier gen
servers started spinning 100% cpu and were overwhelmed. Admittedly these early 
gen osd servers are undersized on cpu which is
probably why they got overwhelmed, but it hasn't escalated like this before. 
Heartbeats among the cluster's OSDs started
failing on those OSDs first and then the osd 100% cpu  problem seemed to 
snowball to all hosts. I'm still trying to figure out
why the relatively small reweighting caused this problem.

The usual strategy here is to set 'noup' and get all of the OSDs to catch
up on osdmaps (you can check progress via the above status command).  Once
they are all caught up, unset noup and let them all peer at once.

I tried having noup set for a few hours earlier to see if stopping the moving 
osdmap target would help but I eventually unset
it while doing more troubleshooting. I'll set it again and let it go overnight. 
Patience is probably needed with a cluster this
size. I saw this similar situation and was trying your previous solution
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-May/040030.html


The problem that has come up here in the past is when the cluster has been
unhealthy for a long time and the past intervals use too much memory.  I
don't see anything in your description about memory usage, though.  If
that does rear its head there's a patch we can apply to kraken to work
around it (this is fixed in luminous).

Memory usage doesn't seem too bad, a little tight on some of those early gen 
servers, but I haven't seen OOM killing things off
yet. I think I saw mention of that patch and luminous handling this type of 
situation better while googling the issue...larger
osdmap increments or something similar if i recall correctly. My cluster is a 
few weeks away from a luminous upgrade.

That's good.  You mgiht also try setting nobackfill and norecover just to
keep the load off the cluster while it's peering.

s


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Blog post: storage server power consumption

2017-11-06 Thread Jack
Online does that on C14 (https://www.online.net/en/c14)

IIRC, 52 spining disks per RU, with only 2 disks usable at a time
There is some custom hardware, though, and it is really design for cold
storage (as an IO must wait for an idle slot, power-on the device, do
the IO, power-off the device and release the slot)
They use 1GB as a block size

I do not think this will work anyhow with Ceph

On 06/11/2017 23:12, Simon Leinen wrote:
> The last paragraph contains a challenge to developers: Can we save more
> power in "cold storage" applications by turning off idle disks?
> Crazy idea, or did anyone already try this?
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Blog post: storage server power consumption

2017-11-06 Thread Simon Leinen
It was a cold and rainy weekend here, so I did some power measurements
of the three types of storage servers we got over a few years of running
Ceph in production, and compared the results:

https://cloudblog.switch.ch/2017/11/06/ceph-storage-server-power-usage/

The last paragraph contains a challenge to developers: Can we save more
power in "cold storage" applications by turning off idle disks?
Crazy idea, or did anyone already try this?
-- 
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW: ERROR: failed to distribute cache

2017-11-06 Thread Yehuda Sadeh-Weinraub
On Mon, Nov 6, 2017 at 7:29 AM, Wido den Hollander  wrote:
> Hi,
>
> On a Ceph Luminous (12.2.1) environment I'm seeing RGWs stall and about the 
> same time I see these errors in the RGW logs:
>
> 2017-11-06 15:50:24.859919 7f8f5fa1a700  0 ERROR: failed to distribute cache 
> for 
> gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.20
> 2017-11-06 15:50:41.768881 7f8f7824b700  0 ERROR: failed to distribute cache 
> for gn1-pf.rgw.data.root:X
> 2017-11-06 15:55:15.781739 7f8f7824b700  0 ERROR: failed to distribute cache 
> for 
> gn1-pf.rgw.meta:.meta:bucket.instance:X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32:_XK5LExyX6EEIXxCD5Cws:1
> 2017-11-06 15:55:25.784404 7f8f7824b700  0 ERROR: failed to distribute cache 
> for 
> gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32
>
> I see one message from a year ago: 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010531.html
>
> The setup has two RGWs running:
>
> - ceph-rgw1
> - ceph-rgw2
>
> While trying to figure this out I see that a "radosgw-admin period pull" 
> hangs for ever.
>
> I don't know if that is related, but it's something I've noticed.
>
> Mainly I see that at random times the RGW stalls for about 30 seconds and 
> while that happens these messages show up in the RGW's log.
>

do you happen to know if there's a dynamic resharding happening? The
dynamic resharding should only affect the writes to the specific
bucket, and should not affect cache distribution though. Originally I
thought it could be HUP signal related issue, but that seem to be
fixed in 12.2.1.

Yehuda

> Is anybody else running into this issue?
>
> Wido
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Developers Monthly - October

2017-11-06 Thread Leonardo Vaz
On Mon, Nov 06, 2017 at 09:54:41PM +0800, kefu chai wrote:
> On Thu, Oct 5, 2017 at 12:16 AM, Leonardo Vaz  wrote:
> > On Wed, Oct 04, 2017 at 03:02:09AM -0300, Leonardo Vaz wrote:
> >> On Thu, Sep 28, 2017 at 12:08:00AM -0300, Leonardo Vaz wrote:
> >> > Hey Cephers,
> >> >
> >> > This is just a friendly reminder that the next Ceph Developer Montly
> >> > meeting is coming up:
> >> >
> >> >  http://wiki.ceph.com/Planning
> >> >
> >> > If you have work that you're doing that it a feature work, significant
> >> > backports, or anything you would like to discuss with the core team,
> >> > please add it to the following page:
> >> >
> >> >  http://wiki.ceph.com/CDM_04-OCT-2017
> >> >
> >> > If you have questions or comments, please let us know.
> >>
> 
> Leo,
> 
> do we have a recording for the CDM in Oct?

We didn't record the CDM in October because some people were not able to
join us to discuss the topics and we ended having a very quick meeting
instead. 

I have the video recording for the CDM we had on Nov 1st, however I need
to upload it. We are in Sydney for the OpenStack Summit and I'll be able
to do that later this week.

Kindest regards,

Leo
 
> cheers,
> 
> -- 
> Regards
> Kefu Chai

-- 
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] s3 bucket policys

2017-11-06 Thread David Turner
It's pretty much identical to creating a user with radosgw-admin except
instead of user create, you do subuser create.  To create subusers for
user_a, you would do something like this...

# read only subuser with the name user_a:read-only
radosgw-admin subuser create --uid=user_a --gen-access-key --gen-secret
--subuser=read-only --key-type=s3 --access=read

The --subuser= is a name you give to the subuser to know what it's for.
The --access= is the type of access that subuser will have to this bucket.
Your options are read, write, readwrite, and full.

For our deployment we create buckets with a user and hand out sub-users to
people to access the bucket.  Usually it's a full access subuser.  We use
the user that created the bucket as an admin user for that bucket and don't
pass out the access and secret keys for it.  This is annoying if a project
needs to access multiple buckets, because they need to have a new key pair
for each bucket.  You also can't set acl permissions for individual objects
in the bucket for a subuser.  The permissions for a key pair will apply to
everything in a bucket.  It works well enough for what we're doing until we
can upgrade to Luminous and take advantage of the newer features for rgw.

On Mon, Nov 6, 2017 at 11:54 AM nigel davies  wrote:

> Thanks all
>
> David if you can explain how to create subusers with keys i happy to try
> and explain to my boss.
>
> The issue i had with the ACLs, for some reason when i upload a file, to
> bucket_a with user_a
>
> user_b cant read the file even tho user_b has read permissions on the
> bucket.
>
> And i tired what Adam said to set the ACLs
>
> s3cmd setacl s3://bucket_name --acl-grant=read:someuser
> s3cmd setacl s3://bucket_name --acl-grant=write:differentuser
>
> but has no luck its like the object is locked to that user only, with what
> ever permissions i set on the bucket it self
>
>
>
> On Mon, Nov 6, 2017 at 4:43 PM, David Turner 
> wrote:
>
>> If you don't mind juggling multiple access/secret keys, you can use
>> subusers.  Just have 1 user per bucket and create subusers with read,
>> write, etc permissions.  The objects are all owned by the 1 user that
>> created the bucket, and then you pass around the subuser keys to the
>> various apps that need that access to the bucket.  It's not pretty, but it
>> works without altering object permissions.
>>
>> On Mon, Nov 6, 2017 at 11:38 AM Adam C. Emerson 
>> wrote:
>>
>>> On 06/11/2017, nigel davies wrote:
>>> > ok i am using Jewel vershion
>>> >
>>> > when i try setting permissions using s3cmd or an php script using
>>> s3client
>>> >
>>> > i get the error
>>> >
>>> > >> >
>>> encoding="UTF-8"?>InvalidArgumenttest_bucket
>>> > (truncated...)
>>> >InvalidArgument (client):  - >> >
>>> encoding="UTF-8"?>InvalidArgumenttest_buckettx
>>> >
>>> >
>>> a-005a005b91-109f-default109f-default-default
>>> >
>>> >
>>> >
>>> > in the log on the s3 server i get
>>> >
>>> > 2017-11-06 12:54:41.987704 7f67a9feb700  0 failed to parse input: {
>>> > "Version": "2012-10-17",
>>> > "Statement": [
>>> > {
>>> > "Sid": "usr_upload_can_write",
>>> > "Effect": "Allow",
>>> > "Principal": {"AWS": ["arn:aws:iam:::user/test"]},
>>> > "Action": ["s3:ListBucket", "s3:PutObject"],
>>> > "Resource": ["arn:aws:s3:::test_bucket"]
>>> > }
>>> > 2017-11-06 12:54:41.988219 7f67a9feb700  1 == req done
>>> > req=0x7f67a9fe57e0 op status=-22 http_status=400 ==
>>> >
>>> >
>>> > Any advice on this one
>>>
>>> Well! If you upgrade to Luminous the advice I gave you will work
>>> perfectly. Also Luminous has a bunch of awesome, wonderful new
>>> features like Bluestore in it (and really what other enterprise
>>> storage platform promises to color your data such a lovely hue?)
>>>
>>> But, if you can't, I think something like:
>>>
>>> s3cmd setacl s3://bucket_name --acl_grant=read:someuser
>>> s3cmd setacl s3://bucket_name --acl_grant=write:differentuser
>>>
>>> Should work. Other people than I know a lot more about ACLs.
>>>
>>> --
>>> Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
>>> IRC: Aemerson@OFTC, Actinic@Freenode
>>> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] s3 bucket policys

2017-11-06 Thread nigel davies
Thanks all

David if you can explain how to create subusers with keys i happy to try
and explain to my boss.

The issue i had with the ACLs, for some reason when i upload a file, to
bucket_a with user_a

user_b cant read the file even tho user_b has read permissions on the
bucket.

And i tired what Adam said to set the ACLs

s3cmd setacl s3://bucket_name --acl-grant=read:someuser
s3cmd setacl s3://bucket_name --acl-grant=write:differentuser

but has no luck its like the object is locked to that user only, with what
ever permissions i set on the bucket it self



On Mon, Nov 6, 2017 at 4:43 PM, David Turner  wrote:

> If you don't mind juggling multiple access/secret keys, you can use
> subusers.  Just have 1 user per bucket and create subusers with read,
> write, etc permissions.  The objects are all owned by the 1 user that
> created the bucket, and then you pass around the subuser keys to the
> various apps that need that access to the bucket.  It's not pretty, but it
> works without altering object permissions.
>
> On Mon, Nov 6, 2017 at 11:38 AM Adam C. Emerson 
> wrote:
>
>> On 06/11/2017, nigel davies wrote:
>> > ok i am using Jewel vershion
>> >
>> > when i try setting permissions using s3cmd or an php script using
>> s3client
>> >
>> > i get the error
>> >
>> > > > encoding="UTF-8"?>InvalidArgument<
>> BucketName>test_bucket
>> > (truncated...)
>> >InvalidArgument (client):  - > > encoding="UTF-8"?>InvalidArgument<
>> BucketName>test_buckettx
>> >
>> > a-005a005b91-109f-default
>> 109f-default-default
>> >
>> >
>> >
>> > in the log on the s3 server i get
>> >
>> > 2017-11-06 12:54:41.987704 7f67a9feb700  0 failed to parse input: {
>> > "Version": "2012-10-17",
>> > "Statement": [
>> > {
>> > "Sid": "usr_upload_can_write",
>> > "Effect": "Allow",
>> > "Principal": {"AWS": ["arn:aws:iam:::user/test"]},
>> > "Action": ["s3:ListBucket", "s3:PutObject"],
>> > "Resource": ["arn:aws:s3:::test_bucket"]
>> > }
>> > 2017-11-06 12:54:41.988219 7f67a9feb700  1 == req done
>> > req=0x7f67a9fe57e0 op status=-22 http_status=400 ==
>> >
>> >
>> > Any advice on this one
>>
>> Well! If you upgrade to Luminous the advice I gave you will work
>> perfectly. Also Luminous has a bunch of awesome, wonderful new
>> features like Bluestore in it (and really what other enterprise
>> storage platform promises to color your data such a lovely hue?)
>>
>> But, if you can't, I think something like:
>>
>> s3cmd setacl s3://bucket_name --acl_grant=read:someuser
>> s3cmd setacl s3://bucket_name --acl_grant=write:differentuser
>>
>> Should work. Other people than I know a lot more about ACLs.
>>
>> --
>> Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
>> IRC: Aemerson@OFTC, Actinic@Freenode
>> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] s3 bucket policys

2017-11-06 Thread David Turner
If you don't mind juggling multiple access/secret keys, you can use
subusers.  Just have 1 user per bucket and create subusers with read,
write, etc permissions.  The objects are all owned by the 1 user that
created the bucket, and then you pass around the subuser keys to the
various apps that need that access to the bucket.  It's not pretty, but it
works without altering object permissions.

On Mon, Nov 6, 2017 at 11:38 AM Adam C. Emerson  wrote:

> On 06/11/2017, nigel davies wrote:
> > ok i am using Jewel vershion
> >
> > when i try setting permissions using s3cmd or an php script using
> s3client
> >
> > i get the error
> >
> >  >
> encoding="UTF-8"?>InvalidArgumenttest_bucket
> > (truncated...)
> >InvalidArgument (client):  -  >
> encoding="UTF-8"?>InvalidArgumenttest_buckettx
> >
> >
> a-005a005b91-109f-default109f-default-default
> >
> >
> >
> > in the log on the s3 server i get
> >
> > 2017-11-06 12:54:41.987704 7f67a9feb700  0 failed to parse input: {
> > "Version": "2012-10-17",
> > "Statement": [
> > {
> > "Sid": "usr_upload_can_write",
> > "Effect": "Allow",
> > "Principal": {"AWS": ["arn:aws:iam:::user/test"]},
> > "Action": ["s3:ListBucket", "s3:PutObject"],
> > "Resource": ["arn:aws:s3:::test_bucket"]
> > }
> > 2017-11-06 12:54:41.988219 7f67a9feb700  1 == req done
> > req=0x7f67a9fe57e0 op status=-22 http_status=400 ==
> >
> >
> > Any advice on this one
>
> Well! If you upgrade to Luminous the advice I gave you will work
> perfectly. Also Luminous has a bunch of awesome, wonderful new
> features like Bluestore in it (and really what other enterprise
> storage platform promises to color your data such a lovely hue?)
>
> But, if you can't, I think something like:
>
> s3cmd setacl s3://bucket_name --acl_grant=read:someuser
> s3cmd setacl s3://bucket_name --acl_grant=write:differentuser
>
> Should work. Other people than I know a lot more about ACLs.
>
> --
> Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
> IRC: Aemerson@OFTC, Actinic@Freenode
> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] VM Data corruption shortly after Luminous Upgrade

2017-11-06 Thread James Forde
Weird but Very bad problem with my test cluster 2-3 weeks after upgrading to 
Luminous.
All 7 running VM's are corrupted and unbootable. 6 Windows and 1 CentOS7. 
Windows error is "unmountable boot volume". CentOS7 will only boot to emergency 
mode.
3 VM's that were off during event work as expected. 2 Windows and 1 Ubuntu.

History:
7 node cluster: 5 OSD, 3 MON, (1 is MON-OSD). Plus 2 KVM nodes.

System originally running Jewel on old Tower servers. Migrated to all rackmount 
servers. Then upgraded to Kraken. Kraken added the MGR servers.

On the 13th or 14th of October Upgraded to Luminous. Upgrade went smoothly. 
Ceph versions showed all nodes running 12.2.1, Health_OK. Even checked out the 
Ceph Dashboard.

Then around the 20th I created a master for cloning, spun off a clone, mucked 
around with it, flattened it so it was stand alone, and shut it and the master 
off.

Problem:
On November 1st I started the clone and got the following error.

"failed to start domain internal error: qemu unexpectedly closed the monitor 
vice virtio-balloon"



To resolve: (restart MON's one at a time)

I restarted 1 MON. tried to restart clone. Same error.

Restarted 2nd MON. All 7 running VMs shut off!

Restarted 3rd MON. Clone now runs. Try to start any of the 7 VM's that were 
running. "Unmountable Boot Volume"



Pulled the logs on all nodes and am going through them.
So far have found this.

"terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
  what():  buffer::end_of_buffer
terminate called recursively
2017-11-01 19:41:48.814+: shutting down, reason=crashed"

Possible monmap corruption?
Any insight would be greatly appreciated.


Hints?
After the Luminous upgrade, ceph osd tree had nothing in the class column. 
After restarting the MON's, the MON-OSD node had "hdd" on each osd.
After restarting the entire cluster all OSD servers had "hdd" in the class 
column. Not sure why this would not have happened right after upgrade.

Also after restart the mgr servers failed to start. " key for mgr.HOST exists 
but cap mds does not match"
Solved per https://www.seekhole.io/?p=12
$ ceph auth caps mgr.HOST mon 'allow profile mgr' mds 'allow *' osd 'allow *'
Again, not sure why this would not have manifested itself at the upgrade when 
all servers were restarted.

-Jim
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] s3 bucket policys

2017-11-06 Thread Adam C. Emerson
On 06/11/2017, nigel davies wrote:
> ok i am using Jewel vershion
> 
> when i try setting permissions using s3cmd or an php script using s3client
> 
> i get the error
> 
>  encoding="UTF-8"?>InvalidArgumenttest_bucket
> (truncated...)
>InvalidArgument (client):  -  encoding="UTF-8"?>InvalidArgumenttest_buckettx
> 
> a-005a005b91-109f-default109f-default-default
> 
> 
> 
> in the log on the s3 server i get
> 
> 2017-11-06 12:54:41.987704 7f67a9feb700  0 failed to parse input: {
> "Version": "2012-10-17",
> "Statement": [
> {
> "Sid": "usr_upload_can_write",
> "Effect": "Allow",
> "Principal": {"AWS": ["arn:aws:iam:::user/test"]},
> "Action": ["s3:ListBucket", "s3:PutObject"],
> "Resource": ["arn:aws:s3:::test_bucket"]
> }
> 2017-11-06 12:54:41.988219 7f67a9feb700  1 == req done
> req=0x7f67a9fe57e0 op status=-22 http_status=400 ==
> 
> 
> Any advice on this one

Well! If you upgrade to Luminous the advice I gave you will work
perfectly. Also Luminous has a bunch of awesome, wonderful new
features like Bluestore in it (and really what other enterprise
storage platform promises to color your data such a lovely hue?)

But, if you can't, I think something like:

s3cmd setacl s3://bucket_name --acl_grant=read:someuser
s3cmd setacl s3://bucket_name --acl_grant=write:differentuser

Should work. Other people than I know a lot more about ACLs.

-- 
Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
IRC: Aemerson@OFTC, Actinic@Freenode
0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW: ERROR: failed to distribute cache

2017-11-06 Thread Mark Schouten
I see this once on both my RGW's today:
rgw01:

2017-11-06 10:36:35.070068 7f4a4f300700  0 ERROR: failed to distribute cache 
for default.rgw.meta:.meta:bucket.instance:XXX/YYY:ZZZ.30636654.1::0
2017-11-06 10:36:45.139068 7f4a4f300700  0 ERROR: failed to distribute cache 
for default.rgw.data.root:.bucket.meta.XXX:YYY:ZZZ.30636654.1



rgw02:
2017-11-06 10:38:29.606736 7f2463658700  0 ERROR: failed to distribute cache 
for default.rgw.meta:.meta:bucket.instance:XXX/YYY:ZZZ.30636741.1::0
2017-11-06 10:38:39.647266 7f2463658700  0 ERROR: failed to distribute cache 
for default.rgw.data.root:.bucket.meta.XXX:YYY:ZZZ.30636741.1


Not sure if it's a coincidence, but it is the bucket that should be dynamically 
reindexed for resharding, which is broken (Issue #22046)

Met vriendelijke groeten,

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076 | http://www.tuxis.nl/
T: 0318 200208 | i...@tuxis.nl



 Van:   Wido den Hollander  
 Aan:
 Verzonden:   6-11-2017 16:29 
 Onderwerp:   [ceph-users] RGW:  ERROR: failed to distribute cache 

Hi, 
 
On a Ceph Luminous (12.2.1) environment I'm seeing RGWs stall and about the 
same time I see these errors in the RGW logs: 
 
2017-11-06 15:50:24.859919 7f8f5fa1a700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.20
 
2017-11-06 15:50:41.768881 7f8f7824b700  0 ERROR: failed to distribute cache 
for gn1-pf.rgw.data.root:X 
2017-11-06 15:55:15.781739 7f8f7824b700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.meta:.meta:bucket.instance:X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32:_XK5LExyX6EEIXxCD5Cws:1
 
2017-11-06 15:55:25.784404 7f8f7824b700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32
 
 
I see one message from a year ago: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010531.html 
 
The setup has two RGWs running: 
 
- ceph-rgw1 
- ceph-rgw2 
 
While trying to figure this out I see that a "radosgw-admin period pull" hangs 
for ever. 
 
I don't know if that is related, but it's something I've noticed. 
 
Mainly I see that at random times the RGW stalls for about 30 seconds and while 
that happens these messages show up in the RGW's log. 
 
Is anybody else running into this issue? 
 
Wido 
___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 


smime.p7s
Description: Electronic Signature S/MIME
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW: ERROR: failed to distribute cache

2017-11-06 Thread Wido den Hollander
Hi,

On a Ceph Luminous (12.2.1) environment I'm seeing RGWs stall and about the 
same time I see these errors in the RGW logs:

2017-11-06 15:50:24.859919 7f8f5fa1a700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.20
2017-11-06 15:50:41.768881 7f8f7824b700  0 ERROR: failed to distribute cache 
for gn1-pf.rgw.data.root:X
2017-11-06 15:55:15.781739 7f8f7824b700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.meta:.meta:bucket.instance:X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32:_XK5LExyX6EEIXxCD5Cws:1
2017-11-06 15:55:25.784404 7f8f7824b700  0 ERROR: failed to distribute cache 
for 
gn1-pf.rgw.data.root:.bucket.meta.X:eb32b1ca-807a-4867-aea5-ff43ef7647c6.20755572.32

I see one message from a year ago: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-June/010531.html

The setup has two RGWs running:

- ceph-rgw1
- ceph-rgw2

While trying to figure this out I see that a "radosgw-admin period pull" hangs 
for ever.

I don't know if that is related, but it's something I've noticed.

Mainly I see that at random times the RGW stalls for about 30 seconds and while 
that happens these messages show up in the RGW's log.

Is anybody else running into this issue?

Wido
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Developers Monthly - October

2017-11-06 Thread kefu chai
On Thu, Oct 5, 2017 at 12:16 AM, Leonardo Vaz  wrote:
> On Wed, Oct 04, 2017 at 03:02:09AM -0300, Leonardo Vaz wrote:
>> On Thu, Sep 28, 2017 at 12:08:00AM -0300, Leonardo Vaz wrote:
>> > Hey Cephers,
>> >
>> > This is just a friendly reminder that the next Ceph Developer Montly
>> > meeting is coming up:
>> >
>> >  http://wiki.ceph.com/Planning
>> >
>> > If you have work that you're doing that it a feature work, significant
>> > backports, or anything you would like to discuss with the core team,
>> > please add it to the following page:
>> >
>> >  http://wiki.ceph.com/CDM_04-OCT-2017
>> >
>> > If you have questions or comments, please let us know.
>>

Leo,

do we have a recording for the CDM in Oct?

cheers,

-- 
Regards
Kefu Chai
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-11-06 Thread Willem Jan Withagen
On 6-11-2017 14:05, Chris Jones wrote:
> I'll document the resolution here for anyone else who experiences
> similar issues.
> 
> We have determined the root cause of the long boot time was a
> combination of factors having to do with ZFS version and tuning, in
> combination with how long filenames are handled.
> 
> ## 1 ## Insufficient ARC cache size. 
> 
> Dramatically increasing the arc_max and arc_meta_limit allowed better
> performance once the cache had time to populate. Previously, each call
> to getxattr took about 8ms (0.008 sec). Multiply that by millions of
> getxattr calls during OSD daemon startup, this was taking hours. This
> only became apparent when we upgraded to Jewel. Hammer does not appear
> to parse all of the extended attributes during startup; This appeared to
> be introduced in Jewel as part of the sortbitwise algorithm.
> 
> Increasing the arc_max and arc_meta_limit allowed more of the meta data
> to be cached in memory. This reduced getxattr call duration to between
> 10 to 100 microseconds (0.0001 to 0.1 sec). An average of around
> 400x faster.
> 
> ## 2 ## ZFS version 0.6.5.11 and inability to store large amounts of
> meta info in the inode/dnode.
> 
> My understanding is that the ability to use a larger dnode size to store
> meta was not introduced until ZFS version 0.7.x. In version 0.6.5.11
> this was causing large quantities of meta data to be stored in
> inefficient spill blocks, which were taking longer to access since they
> were not cached due to (previously) undersized ARC settings.
> 
> ## Summary ##
> 
> Increasing ARC cache settings improved performance, but performance will
> still be a concern if the ARC is purged/flushed, such during system
> reboot, until the cache rebuilds itself.
> 
> Upgrading to ZFS version 0.7.x is one potential upgrade path to utilize
> larger dnode size. Another upgrade path is to switch to XFS, which is
> the recommended filesystem for CEPH. XFS does not appear to require any
> kind of meta cache due to different handling of meta info in the inode.

Hi Chris,

Thanx for the feedback, glad to see I was not completely off track.

I'm sort of failing to see how XFS could be (extreemly) much faster than
ZFS when accessing data for the first time. Especially if you are
accessing millions of attributes. But then again you are running the
tests, so this is what it is. And ATM I'm not in the position to these
this in my cluster running FreeBSD/ZFS.

On FreeBSD I beleive there is work on keeping the ARC on SSD hot over
cold reboots. So that would mean that you can have a preloaded cache
after a system reboot. But I have not really looked into this at all.

And then again it lloks like ZFSonLinux is lagging a bit in features.

Regards,
--WjW


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

2017-11-06 Thread Chris Jones
I'll document the resolution here for anyone else who experiences similar 
issues.

We have determined the root cause of the long boot time was a combination of 
factors having to do with ZFS version and tuning, in combination with how long 
filenames are handled.

## 1 ## Insufficient ARC cache size.

Dramatically increasing the arc_max and arc_meta_limit allowed better 
performance once the cache had time to populate. Previously, each call to 
getxattr took about 8ms (0.008 sec). Multiply that by millions of getxattr 
calls during OSD daemon startup, this was taking hours. This only became 
apparent when we upgraded to Jewel. Hammer does not appear to parse all of the 
extended attributes during startup; This appeared to be introduced in Jewel as 
part of the sortbitwise algorithm.

Increasing the arc_max and arc_meta_limit allowed more of the meta data to be 
cached in memory. This reduced getxattr call duration to between 10 to 100 
microseconds (0.0001 to 0.1 sec). An average of around 400x faster.

## 2 ## ZFS version 0.6.5.11 and inability to store large amounts of meta info 
in the inode/dnode.

My understanding is that the ability to use a larger dnode size to store meta 
was not introduced until ZFS version 0.7.x. In version 0.6.5.11 this was 
causing large quantities of meta data to be stored in inefficient spill blocks, 
which were taking longer to access since they were not cached due to 
(previously) undersized ARC settings.

## Summary ##

Increasing ARC cache settings improved performance, but performance will still 
be a concern if the ARC is purged/flushed, such during system reboot, until the 
cache rebuilds itself.

Upgrading to ZFS version 0.7.x is one potential upgrade path to utilize larger 
dnode size. Another upgrade path is to switch to XFS, which is the recommended 
filesystem for CEPH. XFS does not appear to require any kind of meta cache due 
to different handling of meta info in the inode.



--
Chris


From: Willem Jan Withagen 
Sent: Wednesday, November 1, 2017 4:51:52 PM
To: Chris Jones; Gregory Farnum
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Hammer to Jewel Upgrade - Extreme OSD Boot Time

On 01/11/2017 18:04, Chris Jones wrote:
> Greg,
>
> Thanks so much for the reply!
>
> We are not clear on why ZFS is behaving poorly under some circumstances
> on getxattr system calls, but that appears to be the case.
>
> Since the last update we have discovered that back-to-back booting of
> the OSD yields very fast boot time, and very fast getxattr system calls.
>
> A longer period between boots (or perhaps related to influx of new data)
> correlates to longer boot duration. This is due to slow getxattr calls
> of certain types.
>
> We suspect this may be a caching or fragmentation issue with ZFS for
> xattrs. Use of longer filenames appear to make this worse.

As far as I understand is a lot of this data stored in the metadata.
Which is (or can be) a different set in the (l2)arc cache.

So are you talking about a OSD reboot, or a system reboot?
Don't quite understand what you mean back-to-back...

I have little experience with ZFS on Linux.
So if behaviour there is different is hard for me to tell.

IF you are rebooting the OSD, I can imagine that with certain sequences
of rebooting pre-loads the meta-cache. Reboots further apart can have
lead to a different working set in the ZFS-caches. And then all data
needs to be refetched, instead of getting it from l2arc.

And note that in newer ZFS versions the in memory ARC even can be
compressed, leading to an even higher hit rate.

For example on my development server with 32Gb memory:
ARC: 20G Total, 1905M MFU, 16G MRU, 70K Anon, 557M Header, 1709M Other
  17G Compressed, 42G Uncompressed, 2.49:1 Ratio

--WjW
>
> We experimented on some OSDs with swapping over to XFS as the
> filesystem, and the problem does not appear to be present on those OSDs.
>
> The two examples below are representative of a Long Boot (longer running
> time and more data influx between osd rebooting) and a Short Boot where
> we booted the same OSD back to back.
>
> Notice the drastic difference in time on the getxattr that yields the
> ENODATA return. Around 0.009 secs for "long boot" and "0.0002" secs when
> the same OSD is booted back to back. Long boot time is approx 40x to 50x
> longer. Multiplied by thousands of getxattr calls, this is/was our
> source of longer boot time.
>
> We are considering a full switch to XFS, but would love to hear any ZFS
> tuning tips that might be a short term workaround.
>
> We are using ZFS 6.5.11 prior to implementation of the ability to use
> large dnodes which would allow the use of dnodesize=auto.
>
> #Long Boot
> <0.44>[pid 3413902] 13:08:00.884238
> getxattr("/osd/9/current/20.86bs3_head/default.34597.7\\uptboatonthewaytohavanaiusedtomakealivingmanpickinthebanananowimaguidefortheciahoorayfortheusababybabymakemelocobabyba

Re: [ceph-users] s3 bucket policys

2017-11-06 Thread nigel davies
ok i am using Jewel vershion

when i try setting permissions using s3cmd or an php script using s3client

i get the error

InvalidArgumenttest_bucket
(truncated...)
   InvalidArgument (client):  - InvalidArgumenttest_buckettx

a-005a005b91-109f-default109f-default-default



in the log on the s3 server i get

2017-11-06 12:54:41.987704 7f67a9feb700  0 failed to parse input: {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "usr_upload_can_write",
"Effect": "Allow",
"Principal": {"AWS": ["arn:aws:iam:::user/test"]},
"Action": ["s3:ListBucket", "s3:PutObject"],
"Resource": ["arn:aws:s3:::test_bucket"]
}
2017-11-06 12:54:41.988219 7f67a9feb700  1 == req done
req=0x7f67a9fe57e0 op status=-22 http_status=400 ==


Any advice on this one

On Fri, Nov 3, 2017 at 9:54 PM, Adam C. Emerson  wrote:

> On 03/11/2017, Simon Leinen wrote:
> [snip]
> > Is this supported by the Luminous version of RadosGW?
>
> Yes! There's a few bugfixes in master that are making their way into
> Luminous, but Luminous has all the features at present.
>
> > (Or even Jewel?)
>
> No!
>
> > Does this work with Keystone integration, i.e. can we refer to Keystone
> > users as principals?
>
> In principle probably. I haven't tried it and I don't really know much
> about Keystone at present. It is hooked into the various
> IdentityApplier classes and if RGW thinks a Keystone user is a 'user'
> and you supply whatever RGW thinks its username is, then it should
> work fine. I haven't tried it, though.
>
> > Let's say there are many read-only users rather than just one.  Would we
> > simply add a new clause under "Statement" for each such user, or is
> > there a better way? (I understand that RadosGW doesn't support groups,
> > which could solve this elegantly and efficiently.)
>
> If you want to give a large number of users the same permissions, just
> put them all in the Principal array.
>
> --
> Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
> IRC: Aemerson@OFTC, Actinic@Freenode
> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Issue with "renamed" mon, crashing

2017-11-06 Thread Anders Olausson
Hi,

I recently (yesterday) upgraded to Luminous (12.2.1) running on Ubuntu 14.04.5 
LTS.
Upgrade went fine, no issues at all.
However when I was about to use ceph-deploy to configure some new disks it 
failed.
After some investigation I figured out that it didn't like that my mons was 
named ceph03mon on the host ceph03 for example, ceph-deploy gatherkeys ceph03 
failed.
So I decided to rename my mons. I started with removing one of them:

# stop ceph-mon id=ceph03mon
# ceph mon remove ceph03mon
# cd /var/lib/ceph/mon/
# mv ceph-ceph03mon disabled-ceph-ceph03mon

Created the new one:

# mkdir tmp
# mkdir ceph-ceph03
# ceph auth get mon. -o tmp/keyring
# ceph mon getmap -o tmp/monmap
# ceph-mon -i ceph03 --mkfs --monmap tmp/monmap --keyring tmp/keyring
# chown -R ceph:ceph ceph-ceph03
# ceph-mon -i ceph03 --public-addr 10.10.1.23:6789
# start ceph-mon id=ceph03

Starts OK, quorum is established, when it gets the command "ceph osd pool stat" 
for example, or "ceph auth list" it crashes.

Complete log can be found at: 
http://files.spacedump.se/ceph03-monerror-20171106-01.txt
Used below settings for logging in ceph.conf at the time:

[mon]
   debug mon = 20
   debug paxos = 20
   debug auth = 20

I have now rolled back to the old monitor, it works as it should, on the same 
box etc. But it's the one upgraded from Hammer -> Jewel -> Luminous.

Any idea what the issue could be?
Thanks.

Best regards
  Anders Olausson
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 答复: Re: Luminous LTS: `ceph osd crush class create` isgone?

2017-11-06 Thread Caspar Smit
"That is not true. "ceph osd crush set-device-class" will fail if the input
OSD has already been assigned a class. Instead you should do "ceph osd
crush rm-device-class" before proceeding."

You are absolutely right, sorry for the confusion!

Caspar

2017-11-04 2:22 GMT+01:00 :

> > With the caveat that the "ceph osd crush set-device-class" command only
> works on existing OSD's which already have a default assigned class so you
> cannot plan/create your classes before > adding some OSD's first.
> > The "ceph osd crush class create" command could be run without any OSD's
> configured.
>
>
> That is not true. "ceph osd crush set-device-class" will fail if the
> input OSD has already been assigned a class. Instead you should do "ceph
> osd crush rm-device-class" before proceeding.
>
> Generally you don't have to pre-create an empty class and associate some
> specific OSDs with that class through  "ceph osd crush set-device-class"
> later, as the command will automatically create it if it does not exist.
>
> Also you can simply disable the "auto-class" feature by setting
> "osd_class_update_on_start" to false, and then newly created OSDs will be
> bound to no class(e.g., hdd/ssd).
>
> Otherwise you have to do "ceph osd crush rm-device-class" before you can
> safely reset OSDs' class to any freeform as you wish.
>
>
>
>
>
>
> 原始邮件
> *发件人:* ;
> *收件人:* ;
> *日 期 :*2017年11月03日 17:41
> *主 题 :**Re: [ceph-users] Luminous LTS: `ceph osd crush class create`
> isgone?*
>
>
> 2017-11-03 7:59 GMT+01:00 Brad Hubbard :
>
>> On Fri, Nov 3, 2017 at 4:04 PM, Linh Vu  wrote:
>> > Hi all,
>> >
>> >
>> > Back in Luminous Dev and RC, I was able to do this:
>> >
>> >
>> > `ceph osd crush class create myclass`
>>
>> This was removed as part of https://github.com/ceph/ceph/pull/16388
>>
>> It looks like the set-device-class command is the replacement or
>> equivalent..
>>
>>
>> $ ceph osd crush class ls
>> [
>> "ssd"
>> ]
>>
>> $ ceph osd crush set-device-class myclass 0 1
>>
>> $ ceph osd crush class ls
>> [
>> "ssd",
>> "myclass"
>> ]
>>
>>
> With the caveat that the "ceph osd crush set-device-class" command only
> works on existing OSD's which already have a default assigned class so you
> cannot plan/create your classes before adding some OSD's first.
> The "ceph osd crush class create" command could be run without any OSD's
> configured.
>
> Kind regards,
> Caspar
>
> >
>> >
>> > so I could utilise the new CRUSH device classes feature as described
>> here:
>> > http://ceph.com/community/new-luminous-crush-device-classes/
>> >
>> >
>> >
>> > and in use here:
>> > http://blog-fromsomedude.rhcloud.com/2017/05/16/Luminous-
>> series-CRUSH-devices-class/
>> >
>> >
>> > Now I'm on Luminous LTS 12.2.1 and my custom device classes are still
>> seen
>> > in:
>> >
>> >
>> > `ceph osd crush class ls`
>> >
>> >
>> > `ceph osd tree` and so on. The cluster is working fine and healthy.
>> >
>> >
>> > However, when I run `ceph osd crush class create myclass2` now, it
>> tells me
>> > the command doesn't exist anymore.
>> >
>> >
>> > Are we not meant to create custom device classes anymore?
>> >
>> >
>> > Regards,
>> >
>> > Linh
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brad
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issues with dynamic bucket indexing resharding and tenants

2017-11-06 Thread Mark Schouten

Issue #22046 created.



Met vriendelijke groeten,

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076 | http://www.tuxis.nl/
T: 0318 200208 | i...@tuxis.nl



 Van:   Orit Wasserman  
 Aan:   Mark Schouten  
 Cc:   ceph-users  
 Verzonden:   5-11-2017 7:33 
 Onderwerp:   Re: [ceph-users] Issues with dynamic bucket indexing resharding 
and tenants 

Hi Mark, 
 
On Fri, Oct 20, 2017 at 4:26 PM, Mark Schouten  wrote: 
> Hi, 
> 
> I see issues with resharding. rgw logging shows the following: 
> 2017-10-20 15:17:30.018807 7fa1b219a700 -1 ERROR: failed to get entry from 
> reshard log, oid=reshard.13 tenant= bucket=qnapnas 
> 
> radosgw-admin shows me there is one bucket in the queue to do resharding 
> for: 
> radosgw-admin reshard list 
> [ 
>     { 
>         "time": "2017-10-20 12:37:28.575096Z", 
>         "tenant": "DB0339", 
>         "bucket_name": "qnapnas", 
>         "bucket_id": "1c19a332-7ffc-4472-b852-ec4a143785cc.19675875.3", 
>         "new_instance_id": "", 
>         "old_num_shards": 1, 
>         "new_num_shards": 4 
>     } 
> ] 
> 
> But, the tenant field in the logging entry is emtpy, which makes me expect 
> that the tenant part is partially implemented. 
> 
> Also, I can add "DB0339/qnapnas" to the list: 
> radosgw-admin reshard add --bucket DB0339/qnapnas --num-shards 4 
> 
> But not like this: 
> radosgw-admin reshard add --bucket qnapnas --tenant DB0339 --num-shards 4 
> ERROR: --tenant is set, but there's no user ID 
> 
> 
> Please advise. 
 
Looks like a bug. 
Can you open a tracker issue for this ? 
 
Thanks, 
Orit 
 
> 
> Met vriendelijke groeten, 
> 
> -- 
> Kerio Operator in de Cloud? https://www.kerioindecloud.nl/ 
> Mark Schouten | Tuxis Internet Engineering 
> KvK: 61527076 | http://www.tuxis.nl/ 
> T: 0318 200208 | i...@tuxis.nl 
> 
> ___ 
> ceph-users mailing list 
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 


smime.p7s
Description: Electronic Signature S/MIME
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] announcing ceph-helm (ceph on kubernetes orchestration)

2017-11-06 Thread Hunter Nield
I’m not sure how I missed this earlier in the lists but having done a lot
of work on Ceph helm charts, this is of definite interest to us. We’ve been
running various states of Ceph in Docker and Kubernetes (in production
environments) for over a year now.

There is a lot of overlap between the Rook and the Ceph related projects
(ceph-docker/ceph-container and now ceph-helm) and I agree with Bassam
about finding ways of bringing things closer. Having felt (and contributed
to) the pain of the complex ceph-container entrypoint scripts, the
importance of the simpler initial configuration and user experience with
Rook and it’s Operator approach can’t be understated. There is a definite
need for a vanilla project to run Ceph on Kubernetes but the most useful
part lies in encapsulating the day-to-day operation of a Ceph cluster that
builds on the strengths of Kubernetes (CRDs, Operators, dynamic scaling,
etc).

Looking forward to seeing where this goes (and joining the discussion)

Hunter

On Sat, Nov 4, 2017 at 5:13 AM Bassam Tabbara  wrote:

> (sorry for the late response, just catching up on ceph-users)
>
> > Probably the main difference is that ceph-helm aims to run Ceph as part
> of
> > the container infrastructure.  The containers are privileged so they can
> > interact with hardware where needed (e.g., lvm for dm-crypt) and the
> > cluster runs on the host network.  We use kubernetes some orchestration:
> > kube is a bit of a headache for mons and osds but will be very helpful
> for
> > scheduling everything else: mgrs, rgw, rgw-nfs, iscsi, mds, ganesha,
> > samba, rbd-mirror, etc.
> >
> > Rook, as I understand it at least (the rook folks on the list can speak
> up
> > here), aims to run Ceph more as a tenant of kubernetes.  The cluster runs
> > in the container network space, and the aim is to be able to deploy ceph
> > more like an unprivileged application on e.g., a public cloud providing
> > kubernetes as the cloud api.
>
> Yes Rook’s goal is to run wherever Kubernetes runs without making changes
> at the host level. Eventually we plan to remove the need to run some of the
> containers privileged, and automatically work with different kernel
> versions and heterogeneous environments. It's fair to think of Rook as an
> application of Kubernetes. As a result you could run it on AWS, Google,
> bare-metal or wherever.
>
> > The other difference is around rook-operator, which is the thing that
> lets
> > you declare what you want (ceph clusters, pools, etc) via kubectl and
> goes
> > off and creates the cluster(s) and tells it/them what to do.  It makes
> the
> > storage look like it is tightly integrated with and part of kubernetes
> but
> > means that kubectl becomes the interface for ceph cluster management.
>
> Rook extends Kubernetes to understand storage concepts like Pool, Object
> Store, FileSystems. Our goal is for storage to be integrated deeply into
> Kuberentes. That said, you can easily launch the Rook toolbox and use the
> ceph tools at any point. I don’t think the goal is for Rook to replace the
> ceph tools, but instead to offer a Kuberentes-native alternative to them.
>
> > Some of that seems useful to me (still developing opinions here!) and
> > perhaps isn't so different than the declarations in your chart's
> > values.yaml but I'm unsure about the wisdom of going too far down the
> road
> > of administering ceph via yaml.
> >
> > Anyway, I'm still pretty new to kubernetes-land and very interested in
> > hearing what people are interested in or looking for here!
>
> Would be great to find ways to get these two projects closer.
>
> Bassam
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] python crush tools uses pre luminous health status

2017-11-06 Thread Stefan Priebe - Profihost AG
Hi,

while trying to use python crush tools on a luminous cluster i get:

crush.ceph.HealthError: expected health overall_status == HEALTH_OK but
got HEALTH_WARNinstead

It seems crush-1.0.35 uses the deprecated overall_status element.

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kernel Version for min client luminous

2017-11-06 Thread Ilya Dryomov
On Mon, Nov 6, 2017 at 8:22 AM, Stefan Priebe - Profihost AG
 wrote:
> Hello,
>
> is there already a kernel available which connects with luminous?
>
> ceph features tells for my kernel clients still release jewel.

require-min-compat-client = jewel is the default for new luminous
clusters.

4.13 supports all luminous features and should work with "ceph osd
set-require-min-compat-client luminous", if you choose to set that.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com