Re: [ceph-users] Multi-site replication speed

2019-04-19 Thread Brian Topping
Hi Casey,

I set up a completely fresh cluster on a new VM host.. everything is fresh 
fresh fresh. I feel like it installed cleanly and because there is practically 
zero latency and unlimited bandwidth as peer VMs, this is a better place to 
experiment. The behavior is the same as the other cluster.

The realm is “example-test”, has a single zone group named “us”, and there are 
zones “left” and “right”. The master zone is “left” and I am trying to 
unidirectionally replicate to “right”. “left” is a two node cluster and right 
is a single node cluster. Both show "too few PGs per OSD” but are otherwise 
100% active+clean. Both clusters have been completely restarted to make sure 
there are no latent config issues, although only the RGW nodes should require 
that. 

The thread at [1] is the most involved engagement I’ve found with a staff 
member on the subject, so I checked and believe I attached all the logs that 
were requested there. They all appear to be consistent and are attached below.

For start: 
> [root@right01 ~]# radosgw-admin sync status
>   realm d5078dd2-6a6e-49f8-941e-55c02ad58af7 (example-test)
>   zonegroup de533461-2593-45d2-8975-99072d860bb2 (us)
>zone 5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe (right)
>   metadata sync syncing
> full sync: 0/64 shards
> incremental sync: 64/64 shards
> metadata is caught up with master
>   data sync source: 479d3f20-d57d-4b37-995b-510ba10756bf (left)
> syncing
> full sync: 0/128 shards
> incremental sync: 128/128 shards
> data is caught up with source


I tried the information at [2] and do not see any ops in progress, just 
“linger_ops”. I don’t know what those are, but probably explain the slow stream 
of requests back and forth between the two RGW endpoints:
> [root@right01 ~]# ceph daemon client.rgw.right01.54395.94074682941968 
> objecter_requests
> {
> "ops": [],
> "linger_ops": [
> {
> "linger_id": 2,
> "pg": "2.16dafda0",
> "osd": 0,
> "object_id": "notify.1",
> "object_locator": "@2",
> "target_object_id": "notify.1",
> "target_object_locator": "@2",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
> },
> ...
> ],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
> 


The next thing I tried is `radosgw-admin data sync run --source-zone=left` from 
the right side. I get bursts of messages of the following form:
> 2019-04-19 21:46:34.281 7f1c006ad580  0 RGW-SYNC:data:sync:shard[1]: ERROR: 
> failed to read remote data log info: ret=-2
> 2019-04-19 21:46:34.281 7f1c006ad580  0 meta sync: ERROR: RGWBackoffControlCR 
> called coroutine returned -2


When I sorted and filtered the messages, each burst has one RGW-SYNC message 
for each of the PGs on the left side identified by the number in “[]”. Since 
left has 128 PGs, these are the numbers between 0-127. The bursts happen about 
once every five seconds.

The packet traces between the nodes during the `data sync run` are mostly 
requests and responses of the following form:
> HTTP GET: 
> http://right01.example.com:7480/admin/log/?type=data=7=true=de533461-2593-45d2-8975-99072d860bb2
>  
> HTTP
>  404 RESPONSE: 
> {"Code":"NoSuchKey","RequestId":"tx02a01-005cba9593-371d-right","HostId":"371d-right-us”}

When I stop the `data sync run`, these 404s stop, so clearly the `data sync 
run` isn’t changing a state in the rgw, but doing something synchronously. In 
the past, I have done a `data sync init` but it doesn’t seem like doing it 
repeatedly will make a difference so I didn’t do it any more.

NEXT STEPS:

I am working on how to get better logging output from daemons and hope to find 
something in there that will help. If I am lucky, I will find something in 
there and can report back so this thread is useful for others. If I have not 
written back, I probably haven’t found anything, so would be grateful for any 
leads.

Kind regards and thank you!

Brian

[1] 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013188.html 

[2] 
http://docs.ceph.com/docs/master/radosgw/troubleshooting/?highlight=linger_ops#blocked-radosgw-requests
 


CONFIG DUMPS:

> [root@left01 ~]# radosgw-admin period get-current
> {
> "current_period": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c"
> }
> [root@left01 ~]# radosgw-admin period get cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c

Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Robin H. Johnson
On Fri, Apr 19, 2019 at 12:10:02PM +0200, Marc Roos wrote:
> I am a bit curious on how production ceph clusters are being used. I am 
> reading here that the block storage is used a lot with openstack and 
> proxmox, and via iscsi with vmare. 
Have you looked at the Ceph User Surveys/Census?
https://ceph.com/ceph-blog/ceph-user-survey-2018-results/
https://ceph.com/geen-categorie/results-from-the-ceph-census/

> But I since nobody here is interested in a better rgw client for end 
> users. I am wondering if the rgw is even being used like this, and what 
> most production environments look like. 
Your end-user client thread was specifically asking targeting GUI
clients on OSX & Windows. I feel that the GUI client usage of S3
protocol has a much higher visibility to data size ratio than
automation/tooling usage.

As the quantity of data by a single user increases, the odds that GUI
tools are used for it decreases, as it's MUCH more likely to be driven
by automation & tooling around the API.

My earliest Ceph production deployment was mostly RGW (~16TB raw), with
a little bit of RBD/iSCSI usage (~1TB of floating disk between VMs).
Very little of the RGW usage was GUI driven (there certainly was some,
because it made business sense to offer it rather than FTP sites; but it
tiny compared to the automation flows).

My second production deployment I worked was Dreamhost's DreamObjects,
which was over 3PB then: and MOST of the usage was still not GUI-driven.

I'm working at DigitalOcean's Spaces offering now; again, mostly non-GUI
access.

For the second part of your original-query, I feel that any new clients
SHOULD not be RGW-specific; they should be able to work on a wide range
of services that expose the S3 API, and have a good test-suite around
that (s3-tests, but for testing the client implementation; even Boto is
not bug-free).

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw, nss: dropping the legacy PKI token support in RadosGW (removed in OpenStack Ocata)

2019-04-19 Thread Mike Lowe
I’ve run production Ceph/OpenStack since 2015.  The reality is running 
OpenStack Newton (the last one with pki) with a post Nautilus release just 
isn’t going to work. You are going to have bigger problems than trying to make 
object storage work with keystone issued tokens. Worst case is you will have to 
do the right thing and switch to fernet tokens which are supported all the way 
back to Kilo released 4 years ago.

> On Apr 19, 2019, at 2:39 PM, Anthony D'Atri  wrote:
> 
> 
> I've been away from OpenStack for a couple of years now, so this may have 
> changed.  But back around the Icehouse release, at least, upgrading between 
> OpenStack releases was a major undertaking, so backing an older OpenStack 
> with newer Ceph seems like it might be more common than one might think.
> 
> Which is not to argue for or against dropping PKI in Ceph, but if it's going 
> to be done, please call that out early in the release notes to avoid rude 
> awakenings.
> 
> 
>> [Adding ceph-users for better usability]
>> 
>> On Fri, 19 Apr 2019, Radoslaw Zarzynski wrote:
>>> Hello,
>>> 
>>> RadosGW can use OpenStack Keystone as one of its authentication
>>> backends. Keystone in turn had been offering many token variants
>>> over the time with PKI/PKIz being one of them. Unfortunately,
>>> this specific type had many flaws (like explosion in size of HTTP
>>> header) and has been dropped from Keystone in August 2016 [1].
>>> By "dropping" I don't mean just "deprecating". PKI tokens have
>>> been physically eradicated from Keystone's code base not leaving
>>> documentation behind. This happened in OpenStack Ocata.
>>> 
>>> Intuitively I don't expect that brand new Ceph is deployed with
>>> an ancient OpenStack release. Similarly, upgrading Ceph while
>>> keeping very old OpenStack seems quite improbable.
>> 
>> This sounds reasonable to me.  If someone is running an old OpenStack, 
>> they should be able to defer their Ceph upgrade until OpenStack is 
>> upgraded... or at least transition off the old keystone variant?
>> 
>> sage
>> 
>>> If so, we may consider dropping PKI token support in further
>>> releases. What makes me perceive this idea as attractive is:
>>> 1) significant clean-up in RGW. We could remove a lot of
>>> complexity including the entire revocation machinery with
>>> its dedicated thread.
>>> 2) Killing the NSS dependency. After moving the AWS-like
>>> crypto services of RGW to OpenSSL, the CMS utilized by PKI
>>> token support is the library sole's user.
>>> I'm not saying it's a blocker for NSS removal. Likely we could
>>> reimplement the stuff on top of OpenSSL as well.
>>> All I'm worrying about is this can be futile effort bringing
>>> more problems/confusion than benefits. For instance, instead
>>> of just dropping the "nss_db_path" config option, we would
>>> need to replace it with counterpart for OpenSSL or take care
>>> of differences in certificate formats between the libraries.
>>> 
>>> I can see benefits of the removal. However, the actual cost
>>> is mysterious to me. Is the feature useful?
>>> 
>>> Regards,
>>> Radek
>>> 
>>> [1]: 
>>> https://github.com/openstack/keystone/commit/8a66ef635400083fa426c0daf477038967785caf
>>> 
>>> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw, nss: dropping the legacy PKI token support in RadosGW (removed in OpenStack Ocata)

2019-04-19 Thread Anthony D'Atri


I've been away from OpenStack for a couple of years now, so this may have 
changed.  But back around the Icehouse release, at least, upgrading between 
OpenStack releases was a major undertaking, so backing an older OpenStack with 
newer Ceph seems like it might be more common than one might think.

Which is not to argue for or against dropping PKI in Ceph, but if it's going to 
be done, please call that out early in the release notes to avoid rude 
awakenings.


> [Adding ceph-users for better usability]
> 
> On Fri, 19 Apr 2019, Radoslaw Zarzynski wrote:
>> Hello,
>> 
>> RadosGW can use OpenStack Keystone as one of its authentication
>> backends. Keystone in turn had been offering many token variants
>> over the time with PKI/PKIz being one of them. Unfortunately,
>> this specific type had many flaws (like explosion in size of HTTP
>> header) and has been dropped from Keystone in August 2016 [1].
>> By "dropping" I don't mean just "deprecating". PKI tokens have
>> been physically eradicated from Keystone's code base not leaving
>> documentation behind. This happened in OpenStack Ocata.
>> 
>> Intuitively I don't expect that brand new Ceph is deployed with
>> an ancient OpenStack release. Similarly, upgrading Ceph while
>> keeping very old OpenStack seems quite improbable.
> 
> This sounds reasonable to me.  If someone is running an old OpenStack, 
> they should be able to defer their Ceph upgrade until OpenStack is 
> upgraded... or at least transition off the old keystone variant?
> 
> sage
> 
>> If so, we may consider dropping PKI token support in further
>> releases. What makes me perceive this idea as attractive is:
>> 1) significant clean-up in RGW. We could remove a lot of
>> complexity including the entire revocation machinery with
>> its dedicated thread.
>> 2) Killing the NSS dependency. After moving the AWS-like
>> crypto services of RGW to OpenSSL, the CMS utilized by PKI
>> token support is the library sole's user.
>> I'm not saying it's a blocker for NSS removal. Likely we could
>> reimplement the stuff on top of OpenSSL as well.
>> All I'm worrying about is this can be futile effort bringing
>> more problems/confusion than benefits. For instance, instead
>> of just dropping the "nss_db_path" config option, we would
>> need to replace it with counterpart for OpenSSL or take care
>> of differences in certificate formats between the libraries.
>> 
>> I can see benefits of the removal. However, the actual cost
>> is mysterious to me. Is the feature useful?
>> 
>> Regards,
>> Radek
>> 
>> [1]: 
>> https://github.com/openstack/keystone/commit/8a66ef635400083fa426c0daf477038967785caf
>> 
>> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Brian Topping
> On Apr 19, 2019, at 10:59 AM, Janne Johansson  wrote:
> 
> May the most significant bit of your life be positive.

Marc, my favorite thing about open source software is it has a 100% money back 
satisfaction guarantee: If you are not completely satisfied, you can have an 
instant refund, just for waving your arm! :D

Seriously though, Janne is right, for any OSS project. Think of it like a party 
where the some people go home “when it’s over” and some people stick around and 
help clean up. Using myself as an example, I’ve been asking questions about RGW 
multi-site, and now that I have a little more experience with it (not much more 
— it’s not working yet, just where I can see gaps in the documentation), I owe 
it to those that have helped me get here by filling those gaps in the docs. 

That’s where I can start, and when I understand what’s going on with more 
authority, I can go into the source and create changes that alter how it works 
for others to review.

Note in both cases I am proposing concrete changes, which is far more effective 
than trying to describe situations that others may have never been in. Many can 
try to help, but if it is frustrating for them, they will lose interest. Good 
pull requests are never frustrating to understand, even if they need more work 
to handle cases others know about. It’s a more quantitative means of expression.

If that kind of commitment doesn’t sound appealing, buy support contracts. Pay 
back in to the community so that those with passion for the product can do 
exactly what I’ve described here. There’s no shame in that, but users like you 
and me need to be careful with the time of those who have put their lives into 
this, at least until we can put more into the party than we have taken out.

Hope that helps!  :B
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Janne Johansson
Den fre 19 apr. 2019 kl 12:10 skrev Marc Roos :

>
> [...]since nobody here is interested in a better rgw client for end
> users. I am wondering if the rgw is even being used like this, and what
> most production environments look like.
>
>
"Like this" ?

People use tons of scriptable and built-in clients, from s3cmd, to "My
backup software can use S3 as a remote backend"
You mentioned looking at two and now conclude noone wants s3...


> This could also be interesting information to decide in what direction
> ceph should develop in the future not?
>
>
Find an area which bugs you and fix that, present your results, don't go
ape over a failed "survey" during easter vacations.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI LUN and target Maximums in ceph-iscsi-3.0+

2019-04-19 Thread Jason Dillaman
On Thu, Apr 18, 2019 at 3:47 PM Wesley Dillingham
 wrote:
>
> I am trying to determine some sizing limitations for a potential iSCSI 
> deployment and wondering whats still the current lay of the land:
>
> Are the following still accurate as of the ceph-iscsi-3.0 implementation 
> assuming CentOS 7.6+ and the latest python-rtslib etc from shaman:
>
> Limit of 4 gateways per cluster (source: 
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/block_device_guide/using_an_iscsi_gateway#requirements)

Yes -- at least that's what's tested. I don't know of any immediate
code-level restrictions, however. You can technically isolate a
cluster of iSCSI gateways by configuring them to access their
configuration from different pools. Of course, then things like the
Ceph Dashboard won't work correctly.

> Limit of 256 LUNS per target (source: 
> https://github.com/ceph/ceph-iscsi-cli/issues/84#issuecomment-373359179 ) 
> there is mention of this being updated in this comment: 
> https://github.com/ceph/ceph-iscsi-cli/issues/84#issuecomment-373449362 per 
> an update to rtslib but I still see the limit as 256 here: 
> https://github.com/ceph/ceph-iscsi/blob/master/ceph_iscsi_config/lun.py#L984 
> wondering if this is just an outdated limit or there is still valid reason to 
> limit the number of LUNs per target

Still a limit although it could possibly be removed. Until recently,
it was painfully slow to add hundreds of LUNs, so assuming that has
been addressed, perhaps this limit could be removed -- it just makes
testing harder.

> Limit of 1 target per cluster: 
> https://github.com/ceph/ceph-iscsi-cli/issues/104#issuecomment-396224922

SUSE added support for multiple targets per cluster.

>
> Thanks in advance.
>
>
>
>
>
> Respectfully,
>
> Wes Dillingham
> wdilling...@godaddy.com
> Site Reliability Engineer IV - Platform Storage / Ceph
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Explicitly picking active iSCSI gateway at RBD/LUN export time.

2019-04-19 Thread Jason Dillaman
On Wed, Apr 17, 2019 at 10:48 AM Wesley Dillingham
 wrote:
>
> The man page for gwcli indicates:
>
> "Disks exported through the gateways use ALUA attributes to provide 
> ActiveOptimised and ActiveNonOptimised  access  to the rbd images. Each disk 
> is assigned a primary owner at creation/import time"
>
> I am trying to determine whether I can explicitly set which gateway will be 
> the "owner" at the creation import time. Further is it possible to change 
> after the initial assignment which gateway is the "owner" through the gwcli.

That's not currently possible via the "gwcli". The owner is
auto-selected based on the gateway with the fewest active LUNs. If you
hand-modified the "gateway.conf" object in the "rbd" pool, you could
force update the owner, but you would need to restart your gateways to
pick up the change.

> Thanks.
>
> Respectfully,
>
> Wes Dillingham
> wdilling...@godaddy.com
> Site Reliability Engineer IV - Platform Storage / Ceph
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph inside Docker containers inside VirtualBox

2019-04-19 Thread Varun Singh
On Fri, Apr 19, 2019 at 10:44 AM Varun Singh  wrote:
>
> On Thu, Apr 18, 2019 at 9:53 PM Siegfried Höllrigl
>  wrote:
> >
> > Hi !
> >
> > I am not 100% sure, but i think, --net=host does not propagate /dev/
> > inside the conatiner.
> >
> >  From the Error Message :
> >
> > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
> > device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !
> >
> >
> > I whould say, you should add something like --device=/dev/vdd to the docker 
> > run command for the osd.
> >
> > Br
> >
> >
> > Am 18.04.2019 um 14:46 schrieb Varun Singh:
> > > Hi,
> > > I am trying to setup Ceph through Docker inside a VM. My host machine
> > > is Mac. My VM is an Ubuntu 18.04. Docker version is 18.09.5, build
> > > e8ff056.
> > > I am following the documentation present on ceph/daemon Docker Hub
> > > page. The idea is, if I spawn docker containers as mentioned on the
> > > page, I should get a ceph setup without KV store. I am not worried
> > > about KV store as I just want to try it out. Following are the
> > > commands I am firing to bring the containers up:
> > >
> > > Monitor:
> > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v
> > > /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e
> > > CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon
> > >
> > > Manager:
> > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v
> > > /var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr
> > >
> > > OSD:
> > > docker run -d --net=host --pid=host --privileged=true -v
> > > /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e
> > > OSD_DEVICE=/dev/vdd ceph/daemon osd
> > >
> > >  From the above commands I am able to spawn monitor and manager
> > > properly. I verified this by firing this command on both monitor and
> > > manager containers:
> > > sudo docker exec d1ab985 ceph -s
> > >
> > > I get following outputs for both:
> > >
> > >cluster:
> > >  id: 14a6e40a-8e54-4851-a881-661a84b3441c
> > >  health: HEALTH_OK
> > >
> > >services:
> > >  mon: 1 daemons, quorum serverceph-VirtualBox (age 62m)
> > >  mgr: serverceph-VirtualBox(active, since 56m)
> > >  osd: 0 osds: 0 up, 0 in
> > >
> > >data:
> > >  pools:   0 pools, 0 pgs
> > >  objects: 0 objects, 0 B
> > >  usage:   0 B used, 0 B / 0 B avail
> > >  pgs:
> > >
> > > However when I try to bring up OSD using above command, it doesn't
> > > work. Docker logs show this output:
> > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: static:
> > > does not generate config
> > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
> > > device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !
> > >
> > > I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE env var.
> > > I know there are five different ways to spawning the OSD, but I am not
> > > able to figure out which one would be suitable for a simple
> > > deployment. If you could please let me know how to spawn OSDs using
> > > Docker, it would help a lot.
> > >
> > >
>
> Thanks Br, I will try this out today.
>
> --
> Regards,
> Varun Singh

Hi,
So following your suggestion I tried following two commands:
1. I added --device=/dev/vdd switch without removing OSD_DEVICE env
var. This resulted in same error before
docker run -d --net=host --pid=host --privileged=true
--device=/dev/vdd -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdd
ceph/daemon osd


2. Then I removed OSD_DEVICE env var and just added --device=/dev/vdd switch
docker run -d --net=host --pid=host --privileged=true
--device=/dev/vdd -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/  ceph/daemon osd

OSD_DEVICE related error went away and I think ceph created an OSD
successfully. But it wasn't able to connect to cluster. Is it because
I did not give and network related information? I get the following
error now:

2019-04-18 08:30:47  /opt/ceph-container/bin/entrypoint.sh: static:
does not generate config
2019-04-18 08:30:47  /opt/ceph-container/bin/entrypoint.sh:
Bootstrapped OSD(s) found; using OSD directory
2019-04-18 08:30:47  /opt/ceph-container/bin/entrypoint.sh: Creating osd
2019-04-18 08:30:52.944 7f897ca6d700 -1 auth: unable to find a keyring
on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or
directory
2019-04-18 08:30:52.944 7f897ca6d700 -1 AuthRegistry(0x7f8978063e78)
no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
2019-04-18 08:30:52.964 7f897ca6d700 -1 auth: unable to find a keyring
on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or
directory
2019-04-18 08:30:52.964 7f897ca6d700 -1 AuthRegistry(0x7f89780bcad8)
no keyring found at /var/lib/ceph/bootstrap-osd/ceph.keyring,
disabling cephx
2019-04-18 08:30:52.964 7f897ca6d700 -1 auth: unable to find a keyring
on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or
directory
2019-04-18 08:30:52.964 7f897ca6d700 -1 

Re: [ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware advisory notice

2019-04-19 Thread Vytautas Jonaitis
Hello all,

Thanks! According to Intel, affected are D3-S4510 and D3-S4610 Series 1.92TB 
and 3.84TB.
For those, who have these SSDs connected to LSI/Avago/Broadcom MegaRAID 
controller - do not forget to run before updating:

isdct set -system EnableLSIAdapter=true


Regards,
Vytautas J.

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan 
Kooman
Sent: Friday, April 19, 2019 10:08 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware 
advisory notice

Hi List,

TL;DR:

For those of you who are running a Ceph cluster with Intel SSD D3-S4510 and or 
Intel SSD D3-S4610 with firmware version XCV10100 please upgrade to firmware 
XCV10110 ASAP. At least before ~ 1700 power up hours.

More information here:

https://support.microsoft.com/en-us/help/4499612/intel-ssd-drives-unresponsive-after-1700-idle-hours

https://downloadcenter.intel.com/download/28673/SSD-S4510-S4610-2-5-non-searchable-firmware-links/

Gr. Stefan

P.s. Thanks to Frank Dennis (@jedisct1) for retweeting @NerdPyle:
https://twitter.com/jedisct1/status/1118623635072258049


-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw, nss: dropping the legacy PKI token support in RadosGW (removed in OpenStack Ocata)

2019-04-19 Thread Sage Weil
[Adding ceph-users for better usability]

On Fri, 19 Apr 2019, Radoslaw Zarzynski wrote:
> Hello,
> 
> RadosGW can use OpenStack Keystone as one of its authentication
> backends. Keystone in turn had been offering many token variants
> over the time with PKI/PKIz being one of them. Unfortunately,
> this specific type had many flaws (like explosion in size of HTTP
> header) and has been dropped from Keystone in August 2016 [1].
> By "dropping" I don't mean just "deprecating". PKI tokens have
> been physically eradicated from Keystone's code base not leaving
> documentation behind. This happened in OpenStack Ocata.
> 
> Intuitively I don't expect that brand new Ceph is deployed with
> an ancient OpenStack release. Similarly, upgrading Ceph while
> keeping very old OpenStack seems quite improbable.

This sounds reasonable to me.  If someone is running an old OpenStack, 
they should be able to defer their Ceph upgrade until OpenStack is 
upgraded... or at least transition off the old keystone variant?

sage

> If so, we may consider dropping PKI token support in further
> releases. What makes me perceive this idea as attractive is:
> 1) significant clean-up in RGW. We could remove a lot of
> complexity including the entire revocation machinery with
> its dedicated thread.
> 2) Killing the NSS dependency. After moving the AWS-like
> crypto services of RGW to OpenSSL, the CMS utilized by PKI
> token support is the library sole's user.
> I'm not saying it's a blocker for NSS removal. Likely we could
> reimplement the stuff on top of OpenSSL as well.
> All I'm worrying about is this can be futile effort bringing
> more problems/confusion than benefits. For instance, instead
> of just dropping the "nss_db_path" config option, we would
> need to replace it with counterpart for OpenSSL or take care
> of differences in certificate formats between the libraries.
> 
> I can see benefits of the removal. However, the actual cost
> is mysterious to me. Is the feature useful?
> 
> Regards,
> Radek
> 
> [1]: 
> https://github.com/openstack/keystone/commit/8a66ef635400083fa426c0daf477038967785caf
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-19 Thread Brad Hubbard
OK. So this works for me with master commit
bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.

I cloned the repo and ran './install-deps.sh' and './do_cmake.sh
-DWITH_FIO=ON' then 'make all'.

# find ./lib  -iname '*.so*' | xargs nm -AD 2>&1 | grep
_ZTIN13PriorityCache8PriCacheE
./lib/libfio_ceph_objectstore.so:018f72d0 V
_ZTIN13PriorityCache8PriCacheE

# LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
conf: Path to a ceph configuration file
oi_attr_len : Set OI(aka '_') attribute to specified length
snapset_attr_len: Set 'snapset' attribute to specified length
_fastinfo_omap_len  : Set '_fastinfo' OMAP attribute to specified length
pglog_simulation: Enables PG Log simulation behavior
pglog_omap_len  : Set pglog omap entry to specified length
pglog_dup_omap_len  : Set duplicate pglog omap entry to specified length
single_pool_mode: Enables the mode when all jobs run against
the same pool
preallocate_files   : Enables/disables file preallocation (touch
and resize) on init

So my result above matches your result on ubuntu but not on centos. It
looks to me like we used to define in libceph-common but currently
it's defined in libfio_ceph_objectstore.so. For reasons that are
unclear you are seeing the old behaviour. Why this is and why it isn't
working as designed is not clear to me but I suspect if you clone the
repo again and build from scratch (maybe in a different directory if
you wish to keep debugging, see below) you should get a working
result. Could you try that as a test?

If, on the other hand, you wish to keep debugging your current
environment I'd suggest looking at the output of the following command
as it may shed further light on the issue.

# LD_DEBUG=all LD_LIBRARY_PATH=./lib ./bin/fio
--enghelp=libfio_ceph_objectstore.so

'LD_DEBUG=lib' may suffice but that's difficult to judge without
knowing what the problem is. I still suspect somehow you have
mis-matched libraries and, if that's the case, it's probably not worth
pursuing. If you can give me specific steps so I can reproduce this
from a freshly cloned tree I'd be happy to look further into it.

Good luck.

On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard  wrote:
>
> Let me try to reproduce this on centos 7.5 with master and I'll let
> you know how I go.
>
> On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
> >
> > Using the commands you provided, I actually find some differences:
> >
> > On my CentOS VM:
> > ```
> > # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so: U 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > # ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> > (0x7fd13f3e7000)
> > ```
> > On my Ubuntu VM:
> > ```
> > $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so:019d13e0 V 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > $ ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 =>
> > /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> > ```
> >
> > Notice the "U" and "V" from nm results.
> >
> >
> >
> >
> > Best,
> > Can Zhang
> >
> > On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard  wrote:
> > >
> > > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > > as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
> > > is undefined. Does ldd show that it is finding the libraries you think
> > > it is? Either it is finding a different version of that library
> > > somewhere else or the version you have may not define that symbol.
> > >
> > > On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
> > > >
> > > > It's already in LD_LIBRARY_PATH, under the same directory of
> > > > libfio_ceph_objectstore.so
> > > >
> > > >
> > > > $ ll lib/|grep libceph-common
> > > > lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> > > > libceph-common.so.0
> > > > -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
> > > >
> > > >
> > > >
> > > >
> > > > Best,
> > > > Can Zhang
> > > >
> > > > On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard  
> > > > wrote:
> > > > >
> > > > > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
> > > > > >
> > > > > > Thanks for your suggestions.
> > > > > >
> > > > > > I tried to build libfio_ceph_objectstore.so, but it fails to load:
> > > > > >
> > > > > > ```
> > > > > > $ LD_LIBRARY_PATH=./lib ./bin/fio 
> > > > > > --enghelp=libfio_ceph_objectstore.so
> > > > > >
> > > > > > fio: engine libfio_ceph_objectstore.so not loadable
> > > > > > IO engine libfio_ceph_objectstore.so not found
> > 

[ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Marc Roos


I am a bit curious on how production ceph clusters are being used. I am 
reading here that the block storage is used a lot with openstack and 
proxmox, and via iscsi with vmare. 
But I since nobody here is interested in a better rgw client for end 
users. I am wondering if the rgw is even being used like this, and what 
most production environments look like. 

This could also be interesting information to decide in what direction 
ceph should develop in the future not?








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw windows/mac clients shitty, develop a new one?

2019-04-19 Thread Brian :
I've always used the standalone mac and Linux package version. Wasn't aware
of the 'bundled software' in the installers. Ugh. Thanks for pointing it
out.

On Thursday, April 18, 2019, Janne Johansson  wrote:
> https://www.reddit.com/r/netsec/comments/8t4xrl/filezilla_malware/
>
> not saying it definitely is, or isn't malware-ridden, but it sure was
shady at that time.
> I would suggest not pointing people to it.
>
> Den tors 18 apr. 2019 kl 16:41 skrev Brian : :
>>
>> Hi Marc
>>
>> Filezilla has decent S3 support https://filezilla-project.org/
>>
>> ymmv of course!
>>
>> On Thu, Apr 18, 2019 at 2:18 PM Marc Roos 
wrote:
>> >
>> >
>> > I have been looking a bit at the s3 clients available to be used, and I
>> > think they are quite shitty, especially this Cyberduck that processes
>> > files with default reading rights to everyone. I am in the process to
>> > advice clients to use for instance this mountain duck. But I am not to
>> > happy about it. I don't like the fact that everything has default
>> > settings for amazon or other stuff in there for ftp or what ever.
>> >
>> > I am thinking of developing something in-house, more aimed at the ceph
>> > environments, easier/better to use.
>> >
>> > What I can think of:
>> >
>> > - cheaper, free or maybe even opensource
>> > - default settings for your ceph cluster
>> > - only configuration for object storage (no amazon, rackspace,
backblaze
>> > shit)
>> > - default secure settings
>> > - offer in the client only functionality that is available from the
>> > specific ceph release
>> > - integration with the finder / explorer windows
>> >
>> > I am curious who would be interested in a such new client? Maybe better
>> > to send me your wishes directly, and not clutter the mailing list with
>> > this.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware advisory notice

2019-04-19 Thread Irek Fasikhov
Wow!!!

пт, 19 апр. 2019 г. в 10:16, Stefan Kooman :

> Hi List,
>
> TL;DR:
>
> For those of you who are running a Ceph cluster with Intel SSD D3-S4510
> and or Intel SSD D3-S4610 with firmware version XCV10100 please upgrade
> to firmware XCV10110 ASAP. At least before ~ 1700 power up hours.
>
> More information here:
>
>
> https://support.microsoft.com/en-us/help/4499612/intel-ssd-drives-unresponsive-after-1700-idle-hours
>
>
> https://downloadcenter.intel.com/download/28673/SSD-S4510-S4610-2-5-non-searchable-firmware-links/
>
> Gr. Stefan
>
> P.s. Thanks to Frank Dennis (@jedisct1) for retweeting @NerdPyle:
> https://twitter.com/jedisct1/status/1118623635072258049
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware advisory notice

2019-04-19 Thread Stefan Kooman
Hi List,

TL;DR:

For those of you who are running a Ceph cluster with Intel SSD D3-S4510
and or Intel SSD D3-S4610 with firmware version XCV10100 please upgrade
to firmware XCV10110 ASAP. At least before ~ 1700 power up hours.

More information here:

https://support.microsoft.com/en-us/help/4499612/intel-ssd-drives-unresponsive-after-1700-idle-hours

https://downloadcenter.intel.com/download/28673/SSD-S4510-S4610-2-5-non-searchable-firmware-links/

Gr. Stefan

P.s. Thanks to Frank Dennis (@jedisct1) for retweeting @NerdPyle:
https://twitter.com/jedisct1/status/1118623635072258049


-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com