[ceph-users] Re: Ceph OIDC Integration

2020-10-13 Thread Pritha Srivastava
Hello,

rgw sts key  should be a key of length 16 since we use AES 128 for
encryption (e.g. rgw sts key = abcdefghijklmnop)

Yes it should be 'sts_client' and not 'client'. The errors in documentation
have been noted and will be corrected.

Also please note that the backport to octopus of the new changes is
underway (https://github.com/ceph/ceph/pull/37640), and this should be
available in the next Octopus release.

Thanks,
Pritha



On Tue, Oct 13, 2020 at 9:22 PM  wrote:

> Hi Pritha and thanks for your reply. We are using Ceph Octopus and we have
> switched to Keycloak from dexIdP.
>
> Having said that we have followed the guide from
> https://docs.ceph.com/en/octopus/radosgw/STS/ but we are constantly
> having an issue with the AssumeRoleWithWebIdentity example.
>
> We are using 2 different accounts for role creation and policy creation
> and those 2 parts of the example script are working fine but when we move
> over to the assume_role_with_web_identity part we have forbidden error from
> Ceph.
>
> We have used cephadm to install Ceph which is at:
> # ceph --version
> ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octopus
> (stable)
>
> We used the following command to add the role capabilities for both users:
> radosgw-admin caps add --uid="TESTER" --caps="roles=*"
> radosgw-admin caps add --uid="TESTER1" --caps="roles=*"
>
> We have set the capabilities for the 2 users mentioned above as shown here:
> buckets (*)
> metadata (*)
> roles (*)
> usage (*)
> user-policy (*)
> users (*)
> zone (*)
>
> ---
>
> Can you please confirm that the key values have actually spaces in them or
> are they missing an underscore?
> [client.radosgw.gateway]
> rgw sts key = {sts key for encrypting the session token}
> rgw s3 auth use sts = true
>
> ---
>
> We are also getting "NameError: name 'client' is not defined" error from
> AssumeRoleWithWebIdentity example in this part shown below. Shouldn't it be
> "sts_client.assume_role_with_web_identity" from
> "client.assume_role_with_web_identity" as it is being defined as sts_client
> in the code above it?
>
> sts_client = boto3.client('sts',
> aws_access_key_id=,
> aws_secret_access_key=,
> endpoint_url=,
> region_name='',
> )
>
> response = client.assume_role_with_web_identity(
> RoleArn=role_response['Role']['Arn'],
> RoleSessionName='Bob',
> DurationSeconds=3600,
> WebIdentityToken=
> )
>
> Can you or anyone give us some pointers to this issue please?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph test cluster, how to estimate performance.

2020-10-13 Thread Daniel Mezentsev

 Hi guys,
Thanks for replies. I looked through that table - hmmm, that is really  
true - samsung pro is not really pro. Well im getting what im paying  
for. Mainly my question was - am i getting adequate to my disk  
performance, and seems like yes i am. My tests shows 7-8 kIOPs,  
replicatio factor 2, so it's exactly ~15-16 kIOPs that is strongly  
correlate with that table. It points me that i don't lose anything on  
the way to ceph cluster - no issue with client, no issue with network  
(funny, everything throuhg virtio), and there is no problem with CPU.  
Only disk is bottle neck. Now i know what exactly is happening with my  
cluster.


Thanks again for help.


Hello Daniel,

yes Samsung "Pro" SSD series aren't to much "pro", especially when it's
about write IOPS. I would tend to say get some Intel S4510 if you can
afford it. It you can't you can still try to activate overprovisioning
on the SSD, I would trend to say reserve 10-30% of the SSD for wear
leveling (writing). First check the number of sectors with hdparm -N
/dev/sdX then set a permanent HPA (host protected area) to the disk. The
"p" and no space is important.

hdparm -Np${SECTORS} --yes-i-know-what-i-am-x /dev/sdX

Wait a little (!), power cycle and re-check the disk with hdparm -N
/dev/sdX. My Samsung 850 Pro are a little reluctant to accept the
setting, but after some tries or a little waiting the change gets permanent.

At least the Samsung 850 pro stopped to die suddenly with that setting.
Without it the SSD occasionally disconnected from the bus and reappeared
after power cycle. I suspect it ran of of wear something.

HTH,

derjohn

On 13.10.20 08:41, Martin Verges wrote:

Hello Daniel,

just throw away your crappy Samsung SSD 860 Pro. It won't work in an
acceptable way.

See
https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit?usp=sharing
for a performance indication of individual disks.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 13. Okt. 2020 um 07:31 Uhr schrieb Daniel Mezentsev 
:
Hi Ceph users,

Im working on  common lisp client utilizing rados library. Got some
results, but don't know how to estimate if i am getting correct
performance. I'm running test cluster from laptop - 2 OSDs -  VM, RAM
4Gb, 4 vCPU each, monitors and mgr are running from the same VM(s). As
for storage, i have Samsung SSD 860 Pro, 512G. Disk is splitted into 2
logical volumes (LVMs), and that volumes are attached to VMs. I know
that i can't expect too much from that layout, just want to know if im
getting adequate numbers. Im doing read/write operations on very small
objects - up to 1kb. In async write im getting ~7.5-8.0 KIOPS.
Synchronouse read - pretty much the same 7.5-8.0 KIOPS. Async read is
segfaulting don't know why. Disk itself is capable to deliver well
above 50 KIOPS. Difference is magnitude. Any info is more welcome.
  Daniel Mezentsev, founder
(+1) 604 313 8592.
Soleks Data Group.
Shaping the clouds.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Andreas John
net-lab GmbH  |  Frankfurter Str. 99  |  63067 Offenbach
Geschaeftsfuehrer: Andreas John | AG Offenbach, HRB40832
Tel: +49 69 8570033-1 | Fax: -2 | http://www.net-lab.net

Facebook: https://www.facebook.com/netlabdotnet
Twitter: https://twitter.com/netlabdotnet

___
ceph-users mailing list -- ceph-us...@ceph.ioto unsubscribe send an  
email to ceph-users-le...@ceph.io

 Daniel Mezentsev, founder
(+1) 604 313 8592.
Soleks Data Group.
Shaping the clouds.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Switching to a private repository

2020-10-13 Thread Liam MacKenzie
Hi,

What is the correct procedure to change a running cluster deployed with cephadm 
to a custom container from a private repository that requires a login?

I would have thought something like this would have been the right way, but the 
second command fails with an authentication error.
$ cephadm registry-login --registry-url myregistry.com/ceph/ 
--registry-username 'robot$myuser --registry-password 'mysecret'
$ ceph orch upgrade start --image myregistry.com/ceph/ceph:v15.2.5

Thanks,
Liam


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES

2020-10-13 Thread Seena Fallah
Hi all,

Is TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES configured just for filestore or
can be used for bluestore, too?
https://github.com/ceph/ceph/blob/master/etc/default/ceph#L7

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ubuntu 20 with octopus

2020-10-13 Thread Tobias Urdin
Hello,


A little off topic, but there isn't much activity in the Puppet community 
around Ceph so if it's something you'd like to share, if the module is 
reusable, I'm sure there is people that could have good use of it.


The "official" ceph/puppet-ceph is dead which was forked out of the Puppet 
OpenStack puppet-ceph project which is still alive but poorly maintained and 
based on Exec resources.


Best regards


From: Burkhard Linke 
Sent: Monday, October 12, 2020 11:26:05 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Ubuntu 20 with octopus

Hi,

On 10/12/20 2:31 AM, Seena Fallah wrote:
> Hi all,
>
> Does anyone has any production cluster with ubuntu 20 (focal) or any
> suggestion or any bugs that prevents to deploy Ceph octopus on Ubuntu 20?


We are running our new ceph cluster on Ubuntu 20.04 and ceph octopus
release. Packages are taken from the official ceph repository (15.2.5);
the ubuntu repositories only contain 15.2.3.


We use a custom puppet module for installing the software; service
configuration is done manually. Took ~ 1 day for 3 mons/mgr and several
100 OSDs.


No cephadm, since it has too many bugs and limitations on our environment.


So TL;DR: runs out of the box with official packages, no problems so
far, not using cephadm


Regards,

Burkhard

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Proxmox+Ceph Benchmark 2020

2020-10-13 Thread Maged Mokhtar



Very nice and useful document. One thing is not clear for me, the fio 
parameters in appendix 5:

--numjobs=<1|4> --iodepths=<1|32>
it is not clear if/when the iodepth was set to 32, was it used with all 
tests with numjobs=4 ? or was it:

--numjobs=<1|4> --iodepths=1

/maged

On 13/10/2020 12:17, Alwin Antreich wrote:

Hello fellow Ceph users,

we have released our new Ceph benchmark paper [0]. The used platform and
Hardware is Proxmox VE 6.2 with Ceph Octopus on a new AMD Epyc Zen2 CPU
with U.2 SSDs (details in the paper).

The paper should illustrate the performance that is possible with a 3x
node cluster without significant tuning.

I welcome everyone to share their experience and add to the discussion,
perferred on our forum [1] thread with our fellow Proxmox VE users.

--
Cheers,
Alwin

[0] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark-2020-09
[1] 
https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Proxmox+Ceph Benchmark 2020

2020-10-13 Thread Alex Gorbachev
Alwin, this is excellent info.  We have a lab on AMD with a similar setup
with NVMe on Proxmox, and will try these benchmarks as well.

--
Alex Gorbachev
Intelligent Systems Services Inc. STORCIUM


On Tue, Oct 13, 2020 at 6:18 AM Alwin Antreich 
wrote:

> Hello fellow Ceph users,
>
> we have released our new Ceph benchmark paper [0]. The used platform and
> Hardware is Proxmox VE 6.2 with Ceph Octopus on a new AMD Epyc Zen2 CPU
> with U.2 SSDs (details in the paper).
>
> The paper should illustrate the performance that is possible with a 3x
> node cluster without significant tuning.
>
> I welcome everyone to share their experience and add to the discussion,
> perferred on our forum [1] thread with our fellow Proxmox VE users.
>
> --
> Cheers,
> Alwin
>
> [0]
> https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark-2020-09
> [1]
> https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] DocuBetter Meeting 14 Oct 2020 -- 24 hours from the time of this email.

2020-10-13 Thread John Zachary Dover
There is a general documentation meeting called the "DocuBetter Meeting",
and it is held every two weeks. The next DocuBetter Meeting will be on 14
Oct 2020 at 1630 UTC, and will run for thirty minutes. Everyone with a
documentation-related request or complaint is invited.

The meeting will be held here: https://bluejeans.com/908675367

Send documentation-related requests and complaints to me by replying to
this email and CCing me at zac.do...@gmail.com.

The next DocuBetter meeting is scheduled for:
14 Oct 2020  1630 UTC


Etherpad: https://pad.ceph.com/p/Ceph_Documentation
Zac's docs whiteboard: https://pad.ceph.com/p/docs_whiteboard
Report Documentation Bugs: https://pad.ceph.com/p/Report_Documentation_Bugs

Meeting: https://bluejeans.com/908675367
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Proxmox+Ceph Benchmark 2020

2020-10-13 Thread Mark Nelson

Thanks for the link Alwin!


On intel platforms disabling C/P state transitions can have a really big 
impact on IOPS (on RHEL for instance using the network or performance 
latency tuned profile).  It would be very interesting to know if AMD 
EPYC platforms see similar benefits.  I don't have any in house, but if 
you happen to have a chance it would be an interesting addendum to your 
report.



Mark


On 10/13/20 5:17 AM, Alwin Antreich wrote:

Hello fellow Ceph users,

we have released our new Ceph benchmark paper [0]. The used platform and
Hardware is Proxmox VE 6.2 with Ceph Octopus on a new AMD Epyc Zen2 CPU
with U.2 SSDs (details in the paper).

The paper should illustrate the performance that is possible with a 3x
node cluster without significant tuning.

I welcome everyone to share their experience and add to the discussion,
perferred on our forum [1] thread with our fellow Proxmox VE users.

--
Cheers,
Alwin

[0] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark-2020-09
[1] 
https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph OIDC Integration

2020-10-13 Thread technical
Hi Pritha and thanks for your reply. We are using Ceph Octopus and we have 
switched to Keycloak from dexIdP.

Having said that we have followed the guide from 
https://docs.ceph.com/en/octopus/radosgw/STS/ but we are constantly having an 
issue with the AssumeRoleWithWebIdentity example.

We are using 2 different accounts for role creation and policy creation and 
those 2 parts of the example script are working fine but when we move over to 
the assume_role_with_web_identity part we have forbidden error from Ceph.

We have used cephadm to install Ceph which is at:
# ceph --version
ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octopus (stable)

We used the following command to add the role capabilities for both users:
radosgw-admin caps add --uid="TESTER" --caps="roles=*"
radosgw-admin caps add --uid="TESTER1" --caps="roles=*"

We have set the capabilities for the 2 users mentioned above as shown here:
buckets (*)
metadata (*)
roles (*)
usage (*)
user-policy (*)
users (*)
zone (*) 

---

Can you please confirm that the key values have actually spaces in them or are 
they missing an underscore?
[client.radosgw.gateway]
rgw sts key = {sts key for encrypting the session token}
rgw s3 auth use sts = true

---

We are also getting "NameError: name 'client' is not defined" error from 
AssumeRoleWithWebIdentity example in this part shown below. Shouldn't it be 
"sts_client.assume_role_with_web_identity" from 
"client.assume_role_with_web_identity" as it is being defined as sts_client in 
the code above it?

sts_client = boto3.client('sts',
aws_access_key_id=,
aws_secret_access_key=,
endpoint_url=,
region_name='',
)

response = client.assume_role_with_web_identity(
RoleArn=role_response['Role']['Arn'],
RoleSessionName='Bob',
DurationSeconds=3600,
WebIdentityToken=
)

Can you or anyone give us some pointers to this issue please?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Announcing go-ceph v0.6.0

2020-10-13 Thread John Mulligan
I'm happy to announce the another release of the go-ceph API 
bindings. This is a regular release following our every-two-months release 
cadence.

https://github.com/ceph/go-ceph/releases/tag/v0.6.0

Changes in the release are detailed in the link above.

The bindings aim to play a similar role to the "pybind" python bindings in the 
ceph tree but for the Go language. These API bindings require the use of cgo.  
There are already a few consumers of this library in the wild, including the 
ceph-csi project.


Specific questions, comments, bugs etc are best directed at our github issues 
tracker.


-- 
John Mulligan

phlogistonj...@asynchrono.us
jmulli...@redhat.com








___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problems with mon

2020-10-13 Thread Gaël THEROND
If you’ve got all “Nodes” up and running fine now, here what I’ve done on
my own just this morning.

1°/- Ensure all MONs get the same /etc/ceph/ceph.conf file.

2°/- Many times you MONs share the same keyring, if so, ensure you’ve got
the right keyring in both places /etc/ceph/ceph.mon.keyring and
/var/lib/ceph/mon/-/keyring

3°/- Delete your NOT HEALTHY mon store and kv that you can found out on
/var/lib/ceph/mon/-/ it will be rebuild during the
restart of the mon process.

4°/- Start the latest healthy monitor and wait for him to complain about no
way to acquire global_id.

5°/- Start the remaining MONs.

You should see the quorum trigger a new election as soon as each mons will
have detected it is part of an already existing cluster and so retrieve the
appropriate data (store/kv/etc) from the remaining healthy MON.

This procedure can fail if your not healthy MONs don’t get the appropriate
keyring.

Le mar. 13 oct. 2020 à 12:56, Mateusz Skała  a
écrit :

> Hi,
> Thanks for responding, all monitors goes down, 2/3 is actually up, but
> probably not in the quorum. Quick look for before tasks:
>
>1. few pgs without scrub and deep-scrub, 2 mons in cluster
>2. added one monitor (via ansible), ansible restarted osd
>3. all system os filesystem goes full (because of multiple sst files)
>4. all pods with monitors goes down
>5. added new fs for monitors, and move data from system os to this fs
>6. 2 monitors started (last with failure), but not responding for any
>commands
>
> Regards
> Mateusz Skała
>
>
> On Tue, 13 Oct 2020 at 11:25, Gaël THEROND 
> wrote:
>
>> This error means your quorum didn’t formed.
>>
>> How much mon nodes do you have usually and how much went down?
>>
>> Le mar. 13 oct. 2020 à 10:56, Mateusz Skała  a
>> écrit :
>>
>>> Hello Community,
>>> I have problems with ceph-mons in docker. Docker pods are starting but I
>>> got a lot of messages "e6 handle_auth_request failed to assign global_id”
>>> in log. 2 mons are up but I can’t send any ceph commands.
>>> Regards
>>> Mateusz
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: BlueFS spillover detected - correct response for 14.2.7?

2020-10-13 Thread Eugen Block

Hi,

if possible you can increase the devices (to reasonable sizes,  
3/30/300 GB, doubling the space can help during compaction) holding  
the rocksDB and expand them. Compacting the OSDs should then remove  
the respective spillover from the main devices or you can let ceph do  
it by its own which will be slower. Here's a thread I started last  
year [1] wrt spillover.


Regards,
Eugen

[1]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EWDOYCVPVGYCMUR7BJZQEOMGX3W4A3ZA/



Zitat von Dave Hall :


Hello,

We are running a 14.2.7 cluster - 3 nodes with 24 OSDs, I've recently
started getting 'BlueFS spillover detected'.  I'm up to 3 OSDs in this
state.

In scanning through the various online sources I haven't been able to
determine how to respond to this condition.

Please advise.

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] BlueFS spillover detected - correct response for 14.2.7?

2020-10-13 Thread Dave Hall
Hello,

We are running a 14.2.7 cluster - 3 nodes with 24 OSDs, I've recently
started getting 'BlueFS spillover detected'.  I'm up to 3 OSDs in this
state.

In scanning through the various online sources I haven't been able to
determine how to respond to this condition.

Please advise.

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephadm numa aware config

2020-10-13 Thread nokia ceph
Hi Team,

I would like to validate cephadm on bare metal and use docker/podman as a
container.
Currently we use NUMA aware config on bare metal to improve performance .
Is there any config I can apply in cephadm to run podman/docker use run
with *–cpuset-cpus*=*num * and  *–cpuset-mems*=*nodes *options in startup
scripts  ?

Thanks,
Muthu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problems with mon

2020-10-13 Thread Mateusz Skała
Hi,
Thanks for responding, all monitors goes down, 2/3 is actually up, but
probably not in the quorum. Quick look for before tasks:

   1. few pgs without scrub and deep-scrub, 2 mons in cluster
   2. added one monitor (via ansible), ansible restarted osd
   3. all system os filesystem goes full (because of multiple sst files)
   4. all pods with monitors goes down
   5. added new fs for monitors, and move data from system os to this fs
   6. 2 monitors started (last with failure), but not responding for any
   commands

Regards
Mateusz Skała


On Tue, 13 Oct 2020 at 11:25, Gaël THEROND 
wrote:

> This error means your quorum didn’t formed.
>
> How much mon nodes do you have usually and how much went down?
>
> Le mar. 13 oct. 2020 à 10:56, Mateusz Skała  a
> écrit :
>
>> Hello Community,
>> I have problems with ceph-mons in docker. Docker pods are starting but I
>> got a lot of messages "e6 handle_auth_request failed to assign global_id”
>> in log. 2 mons are up but I can’t send any ceph commands.
>> Regards
>> Mateusz
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Proxmox+Ceph Benchmark 2020

2020-10-13 Thread Alwin Antreich
Hello fellow Ceph users,

we have released our new Ceph benchmark paper [0]. The used platform and
Hardware is Proxmox VE 6.2 with Ceph Octopus on a new AMD Epyc Zen2 CPU
with U.2 SSDs (details in the paper).

The paper should illustrate the performance that is possible with a 3x
node cluster without significant tuning.

I welcome everyone to share their experience and add to the discussion,
perferred on our forum [1] thread with our fellow Proxmox VE users.

--
Cheers,
Alwin

[0] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark-2020-09
[1] 
https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-converged-with-nvme.76516/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problems with mon

2020-10-13 Thread Gaël THEROND
This error means your quorum didn’t formed.

How much mon nodes do you have usually and how much went down?

Le mar. 13 oct. 2020 à 10:56, Mateusz Skała  a
écrit :

> Hello Community,
> I have problems with ceph-mons in docker. Docker pods are starting but I
> got a lot of messages "e6 handle_auth_request failed to assign global_id”
> in log. 2 mons are up but I can’t send any ceph commands.
> Regards
> Mateusz
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Problems with mon

2020-10-13 Thread Mateusz Skała
Hello Community,
I have problems with ceph-mons in docker. Docker pods are starting but I got a 
lot of messages "e6 handle_auth_request failed to assign global_id” in log. 2 
mons are up but I can’t send any ceph commands.
Regards 
Mateusz
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph test cluster, how to estimate performance.

2020-10-13 Thread Andreas John
Hello Daniel,

yes Samsung "Pro" SSD series aren't to much "pro", especially when it's
about write IOPS. I would tend to say get some Intel S4510 if you can
afford it. It you can't you can still try to activate overprovisioning
on the SSD, I would trend to say reserve 10-30% of the SSD for wear
leveling (writing). First check the number of sectors with hdparm -N
/dev/sdX then set a permanent HPA (host protected area) to the disk. The
"p" and no space is important.

hdparm -Np${SECTORS} --yes-i-know-what-i-am-x /dev/sdX

Wait a little (!), power cycle and re-check the disk with hdparm -N
/dev/sdX. My Samsung 850 Pro are a little reluctant to accept the
setting, but after some tries or a little waiting the change gets permanent.

At least the Samsung 850 pro stopped to die suddenly with that setting.
Without it the SSD occasionally disconnected from the bus and reappeared
after power cycle. I suspect it ran of of wear something.


HTH,

derjohn


On 13.10.20 08:41, Martin Verges wrote:
> Hello Daniel,
>
> just throw away your crappy Samsung SSD 860 Pro. It won't work in an
> acceptable way.
>
> See
> https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit?usp=sharing
> for a performance indication of individual disks.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Di., 13. Okt. 2020 um 07:31 Uhr schrieb Daniel Mezentsev > :
>> Hi Ceph users,
>>
>> Im working on  common lisp client utilizing rados library. Got some
>> results, but don't know how to estimate if i am getting correct
>> performance. I'm running test cluster from laptop - 2 OSDs -  VM, RAM
>> 4Gb, 4 vCPU each, monitors and mgr are running from the same VM(s). As
>> for storage, i have Samsung SSD 860 Pro, 512G. Disk is splitted into 2
>> logical volumes (LVMs), and that volumes are attached to VMs. I know
>> that i can't expect too much from that layout, just want to know if im
>> getting adequate numbers. Im doing read/write operations on very small
>> objects - up to 1kb. In async write im getting ~7.5-8.0 KIOPS.
>> Synchronouse read - pretty much the same 7.5-8.0 KIOPS. Async read is
>> segfaulting don't know why. Disk itself is capable to deliver well
>> above 50 KIOPS. Difference is magnitude. Any info is more welcome.
>>   Daniel Mezentsev, founder
>> (+1) 604 313 8592.
>> Soleks Data Group.
>> Shaping the clouds.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
-- 
Andreas John
net-lab GmbH  |  Frankfurter Str. 99  |  63067 Offenbach
Geschaeftsfuehrer: Andreas John | AG Offenbach, HRB40832
Tel: +49 69 8570033-1 | Fax: -2 | http://www.net-lab.net

Facebook: https://www.facebook.com/netlabdotnet
Twitter: https://twitter.com/netlabdotnet

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph test cluster, how to estimate performance.

2020-10-13 Thread Martin Verges
Hello Daniel,

just throw away your crappy Samsung SSD 860 Pro. It won't work in an
acceptable way.

See
https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit?usp=sharing
for a performance indication of individual disks.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 13. Okt. 2020 um 07:31 Uhr schrieb Daniel Mezentsev :

> Hi Ceph users,
>
> Im working on  common lisp client utilizing rados library. Got some
> results, but don't know how to estimate if i am getting correct
> performance. I'm running test cluster from laptop - 2 OSDs -  VM, RAM
> 4Gb, 4 vCPU each, monitors and mgr are running from the same VM(s). As
> for storage, i have Samsung SSD 860 Pro, 512G. Disk is splitted into 2
> logical volumes (LVMs), and that volumes are attached to VMs. I know
> that i can't expect too much from that layout, just want to know if im
> getting adequate numbers. Im doing read/write operations on very small
> objects - up to 1kb. In async write im getting ~7.5-8.0 KIOPS.
> Synchronouse read - pretty much the same 7.5-8.0 KIOPS. Async read is
> segfaulting don't know why. Disk itself is capable to deliver well
> above 50 KIOPS. Difference is magnitude. Any info is more welcome.
>   Daniel Mezentsev, founder
> (+1) 604 313 8592.
> Soleks Data Group.
> Shaping the clouds.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io