Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Виталий Филиппов
I had. 100-200 write iops with iodepth=1, ~5k iops with iodepth=128. These were 
intel 545s.

Not that awful, but micron 5200 costs only a fraction more, so it seems 
pointless to me to use desktop samsungs.

19 декабря 2019 г. 22:20:28 GMT+03:00, Sinan Polat  пишет:
>Hi all,
>
>Thanks for the replies. I am not worried about their lifetime. We will
>be adding only 1 SSD disk per physical server. All SSD’s are enterprise
>drives. If the added consumer grade disk will fail, no problem.
>
>I am more curious regarding their I/O performance. I do want to have
>50% drop in performance.
>
>So anyone any experience with 860 EVO or Crucial MX500 in a Ceph setup?
>
>Thanks!
>
>> Op 19 dec. 2019 om 19:18 heeft Mark Nelson  het
>volgende geschreven:
>> 
>> The way I try to look at this is:
>> 
>> 
>> 1) How much more do the enterprise grade drives cost?
>> 
>> 2) What are the benefits? (Faster performance, longer life, etc)
>> 
>> 3) How much does it cost to deal with downtime, diagnose issues, and
>replace malfunctioning hardware?
>> 
>> 
>> My personal take is that enterprise drives are usually worth it.
>There may be consumer grade drives that may be worth considering in
>very specific scenarios if they still have power loss protection and
>high write durability.  Even when I was in academia years ago with very
>limited budgets, we got burned with consumer grade SSDs to the point
>where we had to replace them all.  You have to be very careful and know
>exactly what you are buying.
>> 
>> 
>> Mark
>> 
>> 
>>> On 12/19/19 12:04 PM, jes...@krogh.cc wrote:
>>> I dont think “usually” is good enough in a production setup.
>>> 
>>> 
>>> 
>>> Sent from myMail for iOS
>>> 
>>> 
>>> Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов
>:
>>> 
>>>Usually it doesn't, it only harms performance and probably SSD
>>>lifetime
>>>too
>>> 
>>>> I would not be running ceph on ssds without powerloss
>protection. I
>>>> delivers a potential data loss scenario
>>> 
>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
With best regards,
  Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Antoine Lecrux
Hi,

If you're looking for a consumer grade SSD, make sure it has capacitors to 
protect you from data corruption in case of a power outage on the entire Ceph 
Cluster.
That's the most important technical specification to look for.

- Antoine

-Original Message-
From: ceph-users  On Behalf Of Udo Lembke
Sent: Thursday, December 19, 2019 3:22 PM
To: Sinan Polat 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Consumer-grade SSD in Ceph

Hi,
if you add on more than one server an SSD with an short lifetime, you can run 
in real trouble (dataloss)!
Even if, all other SSDs are enterprise grade.
Ceph mix all data in PGs, which are spread over many disks - if one disk fails 
- no poblem, but if the next two fails after that due high io
(recovery) you will have data loss.
But if you have only one node with consumer SSDs, the whole node can go down 
without trouble...

I've tried consumer SSDs as yournal a long time ago - was an bad idea!
But this SSDs are cheap - buy one and do the io-test.
If you monitoring the live-time it's perhaps possible for your setup.

Udo


Am 19.12.19 um 20:20 schrieb Sinan Polat:
> Hi all,
>
> Thanks for the replies. I am not worried about their lifetime. We will be 
> adding only 1 SSD disk per physical server. All SSD’s are enterprise drives. 
> If the added consumer grade disk will fail, no problem.
>
> I am more curious regarding their I/O performance. I do want to have 50% drop 
> in performance.
>
> So anyone any experience with 860 EVO or Crucial MX500 in a Ceph setup?
>
> Thanks!
>
>> Op 19 dec. 2019 om 19:18 heeft Mark Nelson  het volgende 
>> geschreven:
>>
>> The way I try to look at this is:
>>
>>
>> 1) How much more do the enterprise grade drives cost?
>>
>> 2) What are the benefits? (Faster performance, longer life, etc)
>>
>> 3) How much does it cost to deal with downtime, diagnose issues, and replace 
>> malfunctioning hardware?
>>
>>
>> My personal take is that enterprise drives are usually worth it. There may 
>> be consumer grade drives that may be worth considering in very specific 
>> scenarios if they still have power loss protection and high write 
>> durability.  Even when I was in academia years ago with very limited 
>> budgets, we got burned with consumer grade SSDs to the point where we had to 
>> replace them all.  You have to be very careful and know exactly what you are 
>> buying.
>>
>>
>> Mark
>>
>>
>>> On 12/19/19 12:04 PM, jes...@krogh.cc wrote:
>>> I dont think “usually” is good enough in a production setup.
>>>
>>>
>>>
>>> Sent from myMail for iOS
>>>
>>>
>>> Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов 
>>> :
>>>
>>>Usually it doesn't, it only harms performance and probably SSD
>>>lifetime
>>>too
>>>
>>>> I would not be running ceph on ssds without powerloss protection. I
>>>> delivers a potential data loss scenario
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Udo Lembke
Hi,
if you add on more than one server an SSD with an short lifetime, you
can run in real trouble (dataloss)!
Even if, all other SSDs are enterprise grade.
Ceph mix all data in PGs, which are spread over many disks - if one disk
fails - no poblem, but if the next two fails after that due high io
(recovery) you will have data loss.
But if you have only one node with consumer SSDs, the whole node can go
down without trouble...

I've tried consumer SSDs as yournal a long time ago - was an bad idea!
But this SSDs are cheap - buy one and do the io-test.
If you monitoring the live-time it's perhaps possible for your setup.

Udo


Am 19.12.19 um 20:20 schrieb Sinan Polat:
> Hi all,
>
> Thanks for the replies. I am not worried about their lifetime. We will be 
> adding only 1 SSD disk per physical server. All SSD’s are enterprise drives. 
> If the added consumer grade disk will fail, no problem.
>
> I am more curious regarding their I/O performance. I do want to have 50% drop 
> in performance.
>
> So anyone any experience with 860 EVO or Crucial MX500 in a Ceph setup?
>
> Thanks!
>
>> Op 19 dec. 2019 om 19:18 heeft Mark Nelson  het volgende 
>> geschreven:
>>
>> The way I try to look at this is:
>>
>>
>> 1) How much more do the enterprise grade drives cost?
>>
>> 2) What are the benefits? (Faster performance, longer life, etc)
>>
>> 3) How much does it cost to deal with downtime, diagnose issues, and replace 
>> malfunctioning hardware?
>>
>>
>> My personal take is that enterprise drives are usually worth it. There may 
>> be consumer grade drives that may be worth considering in very specific 
>> scenarios if they still have power loss protection and high write 
>> durability.  Even when I was in academia years ago with very limited 
>> budgets, we got burned with consumer grade SSDs to the point where we had to 
>> replace them all.  You have to be very careful and know exactly what you are 
>> buying.
>>
>>
>> Mark
>>
>>
>>> On 12/19/19 12:04 PM, jes...@krogh.cc wrote:
>>> I dont think “usually” is good enough in a production setup.
>>>
>>>
>>>
>>> Sent from myMail for iOS
>>>
>>>
>>> Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов 
>>> :
>>>
>>>Usually it doesn't, it only harms performance and probably SSD
>>>lifetime
>>>too
>>>
>>>> I would not be running ceph on ssds without powerloss protection. I
>>>> delivers a potential data loss scenario
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Sinan Polat
Hi all,

Thanks for the replies. I am not worried about their lifetime. We will be 
adding only 1 SSD disk per physical server. All SSD’s are enterprise drives. If 
the added consumer grade disk will fail, no problem.

I am more curious regarding their I/O performance. I do want to have 50% drop 
in performance.

So anyone any experience with 860 EVO or Crucial MX500 in a Ceph setup?

Thanks!

> Op 19 dec. 2019 om 19:18 heeft Mark Nelson  het volgende 
> geschreven:
> 
> The way I try to look at this is:
> 
> 
> 1) How much more do the enterprise grade drives cost?
> 
> 2) What are the benefits? (Faster performance, longer life, etc)
> 
> 3) How much does it cost to deal with downtime, diagnose issues, and replace 
> malfunctioning hardware?
> 
> 
> My personal take is that enterprise drives are usually worth it. There may be 
> consumer grade drives that may be worth considering in very specific 
> scenarios if they still have power loss protection and high write durability. 
>  Even when I was in academia years ago with very limited budgets, we got 
> burned with consumer grade SSDs to the point where we had to replace them 
> all.  You have to be very careful and know exactly what you are buying.
> 
> 
> Mark
> 
> 
>> On 12/19/19 12:04 PM, jes...@krogh.cc wrote:
>> I dont think “usually” is good enough in a production setup.
>> 
>> 
>> 
>> Sent from myMail for iOS
>> 
>> 
>> Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов 
>> :
>> 
>>Usually it doesn't, it only harms performance and probably SSD
>>lifetime
>>too
>> 
>>> I would not be running ceph on ssds without powerloss protection. I
>>> delivers a potential data loss scenario
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy can't generate the client.admin keyring

2019-12-19 Thread Jean-Philippe Méthot
Alright, so I figured it out. It was essentially because the monitor’s main IP 
wasn’t on the public network in the ceph.conf file. Hence, ceph was trying to 
connect on an IP where the monitor wasn’t listening.


Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




> Le 19 déc. 2019 à 11:50, Jean-Philippe Méthot  a 
> écrit :
> 
> Hi,
> 
> We’re currently running Ceph mimic in production and that works fine. 
> However, I am currently deploying another Ceph mimic setup for testing 
> purposes and ceph-deploy is running into issues I’ve never seen before.
> Essentially, the initial monitor setup starts the service, but the process 
> gets interrupted at 
> 
> sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. 
> --keyring=/var/lib/ceph/mon/ceph-ceph-monitor1/keyring auth get client.admin
> 
> with the client.admin keyring generation timing out. I can however issue 
> commands manually using ceph-authtool to create an admin keyring and it works 
> flawlessly.
> 
> ceph monitor logs don’t show any error and I am able to reach the monitor’s 
> port by telnet. What could be causing this timeout?
> 
> 
> Jean-Philippe Méthot
> Openstack system administrator
> Administrateur système Openstack
> PlanetHoster inc.
> 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Mark Nelson

The way I try to look at this is:


1) How much more do the enterprise grade drives cost?

2) What are the benefits? (Faster performance, longer life, etc)

3) How much does it cost to deal with downtime, diagnose issues, and 
replace malfunctioning hardware?



My personal take is that enterprise drives are usually worth it. There 
may be consumer grade drives that may be worth considering in very 
specific scenarios if they still have power loss protection and high 
write durability.  Even when I was in academia years ago with very 
limited budgets, we got burned with consumer grade SSDs to the point 
where we had to replace them all.  You have to be very careful and know 
exactly what you are buying.



Mark


On 12/19/19 12:04 PM, jes...@krogh.cc wrote:

I dont think “usually” is good enough in a production setup.



Sent from myMail for iOS


Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов 
:


Usually it doesn't, it only harms performance and probably SSD
lifetime
too

> I would not be running ceph on ssds without powerloss protection. I
> delivers a potential data loss scenario


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread jesper

I dont think “usually” is good enough in a production setup.



Sent from myMail for iOS


Thursday, 19 December 2019, 12.09 +0100 from Виталий Филиппов  
:
>Usually it doesn't, it only harms performance and probably SSD lifetime 
>too
>
>> I would not be running ceph on ssds without powerloss protection. I
>> delivers a potential data loss scenario
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy can't generate the client.admin keyring

2019-12-19 Thread Jean-Philippe Méthot
Hi,

We’re currently running Ceph mimic in production and that works fine. However, 
I am currently deploying another Ceph mimic setup for testing purposes and 
ceph-deploy is running into issues I’ve never seen before.
Essentially, the initial monitor setup starts the service, but the process gets 
interrupted at 

sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. 
--keyring=/var/lib/ceph/mon/ceph-ceph-monitor1/keyring auth get client.admin

with the client.admin keyring generation timing out. I can however issue 
commands manually using ceph-authtool to create an admin keyring and it works 
flawlessly.

ceph monitor logs don’t show any error and I am able to reach the monitor’s 
port by telnet. What could be causing this timeout?


Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread vitalif
Usually it doesn't, it only harms performance and probably SSD lifetime 
too



I would not be running ceph on ssds without powerloss protection. I
delivers a potential data loss scenario

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs backfill_toofull after removing OSD from CRUSH map

2019-12-19 Thread Eugen Block

Hi Kristof,

setting the OSD "out" doesn't change the crush weight of that OSD, but  
removing it from the tree does, that's why the cluster started to  
rebalance.


Regards,
Eugen


Zitat von Kristof Coucke :


Hi all,

We are facing a strange symptom here.
We're testing our recovery procedures. Short description of our environment:
1. 10 OSD host nodes, each 13 disks + 2 NVMe's
2. 3 monitor nodes
3. 1 management node
4. 2 RGW's
5. 1 Client

Ceph version: Nautilus version 14.2.4

=> We are testing to "nicely" eliminate 1 OSD host.
As a first step, we've removed the OSD's by running "ceph osd out
osd.".
System went in error with a few messages that backfill was too full, but
this was more or less expected.

However, after leaving the system recovering, everything went back to
normal. Health did not indicate any warnings nor errors.
Running the Ceph OSD safe to destroy command indicated disks could be
safely removed.

So far so good, no problem...
Then we decided to properly removed the disks from the crush map, and now
the whole story starts again. Backfill_toofull errors and recovery is
running again.

Why?
The disks were already marked out and no PG's have been on them.

Is this caused by the fact that the CRUSH map is modified and recalculation
is happening causing the PG's automatically to be linked to different
OSD's? It does seem a strange behaviour to be honest.

Any feedback is greatly appreciated!

Regards,

Kristof Coucke




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] pgs backfill_toofull after removing OSD from CRUSH map

2019-12-19 Thread Kristof Coucke
Hi all,

We are facing a strange symptom here.
We're testing our recovery procedures. Short description of our environment:
1. 10 OSD host nodes, each 13 disks + 2 NVMe's
2. 3 monitor nodes
3. 1 management node
4. 2 RGW's
5. 1 Client

Ceph version: Nautilus version 14.2.4

=> We are testing to "nicely" eliminate 1 OSD host.
As a first step, we've removed the OSD's by running "ceph osd out
osd.".
System went in error with a few messages that backfill was too full, but
this was more or less expected.

However, after leaving the system recovering, everything went back to
normal. Health did not indicate any warnings nor errors.
Running the Ceph OSD safe to destroy command indicated disks could be
safely removed.

So far so good, no problem...
Then we decided to properly removed the disks from the crush map, and now
the whole story starts again. Backfill_toofull errors and recovery is
running again.

Why?
The disks were already marked out and no PG's have been on them.

Is this caused by the fact that the CRUSH map is modified and recalculation
is happening causing the PG's automatically to be linked to different
OSD's? It does seem a strange behaviour to be honest.

Any feedback is greatly appreciated!

Regards,

Kristof Coucke
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread jesper

I would not be running ceph on ssds without powerloss protection. I delivers a 
potential data loss scenario

Jesper



Sent from myMail for iOS


Thursday, 19 December 2019, 08.32 +0100 from Виталий Филиппов  
:
>https://yourcmc.ru/wiki/Ceph_performance
>
>https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc
>
>19 декабря 2019 г. 0:41:02 GMT+03:00, Sinan Polat < si...@turka.nl > пишет:
>>Hi,
>>
>>I am aware that  
>>https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>>  holds a list with benchmark of quite some different ssd models. 
>>Unfortunately it doesn't have benchmarks for recent ssd models.
>>
>>A client is planning to expand a running cluster (Luminous, FileStore, SSD 
>>only, Replicated). I/O Utilization is close to 0, but capacity wise the 
>>cluster is almost nearfull. To save costs the cluster will be expanded will 
>>customer-grade SSD's, but I am unable to find benchmarks of recent SSD models.
>>
>>Does anyone has experience with Samsung 860 EVO, 860 PRO and Crucial MX500 in 
>>a Ceph cluster?
>>
>>Thanks!
>>Sinan
>-- 
>With best regards,
>Vitaliy Filippov
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com