Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-17 Thread Mike
17.02.2015 04:11, Christian Balzer пишет:
 
 Hello,
 
 re-adding the mailing list.
 
 On Mon, 16 Feb 2015 17:54:01 +0300 Mike wrote:
 
 Hello

 05.02.2015 08:35, Christian Balzer пишет:

 Hello,


 LSI 2308 IT
 2 x SSD Intel DC S3700 400GB
 2 x SSD Intel DC S3700 200GB
 Why the separation of SSDs? 
 They aren't going to be that busy with regards to the OS.

 We would like to use 400GB SSD for a cache pool, and 200GB SSD for
 the journaling.

 Don't, at least not like that.
 First and foremost, SSD based OSDs/pools have different requirements,
 especially when it comes to CPU. 
 Mixing your HDD and SSD based OSDs in the same chassis is a generally
 a bad idea.

 Why? If we have for example SuperServer 6028U-TR4+ with proper
 configuration  (4 x SSD DC S3700 for cache pool/8 x 6-8Tb SATA HDD for
 Cold storage/E5-2695V3 CPU/128Gb RAM), why it's still bad idea? It's
 something inside Ceph don't work well?

 
 Ceph in and by itself will of course work.
 
 But your example up there is total overkill on one hand and simply not
 balanced on the other hand.
 You'd be much better off (both performance and price wise) if you'd go
 with something less powerful for a HDD storage node like this:
 http://www.supermicro.com/products/system/2U/6027/SSG-6027R-E1R12T.cfm
 with 2 400GB Intels in the back for journals and 16 cores total.
 
 While your SSD based storage nodes would be nicely dense by using
 something like:
 http://www.supermicro.com/products/system/2U/2028/SYS-2028TP-DC0TR.cfm
 with 2 E5-2690 v3 per node (I'd actually rather prefer E5-2687W v3, but
 those are running too hot).
 Alternatively one of the 1U cases with up to 10 SSDs.
 
 Also maintaining a crush map that separates the SSD from HDD pools is made
 a lot easier, less error prone by segregating nodes into SSD and HDD ones.
 
 There are several more reasons below.
 
 

Yes this normal variants of configurations. But in this way you have 2
different nodes versus 1, it's require a more support inside company.

In a whole setup you will be have for each MON, OSD, SSD-CACHE servers
one configuration and another configurations for compute nodes.

A lot of support, supplies, attention.

That's why we still trying reduce amount of configuration for support.
It's a balance support versus cost/speed/etc.

 For me cache pool it's 1-st fast small storage between big slow storage.

 That's the idea, yes.
 But besides the problems with performance I'm listing again below, that
 small is another, very difficult to judge in advance problem.
 By mixing your cache pool SSD OSDs into the HDD OSD chassis, you're
 making yourself inflexible in that area (as in just add another SSD cache
 pool node when needed). 
 

Yes in some way inflexible, but I have one configuration not two and can
grow up cluster simply add modes.

 You don't need journal anymore and if you need you can enlarge fast
 storage.

 You still need the journal of course, it's (unfortunately in some cases)
 a basic requirement in Ceph. 
 I suppose what you meant is don't need journal on SSDs anymore.
 And while that is true, this makes your slow storage at least twice as
 slow, which at some point (deep-scrub, data re-balancing, very busy
 cluster) is likely to make you wish you had those journal SSDs.
 
  

Yes, journal on cold storage is need for re-balancing cluster if some
node/hdd fail or promote/remove object from ssd cache.

I remember a email in this mail list from one of inktank guys (sorry,
didn't remember him full email and name), they wrote that you no need
journal if you use cache pool.

 If you really want to use SSD based OSDs, got at least with Giant,
 probably better even to wait for Hammer. 
 Otherwise your performance will be nowhere near the investment you're
 making. 
 Read up in the ML archives about SSD based clusters and their
 performance, as well as cache pools.

 Which brings us to the second point, cache pools are pretty pointless
 currently when it comes to performance. So unless you're planning to
 use EC pools, you will gain very little from them.

 So, ssd cache pool useless at all?

 They're (currently) not performing all that well, ask people on the ML
 who're actually using them. 

By now it's true I'm reading ML every day.

 This is a combination of Ceph currently being unable to fully utilize the
 full potential of SSDs in general and the cache pool code (having to
 promote/demote whole objects mainly) in particular.
 
 Both of these things are of course known to the Ceph developers and being
 improved, but right now I don't think they will give you what you expect
 from them.
 
 I would build a good, solid, classic Ceph cluster at this point in time
 and have a small cache pool for testing. 
 Once that pool performs to your satisfaction, you can always grow it.
 Another reason to keep SSD based storage nodes separate.
 
 Christian
 

Thanks for answer!

___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-14 Thread Scott Laird
I ended up destroying the EC pool and starting over.  It was killing all of
my OSD machines, and I couldn't keep anything working right with EC in
use.  So, no core dumps and I'm not in a place to reproduce easily
anymore.  This was with Giant on Ubuntu 14.04.

On Thu Feb 12 2015 at 7:07:38 AM Mark Nelson mnel...@redhat.com wrote:

 On 02/08/2015 10:41 PM, Scott Laird wrote:
  Does anyone have a good recommendation for per-OSD memory for EC?  My EC
  test blew up in my face when my OSDs suddenly spiked to 10+ GB per OSD
  process as soon as any reconstruction was needed.  Which (of course)
  caused OSDs to OOM, which meant more reconstruction, which fairly
  immediately led to a dead cluster.  This was with Giant.  Is this
 typical?

 Doh, that shouldn't happen.  Can you reproduce it?  Would be especially
 nice if we could get a core dump or if you could make it happen under
 valgrind.  If the CPUs are spinning, even a perf report might prove useful.

 
  On Fri Feb 06 2015 at 2:41:50 AM Mohamed Pakkeer mdfakk...@gmail.com
  mailto:mdfakk...@gmail.com wrote:
 
  Hi all,
 
  We are building EC cluster with cache tier for CephFS. We are
  planning to use the following 1U chassis along with Intel SSD DC
  S3700 for cache tier. It has 10 * 2.5 slots. Could you recommend a
  suitable Intel processor and amount of RAM to cater 10 * SSDs?.
 
  http://www.supermicro.com/products/system/1U/1028/SYS-1028R-WTRT.cfm
 
 
  Regards
 
  K.Mohamed Pakkeer
 
 
 
  On Fri, Feb 6, 2015 at 2:57 PM, Stephan Seitz
  s.se...@heinlein-support.de mailto:s.se...@heinlein-support.de
  wrote:
 
  Hi,
 
  Am Dienstag, den 03.02.2015, 15:16 + schrieb Colombo Marco:
   Hi all,
I have to build a new Ceph storage cluster, after i‘ve read
 the
   hardware recommendations and some mail from this mailing list
 i would
   like to buy these servers:
 
  just FYI:
 
  SuperMicro already focuses on ceph with a productline:
  http://www.supermicro.com/solutions/datasheet_Ceph.pdf
  http://www.supermicro.com/solutions/storage_ceph.cfm
 
 
 
  regards,
 
 
  Stephan Seitz
 
  --
 
  Heinlein Support GmbH
  Schwedter Str. 8/9b, 10119 Berlin
 
  http://www.heinlein-support.de
 
  Tel: 030 / 405051-44
  Fax: 030 / 405051-19
 
  Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
  Berlin-Charlottenburg,
  Geschäftsführer: Peer Heinlein -- Sitz: Berlin
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
  --
  Thanks  Regards
  K.Mohamed Pakkeer
  Mobile- 0091-8754410114
 
  _
  ceph-users mailing list
  ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
  http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-12 Thread Mark Nelson

On 02/08/2015 10:41 PM, Scott Laird wrote:

Does anyone have a good recommendation for per-OSD memory for EC?  My EC
test blew up in my face when my OSDs suddenly spiked to 10+ GB per OSD
process as soon as any reconstruction was needed.  Which (of course)
caused OSDs to OOM, which meant more reconstruction, which fairly
immediately led to a dead cluster.  This was with Giant.  Is this typical?


Doh, that shouldn't happen.  Can you reproduce it?  Would be especially 
nice if we could get a core dump or if you could make it happen under 
valgrind.  If the CPUs are spinning, even a perf report might prove useful.




On Fri Feb 06 2015 at 2:41:50 AM Mohamed Pakkeer mdfakk...@gmail.com
mailto:mdfakk...@gmail.com wrote:

Hi all,

We are building EC cluster with cache tier for CephFS. We are
planning to use the following 1U chassis along with Intel SSD DC
S3700 for cache tier. It has 10 * 2.5 slots. Could you recommend a
suitable Intel processor and amount of RAM to cater 10 * SSDs?.

http://www.supermicro.com/products/system/1U/1028/SYS-1028R-WTRT.cfm


Regards

K.Mohamed Pakkeer



On Fri, Feb 6, 2015 at 2:57 PM, Stephan Seitz
s.se...@heinlein-support.de mailto:s.se...@heinlein-support.de
wrote:

Hi,

Am Dienstag, den 03.02.2015, 15:16 + schrieb Colombo Marco:
 Hi all,
  I have to build a new Ceph storage cluster, after i‘ve read the
 hardware recommendations and some mail from this mailing list i would
 like to buy these servers:

just FYI:

SuperMicro already focuses on ceph with a productline:
http://www.supermicro.com/solutions/datasheet_Ceph.pdf
http://www.supermicro.com/solutions/storage_ceph.cfm



regards,


Stephan Seitz

--

Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-44
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin


___
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Thanks  Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114

_
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-08 Thread Scott Laird
Does anyone have a good recommendation for per-OSD memory for EC?  My EC
test blew up in my face when my OSDs suddenly spiked to 10+ GB per OSD
process as soon as any reconstruction was needed.  Which (of course) caused
OSDs to OOM, which meant more reconstruction, which fairly immediately led
to a dead cluster.  This was with Giant.  Is this typical?

On Fri Feb 06 2015 at 2:41:50 AM Mohamed Pakkeer mdfakk...@gmail.com
wrote:

 Hi all,

 We are building EC cluster with cache tier for CephFS. We are planning to
 use the following 1U chassis along with Intel SSD DC S3700 for cache tier.
 It has 10 * 2.5 slots. Could you recommend a suitable Intel processor and
 amount of RAM to cater 10 * SSDs?.

 http://www.supermicro.com/products/system/1U/1028/SYS-1028R-WTRT.cfm


 Regards

 K.Mohamed Pakkeer



 On Fri, Feb 6, 2015 at 2:57 PM, Stephan Seitz s.se...@heinlein-support.de
  wrote:

 Hi,

 Am Dienstag, den 03.02.2015, 15:16 + schrieb Colombo Marco:
  Hi all,
   I have to build a new Ceph storage cluster, after i‘ve read the
  hardware recommendations and some mail from this mailing list i would
  like to buy these servers:

 just FYI:

 SuperMicro already focuses on ceph with a productline:
 http://www.supermicro.com/solutions/datasheet_Ceph.pdf
 http://www.supermicro.com/solutions/storage_ceph.cfm



 regards,


 Stephan Seitz

 --

 Heinlein Support GmbH
 Schwedter Str. 8/9b, 10119 Berlin

 http://www.heinlein-support.de

 Tel: 030 / 405051-44
 Fax: 030 / 405051-19

 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
 Berlin-Charlottenburg,
 Geschäftsführer: Peer Heinlein -- Sitz: Berlin


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 Thanks  Regards
 K.Mohamed Pakkeer
 Mobile- 0091-8754410114

  ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-06 Thread Mohamed Pakkeer
Hi all,

We are building EC cluster with cache tier for CephFS. We are planning to
use the following 1U chassis along with Intel SSD DC S3700 for cache tier.
It has 10 * 2.5 slots. Could you recommend a suitable Intel processor and
amount of RAM to cater 10 * SSDs?.

http://www.supermicro.com/products/system/1U/1028/SYS-1028R-WTRT.cfm


Regards

K.Mohamed Pakkeer



On Fri, Feb 6, 2015 at 2:57 PM, Stephan Seitz s.se...@heinlein-support.de
wrote:

 Hi,

 Am Dienstag, den 03.02.2015, 15:16 + schrieb Colombo Marco:
  Hi all,
   I have to build a new Ceph storage cluster, after i‘ve read the
  hardware recommendations and some mail from this mailing list i would
  like to buy these servers:

 just FYI:

 SuperMicro already focuses on ceph with a productline:
 http://www.supermicro.com/solutions/datasheet_Ceph.pdf
 http://www.supermicro.com/solutions/storage_ceph.cfm



 regards,


 Stephan Seitz

 --

 Heinlein Support GmbH
 Schwedter Str. 8/9b, 10119 Berlin

 http://www.heinlein-support.de

 Tel: 030 / 405051-44
 Fax: 030 / 405051-19

 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
 Berlin-Charlottenburg,
 Geschäftsführer: Peer Heinlein -- Sitz: Berlin


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Thanks  Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Colombo Marco
Hi Christian,



On 04/02/15 02:39, Christian Balzer ch...@gol.com wrote:

On Tue, 3 Feb 2015 15:16:57 + Colombo Marco wrote:

 Hi all,
  I have to build a new Ceph storage cluster, after i‘ve read the
 hardware recommendations and some mail from this mailing list i would
 like to buy these servers:
 

Nick mentioned a number of things already I totally agree with, so don't
be surprised if some of this feels like a repeat.

 OSD:
 SSG-6027R-E1R12L -
 http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm
 Intel Xeon e5-2630 v2 64 GB RAM
As nick said, v3 and more RAM might be helpful, depending on your use case
(small writes versus large ones) even faster CPUs as well.

Ok, we switch from v2 to v3 and from 64 to 96 GB of RAM.


 LSI 2308 IT
 2 x SSD Intel DC S3700 400GB
 2 x SSD Intel DC S3700 200GB
Why the separation of SSDs? 
They aren't going to be that busy with regards to the OS.

We would like to use 400GB SSD for a cache pool, and 200GB SSD for the 
journaling.


Get a case like Nick mentioned with 2 2.5 bays in the back, put 2 DC S3700
400GBs in there (connected to onboard 6Gb/s SATA3), partition them so that
you have a RAID1 for OS and plain partitions for the journals of the now 
12
OSD HDDs in your chassis. 
Of course this optimization in terms of cost and density comes with a
price, if one SSD should fail, you will have 6 OSDs down. 
Given how reliable the Intels are this is unlikely, but something you need
to consider.

If you want to limit the impact of a SSD failure and have just 2 OSD
journals per SSD, get a chassis like the one above and 4 DC S3700 200GB,
RAID10 them for the OS and put 2 journal partitions on each. 

I did the same with 8 3TB HDDs and 4 DC S3700 100GB, the HDDs (and CPU
with 4KB IOPS), are the limiting factor, not the SSDs.

 8 x HDD Seagate Enterprise 6TB
Are you really sure you need that density? One disk failure will result in
a LOT of data movement once these become somewhat full.
If you were to go for a 12 OSD node as described above, consider 4TB ones
for the same overall density, while having more IOPS and likely the same
price or less.

We choosen the 6TB of disk, because we need a lot of storage in a small 
amount of server and we prefer server with not too much disks.
However we plan to use max 80% of a 6TB Disk


 2 x 40GbE for backend network
You'd be lucky to write more that 800MB/s sustained to your 8 HDDs
(remember they will have to deal with competing reads and writes, this is
not a sequential synthetic write benchmark). 
Incidentally 1GB/s to 1.2GB/s (depending on configuration) would also be
the limit of your journal SSDs.
Other than backfilling caused by cluster changes (OSD removed/added), your
limitation is nearly always going to be IOPS, not bandwidth.


Ok, after some discussion, we switch to 2 x 10 GbE.


So 2x10GbE or if you're comfortable with it (I am ^o^) an Infiniband
backend (can be cheaper, less latency, plans for RDMA support in
Ceph) should be more than sufficient.

 2 x 10GbE  for public network
 
 META/MON:
 
 SYS-6017R-72RFTP -
 http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm 2
 x Intel Xeon e5-2637 v2 4 x SSD Intel DC S3500 240GB raid 1+0
You're likely to get better performance and of course MUCH better
durability by using 2 DC S3700, at about the same price.

Ok we switch to 2 x SSD DC S3700


 128 GB RAM
Total overkill for a MON, but I have no idea about MDS and RAM never 
hurts.

Ok we switch from 128 to 96


In your follow-up you mentioned 3 mons, I would suggest putting 2 more
mons (only, not MDS) on OSD nodes and make sure that within the IP
numbering the real mons have the lowest IP addresses, because the MON
with the lowest IP becomes master (and thus the busiest). 
This way you can survive a loss of 2 nodes and still have a valid quorum.

Ok, got it



Christian

 2 x 10 GbE
 
 What do you think?
 Any feedbacks, advices, or ideas are welcome!
 
 Thanks so much
 
 Regards,


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com  Global OnLine Japan/Fusion Communications
http://www.gol.com/

Thanks so much!


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Udo Lembke
Hi Marco,

Am 04.02.2015 10:20, schrieb Colombo Marco:
...
 We choosen the 6TB of disk, because we need a lot of storage in a small 
 amount of server and we prefer server with not too much disks.
 However we plan to use max 80% of a 6TB Disk
 

80% is too much! You will run into trouble.
Ceph don't write the data in equal distribution. Sometimes I see an
difference of 20% in the usage of the OSD.

I recommend 60-70% as maximum.

Udo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Christian Balzer

Hello,

On Wed, 4 Feb 2015 09:20:24 + Colombo Marco wrote:

 Hi Christian,
 
 
 
 On 04/02/15 02:39, Christian Balzer ch...@gol.com wrote:
 
 On Tue, 3 Feb 2015 15:16:57 + Colombo Marco wrote:
 
  Hi all,
   I have to build a new Ceph storage cluster, after i‘ve read the
  hardware recommendations and some mail from this mailing list i would
  like to buy these servers:
  
 
 Nick mentioned a number of things already I totally agree with, so don't
 be surprised if some of this feels like a repeat.
 
  OSD:
  SSG-6027R-E1R12L -
  http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm
  Intel Xeon e5-2630 v2 64 GB RAM
 As nick said, v3 and more RAM might be helpful, depending on your use
 case (small writes versus large ones) even faster CPUs as well.
 
 Ok, we switch from v2 to v3 and from 64 to 96 GB of RAM.
 
 
  LSI 2308 IT
  2 x SSD Intel DC S3700 400GB
  2 x SSD Intel DC S3700 200GB
 Why the separation of SSDs? 
 They aren't going to be that busy with regards to the OS.
 
 We would like to use 400GB SSD for a cache pool, and 200GB SSD for the 
 journaling.

Don't, at least not like that.
First and foremost, SSD based OSDs/pools have different requirements,
especially when it comes to CPU. 
Mixing your HDD and SSD based OSDs in the same chassis is a generally a bad
idea.
If you really want to use SSD based OSDs, got at least with Giant,
probably better even to wait for Hammer. 
Otherwise your performance will be nowhere near the investment you're
making. 
Read up in the ML archives about SSD based clusters and their performance,
as well as cache pools.

Which brings us to the second point, cache pools are pretty pointless
currently when it comes to performance. So unless you're planning to use
EC pools, you will gain very little from them.

Lastly, if you still want to do SSD based OSDs, go for something like this:
http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-DC0TR.cfm
Add the fastest CPUs you can afford and voila, instant SSD based cluster
(replication of 2 should be fine with DC S3700). 
Now with _this_ particular type of nodes, you might want to consider 40GbE
links (front and back-end).
 
 
 Get a case like Nick mentioned with 2 2.5 bays in the back, put 2 DC
 S3700 400GBs in there (connected to onboard 6Gb/s SATA3), partition
 them so that you have a RAID1 for OS and plain partitions for the
 journals of the now 12
 OSD HDDs in your chassis. 
 Of course this optimization in terms of cost and density comes with a
 price, if one SSD should fail, you will have 6 OSDs down. 
 Given how reliable the Intels are this is unlikely, but something you
 need to consider.
 
 If you want to limit the impact of a SSD failure and have just 2 OSD
 journals per SSD, get a chassis like the one above and 4 DC S3700 200GB,
 RAID10 them for the OS and put 2 journal partitions on each. 
 
 I did the same with 8 3TB HDDs and 4 DC S3700 100GB, the HDDs (and CPU
 with 4KB IOPS), are the limiting factor, not the SSDs.
 
  8 x HDD Seagate Enterprise 6TB
 Are you really sure you need that density? One disk failure will result
 in a LOT of data movement once these become somewhat full.
 If you were to go for a 12 OSD node as described above, consider 4TB
 ones for the same overall density, while having more IOPS and likely
 the same price or less.
 
 We choosen the 6TB of disk, because we need a lot of storage in a small 
 amount of server and we prefer server with not too much disks.
 However we plan to use max 80% of a 6TB Disk

Less disks, less IOPS, less bandwidth. 
Reducing the amount of servers (which are fixed cost after all) is
understandable. But you have an option up there that gives you the same
density as with the 6TB disks, but with a significantly improved
performance.
 
 
  2 x 40GbE for backend network
 You'd be lucky to write more that 800MB/s sustained to your 8 HDDs
 (remember they will have to deal with competing reads and writes, this
 is not a sequential synthetic write benchmark). 
 Incidentally 1GB/s to 1.2GB/s (depending on configuration) would also be
 the limit of your journal SSDs.
 Other than backfilling caused by cluster changes (OSD removed/added),
 your limitation is nearly always going to be IOPS, not bandwidth.
 
 
 Ok, after some discussion, we switch to 2 x 10 GbE.
 
 
 So 2x10GbE or if you're comfortable with it (I am ^o^) an Infiniband
 backend (can be cheaper, less latency, plans for RDMA support in
 Ceph) should be more than sufficient.
 
  2 x 10GbE  for public network
  
  META/MON:
  
  SYS-6017R-72RFTP -
  http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm
  2 x Intel Xeon e5-2637 v2 4 x SSD Intel DC S3500 240GB raid 1+0
 You're likely to get better performance and of course MUCH better
 durability by using 2 DC S3700, at about the same price.
 
 Ok we switch to 2 x SSD DC S3700
 
 
  128 GB RAM
 Total overkill for a MON, but I have no idea about MDS and RAM never 
 hurts.
 
 Ok we switch from 128 to 96
 
Don't take my 

[ceph-users] Ceph Supermicro hardware recommendation

2015-02-03 Thread Colombo Marco
Hi all,
 I have to build a new Ceph storage cluster, after i‘ve read the hardware 
recommendations and some mail from this mailing list i would like to buy these 
servers:

OSD:
SSG-6027R-E1R12L - 
http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm
Intel Xeon e5-2630 v2
64 GB RAM
LSI 2308 IT
2 x SSD Intel DC S3700 400GB
2 x SSD Intel DC S3700 200GB
8 x HDD Seagate Enterprise 6TB
2 x 40GbE for backend network
2 x 10GbE  for public network

META/MON:

SYS-6017R-72RFTP - 
http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm
2 x Intel Xeon e5-2637 v2
4 x SSD Intel DC S3500 240GB raid 1+0
128 GB RAM
2 x 10 GbE

What do you think?
Any feedbacks, advices, or ideas are welcome!

Thanks so much

Regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-03 Thread Nick Fisk
Hi,

 

Just a couple of points, you might want to see if you can get a Xeon v3 
board+CPU as they have more performance and use less power.

 

You can also get a SM 2U chassis which has 2x 2.5” disk slots at the rear, this 
would allow you to have an extra  2x 3.5” disks in the front of the server.

 

Extra ram in the OSD nodes would probably help performance a bit

 

How many nodes are you going to have? You might find that bonded 10G networking 
is sufficient instead of the extra cost of 40GB networking.

 

Nick

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Colombo Marco
Sent: 03 February 2015 15:17
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph Supermicro hardware recommendation

 

Hi all,

 I have to build a new Ceph storage cluster, after i‘ve read the hardware 
recommendations and some mail from this mailing list i would like to buy these 
servers:

 

OSD: 

SSG-6027R-E1R12L - 
http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm  

Intel Xeon e5-2630 v2

64 GB RAM

LSI 2308 IT

2 x SSD Intel DC S3700 400GB

2 x SSD Intel DC S3700 200GB

8 x HDD Seagate Enterprise 6TB

2 x 40GbE for backend network

2 x 10GbE  for public network 

 

META/MON:

 

SYS-6017R-72RFTP - 
http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm

2 x Intel Xeon e5-2637 v2

4 x SSD Intel DC S3500 240GB raid 1+0

128 GB RAM

2 x 10 GbE

 

What do you think?

Any feedbacks, advices, or ideas are welcome!

 

Thanks so much

 

Regards,




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-03 Thread Colombo Marco
Hi Nick,

Hi,

Just a couple of points, you might want to see if you can get a Xeon v3 
board+CPU as they have more performance and use less power.

   ok

You can also get a SM 2U chassis which has 2x 2.5” disk slots at the rear, this 
would allow you to have an extra  2x 3.5” disks in the front of the server.

These two rear slots will be used for the Operating System's SSD

Extra ram in the OSD nodes would probably help performance a bit

   ok

How many nodes are you going to have? You might find that bonded 10G networking 
is sufficient instead of the extra cost of 40GB networking.

I think about 14 o 16 OSD nodes.
3 Metadata/Monitor nodes

Nick

Thanks
Regards

Marco

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Colombo Marco
Sent: 03 February 2015 15:17
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph Supermicro hardware recommendation

Hi all,
 I have to build a new Ceph storage cluster, after i‘ve read the hardware 
recommendations and some mail from this mailing list i would like to buy these 
servers:

OSD:
SSG-6027R-E1R12L - 
http://xo4t.mj.am/link/xo4t/grkj3rk/1/m3tngGzWbOpwg5uXd5lPdw/aHR0cDovL3d3dy5zdXBlcm1pY3JvLm5sL3Byb2R1Y3RzL3N5c3RlbS8yVS82MDI3L1NTRy02MDI3Ui1FMVIxMkwuY2Zt
Intel Xeon e5-2630 v2
64 GB RAM
LSI 2308 IT
2 x SSD Intel DC S3700 400GB
2 x SSD Intel DC S3700 200GB
8 x HDD Seagate Enterprise 6TB
2 x 40GbE for backend network
2 x 10GbE  for public network

META/MON:

SYS-6017R-72RFTP - 
http://xo4t.mj.am/link/xo4t/grkj3rk/2/Fc3dQ9lM7vImlEFAB-_wDg/aHR0cDovL3d3dy5zdXBlcm1pY3JvLmNvbS9wcm9kdWN0cy9zeXN0ZW0vMVUvNjAxNy9TWVMtNjAxN1ItNzJSRlRQLmNmbQ
2 x Intel Xeon e5-2637 v2
4 x SSD Intel DC S3500 240GB raid 1+0
128 GB RAM
2 x 10 GbE

What do you think?
Any feedbacks, advices, or ideas are welcome!

Thanks so much

Regards,


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-03 Thread Christian Balzer
On Tue, 3 Feb 2015 15:16:57 + Colombo Marco wrote:

 Hi all,
  I have to build a new Ceph storage cluster, after i‘ve read the
 hardware recommendations and some mail from this mailing list i would
 like to buy these servers:
 

Nick mentioned a number of things already I totally agree with, so don't
be surprised if some of this feels like a repeat.

 OSD:
 SSG-6027R-E1R12L -
 http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm
 Intel Xeon e5-2630 v2 64 GB RAM
As nick said, v3 and more RAM might be helpful, depending on your use case
(small writes versus large ones) even faster CPUs as well.

 LSI 2308 IT
 2 x SSD Intel DC S3700 400GB
 2 x SSD Intel DC S3700 200GB
Why the separation of SSDs? 
They aren't going to be that busy with regards to the OS.

Get a case like Nick mentioned with 2 2.5 bays in the back, put 2 DC S3700
400GBs in there (connected to onboard 6Gb/s SATA3), partition them so that
you have a RAID1 for OS and plain partitions for the journals of the now 12
OSD HDDs in your chassis. 
Of course this optimization in terms of cost and density comes with a
price, if one SSD should fail, you will have 6 OSDs down. 
Given how reliable the Intels are this is unlikely, but something you need
to consider.

If you want to limit the impact of a SSD failure and have just 2 OSD
journals per SSD, get a chassis like the one above and 4 DC S3700 200GB,
RAID10 them for the OS and put 2 journal partitions on each. 

I did the same with 8 3TB HDDs and 4 DC S3700 100GB, the HDDs (and CPU
with 4KB IOPS), are the limiting factor, not the SSDs.

 8 x HDD Seagate Enterprise 6TB
Are you really sure you need that density? One disk failure will result in
a LOT of data movement once these become somewhat full.
If you were to go for a 12 OSD node as described above, consider 4TB ones
for the same overall density, while having more IOPS and likely the same
price or less.

 2 x 40GbE for backend network
You'd be lucky to write more that 800MB/s sustained to your 8 HDDs
(remember they will have to deal with competing reads and writes, this is
not a sequential synthetic write benchmark). 
Incidentally 1GB/s to 1.2GB/s (depending on configuration) would also be
the limit of your journal SSDs.
Other than backfilling caused by cluster changes (OSD removed/added), your
limitation is nearly always going to be IOPS, not bandwidth.

So 2x10GbE or if you're comfortable with it (I am ^o^) an Infiniband
backend (can be cheaper, less latency, plans for RDMA support in
Ceph) should be more than sufficient.

 2 x 10GbE  for public network
 
 META/MON:
 
 SYS-6017R-72RFTP -
 http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm 2
 x Intel Xeon e5-2637 v2 4 x SSD Intel DC S3500 240GB raid 1+0
You're likely to get better performance and of course MUCH better
durability by using 2 DC S3700, at about the same price.

 128 GB RAM
Total overkill for a MON, but I have no idea about MDS and RAM never hurts.

In your follow-up you mentioned 3 mons, I would suggest putting 2 more
mons (only, not MDS) on OSD nodes and make sure that within the IP
numbering the real mons have the lowest IP addresses, because the MON
with the lowest IP becomes master (and thus the busiest). 
This way you can survive a loss of 2 nodes and still have a valid quorum.

Christian

 2 x 10 GbE
 
 What do you think?
 Any feedbacks, advices, or ideas are welcome!
 
 Thanks so much
 
 Regards,


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com