Re: [ceph-users] Which OS for fresh install?
On 07/23/2014 04:09 PM, Bachelder, Kurt wrote: 2.) update your grub.conf to boot to the appropriate image (default=0, or whatever kernel in the list you want to boot from). Actually, edit /etc/sysconfig/kernel, set DEFAULTKERNEL=kernel-lt before installing it. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem installing ceph from package manager / ceph repositories
On 06/09/2014 03:08 PM, Karan Singh wrote: 1. When installing Ceph using package manger and ceph repositores , the package manager i.e YUM does not respect the ceph.repo file and takes ceph package directly from EPEL . Option 1: install yum-plugin-priorities, add priority = X to ceph.repo. X should be less than EPEL's priority, the default is I believe 99. Option 2: add exclude = ceph_package(s) to epel.repo. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Recommended way to use Ceph as storage for file server
On 06/02/2014 11:24 AM, Mark Nelson wrote: A more or less obvious alternative for CephFS would be to simply create a huge RBD and have a separate file server (running NFS / Samba / whatever) use that block device as backend. Just put a regular FS on top of the RBD and use it that way. Clients wouldn't really have any of the real performance and resilience benefits that Ceph could offer though, because the (single machine?) file server is now the bottleneck. Performance: assuming all your nodes are fast storage on a quad-10Gb pipe. Resilience: your gateway can be an active-passive HA pair, that shouldn't be any different from NFS+DRBD setups. It's kind of a tough call. Your observations regarding the downsides of using NFS with RBD are apt. You could try throwing another distributed storage system on top of RBD and use Ceph for the replication/etc, but that's not really ideal either. CephFS is relatively stable with active/standby MDS configurations, but it may still have bugs and there are no guarantees or official support (yet!). If you believe in the 10 years rule of thumb, cephfs will become stable enough for production use sometime between 2017 and 2022 dep. on whether you start counting from Sage's thesis defense or from the first official code release. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to backup mon-data?
On 05/27/2014 10:30 AM, Craig Lewis wrote: A ZFS snapshot is atomic, but it doesn't tell the daemons to flush their logs to disk. Reverting to a snapshot looks the same as if you turned off the machine by yanking the power cord at the instant the snapshot was taken. That sounds more relevant than OOM due to slab fragmentation -- as I understand it, basically that's a concern if you don't have enough ram, in which case you've a problem zfs or no zfs. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to backup mon-data?
On 05/23/2014 03:06 PM, Craig Lewis wrote: 1: ZFS or Btrfs snapshots could do this, but neither one are recommended for production. Out of curiosity, what's the current beef with zfs? I know what problems are cited for btrfs, but I haven't heard much about zfs lately. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PCI-E SSD Journal for SSD-OSD Disks
On 05/15/2014 01:19 PM, Tyler Wilson wrote: Would running a different distribution affect this at all? Our target was CentOS 6 however if a more recent kernel would make a difference we could switch. FWIW you can run centos 6 with 3.10 kernel from elrepo. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] NFS over CEPH - best practice
On 5/12/2014 4:52 AM, Andrei Mikhailovsky wrote: Leen, thanks for explaining things. I does make sense now. Unfortunately, it does look like this technology would not fulfill my requirements as I do need to have an ability to perform maintenance without shutting down vms. I've no idea how much state you need to share for iscsi failover; with nfs you put the cluster ip address, the lock directories the daemons on a heartbeat'ed pair of machines. With automount you don't need multiple active servers, you can do (much simpler) active-passive. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] NFS over CEPH - best practice
On 05/12/2014 01:17 PM, McNamara, Bradley wrote: The underlying file system on the RBD needs to be a clustered file system, like OCFS2, GFS2, etc., and a cluster between the two, or more, iSCSI target servers needs to be created to manage the clustered file system. Looks like we aren't sure what the OP wanted multiple servers for: - serving one image to multiple clients (in which case all of the above plus more applies), or - failover setup with one image/one client (in which case you could usually go active/passive and not care about concurrency and all its tentacles). Andrei? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 16 osds: 11 up, 16 in
On 05/07/2014 04:11 PM, Craig Lewis wrote: On 5/7/14 13:40 , Sergey Malinin wrote: Check dmesg and SMART data on both nodes. This behaviour is similar to failing hdd. It does sound like a failing disk... but there's nothing in dmesg, and smartmontools hasn't emailed me about a failing disk. The same thing is happening to more than 50% of my OSDs, in both nodes. check 'iostat -dmx 5 5' (or some other numbers) -- if you see 100%+ disk utilization, that could be the dying one. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] advice with hardware configuration
On 05/06/2014 11:34 AM, Xabier Elkano wrote: OS: 2xSSD intel SC3500 100G Raid 1 Why would you put os on ssds? If buy enough ram so it doesn't swap, about the only i/o on the system drive will be logging. All that'd do is wear out your ssds, not that there's much of that going on. (Our servers average .01% utilization on system drives, most of it log writes.) I can see placing os and journals on the same disks, then ssds make sense because that's where journals are. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] The Ceph disk I would like to have
On 03/25/2014 10:49 AM, Loic Dachary wrote: Hi, It's not available yet but ... are we far away ? It's a pity Pi doesn't do SATA. Otherwise all you'd need's a working arm port and some scripting... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] The next generation beyond Ceph
On 03/21/2014 04:20 PM, Loic Dachary wrote: Hi Ted, Sorry if I misunderstood your initial message : I did not realize it was marketing for the competition. Dear Loic, I wanted to reach out to you with this exciting money transfer opportunity that I believe your bank account could really benefit from. It is currently still in stealth mode, but it's already very big in Nigeria. Would you send us all your bank account passwords so we can educate you about our offer? ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD module - RHEL 6.4
On 01/29/2014 12:27 PM, alistair.whit...@barclays.com wrote: We will not be able to deploy anything other than a fully supported RedHat kernel in which case your only option is probably RHEL 7 and hope they didn't exclude ceph modules from their kernel. Stock centos 6.5 kernel does not have rbd.ko so I'm sure the upstream rhel one doesn't either. ELRepo's kernel 3.10 has it, but that's not going to help you. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD module - RHEL 6.4
On 01/29/2014 12:47 PM, Schlacta, Christ wrote: Dkms is red hat technology. they developed it. Whether or not they support it I don't know... what do know is that dkms by design didn't modify your running, installed, fully supported RedHat kernel. This is in fact why and how RedHat designed it First of all, that's rubbish, you can't install a driver without modifying your system. That's why even the stuff RedHat provides as technology preview is not supported by RedHat for production use; I'm fairly sure stuff you build yourself is out of the question entirely. Second, it's usually not about technology, it's about auditors with checklists. The fact that you can do it and it will most likely work just fine has nothing to do with it. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] servers advise (dell r515 or supermicro ....)
On 1/15/2014 9:16 AM, Mark Nelson wrote: On 01/15/2014 09:14 AM, Alexandre DERUMIER wrote: For the system disk, do you use some kind of internal flash memory disk ? We probably should have, but ended up with I think just a 500GB 7200rpm disk, whatever was cheapest. :) If your system has to swap a lot you need more ram. If it loads stuff from disk a lot (other than thrashing), take a closer look at your job mix: there's likely things you probably should run elsewhere. Outside of those the only thing that bangs on system disk is logging and you can log to a dedicated log server and eliminate that bit of system disk i/o altogether. I.e. speed-wise you should be ok with a usb stick for a system drive. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph / Dell hardware recommendation
On 01/15/2014 10:53 AM, Alexandre DERUMIER wrote: From what I understand the flexbay are inside the box, typically usefull for OS (SSD) drives, then it lets you use all the front hotlug slot with larger platter drives. Yes, it's inside the box. I ask the question because of the derek message: They currently give me a hard time about trying to mix and match SSDs though on the 12 bay back-plane which is not a technical problem but a Dell problem At a guess, Dell BIOS complains that the drives/configuration is not supported, contact your Dell representative for replacement. Press F1 to boot. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph / Dell hardware recommendation
On 01/15/2014 12:42 PM, Derek Yarnell wrote: ... I think this is more a configuration Dell has been unwilling to sell is all. Ah. Every once in a while they make their bios complain when it finds a non-Dell approved disk. Once enough customers start screaming they release a bios update that turns that bit off and it stays that way for a while... and then they release the next h/w model and the cycle repeats again. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph as offline S3 substitute and peer-to-peer fileshare?
On 01/02/2014 04:20 PM, Alek Storm wrote: Anything? Would really appreciate any wisdom at all on this. I think what you're looking for is called git. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Cluster Performance very Poor
On 12/27/2013 05:10 PM, German Anders wrote: 1048576000 bytes (1.0 GB) copied, 10.2545 s, 102 MB/s FWIW I've a crappy crucial v4 ssd that clocks about 106MB/s on sequential i/o... Not sure how much you expect to see, esp. if you have a giga*bit* link to some of the disks. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] When will Ceph FS be ready for use with production data
On 12/21/2013 10:04 AM, Wido den Hollander wrote: On 12/21/2013 02:50 PM, Yan, Zheng wrote: I don't know when inktank will claim Cephfs is stable. But as a cephfs developer, I already have trouble to find new issue in my test setup. If you are willing to help improve cephfs, please try cephfs and report any issue you encounter. Great to hear. Are you also testing Multi-MDS or just one Active/Standby? And snapshots? Those were giving some problems as well. What was it I heard about performance tiers? Last I tried cephfs was spreading i/o fairly over osds, fast and slow, with no way to tune that up. Thanks, Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] centos6.4 + libvirt + qemu + rbd/ceph
On 12/06/2013 04:03 PM, Alek Paunov wrote: We use only Fedora servers for everything, so I am curious, why you are excluded this option from your research? (CentOS is always problematic with the new bits of technology). 6 months lifecycle and having to os-upgrade your entire data center 3 times a year? (OK maybe it's 18 months and once every 9 months) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] centos6.4 + libvirt + qemu + rbd/ceph
On 12/06/2013 04:28 PM, Alek Paunov wrote: On 07.12.2013 00:11, Dimitri Maziuk wrote: 6 months lifecycle and having to os-upgrade your entire data center 3 times a year? (OK maybe it's 18 months and once every 9 months) Most servers novadays are re-provisioned even more often, Not where I work they aren't. Fedora release comes with more and more KVM/Libvirt features and resolved issues, so the net effect is positive anyway. Yes, that is the main argument for tracking ubuntu. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Is Ceph a provider of block device too ?
On 11/21/2013 12:52 PM, Gregory Farnum wrote: If you want a logically distinct copy (? this seems to be what Dimitri is referring to with adding a 3rd DRBD copy on another node) Disclaimer: I haven't done stacked drbd, this is from my reading of the fine manual -- I was referring to stacked setup where you make a drbd raid-1 w/ 2 hosts and then a drbd raid-1 w/ the that drbd device and another host. I don't believe drbd can keep 3 replicas any other way -- unlike ceph, obviously. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] alternative approaches to CEPH-FS
On 11/19/2013 08:02 PM, YIP Wai Peng wrote: Hm, so maybe this nfsceph is not _that_ bad after all! :) Your read clearly wins, so I'm guessing the drdb write is the slow one. Which drdb mode are you using? Active/passive pair, meta-disk internal, protocol C over a 5-long crossover cable on eth1: 1000baseT/Full. Protocol B would probably speed up the writes, but when I run things that write a lot I make them write to /var/tmp anyway... cheers, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Disk Density Considerations
On 2013-11-06 08:37, Mark Nelson wrote: ... Taking this even further, options like the hadoop fat twin nodes with 12 drives in 1U potentially could be even denser, while spreading the drives out over even more nodes. Now instead of 4-5 large dense nodes you have maybe 35-40 small dense nodes. The downside here though is that the cost may be a bit higher and you have to slide out a whole node to swap drives, though Ceph is more tolerant of this than many distributed systems. Another one is 35-40 switch ports vs 4-5. I hear regular 10G ports eat up over 10 watts of juice and cat6e cable offers a unique combination of poor design and high cost. It's probably ok to need 35-40 routable ip addresses: you can add another interface subnet to your public-facing clients. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Red Hat clients
On 10/30/2013 02:35 PM, Gruher, Joseph R wrote: I have CentOS 6.4 running with the 3.11.6 kernel from elrepo and it includes the rbd module. I think you could make the same update on RHEL 6.4 and get rbd. Mmm... I think RHEL means paid support means you can't run an elrepo kernel. Plus I didn't have much luck with their -ml kernels (also centos 6.current) -- half of them wouldn't boot on our supermicros and the latest crop won't boot on my dell pc. So yeah, if by RHEL you mean centos/scilinux and you find an -ml kernel that actually works on your hardware... then you get rbd. As long as you don't 'yum update' the kernel. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CephFS Project Manila (OpenStack)
On 2013-10-22 22:41, Gregory Farnum wrote: ... Right now, unsurprisingly, the focus of the existing Manila developers is on Option 1: it's less work than the others and supports the most common storage protocols very well. But as mentioned, it would be a pretty poor fit for CephFS I must be missing something, I thought CephFS was supposed to be a distributed filesystem which to me means option 1 was the point. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph and RAID
On 10/03/2013 12:40 PM, Andy Paluch wrote: Don't you have to take down a ceph node to replace defective drive? If I have a ceph node with 12 disks and one goes bad, would I not have to take the entire node down to replace and then reformat? If I have a hotswap chassis but using just an hba to connect my drives will the os (say latest Ubuntu) support hot-swapping the drive or do I have to shut it down to replace the drive then bring ip and format etc. Linux supports hotswap. You'll have to restart an osd, but not reboot the node. The issue with cluster rebalancing is bandwidth: basically, sata/sas backplane on one node vs (potentially) the slowest network link in your cluster that also carries data traffic for everybody. There's too many variables involved, you figure out the balance between ceph replication and raid replication for your cluster budget. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph and RAID
On 2013-10-02 07:35, Loic Dachary wrote: Hi, I would not use RAID5 since it would be redundant with what Ceph provides. I would not use raid-5 (or 6) because its safety on modern drives is questionable and because I haven't seen anyone comment on ceph's performance -- e.g. openstack docs explicitly say don't use raid-5 because swift's access patterns are the worst case for raid. I would consider (mdadm) raid-1, dep. on the hardware budget, because this way a single disk failure will not trigger a cluster-wide rebalance. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] some newbie questions...
On 2013-08-31 11:36, Dzianis Kahanovich wrote: Johannes Klarenbeek пишет: 1) i read somewhere that it is recommended to have one OSD per disk in a production environment. is this also the maximum disk per OSD or could i use multiple disks per OSD? and why? you could use multiple disks for one OSD if you used some striping and abstract the disk (like LVM, MDRAID, etc). But it wouldn't make sense. One OSD writes into one filesystem, that is usually one disk in a production environment. Using RAID under it wouldn't increase neither reliability nor performance drastically. I see some sense in RAID 0: single ceph-osd daemon per node (but still disk-per-osd self). But if you have relative few [planned] cores per task on node - you can think about it. Raid-0: single disk failure kills the entire filesystem, off-lines the osd and triggers a cluster-wide resync. Actual raid: single disk failure does not affect the cluster in any way. Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD to OSD Communication
On 08/30/2013 01:38 PM, Geraint Jones wrote: On 30/08/13 11:33 AM, Wido den Hollander w...@42on.com wrote: On 08/30/2013 08:19 PM, Geraint Jones wrote: Hi Guys We are using Ceph in production backing an LXC cluster. The setup is : 2 x Servers, 24 x 3TB Disks each in groups of 3 as RAID0. SSD for journals. Bonded 1gbit ethernet (2gbit total). I think you sized your machines too big. I'd say go for 6 machines with 8 disks each without RAID-0. Let Ceph do it's job and avoid RAID. Typical traffic is fine - its just been an issue tonight :) If you hosed and have to recover an 9TB filesystem, you'll have problems no matter what, ceph or no ceph. You *will* have a disk failure every once in a while, and there's no r in raid-0, so don't think what happened is not typical. (There's nothing wrong with raid as long it's 0.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD to OSD Communication
On 08/30/2013 01:51 PM, Mark Nelson wrote: On 08/30/2013 01:47 PM, Dimitri Maziuk wrote: (There's nothing wrong with raid as long it's 0.) One exception: Some controllers (looking at you LSI!) don't expose disks as JBOD or if they do, don't let you use write-back cache. In those cases we some times have people make single-disk RAID0 LUNs. :) We don't use the ones we have for jbod, but I do recall trying and failing, yes. They do make what our vendor calls hba controllers, though, and for noticeably less money. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RadosGW High Availability
On 05/09/2013 09:57 AM, Tyler Brekke wrote: For High availability RGW you would need a load balancer. HA Proxy is an example of a load balancer that has been used successfully with rados gateway endpoints. Strictly speaking for HA you need an HA solution. E.g. heartbeat. Main difference between that and load balancing is that one server serves the clients until it dies, then another takes over. With load balancing, all servers get a share of the requests. It can be configured to do HA: set main server's share to 100%, then the backup will get no requests as long as the main is up. RRDNS is a load balancing solution. Dep. on the implementation it can simply return a list of IPs instead of a single IP for the host name, then it's up to the client to pick one. A simple stupid client may always pick the first one. A simple stupid server may always return the list in the same order. That could be how all your clients always pick the same server. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] interesting crush rules
On 05/01/2013 04:51 PM, Gregory Farnum wrote: On Wed, May 1, 2013 at 2:44 PM, Sage Weil s...@inktank.com wrote: I added a blueprint for extending the crush rule language. If there are interesting or strange placement policies you'd like to do and aren't able to currently express using CRUSH, please help us out by enumerating them on that blueprint. http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/extend_crush_rule_language, if you don't have the blueprint site handy already. :) My issue was placement of the files/directories (w/ cephfs): I wanted a complete file (or directory all the files in it) on osd.x. The rules I'm interested in would be like - pick one osd from rack 1, pick 2 osds from rack 2, put complete copy of everything on each (HA scenario w/ 2 copies in the on-site rack 2 and a copy in the off-site rack 1). - pick all osds from group compute nodes and place complete copy of everything on each (data placement on compute grids). (Obviously, there's also the bit about getting the clients to read from the right osd.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph mon quorum
On 4/5/2013 7:57 AM, Wido den Hollander wrote: You always need a majority of your monitors to be up. In this case you loose 66% of your monitors, so mon.b can't get a majority. With 3 monitors you need at least 2 to be up to have your cluster working. That's kinda useless, isn't it? I'd've thought 2 copies on-site and one off-site, and if the main site room's down we can work off the off-site server is a basic enough HA setup -- we've had it here for some time. Now you tell me ceph won't even do that? Dima ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph mon quorum
On 04/05/2013 10:12 AM, Wido den Hollander wrote: Think about it this way. You have two racks and the network connection between them fails. If both racks keep operating because they can still reach that single monitor in their rack you will end up with data inconsistency. Yes. In DRBD land it's called 'split brain' and they have (IIRC) entire chapter in the user manual about picking up the pieces. It's not a new problem. You should place mon.c outside rack A or B to keep you up and running in this situation. It's not about racks, it's about rooms, but let's say rack == room == colocation facility. And I have two of those. Are you saying I need a 3rd colo with all associated overhead to have a usable replica of my data in colo #2? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com