[ceph-users] Benefits of using Ceph with Docker or LibVirt & LXC
What are the benefits of using ceph with either docker or libvirt & lxc? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Question about the calamari
Can anybody help? On Dec 1, 2014, at 11:37, mail list wrote: > hi, all > > I have install calamari server, calamari client and diamond on a centos > server, > Then i run the following command: > {code} > [root@centos65 content]# sudo calamari-ctl initialize > [INFO] Loading configuration.. > [INFO] Starting/enabling salt... > [INFO] Starting/enabling postgres... > [INFO] Initializing database... > [INFO] Initializing web interface... > [INFO] You will now be prompted for login details for the administrative user > account. This is the account you will use to log into the web interface once > setup is complete. > Username (leave blank to use 'root'): > Email address: x...@xxx.com > Password: > Password (again): > Superuser created successfully. > [INFO] Starting/enabling services... > [INFO] Restarting services… > > {code} > > The command hold on and can not go on. > > Then i check the log /var/log/calamari/cthulhu.log and find the following > error: > > {code} > OperationalError: (OperationalError) could not connect to server: Connection > refused > Is the server running on host "localhost" and accepting > TCP/IP connections on port 5432? > None None > 2014-10-09 21:55:14,970 - WARNING - cthulhu.salt Re-opening connection to > salt-master > 2014-10-09 21:55:15,024 - WARNING - cthulhu.server_monitor.salt Re-opening > connection to salt-master > 2014-10-09 21:55:39,977 - WARNING - cthulhu.salt Re-opening connection to > salt-master > 2014-10-09 21:55:40,031 - WARNING - cthulhu.server_monitor.salt Re-opening > connection to salt-master > 2014-10-09 21:56:04,983 - WARNING - cthulhu.salt Re-opening connection to > salt-master > 2014-10-09 21:56:05,048 - WARNING - cthulhu.server_monitor.salt Re-opening > connection to salt-master > 2014-10-09 21:56:29,992 - WARNING - cthulhu.salt Re-opening connection to > salt-master > {code} > > The port 5432 seems to be the postgresql port, and the postgresql works very > well: > {code} > [louis@centos65 calamari]$ sudo -u postgres psql postgres -p 5432 > [sudo] password for louis: > psql (8.4.20) > Type "help" for help. > > postgres=# \list > List of databases > Name| Owner | Encoding | Collation |Ctype| Access > privileges > ---+--+--+-+-+--- > calamari | calamari | UTF8 | en_US.UTF8 | en_US.UTF8 | > postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | > template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres > : > postgres=CTc/postgres > template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres > : > postgres=CTc/postgres > (4 rows) > {code} > > > So what the matter with it? Any idea will be appreciated!! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] trouble starting second monitor
What does this mean, please? --rich ceph@adriatic:~/my-cluster$ ceph status cluster 1023db58-982f-4b78-b507-481233747b13 health HEALTH_OK monmap e1: 1 mons at {black=192.168.1.77:6789/0}, election epoch 2, quorum 0 black mdsmap e7: 1/1/1 up {0=adriatic=up:active}, 3 up:standby osdmap e17: 4 osds: 4 up, 4 in pgmap v48: 192 pgs, 3 pools, 1884 bytes data, 20 objects 29134 MB used, 113 GB / 149 GB avail 192 active+clean ceph@adriatic:~/my-cluster$ ceph-deploy mon create celtic [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy mon create celtic [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts celtic [ceph_deploy.mon][DEBUG ] detecting platform for host celtic ... [celtic][DEBUG ] connection detected need for sudo [celtic][DEBUG ] connected to host: celtic [celtic][DEBUG ] detect platform information from remote host [celtic][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty [celtic][DEBUG ] determining if provided host has same hostname in remote [celtic][DEBUG ] get remote short hostname [celtic][DEBUG ] deploying mon to celtic [celtic][DEBUG ] get remote short hostname [celtic][DEBUG ] remote hostname: celtic [celtic][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [celtic][DEBUG ] create the mon path if it does not exist [celtic][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-celtic/done [celtic][DEBUG ] create a done file to avoid re-doing the mon deployment [celtic][DEBUG ] create the init path if it does not exist [celtic][DEBUG ] locating the `service` executable... [celtic][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=celtic [celtic][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.celtic.asok mon_status [celtic][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [celtic][WARNIN] monitor: mon.celtic, might not be running yet [celtic][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.celtic.asok mon_status [celtic][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [celtic][WARNIN] celtic is not defined in `mon initial members` [celtic][WARNIN] monitor celtic does not exist in monmap [celtic][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors [celtic][WARNIN] monitors may not be able to form quorum ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] initial attempt at ceph-deploy fails name resolution
Found it. Ubuntu has a misfeature where it lists hostname and FQDN in /etc/hosts using the loopback address. This makes it impossible to determine the real internet address which confuses many services including ceph_deploy.util.get_nonlocal_ip which tosses all the 127.x addresses and then complains that it has no addresses left. The fix is simple. Remove the line from /etc/hosts. Afaik, it's only purpose is to support stand alone, no-network machines. This misfeature has bitten me a number of times. I think this should go into the quick start documentation. --rich On 11/30/14 13:30 , K Richard Pixley wrote: Hm. Seems like the problem might be deeper than I thought. It doesn't seem to resolve FQDNs either, although it appears to be connected remotely, which would have required /some/ sort of name resolution. --rich ceph@adriatic:~/my-cluster$ ceph-deploy new adriatic.noir.com [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy new adriatic.noir.com [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [adriatic.noir.com][DEBUG ] connection detected need for sudo [adriatic.noir.com][DEBUG ] connected to host: adriatic.noir.com [adriatic.noir.com][DEBUG ] detect platform information from remote host [adriatic.noir.com][DEBUG ] detect machine type [adriatic.noir.com][DEBUG ] find the location of an executable [adriatic.noir.com][INFO ] Running command: sudo /bin/ip link show [adriatic.noir.com][INFO ] Running command: sudo /bin/ip addr show [adriatic.noir.com][DEBUG ] IP addresses found: ['192.168.1.76'] [ceph_deploy.new][DEBUG ] Resolving host adriatic.noir.com [ceph_deploy][ERROR ] UnableToResolveError: Unable to resolve host: adriatic.noir.com On 11/30/14 12:53 , K Richard Pixley wrote: My initial attempt at ceph-deploy is failing on name resolution. Yet ping, ssh, etc, all work. What is the name resolution test my machine needs to pass in order for ceph-deploy to work? --rich ceph@adriatic:~/my-cluster$ ceph-deploy new adriatic [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy new adriatic [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [adriatic][DEBUG ] connection detected need for sudo [adriatic][DEBUG ] connected to host: adriatic [adriatic][DEBUG ] detect platform information from remote host [adriatic][DEBUG ] detect machine type [adriatic][DEBUG ] find the location of an executable [adriatic][INFO ] Running command: sudo /bin/ip link show [adriatic][INFO ] Running command: sudo /bin/ip addr show [adriatic][DEBUG ] IP addresses found: ['192.168.1.76'] [ceph_deploy.new][DEBUG ] Resolving host adriatic [ceph_deploy][ERROR ] UnableToResolveError: Unable to resolve host: adriatic ceph@adriatic:~/my-cluster$ ping -c 2 adriatic PING adriatic.noir.com (127.0.1.1) 56(84) bytes of data. 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=1 ttl=64 time=0.043 ms 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=2 ttl=64 time=0.059 ms --- adriatic.noir.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.043/0.051/0.059/0.008 ms ceph@adriatic:~/my-cluster$ cat /etc/resolv.conf # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 192.168.1.13 nameserver 198.144.192.2 nameserver 198.144.192.4 search noir.com domain noir.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Question about the calamari
hi, all I have install calamari server, calamari client and diamond on a centos server, Then i run the following command: {code} [root@centos65 content]# sudo calamari-ctl initialize [INFO] Loading configuration.. [INFO] Starting/enabling salt... [INFO] Starting/enabling postgres... [INFO] Initializing database... [INFO] Initializing web interface... [INFO] You will now be prompted for login details for the administrative user account. This is the account you will use to log into the web interface once setup is complete. Username (leave blank to use 'root'): Email address: x...@xxx.com Password: Password (again): Superuser created successfully. [INFO] Starting/enabling services... [INFO] Restarting services… {code} The command hold on and can not go on. Then i check the log /var/log/calamari/cthulhu.log and find the following error: {code} OperationalError: (OperationalError) could not connect to server: Connection refused Is the server running on host "localhost" and accepting TCP/IP connections on port 5432? None None 2014-10-09 21:55:14,970 - WARNING - cthulhu.salt Re-opening connection to salt-master 2014-10-09 21:55:15,024 - WARNING - cthulhu.server_monitor.salt Re-opening connection to salt-master 2014-10-09 21:55:39,977 - WARNING - cthulhu.salt Re-opening connection to salt-master 2014-10-09 21:55:40,031 - WARNING - cthulhu.server_monitor.salt Re-opening connection to salt-master 2014-10-09 21:56:04,983 - WARNING - cthulhu.salt Re-opening connection to salt-master 2014-10-09 21:56:05,048 - WARNING - cthulhu.server_monitor.salt Re-opening connection to salt-master 2014-10-09 21:56:29,992 - WARNING - cthulhu.salt Re-opening connection to salt-master {code} The port 5432 seems to be the postgresql port, and the postgresql works very well: {code} [louis@centos65 calamari]$ sudo -u postgres psql postgres -p 5432 [sudo] password for louis: psql (8.4.20) Type "help" for help. postgres=# \list List of databases Name| Owner | Encoding | Collation |Ctype| Access privileges ---+--+--+-+-+--- calamari | calamari | UTF8 | en_US.UTF8 | en_US.UTF8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres : postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres : postgres=CTc/postgres (4 rows) {code} So what the matter with it? Any idea will be appreciated!! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-fs-common & ceph-mds on ARM Raspberry Debian 7.6
Hi, You should be able to use the wheezy-backports repository, which has ceph 0.80.7. Cheers, Paulo On Sun, 2014-11-30 at 19:31 +0100, Florent MONTHEL wrote: > Hi, > > > I’m trying to deploy CEPH (with ceph-deploy) on Raspberry Debian 7.6 > and I have below error on ceph-deploy install command : > > > > > [socrate.flox-arts.in][INFO ] Running command: env > DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get -q -o > Dpkg::Options::=--force-confnew --no-install-recommends --assume-yes > install -- ceph ceph-mds ceph-common ceph-fs-common gdisk > [socrate.flox-arts.in][DEBUG ] Reading package lists... > [socrate.flox-arts.in][DEBUG ] Building dependency tree... > [socrate.flox-arts.in][DEBUG ] Reading state information... > [socrate.flox-arts.in][WARNIN] E: Unable to locate package ceph-mds > [socrate.flox-arts.in][WARNIN] E: Unable to locate package > ceph-fs-common > [socrate.flox-arts.in][ERROR ] RuntimeError: command returned non-zero > exit status: 100 > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: env > DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get -q -o > Dpkg::Options::=--force-confnew --no-install-recommends --assume-yes > install -- ceph ceph-mds ceph-common ceph-fs-common gdisk > > > Do you know how I can have these 2 package on this platform ? > Thanks > > > > Florent Monthel > > > > > > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] initial attempt at ceph-deploy fails name resolution
Hm. Seems like the problem might be deeper than I thought. It doesn't seem to resolve FQDNs either, although it appears to be connected remotely, which would have required /some/ sort of name resolution. --rich ceph@adriatic:~/my-cluster$ ceph-deploy new adriatic.noir.com [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy new adriatic.noir.com [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [adriatic.noir.com][DEBUG ] connection detected need for sudo [adriatic.noir.com][DEBUG ] connected to host: adriatic.noir.com [adriatic.noir.com][DEBUG ] detect platform information from remote host [adriatic.noir.com][DEBUG ] detect machine type [adriatic.noir.com][DEBUG ] find the location of an executable [adriatic.noir.com][INFO ] Running command: sudo /bin/ip link show [adriatic.noir.com][INFO ] Running command: sudo /bin/ip addr show [adriatic.noir.com][DEBUG ] IP addresses found: ['192.168.1.76'] [ceph_deploy.new][DEBUG ] Resolving host adriatic.noir.com [ceph_deploy][ERROR ] UnableToResolveError: Unable to resolve host: adriatic.noir.com On 11/30/14 12:53 , K Richard Pixley wrote: My initial attempt at ceph-deploy is failing on name resolution. Yet ping, ssh, etc, all work. What is the name resolution test my machine needs to pass in order for ceph-deploy to work? --rich ceph@adriatic:~/my-cluster$ ceph-deploy new adriatic [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy new adriatic [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [adriatic][DEBUG ] connection detected need for sudo [adriatic][DEBUG ] connected to host: adriatic [adriatic][DEBUG ] detect platform information from remote host [adriatic][DEBUG ] detect machine type [adriatic][DEBUG ] find the location of an executable [adriatic][INFO ] Running command: sudo /bin/ip link show [adriatic][INFO ] Running command: sudo /bin/ip addr show [adriatic][DEBUG ] IP addresses found: ['192.168.1.76'] [ceph_deploy.new][DEBUG ] Resolving host adriatic [ceph_deploy][ERROR ] UnableToResolveError: Unable to resolve host: adriatic ceph@adriatic:~/my-cluster$ ping -c 2 adriatic PING adriatic.noir.com (127.0.1.1) 56(84) bytes of data. 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=1 ttl=64 time=0.043 ms 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=2 ttl=64 time=0.059 ms --- adriatic.noir.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.043/0.051/0.059/0.008 ms ceph@adriatic:~/my-cluster$ cat /etc/resolv.conf # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 192.168.1.13 nameserver 198.144.192.2 nameserver 198.144.192.4 search noir.com domain noir.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Giant + nfs over cephfs hang tasks
Greg, thanks for your comment. Could you please share what OS, kernel and any nfs/cephfs settings you've used to achieve the pretty well stability? Also, what kind of tests have you ran to check that? Thanks - Original Message - > From: "Gregory Farnum" > To: "Ilya Dryomov" , "Andrei Mikhailovsky" > > Cc: "ceph-users" > Sent: Saturday, 29 November, 2014 10:19:32 PM > Subject: Re: [ceph-users] Giant + nfs over cephfs hang tasks > Ilya, do you have a ticket reference for the bug? > Andrei, we run NFS tests on CephFS in our nightlies and it does > pretty well so in the general case we expect it to work. Obviously > not at the moment with whatever bug Ilya is looking at, though. ;) > -Greg > On Sat, Nov 29, 2014 at 4:51 AM Ilya Dryomov < > ilya.dryo...@inktank.com > wrote: > > On Sat, Nov 29, 2014 at 3:49 PM, Ilya Dryomov < > > ilya.dryo...@inktank.com > wrote: > > > > On Sat, Nov 29, 2014 at 3:22 PM, Andrei Mikhailovsky < > > > and...@arhont.com > wrote: > > > >> Ilya, > > > >> > > > >> I think i spoke too soon in my last message. I've not given it > > >> more load > > > >> (running 8 concurrent dds with bs=4M) and about a minute or so > > >> after > > > >> starting i've seen problems in dmesg output. I am attaching > > >> kern.log file > > > >> for you reference. > > > >> > > > >> Please check starting with the following line: Nov 29 12:07:38 > > > >> arh-ibstorage1-ib kernel: [ 3831.906510]. This is when I've > > >> started the > > > >> concurrent 8 dds. > > > >> > > > >> The command that caused this is: > > > >> > > > >> time dd if=/dev/zero of=4G00 bs=4M count=5K oflag=direct & time > > >> dd > > > >> if=/dev/zero of=4G11 bs=4M count=5K oflag=direct &time dd > > >> if=/dev/zero > > > >> of=4G22 bs=4M count=5K oflag=direct &time dd if=/dev/zero > > >> of=4G33 > > >> bs=4M > > > >> count=5K oflag=direct & time dd if=/dev/zero of=4G44 bs=4M > > >> count=5K > > > >> oflag=direct & time dd if=/dev/zero of=4G55 bs=4M count=5K > > >> oflag=direct > > > >> &time dd if=/dev/zero of=4G66 bs=4M count=5K oflag=direct &time > > >> dd > > > >> if=/dev/zero of=4G77 bs=4M count=5K oflag=direct & > > > >> > > > >> I've ran the same test about 10 times but with only 4 concurrent > > >> dds and > > > >> that didn't cause the issue. > > > >> > > > >> Should I try the 3.18 kernel again to see if 8dds produce > > >> similar > > >> output? > > > > > > > > Missing attachment. > > > Definitely try the 3.18 testing kernel. > > > Thanks, > > > Ilya > > > __ _ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/ listinfo.cgi/ceph-users-ceph. com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] initial attempt at ceph-deploy fails name resolution
My initial attempt at ceph-deploy is failing on name resolution. Yet ping, ssh, etc, all work. What is the name resolution test my machine needs to pass in order for ceph-deploy to work? --rich ceph@adriatic:~/my-cluster$ ceph-deploy new adriatic [ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.20): /usr/bin/ceph-deploy new adriatic [ceph_deploy.new][DEBUG ] Creating new cluster named ceph [ceph_deploy.new][INFO ] making sure passwordless SSH succeeds [adriatic][DEBUG ] connection detected need for sudo [adriatic][DEBUG ] connected to host: adriatic [adriatic][DEBUG ] detect platform information from remote host [adriatic][DEBUG ] detect machine type [adriatic][DEBUG ] find the location of an executable [adriatic][INFO ] Running command: sudo /bin/ip link show [adriatic][INFO ] Running command: sudo /bin/ip addr show [adriatic][DEBUG ] IP addresses found: ['192.168.1.76'] [ceph_deploy.new][DEBUG ] Resolving host adriatic [ceph_deploy][ERROR ] UnableToResolveError: Unable to resolve host: adriatic ceph@adriatic:~/my-cluster$ ping -c 2 adriatic PING adriatic.noir.com (127.0.1.1) 56(84) bytes of data. 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=1 ttl=64 time=0.043 ms 64 bytes from adriatic.noir.com (127.0.1.1): icmp_seq=2 ttl=64 time=0.059 ms --- adriatic.noir.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.043/0.051/0.059/0.008 ms ceph@adriatic:~/my-cluster$ cat /etc/resolv.conf # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 192.168.1.13 nameserver 198.144.192.2 nameserver 198.144.192.4 search noir.com domain noir.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Revisiting MDS memory footprint
On 11/28/2014 03:36 PM, Florian Haas wrote: > On Fri, Nov 28, 2014 at 3:29 PM, Wido den Hollander wrote: >> On 11/28/2014 03:22 PM, Florian Haas wrote: >>> On Fri, Nov 28, 2014 at 3:14 PM, Wido den Hollander wrote: On 11/28/2014 01:04 PM, Florian Haas wrote: > Hi everyone, > > I'd like to come back to a discussion from 2012 (thread at > http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the > expected MDS memory consumption from file metadata caching. I am certain > the following is full of untested assumptions, some of which are > probably inaccurate, so please shoot those down as needed. > > I did an entirely unscientific study of a real data set (my laptop, in > case you care to know) which currently holds about 70G worth of data in > a huge variety of file sizes and several file systems, and currently > lists about 944,000 inodes as being in use. So going purely by order of > magnitude and doing a wild approximation, I'll assume a ratio of 1 > million files in 100G, or 10,000 files per gigabyte, which means an > average file size of about 100KB -- again, approximating and forgetting > about the difference between 10^3 and 2^10, and using a stupid > arithmetic mean rather than a median which would probably be much more > useful. > > If I were to assume that all those files were in CephFS, and they were > all somehow regularly in use (or at least one file in each directory), > then the Ceph MDS would have to keep the metadata of all those files in > cache. Suppose further that the stat struct for all those files is > anywhere between 1 and 2KB, and we go by an average of 1.5KB metadata > per file including some overhead, then that would mean the average > metadata per file is about 1.5% of the average file size. So for my 100G > of data, the MDS would use about 1.5G of RAM for caching. > > If you scale that up for a filestore of say a petabyte, that means all > your Ceph MDSs would consume a relatively whopping 15TB in total RAM for > metadata caching, again assuming that *all* the data is actually used by > clients. > Why do you assume that ALL MDSs keep ALL metadata in memory? Isn't the whole point of directory fragmentation that they all keep a bit of the inodes in memory to spread the load? >>> >>> Directory subtree partitioning is considered neither stable nor >>> supported. Hence why it's important to understand what a single active >>> MDS will hold. >>> >> >> Understood. So it's about sizing your MDS right now, not in the future >> then the subtree partitioning works :) > > Correct. > >> Isn't the memory consumption not also influenced by mds_cache_size? >> Those are the amount of inodes the MDS will cache in memory. >> >> Something that is not in cache will be read from RADOS afaik, so there >> will be a limit in to how much memory the MDS will consume. > > I am acutely aware of that, but this is not about *limiting* MDS > memory consumption. It's about "if I wanted to make sure that all my > metadata fits in the cache, how much memory would I need for that?" > Understood. I hoped someone else chimed in here who has more knowledge about this. But lets make a analogy with ZFS for example. You size your (L2)ARC there based on your hot data. Why would you want all CephFS metadata in memory? With any filesystem that will be a problem. We do however need a good rule of thumb of how much memory is used for each inode. > Also as a corollary to this discussion, I'm not sure if anyone has > actually run any stats on CephFS performance (read/write throughput, > latency, and IOPS) as a function of cache hit/miss ratio. In other > words I don't know, and I'm not sure if anyone knows, what the actual > impact of MDS cache misses is — I am just assuming it would be quite > significant, otherwise I can't imagine why Sage would have come up > with the idea of a metadata-caching MDS in the first place. :) > > Now of course it's entirely unrealistic that in a production system data > is actually ever used across the board, but are the above considerations > "close enough" for a rule-of-thumb approximation of MDS memory > footprint? As in, > > Total MDS RAM = (Total used storage) * (fraction of data in regular use) > * 0.015 > > If CephFS users could use a rule of thumb like that, it would help them > answer questions like "given a filesystem of size X, will a single MDS > be enough to hold my metadata caches if Y is the maximum amount of > memory I can afford for budget Z". > > All thoughts and comments much appreciated. Thank you! > > Cheers, > Florian -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.
[ceph-users] ceph-fs-common & ceph-mds on ARM Raspberry Debian 7.6
Hi, I’m trying to deploy CEPH (with ceph-deploy) on Raspberry Debian 7.6 and I have below error on ceph-deploy install command : [socrate.flox-arts.in][INFO ] Running command: env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get -q -o Dpkg::Options::=--force-confnew --no-install-recommends --assume-yes install -- ceph ceph-mds ceph-common ceph-fs-common gdisk [socrate.flox-arts.in][DEBUG ] Reading package lists... [socrate.flox-arts.in][DEBUG ] Building dependency tree... [socrate.flox-arts.in][DEBUG ] Reading state information... [socrate.flox-arts.in][WARNIN] E: Unable to locate package ceph-mds [socrate.flox-arts.in][WARNIN] E: Unable to locate package ceph-fs-common [socrate.flox-arts.in][ERROR ] RuntimeError: command returned non-zero exit status: 100 [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get -q -o Dpkg::Options::=--force-confnew --no-install-recommends --assume-yes install -- ceph ceph-mds ceph-common ceph-fs-common gdisk Do you know how I can have these 2 package on this platform ? Thanks Florent Monthel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd was crashing on start - journal flush
I had a problem with an osd starting - log seemed to show the journal was a problem. When I tried to flush the journal I got the errors below. I was in a hurry so attached a spare ssd partion as a new journal, which fixed the problem and let it heal. To fix it for the original ssd journal should I have cleared the ssd partition using dd? errors: ceph-osd -i 1 --flush-journal 2014-12-01 01:16:05.387607 7f4133d18780 -1 filestore(/var/lib/ceph/osd/ceph-1) FileStore::_setattrs: chain_setxattr returned -14 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f4133d18780 time 2014-12-01 01:16:05.387752 os/FileStore.cc: 2559: FAILED assert(0 == "unexpected error") ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x9ff) [0x9c63ff] 2: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x9c9efc] 3: (JournalingObjectStore::journal_replay(unsigned long)+0x985) [0x9dfdc5] 4: (FileStore::mount()+0x32dd) [0x9b654d] 5: (main()+0xe20) [0x731dc0] 6: (__libc_start_main()+0xfd) [0x7f4131db8ead] 7: ceph-osd() [0x736ea9] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. 2014-12-01 01:16:05.390458 7f4133d18780 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f4133d18780 time 2014-12-01 01:16:05.387752 os/FileStore.cc: 2559: FAILED assert(0 == "unexpected error") ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x9ff) [0x9c63ff] 2: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x9c9efc] 3: (JournalingObjectStore::journal_replay(unsigned long)+0x985) [0x9dfdc5] 4: (FileStore::mount()+0x32dd) [0x9b654d] 5: (main()+0xe20) [0x731dc0] 6: (__libc_start_main()+0xfd) [0x7f4131db8ead] 7: ceph-osd() [0x736ea9] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. -4> 2014-12-01 01:16:05.387607 7f4133d18780 -1 filestore(/var/lib/ceph/osd/ceph-1) FileStore::_setattrs: chain_setxattr returned -14 0> 2014-12-01 01:16:05.390458 7f4133d18780 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadPool::TPHandle*)' thread 7f4133d18780 time 2014-12-01 01:16:05.387752 os/FileStore.cc: 2559: FAILED assert(0 == "unexpected error") ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x9ff) [0x9c63ff] 2: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x9c9efc] 3: (JournalingObjectStore::journal_replay(unsigned long)+0x985) [0x9dfdc5] 4: (FileStore::mount()+0x32dd) [0x9b654d] 5: (main()+0xe20) [0x731dc0] 6: (__libc_start_main()+0xfd) [0x7f4131db8ead] 7: ceph-osd() [0x736ea9] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' *** Caught signal (Aborted) ** in thread 7f4133d18780 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: ceph-osd() [0xab54a2] 2: (()+0xf0a0) [0x7f413325b0a0] 3: (gsignal()+0x35) [0x7f4131dcc165] 4: (abort()+0x180) [0x7f4131dcf3e0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f413262389d] 6: (()+0x63996) [0x7f4132621996] 7: (()+0x639c3) [0x7f41326219c3] 8: (()+0x63bee) [0x7f4132621bee] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x40a) [0xb8f97a] 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x9ff) [0x9c63ff] 11: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x6c) [0x9c9efc] 12: (JournalingObjectStore::journal_replay(unsigned long)+0x985) [0x9dfdc5] 13: (FileStore::mount()+0x32dd) [0x9b654d] 14: (main()+0xe20) [0x731dc0] 15: (__libc_start_main()+0xfd) [0x7f4131db8ead] 16: ceph-osd() [0x736ea9] 2014-12-01 01:16:05.394614 7f4133d18780 -1 *** Caught signal (Aborted) ** in thread 7f4133d18780 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3) 1: ceph-osd() [0xab54a2] 2: (()+0xf0a0) [0x7f413325b0a0] 3: (gsignal()+0x35) [0x7f4131dcc165] 4: (abort()+0x180) [0x7f4131dcf3e0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f413262389d] 6: (()+0x63996) [0x7f4132621996] 7: (()+0x639c3) [0x7f41326219c3] 8: (()+0x63bee) [0x7f4132621bee] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x40a) [0xb8f97a] 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x