Re: [ceph-users] v0.87 Giant released
Dear expert could you help to provide some guide upgrade Ceph from firefly to giant ? many thanks ! 2014-10-30 15:37 GMT+07:00 Joao Eduardo Luis joao.l...@inktank.com: On 10/30/2014 05:54 AM, Sage Weil wrote: On Thu, 30 Oct 2014, Nigel Williams wrote: On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). Sure. An easy way to get misplaced objects is to do 'ceph osd out N' on an OSD. Nothing is down, we still have as many copies as we had before, but Ceph now wants to move them somewhere else. Starting with giant, you will see the misplaced % in 'ceph -s' and not degraded. leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB I noticed the typo, wondered about the code, but I'm not seeing the same values anyway? https://github.com/ceph/ceph/blob/giant/src/common/config_opts.h OPTION(leveldb_write_buffer_size, OPT_U64, 8 *1024*1024) // leveldb write buffer size OPTION(leveldb_cache_size, OPT_U64, 128 *1024*1024) // leveldb cache size Hmm! Not sure where that 32MB number came from. I'll fix it, thanks! Those just happen to be the values used on the monitors (in ceph_mon.cc). Maybe that's where the mix up came from. :) -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On 10/30/2014 05:54 AM, Sage Weil wrote: On Thu, 30 Oct 2014, Nigel Williams wrote: On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). Sure. An easy way to get misplaced objects is to do 'ceph osd out N' on an OSD. Nothing is down, we still have as many copies as we had before, but Ceph now wants to move them somewhere else. Starting with giant, you will see the misplaced % in 'ceph -s' and not degraded. leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB I noticed the typo, wondered about the code, but I'm not seeing the same values anyway? https://github.com/ceph/ceph/blob/giant/src/common/config_opts.h OPTION(leveldb_write_buffer_size, OPT_U64, 8 *1024*1024) // leveldb write buffer size OPTION(leveldb_cache_size, OPT_U64, 128 *1024*1024) // leveldb cache size Hmm! Not sure where that 32MB number came from. I'll fix it, thanks! Those just happen to be the values used on the monitors (in ceph_mon.cc). Maybe that's where the mix up came from. :) -Joao -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] v0.87 Giant released
This release will form the basis for the Giant stable series, v0.87.x. Highlights for Giant include: * *RADOS Performance*: a range of improvements have been made in the OSD and client-side librados code that improve the throughput on flash backends and improve parallelism and scaling on fast machines. * *CephFS*: we have fixed a raft of bugs in CephFS and built some basic journal recovery and diagnostic tools. Stability and performance of single-MDS systems is vastly improved in Giant. Although we do not yet recommend CephFS for production deployments, we do encourage testing for non-critical workloads so that we can better guage the feature, usability, performance, and stability gaps. * *Local Recovery Codes*: the OSDs now support an erasure-coding scheme that stores some additional data blocks to reduce the IO required to recover from single OSD failures. * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). The distinction is important because the latter does not compromise data safety. * *Tiering improvements*: we have made several improvements to the cache tiering implementation that improve performance. Most notably, objects are not promoted into the cache tier by a single read; they must be found to be sufficiently hot before that happens. * *Monitor performance*: the monitors now perform writes to the local data store asynchronously, improving overall responsiveness. * *Recovery tools*: the ceph_objectstore_tool is greatly expanded to allow manipulation of an individual OSDs data store for debugging and repair purposes. This is most heavily used by our QA infrastructure to exercise recovery code. Upgrade Sequencing -- * If your existing cluster is running a version older than v0.80.x Firefly, please first upgrade to the latest Firefly release before moving on to Giant. We have not tested upgrades directly from Emperor, Dumpling, or older releases. We *have* tested: * Firefly to Giant * Dumpling to Firefly to Giant * Please upgrade daemons in the following order: #. Monitors #. OSDs #. MDSs and/or radosgw Note that the relative ordering of OSDs and monitors should not matter, but we primarily tested upgrading monitors first. Upgrading from v0.80.x Firefly -- * The client-side caching for librbd is now enabled by default (rbd cache = true). A safety option (rbd cache writethrough until flush = true) is also enabled so that writeback caching is not used until the library observes a 'flush' command, indicating that the librbd users is passing that operation through from the guest VM. This avoids potential data loss when used with older versions of qemu that do not support flush. leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB leveldb_block_size= 64*1024 = 65536 // 64KB leveldb_compression = false leveldb_log = OSDs will still maintain the following osd-specific defaults: leveldb_log = * The 'rados getxattr ...' command used to add a gratuitous newline to the attr value; it now does not. * The ``*_kb perf`` counters on the monitor have been removed. These are replaced with a new set of ``*_bytes`` counters (e.g., ``cluster_osd_kb`` is replaced by ``cluster_osd_bytes``). * The ``rd_kb`` and ``wr_kb`` fields in the JSON dumps for pool stats (accessed via the ``ceph df detail -f json-pretty`` and related commands) have been replaced with corresponding ``*_bytes`` fields. Similarly, the ``total_space``, ``total_used``, and ``total_avail`` fields are replaced with ``total_bytes``, ``total_used_bytes``, and ``total_avail_bytes`` fields. * The ``rados df --format=json`` output ``read_bytes`` and ``write_bytes`` fields were incorrectly reporting ops; this is now fixed. * The ``rados df --format=json`` output previously included ``read_kb`` and ``write_kb`` fields; these have been removed. Please use ``read_bytes`` and ``write_bytes`` instead (and divide by 1024 if appropriate). * The experimental keyvaluestore-dev OSD backend had an on-disk format change that prevents existing OSD data from being upgraded. This affects developers and testers only. * mon-specific and osd-specific leveldb options have been removed. From this point onward users should use the `leveldb_*` generic options and add the options in the appropriate sections of their configuration files. Monitors will still maintain the following monitor-specific defaults: leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 //
Re: [ceph-users] v0.87 Giant released
On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB I noticed the typo, wondered about the code, but I'm not seeing the same values anyway? https://github.com/ceph/ceph/blob/giant/src/common/config_opts.h OPTION(leveldb_write_buffer_size, OPT_U64, 8 *1024*1024) // leveldb write buffer size OPTION(leveldb_cache_size, OPT_U64, 128 *1024*1024) // leveldb cache size ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On Thu, 30 Oct 2014 10:40:38 +1100 Nigel Williams wrote: On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). Any time there is a change in data placement, be it a change in the CRUSH map like modifying the weight of an OSD or simply adding a new OSD. Thus objects are (temporarily) not where they're supposed to be, but still present in sufficient replication. A much more benign scenario than degraded and I hope that this doesn't even generate a WARN in the ceph -s report. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On 30/10/2014 11:51 AM, Christian Balzer wrote: Thus objects are (temporarily) not where they're supposed to be, but still present in sufficient replication. thanks for the reminder, I suppose that is obvious :-) A much more benign scenario than degraded and I hope that this doesn't even generate a WARN in the ceph -s report. Better described as a transitory hazardous state, given that the PG distribution might not be optimal for a period of time and (inopportune) failures may tip the health into degraded. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.87 Giant released
On Thu, 30 Oct 2014, Nigel Williams wrote: On 30/10/2014 8:56 AM, Sage Weil wrote: * *Degraded vs misplaced*: the Ceph health reports from 'ceph -s' and related commands now make a distinction between data that is degraded (there are fewer than the desired number of copies) and data that is misplaced (stored in the wrong location in the cluster). Is someone able to briefly described how/why misplaced happens please, is it repaired eventually? I've not seen misplaced (yet). Sure. An easy way to get misplaced objects is to do 'ceph osd out N' on an OSD. Nothing is down, we still have as many copies as we had before, but Ceph now wants to move them somewhere else. Starting with giant, you will see the misplaced % in 'ceph -s' and not degraded. leveldb_write_buffer_size = 32*1024*1024 = 33554432 // 32MB leveldb_cache_size= 512*1024*1204 = 536870912 // 512MB I noticed the typo, wondered about the code, but I'm not seeing the same values anyway? https://github.com/ceph/ceph/blob/giant/src/common/config_opts.h OPTION(leveldb_write_buffer_size, OPT_U64, 8 *1024*1024) // leveldb write buffer size OPTION(leveldb_cache_size, OPT_U64, 128 *1024*1024) // leveldb cache size Hmm! Not sure where that 32MB number came from. I'll fix it, thanks! sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com