Re: [Teuthology] Upgrade hammer on ubuntu : all passed
On 07/21/2015 12:24 AM, Loic Dachary wrote: teuthology-suite --ceph hammer-backports --machine-type openstack --suite upgrade/hammer-x --filter ubuntu_14.04 $HOME/src/ceph-qa-suite_master/machine_types/vps.yaml $(pwd)/teuthology/test/integration/archive-on-error.yaml Hi Loic, Job started http://ceph.aevoo.fr:8081/ubuntu-2015-07-21_05:02:44-upgrade:hammer-hammer---basic-openstack/ David -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: The design of the eviction improvement
> -Original Message- > From: ceph-devel-ow...@vger.kernel.org > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sage Weil > Sent: Tuesday, July 21, 2015 6:38 AM > To: Wang, Zhiqiang > Cc: sj...@redhat.com; ceph-devel@vger.kernel.org > Subject: Re: The design of the eviction improvement > > On Mon, 20 Jul 2015, Wang, Zhiqiang wrote: > > Hi all, > > > > This is a follow-up of one of the CDS session at > http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_tieri > ng_eviction. We discussed the drawbacks of the current eviction algorithm and > several ways to improve it. Seems like the LRU variants is the right way to > go. I > come up with some design points after the CDS, and want to discuss it with > you. > It is an approximate 2Q algorithm, combining some benefits of the clock > algorithm, similar to what the linux kernel does for the page cache. > > Unfortunately I missed this last CDS so I'm behind on the discussion. I have > a > few questions though... > > > # Design points: > > > > ## LRU lists > > - Maintain LRU lists at the PG level. > > The SharedLRU and SimpleLRU implementation in the current code have a > > max_size, which limits the max number of elements in the list. This > > mostly looks like a MRU, though its name implies they are LRUs. Since > > the object size may vary in a PG, it's not possible to caculate the > > total number of objects which the cache tier can hold ahead of time. > > We need a new LRU implementation with no limit on the size. > > This last sentence seems to me to be the crux of it. Assuming we have an > OSD based by flash storing O(n) objects, we need a way to maintain an LRU of > O(n) objects in memory. The current hitset-based approach was taken based > on the assumption that this wasn't feasible--or at least we didn't know how to > implmement such a thing. If it is, or we simply want to stipulate that cache > tier OSDs get gobs of RAM to make it possible, then lots of better options > become possible... > > Let's say you have a 1TB SSD, with an average object size of 1MB -- that's > 1 million objects. At maybe ~100bytes per object of RAM for an LRU entry > that's 100MB... so not so unreasonable, perhaps! I was having the same question before proposing this. I did the similar calculation and thought it would be ok to use this many memory :-) > > > - Two lists for each PG: active and inactive Objects are first put > > into the inactive list when they are accessed, and moved between these two > lists based on some criteria. > > Object flag: active, referenced, unevictable, dirty. > > - When an object is accessed: > > 1) If it's not in both of the lists, it's put on the top of the > > inactive list > > 2) If it's in the inactive list, and the referenced flag is not set, the > > referenced > flag is set, and it's moved to the top of the inactive list. > > 3) If it's in the inactive list, and the referenced flag is set, the > > referenced flag > is cleared, and it's removed from the inactive list, and put on top of the > active > list. > > 4) If it's in the active list, and the referenced flag is not set, the > > referenced > flag is set, and it's moved to the top of the active list. > > 5) If it's in the active list, and the referenced flag is set, it's moved > > to the top > of the active list. > > - When selecting objects to evict: > > 1) Objects at the bottom of the inactive list are selected to evict. They > > are > removed from the inactive list. > > 2) If the number of the objects in the inactive list becomes low, some of > > the > objects at the bottom of the active list are moved to the inactive list. For > those > objects which have the referenced flag set, they are given one more chance in > the active list. They are moved to the top of the active list with the > referenced > flag cleared. For those objects which don't have the referenced flag set, they > are moved to the inactive list, with the referenced flag set. So that they > can be > quickly promoted to the active list when necessary. > > > > ## Combine flush with eviction > > - When evicting an object, if it's dirty, it's flushed first. After > > flushing, it's > evicted. If not dirty, it's evicted directly. > > - This means that we won't have separate activities and won't set different > ratios for flush and evict. Is there a need to do so? > > - Number of objects to evict at a time. 'evict_effort' acts as the > > priority, which > is used to calculate the number of objects to evict. > > As someone else mentioned in a follow-up, the reason we let the dirty level be > set lower than the full level is that it provides headroom so that objects > can be > quickly evicted (delete, no flush) to make room for new writes or new > promotions. > > That said, we probably can/should streamline the flush so that an evict can > immediately follow without waiting for the agent to come around again. > (I don't think we do that now?) I was afraid of having to
RE: The design of the eviction improvement
Hi Nick, > -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: Monday, July 20, 2015 5:28 PM > To: Wang, Zhiqiang; 'Sage Weil'; sj...@redhat.com; > ceph-devel@vger.kernel.org > Subject: RE: The design of the eviction improvement > > Hi, > > > -Original Message- > > From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- > > ow...@vger.kernel.org] On Behalf Of Wang, Zhiqiang > > Sent: 20 July 2015 09:47 > > To: Sage Weil ; sj...@redhat.com; ceph- > > de...@vger.kernel.org > > Subject: The design of the eviction improvement > > > > Hi all, > > > > This is a follow-up of one of the CDS session at > > http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_ > > tiering_eviction. We discussed the drawbacks of the current eviction > > algorithm and several ways to improve it. Seems like the LRU variants > > is > the > > right way to go. I come up with some design points after the CDS, and > > want to discuss it with you. It is an approximate 2Q algorithm, > > combining some benefits of the clock algorithm, similar to what the > > linux kernel does for > the > > page cache. > > > > # Design points: > > > > ## LRU lists > > - Maintain LRU lists at the PG level. > > The SharedLRU and SimpleLRU implementation in the current code have a > > max_size, which limits the max number of elements in the list. This > > mostly looks like a MRU, though its name implies they are LRUs. Since > > the object > size > > may vary in a PG, it's not possible to caculate the total number of > objects > > which the cache tier can hold ahead of time. We need a new LRU > > implementation with no limit on the size. > > - Two lists for each PG: active and inactive Objects are first put > > into > the > > inactive list when they are accessed, and moved between these two > > lists based on some criteria. > > Object flag: active, referenced, unevictable, dirty. > > - When an object is accessed: > > 1) If it's not in both of the lists, it's put on the top of the > > inactive > list > > 2) If it's in the inactive list, and the referenced flag is not set, > > the > referenced > > flag is set, and it's moved to the top of the inactive list. > > 3) If it's in the inactive list, and the referenced flag is set, the > referenced flag > > is cleared, and it's removed from the inactive list, and put on top of > > the > active > > list. > > 4) If it's in the active list, and the referenced flag is not set, the > referenced > > flag is set, and it's moved to the top of the active list. > > 5) If it's in the active list, and the referenced flag is set, it's > > moved > to the top > > of the active list. > > - When selecting objects to evict: > > 1) Objects at the bottom of the inactive list are selected to evict. > > They > are > > removed from the inactive list. > > 2) If the number of the objects in the inactive list becomes low, some > > of > the > > objects at the bottom of the active list are moved to the inactive list. > For > > those objects which have the referenced flag set, they are given one > > more chance in the active list. They are moved to the top of the > > active list > with the > > referenced flag cleared. For those objects which don't have the > > referenced flag set, they are moved to the inactive list, with the > > referenced flag > set. So > > that they can be quickly promoted to the active list when necessary. > > > > I really like this idea but just out of interest, there must be a point where > the > overhead of managing much larger lists of very cold objects starts to impact > on > the gains of having exactly the right objects in each tier. If 90% of your > hot IO is > in 10% of the total data, how much extra benefit would you get by tracking all > objects vs just tracking the top 10,20,30%...etc and evicting randomly after > that? If these objects are being accessed infrequently, the impact of > re-promoting is probably minimal and if the promotion code can get to a point > where it is being a bit more intelligent about what objects are promoted then > this is probably even more so? The idea is that the lists only hold the objects in the cache tier. For those objects which are cold enough, it's evicted from the cache tier and removed from the lists. Also, the lists are maintained at the PG level. I guess the lists won't be too extremely large? In your example of the 90%/10% data access, it may be right that randomly evicting the 90% cold data is good enough. But we need a way to know what the 10% of the hot data are. Also, we can't assume the 90%/10% pattern for every workload. > > > ## Combine flush with eviction > > - When evicting an object, if it's dirty, it's flushed first. After > flushing, it's > > evicted. If not dirty, it's evicted directly. > > - This means that we won't have separate activities and won't set > different > > ratios for flush and evict. Is there a need to do so? > > - Number of objects to evict at a time. 'evict_effort' acts as the > prio
Re: The design of the eviction improvement
On Mon, 20 Jul 2015, Wang, Zhiqiang wrote: > Hi all, > > This is a follow-up of one of the CDS session at > http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_tiering_eviction. > We discussed the drawbacks of the current eviction algorithm and several > ways to improve it. Seems like the LRU variants is the right way to go. I > come up with some design points after the CDS, and want to discuss it with > you. It is an approximate 2Q algorithm, combining some benefits of the clock > algorithm, similar to what the linux kernel does for the page cache. Unfortunately I missed this last CDS so I'm behind on the discussion. I have a few questions though... > # Design points: > > ## LRU lists > - Maintain LRU lists at the PG level. > The SharedLRU and SimpleLRU implementation in the current code have a > max_size, which limits the max number of elements in the list. This > mostly looks like a MRU, though its name implies they are LRUs. Since > the object size may vary in a PG, it's not possible to caculate the > total number of objects which the cache tier can hold ahead of time. We > need a new LRU implementation with no limit on the size. This last sentence seems to me to be the crux of it. Assuming we have an OSD based by flash storing O(n) objects, we need a way to maintain an LRU of O(n) objects in memory. The current hitset-based approach was taken based on the assumption that this wasn't feasible--or at least we didn't know how to implmement such a thing. If it is, or we simply want to stipulate that cache tier OSDs get gobs of RAM to make it possible, then lots of better options become possible... Let's say you have a 1TB SSD, with an average object size of 1MB -- that's 1 million objects. At maybe ~100bytes per object of RAM for an LRU entry that's 100MB... so not so unreasonable, perhaps! > - Two lists for each PG: active and inactive > Objects are first put into the inactive list when they are accessed, and > moved between these two lists based on some criteria. > Object flag: active, referenced, unevictable, dirty. > - When an object is accessed: > 1) If it's not in both of the lists, it's put on the top of the inactive list > 2) If it's in the inactive list, and the referenced flag is not set, the > referenced flag is set, and it's moved to the top of the inactive list. > 3) If it's in the inactive list, and the referenced flag is set, the > referenced flag is cleared, and it's removed from the inactive list, and put > on top of the active list. > 4) If it's in the active list, and the referenced flag is not set, the > referenced flag is set, and it's moved to the top of the active list. > 5) If it's in the active list, and the referenced flag is set, it's moved to > the top of the active list. > - When selecting objects to evict: > 1) Objects at the bottom of the inactive list are selected to evict. They are > removed from the inactive list. > 2) If the number of the objects in the inactive list becomes low, some of the > objects at the bottom of the active list are moved to the inactive list. For > those objects which have the referenced flag set, they are given one more > chance in the active list. They are moved to the top of the active list with > the referenced flag cleared. For those objects which don't have the > referenced flag set, they are moved to the inactive list, with the referenced > flag set. So that they can be quickly promoted to the active list when > necessary. > > ## Combine flush with eviction > - When evicting an object, if it's dirty, it's flushed first. After flushing, > it's evicted. If not dirty, it's evicted directly. > - This means that we won't have separate activities and won't set different > ratios for flush and evict. Is there a need to do so? > - Number of objects to evict at a time. 'evict_effort' acts as the priority, > which is used to calculate the number of objects to evict. As someone else mentioned in a follow-up, the reason we let the dirty level be set lower than the full level is that it provides headroom so that objects can be quickly evicted (delete, no flush) to make room for new writes or new promotions. That said, we probably can/should streamline the flush so that an evict can immediately follow without waiting for the agent to come around again. (I don't think we do that now?) sage > ## LRU lists Snapshotting > - The two lists are snapshotted persisted periodically. > - Only one copy needs to be saved. The old copy is removed when persisting > the lists. The saved lists are used to restore the LRU lists when OSD reboots. > > Any comments/feedbacks are welcomed. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majo
RE: The design of the eviction improvement
This seems much better than the current mechanism. Do you have an estimate of the memory consumption of the two lists? (In terms of bytes/object?) Allen Samuels Software Architect, Systems and Software Solutions 2880 Junction Avenue, San Jose, CA 95134 T: +1 408 801 7030| M: +1 408 780 6416 allen.samu...@sandisk.com -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Wang, Zhiqiang Sent: Monday, July 20, 2015 1:47 AM To: Sage Weil; sj...@redhat.com; ceph-devel@vger.kernel.org Subject: The design of the eviction improvement Hi all, This is a follow-up of one of the CDS session at http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_tiering_eviction. We discussed the drawbacks of the current eviction algorithm and several ways to improve it. Seems like the LRU variants is the right way to go. I come up with some design points after the CDS, and want to discuss it with you. It is an approximate 2Q algorithm, combining some benefits of the clock algorithm, similar to what the linux kernel does for the page cache. # Design points: ## LRU lists - Maintain LRU lists at the PG level. The SharedLRU and SimpleLRU implementation in the current code have a max_size, which limits the max number of elements in the list. This mostly looks like a MRU, though its name implies they are LRUs. Since the object size may vary in a PG, it's not possible to caculate the total number of objects which the cache tier can hold ahead of time. We need a new LRU implementation with no limit on the size. - Two lists for each PG: active and inactive Objects are first put into the inactive list when they are accessed, and moved between these two lists based on some criteria. Object flag: active, referenced, unevictable, dirty. - When an object is accessed: 1) If it's not in both of the lists, it's put on the top of the inactive list 2) If it's in the inactive list, and the referenced flag is not set, the referenced flag is set, and it's moved to the top of the inactive list. 3) If it's in the inactive list, and the referenced flag is set, the referenced flag is cleared, and it's removed from the inactive list, and put on top of the active list. 4) If it's in the active list, and the referenced flag is not set, the referenced flag is set, and it's moved to the top of the active list. 5) If it's in the active list, and the referenced flag is set, it's moved to the top of the active list. - When selecting objects to evict: 1) Objects at the bottom of the inactive list are selected to evict. They are removed from the inactive list. 2) If the number of the objects in the inactive list becomes low, some of the objects at the bottom of the active list are moved to the inactive list. For those objects which have the referenced flag set, they are given one more chance in the active list. They are moved to the top of the active list with the referenced flag cleared. For those objects which don't have the referenced flag set, they are moved to the inactive list, with the referenced flag set. So that they can be quickly promoted to the active list when necessary. ## Combine flush with eviction - When evicting an object, if it's dirty, it's flushed first. After flushing, it's evicted. If not dirty, it's evicted directly. - This means that we won't have separate activities and won't set different ratios for flush and evict. Is there a need to do so? - Number of objects to evict at a time. 'evict_effort' acts as the priority, which is used to calculate the number of objects to evict. ## LRU lists Snapshotting - The two lists are snapshotted persisted periodically. - Only one copy needs to be saved. The old copy is removed when persisting the lists. The saved lists are used to restore the LRU lists when OSD reboots. Any comments/feedbacks are welcomed. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Teuthology] Upgrade hammer on ubuntu : all passed
Hi David, Would you agree to run a similar suite against the hammer-backports branch ? It already is scheduled http://pulpito.ceph.com/loic-2015-07-20_16:52:10-upgrade:hammer-x-hammer-backports-distro-basic-multi/ but maybe you can complete it faster. The command is: teuthology-suite --ceph hammer-backports --machine-type openstack --suite upgrade/hammer-x --filter ubuntu_14.04 $HOME/src/ceph-qa-suite_master/machine_types/vps.yaml $(pwd)/teuthology/test/integration/archive-on-error.yaml Cheers On 20/07/2015 12:56, David Casier AEVOO wrote: > Hi all, > Good news for upgrade hammer on Ubuntu : > http://ceph.aevoo.fr:8081/ubuntu-2015-07-19_05:44:18-upgrade:hammer-hammer---basic-openstack/ > All jobs are passed. > > David > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature
Re: dmcrypt with luks keys in hammer
On Mon, Jul 20, 2015 at 6:21 PM, Sage Weil wrote: > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote: >> No luck with ceph-disk-activate (all or just one device). >> >> $ sudo ceph-disk-activate /dev/sdv1 >> mount: unknown filesystem type 'crypto_LUKS' >> ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', >> 'crypto_LUKS', '-o', '', '--', '/dev/sdv1', >> '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32 >> >> >> Its odd that it should complain about the "crypto_LUKS" filesystem not >> being recognized, because it did mount some of the LUKS systems >> successfully, though not sometimes just the data and not the journal >> (or vice versa). >> >> $ lsblk /dev/sdb >> NAMEMAJ:MIN RM SIZE RO >> TYPE MOUNTPOINT >> sdb 8:16 0 3.7T 0 disk >> ??sdb18:17 0 3.6T 0 part >> ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 >> crypt /var/lib/ceph/osd/ceph-54 >> ??sdb28:18 010G 0 part >> ??temporary-cryptsetup-1235 (dm-6)252:60 125K 1 crypt >> >> >> $ blkid /dev/sdb1 >> /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS" >> >> >> A race condition (or other issue) with udev seems likely given that >> its rather random which ones come up and which ones don't. > > A race condition during creation or activation? If it's activation I > would expect ceph-disk activate ... to work reasonably reliably when > called manually (on a single device at a time). > > sage > Im not sure. I do know that all of the disks *did* work after the initial installation and activation, but they fail after reboot, and the failures are non-deterministic. Im not really sure how to debug it any further. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dmcrypt with luks keys in hammer
On Mon, 20 Jul 2015, Wyllys Ingersoll wrote: > No luck with ceph-disk-activate (all or just one device). > > $ sudo ceph-disk-activate /dev/sdv1 > mount: unknown filesystem type 'crypto_LUKS' > ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', > 'crypto_LUKS', '-o', '', '--', '/dev/sdv1', > '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32 > > > Its odd that it should complain about the "crypto_LUKS" filesystem not > being recognized, because it did mount some of the LUKS systems > successfully, though not sometimes just the data and not the journal > (or vice versa). > > $ lsblk /dev/sdb > NAMEMAJ:MIN RM SIZE RO > TYPE MOUNTPOINT > sdb 8:16 0 3.7T 0 disk > ??sdb18:17 0 3.6T 0 part > ? ??e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 > crypt /var/lib/ceph/osd/ceph-54 > ??sdb28:18 010G 0 part > ??temporary-cryptsetup-1235 (dm-6)252:60 125K 1 crypt > > > $ blkid /dev/sdb1 > /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS" > > > A race condition (or other issue) with udev seems likely given that > its rather random which ones come up and which ones don't. A race condition during creation or activation? If it's activation I would expect ceph-disk activate ... to work reasonably reliably when called manually (on a single device at a time). sage > > > > > On Mon, Jul 20, 2015 at 5:22 PM, Sage Weil wrote: > > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote: > >> Were running a cluster with Hammer v94.2 and are running into issues > >> with the Luks encrypted OSD data and journal partitions. The > >> installation goes smoothly and everything runs OK, but we've had to > >> reboot a couple of the storage nodes for various reasons and when they > >> come back online, a large number of OSD processes fail to start > >> because the LUKS encrypted partitions are not getting mounted > >> correctly. > >> > >> I'm not sure if it is a udev issue or a problem with the OSD process > >> itself, but the encrypted partitions end up getting mounted as > >> "temporary-cryptsetup-PID" and they never recover. From below, you > >> can see that some of the OSDs did come up correctly, but the majority > >> do not. We've seen this problem now on several storage nodes, and it > >> only occurs for those OSDs that used luks (the new default). The only > >> recovery that we've found is to wipe them all out and rebuild them > >> using "plain" dmcrypt (as it used to be). > >> > >> Using "blkid" on a partition that is in the "temporary-cryptsetup" > >> state, does show that it has the right ID_PART_ENTRY_UUID and TYPE > >> values and I can confirm that there is an associated key in > >> /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly. > >> > >> $ sudo blkid -p -o udev /dev/sdv2 > >> ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b > >> ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b > >> ID_FS_VERSION=1 > >> ID_FS_TYPE=crypto_LUKS > >> ID_FS_USAGE=crypto > >> ID_PART_ENTRY_SCHEME=gpt > >> ID_PART_ENTRY_NAME=ceph\x20journal > >> ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1 > >> ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106 > >> ID_PART_ENTRY_NUMBER=2 > >> ID_PART_ENTRY_OFFSET=2048 > >> ID_PART_ENTRY_SIZE=20969473 > >> ID_PART_ENTRY_DISK=65:80 > >> > >> So Im checking to see if this is a known issue or if we are missing > >> something in the installation or configuration that would fix this > >> problem. > > > > This isn't a known issue, although I think we have seen problems in > > general with hosts with lots of OSDs not always coming up on boot. If it > > is specifically a problem with luks+dmcrypt that would be interesting! > > > > Does an explicit 'ceph-disk activate /dev/...' on one of the devices make > > it come up? And/or a 'ceph-disk activate-all'? If so that would indicate > > a race issue in udev. > > > > Thanks- > > sage > > > > > >> > >> -Wyllys Ingersoll > >> > >> > >> Ex: > >> $ lsblk -l > >> NAME MAJ:MIN RM SIZE RO TYPE > >> MOUNTPOINT > >> sda8:00 111.8G 0 disk > >> sda1 8:10 15.3G 0 part > >> [SWAP] > >> sda2 8:20 1K 0 part > >> sda5 8:50 96.5G 0 part / > >> sdb8:16 0 3.7T 0 disk > >> sdb1 8:17 0 3.6T 0 part > >> e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 crypt > >> sdb2 8:18 010G 0 part > >> temporary-cryptsetup-1235 (dm-6) 252:60 125K 1 crypt > >> sdc
[ANN] ceps-deploy 1.5.26 released
Hi everyone, This is announcing a new release of ceph-deploy that focuses on usability improvements. - Most of the help menus for ceph-deploy subcommands (e.ge. “ceph-deploy mon” and “ceph-deploy osd”) have been improved to be more context aware, such that help for “ceph-deploy osd create --help “ and “ceph-deploy osd zap --help” return different output specific to the command. Previously it would show generic help for “ceph-deploy osd”. Additionally, the list of optional arguments shown for the command are always correct for the subcommand in question. Previously the options shown were the aggregate of all options. - ceph-deploy now points to git.ceph.com for downloading GPG keys - ceph-deploy will now work on the Mint Linux distribution (by pointing to Ubuntu packages) - SUSE distro users will now be pointed to SUSE packages by default, as there have not been updated SUSE packages on ceph.com in quite some time. Full changelog is available at: http://ceph.com/ceph-deploy/docs/changelog.html#id1 New packages are available in the usual places of ceph.com hosted repos and PyPI. Cheers, - Travis-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dmcrypt with luks keys in hammer
No luck with ceph-disk-activate (all or just one device). $ sudo ceph-disk-activate /dev/sdv1 mount: unknown filesystem type 'crypto_LUKS' ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', 'crypto_LUKS', '-o', '', '--', '/dev/sdv1', '/var/lib/ceph/tmp/mnt.QHe3zK']' returned non-zero exit status 32 Its odd that it should complain about the "crypto_LUKS" filesystem not being recognized, because it did mount some of the LUKS systems successfully, though not sometimes just the data and not the journal (or vice versa). $ lsblk /dev/sdb NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 3.7T 0 disk ├─sdb18:17 0 3.6T 0 part │ └─e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 crypt /var/lib/ceph/osd/ceph-54 └─sdb28:18 010G 0 part └─temporary-cryptsetup-1235 (dm-6)252:60 125K 1 crypt $ blkid /dev/sdb1 /dev/sdb1: UUID="d6194096-a219-4732-8d61-d0c125c49393" TYPE="crypto_LUKS" A race condition (or other issue) with udev seems likely given that its rather random which ones come up and which ones don't. On Mon, Jul 20, 2015 at 5:22 PM, Sage Weil wrote: > On Mon, 20 Jul 2015, Wyllys Ingersoll wrote: >> Were running a cluster with Hammer v94.2 and are running into issues >> with the Luks encrypted OSD data and journal partitions. The >> installation goes smoothly and everything runs OK, but we've had to >> reboot a couple of the storage nodes for various reasons and when they >> come back online, a large number of OSD processes fail to start >> because the LUKS encrypted partitions are not getting mounted >> correctly. >> >> I'm not sure if it is a udev issue or a problem with the OSD process >> itself, but the encrypted partitions end up getting mounted as >> "temporary-cryptsetup-PID" and they never recover. From below, you >> can see that some of the OSDs did come up correctly, but the majority >> do not. We've seen this problem now on several storage nodes, and it >> only occurs for those OSDs that used luks (the new default). The only >> recovery that we've found is to wipe them all out and rebuild them >> using "plain" dmcrypt (as it used to be). >> >> Using "blkid" on a partition that is in the "temporary-cryptsetup" >> state, does show that it has the right ID_PART_ENTRY_UUID and TYPE >> values and I can confirm that there is an associated key in >> /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly. >> >> $ sudo blkid -p -o udev /dev/sdv2 >> ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b >> ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b >> ID_FS_VERSION=1 >> ID_FS_TYPE=crypto_LUKS >> ID_FS_USAGE=crypto >> ID_PART_ENTRY_SCHEME=gpt >> ID_PART_ENTRY_NAME=ceph\x20journal >> ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1 >> ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106 >> ID_PART_ENTRY_NUMBER=2 >> ID_PART_ENTRY_OFFSET=2048 >> ID_PART_ENTRY_SIZE=20969473 >> ID_PART_ENTRY_DISK=65:80 >> >> So Im checking to see if this is a known issue or if we are missing >> something in the installation or configuration that would fix this >> problem. > > This isn't a known issue, although I think we have seen problems in > general with hosts with lots of OSDs not always coming up on boot. If it > is specifically a problem with luks+dmcrypt that would be interesting! > > Does an explicit 'ceph-disk activate /dev/...' on one of the devices make > it come up? And/or a 'ceph-disk activate-all'? If so that would indicate > a race issue in udev. > > Thanks- > sage > > >> >> -Wyllys Ingersoll >> >> >> Ex: >> $ lsblk -l >> NAME MAJ:MIN RM SIZE RO TYPE >> MOUNTPOINT >> sda8:00 111.8G 0 disk >> sda1 8:10 15.3G 0 part >> [SWAP] >> sda2 8:20 1K 0 part >> sda5 8:50 96.5G 0 part / >> sdb8:16 0 3.7T 0 disk >> sdb1 8:17 0 3.6T 0 part >> e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 crypt >> sdb2 8:18 010G 0 part >> temporary-cryptsetup-1235 (dm-6) 252:60 125K 1 crypt >> sdc8:32 0 3.7T 0 disk >> sdc1 8:33 0 3.6T 0 part >> temporary-cryptsetup-1788 (dm-37)252:37 0 125K 1 crypt >> sdc2 8:34 010G 0 part >> temporary-cryptsetup-1789 (dm-36)252:36 0 125K 1 crypt >> sdd8:48 0 3.7T 0 disk >> sdd1
Re: dmcrypt with luks keys in hammer
On Mon, 20 Jul 2015, Wyllys Ingersoll wrote: > Were running a cluster with Hammer v94.2 and are running into issues > with the Luks encrypted OSD data and journal partitions. The > installation goes smoothly and everything runs OK, but we've had to > reboot a couple of the storage nodes for various reasons and when they > come back online, a large number of OSD processes fail to start > because the LUKS encrypted partitions are not getting mounted > correctly. > > I'm not sure if it is a udev issue or a problem with the OSD process > itself, but the encrypted partitions end up getting mounted as > "temporary-cryptsetup-PID" and they never recover. From below, you > can see that some of the OSDs did come up correctly, but the majority > do not. We've seen this problem now on several storage nodes, and it > only occurs for those OSDs that used luks (the new default). The only > recovery that we've found is to wipe them all out and rebuild them > using "plain" dmcrypt (as it used to be). > > Using "blkid" on a partition that is in the "temporary-cryptsetup" > state, does show that it has the right ID_PART_ENTRY_UUID and TYPE > values and I can confirm that there is an associated key in > /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly. > > $ sudo blkid -p -o udev /dev/sdv2 > ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b > ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b > ID_FS_VERSION=1 > ID_FS_TYPE=crypto_LUKS > ID_FS_USAGE=crypto > ID_PART_ENTRY_SCHEME=gpt > ID_PART_ENTRY_NAME=ceph\x20journal > ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1 > ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106 > ID_PART_ENTRY_NUMBER=2 > ID_PART_ENTRY_OFFSET=2048 > ID_PART_ENTRY_SIZE=20969473 > ID_PART_ENTRY_DISK=65:80 > > So Im checking to see if this is a known issue or if we are missing > something in the installation or configuration that would fix this > problem. This isn't a known issue, although I think we have seen problems in general with hosts with lots of OSDs not always coming up on boot. If it is specifically a problem with luks+dmcrypt that would be interesting! Does an explicit 'ceph-disk activate /dev/...' on one of the devices make it come up? And/or a 'ceph-disk activate-all'? If so that would indicate a race issue in udev. Thanks- sage > > -Wyllys Ingersoll > > > Ex: > $ lsblk -l > NAME MAJ:MIN RM SIZE RO TYPE > MOUNTPOINT > sda8:00 111.8G 0 disk > sda1 8:10 15.3G 0 part [SWAP] > sda2 8:20 1K 0 part > sda5 8:50 96.5G 0 part / > sdb8:16 0 3.7T 0 disk > sdb1 8:17 0 3.6T 0 part > e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 crypt > sdb2 8:18 010G 0 part > temporary-cryptsetup-1235 (dm-6) 252:60 125K 1 crypt > sdc8:32 0 3.7T 0 disk > sdc1 8:33 0 3.6T 0 part > temporary-cryptsetup-1788 (dm-37)252:37 0 125K 1 crypt > sdc2 8:34 010G 0 part > temporary-cryptsetup-1789 (dm-36)252:36 0 125K 1 crypt > sdd8:48 0 3.7T 0 disk > sdd1 8:49 0 3.6T 0 part > temporary-cryptsetup-1252 (dm-1) 252:10 125K 1 crypt > sdd2 8:50 010G 0 part > temporary-cryptsetup-1246 (dm-3) 252:30 125K 1 crypt > sde8:64 0 3.7T 0 disk > sde1 8:65 0 3.6T 0 part > temporary-cryptsetup-1260 (dm-14)252:14 0 125K 1 crypt > sde2 8:66 010G 0 part > temporary-cryptsetup-1255 (dm-12)252:12 0 125K 1 crypt > sdf8:80 0 3.7T 0 disk > sdf1 8:81 0 3.6T 0 part > temporary-cryptsetup-1268 (dm-15)252:15 0 125K 1 crypt > sdf2 8:82 010G 0 part > temporary-cryptsetup-1245 (dm-5) 252:50 125K 1 crypt > sdg8:96 0 3.7T 0 disk > sdg1 8:97 0 3.6T 0 part > temporary-cryptsetup-1271 (dm-17)252:17 0 125K 1 crypt > sdg2 8:98 010G 0 part > temporary-cryptsetup-1278 (dm-2) 252:20 125K 1 crypt > sdh
dmcrypt with luks keys in hammer
Were running a cluster with Hammer v94.2 and are running into issues with the Luks encrypted OSD data and journal partitions. The installation goes smoothly and everything runs OK, but we've had to reboot a couple of the storage nodes for various reasons and when they come back online, a large number of OSD processes fail to start because the LUKS encrypted partitions are not getting mounted correctly. I'm not sure if it is a udev issue or a problem with the OSD process itself, but the encrypted partitions end up getting mounted as "temporary-cryptsetup-PID" and they never recover. From below, you can see that some of the OSDs did come up correctly, but the majority do not. We've seen this problem now on several storage nodes, and it only occurs for those OSDs that used luks (the new default). The only recovery that we've found is to wipe them all out and rebuild them using "plain" dmcrypt (as it used to be). Using "blkid" on a partition that is in the "temporary-cryptsetup" state, does show that it has the right ID_PART_ENTRY_UUID and TYPE values and I can confirm that there is an associated key in /etc/ceph/dmcrypt-keys, but it still isn't mounting correctly. $ sudo blkid -p -o udev /dev/sdv2 ID_FS_UUID=87008c17-9e57-487d-8f8b-160f8f803d8b ID_FS_UUID_ENC=87008c17-9e57-487d-8f8b-160f8f803d8b ID_FS_VERSION=1 ID_FS_TYPE=crypto_LUKS ID_FS_USAGE=crypto ID_PART_ENTRY_SCHEME=gpt ID_PART_ENTRY_NAME=ceph\x20journal ID_PART_ENTRY_UUID=e3eda67b-a2e0-4d22-a62e-d9bda5ecf8b1 ID_PART_ENTRY_TYPE=45b0969e-9b03-4f30-b4c6-35865ceff106 ID_PART_ENTRY_NUMBER=2 ID_PART_ENTRY_OFFSET=2048 ID_PART_ENTRY_SIZE=20969473 ID_PART_ENTRY_DISK=65:80 So Im checking to see if this is a known issue or if we are missing something in the installation or configuration that would fix this problem. -Wyllys Ingersoll Ex: $ lsblk -l NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda8:00 111.8G 0 disk sda1 8:10 15.3G 0 part [SWAP] sda2 8:20 1K 0 part sda5 8:50 96.5G 0 part / sdb8:16 0 3.7T 0 disk sdb1 8:17 0 3.6T 0 part e8bc1531-a187-4fd2-9e3f-cf90255f89d0 (dm-0) 252:00 3.6T 0 crypt sdb2 8:18 010G 0 part temporary-cryptsetup-1235 (dm-6) 252:60 125K 1 crypt sdc8:32 0 3.7T 0 disk sdc1 8:33 0 3.6T 0 part temporary-cryptsetup-1788 (dm-37)252:37 0 125K 1 crypt sdc2 8:34 010G 0 part temporary-cryptsetup-1789 (dm-36)252:36 0 125K 1 crypt sdd8:48 0 3.7T 0 disk sdd1 8:49 0 3.6T 0 part temporary-cryptsetup-1252 (dm-1) 252:10 125K 1 crypt sdd2 8:50 010G 0 part temporary-cryptsetup-1246 (dm-3) 252:30 125K 1 crypt sde8:64 0 3.7T 0 disk sde1 8:65 0 3.6T 0 part temporary-cryptsetup-1260 (dm-14)252:14 0 125K 1 crypt sde2 8:66 010G 0 part temporary-cryptsetup-1255 (dm-12)252:12 0 125K 1 crypt sdf8:80 0 3.7T 0 disk sdf1 8:81 0 3.6T 0 part temporary-cryptsetup-1268 (dm-15)252:15 0 125K 1 crypt sdf2 8:82 010G 0 part temporary-cryptsetup-1245 (dm-5) 252:50 125K 1 crypt sdg8:96 0 3.7T 0 disk sdg1 8:97 0 3.6T 0 part temporary-cryptsetup-1271 (dm-17)252:17 0 125K 1 crypt sdg2 8:98 010G 0 part temporary-cryptsetup-1278 (dm-2) 252:20 125K 1 crypt sdh8:112 0 3.7T 0 disk sdh1 8:113 0 3.6T 0 part 69dcd1e1-6e11-41ec-af19-1e0d90013957 (dm-43) 252:43 0 3.6T 0 crypt /var/lib/ceph/osd/ceph-42 sdh2 8:114 010G 0 part 3382723d-b0d9-4b50-affe-fb9f5df78d6f (dm-45) 252:45 010G 0 crypt sdi8:128 0 3.7T 0 disk sdi1 8:129 0 3.6T 0 part temporary-cryptsetup-1265 (dm-20)252:20 0 125K 1 crypt sdi2
Re: [sepia] debian jessie gitbuilder repositories ?
On Mon, 20 Jul 2015, Dan Mick wrote: > On 07/20/2015 07:19 AM, Sage Weil wrote: > > On Mon, 20 Jul 2015, Alexandre DERUMIER wrote: > >> Hi, > >> > >> debian jessie gitbuilder is ok since 2 weeks now, > >> > >> http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-jessie-amd64-basic > >> > >> > >> It is possible to push packages to repositories ? > >> > >> http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ > > > > > > The builds are failing with this: > > > > + GNUPGHOME=/srv/gnupg reprepro --ask-passphrase -b > > ../out/output/sha1/6ffb1c4ae43bcde9f5fde40dd97959399135ed86.tmp -C main > > --ignore=undef > > inedtarget --ignore=wrongdistribution include jessie > > out~/ceph_0.94.2-50-g6ffb1c4-1jessie_amd64.changes > > Cannot find definition of distribution 'jessie'! > > There have been errors! > > > > > > I've seen it before a long time ago, but I forget what the resolution is. > > > > sage > > ___ > > Sepia mailing list > > se...@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/sepia-ceph.com > > https://github.com/ceph/ceph-build/pull/102, probably That fixed it, thanks! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [sepia] debian jessie gitbuilder repositories ?
On 07/20/2015 07:19 AM, Sage Weil wrote: > On Mon, 20 Jul 2015, Alexandre DERUMIER wrote: >> Hi, >> >> debian jessie gitbuilder is ok since 2 weeks now, >> >> http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-jessie-amd64-basic >> >> >> It is possible to push packages to repositories ? >> >> http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ > > > The builds are failing with this: > > + GNUPGHOME=/srv/gnupg reprepro --ask-passphrase -b > ../out/output/sha1/6ffb1c4ae43bcde9f5fde40dd97959399135ed86.tmp -C main > --ignore=undef > inedtarget --ignore=wrongdistribution include jessie > out~/ceph_0.94.2-50-g6ffb1c4-1jessie_amd64.changes > Cannot find definition of distribution 'jessie'! > There have been errors! > > > I've seen it before a long time ago, but I forget what the resolution is. > > sage > ___ > Sepia mailing list > se...@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/sepia-ceph.com https://github.com/ceph/ceph-build/pull/102, probably -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: teuthology : 70 workers need more than 8GB RAM / 2 CPUS
Thanks for the feedback. I'll try with postgresql as it seems the sqlite modifications did nothing really significant. On 20/07/2015 17:38, Zack Cerza wrote: > Hi Loic, > > This is definitely something to keep an eye on. It's actually a bit > surprising to me, though - I haven't seen ansible-playbook use any > significant resources in sepia. > > I wouldn't really recommend running paddles on the same host as teuthology > though, to do any serious amount of testing; some teuthology tasks do use > large amounts of RAM and/or CPU, and severe load issues could feasibly cause > requests to time out, affecting other jobs. > > That's all theory though, as I've always used separate hosts for the two > services. > > Zack > > - Original Message - >> From: "Loic Dachary" >> To: "Zack Cerza" , "Andrew Schoen" >> Cc: "Ceph Development" >> Sent: Sunday, July 19, 2015 9:06:41 AM >> Subject: Re: teuthology : 70 workers need more than 8GB RAM / 2 CPUS >> >> Hi again, >> >> I had the same problem when 50 workers kick in at the same time. I've lowered >> the number of workers down to 25 and it went well. During a few minutes (~8 >> minutes) the load average stayed around 25 (CPU bound, mainly the ansible >> playbook competing, see the screenshot of htop). But did not see any error / >> timeout. then I added 15 workers, wait for the load to go back to < 2 (10 >> minutes), then 15 more (10 minutes) to get to 55. >> >> That sound like a log of CPU used by a single playbook run. Is there a known >> way to reduce that ? If not I'll just upgrade the machine. Just want to make >> sure I'm not missing a simple solution ;-) >> >> Cheers >> >> On 19/07/2015 14:22, Loic Dachary wrote: >>> Hi, >>> >>> For the record, I launched a rados suite on an idle teuthology cluster, >>> with 70 workers running on a 8GB RAM / 2 CPUS / 40GB SSD disk. The load >>> average reached 40 within a minute or two and some jobs started failing / >>> timeouting. I had pulpito running on the same machine and it failed one >>> time out of two because of the load (see the top image). >>> >>> On friday I was able to run 70 workers because I gradually added them. The >>> load peak is when a job starts and all workers kick in a the same time. >>> >>> Cheers >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature
Re: teuthology : 70 workers need more than 8GB RAM / 2 CPUS
Hi Loic, This is definitely something to keep an eye on. It's actually a bit surprising to me, though - I haven't seen ansible-playbook use any significant resources in sepia. I wouldn't really recommend running paddles on the same host as teuthology though, to do any serious amount of testing; some teuthology tasks do use large amounts of RAM and/or CPU, and severe load issues could feasibly cause requests to time out, affecting other jobs. That's all theory though, as I've always used separate hosts for the two services. Zack - Original Message - > From: "Loic Dachary" > To: "Zack Cerza" , "Andrew Schoen" > Cc: "Ceph Development" > Sent: Sunday, July 19, 2015 9:06:41 AM > Subject: Re: teuthology : 70 workers need more than 8GB RAM / 2 CPUS > > Hi again, > > I had the same problem when 50 workers kick in at the same time. I've lowered > the number of workers down to 25 and it went well. During a few minutes (~8 > minutes) the load average stayed around 25 (CPU bound, mainly the ansible > playbook competing, see the screenshot of htop). But did not see any error / > timeout. then I added 15 workers, wait for the load to go back to < 2 (10 > minutes), then 15 more (10 minutes) to get to 55. > > That sound like a log of CPU used by a single playbook run. Is there a known > way to reduce that ? If not I'll just upgrade the machine. Just want to make > sure I'm not missing a simple solution ;-) > > Cheers > > On 19/07/2015 14:22, Loic Dachary wrote: > > Hi, > > > > For the record, I launched a rados suite on an idle teuthology cluster, > > with 70 workers running on a 8GB RAM / 2 CPUS / 40GB SSD disk. The load > > average reached 40 within a minute or two and some jobs started failing / > > timeouting. I had pulpito running on the same machine and it failed one > > time out of two because of the load (see the top image). > > > > On friday I was able to run 70 workers because I gradually added them. The > > load peak is when a job starts and all workers kick in a the same time. > > > > Cheers > > > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ceph branch status
-- All Branches -- Adam Crume 2014-12-01 20:45:58 -0800 wip-doc-rbd-replay Alfredo Deza 2015-03-23 16:39:48 -0400 wip-11212 2015-03-25 10:10:43 -0400 wip-11065 2015-07-01 08:34:15 -0400 wip-12037 Alfredo Deza 2014-07-08 13:58:35 -0400 wip-8679 2014-09-04 13:58:14 -0400 wip-8366 2014-10-13 11:10:10 -0400 wip-9730 Boris Ranto 2015-04-13 13:51:32 +0200 wip-fix-ceph-dencoder-build 2015-04-14 13:51:49 +0200 wip-fix-ceph-dencoder-build-master 2015-06-23 15:29:45 +0200 wip-user-rebase 2015-07-10 12:34:33 +0200 wip-bash-completion 2015-07-15 18:21:11 +0200 wip-selinux-policy Chendi.Xue 2015-06-16 14:39:42 +0800 wip-blkin Chi Xinze 2015-05-15 21:47:44 + XinzeChi-wip-ec-read Dan Mick 2013-07-16 23:00:06 -0700 wip-5634 Danny Al-Gaaf 2015-04-23 16:32:00 +0200 wip-da-SCA-20150421 2015-04-23 17:18:57 +0200 wip-nosetests 2015-04-23 18:20:16 +0200 wip-unify-num_objects_degraded 2015-07-17 10:50:46 +0200 wip-da-SCA-20150601 David Zafman 2014-08-29 10:41:23 -0700 wip-libcommon-rebase 2015-04-24 13:14:23 -0700 wip-cot-giant 2015-06-02 13:46:23 -0700 wip-11511 2015-07-07 18:11:19 -0700 wip-zafman-testing 2015-07-16 19:13:45 -0700 wip-12000-12200 Dongmao Zhang 2014-11-14 19:14:34 +0800 thesues-master Greg Farnum 2015-04-29 21:44:11 -0700 wip-init-names 2015-06-11 18:22:55 -0700 greg-fs-testing 2015-07-16 09:28:24 -0700 hammer-12297 Greg Farnum 2014-10-23 13:33:44 -0700 wip-forward-scrub Gregory Meno 2015-02-25 17:30:33 -0800 wip-fix-typo-troubleshooting Guang G Yang 2015-06-26 20:31:44 + wip-ec-readall Guang Yang 2014-08-08 10:41:12 + wip-guangyy-pg-splitting 2014-09-25 00:47:46 + wip-9008 2014-09-30 10:36:39 + guangyy-wip-9614 Haomai Wang 2014-07-27 13:37:49 +0800 wip-flush-set 2015-04-20 00:47:59 +0800 update-organization 2015-04-20 00:48:42 +0800 update-organization-1 2015-07-10 15:46:45 +0800 fio-objectstore Ilya Dryomov 2014-09-05 16:15:10 +0400 wip-rbd-notify-errors James Page 2013-02-27 22:50:38 + wip-debhelper-8 Jason Dillaman 2015-05-22 00:52:20 -0400 wip-11625 2015-06-10 12:02:16 -0400 wip-11770-hammer 2015-06-22 11:17:56 -0400 wip-12109-hammer 2015-06-22 16:02:33 -0400 wip-11769-firefly 2015-07-17 12:06:14 -0400 wip-11286 2015-07-17 12:07:32 -0400 wip-11287 2015-07-17 14:17:04 -0400 wip-12384-hammer 2015-07-19 13:44:16 -0400 wip-12237-hammer Jenkins 2014-07-29 05:24:39 -0700 wip-nhm-hang 2015-02-02 10:35:28 -0800 wip-sam-v0.92 2015-06-10 15:04:07 -0700 rhcs-v0.80.8 2015-07-01 14:40:49 -0700 rhcs-v0.94.1-ubuntu 2015-07-14 13:10:32 -0700 last Joao Eduardo Luis 2014-09-10 09:39:23 +0100 wip-leveldb-get.dumpling Joao Eduardo Luis 2014-07-22 15:41:42 +0100 wip-leveldb-misc Joao Eduardo Luis 2014-09-02 17:19:52 +0100 wip-leveldb-get 2014-10-17 16:20:11 +0100 wip-paxos-fix 2014-10-21 21:32:46 +0100 wip-9675.dumpling Joao Eduardo Luis 2014-11-17 16:43:53 + wip-mon-osdmap-cleanup 2014-12-15 16:18:56 + wip-giant-mon-backports 2014-12-17 17:13:57 + wip-mon-backports.firefly 2014-12-17 23:15:10 + wip-mon-sync-fix.dumpling 2015-01-07 23:01:00 + wip-mon-blackhole-mlog-0.87.7 2015-01-10 02:40:42 + wip-dho-joao 2015-01-10 02:46:31 + wip-mon-paxos-fix 2015-01-26 13:00:09 + wip-mon-datahealth-fix 2015-02-04 22:36:14 + wip-10643 Joao Eduardo Luis 2015-05-27 23:48:45 +0100 wip-mon-scrub 2015-05-28 08:12:48 +0100 wip-11786 2015-05-29 12:21:43 +0100 wip-11545 2015-06-05 16:12:57 +0100 wip-10507 2015-06-16 14:34:11 +0100 wip-11470 2015-06-25 00:16:41 +0100 wip-10507-2 2015-07-14 16:52:35 +0100 wip-joao-testing John Spray 2015-04-06 17:25:02 +0100 wip-progress-events 2015-05-05 14:29:16 +0100 wip-live-query 2015-05-28 13:31:32 +0100 wip-9963 2015-05-29 13:59:03 +0100 wip-9964-intrapg 2015-05-29 14:19:16 +0100 wip-9964 2015-06-02 18:16:38 +0100 wip-11859 2015-06-02 18:16:38 +0100 wip-damage-table 2015-06-03 10:09:09 +0100 wip-11857 2015-06-04 12:36:09 +0100 wip-nobjectiterator-crash 2015-06-10 14:10:24 +0100 wip-9663-hashorder 2015-06-10 23:50:49 +0100 wip-9964-nosharding 2015-07-15 15:04:42 +0100 wip-mds-refactor 2015-07-20 12:35:21 +0100 wip-scrub-jcs John Wilkins 2
Re: debian jessie gitbuilder repositories ?
- Original Message - > From: "Sage Weil" > To: "Alexandre DERUMIER" > Cc: "ceph-devel" , se...@ceph.com > Sent: Monday, July 20, 2015 10:19:49 AM > Subject: Re: debian jessie gitbuilder repositories ? > > On Mon, 20 Jul 2015, Alexandre DERUMIER wrote: > > Hi, > > > > debian jessie gitbuilder is ok since 2 weeks now, > > > > http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-jessie-amd64-basic > > > > > > It is possible to push packages to repositories ? > > > > http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ > > > The builds are failing with this: > > + GNUPGHOME=/srv/gnupg reprepro --ask-passphrase -b > ../out/output/sha1/6ffb1c4ae43bcde9f5fde40dd97959399135ed86.tmp -C main > --ignore=undef > inedtarget --ignore=wrongdistribution include jessie > out~/ceph_0.94.2-50-g6ffb1c4-1jessie_amd64.changes > Cannot find definition of distribution 'jessie'! > There have been errors! > > > I've seen it before a long time ago, but I forget what the resolution is. I am not 100% how gitbuilders are setup but the DEB builders are meant to be generic and they can be only when some setup is run to hold the environments for each distro. The environments are created/updated with pbuilder, so one needs to exist for jessie. There is a script that can do that called update_pbuilder.sh but that is missing `jessie`: https://github.com/ceph/ceph-build/blob/master/update_pbuilder.sh#L22-30 A manual call to create the jessie environment for pbuilder would suffice here I think. > > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rados/thrash on OpenStack
More information about this run. I'll run a rados suite on master on OpenStack to get a baseline of what we should expect. http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/12/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/14/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/15/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/17/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/20/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/21/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/22/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/23/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/26/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/28/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/2/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/5/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/6/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/7/ http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/9/ I see 2015-07-20T10:02:10.567 INFO:tasks.ceph.osd.5.ovh165019.stderr:osd/ReplicatedPG.cc: In function 'bool ReplicatedPG::is_degraded_or_backfilling_object(const hobject_t&)' thread 7f2af94df700 time 2015-07-20 10:02:10.481916 2015-07-20T10:02:10.567 INFO:tasks.ceph.osd.5.ovh165019.stderr:osd/ReplicatedPG.cc: 412: FAILED assert(!actingbackfill.empty()) 2015-07-20T10:02:10.567 INFO:tasks.ceph.osd.5.ovh165019.stderr: ceph version 9.0.2-799-gba9c2ae (ba9c2ae4bffd3fd7b26a2e0ce843913b77940b8a) 2015-07-20T10:02:10.568 INFO:tasks.ceph.osd.5.ovh165019.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xc45d1b] 2015-07-20T10:02:10.568 INFO:tasks.ceph.osd.5.ovh165019.stderr: 2: ceph-osd() [0x88535d] 2015-07-20T10:02:10.568 INFO:tasks.ceph.osd.5.ovh165019.stderr: 3: (ReplicatedPG::hit_set_remove_all()+0x7c) [0x8b039c] 2015-07-20T10:02:10.568 INFO:tasks.ceph.osd.5.ovh165019.stderr: 4: (ReplicatedPG::on_pool_change()+0x161) [0x8b1a21] 2015-07-20T10:02:10.569 INFO:tasks.ceph.osd.5.ovh165019.stderr: 5: (PG::handle_advance_map(std::tr1::shared_ptr, std::tr1::shared_ptr, std::vector >&, int, std::vector >&, int, PG::RecoveryCtx*)+0x60c) [0x8348fc] 2015-07-20T10:02:10.569 INFO:tasks.ceph.osd.5.ovh165019.stderr: 6: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set, std::less >, std::allocator > >*)+0x2c3) [0x6dcc73] 2015-07-20T10:02:10.569 INFO:tasks.ceph.osd.5.ovh165019.stderr: 7: (OSD::process_peering_events(std::list > const&, ThreadPool::TPHandle&)+0x1f1) [0x6dd721] 2015-07-20T10:02:10.572 INFO:tasks.ceph.osd.5.ovh165019.stderr: 8: (OSD::PeeringWQ::_process(std::list > const&, ThreadPool::TPHandle&)+0x18) [0x7328d8] 2015-07-20T10:02:10.573 INFO:tasks.ceph.osd.5.ovh165019.stderr: 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xc3677e] 2015-07-20T10:02:10.573 INFO:tasks.ceph.osd.5.ovh165019.stderr: 10: (ThreadPool::WorkThread::entry()+0x10) [0xc37820] 2015-07-20T10:02:10.573 INFO:tasks.ceph.osd.5.ovh165019.stderr: 11: (()+0x8182) [0x7f2b149e3182] 2015-07-20T10:02:10.573 INFO:tasks.ceph.osd.5.ovh165019.stderr: 12: (clone()+0x6d) [0x7f2b12d2847d] In http://149.202.164.239/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/24/ I see the same error as below. In http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/8/ it looks like the run was about to finish, just took a long time, and should be ignored as a false negative. On 20/07/2015 14:52, Loic Dachary wrote: > Hi, > > I checked one of the timeout (dead) at > http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/ > > http://149.202.164.239/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/10/config.yaml > timeed out because of > > > Paste2 > > Create Paste > Followup Paste > QR > > sd.5 since back 2015-07-20 10:45:28.566308 front 2015-07-20 10:45:28.566308 > (cutoff 2015-07-20 10:45:33.823074) > 2015-07-20T10:47:13.921 INFO:tasks.ceph.osd.4.ovh164254.stderr:2015-07-20 > 10:47:13.899770 7fb4be171700 -1 osd.4 655 heartbeat_check: no reply from > osd.5 since back 2015-07-20 10:45:30.719801 front 2015-07-20 10:45:30.719801 > (cutoff 2015-07-20 10:45:33.899763) > 2015-07-20T10:47:15.023 > INFO:tasks.ceph.osd.1.ovh164253.stderr:os
Re: debian jessie gitbuilder repositories ?
On Mon, 20 Jul 2015, Alexandre DERUMIER wrote: > Hi, > > debian jessie gitbuilder is ok since 2 weeks now, > > http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-jessie-amd64-basic > > > It is possible to push packages to repositories ? > > http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ The builds are failing with this: + GNUPGHOME=/srv/gnupg reprepro --ask-passphrase -b ../out/output/sha1/6ffb1c4ae43bcde9f5fde40dd97959399135ed86.tmp -C main --ignore=undef inedtarget --ignore=wrongdistribution include jessie out~/ceph_0.94.2-50-g6ffb1c4-1jessie_amd64.changes Cannot find definition of distribution 'jessie'! There have been errors! I've seen it before a long time ago, but I forget what the resolution is. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: start-stop-daemon radosgw
http://tracker.ceph.com/issues/12407 Thanks, -Pavan. -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: Monday, July 20, 2015 7:19 PM To: Pavan Rallabhandi Cc: ceph-devel@vger.kernel.org; Srinivasula Maram; Yehuda Sadeh-Weinraub Subject: Re: start-stop-daemon radosgw On Mon, 20 Jul 2015, Pavan Rallabhandi wrote: > [Resending in plain text format, apologies for the spam] > > Hi, > > This is with reference to the commit > https://github.com/ceph/ceph/commit/f30fa4a364602fb9412babf7319140eca4c64995 > and tracker http://tracker.ceph.com/issues/11453 > > On Hammer binaries, we are finding this fix has regressed to have multiple > RGW instances to be run on a single machine. Meaning, with no user specified > under 'client.radosgw.gateway' sections, and by having the default user to be > assumed as 'root', we are unable to get multiple RGW daemons run on a client > machine. > > The start-stop-daemon complains than an instance of 'radosgw' is already > running, by starting the first daemon in the configuration and bails out from > starting further instances: > > > > + start-stop-daemon --start -u root -x /usr/bin/radosgw -- -n > client.radosgw.gateway-3 > /usr/bin/radosgw already running. > > <\snip> > > However, by having a user specified in the relevant 'client.radosgw.gateway' > sections, one can get around this issue. Wanted to confirm if this is indeed > a regression or was it expected to behave so from the fix. This was not intentional. Can you open a tracker ticket? Thanks! sage PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: start-stop-daemon radosgw
On Mon, 20 Jul 2015, Pavan Rallabhandi wrote: > [Resending in plain text format, apologies for the spam] > > Hi, > > This is with reference to the commit > https://github.com/ceph/ceph/commit/f30fa4a364602fb9412babf7319140eca4c64995 > and tracker http://tracker.ceph.com/issues/11453 > > On Hammer binaries, we are finding this fix has regressed to have multiple > RGW instances to be run on a single machine. Meaning, with no user specified > under 'client.radosgw.gateway' sections, and by having the default user to be > assumed as 'root', we are unable to get multiple RGW daemons run on a client > machine. > > The start-stop-daemon complains than an instance of 'radosgw' is already > running, by starting the first daemon in the configuration and bails out from > starting further instances: > > > > + start-stop-daemon --start -u root -x /usr/bin/radosgw -- -n > client.radosgw.gateway-3 > /usr/bin/radosgw already running. > > <\snip> > > However, by having a user specified in the relevant 'client.radosgw.gateway' > sections, one can get around this issue. Wanted to confirm if this is indeed > a regression or was it expected to behave so from the fix. This was not intentional. Can you open a tracker ticket? Thanks! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Documentation] Hardware recommandation : RAM and PGLog
On Sun, 19 Jul 2015, David Casier AEVOO wrote: > Hi, > I have a question about PGLog and RAM consumption. > > In the documentation, we read "OSDs do not require as much RAM for regular > operations (e.g., 500MB of RAM per daemon instance); however, during recovery > they need significantly more RAM (e.g., ~1GB per 1TB of storage per daemon)" > > But in fact, all pg log are read in the start of ceph-osd daemon and put in > RAM ( pg->read_state(store, bl); ) > > Is this normal behavior or I have a defect in my environment? There are two tunables that control how many pg log entries we keep around. When teh PG is healthy, we keep ~1000, and when the PG is degraded, we keep more, to expand the time window over which a recovering OSD will be able to do regular log-based recovery instead of a more expensive backfill. This is one source of additional memory. Others are the missing sets (lists of missing/degraded objects) and messages/data/state associated with objcts that are being recovered/copied. Note that the numbers in teh documentation are pretty rough rules of thumb. At some point it would be great to build a model for how much RAM the osd consumes as a function of the various configurables (pg log size, pg count, avg object size, etc.). sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
rados/thrash on OpenStack
Hi, I checked one of the timeout (dead) at http://149.202.164.239:8081/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/ 149.202.164.239/ubuntu-2015-07-20_09:21:01-rados-wip-kefu-testing---basic-openstack/10/config.yaml timeed out because of Paste2 Create Paste Followup Paste QR sd.5 since back 2015-07-20 10:45:28.566308 front 2015-07-20 10:45:28.566308 (cutoff 2015-07-20 10:45:33.823074) 2015-07-20T10:47:13.921 INFO:tasks.ceph.osd.4.ovh164254.stderr:2015-07-20 10:47:13.899770 7fb4be171700 -1 osd.4 655 heartbeat_check: no reply from osd.5 since back 2015-07-20 10:45:30.719801 front 2015-07-20 10:45:30.719801 (cutoff 2015-07-20 10:45:33.899763) 2015-07-20T10:47:15.023 INFO:tasks.ceph.osd.1.ovh164253.stderr:osd/ReplicatedPG.cc: In function 'virtual void ReplicatedPG::op_applied(const eversion_t&)' thread 7f92f0244700 time 2015-07-20 10:47:14.998470 2015-07-20T10:47:15.024 INFO:tasks.ceph.osd.1.ovh164253.stderr:osd/ReplicatedPG.cc: 7311: FAILED assert(applied_version <= info.last_update) 2015-07-20T10:47:15.025 INFO:tasks.ceph.osd.1.ovh164253.stderr: ceph version 9.0.2-799-gba9c2ae (ba9c2ae4bffd3fd7b26a2e0ce843913b77940b8a) 2015-07-20T10:47:15.025 INFO:tasks.ceph.osd.1.ovh164253.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xc45d1b] 2015-07-20T10:47:15.025 INFO:tasks.ceph.osd.1.ovh164253.stderr: 2: (ReplicatedPG::op_applied(eversion_t const&)+0x6dc) [0x8741ac] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 3: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0xd0) [0xa5cfe0] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 4: (Context::complete(int)+0x9) [0x6f4649] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 5: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8dec54] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 6: (Context::complete(int)+0x9) [0x6f4649] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 7: (void finish_contexts(CephContext*, std::list >&, int)+0x94) [0x7351d4] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 8: (C_ContextsBase::complete(int)+0x9) [0x6f4e89] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 9: (Finisher::finisher_thread_entry()+0x158) [0xb6f2b8] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 10: (()+0x8182) [0x7f92ff4e7182] 2015-07-20T10:47:15.026 INFO:tasks.ceph.osd.1.ovh164253.stderr: 11: (clone()+0x6d) [0x7f92fd82c47d] 2015-07-20T10:47:15.027 INFO:tasks.ceph.osd.1.ovh164253.stderr: NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. 2015-07-20T10:47:15.038 INFO:tasks.ceph.osd.1.ovh164253.stderr:2015-07-20 10:47:15.005862 7f92f0244700 -1 osd/ReplicatedPG.cc: In function 'virtual void ReplicatedPG::op_applied(const eversion_t&)' thread 7f92f0244700 time 2015-07-20 10:47:14.998470 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr:osd/ReplicatedPG.cc: 7311: FAILED assert(applied_version <= info.last_update) 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: ceph version 9.0.2-799-gba9c2ae (ba9c2ae4bffd3fd7b26a2e0ce843913b77940b8a) 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xc45d1b] 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 2: (ReplicatedPG::op_applied(eversion_t const&)+0x6dc) [0x8741ac] 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 3: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0xd0) [0xa5cfe0] 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 4: (Context::complete(int)+0x9) [0x6f4649] 2015-07-20T10:47:15.039 INFO:tasks.ceph.osd.1.ovh164253.stderr: 5: (ReplicatedPG::BlessedContext::finish(int)+0x94) [0x8dec54] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 6: (Context::complete(int)+0x9) [0x6f4649] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 7: (void finish_contexts(CephContext*, std::list >&, int)+0x94) [0x7351d4] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 8: (C_ContextsBase::complete(int)+0x9) [0x6f4e89] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 9: (Finisher::finisher_thread_entry()+0x158) [0xb6f2b8] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 10: (()+0x8182) [0x7f92ff4e7182] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: 11: (clone()+0x6d) [0x7f92fd82c47d] 2015-07-20T10:47:15.040 INFO:tasks.ceph.osd.1.ovh164253.stderr: NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. 2015-07-20T10:47:15.041 INFO:tasks.ceph.osd.1.ovh164253.stderr: 2015-07-20T10:47:15.212 INFO:tasks.ceph.osd.1.ovh164253.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion' 2015-07-2
Re: pulpito slowness
- Original Message - > From: "Loic Dachary" > To: "Alfredo Deza" > Cc: "Ceph Development" > Sent: Sunday, July 19, 2015 12:56:12 PM > Subject: pulpito slowness > > Hi Alfredo, > > After installing pulpito and run from sources with: > > virtualenv ./virtualenv > source ./virtualenv/bin/activate > pip install -r requirements.txt > python run.py & > > I run a rados suite with 40 workers and 218 jobs. All is well except a > slowness from pulpito that I don't quite understand. It takes 9 seconds to > load although the load average of the machine is low, the CPU are not all > busy, there is plenty of free ram. There are pieces of the setup that might be causing this. Pulpito on its own doesn't do much, it is stateless and just serves HTML. I would look into paddles (pulpito feeds from it) and see how that is doing. Ideally, paddles would be setup with PostgreSQL as well, I remember that at some point the queries became very complex in paddles and some investigation was done to improve their speed. > > ubuntu@teuthology:~$ curl > http://localhost:8081/ubuntu-2015-07-19_15:57:13-rados-hammer---basic-openstack/ > > /dev/null % Total% Received % Xferd Average Speed TimeTime > Time Current > Dload Upload Total SpentLeft Speed > 100 391k 100 391k0 0 42774 0 0:00:09 0:00:09 --:--:-- > 96305 > > Do you have an idea of the reason for this slowness ? > > Cheers > > -- > Loïc Dachary, Artisan Logiciel Libre > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Teuthology] Upgrade hammer on ubuntu : all passed
Hi all, Good news for upgrade hammer on Ubuntu : http://ceph.aevoo.fr:8081/ubuntu-2015-07-19_05:44:18-upgrade:hammer-hammer---basic-openstack/ All jobs are passed. David -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
start-stop-daemon radosgw
[Resending in plain text format, apologies for the spam] Hi, This is with reference to the commit https://github.com/ceph/ceph/commit/f30fa4a364602fb9412babf7319140eca4c64995 and tracker http://tracker.ceph.com/issues/11453 On Hammer binaries, we are finding this fix has regressed to have multiple RGW instances to be run on a single machine. Meaning, with no user specified under 'client.radosgw.gateway' sections, and by having the default user to be assumed as 'root', we are unable to get multiple RGW daemons run on a client machine. The start-stop-daemon complains than an instance of 'radosgw' is already running, by starting the first daemon in the configuration and bails out from starting further instances: + start-stop-daemon --start -u root -x /usr/bin/radosgw -- -n client.radosgw.gateway-3 /usr/bin/radosgw already running. <\snip> However, by having a user specified in the relevant 'client.radosgw.gateway' sections, one can get around this issue. Wanted to confirm if this is indeed a regression or was it expected to behave so from the fix. Thanks, -Pavan. PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
The design of the eviction improvement
Hi all, This is a follow-up of one of the CDS session at http://tracker.ceph.com/projects/ceph/wiki/Improvement_on_the_cache_tiering_eviction. We discussed the drawbacks of the current eviction algorithm and several ways to improve it. Seems like the LRU variants is the right way to go. I come up with some design points after the CDS, and want to discuss it with you. It is an approximate 2Q algorithm, combining some benefits of the clock algorithm, similar to what the linux kernel does for the page cache. # Design points: ## LRU lists - Maintain LRU lists at the PG level. The SharedLRU and SimpleLRU implementation in the current code have a max_size, which limits the max number of elements in the list. This mostly looks like a MRU, though its name implies they are LRUs. Since the object size may vary in a PG, it's not possible to caculate the total number of objects which the cache tier can hold ahead of time. We need a new LRU implementation with no limit on the size. - Two lists for each PG: active and inactive Objects are first put into the inactive list when they are accessed, and moved between these two lists based on some criteria. Object flag: active, referenced, unevictable, dirty. - When an object is accessed: 1) If it's not in both of the lists, it's put on the top of the inactive list 2) If it's in the inactive list, and the referenced flag is not set, the referenced flag is set, and it's moved to the top of the inactive list. 3) If it's in the inactive list, and the referenced flag is set, the referenced flag is cleared, and it's removed from the inactive list, and put on top of the active list. 4) If it's in the active list, and the referenced flag is not set, the referenced flag is set, and it's moved to the top of the active list. 5) If it's in the active list, and the referenced flag is set, it's moved to the top of the active list. - When selecting objects to evict: 1) Objects at the bottom of the inactive list are selected to evict. They are removed from the inactive list. 2) If the number of the objects in the inactive list becomes low, some of the objects at the bottom of the active list are moved to the inactive list. For those objects which have the referenced flag set, they are given one more chance in the active list. They are moved to the top of the active list with the referenced flag cleared. For those objects which don't have the referenced flag set, they are moved to the inactive list, with the referenced flag set. So that they can be quickly promoted to the active list when necessary. ## Combine flush with eviction - When evicting an object, if it's dirty, it's flushed first. After flushing, it's evicted. If not dirty, it's evicted directly. - This means that we won't have separate activities and won't set different ratios for flush and evict. Is there a need to do so? - Number of objects to evict at a time. 'evict_effort' acts as the priority, which is used to calculate the number of objects to evict. ## LRU lists Snapshotting - The two lists are snapshotted persisted periodically. - Only one copy needs to be saved. The old copy is removed when persisting the lists. The saved lists are used to restore the LRU lists when OSD reboots. Any comments/feedbacks are welcomed. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
debian jessie gitbuilder repositories ?
Hi, debian jessie gitbuilder is ok since 2 weeks now, http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-jessie-amd64-basic It is possible to push packages to repositories ? http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ ? Alexandre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html