[ceph-users] PG Balancer Upmap mode not working
@Wido Den Hollander Regarding the amonut of PGs, and I quote from the docs: "If you have more than 50 OSDs, we recommend approximately 50-100placement groups per OSD to balance out resource usage, datadurability and distribution." (https://docs.ceph.com/docs/master/rados/operations/placement-groups/) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Cluster in ERR status when rebalancing
Has finally been addressed in 14.2.5, check changelog of release. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
My full OSD list (also here as pastebin https://paste.ubuntu.com/p/XJ4Pjm92B5/ ) ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 14 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.8 TiB 71 KiB 18 GiB 2.2 TiB 75.34 1.04 69 up 19 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.8 TiB 80 KiB 18 GiB 2.2 TiB 75.33 1.04 72 up 22 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 8.0 TiB 80 KiB 21 GiB 1.1 TiB 88.13 1.22 84 up 25 hdd 9.09470 1.0 9.1 TiB 7.9 TiB 7.9 TiB 3.7 MiB 21 GiB 1.2 TiB 87.06 1.20 85 up 30 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.9 TiB 48 KiB 18 GiB 2.2 TiB 76.21 1.05 73 up 33 hdd 9.09470 1.0 9.1 TiB 7.2 TiB 7.2 TiB 56 KiB 19 GiB 1.9 TiB 79.13 1.09 76 up 34 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.6 TiB 84 KiB 17 GiB 2.4 TiB 73.13 1.01 70 up 35 hdd 9.09470 1.0 9.1 TiB 7.2 TiB 7.2 TiB 120 KiB 19 GiB 1.9 TiB 79.63 1.10 74 up 12 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 136 KiB 20 GiB 1.4 TiB 84.10 1.16 76 up 16 hdd 9.09470 1.0 9.1 TiB 7.1 TiB 7.1 TiB 92 KiB 18 GiB 2.0 TiB 77.86 1.07 76 up 17 hdd 9.09470 1.0 9.1 TiB 7.3 TiB 7.3 TiB 52 KiB 19 GiB 1.8 TiB 80.48 1.11 73 up 20 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.9 TiB 84 KiB 18 GiB 2.2 TiB 75.58 1.04 70 up 23 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 104 KiB 15 GiB 3.3 TiB 64.02 0.88 63 up 26 hdd 9.09470 1.0 9.1 TiB 7.4 TiB 7.4 TiB 16 KiB 20 GiB 1.7 TiB 81.31 1.12 79 up 28 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 48 KiB 17 GiB 3.0 TiB 67.02 0.92 67 up 31 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 84 KiB 16 GiB 3.2 TiB 64.69 0.89 62 up 13 hdd 9.09470 1.0 9.1 TiB 7.8 TiB 7.8 TiB 64 KiB 20 GiB 1.3 TiB 86.17 1.19 80 up 15 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 36 KiB 18 GiB 2.4 TiB 73.89 1.02 73 up 18 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 72 KiB 20 GiB 1.6 TiB 82.47 1.14 80 up 21 hdd 9.09470 1.0 9.1 TiB 7.9 TiB 7.9 TiB 44 KiB 21 GiB 1.2 TiB 87.23 1.20 83 up 24 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.9 TiB 104 KiB 18 GiB 2.2 TiB 76.07 1.05 71 up 27 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 56 KiB 17 GiB 2.4 TiB 74.08 1.02 72 up 29 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 56 KiB 17 GiB 2.8 TiB 69.46 0.96 68 up 32 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 8.0 TiB 112 KiB 21 GiB 1.1 TiB 88.02 1.21 84 up 37 hdd 9.09470 1.0 9.1 TiB 7.4 TiB 7.4 TiB 76 KiB 19 GiB 1.7 TiB 81.69 1.13 77 up 39 hdd 9.09470 1.0 9.1 TiB 7.4 TiB 7.4 TiB 32 KiB 20 GiB 1.6 TiB 81.90 1.13 76 up 41 hdd 9.09470 1.0 9.1 TiB 6.8 TiB 6.8 TiB 93 KiB 18 GiB 2.3 TiB 74.41 1.03 73 up 43 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 72 KiB 17 GiB 2.7 TiB 70.42 0.97 66 up 45 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 48 KiB 17 GiB 2.6 TiB 71.56 0.99 67 up 46 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.9 TiB 104 KiB 18 GiB 2.2 TiB 76.08 1.05 71 up 48 hdd 9.09470 1.0 9.1 TiB 7.3 TiB 7.3 TiB 40 KiB 19 GiB 1.8 TiB 80.13 1.11 78 up 50 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 76 KiB 17 GiB 2.7 TiB 70.39 0.97 68 up 36 hdd 9.09470 1.0 9.1 TiB 6.8 TiB 6.8 TiB 92 KiB 18 GiB 2.3 TiB 74.88 1.03 72 up 38 hdd 9.09470 1.0 9.1 TiB 7.0 TiB 6.9 TiB 80 KiB 18 GiB 2.1 TiB 76.55 1.06 75 up 40 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 96 KiB 19 GiB 1.6 TiB 82.46 1.14 76 up 42 hdd 9.09470 1.0 9.1 TiB 7.1 TiB 7.1 TiB 80 KiB 19 GiB 2.0 TiB 78.51 1.08 78 up 44 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 8.0 TiB 104 KiB 21 GiB 1.1 TiB 88.01 1.21 85 up 47 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 112 KiB 17 GiB 2.8 TiB 69.59 0.96 67 up 49 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.0 TiB 12 KiB 16 GiB 3.0 TiB 66.69 0.92 66 up 51 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 76 KiB 16 GiB 3.1 TiB 66.16 0.91 64 up 52 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 7.9 TiB 3.7 MiB 20 GiB 1.1 TiB 87.63 1.21 83 up 53 hdd 9.09470 1.0 9.1 TiB 8.1 TiB 8.1 TiB 100 KiB 21 GiB 1.0 TiB 88.77 1.22 85 up 54 hdd 9.09470 1.0 9.1 TiB 7.1 TiB 7.1 TiB 64 KiB 19 GiB 2.0 TiB 78.51 1.08 76 up 55 hdd 9.09470 1.0 9.1 TiB 7.4 TiB 7.3 TiB 60 KiB 19 GiB 1.7 TiB 80.90 1.12 74 up 56 hdd 9.09470 1.0 9.1 TiB 7.9 TiB 7.9 TiB 48 KiB 21 GiB 1.2 TiB 87.09 1.20 82 up 57 hdd 9.09470 1.0 9.1 TiB 7.0 TiB 7.0 TiB 48 KiB 19 GiB 2.1 TiB 76.82 1.06 72 up 58 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 8.0 TiB 56 KiB 21 GiB 1.1 TiB 88.11 1.22 83 up 59 hdd 9.09470 1.0 9.1 TiB 7.9 TiB 7.8 TiB 72 KiB 20 GiB 1.2 TiB 86.34 1.19 85 up 60 hdd 9.09470 1.0 9.1 TiB 6.9 TiB 6.9 TiB 72 KiB 18
[ceph-users] PG Balancer Upmap mode not working
It's only getting worse after raising PGs now. Anything between: 96 hdd 9.09470 1.0 9.1 TiB 4.9 TiB 4.9 TiB 97 KiB 13 GiB 4.2 TiB 53.62 0.76 54 up and 89 hdd 9.09470 1.0 9.1 TiB 8.1 TiB 8.1 TiB 88 KiB 21 GiB 1001 GiB 89.25 1.27 87 up How is that possible? I dont know how much more proof I need to present that there's a bug. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
@Wido Den Hollander Still think this is acceptable? 51 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 72 KiB 16 GiB 3.0 TiB 67.23 0.98 68 up 52 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 3.5 MiB 18 GiB 2.4 TiB 73.99 1.08 75 up 53 hdd 9.09470 1.0 9.1 TiB 8.0 TiB 7.9 TiB 102 KiB 21 GiB 1.1 TiB 87.49 1.27 88 up If I use replica 3 and then my pool almost is full because a single OSD is at 85%+ while others are <70..then in the end I maybe get 25% of my disk storage. Difference of 20 PGs between 2 OSD does not seem acceptable. Not with fillrates that are equally different ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
I never had those issues with Luminous, never once, since Nautilus this is a constant headache.My issue is that I have OSDs that are over 85% whilst others are at 63%. My issue is that every time I do a rebalance or add new disks ceph moves PGs on near full OSDs and almost causes pool failures. My STDDEV: 21.31 ...it's a joke. It's simply not acceptable to deal with nearfull OSDs whilst others are half empty. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
@Wido Den Hollander First of all the docs say: " In most cases, this distribution is “perfect,” whichan equal number of PGs on each OSD (+/-1 PG, since they might notdivide evenly)."Either this is just false information or very badly stated. I increased PGs and see no difference. I pointed out MULTIPLE times that Nautilus has major flaws in the data distribution but nobody seems to listen to me. Not sure how much more evidence I have to show. 0 ssd 3.49219 1.0 3.5 TiB 715 GiB 674 GiB 37 GiB 3.9 GiB 2.8 TiB 19.99 0.29 147 up 1 ssd 3.49219 1.0 3.5 TiB 724 GiB 672 GiB 49 GiB 3.8 GiB 2.8 TiB 20.25 0.30 146 up 2 ssd 3.49219 1.0 3.5 TiB 736 GiB 681 GiB 50 GiB 4.4 GiB 2.8 TiB 20.57 0.30 150 up 3 ssd 3.49219 1.0 3.5 TiB 712 GiB 676 GiB 33 GiB 3.5 GiB 2.8 TiB 19.92 0.29 146 up 4 ssd 3.49219 1.0 3.5 TiB 752 GiB 714 GiB 34 GiB 4.6 GiB 2.8 TiB 21.03 0.31 156 up 6 ssd 3.49219 1.0 3.5 TiB 710 GiB 671 GiB 35 GiB 3.8 GiB 2.8 TiB 19.85 0.29 146 up 8 ssd 3.49219 1.0 3.5 TiB 781 GiB 738 GiB 40 GiB 3.7 GiB 2.7 TiB 21.85 0.32 156 up 10 ssd 3.49219 1.0 3.5 TiB 728 GiB 682 GiB 42 GiB 4.0 GiB 2.8 TiB 20.35 0.30 146 up 5 ssd 3.49219 1.0 3.5 TiB 664 GiB 628 GiB 32 GiB 4.3 GiB 2.8 TiB 18.58 0.27 141 up 7 ssd 3.49219 1.0 3.5 TiB 656 GiB 613 GiB 39 GiB 4.0 GiB 2.9 TiB 18.35 0.27 136 up 9 ssd 3.49219 1.0 3.5 TiB 632 GiB 586 GiB 41 GiB 4.4 GiB 2.9 TiB 17.67 0.26 131 up 11 ssd 3.49219 1.0 3.5 TiB 725 GiB 701 GiB 22 GiB 2.6 GiB 2.8 TiB 20.28 0.30 138 up 101 ssd 3.49219 1.0 3.5 TiB 755 GiB 713 GiB 38 GiB 3.9 GiB 2.8 TiB 21.11 0.31 146 up 103 ssd 3.49219 1.0 3.5 TiB 761 GiB 718 GiB 40 GiB 3.6 GiB 2.7 TiB 21.29 0.31 150 up 105 ssd 3.49219 1.0 3.5 TiB 715 GiB 676 GiB 36 GiB 2.6 GiB 2.8 TiB 19.99 0.29 148 up 107 ssd 3.49219 1.0 3.5 TiB 760 GiB 706 GiB 50 GiB 3.2 GiB 2.8 TiB 21.24 0.31 147 up 100 ssd 3.49219 1.0 3.5 TiB 724 GiB 674 GiB 47 GiB 2.5 GiB 2.8 TiB 20.25 0.30 144 up 102 ssd 3.49219 1.0 3.5 TiB 669 GiB 654 GiB 12 GiB 2.3 GiB 2.8 TiB 18.71 0.27 141 up 104 ssd 3.49219 1.0 3.5 TiB 721 GiB 687 GiB 31 GiB 3.0 GiB 2.8 TiB 20.16 0.30 144 up 106 ssd 3.49219 1.0 3.5 TiB 715 GiB 646 GiB 65 GiB 3.8 GiB 2.8 TiB 19.99 0.29 143 up 108 ssd 3.49219 1.0 3.5 TiB 729 GiB 691 GiB 36 GiB 2.6 GiB 2.8 TiB 20.38 0.30 156 up 109 ssd 3.49219 1.0 3.5 TiB 732 GiB 684 GiB 45 GiB 3.0 GiB 2.8 TiB 20.47 0.30 146 up 110 ssd 3.49219 1.0 3.5 TiB 773 GiB 743 GiB 28 GiB 2.7 GiB 2.7 TiB 21.63 0.32 154 up 111 ssd 3.49219 1.0 3.5 TiB 708 GiB 660 GiB 45 GiB 2.7 GiB 2.8 TiB 19.78 0.29 146 up The % fillrate is no different than before, fluctuates hard. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
@Wido Den Hollander That doesn't explain why its between 76 and 92 PGs, that's major not equal. Raising PGs to 100 is an old statement anyway, anything 60+ should be fine. Not an excuse for distribution failure in this case.I am expecting more or less equal PGs/OSD ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG Balancer Upmap mode not working
Hi,the docs say the upmap mode is trying to achieve perfect distribution as to have equal amount of PGs/OSD.This is what I got(v14.2.4): 0 ssd 3.49219 1.0 3.5 TiB 794 GiB 753 GiB 38 GiB 3.4 GiB 2.7 TiB 22.20 0.32 82 up 1 ssd 3.49219 1.0 3.5 TiB 800 GiB 751 GiB 45 GiB 3.7 GiB 2.7 TiB 22.37 0.33 84 up 2 ssd 3.49219 1.0 3.5 TiB 846 GiB 792 GiB 50 GiB 3.6 GiB 2.7 TiB 23.66 0.35 88 up 3 ssd 3.49219 1.0 3.5 TiB 812 GiB 776 GiB 33 GiB 3.3 GiB 2.7 TiB 22.71 0.33 85 up 4 ssd 3.49219 1.0 3.5 TiB 768 GiB 730 GiB 34 GiB 4.1 GiB 2.7 TiB 21.47 0.31 83 up 6 ssd 3.49219 1.0 3.5 TiB 765 GiB 731 GiB 31 GiB 3.3 GiB 2.7 TiB 21.40 0.31 82 up 8 ssd 3.49219 1.0 3.5 TiB 872 GiB 828 GiB 41 GiB 3.2 GiB 2.6 TiB 24.40 0.36 85 up 10 ssd 3.49219 1.0 3.5 TiB 789 GiB 743 GiB 42 GiB 3.3 GiB 2.7 TiB 22.05 0.32 82 up 5 ssd 3.49219 1.0 3.5 TiB 719 GiB 683 GiB 32 GiB 3.9 GiB 2.8 TiB 20.12 0.29 78 up 7 ssd 3.49219 1.0 3.5 TiB 741 GiB 698 GiB 39 GiB 3.8 GiB 2.8 TiB 20.73 0.30 79 up 9 ssd 3.49219 1.0 3.5 TiB 709 GiB 664 GiB 41 GiB 3.5 GiB 2.8 TiB 19.82 0.29 78 up 11 ssd 3.49219 1.0 3.5 TiB 858 GiB 834 GiB 22 GiB 2.4 GiB 2.7 TiB 23.99 0.35 82 up 101 ssd 3.49219 1.0 3.5 TiB 815 GiB 774 GiB 38 GiB 3.5 GiB 2.7 TiB 22.80 0.33 80 up 103 ssd 3.49219 1.0 3.5 TiB 827 GiB 783 GiB 40 GiB 3.3 GiB 2.7 TiB 23.11 0.34 81 up 105 ssd 3.49219 1.0 3.5 TiB 797 GiB 759 GiB 36 GiB 2.5 GiB 2.7 TiB 22.30 0.33 81 up 107 ssd 3.49219 1.0 3.5 TiB 840 GiB 788 GiB 50 GiB 2.8 GiB 2.7 TiB 23.50 0.34 83 up 100 ssd 3.49219 1.0 3.5 TiB 728 GiB 678 GiB 47 GiB 2.4 GiB 2.8 TiB 20.36 0.30 78 up 102 ssd 3.49219 1.0 3.5 TiB 764 GiB 750 GiB 12 GiB 2.2 GiB 2.7 TiB 21.37 0.31 76 up 104 ssd 3.49219 1.0 3.5 TiB 795 GiB 761 GiB 31 GiB 2.5 GiB 2.7 TiB 22.22 0.33 78 up 106 ssd 3.49219 1.0 3.5 TiB 730 GiB 665 GiB 62 GiB 2.8 GiB 2.8 TiB 20.41 0.30 78 up 108 ssd 3.49219 1.0 3.5 TiB 849 GiB 808 GiB 38 GiB 2.5 GiB 2.7 TiB 23.73 0.35 92 up 109 ssd 3.49219 1.0 3.5 TiB 798 GiB 754 GiB 41 GiB 2.7 GiB 2.7 TiB 22.30 0.33 83 up 110 ssd 3.49219 1.0 3.5 TiB 840 GiB 810 GiB 28 GiB 2.4 GiB 2.7 TiB 23.49 0.34 85 up 111 ssd 3.49219 1.0 3.5 TiB 788 GiB 741 GiB 45 GiB 2.5 GiB 2.7 TiB 22.04 0.32 85 up PG's are badly distributed.ceph balancer status { "active": true, "plans": [], "mode": "upmap" } It is because of this? health: HEALTH_WARN Failed to send data to Zabbix 1 subtrees have overcommitted pool target_size_bytes 1 subtrees have overcommitted pool target_size_ratio Any ideas why its not working? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] how to find the lazy egg - poor performance - interesting observations [klartext]
This only happens with this one specific node?checked system logs? checked SMART on all disks?I mean technically it's expected to have slower writes when the third node is there, it's by ceph design. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion
Zap had an issue back then and never properly worked, you have to manually dd, we always played it save and went 2-4GB in just to be sure.Should fix your issue. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rebalance stuck backfill_toofull, OSD NOT full
v14.2.4 Following issue: PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull pg 1.285 is active+remapped+backfill_toofull, acting [118,94,84] BUT:118 hdd 9.09470 0.8 9.1 TiB 7.4 TiB 7.4 TiB 12 KiB 19 GiB 1.7 TiB 81.53 1.16 38 up Even with adjusted backfillfull ratio of 0.94 nothing is moving (incl restarting the OSD). This is dangerous because it blocks recovery. This happens because there's a bug in the PG distribution algo, due to improper balance my PG counts are all over the place and some OSD are half empty and a few are up to 90%. How do I fix this rebalance issue now? I already googled and only came up with adjusting ratio rates or restarting the OSD but nothing helps. Thanks for help ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] feature set mismatch CEPH_FEATURE_MON_GV kernel 5.0?
So it seems like for some reason librados is used now instead of kernel module, and this produces the error. But we have all latest Nautilus repos installed on the clients...so why would librados throw a compatiblity issue?Client compatiblity level is set to Luminous. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] changing set-require-min-compat-client will cause hiccup?
Hi, it is NOT safe.All clients fail to mount rbds now :( Am Mittwoch, 30. Oktober 2019, 09:33:16 OEZ hat Konstantin Shalygin Folgendes geschrieben: Hi,I need to change set-require-min-compat-clientto use upmap mode for the PG balancer. Will this cause a disconnect of all clients? We're talking cephfs and RBD images for VMs. Or is it save to switch that live? Is safe. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] feature set mismatch CEPH_FEATURE_MON_GV kernel 5.0?
Hi, we're on v14.2.4 and nothing but that. All clients and servers run kernel ubuntu 18.04 LTS 5.0.0-20. We're seeing this error: MountVolume.WaitForAttach failed for volume "pvc-45a86719-edb9-11e9-9f38-02000a030111" : fail to check rbd image status with: (exit status 110), rbd output: (2019-10-31 06:17:53.295823 7ff74ecea700 0 -- 10.3.3.56:0/2361280277 >> 10.3.2.3:6789/0 pipe(0x561d0ab0 sd=3 :51770 s=1 pgs=0 cs=0 l=1 c=0x561d0888a5f0).connect protocol feature mismatch, my 27ffefdfbfff < peer 27fddff8efacbfff missing 20 2019-10-31 06:17:53.295884 7ff74ecea700 0 -- 10.3.3.56:0/2361280277 >> 10.3.2.3:6789/0 pipe(0x561d0ab0 sd=3 :51770 s=1 pgs=0 cs=0 l=1 Following https://ceph.io/geen-categorie/feature-set-mismatch-error-on-ceph-kernel-client/ this means (it's very old but the only info I found) CEPH_FEATURE_MON_GV is missing? But that makes no sense. Can someone enlighten me please? Client compatiblity is set to Luminous to use upmap for pg balancer. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] changing set-require-min-compat-client will cause hiccup?
Hi,I need to change set-require-min-compat-clientto use upmap mode for the PG balancer. Will this cause a disconnect of all clients? We're talking cephfs and RBD images for VMs. Or is it save to switch that live? thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] very high ram usage by OSDs on Nautilus
Yes you were right, somehow there was an unusual high memory target set, not sure where this came from. I set it back to normal now, that should fix it I guess. Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] very high ram usage by OSDs on Nautilus
Ok looking at mempool, what does it tell me? This affects multiple OSDs, got crashes almost every hour. { "mempool": { "by_pool": { "bloom_filter": { "items": 0, "bytes": 0 }, "bluestore_alloc": { "items": 2545349, "bytes": 20362792 }, "bluestore_cache_data": { "items": 28759, "bytes": 6972870656 }, "bluestore_cache_onode": { "items": 2885255, "bytes": 1892727280 }, "bluestore_cache_other": { "items": 202831651, "bytes": 5403585971 }, "bluestore_fsck": { "items": 0, "bytes": 0 }, "bluestore_txc": { "items": 21, "bytes": 15792 }, "bluestore_writing_deferred": { "items": 77, "bytes": 7803168 }, "bluestore_writing": { "items": 4, "bytes": 5319827 }, "bluefs": { "items": 5242, "bytes": 175096 }, "buffer_anon": { "items": 726644, "bytes": 193214370 }, "buffer_meta": { "items": 754360, "bytes": 66383680 }, "osd": { "items": 29, "bytes": 377464 }, "osd_mapbl": { "items": 50, "bytes": 3492082 }, "osd_pglog": { "items": 99011, "bytes": 46170592 }, "osdmap": { "items": 48130, "bytes": 1151208 }, "osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": { "items": 0, "bytes": 0 }, "mds_co": { "items": 0, "bytes": 0 }, "unittest_1": { "items": 0, "bytes": 0 }, "unittest_2": { "items": 0, "bytes": 0 } }, "total": { "items": 209924582, "bytes": 14613649978 } } } ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph is moving data ONLY to near-full OSDs [BUG]
I was following the pg autoscaler recommendations and I did not get a recommendation to raise the PGs there. I'll try that, I am raising it already. But still seems weird why it would move data onto almost full OSDs, see the data distribution, it's horrible, ranges from 60 to almost 90% of full. PGs are not equally distributed otherwise it'd be a PG size issue. Thanks Am Sonntag, 27. Oktober 2019, 20:33:11 OEZ hat Wido den Hollander Folgendes geschrieben: On 10/26/19 8:01 AM, Philippe D'Anjou wrote: > V14.2.4 > So, this is not new, this happens every time there is a rebalance, now > because of raising PGs. PG balancer is disabled because I thought it was > the reason but apparently it's not, but it ain't helping either. > > Ceph is totally borged, it's only moving data on nearfull OSDs causing > issues. See this after PG raise. > > health: HEALTH_WARN > 3 nearfull osd(s) > 2 pool(s) nearfull > > > 08:44 am > > 92 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 48 KiB 19 GiB 1.6 > TiB 82.79 1.25 39 up > 71 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 88 KiB 20 GiB 1.4 > TiB 84.09 1.27 38 up > 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB 1.5 > TiB 84.05 1.27 36 up > > > 08:54 am > > 92 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 48 KiB 19 GiB 1.6 > TiB 82.81 1.25 39 up > 71 hdd 9.09470 1.0 9.1 TiB 7.7 TiB 7.6 TiB 88 KiB 20 GiB 1.4 > TiB 84.14 1.27 38 up > 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB 1.4 > TiB 84.10 1.27 36 up > > > > 14 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 76 KiB 17 GiB > 2.6 TiB 71.33 1.07 32 up > 19 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 52 KiB 17 GiB > 2.8 TiB 68.81 1.04 30 up > 22 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 92 KiB 17 GiB > 2.8 TiB 68.90 1.04 32 up > 25 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 219 KiB 17 GiB > 2.9 TiB 68.11 1.03 31 up > 30 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 20 KiB 17 GiB > 2.6 TiB 71.41 1.08 33 up > 33 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 40 KiB 17 GiB > 2.6 TiB 71.30 1.07 32 up > 34 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 36 KiB 17 GiB > 2.6 TiB 71.33 1.07 30 up > 35 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 124 KiB 17 GiB > 2.5 TiB 72.61 1.09 32 up > 12 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 24 KiB 18 GiB > 2.4 TiB 73.84 1.11 32 up > 16 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.4 TiB 96 KiB 17 GiB > 2.6 TiB 71.08 1.07 29 up > 17 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 60 KiB 17 GiB > 2.6 TiB 71.41 1.08 31 up > 20 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 92 KiB 17 GiB > 2.9 TiB 68.57 1.03 28 up > 23 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 36 KiB 17 GiB > 2.6 TiB 71.37 1.08 29 up > 26 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 84 KiB 17 GiB > 2.7 TiB 70.02 1.06 30 up > 28 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 28 KiB 17 GiB > 2.7 TiB 70.11 1.06 30 up > 31 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 56 KiB 17 GiB > 2.6 TiB 71.26 1.07 32 up > 13 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 24 KiB 18 GiB > 2.4 TiB 73.84 1.11 31 up > 15 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 44 KiB 17 GiB > 2.6 TiB 71.35 1.08 29 up > 18 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 76 KiB 16 GiB > 3.3 TiB 63.70 0.96 26 up > 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB > 1.4 TiB 84.10 1.27 36 up > 24 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 64 KiB 15 GiB > 3.3 TiB 63.67 0.96 28 up > 27 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 48 KiB 17 GiB > 3.1 TiB 66.03 1.00 28 up > 29 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.3 TiB 28 KiB 18 GiB > 2.7 TiB 69.93 1.05 34 up > 32 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 20 KiB 17 GiB > 3.1 TiB 66.20 1.00 28 up > 37 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 32 KiB 18 GiB > 2.7 TiB 70.59 1.06 31 up > 39 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 32 KiB 19 GiB > 2.7 TiB 70.50 1.06 29 up > 41 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 52 KiB 17 GiB > 2.8 TiB 68.79 1.04 30 up > 43 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 48 KiB 17 GiB > 2.8 TiB 68.84 1.04 28 up > 45 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 80 KiB 18 GiB > 2.4 TiB 74.02 1.12 33 up > 46 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 36 KiB 18 Gi
[ceph-users] very high ram usage by OSDs on Nautilus
Hi, we are seeing quite a high memory usage by OSDs since Nautilus. Averaging 10GB/OSD for 10TB HDDs. But I had OOM issues on 128GB Systems because some single OSD processes used up to 32%. Here an example how they look on average: https://i.imgur.com/kXCtxMe.png Is that normal? I never seen this on luminous. Memory leaks?Using all default values, OSDs have no special configuration. Use case is cephfs. v14.2.4 on Ubuntu 18.04 LTS Seems a bit high? Thanks for help ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph is moving data ONLY to near-full OSDs [BUG]
V14.2.4 So, this is not new, this happens every time there is a rebalance, now because of raising PGs. PG balancer is disabled because I thought it was the reason but apparently it's not, but it ain't helping either. Ceph is totally borged, it's only moving data on nearfull OSDs causing issues. See this after PG raise. health: HEALTH_WARN 3 nearfull osd(s) 2 pool(s) nearfull 08:44 am 92 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 48 KiB 19 GiB 1.6 TiB 82.79 1.25 39 up 71 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 88 KiB 20 GiB 1.4 TiB 84.09 1.27 38 up 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB 1.5 TiB 84.05 1.27 36 up 08:54 am 92 hdd 9.09470 1.0 9.1 TiB 7.5 TiB 7.5 TiB 48 KiB 19 GiB 1.6 TiB 82.81 1.25 39 up 71 hdd 9.09470 1.0 9.1 TiB 7.7 TiB 7.6 TiB 88 KiB 20 GiB 1.4 TiB 84.14 1.27 38 up 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB 1.4 TiB 84.10 1.27 36 up 14 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 76 KiB 17 GiB 2.6 TiB 71.33 1.07 32 up 19 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 52 KiB 17 GiB 2.8 TiB 68.81 1.04 30 up 22 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 92 KiB 17 GiB 2.8 TiB 68.90 1.04 32 up 25 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 219 KiB 17 GiB 2.9 TiB 68.11 1.03 31 up 30 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 20 KiB 17 GiB 2.6 TiB 71.41 1.08 33 up 33 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 40 KiB 17 GiB 2.6 TiB 71.30 1.07 32 up 34 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 36 KiB 17 GiB 2.6 TiB 71.33 1.07 30 up 35 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 124 KiB 17 GiB 2.5 TiB 72.61 1.09 32 up 12 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 24 KiB 18 GiB 2.4 TiB 73.84 1.11 32 up 16 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.4 TiB 96 KiB 17 GiB 2.6 TiB 71.08 1.07 29 up 17 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 60 KiB 17 GiB 2.6 TiB 71.41 1.08 31 up 20 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 92 KiB 17 GiB 2.9 TiB 68.57 1.03 28 up 23 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 36 KiB 17 GiB 2.6 TiB 71.37 1.08 29 up 26 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 84 KiB 17 GiB 2.7 TiB 70.02 1.06 30 up 28 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 28 KiB 17 GiB 2.7 TiB 70.11 1.06 30 up 31 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 56 KiB 17 GiB 2.6 TiB 71.26 1.07 32 up 13 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 24 KiB 18 GiB 2.4 TiB 73.84 1.11 31 up 15 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 44 KiB 17 GiB 2.6 TiB 71.35 1.08 29 up 18 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 76 KiB 16 GiB 3.3 TiB 63.70 0.96 26 up 21 hdd 9.09470 1.0 9.1 TiB 7.6 TiB 7.6 TiB 60 KiB 20 GiB 1.4 TiB 84.10 1.27 36 up 24 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 64 KiB 15 GiB 3.3 TiB 63.67 0.96 28 up 27 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 48 KiB 17 GiB 3.1 TiB 66.03 1.00 28 up 29 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.3 TiB 28 KiB 18 GiB 2.7 TiB 69.93 1.05 34 up 32 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 20 KiB 17 GiB 3.1 TiB 66.20 1.00 28 up 37 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 32 KiB 18 GiB 2.7 TiB 70.59 1.06 31 up 39 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 32 KiB 19 GiB 2.7 TiB 70.50 1.06 29 up 41 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 52 KiB 17 GiB 2.8 TiB 68.79 1.04 30 up 43 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 48 KiB 17 GiB 2.8 TiB 68.84 1.04 28 up 45 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 80 KiB 18 GiB 2.4 TiB 74.02 1.12 33 up 46 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 36 KiB 18 GiB 2.4 TiB 73.88 1.11 30 up 48 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 101 KiB 17 GiB 2.5 TiB 72.57 1.09 31 up 50 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 96 KiB 17 GiB 2.8 TiB 68.86 1.04 31 up 36 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 60 KiB 18 GiB 2.4 TiB 73.82 1.11 32 up 38 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 76 KiB 17 GiB 2.7 TiB 70.62 1.06 33 up 40 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 8 KiB 18 GiB 2.4 TiB 73.89 1.11 33 up 42 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.2 TiB 120 KiB 17 GiB 2.8 TiB 68.82 1.04 28 up 44 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 68 KiB 17 GiB 2.6 TiB 71.39 1.08 29 up 47 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 68 KiB 17 GiB 2.7 TiB 70.06 1.06 29 up 49 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 40 KiB 16 GiB
[ceph-users] How to reset compat weight-set changes caused by PG balancer module?
Apparently the PG balancer crush-compat mode adds some crush bucket weights. Those cause major havoc in our cluster, our PG distribution is all over the place. Seeing things like this:... 97 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 32 KiB 17 GiB 2.8 TiB 69.03 1.08 28 up 98 hdd 9.09470 1.0 9.1 TiB 4.5 TiB 4.5 TiB 96 KiB 11 GiB 4.6 TiB 49.51 0.77 20 up 99 hdd 9.09470 1.0 9.1 TiB 7.0 TiB 6.9 TiB 80 KiB 18 GiB 2.1 TiB 76.47 1.20 31 up Filling rates are from 50 - 90%. Unfortunately reweighing doesn't seem to help and I suspect it's because of bucket weights which are WEIRD bucket_id -42 weight_set [ [ 7.846 11.514 9.339 9.757 10.173 8.900 9.164 6.759 ] I disabled the module already but the rebalance is broken now. Do I have to hand reset this and push a new crush map? This is a sensitive production cluster, I don't feel pretty good about that. Thanks for any ideas.. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD PGs are not being removed - Full OSD issues
This is related to https://tracker.ceph.com/issues/42341 and to http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-October/037017.html After closing inspection yesterday we found that PGs are not being removed from OSDs which then leads to near full errors, explains why reweights don't work. This is a BIG issue because I have to constantly manually intervene to not have the cluster die.14.2.4. Fresh Setup, all defaultPG Balancer is turned off now, I begin to wonder if its at fault. My crush map: https://termbin.com/3t8lWhat was mentioned that the bucket weights are WEIRD. I never touched this.The crush weights that are unsual are for nearfull osd53 and some are set to 10 from a previous manual intervention. Now that the PGs are not being purged is one issue, the original issue is why the f ceph fills ONLY my nearfull OSDs in the first place. It seems to always select the fullest OSD to write more data onto it. If I reweight it it starts giving alerts for another almost full OSD because it intends to write everything there, despite everything else being only at about 60%. I dont know how to debug this, it's a MAJOR PITA Hope someone has an idea because I can't fight this 24/7, I'm getting pretty tired of this Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues with data distribution on Nautilus / weird filling behavior
Still seeing this issue. I had equal distribution but now copying data on cephfs and now 1 OSD is acting up again. ... 52 hdd 9.09470 1.0 9.1 TiB 5.4 TiB 5.4 TiB 84 KiB 14 GiB 3.7 TiB 59.32 1.00 26 up 53 hdd 8.0 1.0 9.1 TiB 7.9 TiB 7.9 TiB 64 KiB 19 GiB 1.2 TiB 86.83 1.46 38 up 54 hdd 9.09470 1.0 9.1 TiB 5.0 TiB 5.0 TiB 136 KiB 13 GiB 4.1 TiB 54.80 0.92 24 up ... Now I again have to manually reweight to prevent bigger issues. How to fix this? Am Mittwoch, 2. Oktober 2019, 08:49:50 OESZ hat Philippe D'Anjou Folgendes geschrieben: Hi,this is a fresh Nautilus cluster, but there is a second old one that was upgraded from Luminous to Nautilus, both experience the same symptoms. First of all the data distribution on the OSDs is very bad. Now that could be due to low PGs although I get no recommendation to raise the PG number so far. 14 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 68 KiB 15 GiB 2.9 TiB 67.83 1.07 32 up 19 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 56 KiB 16 GiB 2.4 TiB 73.56 1.16 35 up 22 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 88 KiB 15 GiB 2.8 TiB 69.39 1.09 32 up 25 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 88 KiB 16 GiB 3.1 TiB 66.01 1.04 31 up 30 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 84 KiB 16 GiB 2.8 TiB 69.36 1.09 33 up 33 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 60 KiB 15 GiB 3.1 TiB 65.84 1.03 31 up 34 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 124 KiB 14 GiB 3.3 TiB 63.84 1.00 29 up 35 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 32 KiB 15 GiB 3.1 TiB 65.82 1.03 31 up 12 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 44 KiB 15 GiB 3.0 TiB 67.32 1.06 31 up 16 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 72 KiB 17 GiB 2.4 TiB 73.52 1.16 35 up 17 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 80 KiB 15 GiB 3.0 TiB 67.24 1.06 32 up 20 hdd 9.09470 1.0 9.1 TiB 6.8 TiB 6.8 TiB 64 KiB 17 GiB 2.3 TiB 74.45 1.17 35 up 23 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 80 KiB 16 GiB 2.6 TiB 71.46 1.12 34 up 26 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 68 KiB 16 GiB 2.7 TiB 70.45 1.11 33 up 28 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.0 TiB 64 KiB 15 GiB 3.0 TiB 66.52 1.05 31 up 31 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 52 KiB 16 GiB 2.5 TiB 72.21 1.13 34 up 13 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 68 KiB 16 GiB 2.4 TiB 73.56 1.16 35 up 15 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 44 KiB 17 GiB 2.4 TiB 73.58 1.16 35 up 18 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 56 KiB 16 GiB 2.6 TiB 71.46 1.12 34 up 21 hdd 8.0 1.0 9.1 TiB 5.9 TiB 5.9 TiB 84 KiB 15 GiB 3.2 TiB 65.14 1.02 31 up 24 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 76 KiB 16 GiB 2.5 TiB 72.21 1.13 34 up 27 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 64 KiB 16 GiB 2.6 TiB 71.42 1.12 34 up 29 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 80 KiB 15 GiB 2.9 TiB 68.16 1.07 32 up 32 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 132 KiB 15 GiB 3.0 TiB 66.91 1.05 31 up 37 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 16 KiB 17 GiB 2.8 TiB 69.38 1.09 33 up 39 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 92 KiB 16 GiB 2.7 TiB 70.44 1.11 33 up 41 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 108 KiB 15 GiB 2.9 TiB 67.95 1.07 32 up 43 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 24 KiB 15 GiB 3.2 TiB 65.20 1.02 31 up 45 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 72 KiB 15 GiB 3.2 TiB 65.35 1.03 31 up 46 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 60 KiB 15 GiB 2.9 TiB 68.08 1.07 32 up 48 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 64 KiB 15 GiB 2.9 TiB 67.96 1.07 32 up 50 hdd 9.09470 1.0 9.1 TiB 5.7 TiB 5.7 TiB 48 KiB 15 GiB 3.4 TiB 63.06 0.99 30 up 36 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 116 KiB 15 GiB 3.0 TiB 67.45 1.06 32 up 38 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 80 KiB 16 GiB 2.8 TiB 69.36 1.09 33 up 40 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 84 KiB 15 GiB 3.0 TiB 67.31 1.06 32 up 42 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.0 TiB 104 KiB 15 GiB 3.0 TiB 66.59 1.05 31 up 44 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 15 GiB 3.2 TiB 65.12 1.02 31 up 47 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 88 KiB 15 GiB 3.2 TiB 65.09 1.02 31 up 49 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 112 KiB 15 GiB 3.0 TiB 67.19 1.06 32 up 51 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.1 TiB 40 KiB 15 GiB 2.9 TiB 67.75 1.06 32 up 52 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 80 KiB 15 GiB 3.3 TiB 63.91
Re: [ceph-users] mon sudden crash loop - pinned map
After trying to disable the paxos service trim temporarily (since that seemed to trigger it initially), we now see this: "assert_condition": "from != to", "assert_func": "void PaxosService::trim(MonitorDBStore::TransactionRef, version_t, version_t)", "assert_file": "/build/ceph-14.2.4/src/mon/PaxosService.cc", "assert_line": 412, "assert_thread_name": "safe_timer", "assert_msg": "/build/ceph-14.2.4/src/mon/PaxosService.cc: In function 'void PaxosService::trim(MonitorDBStore::TransactionRef, version_t, version_t)' thread 7fd31cb9a700 time 2019-10-10 13:13:59.394987\n/build/ceph-14.2.4/src/mon/PaxosService.cc: 412: FAILED ceph_assert(from != to)\n", We need some crutch...all I need is a running mon to mount Cephfs, data is still fine. Am Mittwoch, 9. Oktober 2019, 20:19:42 OESZ hat Gregory Farnum Folgendes geschrieben: On Mon, Oct 7, 2019 at 11:11 PM Philippe D'Anjou wrote: > > Hi, > unfortunately it's single mon, because we had major outage on this cluster > and it's just being used to copy off data now. We werent able to add more > mons because once a second mon was added it crashed the first one (there's a > bug tracker ticket). > I still have old rocksdb files before I ran a repair on it, but well it had > the rocksdb corruption issue (not sure why that happened, it ran fine for > 2months now). > > Any options? I mean everything still works, data is accessible, RBDs run, > only cephfs mount is obviously not working. For that short amount of time the > mon starts it reports no issues and all commands run fine. Sounds like you actually lost some data. You'd need to manage a repair by trying to figure out why CephFS needs that map and performing surgery on either the monitor (to give it a fake map or fall back to something else) or the CephFS data structures. You might also be able to rebuild the CephFS metadata using the disaster recovery tools to work around it, but no guarantees there since I don't understand why CephFS is digging up OSD maps that nobody else in the cluster cares about. -Greg > Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum > Folgendes geschrieben: > > > On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou > wrote: > > > > I had to use rocksdb repair tool before because the rocksdb files got > > corrupted, for another reason (another bug possibly). Maybe that is why now > > it crash loops, although it ran fine for a day. > > Yeah looks like it lost a bit of data. :/ > > > What is meant with "turn it off and rebuild from remainder"? > > If only one monitor is crashing, you can remove it from the quorum, > zap all the disks, and add it back so that it recovers from its > healthy peers. > -Greg > > > > > > Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum > > Folgendes geschrieben: > > > > > > Hmm, that assert means the monitor tried to grab an OSDMap it had on > > disk but it didn't work. (In particular, a "pinned" full map which we > > kept around after trimming the others to save on disk space.) > > > > That *could* be a bug where we didn't have the pinned map and should > > have (or incorrectly thought we should have), but this code was in > > Mimic as well as Nautilus and I haven't seen similar reports. So it > > could also mean that something bad happened to the monitor's disk or > > Rocksdb store. Can you turn it off and rebuild from the remainder, or > > do they all exhibit this bug? > > > > > > On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou > > wrote: > > > > > > Hi, > > > our mon is acting up all of a sudden and dying in crash loop with the > > > following: > > > > > > > > > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 > > > -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c > > >4548623..4549352) is_readable = 1 - now=2019-10-04 14:00:24.339620 > > >lease_expire=0.00 has v0 lc 4549352 > > > -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map > > >closest pinned map ver 252615 not available! error: (2) No such file or > > >directory > > > -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int > > >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread > > >7f6e5d461700 time 2019-10-04 14:00:24.347580 > > > /bu
Re: [ceph-users] mon sudden crash loop - pinned map
How do I Import an osdmap in Nautilus? I saw documentation for older version but it seems one now can only export but not import anymore? Am Donnerstag, 10. Oktober 2019, 08:52:03 OESZ hat Philippe D'Anjou Folgendes geschrieben: I dont think this has anything to do with CephFS, the mon crashes for same reason even without the mds running.I have still the old rocksdb files but they had a corruption issue, not sure if that's easier to fix, there havent been any changes on the cluster in between. This is a disaster rebuild, we managed to get all cephfs data back online, apart from some metadata, and we're copying for the last weeks now but suddenly the mon died first of rocksdb corruption and now after the repair because of that osdmap issue. Am Mittwoch, 9. Oktober 2019, 20:19:42 OESZ hat Gregory Farnum Folgendes geschrieben: On Mon, Oct 7, 2019 at 11:11 PM Philippe D'Anjou wrote: > > Hi, > unfortunately it's single mon, because we had major outage on this cluster > and it's just being used to copy off data now. We werent able to add more > mons because once a second mon was added it crashed the first one (there's a > bug tracker ticket). > I still have old rocksdb files before I ran a repair on it, but well it had > the rocksdb corruption issue (not sure why that happened, it ran fine for > 2months now). > > Any options? I mean everything still works, data is accessible, RBDs run, > only cephfs mount is obviously not working. For that short amount of time the > mon starts it reports no issues and all commands run fine. Sounds like you actually lost some data. You'd need to manage a repair by trying to figure out why CephFS needs that map and performing surgery on either the monitor (to give it a fake map or fall back to something else) or the CephFS data structures. You might also be able to rebuild the CephFS metadata using the disaster recovery tools to work around it, but no guarantees there since I don't understand why CephFS is digging up OSD maps that nobody else in the cluster cares about. -Greg > Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum > Folgendes geschrieben: > > > On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou > wrote: > > > > I had to use rocksdb repair tool before because the rocksdb files got > > corrupted, for another reason (another bug possibly). Maybe that is why now > > it crash loops, although it ran fine for a day. > > Yeah looks like it lost a bit of data. :/ > > > What is meant with "turn it off and rebuild from remainder"? > > If only one monitor is crashing, you can remove it from the quorum, > zap all the disks, and add it back so that it recovers from its > healthy peers. > -Greg > > > > > > Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum > > Folgendes geschrieben: > > > > > > Hmm, that assert means the monitor tried to grab an OSDMap it had on > > disk but it didn't work. (In particular, a "pinned" full map which we > > kept around after trimming the others to save on disk space.) > > > > That *could* be a bug where we didn't have the pinned map and should > > have (or incorrectly thought we should have), but this code was in > > Mimic as well as Nautilus and I haven't seen similar reports. So it > > could also mean that something bad happened to the monitor's disk or > > Rocksdb store. Can you turn it off and rebuild from the remainder, or > > do they all exhibit this bug? > > > > > > On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou > > wrote: > > > > > > Hi, > > > our mon is acting up all of a sudden and dying in crash loop with the > > > following: > > > > > > > > > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 > > > -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c > > >4548623..4549352) is_readable = 1 - now=2019-10-04 14:00:24.339620 > > >lease_expire=0.00 has v0 lc 4549352 > > > -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map > > >closest pinned map ver 252615 not available! error: (2) No such file or > > >directory > > > -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int > > >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread > > >7f6e5d461700 time 2019-10-04 14:00:24.347580 > > > /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == > > > 0) > > > >
Re: [ceph-users] mon sudden crash loop - pinned map
I dont think this has anything to do with CephFS, the mon crashes for same reason even without the mds running.I have still the old rocksdb files but they had a corruption issue, not sure if that's easier to fix, there havent been any changes on the cluster in between. This is a disaster rebuild, we managed to get all cephfs data back online, apart from some metadata, and we're copying for the last weeks now but suddenly the mon died first of rocksdb corruption and now after the repair because of that osdmap issue. Am Mittwoch, 9. Oktober 2019, 20:19:42 OESZ hat Gregory Farnum Folgendes geschrieben: On Mon, Oct 7, 2019 at 11:11 PM Philippe D'Anjou wrote: > > Hi, > unfortunately it's single mon, because we had major outage on this cluster > and it's just being used to copy off data now. We werent able to add more > mons because once a second mon was added it crashed the first one (there's a > bug tracker ticket). > I still have old rocksdb files before I ran a repair on it, but well it had > the rocksdb corruption issue (not sure why that happened, it ran fine for > 2months now). > > Any options? I mean everything still works, data is accessible, RBDs run, > only cephfs mount is obviously not working. For that short amount of time the > mon starts it reports no issues and all commands run fine. Sounds like you actually lost some data. You'd need to manage a repair by trying to figure out why CephFS needs that map and performing surgery on either the monitor (to give it a fake map or fall back to something else) or the CephFS data structures. You might also be able to rebuild the CephFS metadata using the disaster recovery tools to work around it, but no guarantees there since I don't understand why CephFS is digging up OSD maps that nobody else in the cluster cares about. -Greg > Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum > Folgendes geschrieben: > > > On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou > wrote: > > > > I had to use rocksdb repair tool before because the rocksdb files got > > corrupted, for another reason (another bug possibly). Maybe that is why now > > it crash loops, although it ran fine for a day. > > Yeah looks like it lost a bit of data. :/ > > > What is meant with "turn it off and rebuild from remainder"? > > If only one monitor is crashing, you can remove it from the quorum, > zap all the disks, and add it back so that it recovers from its > healthy peers. > -Greg > > > > > > Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum > > Folgendes geschrieben: > > > > > > Hmm, that assert means the monitor tried to grab an OSDMap it had on > > disk but it didn't work. (In particular, a "pinned" full map which we > > kept around after trimming the others to save on disk space.) > > > > That *could* be a bug where we didn't have the pinned map and should > > have (or incorrectly thought we should have), but this code was in > > Mimic as well as Nautilus and I haven't seen similar reports. So it > > could also mean that something bad happened to the monitor's disk or > > Rocksdb store. Can you turn it off and rebuild from the remainder, or > > do they all exhibit this bug? > > > > > > On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou > > wrote: > > > > > > Hi, > > > our mon is acting up all of a sudden and dying in crash loop with the > > > following: > > > > > > > > > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 > > > -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c > > >4548623..4549352) is_readable = 1 - now=2019-10-04 14:00:24.339620 > > >lease_expire=0.00 has v0 lc 4549352 > > > -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map > > >closest pinned map ver 252615 not available! error: (2) No such file or > > >directory > > > -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > > >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int > > >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread > > >7f6e5d461700 time 2019-10-04 14:00:24.347580 > > > /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == > > > 0) > > > > > > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus > > >(stable) > > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > > >const*)+0x152) [0x7f6e68eb064e] > > >
Re: [ceph-users] mon sudden crash loop - pinned map
Hi,unfortunately it's single mon, because we had major outage on this cluster and it's just being used to copy off data now. We werent able to add more mons because once a second mon was added it crashed the first one (there's a bug tracker ticket). I still have old rocksdb files before I ran a repair on it, but well it had the rocksdb corruption issue (not sure why that happened, it ran fine for 2months now). Any options? I mean everything still works, data is accessible, RBDs run, only cephfs mount is obviously not working. For that short amount of time the mon starts it reports no issues and all commands run fine. Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum Folgendes geschrieben: On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou wrote: > > I had to use rocksdb repair tool before because the rocksdb files got > corrupted, for another reason (another bug possibly). Maybe that is why now > it crash loops, although it ran fine for a day. Yeah looks like it lost a bit of data. :/ > What is meant with "turn it off and rebuild from remainder"? If only one monitor is crashing, you can remove it from the quorum, zap all the disks, and add it back so that it recovers from its healthy peers. -Greg > > Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum > Folgendes geschrieben: > > > Hmm, that assert means the monitor tried to grab an OSDMap it had on > disk but it didn't work. (In particular, a "pinned" full map which we > kept around after trimming the others to save on disk space.) > > That *could* be a bug where we didn't have the pinned map and should > have (or incorrectly thought we should have), but this code was in > Mimic as well as Nautilus and I haven't seen similar reports. So it > could also mean that something bad happened to the monitor's disk or > Rocksdb store. Can you turn it off and rebuild from the remainder, or > do they all exhibit this bug? > > > On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou > wrote: > > > > Hi, > > our mon is acting up all of a sudden and dying in crash loop with the > > following: > > > > > > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 > > -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 > >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) > >is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.00 has > >v0 lc 4549352 > > -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map > >closest pinned map ver 252615 not available! error: (2) No such file or > >directory > > -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 > >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int > >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread > >7f6e5d461700 time 2019-10-04 14:00:24.347580 > > /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0) > > > > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus > >(stable) > > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > >const*)+0x152) [0x7f6e68eb064e] > > 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, > >char const*, ...)+0) [0x7f6e68eb0829] > > 3: (OSDMonitor::get_full_from_pinned_map(unsigned long, > >ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b] > > 4: (OSDMonitor::get_version_full(unsigned long, unsigned long, > >ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82] > > 5: > >(OSDMonitor::encode_trim_extra(std::shared_ptr, > >unsigned long)+0x8c) [0x717c3c] > > 6: (PaxosService::maybe_trim()+0x473) [0x707443] > > 7: (Monitor::tick()+0xa9) [0x5ecf39] > > 8: (C_MonContext::finish(int)+0x39) [0x5c3f29] > > 9: (Context::complete(int)+0x9) [0x6070d9] > > 10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580] > > 11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d] > > 12: (()+0x76ba) [0x7f6e67cab6ba] > > 13: (clone()+0x6d) [0x7f6e674d441d] > > > > 0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) > >** > > in thread 7f6e5d461700 thread_name:safe_timer > > > > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus > >(stable) > > 1: (()+0x11390) [0x7f6e67cb5390] > > 2: (gsignal()+0x38) [0x7f6e67402428] > > 3: (abort()+0x16a) [0x7f6e6740402a] > > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > >const*)+0x1a3) [0x7f6e68eb069f] > > 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, > >
Re: [ceph-users] mon sudden crash loop - pinned map
I had to use rocksdb repair tool before because the rocksdb files got corrupted, for another reason (another bug possibly). Maybe that is why now it crash loops, although it ran fine for a day.What is meant with "turn it off and rebuild from remainder"? Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum Folgendes geschrieben: Hmm, that assert means the monitor tried to grab an OSDMap it had on disk but it didn't work. (In particular, a "pinned" full map which we kept around after trimming the others to save on disk space.) That *could* be a bug where we didn't have the pinned map and should have (or incorrectly thought we should have), but this code was in Mimic as well as Nautilus and I haven't seen similar reports. So it could also mean that something bad happened to the monitor's disk or Rocksdb store. Can you turn it off and rebuild from the remainder, or do they all exhibit this bug? On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou wrote: > > Hi, > our mon is acting up all of a sudden and dying in crash loop with the > following: > > > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 > -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) >is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.00 has v0 >lc 4549352 > -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map >closest pinned map ver 252615 not available! error: (2) No such file or >directory > -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread >7f6e5d461700 time 2019-10-04 14:00:24.347580 > /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0) > > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus >(stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >const*)+0x152) [0x7f6e68eb064e] > 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, >char const*, ...)+0) [0x7f6e68eb0829] > 3: (OSDMonitor::get_full_from_pinned_map(unsigned long, >ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b] > 4: (OSDMonitor::get_version_full(unsigned long, unsigned long, >ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82] > 5: >(OSDMonitor::encode_trim_extra(std::shared_ptr, >unsigned long)+0x8c) [0x717c3c] > 6: (PaxosService::maybe_trim()+0x473) [0x707443] > 7: (Monitor::tick()+0xa9) [0x5ecf39] > 8: (C_MonContext::finish(int)+0x39) [0x5c3f29] > 9: (Context::complete(int)+0x9) [0x6070d9] > 10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580] > 11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d] > 12: (()+0x76ba) [0x7f6e67cab6ba] > 13: (clone()+0x6d) [0x7f6e674d441d] > > 0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) ** > in thread 7f6e5d461700 thread_name:safe_timer > > ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus >(stable) > 1: (()+0x11390) [0x7f6e67cb5390] > 2: (gsignal()+0x38) [0x7f6e67402428] > 3: (abort()+0x16a) [0x7f6e6740402a] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char >const*)+0x1a3) [0x7f6e68eb069f] > 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, >char const*, ...)+0) [0x7f6e68eb0829] > 6: (OSDMonitor::get_full_from_pinned_map(unsigned long, >ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b] > 7: (OSDMonitor::get_version_full(unsigned long, unsigned long, >ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82] > 8: >(OSDMonitor::encode_trim_extra(std::shared_ptr, >unsigned long)+0x8c) [0x717c3c] > 9: (PaxosService::maybe_trim()+0x473) [0x707443] > 10: (Monitor::tick()+0xa9) [0x5ecf39] > 11: (C_MonContext::finish(int)+0x39) [0x5c3f29] > 12: (Context::complete(int)+0x9) [0x6070d9] > 13: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580] > 14: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d] > 15: (()+0x76ba) [0x7f6e67cab6ba] > 16: (clone()+0x6d) [0x7f6e674d441d] > NOTE: a copy of the executable, or `objdump -rdS ` is needed to >interpret this. > > > This was running fine for 2months now, it's a crashed cluster that is in > recovery. > > Any suggestions? > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] mon sudden crash loop - pinned map
Hi,our mon is acting up all of a sudden and dying in crash loop with the following: 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352 -3> 2019-10-04 14:00:24.335 7f6e5d461700 5 mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.00 has v0 lc 4549352 -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map closest pinned map ver 252615 not available! error: (2) No such file or directory -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 /build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread 7f6e5d461700 time 2019-10-04 14:00:24.347580 /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0) ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f6e68eb064e] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f6e68eb0829] 3: (OSDMonitor::get_full_from_pinned_map(unsigned long, ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b] 4: (OSDMonitor::get_version_full(unsigned long, unsigned long, ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82] 5: (OSDMonitor::encode_trim_extra(std::shared_ptr, unsigned long)+0x8c) [0x717c3c] 6: (PaxosService::maybe_trim()+0x473) [0x707443] 7: (Monitor::tick()+0xa9) [0x5ecf39] 8: (C_MonContext::finish(int)+0x39) [0x5c3f29] 9: (Context::complete(int)+0x9) [0x6070d9] 10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580] 11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d] 12: (()+0x76ba) [0x7f6e67cab6ba] 13: (clone()+0x6d) [0x7f6e674d441d] 0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) ** in thread 7f6e5d461700 thread_name:safe_timer ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable) 1: (()+0x11390) [0x7f6e67cb5390] 2: (gsignal()+0x38) [0x7f6e67402428] 3: (abort()+0x16a) [0x7f6e6740402a] 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x7f6e68eb069f] 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f6e68eb0829] 6: (OSDMonitor::get_full_from_pinned_map(unsigned long, ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b] 7: (OSDMonitor::get_version_full(unsigned long, unsigned long, ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82] 8: (OSDMonitor::encode_trim_extra(std::shared_ptr, unsigned long)+0x8c) [0x717c3c] 9: (PaxosService::maybe_trim()+0x473) [0x707443] 10: (Monitor::tick()+0xa9) [0x5ecf39] 11: (C_MonContext::finish(int)+0x39) [0x5c3f29] 12: (Context::complete(int)+0x9) [0x6070d9] 13: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580] 14: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d] 15: (()+0x76ba) [0x7f6e67cab6ba] 16: (clone()+0x6d) [0x7f6e674d441d] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. This was running fine for 2months now, it's a crashed cluster that is in recovery. Any suggestions? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Issues with data distribution on Nautilus / weird filling behavior
Hi,this is a fresh Nautilus cluster, but there is a second old one that was upgraded from Luminous to Nautilus, both experience the same symptoms. First of all the data distribution on the OSDs is very bad. Now that could be due to low PGs although I get no recommendation to raise the PG number so far. 14 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 68 KiB 15 GiB 2.9 TiB 67.83 1.07 32 up 19 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 56 KiB 16 GiB 2.4 TiB 73.56 1.16 35 up 22 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 88 KiB 15 GiB 2.8 TiB 69.39 1.09 32 up 25 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 88 KiB 16 GiB 3.1 TiB 66.01 1.04 31 up 30 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 84 KiB 16 GiB 2.8 TiB 69.36 1.09 33 up 33 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 60 KiB 15 GiB 3.1 TiB 65.84 1.03 31 up 34 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 124 KiB 14 GiB 3.3 TiB 63.84 1.00 29 up 35 hdd 9.09470 1.0 9.1 TiB 6.0 TiB 6.0 TiB 32 KiB 15 GiB 3.1 TiB 65.82 1.03 31 up 12 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 44 KiB 15 GiB 3.0 TiB 67.32 1.06 31 up 16 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 72 KiB 17 GiB 2.4 TiB 73.52 1.16 35 up 17 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 80 KiB 15 GiB 3.0 TiB 67.24 1.06 32 up 20 hdd 9.09470 1.0 9.1 TiB 6.8 TiB 6.8 TiB 64 KiB 17 GiB 2.3 TiB 74.45 1.17 35 up 23 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 80 KiB 16 GiB 2.6 TiB 71.46 1.12 34 up 26 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 68 KiB 16 GiB 2.7 TiB 70.45 1.11 33 up 28 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.0 TiB 64 KiB 15 GiB 3.0 TiB 66.52 1.05 31 up 31 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 52 KiB 16 GiB 2.5 TiB 72.21 1.13 34 up 13 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 68 KiB 16 GiB 2.4 TiB 73.56 1.16 35 up 15 hdd 9.09470 1.0 9.1 TiB 6.7 TiB 6.7 TiB 44 KiB 17 GiB 2.4 TiB 73.58 1.16 35 up 18 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 56 KiB 16 GiB 2.6 TiB 71.46 1.12 34 up 21 hdd 8.0 1.0 9.1 TiB 5.9 TiB 5.9 TiB 84 KiB 15 GiB 3.2 TiB 65.14 1.02 31 up 24 hdd 9.09470 1.0 9.1 TiB 6.6 TiB 6.6 TiB 76 KiB 16 GiB 2.5 TiB 72.21 1.13 34 up 27 hdd 9.09470 1.0 9.1 TiB 6.5 TiB 6.5 TiB 64 KiB 16 GiB 2.6 TiB 71.42 1.12 34 up 29 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 80 KiB 15 GiB 2.9 TiB 68.16 1.07 32 up 32 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 132 KiB 15 GiB 3.0 TiB 66.91 1.05 31 up 37 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 16 KiB 17 GiB 2.8 TiB 69.38 1.09 33 up 39 hdd 9.09470 1.0 9.1 TiB 6.4 TiB 6.4 TiB 92 KiB 16 GiB 2.7 TiB 70.44 1.11 33 up 41 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 108 KiB 15 GiB 2.9 TiB 67.95 1.07 32 up 43 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 24 KiB 15 GiB 3.2 TiB 65.20 1.02 31 up 45 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 72 KiB 15 GiB 3.2 TiB 65.35 1.03 31 up 46 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 60 KiB 15 GiB 2.9 TiB 68.08 1.07 32 up 48 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 64 KiB 15 GiB 2.9 TiB 67.96 1.07 32 up 50 hdd 9.09470 1.0 9.1 TiB 5.7 TiB 5.7 TiB 48 KiB 15 GiB 3.4 TiB 63.06 0.99 30 up 36 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 116 KiB 15 GiB 3.0 TiB 67.45 1.06 32 up 38 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 80 KiB 16 GiB 2.8 TiB 69.36 1.09 33 up 40 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 84 KiB 15 GiB 3.0 TiB 67.31 1.06 32 up 42 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.0 TiB 104 KiB 15 GiB 3.0 TiB 66.59 1.05 31 up 44 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 15 GiB 3.2 TiB 65.12 1.02 31 up 47 hdd 9.09470 1.0 9.1 TiB 5.9 TiB 5.9 TiB 88 KiB 15 GiB 3.2 TiB 65.09 1.02 31 up 49 hdd 9.09470 1.0 9.1 TiB 6.1 TiB 6.1 TiB 112 KiB 15 GiB 3.0 TiB 67.19 1.06 32 up 51 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.1 TiB 40 KiB 15 GiB 2.9 TiB 67.75 1.06 32 up 52 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 80 KiB 15 GiB 3.3 TiB 63.91 1.00 30 up 53 hdd 7.0 1.0 9.1 TiB 7.3 TiB 7.2 TiB 88 KiB 18 GiB 1.8 TiB 79.84 1.25 38 up 54 hdd 9.09470 1.0 9.1 TiB 5.8 TiB 5.8 TiB 96 KiB 14 GiB 3.3 TiB 63.67 1.00 30 up 55 hdd 9.09470 1.0 9.1 TiB 5.3 TiB 5.3 TiB 48 KiB 13 GiB 3.8 TiB 58.20 0.91 27 up 56 hdd 9.09470 1.0 9.1 TiB 6.3 TiB 6.3 TiB 64 KiB 16 GiB 2.8 TiB 69.35 1.09 33 up 57 hdd 6.0 1.0 9.1 TiB 4.2 TiB 4.2 TiB 36 KiB 11 GiB 4.9 TiB 46.27 0.73 22 up 58 hdd 9.09470 1.0 9.1 TiB 6.2 TiB 6.2 TiB 104 KiB 15 GiB 2.9 TiB 68.20 1.07 32
[ceph-users] hanging/stopped recovery/rebalance in Nautilus
Hi,I often observed now that the recovery/rebalance in Nautilus starts quite fast but gets extremely slow (2-3 objects/s) even if there are like 20 OSDs involved. Right now I am moving (reweighted to 0) 16x8TB disks, it's running since 4 days and since 12h it's kind of stuck now at cluster: id: 2f525d60-aada-4da6-830f-7ba7b46c546b health: HEALTH_WARN Degraded data redundancy: 1070/899796274 objects degraded (0.000%), 1 pg degraded, 1 pg undersized 1216 pgs not deep-scrubbed in time 1216 pgs not scrubbed in time services: mon: 1 daemons, quorum km-fsn-1-dc4-m1-797678 (age 8w) mgr: km-fsn-1-dc4-m1-797678(active, since 6w) mds: xfd:1 {0=km-fsn-1-dc4-m1-797678=up:active} osd: 151 osds: 151 up (since 3d), 151 in (since 7d); 24 remapped pgs rgw: 1 daemon active (km-fsn-1-dc4-m1-797680) data: pools: 13 pools, 10433 pgs objects: 447.45M objects, 282 TiB usage: 602 TiB used, 675 TiB / 1.2 PiB avail pgs: 1070/899796274 objects degraded (0.000%) 261226/899796274 objects misplaced (0.029%) 10388 active+clean 24 active+clean+remapped 19 active+clean+scrubbing+deep 1 active+undersized+degraded 1 active+clean+scrubbing io: client: 10 MiB/s rd, 18 MiB/s wr, 141 op/s rd, 292 op/s wr osd-max-backfill is at 16 for all OSDs. Anyone got an idea why rebalance completely stopped? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com