Have you tried running a Luminous OSD with filestore instead of BlueStore?

As BlueStore is all new code and uses a lot of optimizations and tricks for
fast and efficient use of memory, some 64-bit assumptions may have snuck in
there. I'm not sure how much interest there is in making sure that works on
32-bit systems at this point, but narrowing it down to a specific component
would certainly help.

On Fri, Sep 22, 2017 at 8:57 PM Dyweni - Ceph-Users <6exbab4fy...@dyweni.com>
wrote:

> It crashes with SimpleMessenger as well  (ms_type = simple)
>
>
> I've also tried with and without these two settings, but still crashes.
> bluestore cache size = 536870912
> bluestore cache kv max = 268435456
>
>
> When using SimpleMessenger, it tells me it is crashing (Segmentation
> Fault) in 'thread_name:ms_pipe_write'.  This is common in all crashes under
> SimpleMessenger, just like 'msgr-worker-<n>' was common
> under AsyncMessenger.
>
>
> The node I'm testing this on is running a 32bit kernel (4.12.5) and has
> 8GB ram (free -m).
>
>
> Per 'ps aux', VSZ and RSS never get much above 1196392 and 544024
> respectively.  (One time they didn't get past 999536 and 329712
> respectively.)
>
>
> Also, under SimpleMessenger, gdb is reporting stack corruption in the back
> traces.
>
>
> What other memory tuning options should I try?
>
>
>
>
>
> On 2017-09-11 08:05, Gregory Farnum wrote:
>
> You could try setting it to run with SimpleMessenger instead of
> AsyncMessenger -- the default changed across those releases.
> I imagine the root of the problem though is that with BlueStore the OSD is
> using a lot more memory than it used to and so we're overflowing the 32-bit
> address space...which means a more permanent solution might require turning
> down the memory tuning options. Sage has discussed those in various places.
> On Sun, Sep 10, 2017 at 11:52 PM Dyweni - Ceph-Users <
> 6exbab4fy...@dyweni.com> wrote:
>
>> Hi,
>>
>> Is anyone running Ceph Luminous (12.2.0) on 32bit Linux?  Have you seen
>> any problems?
>>
>>
>>
>> My setup has been 1 MON and 7 OSDs (no MDS, RGW, etc), all running Jewel
>> (10.2.1), on 32bit, with no issues at all.
>>
>> I've upgraded everything to latest version of Jewel (10.2.9) and still
>> no issues.
>>
>> Next I upgraded my MON to Luminous (12.2.0) and added MGR to it.  Still
>> no issues.
>>
>> Next I removed one node from the cluster, wiped it clean, upgraded it to
>> Luminous (12.2.), and created a new BlueStore data area.  Now this node
>> crashes with segmentation fault usually within a few minutes of starting
>> up.  I've loaded symbols and used GDB to examine back traces.  From what
>> I can tell, the seg faults are happening randomly, and the stack is
>> corrupted, so traces from GDB are unusable (even with all symbols
>> installed for all packages on the system). However, in all cases, the
>> seg fault is occuring in the 'msgr-worker-<n>' thread.
>>
>>
>>
>>
>> My data is fine, just would like to get Ceph 12.2.0 running stably on
>> this node, so I can upgrade the remaining nodes and switch everything
>> over to BlueStore.
>>
>>
>>
>> Thanks,
>> Dyweni
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to