Just to circle back here. I upgraded our app from otp 22.3.4.20 -> 23 and 
we have not seen any segfaults for ~ 24 hours; which is solid considering 
we'd expect at least a couple in that timeframe. 

I also was able to get our elixir release to use the debug build of erlang 
by configuring the release to use the erlang source dir as the erts 
directory. Below is the configuration. Because mix.release copy_erts 
expects the erts dir to be of the format "erts-<version-number>" we 
symlinked the erlang source dir /OTP/subdir to /OTP/subdir/erts-11.2.2.15. 
We did this in the Dockerfile right before calling mix release. 

# in our app's mix.exs
      releases: [
        app: [
          include_erts: "/OTP/subdir/erts-11.2.2.15",
          strip_beams: false,
        ],
      ],

Now the cerl binary will be in the top-level erts directory assembled by 
the release. We also had to have a slightly modified elixir bin that used 
cerl -debug instead of erl. 

On Monday, August 22, 2022 at 10:24:39 AM UTC-7 Stephen Baldwin wrote:

> Ok I'll see about using the most recent erlang version. Thanks for the tip.
>
> On Monday, August 22, 2022 at 10:23:23 AM UTC-7 Stephen Baldwin wrote:
>
>> I can successfully run the cerl -debug vm from the sym link:
>>
>> root@:~# erts-11.2.2.15/bin/cerl -debug
>> Erlang/OTP 23 [erts-11.2.2.15] [source] [64-bit] [smp:12:12] 
>> [ds:12:12:10] [async-threads:1] [hipe] [type-assertions] [debug-compiled] 
>> [lock-checking]
>>
>> Eshell V11.2.2.15  (abort with ^G)
>> 1> 
>>
>>
>> On Monday, August 22, 2022 at 10:17:39 AM UTC-7 Stephen Baldwin wrote:
>>
>>>  Hey José,
>>>
>>> I was able to get elixir to work with cerl, such as elixir -e "IO.puts 
>>> :ok". But I could not get it to work in the release environment running in 
>>> docker. I get this error:
>>>
>>> root:~# releases/0.1.0/elixir -e "IO.puts :ok"
>>> {"init terminating in 
>>> do_boot",{undef,[{elixir,start_cli,[],[]},{init,start_em,1,[]},{init,do_boot,3,[]}]}}
>>> init terminating in do_boot 
>>> ({undef,[{elixir,start_cli,[],[]},{init,start_em,1,[]},{init,do_boot,3,[]}]})
>>>
>>> Crash dump is being written to: erl_crash.dump...done
>>>
>>> In my dockerfile I replace the elixir bin with my elixir debug like so:
>>>
>>> ....
>>> COPY --from=build /app/_build/prod/rel/app ./
>>> # Copy erlang source, along with erlang debug binary
>>> COPY --from=build /OTP/subdir /OTP/subdir
>>> # Symlink erlang debug binary to erts bin dir
>>> RUN ln -s /OTP/subdir/bin/cerl 
>>> /app/releases/0.1.0/../../erts-11.2.2.15/bin/cerl
>>> # Replace elixir script with our scrip that runs the erlang debug binary
>>> COPY elixir-debug releases/0.1.0/elixir
>>> RUN chmod +x releases/0.1.0/elixir
>>> ...
>>>
>>> The source image for my app docker file 
>>> is hexpm/elixir:1.10.4-erlang-23.3.4.16-ubuntu-bionic-20210930 but modified 
>>> to keep the erlang source code and build cerl debug vm.
>>>
>>> I don't think I can reasonably share a minimal app that reproduces the 
>>> issue (without sharing my app code which I cannot). The seg fault happens 
>>> randomly after x hours and I do not know what is causing it.
>>>
>>> Attached is my modified elixir bin.
>>>
>>>
>>> On Monday, August 22, 2022 at 9:14:52 AM UTC-7 José Valim wrote:
>>>
>>>> Or perhaps please provide a minimal app that reproduces it. :)
>>>>
>>>> On Mon, Aug 22, 2022 at 6:13 PM José Valim <jose....@dashbit.co> wrote:
>>>>
>>>>> Running Elixir with cerl should just work. Can you expand on the 
>>>>> on_boot errors you get in a release?
>>>>>
>>>>> On Mon, Aug 22, 2022 at 5:46 PM Stephen Baldwin <
>>>>> stephen...@syncromsp.com> wrote:
>>>>>
>>>>>> Hello I've been trying to debug a seg fault in my elixir app for a 
>>>>>> bit. I've learned that if debugging symbols are enabled on the erlang vm 
>>>>>> you can use gdb to debug a linux core file to deduce where the seg fault 
>>>>>> is 
>>>>>> occuring. Now I've rebuilt erlang from source to have the debugging 
>>>>>> symbols 
>>>>>> and that all works fine, but using it with an elixir release seems to be 
>>>>>> a 
>>>>>> bit difficult.
>>>>>>
>>>>>> I modified the elixir bin similar to 
>>>>>> https://github.com/elixir-lang/elixir/pull/11082 but I am getting 
>>>>>> on_boot errors when running the release. So replacing cerl with erl 
>>>>>> isn't a 
>>>>>> path to success.
>>>>>>
>>>>>> I need some help as I don't fully understand the path from an elixir 
>>>>>> release to the erlang vm. Any quick ways to get this to work? Otherwise 
>>>>>> I 
>>>>>> think it would be worthwhile to have an option when building an elixir 
>>>>>> release to use a cerl vm (debug, valgrind, etc). 
>>>>>>
>>>>>> Regards,
>>>>>> Stephen
>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/fd8b2291-c3aa-49fd-925f-bde1560fc379n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/fd8b2291-c3aa-49fd-925f-bde1560fc379n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/e9476965-3759-48e2-831c-ff4d95b03c78n%40googlegroups.com.

Reply via email to