On Mon, Oct 30, 2017 at 4:25 PM, Sam Hague <sha...@redhat.com> wrote:

>
>
> On Mon, Oct 30, 2017 at 3:02 PM, Tom Pantelis <tompante...@gmail.com>
> wrote:
>
>>
>>
>> On Mon, Oct 30, 2017 at 2:49 PM, Michael Vorburger <vorbur...@redhat.com>
>> wrote:
>>
>>> Hi Sam,
>>>
>>> On Mon, Oct 30, 2017 at 7:45 PM, Sam Hague <sha...@redhat.com> wrote:
>>>
>>>> Stephen, Michael, Tom,
>>>>
>>>> do you have any ways to collect debugs when ODL crashes in CSIT?
>>>>
>>>
>>> JVMs (almost) never "just crash" without a word... either some code
>>> does java.lang.System.exit(), which you may remember we do in the CDS/Akka
>>> code somewhere, or there's a bug in the JVM implementation - in which case
>>> there should be a one of those JVM crash logs type things - a file named
>>> something like hs_err_pid22607.log in the "current working" directory.
>>> Where would that be on these CSIT runs, and are the CSIT JJB jobs set up to
>>> preserve such JVM crash log files and copy them over to
>>> logs.opendaylight.org ?
>>>
>>
>> Akka will do System.exit() if it encounters an error serious for that.
>> But it doesn't do it silently. However I believe we disabled the automatic
>> exiting in akka.
>>
> Should there be any logs in ODL for this? There is nothing in the karaf
> log when this happens. It literally just stops.
>
> The karaf.console log does say the karaf process was killed:
>
> /tmp/karaf-0.7.1-SNAPSHOT/bin/karaf: line 422: 11528 Killed ${KARAF_EXEC}
> "${JAVA}" ${JAVA_OPTS} "$NON_BLOCKING_PRNG" 
> -Djava.endorsed.dirs="${JAVA_ENDORSED_DIRS}"
> -Djava.ext.dirs="${JAVA_EXT_DIRS}" -Dkaraf.instances="${KARAF_HOME}/instances"
> -Dkaraf.home="${KARAF_HOME}" -Dkaraf.base="${KARAF_BASE}"
> -Dkaraf.data="${KARAF_DATA}" -Dkaraf.etc="${KARAF_ETC}"
> -Dkaraf.restart.jvm.supported=true -Djava.io.tmpdir="${KARAF_DATA}/tmp"
> -Djava.util.logging.config.file="${KARAF_BASE}/etc/java.util.logging.properties"
> ${KARAF_SYSTEM_OPTS} ${KARAF_OPTS} ${OPTS} "$@" -classpath "${CLASSPATH}"
> ${MAIN}
>
> In the CSIT robot files we can see the below connection errors so ODL is
> not responding to new requests. This plus the above lead to think ODL just
> died.
>
> [ WARN ] Retrying (Retry(total=2, connect=None, read=None, redirect=None,
> status=None)) after connection broken by 'NewConnectionError('<
> requests.packages.urllib3.connection.HTTPConnection object at 0x5ca2d50>:
> Failed to establish a new connection: [Errno 111] Connection refused',)'
>
>>
>>
That would seem to indicate something did a kill -9.  As Michael said, if
the JVM crashed there would be an hs_err_pid file and it would log a
message about it.
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to