On 07/02/2018 11:44 AM, Tom Pantelis wrote:


On Mon, Jul 2, 2018 at 2:15 PM, Victor Pickard <vpick...@redhat.com 
<mailto:vpick...@redhat.com>> wrote:

    Hi all,

    I'm looking at clustering stability. One of the jobs I've been looking at 
is controller clustering. This is a good
    CSIT, in that it stops and starts ODL several times during the run.

    In one of failed test runs (sandbox, logs wiped from last week, but I do 
have this particular karaf log archived
    locally), ODL is started, and rest calls fail during the test. Looking at 
the logs, I can see why. Karaf failed to
    start, or better yet, took a really long time to start. From the snipped 
below, you can see about 7 mins between
    when Karaf launched, and did something?, maybe restarted again. But the 
main thing is that karaf failed to start in
    a timely manner, taking over 7 minutes to begin to start up blueprints, etc.


Vic,

when you have a sandbox job that you want to keep around (the logs), write
"copy-logs: <job-name>/<job-number>" on any gerrit. that will trigger a job
to copy the logs to the logs server where we can keep them for 6 months.

Also, is this the failure we have when the high level robot failure is that
some node did not move to cluster syncstatus == true within 5 minutes? Not
coming up for 7 minutes would easily explain that.

Do we have a jira for this one yet?


Thanks,
JamO



    I ran a job that had karaf debug logging enabled with this setting:

    log4j.rootLogger=DEBUG


    This did not go very well. This generates way too much debug info, and was 
causing timeouts and other various errors
    in the CSIT run.


    So, my questions are:

    1. Has anyone see this issue where karaf seems to hang on startup (after a 
kill -9 on karaf pid)? If so, is this a
    known issue?

    2. What debug would be needed to figure out why karaf was hanging? Note the 
above generated a log file of ~768 MB in
    a very short timespan.


Vic - does this happen if you gracefully shut it down? In years past with karaf I recall corruption could occur in the bundle cache under data if the karaf process was killed. I don't know if that potential issue is still present with karaf 4. Does it clean the data dir before restarting? If not, it would be good to do so to be safe.

Other than that, we probably need to get a thread dump.

    Thanks,

    Vic




    Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch
    INFO: Installing and starting initial bundles
    Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch
    INFO: All initial bundles installed and set to start
    Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock lock
    INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock
    Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock lock
    INFO: Lock acquired
    Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main$KarafLockCallback 
lockAquired INFO: Lock acquired. Setting
    startlevel to 100 Jun 29, 2018 3:50:48 PM org.apache.karaf.main.Main launch 
INFO: Installing and starting initial
    bundles
    Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main launch
    INFO: All initial bundles installed and set to start
    Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock lock
    INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock
    Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock lock
    INFO: Lock acquired
    Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main$KarafLockCallback 
lockAquired
    INFO: Lock acquired. Setting startlevel to 100



    _______________________________________________
    controller-dev mailing list
    controller-dev@lists.opendaylight.org 
<mailto:controller-dev@lists.opendaylight.org>
    https://lists.opendaylight.org/mailman/listinfo/controller-dev
    <https://lists.opendaylight.org/mailman/listinfo/controller-dev>




_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to