On Mon, Jul 2, 2018 at 3:24 PM Jamo Luhrsen <jluhr...@gmail.com> wrote:

>
>
> On 07/02/2018 11:44 AM, Tom Pantelis wrote:
> >
> >
> > On Mon, Jul 2, 2018 at 2:15 PM, Victor Pickard <vpick...@redhat.com
> <mailto:vpick...@redhat.com>> wrote:
> >
> >     Hi all,
> >
> >     I'm looking at clustering stability. One of the jobs I've been
> looking at is controller clustering. This is a good
> >     CSIT, in that it stops and starts ODL several times during the run.
> >
> >     In one of failed test runs (sandbox, logs wiped from last week, but
> I do have this particular karaf log archived
> >     locally), ODL is started, and rest calls fail during the test.
> Looking at the logs, I can see why. Karaf failed to
> >     start, or better yet, took a really long time to start. From the
> snipped below, you can see about 7 mins between
> >     when Karaf launched, and did something?, maybe restarted again. But
> the main thing is that karaf failed to start in
> >     a timely manner, taking over 7 minutes to begin to start up
> blueprints, etc.
>
>
> Vic,
>
> when you have a sandbox job that you want to keep around (the logs), write
> "copy-logs: <job-name>/<job-number>" on any gerrit. that will trigger a job
> to copy the logs to the logs server where we can keep them for 6 months.
>
> Also, is this the failure we have when the high level robot failure is that
> some node did not move to cluster syncstatus == true within 5 minutes? Not
> coming up for 7 minutes would easily explain that.
>
> Do we have a jira for this one yet?
>

 https://jira.opendaylight.org/browse/CONTROLLER-1845

>
>
> Thanks,
> JamO
>
>
> >
> >     I ran a job that had karaf debug logging enabled with this setting:
> >
> >     log4j.rootLogger=DEBUG
> >
> >
> >     This did not go very well. This generates way too much debug info,
> and was causing timeouts and other various errors
> >     in the CSIT run.
> >
> >
> >     So, my questions are:
> >
> >     1. Has anyone see this issue where karaf seems to hang on startup
> (after a kill -9 on karaf pid)? If so, is this a
> >     known issue?
> >
> >     2. What debug would be needed to figure out why karaf was hanging?
> Note the above generated a log file of ~768 MB in
> >     a very short timespan.
> >
> >
> > Vic - does this happen if you gracefully shut it down? In years past
> with karaf I recall corruption could occur in the
> > bundle cache under data if the karaf process was killed. I don't know if
> that potential issue is still present with
> > karaf 4. Does it clean the data dir before restarting? If not, it would
> be good to do so to be safe.
> >
> > Other than that, we probably need to get a thread dump.
> >
> >     Thanks,
> >
> >     Vic
> >
> >
> >
> >
> >     Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch
> >     INFO: Installing and starting initial bundles
> >     Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch
> >     INFO: All initial bundles installed and set to start
> >     Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock
> lock
> >     INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock
> >     Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock
> lock
> >     INFO: Lock acquired
> >     Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main$KarafLockCallback
> lockAquired INFO: Lock acquired. Setting
> >     startlevel to 100 Jun 29, 2018 3:50:48 PM org.apache.karaf.main.Main
> launch INFO: Installing and starting initial
> >     bundles
> >     Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main launch
> >     INFO: All initial bundles installed and set to start
> >     Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock
> lock
> >     INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock
> >     Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock
> lock
> >     INFO: Lock acquired
> >     Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main$KarafLockCallback
> lockAquired
> >     INFO: Lock acquired. Setting startlevel to 100
> >
> >
> >
> >     _______________________________________________
> >     controller-dev mailing list
> >     controller-dev@lists.opendaylight.org <mailto:
> controller-dev@lists.opendaylight.org>
> >     https://lists.opendaylight.org/mailman/listinfo/controller-dev
> >     <https://lists.opendaylight.org/mailman/listinfo/controller-dev>
> >
> >
> >
> >
> > _______________________________________________
> > controller-dev mailing list
> > controller-dev@lists.opendaylight.org
> > https://lists.opendaylight.org/mailman/listinfo/controller-dev
> >
>
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to