On Mon, Jul 2, 2018 at 3:24 PM Jamo Luhrsen <jluhr...@gmail.com> wrote:
> > > On 07/02/2018 11:44 AM, Tom Pantelis wrote: > > > > > > On Mon, Jul 2, 2018 at 2:15 PM, Victor Pickard <vpick...@redhat.com > <mailto:vpick...@redhat.com>> wrote: > > > > Hi all, > > > > I'm looking at clustering stability. One of the jobs I've been > looking at is controller clustering. This is a good > > CSIT, in that it stops and starts ODL several times during the run. > > > > In one of failed test runs (sandbox, logs wiped from last week, but > I do have this particular karaf log archived > > locally), ODL is started, and rest calls fail during the test. > Looking at the logs, I can see why. Karaf failed to > > start, or better yet, took a really long time to start. From the > snipped below, you can see about 7 mins between > > when Karaf launched, and did something?, maybe restarted again. But > the main thing is that karaf failed to start in > > a timely manner, taking over 7 minutes to begin to start up > blueprints, etc. > > > Vic, > > when you have a sandbox job that you want to keep around (the logs), write > "copy-logs: <job-name>/<job-number>" on any gerrit. that will trigger a job > to copy the logs to the logs server where we can keep them for 6 months. > > Also, is this the failure we have when the high level robot failure is that > some node did not move to cluster syncstatus == true within 5 minutes? Not > coming up for 7 minutes would easily explain that. > > Do we have a jira for this one yet? > https://jira.opendaylight.org/browse/CONTROLLER-1845 > > > Thanks, > JamO > > > > > > I ran a job that had karaf debug logging enabled with this setting: > > > > log4j.rootLogger=DEBUG > > > > > > This did not go very well. This generates way too much debug info, > and was causing timeouts and other various errors > > in the CSIT run. > > > > > > So, my questions are: > > > > 1. Has anyone see this issue where karaf seems to hang on startup > (after a kill -9 on karaf pid)? If so, is this a > > known issue? > > > > 2. What debug would be needed to figure out why karaf was hanging? > Note the above generated a log file of ~768 MB in > > a very short timespan. > > > > > > Vic - does this happen if you gracefully shut it down? In years past > with karaf I recall corruption could occur in the > > bundle cache under data if the karaf process was killed. I don't know if > that potential issue is still present with > > karaf 4. Does it clean the data dir before restarting? If not, it would > be good to do so to be safe. > > > > Other than that, we probably need to get a thread dump. > > > > Thanks, > > > > Vic > > > > > > > > > > Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch > > INFO: Installing and starting initial bundles > > Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main launch > > INFO: All initial bundles installed and set to start > > Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock > lock > > INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock > > Jun 29, 2018 3:43:47 PM org.apache.karaf.main.lock.SimpleFileLock > lock > > INFO: Lock acquired > > Jun 29, 2018 3:43:47 PM org.apache.karaf.main.Main$KarafLockCallback > lockAquired INFO: Lock acquired. Setting > > startlevel to 100 Jun 29, 2018 3:50:48 PM org.apache.karaf.main.Main > launch INFO: Installing and starting initial > > bundles > > Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main launch > > INFO: All initial bundles installed and set to start > > Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock > lock > > INFO: Trying to lock /tmp/karaf-0.8.3-SNAPSHOT/lock > > Jun 29, 2018 3:50:49 PM org.apache.karaf.main.lock.SimpleFileLock > lock > > INFO: Lock acquired > > Jun 29, 2018 3:50:49 PM org.apache.karaf.main.Main$KarafLockCallback > lockAquired > > INFO: Lock acquired. Setting startlevel to 100 > > > > > > > > _______________________________________________ > > controller-dev mailing list > > controller-dev@lists.opendaylight.org <mailto: > controller-dev@lists.opendaylight.org> > > https://lists.opendaylight.org/mailman/listinfo/controller-dev > > <https://lists.opendaylight.org/mailman/listinfo/controller-dev> > > > > > > > > > > _______________________________________________ > > controller-dev mailing list > > controller-dev@lists.opendaylight.org > > https://lists.opendaylight.org/mailman/listinfo/controller-dev > > >
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev