That works but it doesn't feel right doing it this way. I am going to fix this one for good.
Cheers, Abdullah. On Mon, Aug 24, 2015 at 5:11 PM, Ian Maxon <[email protected]> wrote: > The way I assured liveness for the YARN installer was to try running "for > $x in dataset Metadata.Dataset return $x" via the API. I just polled for a > reasonable amount of time (though honestly, thinking about it now, the > correct parameter to use for the polling interval is the startup wait time > in the parameters file :) ). It's not perfect, but it gives less false > positives than just checking ps for processes that look like CCs/NCs. > > - Ian. > > On Mon, Aug 24, 2015 at 5:03 AM, abdullah alamoudi <[email protected]> > wrote: > > > Now that I think about it. Maybe we should provide multiple ways to do > > this. A polling mechanism to be used for arbitrary time and a pushing > > mechanism on startup. > > I am going to start implementation of this and will probably use RMI for > > this task both ways (CC to InstallerDriver and InstallerDriver to CC). > > > > Cheers, > > Abdullah. > > > > On Mon, Aug 24, 2015 at 2:19 PM, abdullah alamoudi <[email protected]> > > wrote: > > > > > So after further investigation, turned out our startup process just > > starts > > > the CC and NC processes and then make sure the processes are running > and > > if > > > the processes were found to be running, it returns the state of the > > cluster > > > to be active and the subsequent test commands can start immediately. > > > > > > This means that the CC could've started but is not yet ready when we > try > > > to process the next command. To address this, we need a better way to > > tell > > > when the startup procedure has completed. we can do this by pushing (CC > > > informs installer driver when the startup is complete) or polling (The > > > installer driver needs to actually query the CC for the state of the > > > cluster). > > > > > > I can do either way so let's vote. My vote goes to the pushing > mechanism. > > > Thoughts? > > > > > > On Mon, Aug 24, 2015 at 10:15 AM, abdullah alamoudi < > [email protected]> > > > wrote: > > > > > >> This solution turned out to be incorrect. Actually, the test cases > when > > I > > >> build after using the join method never fails but running an actual > > asterix > > >> instance never succeeds which is quite confusing. > > >> > > >> I also think that the startup script has a major bug where it might > > >> returns before the startup is complete. More on this later...... > > >> > > >> On Mon, Aug 24, 2015 at 7:48 AM, abdullah alamoudi < > [email protected]> > > >> wrote: > > >> > > >>> It is highly unlikely that it is related. > > >>> > > >>> Cheers, > > >>> Abdullah. > > >>> > > >>> On Mon, Aug 24, 2015 at 5:45 AM, Chen Li <[email protected]> wrote: > > >>> > > >>>> @Abdullah: Is this issue related to > > >>>> https://issues.apache.org/jira/browse/ASTERIXDB-1074? Ian and I > plan > > to > > >>>> look into the details on Monday. > > >>>> > > >>>> On Sun, Aug 23, 2015 at 10:08 AM, abdullah alamoudi < > > [email protected] > > >>>> > > > >>>> wrote: > > >>>> > > >>>> > About 3-4 days ago, I was working on the addition of the > filesystem > > >>>> based > > >>>> > feed adapter and it didn't take anytime to complete. However, > when I > > >>>> wanted > > >>>> > to build and make sure all tests pass, I kept getting > > >>>> ConnectionRefused > > >>>> > errors which caused the installer tests to fail every now and > then. > > >>>> > > > >>>> > I knew the new change had nothing to do with this failure, yet, I > > >>>> couldn't > > >>>> > direct my attention away from this bug (It just bothered me so > much > > >>>> and I > > >>>> > knew it needs to be resolved ASAP). After wasting countless > hours, I > > >>>> was > > >>>> > finally able to figure out what was happening :-) > > >>>> > > > >>>> > In the startup routine, we start three Jetty web servers (Web > > >>>> interface > > >>>> > server, JSON API server, and Feed server). Sometime ago, we used > to > > >>>> end the > > >>>> > startup call before making sure the server.isStarted() method > > returns > > >>>> true > > >>>> > on all servers. At that time, I introduced the > waitUntilServerStarts > > >>>> method > > >>>> > to make sure we don't return before the servers are ready. Turned > > >>>> out, that > > >>>> > was an incorrect way to handle this (We can blame stackoverflow > for > > >>>> this > > >>>> > one!) and it is not enough that the server isStarted() returns > true. > > >>>> The > > >>>> > correct way to do this is to call the server.join() method after > the > > >>>> > server.start(). > > >>>> > > > >>>> > See: > > >>>> > > > >>>> > > > http://stackoverflow.com/questions/15924874/embedded-jetty-why-to-use-join > > >>>> > > > >>>> > This was equally satisfying as it was frustrating and you are > > welcome > > >>>> for > > >>>> > the future time I saved each of you :) > > >>>> > -- > > >>>> > Amoudi, Abdullah. > > >>>> > > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Amoudi, Abdullah. > > >>> > > >> > > >> > > >> > > >> -- > > >> Amoudi, Abdullah. > > >> > > > > > > > > > > > > -- > > > Amoudi, Abdullah. > > > > > > > > > > > -- > > Amoudi, Abdullah. > > > -- Amoudi, Abdullah.
