Hi, I see https://issues.apache.org/jira/browse/SLIDER-1051 is still open. Any updates on when this will be fixed?
On Thu, Jan 7, 2016 at 9:49 AM, anu sudarsan <anu.at.i...@gmail.com> wrote: > Thanks Steve for fixing it. I will give 0.91 a try at some point. > > As I mentioned, the issue is not there in 0.80.0 but only in 0.81.1. Did > you mean cherrypick-ing the fix to 0.81 branch? > > If we upgrade to 0.91, do you expect instabilities just in AA placement or > other placement policies in general? Also upgrading to 0.91 will need a > Hadoop upgrade (to 2.7?) too, I assume? If so, I would suggest backporting > the fix to 0.81 branch. > > -Anu > > On Thu, Jan 7, 2016 at 8:17 AM, Steve Loughran <ste...@hortonworks.com> > wrote: > >> >> don't bother trying that —I've replicated it locally, added tests and >> fixed it. >> >> It'll be fixed in 0.91, that is, the successor to the 0.90.2 that will be >> out today. >> >> One thing to consider is: do we backport this to the 0.80 branch? It's a >> one-off change, and with the changes for AA placement still going to take >> an iteration or so to stabilise, probably better to cherry pick it in >> rather than say "do the big upgrade" >> >> What do people think? >> >> >> > On 7 Jan 2016, at 11:28, Steve Loughran <ste...@hortonworks.com> wrote: >> > >> > quick question >> > >> > what happens if you delete the directory under that cluster in >> ${user.home}/.sliders/clusters/${appname}/history (where user.home is your >> homedir, appname the name of the slider application? >> > >> > If it works then, what happens when you stop the application and >> restart it? >> > >> > >> >> On 6 Jan 2016, at 15:36, anu sudarsan <anu.at.i...@gmail.com> wrote: >> >> >> >> Hi >> >> >> >> I tried with HDP 2.3 and still getting the same error. Any ideas what >> might >> >> be causing this? As I said, the same appConfig and resources.json >> >> configurations and cluster works for Slider 0.80.0. >> >> >> >> Relevant parameters from the resources.json >> >> >> >> { >> >> "schema": "http://example.org/specification/v2.0.0", >> >> "metadata": { >> >> }, >> >> "global": { >> >> "yarn.vcores": "1" >> >> }, >> >> "components": { >> >> "slider-appmaster": { >> >> }, >> >> "COORDINATOR": { >> >> "yarn.role.priority": "1", >> >> "yarn.component.instances": "1", >> >> "yarn.memory": "256", >> >> "yarn.label.expression": "coord" >> >> } >> >> } >> >> >> >> >> >> On Tue, Jan 5, 2016 at 2:30 PM, anu sudarsan <anu.at.i...@gmail.com> >> wrote: >> >> >> >>> Hi >> >>> >> >>> I am trying to use Slider 0.81.1 with *yarn.label.expression* feature >> >>> and getting the following error. This is from slider-agent.log for >> >>> slider-appmaster. >> >>> >> >>> ERROR appmaster.SliderAppMaster - Exception in AmExecutor-006: >> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: relax >> location flag doesn't match container priority: >> OutstandingRequest{roleId=1, node=null, hostname='null', hasLocation=true, >> requestedTimeMillis=1452019991722, mayEscalate=false, escalated=true, >> escalationTimeoutMillis=1452020021722, >> issuedRequest=Capability[<memory:1500, vCores:1>]Priority[1073741825]; >> relaxLocality=true; nodeLabels=coord; } >> >>> org.apache.hadoop.yarn.client.api.InvalidContainerRequestException: >> relax location flag doesn't match container priority: >> OutstandingRequest{roleId=1, node=null, hostname='null', hasLocation=true, >> requestedTimeMillis=1452019991722, mayEscalate=false, escalated=true, >> escalationTimeoutMillis=1452020021722, >> issuedRequest=Capability[<memory:1500, vCores:1>]Priority[1073741825]; >> relaxLocality=true; nodeLabels=coord; } >> >>> at >> org.apache.slider.server.appmaster.state.OutstandingRequest.validate(OutstandingRequest.java:406) >> >>> at >> org.apache.slider.server.appmaster.state.OutstandingRequest.buildContainerRequest(OutstandingRequest.java:232) >> >>> at >> org.apache.slider.server.appmaster.state.RoleHistory.requestInstanceOnNode(RoleHistory.java:598) >> >>> at >> org.apache.slider.server.appmaster.state.RoleHistory.requestNode(RoleHistory.java:613) >> >>> at >> org.apache.slider.server.appmaster.state.AppState.createContainerRequest(AppState.java:1232) >> >>> at >> org.apache.slider.server.appmaster.state.AppState.buildContainerResourceAndRequest(AppState.java:1213) >> >>> at >> org.apache.slider.server.appmaster.state.AppState.reviewOneRole(AppState.java:1938) >> >>> at >> org.apache.slider.server.appmaster.state.AppState.reviewRequestAndReleaseNodes(AppState.java:1812) >> >>> at >> org.apache.slider.server.appmaster.SliderAppMaster.executeNodeReview(SliderAppMaster.java:1804) >> >>> at >> org.apache.slider.server.appmaster.SliderAppMaster.handleReviewAndFlexApplicationSize(SliderAppMaster.java:1790) >> >>> at >> org.apache.slider.server.appmaster.actions.ReviewAndFlexApplicationSize.execute(ReviewAndFlexApplicationSize.java:41) >> >>> at >> org.apache.slider.server.appmaster.actions.QueueExecutor.run(QueueExecutor.java:73) >> >>> >> >>> The same configuration works fine when I remove >> "*yarn.label.expression*" >> >>> from the resources.json. >> >>> >> >>> Does Slider 0.81.1 require Hadoop 2.7? I am using HDP 2.2, and thus >> have >> >>> Hadoop 2.6. I have no issues deploying the application using yarn >> labels >> >>> when using Slider 0.80.0, the problem is only with Slider 0.81.1. >> >>> >> >>> Thanks >> >>> -Anu >> >>> >> > >> > >> >> >