> On March 31, 2017, 5:07 p.m., Robert Levas wrote: > > ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java > > Lines 184-185 (original), 153-154 (patched) > > <https://reviews.apache.org/r/58109/diff/2/?file=1682589#file1682589line190> > > > > By using a more comple query, could we avoid making multiple calls the > > the DB to get the stage entities? > > > > The following (non-JPA) query should do the trick once properly > > formatted for JPA. However, I am not sure if all DBs would support it. > > Apparenly PostgreSQL does, according to my test, and I know ath MySQL does. > > I am not sure about other databases able to be used with Ambari. > > > > ``` > > SELECT * > > FROM stage s > > INNER JOIN ( > > SELECT s.request_id, MIN(s.stage_id) AS stage_id > > FROM stage s > > INNER JOIN host_role_command hrc ON (hrc.stage_id = s.stage_id AND > > hrc.request_id = s.request_id) > > WHERE hrc.status IN ('COMPLETED') > > GROUP BY s.request_id > > ORDER BY s.request_id > > ) AS foo ON (s.request_id = foo.request_id and s.stage_id = > > foo.stage_id); > > ``` > > Jonathan Hurley wrote: > This doesn't call into the database multiple times. The 2nd hit is a > cache-only lookup. I think when I was researching how to do this, that query > had problems on some databases... Namely; how do you get the entity from it > when the request_id is in the returned results.
I figured you would have looked at this approach... thanks or the clarification. - Robert ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58109/#review170773 ----------------------------------------------------------- On March 31, 2017, 9:16 p.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/58109/ > ----------------------------------------------------------- > > (Updated March 31, 2017, 9:16 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, and Robert Levas. > > > Bugs: AMBARI-20646 > https://issues.apache.org/jira/browse/AMBARI-20646 > > > Repository: ambari > > > Description > ------- > > When creating a massive request (a rolling upgrade on a cluster with 1000 > nodes), the size of the request seems to slow down the {{ActionScheduler}}. > Each command was taking between 1 to 2 minutes to run (even server-side > tasks). > > The cause of this can be seen in the following two stack traces: > > {code:title=ActionSchedulerImpl} > at > org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getTasks(ActionDBAccessorImpl.java:84) > at org.apache.ambari.server.actionmanager.Stage.<init>(Stage.java:157) > at > org.apache.ambari.server.actionmanager.StageFactoryImpl.createExisting(StageFactoryImpl.java:72) > at > org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getStagesInProgress(ActionDBAccessorImpl.java:303) > at > org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:341) > at > org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:302) > at java.lang.Thread.run(Thread.java:745) > {code} > > {code:title=Server Action Executor} > at > org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getTasks(ActionDBAccessorImpl.java:700) > at > org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getTasks(ActionDBAccessorImpl.java:84) > at org.apache.ambari.server.actionmanager.Stage.<init>(Stage.java:157) > at > org.apache.ambari.server.actionmanager.StageFactoryImpl.createExisting(StageFactoryImpl.java:72) > at > org.apache.ambari.server.actionmanager.Request.<init>(Request.java:199) > at > org.apache.ambari.server.actionmanager.Request$$FastClassByGuice$$9071e03.newInstance(<generated>) > at > com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40) > at > com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:60) > at > com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85) > at > com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:254) > at > com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978) > at > com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1024) > at com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974) > at > com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:632) > at com.sun.proxy.$Proxy26.createExisting(Unknown Source) > at > org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getRequests(ActionDBAccessorImpl.java:784) > at > org.apache.ambari.server.serveraction.ServerActionExecutor.cleanRequestShareDataContexts(ServerActionExecutor.java:259) > - locked <0x00007ff0a14083c8> (a java.util.HashMap) > at > org.apache.ambari.server.serveraction.ServerActionExecutor.doWork(ServerActionExecutor.java:454) > at > org.apache.ambari.server.serveraction.ServerActionExecutor$1.run(ServerActionExecutor.java:160) > at java.lang.Thread.run(Thread.java:745) > {code} > > It's clear from these stacks that every {{PENDING}} stage (roughly 15,000) > were being loaded into memory every second (and their accompanying task as > well). This makes no sense as these methods don't need all stages - just the > _next_ stage. This is because all stages are synchronous within a single > request. > > The proposed solution is to fix the {{StageEntity.findByCommandStatuses}} > call so it doesn't return every stage: > {code} > SELECT stage.requestid, > MIN(stage.stageid) > FROM stageentity stage, > hostrolecommandentity hrc > WHERE hrc.status IN :statuses > AND hrc.stageid = stage.stageid > AND hrc.requestid = stage.requestid > GROUP BY stage.requestid > {code} > > > Diffs > ----- > > > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessor.java > 9325d03 > > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java > ab4feaa > > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java > 0984c5c > ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java > 5151fb3 > > ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java > f68338f > > ambari-server/src/main/java/org/apache/ambari/server/serveraction/ServerActionExecutor.java > b0be6b3 > > ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java > 81eef3b > > ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java > 2b5d2f3 > > ambari-server/src/test/java/org/apache/ambari/server/orm/dao/RequestDAOTest.java > 9b62671 > > ambari-server/src/test/java/org/apache/ambari/server/serveraction/ServerActionExecutorTest.java > 44d5b63 > > ambari-server/src/test/java/org/apache/ambari/server/state/services/RetryUpgradeActionServiceTest.java > e2ce6e7 > > > Diff: https://reviews.apache.org/r/58109/diff/2/ > > > Testing > ------- > > Tests run: 4976, Failures: 0, Errors: 0, Skipped: 39 > > [INFO] > ------------------------------------------------------------------------ > [INFO] BUILD SUCCESS > [INFO] > ------------------------------------------------------------------------ > [INFO] Total time: 17:49 min > [INFO] Finished at: 2017-03-31T12:58:22-04:00 > [INFO] Final Memory: 59M/664M > [INFO] > ------------------------------------------------------------------------ > > > Thanks, > > Jonathan Hurley > >