Re: Assignment Manager and clock advance.
Thank you all for the comments, I got the point. Raised that issue because in some of our tests we are using manual advance for the clock and that was working just fine in 1.x releases (so create table operation was atomic from this perspective). And that stops working with AMv2. Of course, we can deal with it by using auto incremental edge, but my main concern was that for a simple operation as create table we may run in the situation where negative clock adjustment happen by ntpd/whatever because of time drift which may (and usually) happen in a hypervisor environment (HyperV, KVM, VMWare - they all have that issue under a heavy load). But it seems that the master will retry the subprocedure and eventually it will be completed (tried the same test with slow updated clock). I will raise the jira as nice to have. Thanks, Sergey On Wed, Jan 3, 2018 at 8:00 AM, stack wrote: > Stalling environmentedge as is done here does not work. In various areas in > internals we expect the clock to advance. This is not particular to AMv2. > > As per Appy, we need HLC generally (it is almost there but needs some > concentrated effort to carry it the last few yards). > > For AMv2, we have a single actor--the Master-- so we should be able to put > up simple checks that we have an advancing clock. > > Please make an issue Sergey and we'll have a go at it. Thanks for raising > this issue. > > S > > On Jan 2, 2018 5:58 PM, "Sergey Soldatov" > wrote: > > > Hi, > > Not sure whether we may consider that as a bug, but I found an > interesting > > dependency of AM2 on clock advancing. A simple operation such as create > > table is unable to perform with the same current_time value: > > > > public void testCreateTable() throws IOException { > > EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { > > volatile int curTime = 1000; > > > > @Override > > public long currentTime() { > > > > return curTime; > > } > > }); > > final TableName tableName = TableName.valueOf("test"); > > TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); > > } > > > > and fails with a TableNotFound exception. The reason is that between > > transitions we get table information from meta using Get with the > exclusive > > current timestamp. Could it be a potential problem (i.e. the system > capable > > to execute all that transition stuff in less than 1 ms)? > > > > Thanks, > > Sergey > > >
Re: Assignment Manager and clock advance.
Stalling environmentedge as is done here does not work. In various areas in internals we expect the clock to advance. This is not particular to AMv2. As per Appy, we need HLC generally (it is almost there but needs some concentrated effort to carry it the last few yards). For AMv2, we have a single actor--the Master-- so we should be able to put up simple checks that we have an advancing clock. Please make an issue Sergey and we'll have a go at it. Thanks for raising this issue. S On Jan 2, 2018 5:58 PM, "Sergey Soldatov" wrote: > Hi, > Not sure whether we may consider that as a bug, but I found an interesting > dependency of AM2 on clock advancing. A simple operation such as create > table is unable to perform with the same current_time value: > > public void testCreateTable() throws IOException { > EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { > volatile int curTime = 1000; > > @Override > public long currentTime() { > > return curTime; > } > }); > final TableName tableName = TableName.valueOf("test"); > TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); > } > > and fails with a TableNotFound exception. The reason is that between > transitions we get table information from meta using Get with the exclusive > current timestamp. Could it be a potential problem (i.e. the system capable > to execute all that transition stuff in less than 1 ms)? > > Thanks, > Sergey >
Re: Assignment Manager and clock advance.
Yeah, you're right. Unreliable clocks open a whole set of different issues which i didn't want to go into for the question "the system capable to execute all that transition stuff in less than 1 ms?". I thought that question was genuinely asking how probable was the fabricated scenario i.e. if everything can actually execute within 1ms realtime. But talking about unreliable clocks, it's not just the problem of non-incrementing, right? clock can go backwards also. Sadly, our current EnvironmentEdgeManager isn't capable of handling such cases and the HLC (HBASE-14070) effort to fix that hasn't seen much progress lately :( So yes, if clocks go bad, those bugs can happen even if operation is spanning over seconds in realtime. -- Appy On Tue, Jan 2, 2018 at 5:01 PM, Nick Dimiduk wrote: > I don't think these assumptions are reliable. I've seen cases where > subsequent calls to currentTimeMillis() are non-incrementing on specific > Linux distributions. Taken in aggregate, the system clock makes progress, > but those aggregations are on the multi-second scale. > > On Tue, Jan 2, 2018 at 4:37 PM Apekshit Sharma wrote: > > > Hi Sergey, > > > > Interesting test and find. Makes total sense too. > > However, in real world case, any put in meta table itself will take more > > than a ms, and then we have lot of Procedure framework and other logic in > > between meta accesses which would make this scenarios impossible. > > Specifically, there a lot of processing in between adding rows from > > CreateTableProcedure and trying to access it in children > > AssignProcedure(s). > > > > -- Apy > > > > On Tue, Jan 2, 2018 at 3:58 PM, Sergey Soldatov < > sergeysolda...@gmail.com> > > wrote: > > > > > Hi, > > > Not sure whether we may consider that as a bug, but I found an > > interesting > > > dependency of AM2 on clock advancing. A simple operation such as create > > > table is unable to perform with the same current_time value: > > > > > > public void testCreateTable() throws IOException { > > > EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { > > > volatile int curTime = 1000; > > > > > > @Override > > > public long currentTime() { > > > > > > return curTime; > > > } > > > }); > > > final TableName tableName = TableName.valueOf("test"); > > > TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); > > > } > > > > > > and fails with a TableNotFound exception. The reason is that between > > > transitions we get table information from meta using Get with the > > exclusive > > > current timestamp. Could it be a potential problem (i.e. the system > > capable > > > to execute all that transition stuff in less than 1 ms)? > > > > > > Thanks, > > > Sergey > > > > > > > > > > > -- > > > > -- Appy > > > -- -- Appy
Re: Assignment Manager and clock advance.
I don't think these assumptions are reliable. I've seen cases where subsequent calls to currentTimeMillis() are non-incrementing on specific Linux distributions. Taken in aggregate, the system clock makes progress, but those aggregations are on the multi-second scale. On Tue, Jan 2, 2018 at 4:37 PM Apekshit Sharma wrote: > Hi Sergey, > > Interesting test and find. Makes total sense too. > However, in real world case, any put in meta table itself will take more > than a ms, and then we have lot of Procedure framework and other logic in > between meta accesses which would make this scenarios impossible. > Specifically, there a lot of processing in between adding rows from > CreateTableProcedure and trying to access it in children > AssignProcedure(s). > > -- Apy > > On Tue, Jan 2, 2018 at 3:58 PM, Sergey Soldatov > wrote: > > > Hi, > > Not sure whether we may consider that as a bug, but I found an > interesting > > dependency of AM2 on clock advancing. A simple operation such as create > > table is unable to perform with the same current_time value: > > > > public void testCreateTable() throws IOException { > > EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { > > volatile int curTime = 1000; > > > > @Override > > public long currentTime() { > > > > return curTime; > > } > > }); > > final TableName tableName = TableName.valueOf("test"); > > TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); > > } > > > > and fails with a TableNotFound exception. The reason is that between > > transitions we get table information from meta using Get with the > exclusive > > current timestamp. Could it be a potential problem (i.e. the system > capable > > to execute all that transition stuff in less than 1 ms)? > > > > Thanks, > > Sergey > > > > > > -- > > -- Appy >
Re: Assignment Manager and clock advance.
Hi Sergey, Interesting test and find. Makes total sense too. However, in real world case, any put in meta table itself will take more than a ms, and then we have lot of Procedure framework and other logic in between meta accesses which would make this scenarios impossible. Specifically, there a lot of processing in between adding rows from CreateTableProcedure and trying to access it in children AssignProcedure(s). -- Apy On Tue, Jan 2, 2018 at 3:58 PM, Sergey Soldatov wrote: > Hi, > Not sure whether we may consider that as a bug, but I found an interesting > dependency of AM2 on clock advancing. A simple operation such as create > table is unable to perform with the same current_time value: > > public void testCreateTable() throws IOException { > EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { > volatile int curTime = 1000; > > @Override > public long currentTime() { > > return curTime; > } > }); > final TableName tableName = TableName.valueOf("test"); > TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); > } > > and fails with a TableNotFound exception. The reason is that between > transitions we get table information from meta using Get with the exclusive > current timestamp. Could it be a potential problem (i.e. the system capable > to execute all that transition stuff in less than 1 ms)? > > Thanks, > Sergey > -- -- Appy
Assignment Manager and clock advance.
Hi, Not sure whether we may consider that as a bug, but I found an interesting dependency of AM2 on clock advancing. A simple operation such as create table is unable to perform with the same current_time value: public void testCreateTable() throws IOException { EnvironmentEdgeManager.injectEdge(new EnvironmentEdge() { volatile int curTime = 1000; @Override public long currentTime() { return curTime; } }); final TableName tableName = TableName.valueOf("test"); TEST_UTIL.createTable(tableName, HConstants.CATALOG_FAMILY).close(); } and fails with a TableNotFound exception. The reason is that between transitions we get table information from meta using Get with the exclusive current timestamp. Could it be a potential problem (i.e. the system capable to execute all that transition stuff in less than 1 ms)? Thanks, Sergey