Selva,
I wasn't thinking of a particular pull request when I sent this out for 
discussion.  I was just thinking in general because there was the hope of being 
able to install Trafodion without restarting HBase.

Regarding the specific pull request you mentioned, I see no issue because it 
only affects install_local_hadoop and that explicitly does an initialize 
Trafodion, so there is no recovery to worry about.  This should be fine.

Also, I have no issue with configuring a coprocessor as a table attribute.  
This certainly loads the coprocessors.

It's unclear for me if there is still an intent to avoid HBase restart or not.  
If not, then this discussion is mute.  But if there is, can you tell me how the 
software migration from version X to version y would work?  If you change the 
class path to point away from version x and toward version y, then disable and 
re-enable each table in order to close and reopen all the regions, this might 
work.  I think you'd want to lock the database to prevent transactions during 
this time.

Am I understanding your proposal correctly?

Thanks,
Sean 

-----Original Message-----
From: Selva Govindarajan [mailto:[email protected]] 
Sent: Monday, October 24, 2016 10:37 PM
To: [email protected]
Subject: RE: [DISCUSS] Loading coprocessors from the client side dynamically

The coprocessors are neither pushed nor loaded dynamically from the SQL client 
side. They are simply added as the table attribute and hence configured as 
table coprocessor at the time of creation via Trafodion. So, if you do describe 
in hbase shell, you will see something like below

describe 'TRAFODION._MD_.OBJECTS'
Table TRAFODION._MD_.OBJECTS is ENABLED
TRAFODION._MD_.OBJECTS, {TABLE_ATTRIBUTES => {coprocessor$1 => 
'|org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionObserver|1073741823|',
 coprocesso
r$2 => 
'|org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionEndpoint|1073741823|',
 coprocessor$3 => '|org.apache.hadoop.hbase.coprocessor.AggregateIm
plementation|1073741823|'}
COLUMN FAMILIES DESCRIPTION
{NAME => '#1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', 
REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => 
'FOREVER', MIN_V ERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => 
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} {NAME => 'mt_', 
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', 
COMPRESSION => 'NONE', VERSIONS => '2', TTL => 'FOREVER', MIN_ VERSIONS => '0', 
KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'true', 
BLOCKCACHE => 'true'}

By configuring it as table coprocessor, the region server would dynamically 
load the needed classes and invoke the corresponding hooks when the 
table/region is operated on by it. Trafodion has started adding table 
coprocessor to its table possibly in Trafodion 1.0 release. However, it was 
added in CDH cluster installation wherein the property 
'hbase.coprocessor.region.classes' is not visible in the client side of hbase. 
It doesn't happen in case of Hadoop installation via install_local_hadoop for 
the developer workspaces.

At the time of adding of table coprocessor it was thought we can avoid hbase 
restart by doing it.  It was soon found that restart can't be avoided.  
However, adding table coprocessor has another advantage that coprocessor 
classes hooks are invoked for trafodion tables only. It is not invoked for 
non-trafodion tables.

For the 2 main points in the discussion:

The table coprocessor attribute just has the fully qualified class name.. The 
region server needs to find this class from its own classpath. Hence it is not 
clear how incompatible version of software would exist? Is it because the jars 
containing these classes from the compatible version are not copied to RS class 
path locations? If so, the same problem would exist with/without table 
coprocessors?

Do you think TrxRegionEndPoint and TrxRegionObserver wouldn't be loaded if it 
is configured as table coprocessor attribute only?

The PR 777 doesn't remove the property 'hbase.coprocessor.region.classes' from 
hbase-site.xml for cluster install. It is removed from install_local_hadoop.  I 
was able to run full regressions without the property 
'hbase.coprocessor.region.classes' in hbase-site.xml? Is there a way to 
simulate the condition that you had described so that I can verify if it works 
as expected or as feared?

Selva

-----Original Message-----
From: Sean Broeder [mailto:[email protected]]
Sent: Monday, October 24, 2016 7:59 PM
To: 
[email protected]<mailto:[email protected]>

Subject: [DISCUSS] Loading coprocessors from the client side dynamically

This sender failed our fraud detection checks and may not be who they appear to 
be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing

Hi,
I have seen some discussion related to pushing/loading coprocessors in the 
regions dynamically from the SQL client side when Trafodion is started or when 
a table is opened.  The motivation here is to be able to install Trafodion 
without having to stop/restart HBase where a customer might have a previous 
HBase installation.

I want to point out that Trafodion uses a TrxRegionEndpoint and a 
TrxRegionObserver coprocessor.  These two coprocessors work in tandem to ensure 
region split, region rebalance, and region recovery are possible for 
multi-table transactions.  These two coprocessors ensure data consistency and 
form the basis of our ACID transaction implementation.  We currently mandate 
that these two coprocessors have entries in the hbase-site.xml file.

While I have no problem loading other coprocessors dynamically, I do not think 
it is a good idea to load these two on the fly without stopping/restarting 
HBase, at least not in the current design.

The issue is that other components in the DTM may have code changes that assume 
the corresponding versions of these coprocessors are loaded and running.  If 
the DTM components are not compatible with the coprocessors, unpredictable 
results could occur, including system hangs, process failures, and data 
corruption.  Furthermore, if the coprocessors are loaded dynamically and the 
entries are not in the hbase-site.xml file, then Trafodion recovery would 
effectively be abandoned before the regions are started and there would be no 
way to correct the inconsistencies short of dropping and recreating the tables.

So the 2 main points I hope the readers of this discussion will take away based 
on the current Trafodion design are:
1) The TrxRegionEndpoint and TrxRegionObserver coprocessors must remain in the 
hbase-site.xml file to ensure recovery occurs before the regions are started 
and open for HBase activity.
2) Trafodion and the above coprocessors must run compatible versions of 
software to ensure proper function.  These two coprocessors are not static, so 
it is likely that if Trafodion code has changed, so have the coprocessors.

Thanks,
Sean

Reply via email to