[ 
https://issues.apache.org/jira/browse/PHOENIX-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453283#comment-16453283
 ] 

James Taylor commented on PHOENIX-4484:
---------------------------------------

One consideration we discussed offline is to disable the Omid garbage collector 
while an index is build built. One idea would be to delay the calling of the 
compaction hook that does the GC if an index is being built by wrapping the 
coprocessor by our own Phoenix delegate coprocessor. In the simple case, in 
which the hbase table maps directly to a Phoenix index table, it's easy since 
we can lookup the table in SYSTEM.CATALOG and detect if it's an index being 
built. However, if it's an index on a view (i.e. a shared physical table) or a 
local index (multiple indexes in the same physical table as the data table but 
different column family), it's difficult. You'd have to scan the entire 
SYSTEM.CATALOG and try to figure out if the HBase table being compacted 
corresponds to any index that is in a building state. It might end up delaying 
the compaction too often too, since it would delay if any index is being built 
(which may be a very small portion of the overall table).

I think we should brainstorm if there are other solutions. Would it be 
possible, [~ohads], to have a flag in the commit table that would be used to 
determine whether or not the GC happens? Or would another alternative be to not 
put the index table under the control of Omid until after the initial 
population is finished?

> Write directly to HBase when creating an index for transactional table
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-4484
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4484
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Ohad Shacham
>            Assignee: Ohad Shacham
>            Priority: Major
>
> Today, when creating an index table for a non empty data table. The writes 
> are performed using the transaction api and both consumes client side memory, 
> for storing the writeset, and checks for conflict analysis upon commit. This 
> is redundant and can be replaced by direct write to HBase. For this reason, a 
> new function in the transaction abstraction layer should be added that writes 
> directly to HBase at the Tephra's case and adds shadow cells with the fence 
> id at the Omid case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to