[ 
https://issues.apache.org/jira/browse/PHOENIX-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088634#comment-15088634
 ] 

William Yang commented on PHOENIX-2520:
---------------------------------------

Updating the meta data periodically has a problem that if some connection 
changes the table meta in the middle of a period, then current connection will 
have to wait for the rest of the period before it can update the meta. Even if 
it knows that there's a new version of meta it cannot pick it up until current 
meta is 'old' enough. So I propose the following solution:

update meta data once(at the first time creating the TableRef object) and then 
update on demand. 

1. table meta change can be categroized into 3 types: 'ALWAYS' (current 
behaviour), 'NEVER', 'RARELY'. For the last two types, we should only update 
table meta at the first time we access the table and then update by demand. 
2. If user add a new column, and execute a SQL contains the new column with an 
old connection, SQL compilation will fail for MetaDataEntityNotFoundException, 
so we know that it is time to update table meta explicitly and retry compiling. 
3. The defect of this solution is that it cannot handle column deletion. If 
some connections remove a column, the old connections can still access the 
deleted column until it get re-opened. If user cannot accept this behaviour he 
should choose the 'ALWAYS' type. So a switch should be introduced. 

new configurations in hbase-site.xml:
<property>
   <name>phoenix.functions.preferMetaCache.enabled</name>
   <value>true</value>
</property>
<property>
   <name>phoenix.functions.preferMetaCache.enabled.your_table_name</name>
   <value>false</value>
</property>

I introduce two level configs here: global config and table-level config. If 
you enable the 'preferMetaCache' property, table meta will be updated once and 
then update by demand. Otherwise, it will update table meta every time (current 
behaviour). Users are responsible to decide which mode to use.

Since 'NEVER' and 'RARELY' and 'add column' are most common cases, this 
solution will work well for most scenarios. And for those minority cases you 
can choose 'ALWAYS'. 
More details, see patch 'preferMetaCache.patch'.

> Create DDL property for metadata update frequency
> -------------------------------------------------
>
>                 Key: PHOENIX-2520
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2520
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>         Attachments: preferMetaCache.patch
>
>
> On the client-side, Phoenix pings the server when a query is compiled to 
> confirm that the client has the most up-to-date metadata for the table being 
> queried. For some tables that are known to not change, this RPC is wasteful. 
> We can allow a property such as {{UPDATE_METADATA_CACHE_FREQUENCY_MS}} to be 
> specified a time to wait before checking with the server to see if the 
> metadata has changed. This could be specified in the CREATE TABLE call and 
> stored in the SYSTEM.CATALOG table header row. By default the value could be 
> 0 which would keep the current behavior. Tables that never change could use 
> Long.MAX_VALUE. Potentially we could allow 'ALWAYS' and 'NEVER' values for 
> convenience.
> Proposed implementation:
> - add {{public long getAge()}} method to {{PTableRef}}.
> - when setting lastAccessTime, also store System.currentMillis() to new 
> {{setAccessTime}} private member variable
> - the getAge() would return {{System.currentMillis() - setAccessTime}}
> - code in MetaDataClient would prevent call to server if age < 
> {{UPDATE_METADATA_CACHE_FREQUENCY_MS}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to