Hi,

We experienced one problem in hdp 4.7.0.2.6.1.0-129 (includes schema mapping 
feature from 4.8) and I can see the same problem in master. The problem itself 
quite easy to understand, but what it can break is a big question.

Class org.apache.phoenix.util,SchemaUtil has getSchemaNameFromFullName 
<https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/SchemaUtil.java#L641>
 and getTableNameFromFullName 
<https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/SchemaUtil.java#L691>
 methods that do not respect IS_NAMESPACE_MAPPING_ENABLED flag. Moreover 
methods treat namespace as a schema even though flag is supposed to be FALSE by 
default. I am pretty surprised that it wasn’t discovered before and might be I 
am wrong. I found only one related bug and wrote comment PHOENIX-3460 
<https://issues.apache.org/jira/browse/PHOENIX-3460>.

The problem can cause quite nasty bugs. Please find method 
PhoenixRuntime#generateColumnInfo 
<https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L469>
 (btw there is another method in SchemaUtil with another implementation) and 
see how it works with table name and cache. The method calls 
PhoenixRuntime#getTable 
<https://github.com/apache/phoenix/blob/b46cbd375e3d2ee9a11644825c13937572c027cd/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L442>
 with normalized table name and if cache doesn’t contain the table, it tries to 
update cache but with table name returned by getTableNameFromFullName and fails.

Consider an example (schema mapping feature is disabled):

generateColumnInfo("\"ns:my_table\"") -> getTable("ns:my_table") -> 
MetaDataClient#updateCache("ns", "my_table")

Therefore it looks up for the table "ns:my_table" in cache but updates cache 
with "ns.my_table" and throws exception with "ns:my_table". So this problem 
depends on the cache state which makes it really nasty. The exception with a 
wrong table name is also something to consider.

Alright, questions:

1) Am I right?
2) Is there any plan to get rid of those methods?
3) Is there any plan to fix those methods or namespace mapping is must be 
always turned on?

If there any suggestion how it should be fixed I can help with that. In our 
project we changed methods to default behavior 
(IS_NAMESPACE_MAPPING_ENABLED=false) but I do not see clear way to fetch this 
flag from config.

// Stas

Reply via email to