-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9276/#review23388
-----------------------------------------------------------


Hi,

Regarding the discussion between yourself, Mark and I before, we weren't saying 
use a regex to decide if the incoming column is a wildcard. We are saying that 
it should be possible for someone to specify a regex in hbase.columns.mapping 
which we'd use to match. However, since we don't know the typing of the 
incoming column qualifiers (from hbase) this might be tough.

How about this... Today we require a very simple .*  to match all characters?  
This is a valid regex so when we add regex support later we don't have to deal 
with backwards incompatibility issues. Basically what this would mean is:

1) Instead of col* matching everything that starts with col, col.* matches 
everything that starts with col.
2) Eliminate the regex matching against hbase.columns.mapping
3) Add a property which defaults to true named something like 
hbase.columns.mapping.regex.matching so users could turn this off if needed.
4) As you do today you'd use Bytes.startWith to do the match. Later we'd 
implement regex matching.

Brock

- Brock Noland


On Feb. 9, 2013, 9:56 p.m., Swarnim Kulkarni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9276/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2013, 9:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-3725
>     https://issues.apache.org/jira/browse/HIVE-3725
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Added support for pulling hbase columns just by providing prefixes and a 
> wildcard. So a query now could look something like this:
> 
> CREATE EXTERNAL TABLE hive_hbase_test
> ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,fam1:col*") 
> TBLPROPERTIES ("hbase.table.name" = "TEST_HBASE_TABLE");
> 
> This would pull in all columns under column family "fam1" which start with 
> "col". This gives a little more flexibility over pull all columns format.
> 
> 
> Diffs
> -----
> 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 
> a8ba9d9 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
> d35bb52 
>   hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 
> e821282 
> 
> Diff: https://reviews.apache.org/r/9276/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests to demonstrate the new functionality. Also made sure that 
> all existing unit tests passed.
> 
> 
> Thanks,
> 
> Swarnim Kulkarni
> 
>

Reply via email to