> On Feb. 5, 2013, 3:43 a.m., Mark Grover wrote:
> > hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line 
> > 192
> > <https://reviews.apache.org/r/9276/diff/1/?file=254957#file254957line192>
> >
> >     This seems like a limited case of pattern matching. Swarnim, any way we 
> > can support generic regex matching instead?
> 
> Swarnim Kulkarni wrote:
>     Mark, in this case I specifically wanted to only allow strings that end 
> with exactly the character "*" and using String#endsWith seemed more simpler 
> and readable than a regex. Do you still want me to replace this with a regex 
> matching?
> 
> Brock Noland wrote:
>     I think the issue is that this would make it difficult to implement 
> enhanced pattern matching later. Implementing it now, you'd only need to 
> specify:
>     
>     col.*
>     
>     in the table configuration. Now the issue would be detecting if the 
> particular column was a regex pattern. Because #, comma, and : are used as 
> separators that would exclude those characters from being used.
> 
> Swarnim Kulkarni wrote:
>     Thanks Brock. Makes sense. To be sure I am understanding you right, the 
> change now would be just to replace the "parts[1].endsWith(*)" with something 
> more regexy that would still imply that the string ends with "*". Correct?

I think that should be do it.

Personally, I think having limited regex matching is just going to confuse 
people, so if you could implement (and test) full Nava style regex matching 
(like we do for RegexSerDe for example), that would be fantastic. Of course, 
let me know if you have questions!

Thanks for doing this, BTW!


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9276/#review16080
-----------------------------------------------------------


On Feb. 3, 2013, 1:04 a.m., Swarnim Kulkarni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9276/
> -----------------------------------------------------------
> 
> (Updated Feb. 3, 2013, 1:04 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Description
> -------
> 
> Added support for pulling hbase columns just by providing prefixes and a 
> wildcard. So a query now could look something like this:
> 
> CREATE EXTERNAL TABLE hive_hbase_test
> ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,fam1:col*") 
> TBLPROPERTIES ("hbase.table.name" = "TEST_HBASE_TABLE");
> 
> This would pull in all columns under column family "fam1" which start with 
> "col". This gives a little more flexibility over pull all columns format.
> 
> 
> This addresses bug HIVE-3725.
>     https://issues.apache.org/jira/browse/HIVE-3725
> 
> 
> Diffs
> -----
> 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 
> a8ba9d9 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
> d35bb52 
>   hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 
> e821282 
> 
> Diff: https://reviews.apache.org/r/9276/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests to demonstrate the new functionality. Also made sure that 
> all existing unit tests passed.
> 
> 
> Thanks,
> 
> Swarnim Kulkarni
> 
>

Reply via email to