Thanks for the quick response. The idea is that we are selling the encryption product for customers who use HDFS. Hence, encryption is a requirement.
Any other suggestions. Sam ________________________________________ From: Michael Segel [michael_se...@hotmail.com] Sent: Wednesday, October 17, 2012 6:10 PM To: user@hadoop.apache.org Subject: Re: Hive Query with UDF You don't need an UDF... You encrypt the string 'Ann' first then use that encrypted value in the Select statement. That should make things a bit simpler. On Oct 17, 2012, at 8:04 PM, Sam Mohamed <sam.moha...@voltage.com> wrote: > I have some encrypted data in an HDFS csv, that I've created a Hive table > for, and I want to run a Hive query that first encrypts the query param, then > does the lookup. I have a UDF that does encryption as follows: > > public class ParamEncrypt extends UDF { > > public Text evaluate(String name) throws Exception { > > String result = new String(); > > if (name == null) { return null; } > > result = ParamData.encrypt(name); > > return new Text(result); > } > } > > Then I run the Hive query as: > > select * from cc_details where first_name = encrypt('Ann'); > > The problem is, it's running encrypt('Ann') across every single record in the > table. I want it do the encryption once, then do the matchup. I've tried: > > select * from cc_details where first_name in (select encrypt('Ann') from > cc_details limit 1); > > But Hive doesn't support **IN** or select queries in the where clause. > > What can I do? > > Can I do something like: > > select encrypt('Ann') as ann from cc_details where first_name = ann; > > That also doesn't work because the query parser throws an error saying > **ann** is not a known column > > Thanks, > > Sam