Hi Keith,
You should also consider writing you own UDF that takes in the date in 
"American" format and spits out a lexicographical string.
That way you don't have to modify your base data, just use this newly created 
from_american_date(String date) UDF to get your new date string in

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 
e: mgro...@oanda.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 
----- Original Message -----
From: "Keith Wiley" <kwi...@keithwiley.com>
To: user@hive.apache.org
Sent: Tuesday, March 13, 2012 1:47:12 PM
Subject: Re: order by date

I see how I misled you, sorry.  I wasn't implying that my csv data is cleanly 
represented in yyyy-MM-dd format.  I was just asking syntactically how to use 
date functions in HiveQL because I hadn't found any examples and I used 
yyyy-MM-dd in my example.  The dates in my csv tables are often in "American" 
format, month first without leading zeroes, e.g., "3/31/2012 7:00".  The lack 
of leading zeroes and the unsortabled date format make the dates difficult to 
work with.  I was thinking I could use the date functions with some other 
format to sort them (I guess it would be "M/d/yyyy h:mm" or something like 
that).

I admit, I didn't actually go to the trouble of providing the correct pattern 
string in my earlier post, I was focused on the HiveQL syntax in that post, not 
the precise date pattern given to the date function.

So yeah, I'm still trying to determine the best way to sort queries against the 
date-time columns.  One option is to read/write the entire tables with a date 
conversion to a lexicographic format.  Another option -- my original question 
in this thread -- was how I might use hive's date functions at the time a query 
is performed.

What do you think is the best way to deal with this?

Thanks.

On Mar 13, 2012, at 10:35 , Tucker, Matt wrote:

> I'm a bit confused.  It sounds like you're already storing your dates as 
> strings in a 'yyyy-mm-dd' format.  In that case, you can just sort by 
> dateColName.  There's no issue with using UNIX_TIMESTAMP() in the order by 
> clause, as it outputs integer values.
> 
> Most of the date functions in hive take arguments in string format, with a 
> few functions that will translate between unix timestamps and datetime 
> strings.
> 
> Matt Tucker


________________________________________________________________________________
Keith Wiley     kwi...@keithwiley.com     keithwiley.com    music.keithwiley.com

"I do not feel obliged to believe that the same God who has endowed us with
sense, reason, and intellect has intended us to forgo their use."
                                           --  Galileo Galilei
________________________________________________________________________________

Reply via email to