FWIW I was doing very similar substring (etc) machinations until we
started converting output from back-end data-generation tools directly
into parquet (using other tools). IMO it's a common enough format (at
least in the types of data you and I likely have to work with :-) that
it'd be great if there was direct support for it. If there is, I also
missed it and would also be most appreciative of which incantations to
use to take advantage of it.

On Wed, Oct 18, 2017 at 10:49 AM, Charles Givre <cgi...@gmail.com> wrote:
> Hello Drillers,
> I have a silly question which I’m a little stuck with.  I have some data in 
> CSV format with dates in the following format:  2017-08-10T09:12:26.000Z.  
> I’m trying to convert this into a date time data field so that I have both 
> the date and the hours, however I keep running into road blocks.   I’ve tried 
> the CAST( field AS DATE ) but in doing so I lose the time component.  I’ve 
> tried the TO_TIMESTAMP function, however the only success I’ve had is using 
> the substring function to remove the timezone at the end, then use 
> regex_replace to get rid of the literal ’T’ in the middle of the string, then 
> TO_NUMBER.  (See query below)
>
> SELECT TO_TIMESTAMP( REGEXP_REPLACE( SUBSTR( <field>, 1,19), 'T', ' '), 
> 'yyyy-MM-dd HH:mm:ss'  ) AS dt,
> EXTRACT(
> hour FROM
> TO_TIMESTAMP( REGEXP_REPLACE( SUBSTR( <field>, 1,19), 'T', ' '), 'yyyy-MM-dd 
> HH:mm:ss'  )
>  ) AS dt_hour
>
> I’d think you could do this directly with the TO_TIMESTAMP function however, 
> I can’t figure out how include the literal ’T’ in the formatting string.  The 
> escape character seems to be the single quote which also is the only 
> character allowed to denote the formatting string.
>
> So, questions:
> 1.  Is there any way to include a literal character in a joda date format?
> 2.  Is it possible to use any character besides a single quote to mark the 
> beginning/end of a format string?
> 3.  Are there any ways to do this that I’m missing?
>
> Thanks!
> —C
>

Reply via email to