[ 
https://issues.apache.org/jira/browse/DATAFU-23?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883737#comment-13883737
 ] 

Russell Jurney commented on DATAFU-23:
--------------------------------------

I'll close this and add a Truncate date JIRA. Are we ok with these JIRAs being 
in DataFu and not in Pig itself? I'd say it probably belongs in Pig, but my 
time constraints don't allow me to commit code against Pig directly very easily.

> Create datafu.pig.util.PadZero to pad integers < 10 with 0s
> -----------------------------------------------------------
>
>                 Key: DATAFU-23
>                 URL: https://issues.apache.org/jira/browse/DATAFU-23
>             Project: DataFu
>          Issue Type: Improvement
>            Reporter: Russell Jurney
>         Attachments: DATAFU-23.patch
>
>
> /* Now group by time down to the hour, our time series granularity */
> grouped_by_time = GROUP bytes_in_out BY (GetYear(date_time), 
> GetMonth(date_time), GetDay(date_time), GetHour(date_time));
> bytes_per_hour = FOREACH grouped_by_time GENERATE FLATTEN(group) AS (year, 
> month, day, hour), 
>                                                   SUM(bytes_in_out.sc_bytes) 
> AS total_sc_bytes,
>                                                   SUM(bytes_in_out.cs_bytes) 
> AS total_cs_bytes;
> /* Now convert time elements back into a key for HBase */
> bytes_per_hour = FOREACH bytes_per_hour GENERATE ToDate(StringConcat(year, 
> '-', month, '-', day, 'T', hour, ':00:00.000Z')) AS date_time, 
>                                                  total_sc_bytes, 
>                                                  total_cs_bytes;
> The previous code will erroneously generate bad ISO8601 dates, looking like 
> this: "2005-1-1:1:00:00.000Z"
> Therefore a PadZero utility is needed to regenerate ISO8601 keys after 
> grouping by date pieces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to