[ https://issues.apache.org/jira/browse/DATAFU-23?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883453#comment-13883453 ]
Will Vaughan commented on DATAFU-23: ------------------------------------ For this specific use case, it would probably be easier to use http://pig.apache.org/docs/r0.11.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToHour.html to generate the appropriate date time string. > Create datafu.pig.util.PadZero to pad integers < 10 with 0s > ----------------------------------------------------------- > > Key: DATAFU-23 > URL: https://issues.apache.org/jira/browse/DATAFU-23 > Project: DataFu > Issue Type: Improvement > Reporter: Russell Jurney > Attachments: DATAFU-23.patch > > > /* Now group by time down to the hour, our time series granularity */ > grouped_by_time = GROUP bytes_in_out BY (GetYear(date_time), > GetMonth(date_time), GetDay(date_time), GetHour(date_time)); > bytes_per_hour = FOREACH grouped_by_time GENERATE FLATTEN(group) AS (year, > month, day, hour), > SUM(bytes_in_out.sc_bytes) > AS total_sc_bytes, > SUM(bytes_in_out.cs_bytes) > AS total_cs_bytes; > /* Now convert time elements back into a key for HBase */ > bytes_per_hour = FOREACH bytes_per_hour GENERATE ToDate(StringConcat(year, > '-', month, '-', day, 'T', hour, ':00:00.000Z')) AS date_time, > total_sc_bytes, > total_cs_bytes; > The previous code will erroneously generate bad ISO8601 dates, looking like > this: "2005-1-1:1:00:00.000Z" > Therefore a PadZero utility is needed to regenerate ISO8601 keys after > grouping by date pieces. -- This message was sent by Atlassian JIRA (v6.1.5#6160)