[ https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405724#comment-13405724 ]
Zhijie Shen commented on PIG-1314: ---------------------------------- There's some issues with loading/storing pig data. When store a DateTime object with "Utf8StorageConverter" without using UDFs to convert it to some string, should we serialize it as a millis+timezone composite, or output an UTC-style datetime string (e.g., 2012-07-03T08:14:19.962+01:00))? The latter operation behaves the same as uses "String ToString(DateTime d)" before storing the string? Personally, I like the latter choice, because the data is directly readable from the stored files. On the other hand, if a datetime object is stored in the file as a datetime string, when we load it again as a datetime object, should we use the default timezone or use the one specified in the timezone string (e.g., +01:00 in the last example)? I again prefer the second choice. When we use Pig, it is possible to do a bunch of store/load to achieve some goal. The timezone information need to be preserved. For example, let's assume +08:00 is the default timezone. A datatime object whose individual timezone is -04:00 is stored as a string, which will have -04:00 as suffix. When the string is loaded as a datetime object for further process, we'd better keep to the previously used timezone, -04:00, instead of the default one. How do you think about this? Thanks! > Add DateTime Support to Pig > --------------------------- > > Key: PIG-1314 > URL: https://issues.apache.org/jira/browse/PIG-1314 > Project: Pig > Issue Type: Bug > Components: data > Affects Versions: 0.7.0 > Reporter: Russell Jurney > Assignee: Zhijie Shen > Labels: gsoc2012 > Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip > > Original Estimate: 672h > Remaining Estimate: 672h > > Hadoop/Pig are primarily used to parse log data, and most logs have a > timestamp component. Therefore Pig should support dates as a primitive. > Can someone familiar with adding types to pig comment on how hard this is? > We're looking at doing this, rather than use UDFs. Is this a patch that > would be accepted? > This is a candidate project for Google summer of code 2012. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira