[
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405724#comment-13405724
]
Zhijie Shen commented on PIG-1314:
----------------------------------
There's some issues with loading/storing pig data. When store a DateTime object
with "Utf8StorageConverter" without using UDFs to convert it to some string,
should we serialize it as a millis+timezone composite, or output an UTC-style
datetime string (e.g., 2012-07-03T08:14:19.962+01:00))? The latter operation
behaves the same as uses "String ToString(DateTime d)" before storing the
string? Personally, I like the latter choice, because the data is directly
readable from the stored files.
On the other hand, if a datetime object is stored in the file as a datetime
string, when we load it again as a datetime object, should we use the default
timezone or use the one specified in the timezone string (e.g., +01:00 in the
last example)? I again prefer the second choice. When we use Pig, it is
possible to do a bunch of store/load to achieve some goal. The timezone
information need to be preserved. For example, let's assume +08:00 is the
default timezone. A datatime object whose individual timezone is -04:00 is
stored as a string, which will have -04:00 as suffix. When the string is loaded
as a datetime object for further process, we'd better keep to the previously
used timezone, -04:00, instead of the default one.
How do you think about this? Thanks!
> Add DateTime Support to Pig
> ---------------------------
>
> Key: PIG-1314
> URL: https://issues.apache.org/jira/browse/PIG-1314
> Project: Pig
> Issue Type: Bug
> Components: data
> Affects Versions: 0.7.0
> Reporter: Russell Jurney
> Assignee: Zhijie Shen
> Labels: gsoc2012
> Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a
> timestamp component. Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?
> We're looking at doing this, rather than use UDFs. Is this a patch that
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information
> about the program can be found at
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira