[ 
https://issues.apache.org/jira/browse/PIG-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12975000#action_12975000
 ] 

Michael Brauwerman commented on PIG-1781:
-----------------------------------------

New patch PIG-1781.3.patch, supersedes PIG-1781.2.patch

Changes between PIG-1781.2.patch and  PIG-1781.3.patch:
* Changed ISOHelper's dateTime() to dateTimeParser(), which more closely 
matches the pre-patch behavior of accepting a date or time or dateTime.
* Renamed parseDate() to parseDateTime(), and renamed corresponding tests.
* Added a public constant ISOHelper#DEFAULT_DATE_TIME_ZONE to advertise the 
fact that ISOHelper has its own default time zone for parsing.
* Added code to ISOHelper#parseDateTime to use UTC as default time zone for 
parsing ambiguous dates.
* Added code to save/restore the System's default time zone. **This is a change 
in behavior from the pre-patch code.***

* Added unit tests:
** to illustrate various corner cases of date/time/dateTime and 
UTC/other-time-zone/no-time-zone parsing
** to illustrate how default time zones are managed.

* ran 'svn diff' at base of pig tree.

"ant clean test" in contrib directory succeeded.


I hope this version addresses everyone's concerns, but let me know if more 
changes are needed.

Regarding the general issue of time zones. I see the options for behavior like 
so:
(A) Use the System default time zone;
(B) Use ISOHelper's preferred time zone (UTC) to parse ambiguous times, but 
don't touch System's default time zone;
(C) Use ISOHelper's preferred time zone (UTC) to parse ambiguous times, and 
mutate the System time zone to match.
(D) Let the Pig User set a time zone for parsing ambiguous dates, independently 
of the System default time zone, with a parser default to either UTC or 
System's default.

The pre-patch 0.8.0 code behaves as option (C). This patch changes the behavior 
to be option (B) (with a tiny, unavoidable, race condition). (I think option 
(D) would be best, but that is more work that no one has requested yet.) 

Let me know if you disagree with my choice.

(I think the best practice for Pig Users to assign timezones as early as 
possible in the data processing pipeline, and keep dates tagged with time zones 
throughout the pipeline.)

> Piggybank: ISOToDay disregards timezone (should use ISODateTimeFormat instead 
> of DateTime to parse)
> ---------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1781
>                 URL: https://issues.apache.org/jira/browse/PIG-1781
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Michael Brauwerman
>         Attachments: PIG-1781.2.patch, PIG-1781.3.patch
>
>
> (Apologies if this is the wrong place to file Piggybank bugs)
> Bug in 
> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToDay.java?view=markup
> and other 
> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/truncate/
>  classes that copy-paste the same code.
> These classes parse dates like so:
>       DateTimeZone.setDefault(DateTimeZone.UTC);      
>       DateTime dt = new DateTime((String)input.get(0).toString()); 
> This has two problems:
> (1) It messes up JVM static state by changing the DateTimeZone default time 
> zone.
> (2) It ignore timezone information in the input string, so times like 
> "2009-12-09T23:59:59-0800" get truncated to "2009-12-10T00:00:00Z", which is 
> the wrong day of year. 
> Instead, they should use something like this, which respects the input 
> timezone and does not modify any global state:
>   DateTime dt 
> ISODateTimeFormat.dateTime().withOffsetParsed().parseDateTime(isoDateString);
> I have not provided a patch, because I'm not really set up to hack on 
> Piggybank locally.
> As a workaround, I am copy-pasting the classes into my own packages, and 
> making the desired change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to