[
https://issues.apache.org/jira/browse/PIG-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai resolved PIG-2581.
-----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Patch committed to trunk.
> HashFNV inconsistent/non-deterministic due to default platform encoding
> -----------------------------------------------------------------------
>
> Key: PIG-2581
> URL: https://issues.apache.org/jira/browse/PIG-2581
> Project: Pig
> Issue Type: Bug
> Components: piggybank
> Affects Versions: 0.8.1
> Reporter: Daniel Andersson
> Assignee: Prashant Kommireddi
> Priority: Minor
> Attachments: PIG-2581-2.patch, PIG-2581.patch
>
>
> HashFNV (org/apache/pig/piggybank/evaluation/string/HashFNV) bases its
> computation on String.getBytes(), which uses the platform default encoding.
> This leads to different results on different platforms. Worse, if any
> character is not supported by the encoding, the behavior is completely
> undefined. We have observed non-deterministic behavior that seems to be
> caused by this.
> Suggested fix is to instead use String.getBytes("UTF-8"), which will be
> well-defined and consistent on every platform.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira