[
https://issues.apache.org/jira/browse/PIG-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228623#comment-13228623
]
Daniel Dai commented on PIG-2581:
---------------------------------
Sounds good, will you make a patch?
> HashFNV inconsistent/non-deterministic due to default platform encoding
> -----------------------------------------------------------------------
>
> Key: PIG-2581
> URL: https://issues.apache.org/jira/browse/PIG-2581
> Project: Pig
> Issue Type: Bug
> Components: piggybank
> Affects Versions: 0.8.1
> Reporter: Daniel Andersson
> Priority: Minor
>
> HashFNV (org/apache/pig/piggybank/evaluation/string/HashFNV) bases its
> computation on String.getBytes(), which uses the platform default encoding.
> This leads to different results on different platforms. Worse, if any
> character is not supported by the encoding, the behavior is completely
> undefined. We have observed non-deterministic behavior that seems to be
> caused by this.
> Suggested fix is to instead use String.getBytes("UTF-8"), which will be
> well-defined and consistent on every platform.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira