[
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sushanth Sowmyan updated HIVE-5105:
-----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up
> fieldPositionMap
> -------------------------------------------------------------------------------------
>
> Key: HIVE-5105
> URL: https://issues.apache.org/jira/browse/HIVE-5105
> Project: Hive
> Issue Type: Bug
> Components: HCatalog
> Affects Versions: 0.12.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map<String, Integer> so the element is not
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap.
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that
> is called from HCatOutputFormat.setSchema, where we try to detect if the user
> specified partition column names in the schema when they shouldn't have, and
> if they did, we remove it. Normally, people do not specify this, and this
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing
> some validations), and this does appear to stay through the serialization
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to
> it at any time(all appends in hcat code are done on newly initialized
> HCatSchema objects which have had no removes done on them), and we don't ask
> for the position of something we do not expect to be there(harder to verify
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public
> interface for HCat, in that M/R programs that use HCat can use it, and thus,
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of
> the rarity of the situation, but nevertheless, something we should fix
> immediately.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira