[ 
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742877#comment-13742877
 ] 

Hudson commented on HIVE-5105:
------------------------------

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #131 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/131/])
HIVE-5105 HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
fieldPositionMap (Eugene Koifman via Sushanth Sowmyan) (khorgath: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514929)
* 
/hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatSchema.java
* 
/hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/data/schema/TestHCatSchema.java

                
> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
> fieldPositionMap
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-5105
>                 URL: https://issues.apache.org/jira/browse/HIVE-5105
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog
>    Affects Versions: 0.12.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>             Fix For: 0.12.0
>
>         Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema 
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map<String, Integer> so the element is not 
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap. 
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed 
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will 
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that 
> is called from HCatOutputFormat.setSchema, where we try to detect if the user 
> specified partition column names in the schema when they shouldn't have, and 
> if they did, we remove it. Normally, people do not specify this, and this 
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing 
> some validations), and this does appear to stay through the serialization 
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to 
> it at any time(all appends in hcat code are done on newly initialized 
> HCatSchema objects which have had no removes done on them), and we don't ask 
> for the position of something we do not expect to be there(harder to verify 
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public 
> interface for HCat, in that M/R programs that use HCat can use it, and thus, 
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of 
> the rarity of the situation, but nevertheless, something we should fix 
> immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to