[ 
https://issues.apache.org/jira/browse/SQOOP-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273601#comment-14273601
 ] 

Jarek Jarcec Cecho commented on SQOOP-1988:
-------------------------------------------

Thank you for bringing this one up [~stanleyxu2005]. I was asking myself the 
same question when I was reviewing some of the recent patches, but I didn't 
have time to dig into it a bit more.

I believe that the schema matcher should do only a schema matching and should 
not do any data conversions. Hence I also believe that this code should be 
removed from schema matcher. I think that the original purpose was to allow 
arbitrary CSV support. I think that we had a JIRA to cover the custom {{NULL}} 
representations, but I'm having difficulties to look it up. As we are insisting 
on using a constant {{NULL}} in our CSV IDF I would assume that we should 
simply drop the code in question. It should be job of the IDF to convert any 
incoming values to proper {{NULL}} objects if they want to support multiple 
{{NULL}} representations. What do you think?

> Sqoop2: isNull handling should be moved to CSVIntermediateDataFormat
> --------------------------------------------------------------------
>
>                 Key: SQOOP-1988
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1988
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>             Fix For: 2.0.0
>
>
> The {{Matcher.getMatchingData}} method is expected to rearrange record fields 
> according to the FROM and TO schema. Currently here is an extra step in the 
> implementation, which will reset any {{null}} {{"NULL"}} {{"null"}} 
> {{"'null'"}} or {{""}} field to null. 
> As there is no comment or documentation about this, I guess it is some 
> undocumented special handling. [Here is some 
> discussion|https://issues.apache.org/jira/browse/SQOOP-1811?focusedCommentId=14270755&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14270755].
> I think this check should not belong here. I propose to remove it. As the 
> method will be called very frequently, the code removal will have performance 
> advance. Thanks [~jerrychenhf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to