[ 
https://issues.apache.org/jira/browse/SQOOP-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463723#comment-16463723
 ] 

Daniel Voros commented on SQOOP-3317:
-------------------------------------

Hi [~srikumaran.t], thank you for reporting this!

As far as I can tell, currently the only option for validation is to check for 
an exact match for the number of records. "Percentage tolerant" validation was 
only mentioned in the documentation but is not implemented.

In my opinion this kind of validation (comparing the number of records) doesn't 
make much sense and should only be used as a sanity check, since it doesn't 
guarantee the equality of the contents.

However we could improve the existing implementation by introducing another 
parameter (margin/threshold) to not require an exact match and we could also 
implement "Percentage tolerant".

> org.apache.sqoop.validation.RowCountValidator in live RDBMS system
> ------------------------------------------------------------------
>
>                 Key: SQOOP-3317
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3317
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Sri Kumaran Thirupathy
>            Priority: Major
>
> org.apache.sqoop.validation.RowCountValidator is retrieving count from Source 
> after the MR completes. This fails in live RDBMS case.
> org.apache.sqoop.validation.RowCountValidator can retrive count during MR 
> execution phase.  
> Also, How to use Percentage Tolerant? Reference: 
> [https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to