Hi Bogi, Thanks for the providing direction.
As you suggested I explored further and resolved the issue and able to test the fix on trunk based code changes in my hadoop cluster. Root cause for my issue: 1.4.6 code base using the same avro version which is there in my hadoop cluster so there is no issue for that jar component, whereas trunk code base using the avro-1.8.1 jar files, which is not available in my hadoop cluster. Can you suggest how to do unit test etc for this component. I tried with "test" target, I am getting all as failed as below. Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.415 sec [junit] Running com.cloudera.sqoop.TestDirectImport [junit] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.705 sec [junit] Test com.cloudera.sqoop.TestDirectImport FAILED [junit] Running com.cloudera.sqoop.TestExport [junit] Tests run: 17, Failures: 0, Errors: 17, Skipped: 0, Time elapsed: 22.564 sec [junit] Test com.cloudera.sqoop.TestExport FAILED [junit] Running com.cloudera.sqoop.TestExportUpdate Do I need to do any changes? I am running from eclipse with "test" target. Thanks, Jilani On Thu, Mar 9, 2017 at 9:42 PM, Jilani Shaik <jilani2...@gmail.com> wrote: > Hi Bogi, > > - Prepared jar using trunk with "jar-all" target > > - Copied the jar to /opt/mapr/sqoop/sqoop-1.4.6/ > > - Moved out existing jar to some other location > > - then execute the below command to do import > sqoop import --connect jdbc:mysql://10.0.0.300/database123 --verbose > --username test --password test123$ --table payment -m 2 --hbase-table > /database/demoapp/hbase/payment --column-family pay --hbase-row-key > payment_id --incremental lastmodified --merge-key payment_id --check-column > last_update --last-value '2017-01-08 08:02:05.0' > > > The same steps I followed for both the jar from trunk code vs 1.4.6 branch > code. > > Where are you suggesting the multiple avro jars, is it at the time of jar > preparation or running the command using the jar. > > > Thanks, > Jilani > > On Thu, Mar 9, 2017 at 9:21 AM, Boglarka Egyed <b...@cloudera.com> wrote: > >> Hi Jilani, >> >> I suspect that you have an old version of Avro or even multiple Avro >> versions on your classpath and thus Sqoop uses an older one. >> >> Could you please provide a list of the exact commands you have performed >> so that I can reproduce the issue? >> >> Thanks, >> Bogi >> >> On Thu, Mar 9, 2017 at 2:51 AM, Jilani Shaik <jilani2...@gmail.com> >> wrote: >> >>> Can some one provide me the pointers what am I missing with trunk vs >>> 1.4.6 >>> builds, which is giving some error as mentioned in below mail chain. >>> >>> I did followed the same ant target to prepare jar for both branches, but >>> even though 1.4.6 jar is different to 1.4.7 which is created form trunk. >>> >>> Thanks, >>> Jilani >>> >>> >>> On Wed, Mar 8, 2017 at 3:29 AM, Jilani Shaik <jilani2...@gmail.com> >>> wrote: >>> >>> > Hi Bogi, >>> > >>> > I am getting below error, when I have prepared jar from trunk and try >>> to >>> > do sqoop import with mysql database table and got below exception, >>> where as >>> > similar changes are working with branch 1.4.6. >>> > >>> > >>> > 17/03/08 01:06:25 INFO sqoop.Sqoop: Running Sqoop version: >>> 1.4.7-SNAPSHOT >>> > 17/03/08 01:06:25 DEBUG tool.BaseSqoopTool: Enabled debug logging. >>> > 17/03/08 01:06:25 WARN tool.BaseSqoopTool: Setting your password on the >>> > command-line is insecure. Consider using -P instead. >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory: >>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory: >>> > com.cloudera.sqoop.manager.DefaultManagerFactory >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: >>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory >>> > 17/03/08 01:06:25 DEBUG oracle.OraOopManagerFactory: Data Connector for >>> > Oracle and Hadoop can be called by Sqoop! >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: >>> > com.cloudera.sqoop.manager.DefaultManagerFactory >>> > 17/03/08 01:06:25 DEBUG manager.DefaultManagerFactory: Trying with >>> scheme: >>> > jdbc:mysql: >>> > Exception in thread "main" java.lang.NoClassDefFoundError: >>> > org/apache/avro/LogicalType >>> > at org.apache.sqoop.manager.DefaultManagerFactory.accept( >>> > DefaultManagerFactory.java:67) >>> > at org.apache.sqoop.ConnFactory.g >>> etManager(ConnFactory.java:184) >>> > at org.apache.sqoop.tool.BaseSqoopTool.init( >>> > BaseSqoopTool.java:270) >>> > at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:97) >>> > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:617) >>> > at org.apache.sqoop.Sqoop.run(Sqoop.java:147) >>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>> > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) >>> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) >>> > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) >>> > at org.apache.sqoop.Sqoop.main(Sqoop.java:252) >>> > Caused by: java.lang.ClassNotFoundException: >>> org.apache.avro.LogicalType >>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> > at sun.misc.Launcher$AppClassLoad >>> er.loadClass(Launcher.java:331) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> > ... 11 more >>> > >>> > Please let me know what is missing and how to resolve this exception, >>> Let >>> > me know if you need further details. >>> > >>> > Thanks, >>> > Jilani >>> > >>> > On Wed, Mar 1, 2017 at 4:38 AM, Boglarka Egyed <b...@cloudera.com> >>> wrote: >>> > >>> >> Hi Jilani, >>> >> >>> >> This is an example: SQOOP-3053 >>> >> <https://issues.apache.org/jira/browse/SQOOP-3053> with the review >>> >> <https://reviews.apache.org/r/54206/> linked. Please make your >>> changes on >>> >> trunk as it will be used to cut the future release so your patch >>> >> definitely >>> >> needs to be be able to apply on it. >>> >> >>> >> Thanks, >>> >> Bogi >>> >> >>> >> On Wed, Mar 1, 2017 at 3:46 AM, Jilani Shaik <jilani2...@gmail.com> >>> >> wrote: >>> >> >>> >> > Hi Bogi, >>> >> > >>> >> > Can you provide me sample Jira tickets and Review requests similar >>> to >>> >> > this, to proceed further. >>> >> > >>> >> > I applied the code changes from sqoop git from this branch >>> >> > "sqoop-release-1.4.6-rc0", If you suggest right branch I will take >>> the >>> >> code >>> >> > from there and apply the changes before submit review for request. >>> >> > >>> >> > Thanks, >>> >> > Jilani >>> >> > >>> >> > On Mon, Feb 27, 2017 at 3:05 AM, Boglarka Egyed <b...@cloudera.com> >>> >> wrote: >>> >> > >>> >> >> Hi Jilani, >>> >> >> >>> >> >> To get your change committed please do the following: >>> >> >> * Open a JIRA ticket for your change in Apache's JIRA system >>> >> >> <https://issues.apache.org/jira/browse/SQOOP/> for project Sqoop >>> >> >> * Create a review request at Apache's review board >>> >> >> <https://reviews.apache.org/r/> for project Sqoop and link it to >>> the >>> >> JIRA >>> >> >> >>> >> >> ticket >>> >> >> >>> >> >> Please consider the guidelines below: >>> >> >> >>> >> >> Review board >>> >> >> * Summary: generate your summary using the issue's jira key + jira >>> >> title >>> >> >> * Groups: add the relevant group so everyone on the project will >>> know >>> >> >> about >>> >> >> your patch (Sqoop) >>> >> >> * Bugs: add the issue's jira key so it's easy to navigate to the >>> jira >>> >> side >>> >> >> * Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2 for Sqoop2 >>> >> >> * And as soon as the patch gets committed, it's very useful for the >>> >> >> community if you close the review and mark it as "Submitted" at the >>> >> Review >>> >> >> board. The button to do this is top right at your own tickets, >>> right >>> >> next >>> >> >> to the Download Diff button. >>> >> >> >>> >> >> Jira >>> >> >> * Link: please add the link of the review as an external/web link >>> so >>> >> it's >>> >> >> easy to navigate to the reviews side >>> >> >> * Status: mark it as "patch available" >>> >> >> >>> >> >> Sqoop community will receive emails about your new ticket and >>> review >>> >> >> request and will review your change. >>> >> >> >>> >> >> Thanks, >>> >> >> Bogi >>> >> >> >>> >> >> >>> >> >> On Sat, Feb 25, 2017 at 2:14 AM, Jilani Shaik < >>> jilani2...@gmail.com> >>> >> >> wrote: >>> >> >> >>> >> >> > Do we have any update? >>> >> >> > >>> >> >> > I did checkout of the 1.4.6 code and done code changes to achieve >>> >> this >>> >> >> and >>> >> >> > tested in cluster and it is working as expected. Is there a way >>> I can >>> >> >> > contribute this as a patch and then the committers can validate >>> >> further >>> >> >> and >>> >> >> > suggest if any changes required to move further. Please suggest >>> the >>> >> >> > approach. >>> >> >> > >>> >> >> > Thanks, >>> >> >> > Jilani >>> >> >> > >>> >> >> > On Sun, Feb 5, 2017 at 10:41 PM, Jilani Shaik < >>> jilani2...@gmail.com> >>> >> >> > wrote: >>> >> >> > >>> >> >> > > Hi Liz, >>> >> >> > > >>> >> >> > > lets say we inserted data in a table with initial import, that >>> >> looks >>> >> >> like >>> >> >> > > this in hbase shell >>> >> >> > > >>> >> >> > > 1 column=pay:amount, >>> >> >> > > timestamp=1485129654025, value=4.99 >>> >> >> > > 1 column=pay:customer_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 1 column=pay:last_update, >>> >> >> > > timestamp=1485129654025, value=2017-01-23 05:29:09.0 >>> >> >> > > 1 column=pay:payment_date, >>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0 >>> >> >> > > 1 column=pay:rental_id, >>> >> >> > > timestamp=1485129654025, value=573 >>> >> >> > > 1 column=pay:staff_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 10 column=pay:amount, >>> >> >> > > timestamp=1485129504390, value=5.99 >>> >> >> > > 10 column=pay:customer_id, >>> >> >> > > timestamp=1485129504390, value=1 >>> >> >> > > 10 column=pay:last_update, >>> >> >> > > timestamp=1485129504390, value=2006-02-15 22:12:30.0 >>> >> >> > > 10 column=pay:payment_date, >>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0 >>> >> >> > > 10 column=pay:rental_id, >>> >> >> > > timestamp=1485129504390, value=4526 >>> >> >> > > 10 column=pay:staff_id, >>> >> >> > > timestamp=1485129504390, value=2 >>> >> >> > > >>> >> >> > > >>> >> >> > > now assume that in source rental_id becomes NULL for rowkey >>> "1", >>> >> and >>> >> >> then >>> >> >> > > we are doing incremental import into HBase. With current >>> import the >>> >> >> final >>> >> >> > > HBase data after incremental import will look like this. >>> >> >> > > >>> >> >> > > 1 column=pay:amount, >>> >> >> > > timestamp=1485129654025, value=4.99 >>> >> >> > > 1 column=pay:customer_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 1 column=pay:last_update, >>> >> >> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0 >>> >> >> > > 1 column=pay:payment_date, >>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0 >>> >> >> > > 1 column=pay:rental_id, >>> >> >> > > timestamp=1485129654025, value=573 >>> >> >> > > 1 column=pay:staff_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 10 column=pay:amount, >>> >> >> > > timestamp=1485129504390, value=5.99 >>> >> >> > > 10 column=pay:customer_id, >>> >> >> > > timestamp=1485129504390, value=1 >>> >> >> > > 10 column=pay:last_update, >>> >> >> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0 >>> >> >> > > 10 column=pay:payment_date, >>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0 >>> >> >> > > 10 column=pay:rental_id, >>> >> >> > > timestamp=1485129504390, value=126 >>> >> >> > > 10 column=pay:staff_id, >>> >> >> > > timestamp=1485129504390, value=2 >>> >> >> > > >>> >> >> > > >>> >> >> > > >>> >> >> > > As source column "rental_id" becomes NULL for rowkey "1", the >>> final >>> >> >> HBase >>> >> >> > > should not have the "rental_id" for this rowkey "1". I am >>> expecting >>> >> >> below >>> >> >> > > data for these rowkeys. >>> >> >> > > >>> >> >> > > >>> >> >> > > 1 column=pay:amount, >>> >> >> > > timestamp=1485129654025, value=4.99 >>> >> >> > > 1 column=pay:customer_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 1 column=pay:last_update, >>> >> >> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0 >>> >> >> > > 1 column=pay:payment_date, >>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0 >>> >> >> > > 1 column=pay:staff_id, >>> >> >> > > timestamp=1485129654025, value=1 >>> >> >> > > 10 column=pay:amount, >>> >> >> > > timestamp=1485129504390, value=5.99 >>> >> >> > > 10 column=pay:customer_id, >>> >> >> > > timestamp=1485129504390, value=1 >>> >> >> > > 10 column=pay:last_update, >>> >> >> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0 >>> >> >> > > 10 column=pay:payment_date, >>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0 >>> >> >> > > 10 column=pay:rental_id, >>> >> >> > > timestamp=1485129504390, value=126 >>> >> >> > > 10 column=pay:staff_id, >>> >> >> > > timestamp=1485129504390, value=2 >>> >> >> > > >>> >> >> > > >>> >> >> > > Please let me know if anything required further. >>> >> >> > > >>> >> >> > > >>> >> >> > > Thanks, >>> >> >> > > Jilani >>> >> >> > > >>> >> >> > > On Tue, Jan 31, 2017 at 3:38 AM, Erzsebet Szilagyi < >>> >> >> > > liz.szila...@cloudera.com> wrote: >>> >> >> > > >>> >> >> > >> Hi Jilani, >>> >> >> > >> I'm not sure I completely understand what you are trying to >>> do. >>> >> Could >>> >> >> > you >>> >> >> > >> give us some examples with e.g. 4 columns and 2 rows of >>> example >>> >> data >>> >> >> > >> showing the changes that happen compared to the changes you'd >>> >> like to >>> >> >> > see? >>> >> >> > >> Thanks, >>> >> >> > >> Liz >>> >> >> > >> >>> >> >> > >> On Tue, Jan 31, 2017 at 5:18 AM, Jilani Shaik < >>> >> jilani2...@gmail.com> >>> >> >> > >> wrote: >>> >> >> > >> >>> >> >> > >> > >>> >> >> > >> > Please help in resolving the issue, I am going through >>> source >>> >> code >>> >> >> > some >>> >> >> > >> > how the required nature is missing, But not sure is it for >>> some >>> >> >> reason >>> >> >> > >> we >>> >> >> > >> > avoided this nature. >>> >> >> > >> > >>> >> >> > >> > Provide me some suggestions how to go with this scenario. >>> >> >> > >> > >>> >> >> > >> > Thanks, >>> >> >> > >> > Jilani >>> >> >> > >> > >>> >> >> > >> > On Sun, Jan 22, 2017 at 6:45 PM, Jilani Shaik < >>> >> >> jilani2...@gmail.com> >>> >> >> > >> > wrote: >>> >> >> > >> > >>> >> >> > >> >> Hi, >>> >> >> > >> >> >>> >> >> > >> >> We have a scenario where we are importing data into HBase >>> with >>> >> >> sqoop >>> >> >> > >> >> incremental import. >>> >> >> > >> >> >>> >> >> > >> >> Lets say we imported a table and later source table got >>> updated >>> >> >> for >>> >> >> > >> some >>> >> >> > >> >> columns as null values for some rows. Then while doing >>> >> incremental >>> >> >> > >> import >>> >> >> > >> >> as per HBase these columns should not be there in HBase >>> table. >>> >> But >>> >> >> > >> right >>> >> >> > >> >> now these columns will be as it is available with previous >>> >> values. >>> >> >> > >> >> >>> >> >> > >> >> Is there any fix to overcome this issue? >>> >> >> > >> >> >>> >> >> > >> >> >>> >> >> > >> >> Thanks, >>> >> >> > >> >> Jilani >>> >> >> > >> >> >>> >> >> > >> > >>> >> >> > >> > >>> >> >> > >> >>> >> >> > > >>> >> >> > > >>> >> >> > >>> >> >> >>> >> > >>> >> > >>> >> >>> > >>> > >>> >> >> >