Re: Review Request 67429: SQOOP-3331: Add mainframe integration test for GDG

2018-06-14 Thread Chris Teoh


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > Hi Chris!
> > 
> > Thank you for your patch this is a great addition to our mainframe 
> > connector testing!
> > Could we use this image for the other mainframe tests as well?
> > 
> > When I tried to run it on my machine I got an exception first the reason 
> > was that org.apache.sqoop.manager.MainframeManager#options had to be 
> > deleted because it is already added to ConnManager, can you please include 
> > this change in your patch as well?
> > See my other comments inline.

Yes it can be used for Sequential datasets as well. I have not yet put in any 
testing for Partitioned Datasets as I'm not using those at the moment.

Yes I can add the patch.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > build.xml
> > Lines 887 (patched)
> > 
> >
> > Do we need to be able to set gdg, gdg.filename and gdg.md5 from Ant?
> > For host, port, username and password it makes sense because in theory 
> > they could be different in some test environments but gdg, gdg.filename and 
> > gdg.md5 seems to be a test data which could be defined as a static field in 
> > your test only.

I put them in for flexibility. I have already set the default values in the 
file as well so if they don't get overridden, we shouldn't have a problem. Let 
me know if you want this changed.
```
  
  
  
  
  
  
  
```


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml
> > Lines 115 (patched)
> > 
> >
> > Did you create this image using a dockerfile? If yes, is it possible to 
> > attach to include it in your patch?

Thanks for your review. I have the Dockerfile and accompanying contents at 
https://github.com/christeoh/zos-ftpmock/

Let me know if I need to do anything else to reference this. I will also 
mention it in the JIRA.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
> > Lines 42 (patched)
> > 
> >
> > There are unfortunately too many StringUtils classes on the classpath, 
> > can you please use org.apache.commons.lang3.StringUtils?

Thanks. I have updated this.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
> > Lines 112 (patched)
> > 
> >
> > Do we need this method here? It initializes a MainframeManager but it 
> > is not used later and I can see that similar parameters are configured in 
> > getArgv method.

Yes. The base class was throwing errors saying it did not recognise the 
mainframe parameters.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
> > Lines 163 (patched)
> > 
> >
> > Do we need to redefine runImport here or can we just use 
> > org.apache.sqoop.testutil.ImportJobTestCase#runImport(org.apache.sqoop.tool.SqoopTool,
> >  java.lang.String[])?

Yes, I did not want to use the base function as it called getSqoopOptions(conf) 
and creates a new SqoopOptions instance. I put in some configuration values in 
SqoopOptions instance in my setup() which are needed for the test.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
> > Lines 185 (patched)
> > 
> >
> > Can we remove this comment?

Done.


> On June 14, 2018, 10:35 p.m., Szabolcs Vasas wrote:
> > src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
> > Lines 216 (patched)
> > 
> >
> > You can remove this method since it is empty.

Done.


- Chris


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67429/#review204767
---


On June 15, 2018, 10:18 a.m., Chris Teoh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67429/
> ---
> 
> (Updated June 15, 2018, 10:18 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Added simple test for GDG dataset. Fixed bug triggered by recent Parquet 
> patch.
> 
> 
> Diffs
> -

Re: Review Request 67429: SQOOP-3331: Add mainframe integration test for GDG

2018-06-14 Thread Chris Teoh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67429/
---

(Updated June 15, 2018, 10:18 a.m.)


Review request for Sqoop.


Changes
---

As per review feedback.


Repository: sqoop-trunk


Description (updated)
---

Added simple test for GDG dataset. Fixed bug triggered by recent Parquet patch.


Diffs (updated)
-

  build.xml a85705fa 
  src/java/org/apache/sqoop/manager/MainframeManager.java 4e8be155 
  
src/scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml 
2f4a07f6 
  src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67429/diff/4/

Changes: https://reviews.apache.org/r/67429/diff/3-4/


Testing (updated)
---

Running the integration and unit test on a patched version of trunk.


Thanks,

Chris Teoh



[jira] [Updated] (SQOOP-3331) Add Mainframe FTP integration test for GDG dataset.

2018-06-14 Thread Chris Teoh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SQOOP-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Teoh updated SQOOP-3331:
--
Description: 
Current mainframe import functionality doesn't have integration tests, only 
unit tests. Adding this test improves validation of this functionality.

Dockerfile and accompanying contents at 
https://github.com/christeoh/zos-ftpmock/


  was:Current mainframe import functionality doesn't have integration tests, 
only unit tests. Adding this test improves validation of this functionality.


> Add Mainframe FTP integration test for GDG dataset.
> ---
>
> Key: SQOOP-3331
> URL: https://issues.apache.org/jira/browse/SQOOP-3331
> Project: Sqoop
>  Issue Type: Test
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Minor
>
> Current mainframe import functionality doesn't have integration tests, only 
> unit tests. Adding this test improves validation of this functionality.
> Dockerfile and accompanying contents at 
> https://github.com/christeoh/zos-ftpmock/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables

2018-06-14 Thread Boglarka Egyed

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/#review204773
---



Hi Daniel,

Thanks for this improvement, your change generally LGTM, I also like the amount 
of test cases you attached to this change!

I have a couple of minor findings, please find them below, also I ran unit 
tests and the newly added TestOrcConversionContext failed for me with the 
following error message:

Testcase: testGetConverterDatetimeTypes took 0.011 sec
Caused an ERROR
org.apache.orc.storage.serde2.io.DateWritable cannot be cast to 
org.apache.hadoop.hive.serde2.io.DateWritable
java.lang.ClassCastException: org.apache.orc.storage.serde2.io.DateWritable 
cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritable
at 
org.apache.sqoop.util.TestOrcConversionContext.testGetConverterDatetimeTypes(TestOrcConversionContext.java:97)

Could you please take a look?

Thanks in advance,
Bogi


src/java/org/apache/sqoop/util/OrcUtil.java
Lines 38-47 (patched)


Input parameter overrides is missing from javadoc here.



src/test/org/apache/sqoop/TestAllTables.java
Line 41 (original), 52 (patched)


Could you please revert this to prevent wild card import usage?



src/test/org/apache/sqoop/TestOrcImport.java
Lines 62-81 (patched)


Would you mind use the newly introduced ArgumentArrayBuilder logic in tests 
instead? It has been added in https://reviews.apache.org/r/65607/


- Boglarka Egyed


On May 2, 2018, 12:12 p.m., daniel voros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66548/
> ---
> 
> (Updated May 2, 2018, 12:12 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3311
> https://issues.apache.org/jira/browse/SQOOP-3311
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID 
> by default. This will probably result in increased usage of ACID tables and 
> the need to support importing into ACID tables with Sqoop.
> 
> Currently the only table format supporting full ACID tables is ORC.
> 
> The easiest and most effective way to support importing into these tables 
> would be to write out files as ORC and keep using LOAD DATA as we do for all 
> other Hive tables (supported since HIVE-17361).
> 
> Workaround could be to create table as textfile (as before) and then CTAS 
> from that. This would push the responsibility of creating ORC format to Hive. 
> However it would result in writing every record twice; in text format and in 
> ORC.
> 
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. 
> micromanaged) ACID tables can use arbitrary file format.
> 
> Supporting full ACID tables would also be the first step in making 
> "lastmodified" incremental imports work with Hive.
> 
> 
> Diffs
> -
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/docs/man/import-common-args.txt 22e3448e 
>   src/docs/man/sqoop-import-all-tables.txt 6db38ad8 
>   src/docs/user/export-purpose.txt def6ead3 
>   src/docs/user/hcatalog.txt 2ae1d54d 
>   src/docs/user/help.txt 8a0d1477 
>   src/docs/user/import-all-tables.txt fbad47b2 
>   src/docs/user/import-purpose.txt c7eca606 
>   src/docs/user/import.txt 2d074f49 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b542102 
>   src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java c62ee98c 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2c474b7e 
>   src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
>   src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/TestAllTables.java 16933a82 
>   src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java f6d591b7 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 
>   src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION 
>   src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66548/diff/11/
> 
> 
> Testing
> ---
> 
> - added some unit tests
> - tested basic Hive import scenarios on a cluster
> 
> 
> Thanks,
> 
> daniel voros
> 
>



Re: Review Request 67429: SQOOP-3331: Add mainframe integration test for GDG

2018-06-14 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67429/#review204767
---



Hi Chris!

Thank you for your patch this is a great addition to our mainframe connector 
testing!
Could we use this image for the other mainframe tests as well?

When I tried to run it on my machine I got an exception first the reason was 
that org.apache.sqoop.manager.MainframeManager#options had to be deleted 
because it is already added to ConnManager, can you please include this change 
in your patch as well?
See my other comments inline.


build.xml
Lines 887 (patched)


Do we need to be able to set gdg, gdg.filename and gdg.md5 from Ant?
For host, port, username and password it makes sense because in theory they 
could be different in some test environments but gdg, gdg.filename and gdg.md5 
seems to be a test data which could be defined as a static field in your test 
only.



src/scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml
Lines 115 (patched)


Did you create this image using a dockerfile? If yes, is it possible to 
attach to include it in your patch?



src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
Lines 42 (patched)


There are unfortunately too many StringUtils classes on the classpath, can 
you please use org.apache.commons.lang3.StringUtils?



src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
Lines 112 (patched)


Do we need this method here? It initializes a MainframeManager but it is 
not used later and I can see that similar parameters are configured in getArgv 
method.



src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
Lines 163 (patched)


Do we need to redefine runImport here or can we just use 
org.apache.sqoop.testutil.ImportJobTestCase#runImport(org.apache.sqoop.tool.SqoopTool,
 java.lang.String[])?



src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
Lines 185 (patched)


Can we remove this comment?



src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java
Lines 216 (patched)


You can remove this method since it is empty.


- Szabolcs Vasas


On June 5, 2018, 1:23 p.m., Chris Teoh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67429/
> ---
> 
> (Updated June 5, 2018, 1:23 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> Added simple test for GDG dataset. This depends on a bug being fixed when 
> SQOOP-3224 ships.
> 
> 
> Diffs
> -
> 
>   build.xml a85705fa 
>   
> src/scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml
>  2f4a07f6 
>   src/test/org/apache/sqoop/manager/mainframe/MainframeManagerImportTest.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/67429/diff/3/
> 
> 
> Testing
> ---
> 
> Running the test on a patched version of trunk.
> 
> 
> Thanks,
> 
> Chris Teoh
> 
>