[jira] [Assigned] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros reassigned SQOOP-3134: --- Assignee: Daniel Voros (was: Eric Lin) > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Daniel Voros >Priority: Major > Attachments: SQOOP-3134.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809951#comment-16809951 ] Daniel Voros commented on SQOOP-3134: - Submitted PR: https://github.com/apache/sqoop/pull/78 > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Daniel Voros >Priority: Major > Attachments: SQOOP-3134.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808645#comment-16808645 ] Daniel Voros commented on SQOOP-3134: - Tests have passed for this patch: https://travis-ci.org/dvoros/sqoop/builds/515049441 > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Eric Lin >Priority: Major > Attachments: SQOOP-3134.1.patch > > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807898#comment-16807898 ] Daniel Voros commented on SQOOP-3134: - [~ericlin] I've attached the change I had in mind. Would you mind if I were to take this over? > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Eric Lin >Priority: Major > Attachments: SQOOP-3134.1.patch > > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3134: Attachment: SQOOP-3134.1.patch > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Eric Lin >Priority: Major > Attachments: SQOOP-3134.1.patch > > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3134) Add option to configure Avro schema output file name with (import + --as-avrodatafile)
[ https://issues.apache.org/jira/browse/SQOOP-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807883#comment-16807883 ] Daniel Voros commented on SQOOP-3134: - Just ran into this. Instead of introducing a new option, this could probably also be controlled with {{--class-name}}. It would only need a small change in the code path changed by SQOOP-2783 to also check for {{className == null}}. > Add option to configure Avro schema output file name with (import + > --as-avrodatafile) > --- > > Key: SQOOP-3134 > URL: https://issues.apache.org/jira/browse/SQOOP-3134 > Project: Sqoop > Issue Type: Improvement >Reporter: Markus Kemper >Assignee: Eric Lin >Priority: Major > > Please consider adding an option to configure the Avro schema output file > name that is created with Sqoop (import + --as-avrodatafile), example cases > below. > {noformat} > # > # STEP 01 - Create Data > # > export MYCONN=jdbc:mysql://mysql.cloudera.com:3306/db_coe > export MYUSER=sqoop > export MYPSWD=cloudera > sqoop list-tables --connect $MYCONN --username $MYUSER --password $MYPSWD > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "drop table t1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "create table t1 (c1 int, c2 date, c3 varchar(10))" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "insert into t1 values (1, current_date, 'some data')" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1" > - > | c1 | c2 | c3 | > - > | 1 | 2017-02-13 | some data | > - > # > # STEP 02 - Import + --table + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > t1 --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Transferred 413 bytes in > 20.6988 seconds (19.9529 bytes/sec) > 17/02/13 12:14:52 INFO mapreduce.ImportJobBase: Retrieved 1 records. > > -rw-r--r-- 1 root root 492 Feb 13 12:14 ./t1.avsc < want option to > configure this file name > -rw-r--r-- 1 root root 12462 Feb 13 12:14 ./t1.java > # > # STEP 03 - Import + --query + --as-avrodatafile > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "select * from t1 where \$CONDITIONS" --split-by c1 --target-dir > /user/root/t1 --delete-target-dir --num-mappers 1 --as-avrodatafile > ls -l ./* > Output: > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Transferred 448 bytes in > 25.2757 seconds (17.7245 bytes/sec) > 17/02/13 12:16:58 INFO mapreduce.ImportJobBase: Retrieved 1 records. > ~ > -rw-r--r-- 1 root root 527 Feb 13 12:16 ./AutoGeneratedSchema.avsc < > want option to configure this file name > -rw-r--r-- 1 root root 12590 Feb 13 12:16 ./QueryResult.java > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (SQOOP-3289) Add .travis.yml
[ https://issues.apache.org/jira/browse/SQOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros reassigned SQOOP-3289: --- Assignee: Szabolcs Vasas (was: Daniel Voros) Thank you [~vasas] for your effort, this is a lot more then what I've had in the old review request, so I've closed that one. > Add .travis.yml > --- > > Key: SQOOP-3289 > URL: https://issues.apache.org/jira/browse/SQOOP-3289 > Project: Sqoop > Issue Type: Sub-task > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Szabolcs Vasas >Priority: Minor > Fix For: 1.5.0, 3.0.0 > > Attachments: SQOOP-3289.patch > > > Adding a .travis.yml would enable running builds/tests on travis-ci.org. > Currently if you wish to use Travis for testing your changes, you have to > manually add a .travis.yml to your branch. Having it committed to trunk would > save us this extra step. > I currently have an example > [{{.travis.yml}}|https://github.com/dvoros/sqoop/blob/93a4c06c1a3da1fd5305c99e379484507797b3eb/.travis.yml] > on my travis branch running unit tests for every commit and every pull > request: https://travis-ci.org/dvoros/sqoop/builds > Later we could add the build status to the project readme as well, see: > https://github.com/dvoros/sqoop/tree/travis > Also, an example of a pull request: https://github.com/dvoros/sqoop/pull/1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 69433: Setting up Travis CI using Gradle test categories
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69433/#review210826 --- Ship it! So cool, thanks for picking this up! I believe this can truly make a difference for present and future developers. (: Ship it! - daniel voros On Nov. 23, 2018, 10:33 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69433/ > --- > > (Updated Nov. 23, 2018, 10:33 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3289 > https://issues.apache.org/jira/browse/SQOOP-3289 > > > Repository: sqoop-trunk > > > Description > --- > > The patch includes the following changes: > - Changed the default DB connection parameters to Docker image defaults so > the test tasks can be started without specifying connection parameters > - Connection parameter settings duplications are removed > - Most of the JDBC drivers are downloaded from Maven repositories the only > exception is Oracle. Contributors have to upload ojdbc6.jar to a public drive > and make it available to the CI job by setting the ORACLE_DRIVER_URL in Travis > - Introduced separate test tasks for each databases > - An Oracle Express Edition Docker image is added to > sqoop-thirdpartytest-db-services.yml so Oracle tests which does not require > Oracle EE features can be executed much easier > - The ports for MySQL and PostgreSQL Docker containers are changed because > the default ones were used in the Travis VM already. > - Introduced OracleEe test category for tests requiring Oracle EE database. > These tests won't be executed on Travis. The good news is that only a few > tests require Oracle EE > > Documentation is still coming feel free to provide a feedback! > > > Diffs > - > > .travis.yml PRE-CREATION > COMPILING.txt b399ba825 > build.gradle efe980d67 > build.xml a0e25191e > gradle.properties 722bc8bb2 > src/scripts/thirdpartytest/docker-compose/oraclescripts/ee-healthcheck.sh > PRE-CREATION > src/scripts/thirdpartytest/docker-compose/oraclescripts/healthcheck.sh > fb7800efe > > src/scripts/thirdpartytest/docker-compose/sqoop-thirdpartytest-db-services.yml > b4cf48863 > src/test/org/apache/sqoop/manager/cubrid/CubridTestUtils.java 4fd522bae > > src/test/org/apache/sqoop/manager/db2/DB2ImportAllTableWithSchemaManualTest.java > ed949b98f > src/test/org/apache/sqoop/manager/db2/DB2ManagerImportManualTest.java > 32dfc5eb2 > src/test/org/apache/sqoop/manager/db2/DB2TestUtils.java PRE-CREATION > src/test/org/apache/sqoop/manager/db2/DB2XmlTypeImportManualTest.java > 494c75b08 > src/test/org/apache/sqoop/manager/mysql/MySQLTestUtils.java be205c877 > src/test/org/apache/sqoop/manager/oracle/ExportTest.java a60168719 > src/test/org/apache/sqoop/manager/oracle/ImportTest.java 5db9fe34e > src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java 1598813d8 > src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java 1f67c4697 > src/test/org/apache/sqoop/manager/oracle/OracleConnectionFactoryTest.java > 34e182f4c > src/test/org/apache/sqoop/manager/oracle/TimestampDataTest.java be086c5c2 > src/test/org/apache/sqoop/manager/oracle/util/OracleUtils.java 14b57f91a > > src/test/org/apache/sqoop/manager/postgresql/DirectPostgreSQLExportManualTest.java > 7dd6efcf9 > > src/test/org/apache/sqoop/manager/postgresql/PGBulkloadManagerManualTest.java > 1fe264456 > src/test/org/apache/sqoop/manager/postgresql/PostgresqlExportTest.java > eb798fa99 > > src/test/org/apache/sqoop/manager/postgresql/PostgresqlExternalTableImportTest.java > 8c3d2fd90 > src/test/org/apache/sqoop/manager/postgresql/PostgresqlTestUtil.java > e9705e5da > src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java bd12c5566 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerExportTest.java > ab1e8ff2d > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > 3c5bb327e > src/test/org/apache/sqoop/metastore/db2/DB2JobToolTest.java 81ef5fce6 > > src/test/org/apache/sqoop/metastore/db2/DB2MetaConnectIncrementalImportTest.java > 5403908e2 > src/test/org/apache/sqoop/metastore/db2/DB2SavedJobsTest.java b41eda110 > src/test/org/apache/sqoop/metastore/postgres/PostgresJobToolTest.java > 59ea151a5 > > src/test/org/apache/sqoop/metastore/postgres/PostgresMetaConnectIncrementalImp
[jira] [Commented] (SQOOP-3378) Error during direct Netezza import/export can interrupt process in uncontrolled ways
[ https://issues.apache.org/jira/browse/SQOOP-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651992#comment-16651992 ] Daniel Voros commented on SQOOP-3378: - Thanks for letting me know [~vasas]! I can confirm, this is failing for me on trunk as well when running on Linux. It passes on Mac however. I've opened SQOOP-3393 to look into this. > Error during direct Netezza import/export can interrupt process in > uncontrolled ways > > > Key: SQOOP-3378 > URL: https://issues.apache.org/jira/browse/SQOOP-3378 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0, 3.0.0 > > Attachments: SQOOP-3378.2.patch > > > SQLException during JDBC operation in direct Netezza import/export signals > parent thread to fail fast by interrupting it (see > [here|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java#L92]). > We're [trying to process the interrupt in the > parent|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java#L232] > (main) thread, but there's no guarantee that we're not in some blocking > internal call that will process the interrupted flag and reset it before > we're able to check. > It is also possible that the parent thread has passed the "checking part" > when it gets interrupted. In case of {{NetezzaExternalTableExportMapper}} > this can interrupt the upload of log files. > I'd recommend using some other means of communication between the threads > than interrupts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3393) TestNetezzaExternalTableExportMapper hangs
Daniel Voros created SQOOP-3393: --- Summary: TestNetezzaExternalTableExportMapper hangs Key: SQOOP-3393 URL: https://issues.apache.org/jira/browse/SQOOP-3393 Project: Sqoop Issue Type: Bug Components: test Affects Versions: 1.5.0, 3.0.0 Reporter: Daniel Voros Assignee: Daniel Voros Fix For: 1.5.0, 3.0.0 Introduced in SQOOP-3378, spotted by [~vasas]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 68687: SQOOP-3381 Upgrade the Parquet library
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68687/#review209630 --- Ship it! Thanks Fero, for taking care of this. Ship it! - daniel voros On Oct. 16, 2018, 9:37 a.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68687/ > --- > > (Updated Oct. 16, 2018, 9:37 a.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3381 > https://issues.apache.org/jira/browse/SQOOP-3381 > > > Repository: sqoop-trunk > > > Description > --- > > This change upgrades our parquet library to the newest version and a whole > lot of libraries to newer versions with it. > > As we will need to register a data supplier in the fix for parquet decimal > support (SQOOP-3382), we will need a version that contains PARQUET-243. We > need to upgrade the Parquet library to a version that contains this fix and > is compatible with Hadoop 3.0. > > A few things to note: > - hadoop's version is still 2.8.0 > - hive is upgraded to 2.1.1 > - the rest of the dependency changes are required for the hive version bump. > > There is are a few changes in the codebase, but of course no new > functionality at all: > - in the TestParquetImport class, the new implementation returns a Utf8 > object for Strings written out. > - Added the security policy and related code changes from the patch for > SQOOP-3305 (upgrade hadoop) written by Daniel Voros. > - modified HiveMiniCluster config so it won't try to start a web ui (it's > unnecessary during tests anyway) > > > Diffs > - > > build.gradle fc7fc0c4 > gradle.properties 0d30378d > gradle/sqoop-package.gradle 1a8d994d > ivy.xml 670cb32d > ivy/libraries.properties 8f3dab2b > src/java/org/apache/sqoop/avro/AvroUtil.java 1663b1d1 > src/java/org/apache/sqoop/hive/HiveImport.java 48800366 > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java > 2180cc20 > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java > 90b910a3 > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java > 66ebc5b8 > src/test/org/apache/sqoop/TestParquetExport.java be1d8164 > src/test/org/apache/sqoop/TestParquetImport.java 2810e318 > src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java adad0cc1 > src/test/org/apache/sqoop/hive/TestHiveServer2ParquetImport.java b55179a4 > src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 9dd54486 > src/test/org/apache/sqoop/util/ParquetReader.java f1c2fe10 > testdata/hcatalog/conf/hive-site.xml 8a84a5d3 > > > Diff: https://reviews.apache.org/r/68687/diff/5/ > > > Testing > --- > > Ant unit and 3rd party tests were successful. > gradlew test and thirdpartytest were succesful as well. > > > Thanks, > > Fero Szabo > >
[jira] [Commented] (SQOOP-3378) Error during direct Netezza import/export can interrupt process in uncontrolled ways
[ https://issues.apache.org/jira/browse/SQOOP-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646190#comment-16646190 ] Daniel Voros commented on SQOOP-3378: - Uploaded, thank you [~vasas]. > Error during direct Netezza import/export can interrupt process in > uncontrolled ways > > > Key: SQOOP-3378 > URL: https://issues.apache.org/jira/browse/SQOOP-3378 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0, 3.0.0 > > Attachments: SQOOP-3378.2.patch > > > SQLException during JDBC operation in direct Netezza import/export signals > parent thread to fail fast by interrupting it (see > [here|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java#L92]). > We're [trying to process the interrupt in the > parent|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java#L232] > (main) thread, but there's no guarantee that we're not in some blocking > internal call that will process the interrupted flag and reset it before > we're able to check. > It is also possible that the parent thread has passed the "checking part" > when it gets interrupted. In case of {{NetezzaExternalTableExportMapper}} > this can interrupt the upload of log files. > I'd recommend using some other means of communication between the threads > than interrupts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3378) Error during direct Netezza import/export can interrupt process in uncontrolled ways
[ https://issues.apache.org/jira/browse/SQOOP-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3378: Attachment: SQOOP-3378.2.patch > Error during direct Netezza import/export can interrupt process in > uncontrolled ways > > > Key: SQOOP-3378 > URL: https://issues.apache.org/jira/browse/SQOOP-3378 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0, 3.0.0 > > Attachments: SQOOP-3378.2.patch > > > SQLException during JDBC operation in direct Netezza import/export signals > parent thread to fail fast by interrupting it (see > [here|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java#L92]). > We're [trying to process the interrupt in the > parent|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java#L232] > (main) thread, but there's no guarantee that we're not in some blocking > internal call that will process the interrupted flag and reset it before > we're able to check. > It is also possible that the parent thread has passed the "checking part" > when it gets interrupted. In case of {{NetezzaExternalTableExportMapper}} > this can interrupt the upload of log files. > I'd recommend using some other means of communication between the threads > than interrupts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0
[ https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639659#comment-16639659 ] Daniel Voros commented on SQOOP-3381: - With SQOOP-3305 I've decided to hold off until there's an HBase release that supports Hadoop 3.x. I don't think Hive 3.1.0 would help in this regard, since parquet classes are still shaded in hive-exec:3.1.0. > Upgrade the Parquet library from 1.6.0 to 1.9.0 > --- > > Key: SQOOP-3381 > URL: https://issues.apache.org/jira/browse/SQOOP-3381 > Project: Sqoop > Issue Type: Sub-task >Affects Versions: 1.4.7 >Reporter: Fero Szabo >Assignee: Fero Szabo >Priority: Major > Fix For: 3.0.0 > > > As we will need to register a data supplier in the fix for parquet decimal > support, we will need a version that contains PARQUET-243. > We need to upgrade the Parquet library to a version that contains this fix > and is compatible with Hadoop. Most probably, the newest version will be > adequate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3381) Upgrade the Parquet library from 1.6.0 to 1.9.0
[ https://issues.apache.org/jira/browse/SQOOP-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612402#comment-16612402 ] Daniel Voros commented on SQOOP-3381: - Hey [~fero], thanks for keeping that in mind. What I've seen during the hadoop3 upgrade, is that Avro is added to the MR classpath from under hadoop. So where this could lead to issues is conflicting versions of Avro in hadoop and Parquet shipped with Sqoop. Could you try your patch (having new parquet jar in lib/) on a cluster with current Hadoop versions? I don't think we should bother with testing with Hadoop 3, we'll face that in the Hadoop 3 patch. (One more thing to keep in mind, is that parquet-hadoop-bundle is also shaded into the hive-exec artifact. However, I think the classes involved in PARQUET-243 are not bundled there.) > Upgrade the Parquet library from 1.6.0 to 1.9.0 > --- > > Key: SQOOP-3381 > URL: https://issues.apache.org/jira/browse/SQOOP-3381 > Project: Sqoop > Issue Type: Sub-task >Affects Versions: 1.4.7 >Reporter: Fero Szabo >Assignee: Fero Szabo >Priority: Major > Fix For: 3.0.0 > > > As we will need to register a data supplier in the fix for parquet decimal > support, we will need a version that contains PARQUET-243. > We need to upgrade the Parquet library to a version that contains this fix > and is compatible with Hadoop. Most probably, the newest version will be > adequate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3374) Assigning HDFS path to --bindir is giving error "java.lang.reflect.InvocationTargetException"
[ https://issues.apache.org/jira/browse/SQOOP-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602719#comment-16602719 ] Daniel Voros commented on SQOOP-3374: - [~amjosh911] setting an HDFS location for {{--bindir}} is not supported at the moment. What is your use-case that would require to do so? A workaround might be putting the generated files on HDFS manually after the Sqoop job finishes. > Assigning HDFS path to --bindir is giving error > "java.lang.reflect.InvocationTargetException" > - > > Key: SQOOP-3374 > URL: https://issues.apache.org/jira/browse/SQOOP-3374 > Project: Sqoop > Issue Type: Wish > Components: sqoop2-api >Reporter: Amit Joshi >Priority: Blocker > > When I am trying to assign the HDFS directory path to --bindir in my sqoop > command, it is throwing error "java.lang.reflect.InvocationTargetException". > My sqoop query looks like this: > sqoop import -connect connection_string --username username --password-file > file_path --query 'select * from EDW_PROD.RXCLM_LINE_FACT_DENIED > PARTITION(RXCLM_LINE_FACTP201808) where $CONDITIONS' --as-parquetfile > --compression-codec org.apache.hadoop.io.compress.SnappyCodec --append > --target-dir target_dir *-bindir hdfs://user/projects/* --split-by RX_ID > --null-string '/N' --null-non-string '/N' --fields-terminated-by ',' -m 10 > > It is creating folder "hdfs:" in my home directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3058) Sqoop import with Netezza --direct fails properly but also produces NPE
[ https://issues.apache.org/jira/browse/SQOOP-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602110#comment-16602110 ] Daniel Voros commented on SQOOP-3058: - [~kuldeepkulkarn...@gmail.com], I don't think there's a workaround, but please note that this issue is only about reporting an extra NPE in case of an error. I've submitted a patch to throw a more meaningful exception. > Sqoop import with Netezza --direct fails properly but also produces NPE > --- > > Key: SQOOP-3058 > URL: https://issues.apache.org/jira/browse/SQOOP-3058 > Project: Sqoop > Issue Type: Bug >Reporter: Markus Kemper >Assignee: Daniel Voros >Priority: Major > > The [error] is expected however the [npe] seems like a defect, see [test > case] below > [error] > ERROR: relation does not exist SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > [npe] > 16/11/18 09:19:44 ERROR sqoop.Sqoop: Got exception running Sqoop: > java.lang.NullPointerException > [test case] > {noformat} > # > # STEP 01 - Setup Netezza Table and Data > # > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DROP TABLE SQOOP_SME1.T1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "CREATE TABLE SQOOP_SME1.T1 (C1 INTEGER)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "INSERT INTO SQOOP_SME1.T1 VALUES (1)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > # > # STEP 02 - Test Import and Export (baseline) > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --export-dir /user/root/t1 --num-mappers 1 > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > --- > | C1 | > --- > | 1 | > --- > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --export-dir /user/root/t1 --num-mappers 1 --direct > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > --- > | C1 | > --- > | 1 | > --- > > # > # STEP 03 - Test Import and Export (with SCHEMA in --table option AND > --direct) > # > /* Notes: This failure seems correct however the NPE after the failure seems > like a defect */ > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "SQOOP_SME1.T1" --export-dir /user/root/t1 --num-mappers 1 --direct > 16/11/18 09:19:44 ERROR manager.SqlManager: Error executing statement: > org.netezza.error.NzSQLException: ERROR: relation does not exist > SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > org.netezza.error.NzSQLException: ERROR: relation does not exist > SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > at > org.netezza.internal.QueryExecutor.getNextResult(QueryExecutor.java:280) > at org.netezza.internal.QueryExecutor.execute(QueryExecutor.java:76) > at org.netezza.sql.NzConnection.execute(NzConnection.java:2869) > at > org.netezza.sql.NzPreparedStatament._execute(NzPreparedStatament.java:1126) > at > org.netezza.sql.NzPreparedStatament.prepare(NzPreparedStatament.java:1143) > at > org.netezza.sql.NzPreparedStatament.(NzPreparedStatament.java:89) > at org.netezza.sql.NzConnection.prepareStatement(NzConnection.java:1589) > at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763) > at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) > at > org.apache.sqoop.manager.SqlManager.getColumnNamesForRawQuery(SqlManager.java:151) > at > org.apache.sqoop.manager.SqlManager.getColumnNames(SqlManager.java:11
Review Request 68607: Sqoop import with Netezza --direct fails properly but also produces NPE
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68607/ --- Review request for Sqoop. Bugs: SQOOP-3058 https://issues.apache.org/jira/browse/SQOOP-3058 Repository: sqoop-trunk Description --- We're not interrupting the import if we were unable to get column names, that leads to NPE later. We should check for null instead and throw some more meaningful exception. Diffs - src/java/org/apache/sqoop/mapreduce/netezza/NetezzaExternalTableExportJob.java 11ac95df src/test/org/apache/sqoop/mapreduce/netezza/TestNetezzaExternalTableExportJob.java PRE-CREATION Diff: https://reviews.apache.org/r/68607/diff/1/ Testing --- added UT Thanks, daniel voros
[jira] [Assigned] (SQOOP-3058) Sqoop import with Netezza --direct fails properly but also produces NPE
[ https://issues.apache.org/jira/browse/SQOOP-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros reassigned SQOOP-3058: --- Assignee: Daniel Voros > Sqoop import with Netezza --direct fails properly but also produces NPE > --- > > Key: SQOOP-3058 > URL: https://issues.apache.org/jira/browse/SQOOP-3058 > Project: Sqoop > Issue Type: Bug >Reporter: Markus Kemper > Assignee: Daniel Voros >Priority: Major > > The [error] is expected however the [npe] seems like a defect, see [test > case] below > [error] > ERROR: relation does not exist SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > [npe] > 16/11/18 09:19:44 ERROR sqoop.Sqoop: Got exception running Sqoop: > java.lang.NullPointerException > [test case] > {noformat} > # > # STEP 01 - Setup Netezza Table and Data > # > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DROP TABLE SQOOP_SME1.T1" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "CREATE TABLE SQOOP_SME1.T1 (C1 INTEGER)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "INSERT INTO SQOOP_SME1.T1 VALUES (1)" > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > # > # STEP 02 - Test Import and Export (baseline) > # > sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --target-dir /user/root/t1 --delete-target-dir --num-mappers 1 > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --export-dir /user/root/t1 --num-mappers 1 > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > --- > | C1 | > --- > | 1 | > --- > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "T1" --export-dir /user/root/t1 --num-mappers 1 --direct > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "SELECT C1 FROM SQOOP_SME1.T1" > --- > | C1 | > --- > | 1 | > --- > > # > # STEP 03 - Test Import and Export (with SCHEMA in --table option AND > --direct) > # > /* Notes: This failure seems correct however the NPE after the failure seems > like a defect */ > sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query > "DELETE FROM SQOOP_SME1.T1" > sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table > "SQOOP_SME1.T1" --export-dir /user/root/t1 --num-mappers 1 --direct > 16/11/18 09:19:44 ERROR manager.SqlManager: Error executing statement: > org.netezza.error.NzSQLException: ERROR: relation does not exist > SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > org.netezza.error.NzSQLException: ERROR: relation does not exist > SQOOP_SME_DB.SQOOP_SME1.SQOOP_SME1.T1 > at > org.netezza.internal.QueryExecutor.getNextResult(QueryExecutor.java:280) > at org.netezza.internal.QueryExecutor.execute(QueryExecutor.java:76) > at org.netezza.sql.NzConnection.execute(NzConnection.java:2869) > at > org.netezza.sql.NzPreparedStatament._execute(NzPreparedStatament.java:1126) > at > org.netezza.sql.NzPreparedStatament.prepare(NzPreparedStatament.java:1143) > at > org.netezza.sql.NzPreparedStatament.(NzPreparedStatament.java:89) > at org.netezza.sql.NzConnection.prepareStatement(NzConnection.java:1589) > at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763) > at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) > at > org.apache.sqoop.manager.SqlManager.getColumnNamesForRawQuery(SqlManager.java:151) > at > org.apache.sqoop.manager.SqlManager.getColumnNames(SqlManager.java:116) > at > org.apache.sqoop.mapreduce.netezza.NetezzaExternalTableExportJob.configureOutputFormat(NetezzaExternalTableExportJob.java:128) > at > org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:433) > at > org.a
[jira] [Commented] (SQOOP-3378) Error during direct Netezza import/export can interrupt process in uncontrolled ways
[ https://issues.apache.org/jira/browse/SQOOP-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602058#comment-16602058 ] Daniel Voros commented on SQOOP-3378: - Attached review request. > Error during direct Netezza import/export can interrupt process in > uncontrolled ways > > > Key: SQOOP-3378 > URL: https://issues.apache.org/jira/browse/SQOOP-3378 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0, 3.0.0 > > > SQLException during JDBC operation in direct Netezza import/export signals > parent thread to fail fast by interrupting it (see > [here|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java#L92]). > We're [trying to process the interrupt in the > parent|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java#L232] > (main) thread, but there's no guarantee that we're not in some blocking > internal call that will process the interrupted flag and reset it before > we're able to check. > It is also possible that the parent thread has passed the "checking part" > when it gets interrupted. In case of {{NetezzaExternalTableExportMapper}} > this can interrupt the upload of log files. > I'd recommend using some other means of communication between the threads > than interrupts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 68606: Error during direct Netezza import/export can interrupt process in uncontrolled ways
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68606/ --- Review request for Sqoop. Bugs: SQOOP-3378 https://issues.apache.org/jira/browse/SQOOP-3378 Repository: sqoop-trunk Description --- `SQLException` during JDBC operation in direct Netezza import/export signals parent thread to fail fast by interrupting it. We're trying to process the interrupt in the parent (main) thread, but there's no guarantee that we're not in some internal call that will process the interrupted flag and reset it before we're able to check. It is also possible that the parent thread has passed the "checking part" when it gets interrupted. In case of `NetezzaExternalTableExportMapper` this can interrupt the upload of log files. I'd recommend using some other means of communication between the threads than interrupts. Diffs - src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java 5bf21880 src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableImportMapper.java 306062aa src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java cedfd235 src/test/org/apache/sqoop/mapreduce/db/netezza/TestNetezzaExternalTableExportMapper.java PRE-CREATION src/test/org/apache/sqoop/mapreduce/db/netezza/TestNetezzaExternalTableImportMapper.java PRE-CREATION Diff: https://reviews.apache.org/r/68606/diff/1/ Testing --- added new UTs and checked manual Netezza tests (NetezzaExportManualTest, NetezzaImportManualTest) Thanks, daniel voros
[jira] [Created] (SQOOP-3378) Error during direct Netezza import/export can interrupt process in uncontrolled ways
Daniel Voros created SQOOP-3378: --- Summary: Error during direct Netezza import/export can interrupt process in uncontrolled ways Key: SQOOP-3378 URL: https://issues.apache.org/jira/browse/SQOOP-3378 Project: Sqoop Issue Type: Bug Affects Versions: 1.4.7 Reporter: Daniel Voros Assignee: Daniel Voros Fix For: 1.5.0, 3.0.0 SQLException during JDBC operation in direct Netezza import/export signals parent thread to fail fast by interrupting it (see [here|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaJDBCStatementRunner.java#L92]). We're [trying to process the interrupt in the parent|https://github.com/apache/sqoop/blob/c814e58348308b05b215db427412cd6c0b21333e/src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableExportMapper.java#L232] (main) thread, but there's no guarantee that we're not in some blocking internal call that will process the interrupted flag and reset it before we're able to check. It is also possible that the parent thread has passed the "checking part" when it gets interrupted. In case of {{NetezzaExternalTableExportMapper}} this can interrupt the upload of log files. I'd recommend using some other means of communication between the threads than interrupts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 68569: HiveMiniCluster does not restore hive-site.xml location
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68569/#review208249 --- Ship it! Ship It! - daniel voros On Aug. 30, 2018, 11:27 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68569/ > --- > > (Updated Aug. 30, 2018, 11:27 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3375 > https://issues.apache.org/jira/browse/SQOOP-3375 > > > Repository: sqoop-trunk > > > Description > --- > > HiveMiniCluster sets the hive-site.xml location using > org.apache.hadoop.hive.conf.HiveConf#setHiveSiteLocation static method during > startup but it does not restore the original location during shutdown. > > This makes HCatalogImportTest and HCatalogExportTest fail if they are ran in > the same JVM after any test using HiveMiniCluster. > > > Diffs > - > > src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 19bb7605c > > > Diff: https://reviews.apache.org/r/68569/diff/1/ > > > Testing > --- > > Executed unit and third party tests. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 68541: SQOOP-3104: Create test categories instead of test suites and naming conventions
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68541/#review208088 --- Ship it! Great stuff! Checked `test`, `unitTest` and `integrationPlainTest`. My only concern is forgetting to apply `@Category` on future test classes. We wouldn't execute without that, right? Any ideas how to prevent this from happening? - daniel voros On Aug. 28, 2018, 3:52 p.m., Nguyen Truong wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68541/ > --- > > (Updated Aug. 28, 2018, 3:52 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3104 > https://issues.apache.org/jira/browse/SQOOP-3104 > > > Repository: sqoop-trunk > > > Description > --- > > We are currently unsing test naming conventions to differentiate between > ManualTests, Unit tests and 3rd party tests. Instead of that, I implemented > junit categories which will allow us to have more categories in the future. > This would also remove the reliance on the test class name. > > Test categories skeleton: > SqoopTest _ UnitTest > |__ IntegrationTest > |__ ManualTest > > ThirdPartyTest _ CubridTest >|__ Db2Test >|__ MainFrameTest >|__ MysqlTest >|__ NetezzaTest >|__ OracleTest >|__ PostgresqlTest >|__ SqlServerTest > > KerberizedTest > > Categories explanation: > * SqoopTest: Group of the big categories, including: > - UnitTest: It tests one class only with its dependencies mocked or > if the dependency > is lightweight we can keep it. It must not start a minicluster or an > hsqldb database. > It does not need JCDB drivers. > - IntegrationTest: It usually tests a whole scenario. It may start up > miniclusters, > hsqldb and connect to external resources like RDBMSs. > - ManualTest: This should be a deprecated category which should not > be used in the future. > It only exists to mark the currently existing manual tests. > * ThirdPartyTest: An orthogonal hierarchy for tests that need a JDBC > driver and/or a docker > container/external RDBMS instance to run. Subcategories express what kind > of external > resource the test needs. E.g: OracleTest needs an Oracle RDBMS and Oracle > driver on the classpath > * KerberizedTest: Test that needs Kerberos, which needs to be run on a > separate JVM. > > Opinions are very welcomed. Thanks! > > > Diffs > - > > build.gradle fc7fc0c4c > src/test/org/apache/sqoop/TestConnFactory.java fb6c94059 > src/test/org/apache/sqoop/TestIncrementalImport.java 29c477954 > src/test/org/apache/sqoop/TestSqoopOptions.java e55682edf > src/test/org/apache/sqoop/accumulo/TestAccumuloUtil.java 631eeff5e > src/test/org/apache/sqoop/authentication/TestKerberosAuthenticator.java > f5700ce65 > src/test/org/apache/sqoop/db/TestDriverManagerJdbcConnectionFactory.java > 244831672 > > src/test/org/apache/sqoop/db/decorator/TestKerberizedConnectionFactoryDecorator.java > d3e3fb23e > src/test/org/apache/sqoop/hbase/HBaseKerberizedConnectivityTest.java > 3bfb39178 > src/test/org/apache/sqoop/hbase/TestHBasePutProcessor.java e78a535f4 > src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java ba05cabbb > > src/test/org/apache/sqoop/hive/HiveServer2ConnectionFactoryInitializerTest.java > 4d2cb2f88 > src/test/org/apache/sqoop/hive/TestHiveClientFactory.java a3c2dc939 > src/test/org/apache/sqoop/hive/TestHiveMiniCluster.java 419f888c0 > src/test/org/apache/sqoop/hive/TestHiveServer2Client.java 02617295e > src/test/org/apache/sqoop/hive/TestHiveServer2ParquetImport.java b55179a4f > src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java 410724f37 > src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java > 276e9eaa4 > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 626ad22f6 > src/test/org/apache/sqoop/hive/TestTableDefWriterForExternalTable.java > f1768ee76 > src/test/org/apache/sqoop/io/TestCodecMap.java e71921823 > src/test/org/apache/sqoop/io/TestLobFile.java 2bc95f283 > src/test/org/apache/sqoop/io/TestNamedFifo.java a93784e08 > src/test/org/apache/sqoop/io/TestSplittableBuffered
[jira] [Commented] (SQOOP-3042) Sqoop does not clear compile directory under /tmp/sqoop-/compile automatically
[ https://issues.apache.org/jira/browse/SQOOP-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596026#comment-16596026 ] Daniel Voros commented on SQOOP-3042: - [~amjosh911] could you please open a new ticket for that with details? > Sqoop does not clear compile directory under /tmp/sqoop-/compile > automatically > > > Key: SQOOP-3042 > URL: https://issues.apache.org/jira/browse/SQOOP-3042 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Critical > Labels: patch > Fix For: 3.0.0 > > Attachments: SQOOP-3042.1.patch, SQOOP-3042.2.patch, > SQOOP-3042.4.patch, SQOOP-3042.5.patch, SQOOP-3042.6.patch, > SQOOP-3042.7.patch, SQOOP-3042.9.patch > > > After running sqoop, all the temp files generated by ClassWriter are left > behind on disk, so anyone can check those JAVA files to see the schema of > those tables that Sqoop has been interacting with. By default, the directory > is under /tmp/sqoop-/compile. > In class org.apache.sqoop.SqoopOptions, function getNonceJarDir(), I can see > that we did add "deleteOnExit" on the temp dir: > {code} > for (int attempts = 0; attempts < MAX_DIR_CREATE_ATTEMPTS; attempts++) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > while (hashDir.exists()) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > } > if (hashDir.mkdirs()) { > // We created the directory. Use it. > // If this directory is not actually filled with files, delete it > // when the JVM quits. > hashDir.deleteOnExit(); > break; > } > } > {code} > However, I believe it failed to delete due to directory is not empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3042) Sqoop does not clear compile directory under /tmp/sqoop-/compile automatically
[ https://issues.apache.org/jira/browse/SQOOP-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594827#comment-16594827 ] Daniel Voros commented on SQOOP-3042: - [~amjosh911] use the `--bindir` option, see [here|https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html]. > Sqoop does not clear compile directory under /tmp/sqoop-/compile > automatically > > > Key: SQOOP-3042 > URL: https://issues.apache.org/jira/browse/SQOOP-3042 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Critical > Labels: patch > Fix For: 3.0.0 > > Attachments: SQOOP-3042.1.patch, SQOOP-3042.2.patch, > SQOOP-3042.4.patch, SQOOP-3042.5.patch, SQOOP-3042.6.patch, > SQOOP-3042.7.patch, SQOOP-3042.9.patch > > > After running sqoop, all the temp files generated by ClassWriter are left > behind on disk, so anyone can check those JAVA files to see the schema of > those tables that Sqoop has been interacting with. By default, the directory > is under /tmp/sqoop-/compile. > In class org.apache.sqoop.SqoopOptions, function getNonceJarDir(), I can see > that we did add "deleteOnExit" on the temp dir: > {code} > for (int attempts = 0; attempts < MAX_DIR_CREATE_ATTEMPTS; attempts++) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > while (hashDir.exists()) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > } > if (hashDir.mkdirs()) { > // We created the directory. Use it. > // If this directory is not actually filled with files, delete it > // when the JVM quits. > hashDir.deleteOnExit(); > break; > } > } > {code} > However, I believe it failed to delete due to directory is not empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3042) Sqoop does not clear compile directory under /tmp/sqoop-/compile automatically
[ https://issues.apache.org/jira/browse/SQOOP-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594718#comment-16594718 ] Daniel Voros commented on SQOOP-3042: - [~amjosh911] it is going to be included in the next release we do from trunk. Not sure yet if it's going to be 1.4.8, 1.5.0 or 3.0.0. > Sqoop does not clear compile directory under /tmp/sqoop-/compile > automatically > > > Key: SQOOP-3042 > URL: https://issues.apache.org/jira/browse/SQOOP-3042 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 >Reporter: Eric Lin >Assignee: Eric Lin >Priority: Critical > Labels: patch > Fix For: 3.0.0 > > Attachments: SQOOP-3042.1.patch, SQOOP-3042.2.patch, > SQOOP-3042.4.patch, SQOOP-3042.5.patch, SQOOP-3042.6.patch, > SQOOP-3042.7.patch, SQOOP-3042.9.patch > > > After running sqoop, all the temp files generated by ClassWriter are left > behind on disk, so anyone can check those JAVA files to see the schema of > those tables that Sqoop has been interacting with. By default, the directory > is under /tmp/sqoop-/compile. > In class org.apache.sqoop.SqoopOptions, function getNonceJarDir(), I can see > that we did add "deleteOnExit" on the temp dir: > {code} > for (int attempts = 0; attempts < MAX_DIR_CREATE_ATTEMPTS; attempts++) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > while (hashDir.exists()) { > hashDir = new File(baseDir, RandomHash.generateMD5String()); > } > if (hashDir.mkdirs()) { > // We created the directory. Use it. > // If this directory is not actually filled with files, delete it > // when the JVM quits. > hashDir.deleteOnExit(); > break; > } > } > {code} > However, I believe it failed to delete due to directory is not empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 68382: Upgrade Gradle version to 4.9
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68382/#review207668 --- Ship it! Thank you for picking this up! I've checked the following: - tar.gz contents (and lib/ in particular) are the same when generated with `./gradlew tar -x test` - publishing of snapshot and released artifacts works with local and remote repositories I couldn't get the ant way of publishing to work with remote repositories but comparing to the Maven central I've noticed that we've only released 1.4.7 with the classifier `hadoop260`. This is something we might need to revisit when deploying the next release; whether it makes sense to add a classifier if we're only releasing a single version. (For 1.4.6 there were multiple versions: http://central.maven.org/maven2/org/apache/sqoop/sqoop/1.4.6/) - daniel voros On Aug. 16, 2018, 2:37 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68382/ > --- > > (Updated Aug. 16, 2018, 2:37 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3364 > https://issues.apache.org/jira/browse/SQOOP-3364 > > > Repository: sqoop-trunk > > > Description > --- > > Apart from the Gradle version bump the change contains the following: > - html.destination type is modified to file to avoid deprecation warning > - the task wrapper is replaced with wrapper {} to avoid deprecation warning > - enableFeaturePreview('STABLE_PUBLISHING') is added to settings.gradle to > avoid deprecation warning. This is a change I could not test since we cannot > publish to maven repo now. However in a case of a future release we should > test it as described here: > https://docs.gradle.org/4.9/userguide/publishing_maven.html#publishing_maven:deferred_configuration > - The HBase test cases failed at first because the regionserver web ui was > not able to start up most probably because of a bad version of a Jetty class > on the classpath. However we do not need the regionserver web ui for the > Sqoop tests so instead of playing around with libraries I disabled it just > like we have already disabled the master web ui. > > > Diffs > - > > build.gradle 709172cc0 > gradle/wrapper/gradle-wrapper.jar 99340b4ad18d3c7e764794d300ffd35017036793 > gradle/wrapper/gradle-wrapper.properties 90a06cec7 > settings.gradle 7d64af500 > src/test/org/apache/sqoop/hbase/HBaseTestCase.java 87fce34a8 > > > Diff: https://reviews.apache.org/r/68382/diff/1/ > > > Testing > --- > > Executed unit and third party test suite successfully. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 68316: Debug toString() methods of OraOopOracleDataChunk
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68316/#review207395 --- Ship it! Ship It! - daniel voros On Aug. 15, 2018, 3:21 p.m., Nguyen Truong wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68316/ > --- > > (Updated Aug. 15, 2018, 3:21 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3362 > https://issues.apache.org/jira/browse/SQOOP-3362 > > > Repository: sqoop-trunk > > > Description > --- > > The method was currently returning the hash of data chunk object. I > implemented the toString() methods inside the subclasses of > OraOopOracleDataChunk. > > > Diffs > - > > src/java/org/apache/sqoop/manager/oracle/OraOopDBInputSplit.java 948bdbb73 > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunk.java > eb67fd2e4 > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunkExtent.java > 20b39eea0 > > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunkPartition.java > 59889b82b > > src/test/org/apache/sqoop/manager/oracle/TestOraOopDBInputSplitGetDebugDetails.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/68316/diff/4/ > > > Testing > --- > > A test case is added named TestOraOopDBInputSplitGetDebugDetails. > > > Thanks, > > Nguyen Truong > >
Re: Review Request 68316: Debug toString() methods of OraOopOracleDataChunk
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68316/#review207321 --- Hi Nguyen, Thanks for your contribution! Have you considered using ReflectionToStringBuilder from commons-lang3? You could achieve similar results by overriding toString() only in OraOopOracleDataChunk without having to worry about future fields added to the classes: ``` @Override public String toString() { return ReflectionToStringBuilder.toString(this, ToStringStyle.MULTI_LINE_STYLE); } ``` If you decide to keep the current solution, I'd recommend replacing `super.getId()` with `getId()` in the toString methods. Regards, Daniel - daniel voros On Aug. 14, 2018, 12:53 p.m., Nguyen Truong wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68316/ > --- > > (Updated Aug. 14, 2018, 12:53 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3362 > https://issues.apache.org/jira/browse/SQOOP-3362 > > > Repository: sqoop-trunk > > > Description > --- > > The method was currently returning the hash of data chunk object. I > implemented the toString() methods inside the subclasses of > OraOopOracleDataChunk. > > > Diffs > - > > src/java/org/apache/sqoop/manager/oracle/OraOopDBInputSplit.java 948bdbb73 > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunk.java > eb67fd2e4 > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunkExtent.java > 20b39eea0 > > src/java/org/apache/sqoop/manager/oracle/OraOopOracleDataChunkPartition.java > 59889b82b > > src/test/org/apache/sqoop/manager/oracle/TestOraOopDBInputSplitGetDebugDetails.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/68316/diff/3/ > > > Testing > --- > > No test case is added because the change has already covered. > > > Thanks, > > Nguyen Truong > >
[jira] [Updated] (SQOOP-3052) Introduce Gradle based build for Sqoop to make it more developer friendly / open
[ https://issues.apache.org/jira/browse/SQOOP-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3052: Summary: Introduce Gradle based build for Sqoop to make it more developer friendly / open (was: Introduce Maven/Gradle/etc. based build for Sqoop to make it more developer friendly / open) > Introduce Gradle based build for Sqoop to make it more developer friendly / > open > > > Key: SQOOP-3052 > URL: https://issues.apache.org/jira/browse/SQOOP-3052 > Project: Sqoop > Issue Type: Improvement >Reporter: Attila Szabo >Assignee: Anna Szonyi >Priority: Major > Fix For: 1.5.0 > > Attachments: SQOOP-3052.patch > > > The current trunk version can only be build with Ant/Ivy combination, which > has some painful limitations (resolve is slow / needs to be tweaked to use > only caches, the current profile / variable based settings are not working in > IDEs out of the box, the current solution does not download the related > sources, etc.) > It would be nice to provide a solution, which would give the possibility for > the developers to choose between the nowadays well used build infrsturctures > (e.g. Maven, Gradle, etc.). For this solution it would be also essential to > keep the different build files (if there is more then one) synchronized > easily, and the configuration wouldn't diverege by time. Test execution has > to be solved also, and should cover all the available test cases. > In this scenario: > If we can provide one good working solution is much better, then provide > three different ones which become out of sync easily. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 67929: Remove Kite dependency from the Sqoop project
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67929/#review206241 --- Ship it! Thanks for the update! Verified on same cluster. Ship it! - daniel voros On July 19, 2018, 1:52 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67929/ > --- > > (Updated July 19, 2018, 1:52 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3329 > https://issues.apache.org/jira/browse/SQOOP-3329 > > > Repository: sqoop-trunk > > > Description > --- > > - Removed kitesdk dependency from ivy.xml > - Removed Kite Dataset API based Parquet import implementation > - Since Parquet library was a transitive dependency of the Kite SDK I added > org.apache.parquet.avro-parquet 1.9 as a direct dependency > - In this dependency the parquet package has changed to org.apache.parquet so > I needed to make changes in several classes according to this > - Removed all the Parquet related test cases from TestHiveImport. These > scenarios are already covered in TestHiveServer2ParquetImport. > - Modified the documentation to reflect these changes. > > > Diffs > - > > ivy.xml 1f587f3eb > ivy/libraries.properties 565a8bf50 > src/docs/user/hive-notes.txt af97d94b3 > src/docs/user/import.txt a2c16d956 > src/java/org/apache/sqoop/SqoopOptions.java cc1b75281 > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java > 050c85488 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java > 02816d77f > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java > 6ebc5a31b > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java > 122ff3fc9 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java > 7e179a27d > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java > 0a91e4a20 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java > bd07c09f4 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java > ed045cd14 > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java > a4768c932 > src/java/org/apache/sqoop/tool/BaseSqoopTool.java 87fc5e987 > src/test/org/apache/sqoop/TestMerge.java 2b3280a5a > src/test/org/apache/sqoop/TestParquetExport.java 0fab1880c > src/test/org/apache/sqoop/TestParquetImport.java b1488e8af > src/test/org/apache/sqoop/hive/TestHiveImport.java 436f0e512 > src/test/org/apache/sqoop/tool/TestBaseSqoopTool.java dbda8b7f4 > > > Diff: https://reviews.apache.org/r/67929/diff/2/ > > > Testing > --- > > Ran unit and third party tests. > > > File Attachments > > > trunkdependencies.graphml > > https://reviews.apache.org/media/uploaded/files/2018/07/18/4df23fec-c7a7-4dc6-8ac1-0872ee6fdadf__trunkdependencies.graphml > kiteremovaldependencies.graphml > > https://reviews.apache.org/media/uploaded/files/2018/07/18/e8cbb4d3-1da3-4b64-96ea-09f647ece126__kiteremovaldependencies.graphml > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 67971: SQOOP-3346: Upgrade Hadoop version to 2.8.0
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67971/#review206236 --- Ship it! Ship It! - daniel voros On July 19, 2018, 9 a.m., Boglarka Egyed wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67971/ > --- > > (Updated July 19, 2018, 9 a.m.) > > > Review request for Sqoop, daniel voros, Fero Szabo, and Szabolcs Vasas. > > > Bugs: SQOOP-3346 > https://issues.apache.org/jira/browse/SQOOP-3346 > > > Repository: sqoop-trunk > > > Description > --- > > Upgrading Hadoop version from 2.6.0 to 2.8.0 and some related code changes. > > > Diffs > - > > ivy/libraries.properties 565a8bf50cdd88597a2a502d2fdbce2d5c8585ef > src/java/org/apache/sqoop/config/ConfigurationConstants.java > 666852c2af2f7636bd068c24e5df32173b185603 > src/java/org/apache/sqoop/config/ConfigurationHelper.java > fb2ab031caef023dfbd8130814d07416dbf4db14 > src/java/org/apache/sqoop/mapreduce/JobBase.java > 6d1e04992c0e1d45a24e22fcd765c286e7414578 > src/java/org/apache/sqoop/tool/ImportTool.java > f7310b939a667e4434a78bdbc50f9520fe72f8a6 > src/test/org/apache/sqoop/TestSqoopOptions.java > ba4a4d44f36c155318092bdcc71588c476e84e2d > src/test/org/apache/sqoop/manager/sqlserver/SQLServerParseMethodsTest.java > 833ebe8a14e438daa7fbb2eae13dc0d04bec3bb8 > src/test/org/apache/sqoop/orm/TestParseMethods.java > 46bb52d562991bc9c3443b8a26c7a7f9996d72d2 > > > Diff: https://reviews.apache.org/r/67971/diff/1/ > > > Testing > --- > > Ran unit and 3rd party tests successfully. > > > Thanks, > > Boglarka Egyed > >
[jira] [Commented] (SQOOP-3346) Upgrade Hadoop version to 2.8.0
[ https://issues.apache.org/jira/browse/SQOOP-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549011#comment-16549011 ] Daniel Voros commented on SQOOP-3346: - Yes, I agree with you. Don't block this until SQOOP-3305 is done! > Upgrade Hadoop version to 2.8.0 > --- > > Key: SQOOP-3346 > URL: https://issues.apache.org/jira/browse/SQOOP-3346 > Project: Sqoop > Issue Type: Sub-task >Reporter: Boglarka Egyed >Assignee: Boglarka Egyed >Priority: Major > > Support for AWS temporary credentials has been introduced in Hadoop 2.8.0 > based on HADOOP-12537 and it would make more sense to test and support this > capability too with Sqoop. > There is [SQOOP-3305|https://reviews.apache.org/r/66300/bugs/SQOOP-3305/] > being open for upgrading Hadoop to 3.0.0 however it has several issues > described in [https://reviews.apache.org/r/66300/] currently thus I would > like to proceed with an "intermediate" upgrade to 2.8.0 to enable development > on S3 front. [~dvoros] are you OK with this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 67929: Remove Kite dependency from the Sqoop project
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67929/#review206195 --- Hi! I was trying to run this on a minicluster but got the following error: ``` 2018-07-18 09:20:41,799 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType; at org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:178) at org.apache.parquet.avro.AvroSchemaConverter.convertUnion(AvroSchemaConverter.java:214) at org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:171) at org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:130) at org.apache.parquet.avro.AvroSchemaConverter.convertField(AvroSchemaConverter.java:227) at org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:124) at org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:115) at org.apache.parquet.avro.AvroWriteSupport.init(AvroWriteSupport.java:117) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:389) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:350) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:653) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171) ``` This is happening when we have newer version of parquet (1.8.1 IIRC) with older Avro (1.7.7 in this case). Where is parquet coming from? - 1.9 is coming from Sqoop since this new patch - Hive's hive-exec jar also contains parquet classes shaded with the original packaging Which gets picked seems to be random to me (even changing between reexecution of mappers!). Both are in the distributed cache. Where is avro coming from? - There can be multiple versions under Sqoop/Hive but it doesn't really matter. Hadoop is packaged with avro under `share/hadoop/*/lib`. The jars there will take precedence over user classpath. This can be changed with `mapreduce.job.user.classpath.first=true`, but then we'd have to make sure not to override anything that Hadoop relies on. I've come across this issue before and solved it with shading parquet classes. Note that this could be harder to do with Sqoop's ant build scripts. Some other minor observations: - Hadoop 3.1.0 still has Avro 1.7.7 - Hive has been using incompatible versions of Avro and Parquet for a long time, but they're not relying on parts of Parquet that require Avro. Szabolcs, I've been struggling this for too long, and a fresh pair of eyes might help spot some other options! Can you please take a look and validate what I've found? Regards, Daniel - daniel voros On July 16, 2018, 3:56 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67929/ > --- > > (Updated July 16, 2018, 3:56 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3329 > https://issues.apache.org/jira/browse/SQOOP-3329 > > > Repository: sqoop-trunk > > > Description > --- > > - Removed kitesdk dependency from ivy.xml > - Removed Kite Dataset API based Parquet import implementation > - Since Parquet library was a transitive dependency of the Kite SDK I added > org.apache.parquet.avro-parquet 1.9 as a direct dependency > - In this dependency the parquet package has changed to org.apache.parquet so > I needed to make changes in several classes according to this > - Removed all the Parquet related test cases from TestHiveImport. These > scenarios are already covered in TestHiveServer2ParquetImport. > - Modified the documentation to reflect these changes. > > > Diffs > - > > ivy.xml 1f587f3eb > ivy/libraries.properties 565a8bf50 > src/docs/user/hive-notes.txt af97d94b3 > src/docs/user/import.txt a2c16d956 > src/java/org/apache/sqoop/SqoopOptions.java cc1b75281 > src/java/org/apache/sqoop/avro/AvroUtil.java 16
Re: Review Request 66300: Upgrade to Hadoop 3.0.0
ute(Driver.java:2150) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1561) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:221) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:313) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:326) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:613) at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:409) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:798) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:794) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794) ... 36 more ``` - daniel voros On March 27, 2018, 8:50 a.m., daniel voros wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66300/ > --- > > (Updated March 27, 2018, 8:50 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3305 > https://issues.apache.org/jira/browse/SQOOP-3305 > > > Repository: sqoop-trunk > > > Description > --- > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > > > Diffs > - > > ivy.xml 1f587f3e > ivy/libraries.properties 565a8bf5 > src/java/org/apache/sqoop/SqoopOptions.java d9984af3 > src/java/org/apache/sqoop/config/ConfigurationHelper.java fb2ab031 > src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 > src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e0499 > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2a > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java > e68bba90 > src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b7 > src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20dd > src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java 19bb7605 > > src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java > 549a8c6c > > src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java > 79881f7b > src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c1 > testdata/hcatalog/conf/hive-site.xml edac7aa9 > > > Diff: https://reviews.apache.org/r/66300/diff/7/ > > > Testing > --- > > Normal and third-party unit tests. > > > Thanks, > > daniel voros > >
[jira] [Resolved] (SQOOP-3343) format all DTA.bat SQOOP
[ https://issues.apache.org/jira/browse/SQOOP-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros resolved SQOOP-3343. - Resolution: Invalid see INFRA-16778 > format all DTA.bat SQOOP > > > Key: SQOOP-3343 > URL: https://issues.apache.org/jira/browse/SQOOP-3343 > Project: Sqoop > Issue Type: Bug >Reporter: Mohamedvolt >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (SQOOP-3342) rformat.batalldata:assignee = currentUser() AND resolution = Unresolved order by updated DESC
[ https://issues.apache.org/jira/browse/SQOOP-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros resolved SQOOP-3342. - Resolution: Invalid see INFRA-16778 > rformat.batalldata:assignee = currentUser() AND resolution = Unresolved order > by updated DESC > - > > Key: SQOOP-3342 > URL: https://issues.apache.org/jira/browse/SQOOP-3342 > Project: Sqoop > Issue Type: New Feature >Reporter: Mohamedvolt >Priority: Major > > rformat.batalldata:assignee = currentUser() AND resolution = Unresolved order > by updated DESC -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 67873: Add Hive support to the new Parquet writing implementation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67873/#review206088 --- Ship it! Looks good, thank you! Ship it! - daniel voros On July 10, 2018, 11:26 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67873/ > --- > > (Updated July 10, 2018, 11:26 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3335 > https://issues.apache.org/jira/browse/SQOOP-3335 > > > Repository: sqoop-trunk > > > Description > --- > > SQOOP-3328 adds a new Parquet reading and writing implementation to Sqoop it > does not add support to Hive Parquet imports. The task of this Jira is to add > this missing functionality. > > > Diffs > - > > src/java/org/apache/sqoop/hive/HiveTypes.java ad00535e5 > src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c53 > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java > eb6d08f8a > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java > 3f35faf86 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java > feb3bf19b > src/java/org/apache/sqoop/tool/BaseSqoopTool.java 8d318327a > src/java/org/apache/sqoop/tool/ImportTool.java 25c3f7031 > src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java d8d3af40f > src/test/org/apache/sqoop/hive/TestHiveServer2ParquetImport.java > PRE-CREATION > src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java 3d115ab3e > src/test/org/apache/sqoop/hive/TestHiveTypesForAvroTypeMapping.java > PRE-CREATION > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f646 > src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java ac6db0b14 > src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java > 4d3f93898 > > > Diff: https://reviews.apache.org/r/67873/diff/1/ > > > Testing > --- > > Executed unit and third party test cases. > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67675/#review205504 --- Ship it! Ship It! - daniel voros On June 28, 2018, 12:29 p.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67675/ > --- > > (Updated June 28, 2018, 12:29 p.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3332 > https://issues.apache.org/jira/browse/SQOOP-3332 > > > Repository: sqoop-trunk > > > Description > --- > > This is the documentation part of SQOOP-. > > > Diffs > - > > src/docs/user/connectors.txt f1c7aebe > src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db > src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java > cf58f631 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > fc1c4895 > > > Diff: https://reviews.apache.org/r/67675/diff/3/ > > > Testing > --- > > Unit tests, 3rdparty tests, ant docs. > > I've also investigated how export and import works: > > Import has it's retry mechanism in > org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue > In case of error, it re-calculates the db query, thus the implicit > requirements > > Export has it's retry loop in > org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write > It doesn't recalculate the query, thus is a lot safer. > > > Thanks, > > Fero Szabo > >
Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67628/#review205495 --- Ship it! Thanks for the updates! Ship it! - daniel voros On June 26, 2018, 9:15 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67628/ > --- > > (Updated June 26, 2018, 9:15 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3328 > https://issues.apache.org/jira/browse/SQOOP-3328 > > > Repository: sqoop-trunk > > > Description > --- > > The new implementation uses classes from parquet.hadoop packages. > TestParquetIncrementalImportMerge has been introduced to cover some gaps we > had in the Parquet merge support. > The test infrastructure is also modified a bit which was needed because of > TestParquetIncrementalImportMerge. > > Note that this JIRA does not cover the Hive Parquet import support I will > create another JIRA for that. > > > Diffs > - > > src/java/org/apache/sqoop/SqoopOptions.java > d9984af369f901c782b1a74294291819e7d13cdd > src/java/org/apache/sqoop/avro/AvroUtil.java > 57c2062568778c5bb53cd4118ce4f030e4ff33f2 > src/java/org/apache/sqoop/manager/ConnManager.java > c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3 > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java > 3b5421028d3006e790ed4b711a06dbdb4035b8a0 > src/java/org/apache/sqoop/mapreduce/ImportJobBase.java > 17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b > src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java > ae53a96bddc523a52384715dd97705dc3d9db607 > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java > 8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java > fa1bc7d1395fbbbceb3cb72802675aebfdb27898 > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java > ed5103f1d84540ef2fa5de60599e94aa69156abe > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java > 2286a52030778925349ebb32c165ac062679ff71 > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java > 67fdf6602bcbc6c091e1e9bf4176e56658ce5222 > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java > 7f21205e1c4be4200f7248d3f1c8513e0c8e490c > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java > ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java > 2d88a9c8ea4eb32001e1eb03e636d9386719 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java > 87828d1413eb71761aed44ad3b138535692f9c97 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java > 20adf6e422cc4b661a74c8def114d44a14787fc6 > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java > 055e1166b07aeef711cd162052791500368c628d > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java > 9fecf282885f7aeac011a66f7d5d05512624976f > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java > e68bba90d8b08ac3978fcc9ccae612bdf02388e8 > src/java/org/apache/sqoop/tool/BaseSqoopTool.java > c62ee98c2b22d819c9a994884b254f76eb518b6a > src/java/org/apache/sqoop/tool/ImportTool.java > 2c474b7eeeff02b59204e4baca8554d668b6c61e > src/java/org/apache/sqoop/tool/MergeTool.java > 4c20f7d151514b26a098dafdc1ee265cbde5ad20 >
Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67675/#review205494 --- Hi Fero, If I understand correclty, with this patch we're only displaying a warning when using --resilient to let the users know they should add --split-by (even if they do so?). In the documentation you're saying omitting --split-by can lead to lost/duplicated records. Shouldn't we stop the importing if there's no --split-by then? I understand we can't enforce the uniqeness and ascending order though, so keeping some kind of warning could make sense too. What do you think? Regards, Daniel - daniel voros On June 25, 2018, 3:17 p.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67675/ > --- > > (Updated June 25, 2018, 3:17 p.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3332 > https://issues.apache.org/jira/browse/SQOOP-3332 > > > Repository: sqoop-trunk > > > Description > --- > > This is the documentation part of SQOOP-. > > > Diffs > - > > src/docs/user/connectors.txt f1c7aebe > src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db > src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java > cf58f631 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > fc1c4895 > > > Diff: https://reviews.apache.org/r/67675/diff/2/ > > > Testing > --- > > Unit tests, 3rdparty tests, ant docs. > > I've also investigated how export and import works: > > Import has it's retry mechanism in > org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue > In case of error, it re-calculates the db query, thus the implicit > requirements > > Export has it's retry loop in > org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write > It doesn't recalculate the query, thus is a lot safer. > > > Thanks, > > Fero Szabo > >
[jira] [Commented] (SQOOP-3323) Use hive executable in (non-JDBC) Hive imports
[ https://issues.apache.org/jira/browse/SQOOP-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520178#comment-16520178 ] Daniel Voros commented on SQOOP-3323: - Attached review request. > Use hive executable in (non-JDBC) Hive imports > -- > > Key: SQOOP-3323 > URL: https://issues.apache.org/jira/browse/SQOOP-3323 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 3.0.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 3.0.0 > > > When doing Hive imports the old way (not via JDBC that was introduced in > SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall > back to the {{hive}} executable (a.k.a. [Hive > Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if > that class is not found. > Since {{CliDriver}} and the {{hive}} executable that's relying on it are > [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] > (see also HIVE-10511), we should switch to using {{beeline}} to talk to > Hive. With recent additions (e.g. HIVE-18963) this should be easier than > before. > As a first step we could switch to using {{hive}} executable. With HIVE-19728 > it will be possible (in Hive 3.1) to configure hive to actually run beeline > when using the {{hive}} executable. This way we could leave it to the user to > decide whether to use the deprecated cli or use beeline instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3336) Splitting on integer column can create more splits than necessary
[ https://issues.apache.org/jira/browse/SQOOP-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520176#comment-16520176 ] Daniel Voros commented on SQOOP-3336: - Attached review request. This also affects splitting on date/timestamp columns, since DateSplitter uses the same logic. > Splitting on integer column can create more splits than necessary > - > > Key: SQOOP-3336 > URL: https://issues.apache.org/jira/browse/SQOOP-3336 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 > Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0, 3.0.0 > > > Running an import with {{-m 2}} will result in three splits if there are only > three consecutive integers in the table ({{\{1, 2, 3\}}}). > Work is (probably) spread more evenly between mappers this way, but ending up > with more files than expected could be an issue. > Split-limit can also result in more values than asked for in the last chunk > (due to the closed interval in the end). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67699: Splitting on integer column can create more splits than necessary
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67699/ --- Review request for Sqoop. Bugs: SQOOP-3336 https://issues.apache.org/jira/browse/SQOOP-3336 Repository: sqoop-trunk Description --- Running an import with -m 2 will result in three splits if there are only three consecutive integers in the table ({1, 2, 3}). Work is (probably) spread more evenly between mappers this way, but ending up with more files than expected could be an issue. Split-limit can also result in more values than asked for in the last chunk (due to the closed interval in the end). Diffs - src/docs/user/import.txt 2d074f49 src/java/org/apache/sqoop/mapreduce/db/IntegerSplitter.java 22c18e25 src/test/org/apache/sqoop/mapreduce/db/TestIntegerSplitter.java b43fc41f Diff: https://reviews.apache.org/r/67699/diff/1/ Testing --- Corrected some tests that were flawed before and added new tests for the above mentioned (-m 2) case. ran normal UTs and thirdparties Thanks, daniel voros
[jira] [Created] (SQOOP-3336) Splitting on integer column can create more splits than necessary
Daniel Voros created SQOOP-3336: --- Summary: Splitting on integer column can create more splits than necessary Key: SQOOP-3336 URL: https://issues.apache.org/jira/browse/SQOOP-3336 Project: Sqoop Issue Type: Bug Affects Versions: 1.4.7 Reporter: Daniel Voros Assignee: Daniel Voros Fix For: 1.5.0, 3.0.0 Running an import with {{-m 2}} will result in three splits if there are only three consecutive integers in the table ({{\{1, 2, 3\}}}). Work is (probably) spread more evenly between mappers this way, but ending up with more files than expected could be an issue. Split-limit can also result in more values than asked for in the last chunk (due to the closed interval in the end). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67689: Use hive executable in (non-JDBC) Hive imports
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67689/ --- Review request for Sqoop. Bugs: SQOOP-3323 https://issues.apache.org/jira/browse/SQOOP-3323 Repository: sqoop-trunk Description --- When doing Hive imports the old way (not via JDBC that was introduced in SQOOP-3309) we're trying to use the CliDriver class from Hive and fall back to the hive executable (a.k.a. Hive Cli) if that class is not found. Since CliDriver and the hive executable that's relying on it are deprecated (see also HIVE-10511), we should switch to using beeline to talk to Hive. With recent additions (e.g. HIVE-18963) this should be easier than before. As a first step we could switch to using hive executable. With HIVE-19728 it will be possible (in Hive 3.1) to configure hive to actually run beeline when using the hive executable. This way we could leave it to the user to decide whether to use the deprecated cli or use beeline instead. Diffs - src/java/org/apache/sqoop/hive/HiveImport.java 5da00a74 src/test/org/apache/sqoop/TestIncrementalImport.java 1ab98021 src/test/org/apache/sqoop/TestSqoopJobDataPublisher.java b3579ac1 src/test/org/apache/sqoop/hive/TestHiveImport.java 436f0e51 src/test/org/apache/sqoop/manager/postgresql/PostgresqlExternalTableImportTest.java dd4cfb48 Diff: https://reviews.apache.org/r/67689/diff/1/ Testing --- run thirdparty and normal UTs, also tested on a cluster I'm removing PostgresqlExternalTableImportTest since it was relying on the CliDriver path to do an actual Hive import. Thanks, daniel voros
[jira] [Updated] (SQOOP-3323) Use hive executable in (non-JDBC) Hive imports
[ https://issues.apache.org/jira/browse/SQOOP-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3323: Description: When doing Hive imports the old way (not via JDBC that was introduced in SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall back to the {{hive}} executable (a.k.a. [Hive Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if that class is not found. Since {{CliDriver}} and the {{hive}} executable that's relying on it are [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] (see also HIVE-10511), we should switch to using {{beeline}} to talk to Hive. With recent additions (e.g. HIVE-18963) this should be easier than before. As a first step we could switch to using {{hive}} executable. With HIVE-19728 it will be possible (in Hive 3.1) to configure hive to actually run beeline when using the {{hive}} executable. This way we could leave it to the user to decide whether to use the deprecated cli or use beeline instead. was: When doing Hive imports the old way (not via JDBC that was introduced in SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall back to the {{hive}} executable (a.k.a. [Hive Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if that class is not found. Since {{CliDriver}} and the {{hive}} executable that's relying on it are [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] (see also HIVE-10511), we should switch to using {{beeline}} to talk to Hive. With recent additions (e.g. HIVE-18963) this should be easier than before. Summary: Use hive executable in (non-JDBC) Hive imports (was: Use beeline in (non-JDBC) Hive imports) With HIVE-19728 (will be released in Hive 3.1) it will be possible to map hive executable to beeline. I'm updating the goal of this Jira to be using {{hive}} executable and let the users decide whether if they want to use beeline instead. > Use hive executable in (non-JDBC) Hive imports > -- > > Key: SQOOP-3323 > URL: https://issues.apache.org/jira/browse/SQOOP-3323 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 3.0.0 > Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 3.0.0 > > > When doing Hive imports the old way (not via JDBC that was introduced in > SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall > back to the {{hive}} executable (a.k.a. [Hive > Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if > that class is not found. > Since {{CliDriver}} and the {{hive}} executable that's relying on it are > [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] > (see also HIVE-10511), we should switch to using {{beeline}} to talk to > Hive. With recent additions (e.g. HIVE-18963) this should be easier than > before. > As a first step we could switch to using {{hive}} executable. With HIVE-19728 > it will be possible (in Hive 3.1) to configure hive to actually run beeline > when using the {{hive}} executable. This way we could leave it to the user to > decide whether to use the deprecated cli or use beeline instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 67524: SQOOP-3333 Change default behavior of the MS SQL connector to non-resilient.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67524/#review204931 --- Ship it! Ship It! - daniel voros On June 18, 2018, 2:48 p.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67524/ > --- > > (Updated June 18, 2018, 2:48 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. > > > Bugs: SQOOP- > https://issues.apache.org/jira/browse/SQOOP- > > > Repository: sqoop-trunk > > > Description > --- > > This change is about changing the default behavior of the MS SQL connector > from resilient to non-resilient. I was aiming for the fewest possible > modifications while also removed double negation where previously present. > > I've refactored the context configuration into a separate class. > > I've also changed the documentation of the non-resilient flag and added a > note about the implicit requirement of the feature (that the split-by column > has to be unique and ordered in ascending order). > > I plan to expand the documentation more in SQOOP-3332, as the (now named) > resilient flag works not just for export, but import as well (queries and > tables). > > I've also added new tests that cover what classes get loaded in connection > with the resilient option. Also, I've refactored SQL Server import tests and > added a few more cases for better coverage. (The query import uses a > different method and wasn't covered by these tests at all.) > > > Diffs > - > > src/docs/user/connectors.txt 7c540718 > src/java/org/apache/sqoop/manager/ExportJobContext.java 773cf742 > src/java/org/apache/sqoop/manager/SQLServerManager.java b136087f > src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java > PRE-CREATION > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > c83c2c93 > > src/test/org/apache/sqoop/manager/sqlserver/TestSqlServerManagerContextConfigurator.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/67524/diff/4/ > > > Testing > --- > > Added new unit tests for SqlServerConfigurator. > unit and 3rd party tests. > ant docs ran succesfully. > manual testing. > > > Thanks, > > Fero Szabo > >
Re: Review Request 67629: SQOOP-3334 Improve ArgumentArrayBuilder, so arguments are replaceable
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67629/#review204921 --- Ship it! Ship it! (Maybe you could also consider dropping the Argument class, now that we're using maps.) - daniel voros On June 18, 2018, 12:06 p.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67629/ > --- > > (Updated June 18, 2018, 12:06 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. > > > Bugs: SQOOP-3334 > https://issues.apache.org/jira/browse/SQOOP-3334 > > > Repository: sqoop-trunk > > > Description > --- > > Changed the implementation so that it uses maps instead of lists. > > > Diffs > - > > src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java 00ce4fe8 > src/test/org/apache/sqoop/testutil/TestArgumentArrayBuilder.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/67629/diff/1/ > > > Testing > --- > > Added 2 new unit tests. > Ran 3rdparty and unit tests. > > > Thanks, > > Fero Szabo > >
Re: Review Request 67524: SQOOP-3333 Change default behavior of the MS SQL connector to non-resilient.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67524/#review204918 --- Hi Fero, Thank you for taking care of this! I think it's always a good idea to avoid these nagating options. I've posted a few minor issues/questions. Regards, Daniel src/docs/user/connectors.txt Lines 154 (patched) <https://reviews.apache.org/r/67524/#comment287716> *and in ascending order? src/java/org/apache/sqoop/manager/ExportJobContext.java Lines 38 (patched) <https://reviews.apache.org/r/67524/#comment287720> This new constructor is always called with outputFormatClass=null now. Are you planning on using this later? src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java Lines 34 (patched) <https://reviews.apache.org/r/67524/#comment287717> *"to be resilient"? src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java Lines 115 (patched) <https://reviews.apache.org/r/67524/#comment287719> Could you please add some javadoc about the return value? src/test/org/apache/sqoop/manager/sqlserver/TestSqlServerManagerContextConfigurator.java Lines 119 (patched) <https://reviews.apache.org/r/67524/#comment287718> Could this be a @Before method since it's called from every TC? - daniel voros On June 18, 2018, 10:25 a.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67524/ > --- > > (Updated June 18, 2018, 10:25 a.m.) > > > Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. > > > Bugs: SQOOP- > https://issues.apache.org/jira/browse/SQOOP- > > > Repository: sqoop-trunk > > > Description > --- > > This change is about changing the default behavior of the MS SQL connector > from resilient to non-resilient. I was aiming for the fewest possible > modifications while also removed double negation where previously present. > > I've refactored the context configuration into a separate class. > > I've also changed the documentation of the non-resilient flag and added a > note about the implicit requirement of the feature (that the split-by column > has to be unique and ordered in ascending order). > > I plan to expand the documentation more in SQOOP-3332, as the (now named) > resilient flag works not just for export, but import as well (queries and > tables). > > I've also added new tests that cover what classes get loaded in connection > with the resilient option. Also, I've refactored SQL Server import tests and > added a few more cases for better coverage. (The query import uses a > different method and wasn't covered by these tests at all.) > > > Diffs > - > > src/docs/user/connectors.txt 7c540718 > src/java/org/apache/sqoop/manager/ExportJobContext.java 773cf742 > src/java/org/apache/sqoop/manager/SQLServerManager.java b136087f > src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java > PRE-CREATION > src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java > c83c2c93 > > src/test/org/apache/sqoop/manager/sqlserver/TestSqlServerManagerContextConfigurator.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/67524/diff/3/ > > > Testing > --- > > Added new unit tests for SqlServerConfigurator. > unit and 3rd party tests. > ant docs ran succesfully. > manual testing. > > > Thanks, > > Fero Szabo > >
Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67628/#review204915 --- Hey Szabolcs, Thank you for submitting this! Verified UTs, opened some minor issues. Could you please add a few lines of Javadoc to the new classes to make it clear what they're used for? Thanks, Daniel src/java/org/apache/sqoop/SqoopOptions.java Lines 2936 (patched) <https://reviews.apache.org/r/67628/#comment287711> Couldn't we store this as a field of SqoopOptions? That way it could have a default without this method. src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java Lines 51 (patched) <https://reviews.apache.org/r/67628/#comment287712> Wrong error msg: Is unknown? Or is _not_ set? src/test/org/apache/sqoop/TestParquetImport.java Lines 152 (patched) <https://reviews.apache.org/r/67628/#comment287713> Why is this expected to fail? Could you please add some Javadoc? - daniel voros On June 18, 2018, 9:49 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67628/ > --- > > (Updated June 18, 2018, 9:49 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3328 > https://issues.apache.org/jira/browse/SQOOP-3328 > > > Repository: sqoop-trunk > > > Description > --- > > The new implementation uses classes from parquet.hadoop packages. > TestParquetIncrementalImportMerge has been introduced to cover some gaps we > had in the Parquet merge support. > The test infrastructure is also modified a bit which was needed because of > TestParquetIncrementalImportMerge. > > Note that this JIRA does not cover the Hive Parquet import support I will > create another JIRA for that. > > > Diffs > - > > src/java/org/apache/sqoop/SqoopOptions.java d9984af36 > src/java/org/apache/sqoop/avro/AvroUtil.java 57c206256 > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b5421028 > src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 17c9ed39b > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java > 2286a5203 > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopMergeParquetReducer.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java > PRE-CREATION > src/test/org/apache/sqoop/TestBigDecimalExport.java ccea17345 > src/test/org/apache/sqoop/TestMerge.java 11806fea6 > src/test/org/apache/sqoop/TestParquetExport.java 43dabb57b > src/test/org/apache/sqoop/TestParquetImport.java 27d407aa3 > src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java > PRE-CREATION > src/test/org/apache/sqoop/hive/TestHiveImport.java 436f0e512 > src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java f6d591b73 > src/test/org/apache/sqoop/manager/sqlserver/SQLServerHiveImportTest.java > e6b086550 > src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a06b > src/test/org/apache/sqoop/testutil/ImportJobTestCase.java dbefe2097 > src/test/org/apache/sqoop/util/ParquetReader.java 56e03a060 > > > Diff: https://reviews.apache.org/r/67628/diff/1/ > > > Testing > --- > > Ran unit and third party tests successfully. > > > Thanks, > > Szabolcs Vasas > >
[jira] [Resolved] (SQOOP-2471) Support arrays and structs datatypes with Sqoop Hcatalog integration
[ https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros resolved SQOOP-2471. - Resolution: Duplicate I believe this has been superseded by SQOOP-2935. > Support arrays and structs datatypes with Sqoop Hcatalog integration > > > Key: SQOOP-2471 > URL: https://issues.apache.org/jira/browse/SQOOP-2471 > Project: Sqoop > Issue Type: New Feature > Components: hive-integration >Affects Versions: 1.4.6 >Reporter: Pavel Benes >Priority: Critical > > Currently sqoop import is not able to handle any complex type. On the other > side the hive already has support for the following complex types: > - arrays: ARRAY > - structs: STRUCT > Since it is probably not possible to obtain all necessary information about > those types from general JDBC database, this feature should somehow use an > external information provided by arguments --map-column-java and > --map-column-hive. > For example it could look like this: > --map-column-java item='inventory_item(name text, supplier_id integer,price > numeric)' > --map-column-hive item='STRUCT decimal>' > In case no additional information is provided some more general type should > be created if possible. > It should be possible to serialize the complex datatypes values into strings > when the Hive target column's type is explicitly set to 'STRING'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 67268: Extract code using Kite into separate classes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67268/#review203943 --- Ship it! Looks good! (Checked that Kite dependecies don't leak outside of the kite package.) A quick question: are you planning on keeping all these abstractions after removing Kite? - daniel voros On May 25, 2018, 1:55 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67268/ > --- > > (Updated May 25, 2018, 1:55 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3319 > https://issues.apache.org/jira/browse/SQOOP-3319 > > > Repository: sqoop-trunk > > > Description > --- > > Kite Dataset API is used in many places in the code to read/write Parquet > files and configure MR jobs. > > The goal of this JIRA is to introduce an implementation agnostic interface > for Parquet reading/writing and extract the code using Kite Dataset API into > separate classes implementing this interface. The benefit of this refactoring > is that it enables us introducing a new Parquet reading/writing > implementation which does not use Kite but plugs in easily. > > > Diffs > - > > src/java/org/apache/sqoop/avro/AvroUtil.java > 603cc631c9c45e3bc86f8c401da29cb1ba50d417 > src/java/org/apache/sqoop/manager/ConnManager.java > d7d6279a17c72c2d65a1d6db1539853a8246e143 > src/java/org/apache/sqoop/manager/CubridManager.java > e27f616c2aad60f66e59065354f30985418fef9e > src/java/org/apache/sqoop/manager/Db2Manager.java > 7ff68ce015d8db0a9b3b9a627ad75e94e2bf51c2 > src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java > c05e1c191fa071ac3f80f3d9316e83c0c99716ec > src/java/org/apache/sqoop/manager/MainframeManager.java > a6002ef47e604e029e3f1197ad8282bb48953c53 > src/java/org/apache/sqoop/manager/MySQLManager.java > 2d177071204f6c62c0862c9df33debed2184e034 > src/java/org/apache/sqoop/manager/OracleManager.java > b7005d467557df682a0045c1ebbb1c1efe41099a > src/java/org/apache/sqoop/manager/SQLServerManager.java > d57a4935d465e7b75228475e2078e580fd88e92e > src/java/org/apache/sqoop/manager/SqlManager.java > 4572098831e1482d32979957f4a4406c087cfc1c > src/java/org/apache/sqoop/manager/oracle/OraOopConnManager.java > 10524e3a721bd40289ffaeb9368faa7188e8b195 > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java > a5962ba44282fc3ae48de23860de0992586e549a > src/java/org/apache/sqoop/mapreduce/ImportJobBase.java > fb5d0541fa685d90b267db775f67de4c9f4f1902 > src/java/org/apache/sqoop/mapreduce/JdbcCallExportJob.java > b7eea93611e50e922342ffbe4d566c6aa9a51bb1 > src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java > 37198363580d8ab4ed1fcc287bd2d8a2182c0fad > src/java/org/apache/sqoop/mapreduce/JdbcUpdateExportJob.java > 86069c4619b03a35fc4b902fa943594f68cd4eb9 > src/java/org/apache/sqoop/mapreduce/JdbcUpsertExportJob.java > 9a8c17a98b66f8c57c0f96347b3a17fc922b47d1 > src/java/org/apache/sqoop/mapreduce/MergeJob.java > bb21b64da9a2d296be54657cbd0129636fa0a4c8 > src/java/org/apache/sqoop/mapreduce/MergeParquetReducer.java > caa4f5f760b9c2be604c89937ba7ad0a4bfa99a0 > src/java/org/apache/sqoop/mapreduce/ParquetExportMapper.java > 2bc0cba1466092b31f2263fd64a7d456177cfb2d > src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java > 35ab495790d5d80b5f9bf8de92a5b61cd0eb6b2e > src/java/org/apache/sqoop/mapreduce/ParquetJob.java > 46047733cce29ae11d227eab79280ed9ee6a84b5 > src/java/org/apache/sqoop/mapreduce/mainframe/MainframeImportJob.java > 7e975c7bbadde0fba5a09798c952be0da7d44ea9 > src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java > PRE-CREATION > > src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java > PRE-
Re: Review Request 67086: SQOOP-3324 Document SQOOP-816: Sqoop add support for external Hive tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67086/#review202947 --- Ship it! Ship It! - daniel voros On May 11, 2018, 1:58 p.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67086/ > --- > > (Updated May 11, 2018, 1:58 p.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3324 > https://issues.apache.org/jira/browse/SQOOP-3324 > > > Repository: sqoop-trunk > > > Description > --- > > This is a missing documentation from Sqoop. > > > Diffs > - > > src/docs/man/hive-args.txt 438c1dc4 > src/docs/user/hive-args.txt 75095641 > src/docs/user/hive.txt f8f7c27e > > > Diff: https://reviews.apache.org/r/67086/diff/2/ > > > Testing > --- > > ant docs completed successfully. > > > Thanks, > > Fero Szabo > >
Re: Review Request 67086: SQOOP-3324 Document SQOOP-816: Sqoop add support for external Hive tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67086/#review202923 --- Thanks a lot for al these documentation issues! Are you going thru the list of command line options to see if they're all documented? src/docs/user/hive.txt Lines 115 (patched) <https://reviews.apache.org/r/67086/#comment284983> nit: I'm sure we have this wrong elsewhere too, but I think we should say "switch" or "option" instead of "flag" if it takes an argument. src/docs/user/hive.txt Lines 127 (patched) <https://reviews.apache.org/r/67086/#comment284981> I think this command misses "import --hive-import" after "sqoop". - daniel voros On May 11, 2018, 11:10 a.m., Fero Szabo wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67086/ > --- > > (Updated May 11, 2018, 11:10 a.m.) > > > Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas. > > > Bugs: SQOOP-3324 > https://issues.apache.org/jira/browse/SQOOP-3324 > > > Repository: sqoop-trunk > > > Description > --- > > This is a missing documentation from Sqoop. > > > Diffs > - > > src/docs/man/hive-args.txt 438c1dc4 > src/docs/user/hive-args.txt 75095641 > src/docs/user/hive.txt f8f7c27e > > > Diff: https://reviews.apache.org/r/67086/diff/1/ > > > Testing > --- > > ant docs completed successfully. > > > Thanks, > > Fero Szabo > >
[jira] [Updated] (SQOOP-3313) Remove Kite dependency
[ https://issues.apache.org/jira/browse/SQOOP-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3313: Fix Version/s: 3.0.0 > Remove Kite dependency > -- > > Key: SQOOP-3313 > URL: https://issues.apache.org/jira/browse/SQOOP-3313 > Project: Sqoop > Issue Type: Improvement > Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > Fix For: 3.0.0 > > > Having Kite as a dependency makes it hard to release a version of Sqoop > compatible with Hadoop 3. > For details see discussion on dev list in [this thread|http://example.com] > and also SQOOP-3305. > Let's use this ticket to gather features that need to be > changed/reimplemented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3305) Upgrade to Hadoop 3, Hive 3, and HBase 2
[ https://issues.apache.org/jira/browse/SQOOP-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3305: Fix Version/s: 3.0.0 > Upgrade to Hadoop 3, Hive 3, and HBase 2 > > > Key: SQOOP-3305 > URL: https://issues.apache.org/jira/browse/SQOOP-3305 > Project: Sqoop > Issue Type: Task > Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > Fix For: 3.0.0 > > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > In this ticket I'll collect the necessary changes to do the upgrade. I'm not > setting a fix version yet, since this might mean a major release and to be > done together with the upgrade of related components. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3322) Version differences between ivy configurations
[ https://issues.apache.org/jira/browse/SQOOP-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3322: Fix Version/s: 3.0.0 > Version differences between ivy configurations > -- > > Key: SQOOP-3322 > URL: https://issues.apache.org/jira/browse/SQOOP-3322 > Project: Sqoop > Issue Type: Bug > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Minor > Fix For: 3.0.0 > > > We have multiple ivy configurations defined in ivy.xml. > - The {{redist}} configuration is used to select the artifacts that need to > be distributed with Sqoop in its tar.gz. > - The {{common}} configuration is used to set the classpath during > compilation (also refered to as 'hadoop classpath') > - The {{test}} configuration is used to set the classpath during junit > execution. It extends the {{common}} config. > Some artifacts end up having different versions between these three > configurations, which means we're using different versions during > compilation/testing/runtime. > Differences: > ||Artifact||redist||common (compilation)||test|| > |commons-pool|not in redist|1.5.4|*1.6*| > |commons-codec|1.4|1.9|*1.9*| > |commons-io|1.4|2.4|*2.4*| > |commons-logging|1.1.1|1.2|*1.2*| > |slf4j-api|1.6.1|1.7.7|*1.7.7*| > I'd suggest using the version *in bold* in all three configurations to use > the latest versions. > To achieve this we should exclude these artifacts from the transitive > dependencies and define them explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3323) Use beeline in (non-JDBC) Hive imports
[ https://issues.apache.org/jira/browse/SQOOP-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3323: Affects Version/s: 3.0.0 Fix Version/s: 3.0.0 Thank you! > Use beeline in (non-JDBC) Hive imports > -- > > Key: SQOOP-3323 > URL: https://issues.apache.org/jira/browse/SQOOP-3323 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 3.0.0 >Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > Fix For: 3.0.0 > > > When doing Hive imports the old way (not via JDBC that was introduced in > SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall > back to the {{hive}} executable (a.k.a. [Hive > Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if > that class is not found. > Since {{CliDriver}} and the {{hive}} executable that's relying on it are > [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] > (see also HIVE-10511), we should switch to using {{beeline}} to talk to > Hive. With recent additions (e.g. HIVE-18963) this should be easier than > before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3321) TestHiveImport is failing on Jenkins
[ https://issues.apache.org/jira/browse/SQOOP-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470349#comment-16470349 ] Daniel Voros commented on SQOOP-3321: - Thank you [~fero]! I've attached a patch on the RB. > TestHiveImport is failing on Jenkins > > > Key: SQOOP-3321 > URL: https://issues.apache.org/jira/browse/SQOOP-3321 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Boglarka Egyed >Priority: Major > Attachments: TEST-org.apache.sqoop.hive.TestHiveImport.txt > > > org.apache.sqoop.hive.TestHiveImport is failing since > [SQOOP-3318|https://reviews.apache.org/r/66761/bugs/SQOOP-3318/] has been > committed. This test seem to be failing only in the Jenkins environment as it > pass on several local machines. There can be some difference in the > filesystem which may cause this issue, it shall be investigated. I am > attaching the log from a failed run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67057: TestHiveImport is failing on Jenkins
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67057/ --- Review request for Sqoop. Bugs: SQOOP-3321 https://issues.apache.org/jira/browse/SQOOP-3321 Repository: sqoop-trunk Description --- I believe this is due to case sensitivity of file names in Linux (as opposed to MacOS). The table name gets converted to lowercase when importing but we're referring to it with it's original casing when trying to verify its contents in ParquetReader. Tests are passing after converting these three table names to all lowercase in TestHiveImport: - APPEND_HIVE_IMPORT_AS_PARQUET - NORMAL_HIVE_IMPORT_AS_PARQUET - CREATE_OVERWRITE_HIVE_IMPORT_AS_PARQUET Diffs - src/test/org/apache/sqoop/hive/TestHiveImport.java bc19b697 Diff: https://reviews.apache.org/r/67057/diff/1/ Testing --- Run TestHiveImport. Thanks, daniel voros
[jira] [Commented] (SQOOP-3323) Use beeline in (non-JDBC) Hive imports
[ https://issues.apache.org/jira/browse/SQOOP-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470324#comment-16470324 ] Daniel Voros commented on SQOOP-3323: - [~BoglarkaEgyed] could you please create a new "Fix version/affected version" to be able to mark this (and a few other tickets) as only necessary in the next major release (3.0?)? > Use beeline in (non-JDBC) Hive imports > -- > > Key: SQOOP-3323 > URL: https://issues.apache.org/jira/browse/SQOOP-3323 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > > When doing Hive imports the old way (not via JDBC that was introduced in > SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall > back to the {{hive}} executable (a.k.a. [Hive > Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if > that class is not found. > Since {{CliDriver}} and the {{hive}} executable that's relying on it are > [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] > (see also HIVE-10511), we should switch to using {{beeline}} to talk to > Hive. With recent additions (e.g. HIVE-18963) this should be easier than > before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3323) Use beeline in (non-JDBC) Hive imports
Daniel Voros created SQOOP-3323: --- Summary: Use beeline in (non-JDBC) Hive imports Key: SQOOP-3323 URL: https://issues.apache.org/jira/browse/SQOOP-3323 Project: Sqoop Issue Type: Improvement Components: hive-integration Reporter: Daniel Voros Assignee: Daniel Voros When doing Hive imports the old way (not via JDBC that was introduced in SQOOP-3309) we're trying to use the {{CliDriver}} class from Hive and fall back to the {{hive}} executable (a.k.a. [Hive Cli|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]) if that class is not found. Since {{CliDriver}} and the {{hive}} executable that's relying on it are [deprecated|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] (see also HIVE-10511), we should switch to using {{beeline}} to talk to Hive. With recent additions (e.g. HIVE-18963) this should be easier than before. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3322) Version differences between ivy configurations
[ https://issues.apache.org/jira/browse/SQOOP-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467438#comment-16467438 ] Daniel Voros commented on SQOOP-3322: - Attaching review request. > Version differences between ivy configurations > -- > > Key: SQOOP-3322 > URL: https://issues.apache.org/jira/browse/SQOOP-3322 > Project: Sqoop > Issue Type: Bug > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Minor > > We have multiple ivy configurations defined in ivy.xml. > - The {{redist}} configuration is used to select the artifacts that need to > be distributed with Sqoop in its tar.gz. > - The {{common}} configuration is used to set the classpath during > compilation (also refered to as 'hadoop classpath') > - The {{test}} configuration is used to set the classpath during junit > execution. It extends the {{common}} config. > Some artifacts end up having different versions between these three > configurations, which means we're using different versions during > compilation/testing/runtime. > Differences: > ||Artifact||redist||common (compilation)||test|| > |commons-pool|not in redist|1.5.4|*1.6*| > |commons-codec|1.4|1.9|*1.9*| > |commons-io|1.4|2.4|*2.4*| > |commons-logging|1.1.1|1.2|*1.2*| > |slf4j-api|1.6.1|1.7.7|*1.7.7*| > I'd suggest using the version *in bold* in all three configurations to use > the latest versions. > To achieve this we should exclude these artifacts from the transitive > dependencies and define them explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67005: Version differences between ivy configurations
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67005/ --- Review request for Sqoop. Bugs: SQOOP-3322 https://issues.apache.org/jira/browse/SQOOP-3322 Repository: sqoop-trunk Description --- We have multiple ivy configurations defined in ivy.xml. - The `redist` configuration is used to select the artifacts that need to be distributed with Sqoop in its tar.gz. - The `common` configuration is used to set the classpath during compilation (also refered to as 'hadoop classpath') - The `test` configuration is used to set the classpath during junit execution. It extends the `common` config. Some artifacts end up having different versions between these three configurations, which means we're using different versions during compilation/testing/runtime. Diffs - ivy.xml 6af94d9d ivy/libraries.properties c44b50bc Diff: https://reviews.apache.org/r/67005/diff/1/ Testing --- - compared the results of ivy-resolve-hadoop, ivy-resolve-test, ivy-resolve-redist tasks to make sure versions are the same - checked unit tests just to be on the safe side, test versions weren't changed though (all passed apart from known issues in SQOOP-3321) Thanks, daniel voros
[jira] [Commented] (SQOOP-3322) Version differences between ivy configurations
[ https://issues.apache.org/jira/browse/SQOOP-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467355#comment-16467355 ] Daniel Voros commented on SQOOP-3322: - One more thing I'd include in this ticket is bumping (defining to be more precise, and not just getting via transitive dependencies) jackson-databind version from 2.3.1 to 2.9.5 that isn't affected by CVE-2017-7525. > Version differences between ivy configurations > -- > > Key: SQOOP-3322 > URL: https://issues.apache.org/jira/browse/SQOOP-3322 > Project: Sqoop > Issue Type: Bug > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Minor > > We have multiple ivy configurations defined in ivy.xml. > - The {{redist}} configuration is used to select the artifacts that need to > be distributed with Sqoop in its tar.gz. > - The {{common}} configuration is used to set the classpath during > compilation (also refered to as 'hadoop classpath') > - The {{test}} configuration is used to set the classpath during junit > execution. It extends the {{common}} config. > Some artifacts end up having different versions between these three > configurations, which means we're using different versions during > compilation/testing/runtime. > Differences: > ||Artifact||redist||common (compilation)||test|| > |commons-pool|not in redist|1.5.4|*1.6*| > |commons-codec|1.4|1.9|*1.9*| > |commons-io|1.4|2.4|*2.4*| > |commons-logging|1.1.1|1.2|*1.2*| > |slf4j-api|1.6.1|1.7.7|*1.7.7*| > I'd suggest using the version *in bold* in all three configurations to use > the latest versions. > To achieve this we should exclude these artifacts from the transitive > dependencies and define them explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (SQOOP-3322) Version differences between ivy configurations
[ https://issues.apache.org/jira/browse/SQOOP-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3322: Description: We have multiple ivy configurations defined in ivy.xml. - The {{redist}} configuration is used to select the artifacts that need to be distributed with Sqoop in its tar.gz. - The {{common}} configuration is used to set the classpath during compilation (also refered to as 'hadoop classpath') - The {{test}} configuration is used to set the classpath during junit execution. It extends the {{common}} config. Some artifacts end up having different versions between these three configurations, which means we're using different versions during compilation/testing/runtime. Differences: ||Artifact||redist||common (compilation)||test|| |commons-pool|not in redist|1.5.4|*1.6*| |commons-codec|1.4|1.9|*1.9*| |commons-io|1.4|2.4|*2.4*| |commons-logging|1.1.1|1.2|*1.2*| |slf4j-api|1.6.1|1.7.7|*1.7.7*| I'd suggest using the version *in bold* in all three configurations to use the latest versions. To achieve this we should exclude these artifacts from the transitive dependencies and define them explicitly. was: We have multiple ivy configurations defined in ivy.xml. - The {{redist}} configuration is used to select the artifacts that need to be distributed with Sqoop in its tar.gz. - The {{common}} configuration is used to set the classpath during compilation (also refered to as 'hadoop classpath') - The {{test}} configuration is used to set the classpath during junit execution. It extends the {{common}} config. Some artifacts end up having different versions between these three configurations, which means we're using different versions during compilation/testing/runtime. Differences: ||Artifact||redist||common (compilation)||test|| |commons-pool|not in redist|1.5.4|*1.6*| |commons-codec|*1.4*|1.9|1.9| |commons-io|*1.4*|2.4|2.4| |commons-logging|*1.1.1*|1.2|1.2| |slf4j-api|*1.6.1*|1.7.7|1.7.7| I'd suggest using the version *in bold* in all three configurations, based on: - keep version from redist (where there is one), since that's the version we were shipping with and used in production - keep the latest version in case of commons-pool that is not part of the redist config To achieve this we should exclude these artifacts from the transitive dependencies and define them explicitly. Thanks for commenting [~vasas], I agree! I've updated the description. > Version differences between ivy configurations > -- > > Key: SQOOP-3322 > URL: https://issues.apache.org/jira/browse/SQOOP-3322 > Project: Sqoop > Issue Type: Bug > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Minor > > We have multiple ivy configurations defined in ivy.xml. > - The {{redist}} configuration is used to select the artifacts that need to > be distributed with Sqoop in its tar.gz. > - The {{common}} configuration is used to set the classpath during > compilation (also refered to as 'hadoop classpath') > - The {{test}} configuration is used to set the classpath during junit > execution. It extends the {{common}} config. > Some artifacts end up having different versions between these three > configurations, which means we're using different versions during > compilation/testing/runtime. > Differences: > ||Artifact||redist||common (compilation)||test|| > |commons-pool|not in redist|1.5.4|*1.6*| > |commons-codec|1.4|1.9|*1.9*| > |commons-io|1.4|2.4|*2.4*| > |commons-logging|1.1.1|1.2|*1.2*| > |slf4j-api|1.6.1|1.7.7|*1.7.7*| > I'd suggest using the version *in bold* in all three configurations to use > the latest versions. > To achieve this we should exclude these artifacts from the transitive > dependencies and define them explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3322) Version differences between ivy configurations
Daniel Voros created SQOOP-3322: --- Summary: Version differences between ivy configurations Key: SQOOP-3322 URL: https://issues.apache.org/jira/browse/SQOOP-3322 Project: Sqoop Issue Type: Bug Components: build Affects Versions: 1.4.7 Reporter: Daniel Voros Assignee: Daniel Voros We have multiple ivy configurations defined in ivy.xml. - The {{redist}} configuration is used to select the artifacts that need to be distributed with Sqoop in its tar.gz. - The {{common}} configuration is used to set the classpath during compilation (also refered to as 'hadoop classpath') - The {{test}} configuration is used to set the classpath during junit execution. It extends the {{common}} config. Some artifacts end up having different versions between these three configurations, which means we're using different versions during compilation/testing/runtime. Differences: ||Artifact||redist||common (compilation)||test|| |commons-pool|not in redist|1.5.4|*1.6*| |commons-codec|*1.4*|1.9|1.9| |commons-io|*1.4*|2.4|2.4| |commons-logging|*1.1.1*|1.2|1.2| |slf4j-api|*1.6.1*|1.7.7|1.7.7| I'd suggest using the version *in bold* in all three configurations, based on: - keep version from redist (where there is one), since that's the version we were shipping with and used in production - keep the latest version in case of commons-pool that is not part of the redist config To achieve this we should exclude these artifacts from the transitive dependencies and define them explicitly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3317) org.apache.sqoop.validation.RowCountValidator in live RDBMS system
[ https://issues.apache.org/jira/browse/SQOOP-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463723#comment-16463723 ] Daniel Voros commented on SQOOP-3317: - Hi [~srikumaran.t], thank you for reporting this! As far as I can tell, currently the only option for validation is to check for an exact match for the number of records. "Percentage tolerant" validation was only mentioned in the documentation but is not implemented. In my opinion this kind of validation (comparing the number of records) doesn't make much sense and should only be used as a sanity check, since it doesn't guarantee the equality of the contents. However we could improve the existing implementation by introducing another parameter (margin/threshold) to not require an exact match and we could also implement "Percentage tolerant". > org.apache.sqoop.validation.RowCountValidator in live RDBMS system > -- > > Key: SQOOP-3317 > URL: https://issues.apache.org/jira/browse/SQOOP-3317 > Project: Sqoop > Issue Type: Bug >Reporter: Sri Kumaran Thirupathy >Priority: Major > > org.apache.sqoop.validation.RowCountValidator is retrieving count from Source > after the MR completes. This fails in live RDBMS case. > org.apache.sqoop.validation.RowCountValidator can retrive count during MR > execution phase. > Also, How to use Percentage Tolerant? Reference: > [https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3321) TestHiveImport is failing on Jenkins
[ https://issues.apache.org/jira/browse/SQOOP-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463608#comment-16463608 ] Daniel Voros commented on SQOOP-3321: - [~BoglarkaEgyed] this is failing for me on Linux as well. I believe this is due to case sensitivity of file names there (as opposed to MacOS). The table name gets converted to lowercase when importing but we're referring to it with it's original casing when trying to verify its contents in {{ParquetReader}}. Tests are passing after converting these three table names to all lowercase in TestHiveImport: - APPEND_HIVE_IMPORT_AS_PARQUET - NORMAL_HIVE_IMPORT_AS_PARQUET - CREATE_OVERWRITE_HIVE_IMPORT_AS_PARQUET Since SQOOP-3318 only changed the tests, I think we should adapt to the lowercase names in the tests too. Easiest solution would be to use lowercase names. What do you think [~vasas]? > TestHiveImport is failing on Jenkins > > > Key: SQOOP-3321 > URL: https://issues.apache.org/jira/browse/SQOOP-3321 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.7 >Reporter: Boglarka Egyed >Priority: Major > Attachments: TEST-org.apache.sqoop.hive.TestHiveImport.txt > > > org.apache.sqoop.hive.TestHiveImport is failing since > [SQOOP-3318|https://reviews.apache.org/r/66761/bugs/SQOOP-3318/] has been > committed. This test seem to be failing only in the Jenkins environment as it > pass on several local machines. There can be some difference in the > filesystem which may cause this issue, it shall be investigated. I am > attaching the log from a failed run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66548/ --- (Updated May 2, 2018, 12:12 p.m.) Review request for Sqoop. Changes --- Patch #6 fixes `TestOrcImport#testDatetimeTypeOverrides` (fixed timezone). Bugs: SQOOP-3311 https://issues.apache.org/jira/browse/SQOOP-3311 Repository: sqoop-trunk Description --- Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default. This will probably result in increased usage of ACID tables and the need to support importing into ACID tables with Sqoop. Currently the only table format supporting full ACID tables is ORC. The easiest and most effective way to support importing into these tables would be to write out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported since HIVE-17361). Workaround could be to create table as textfile (as before) and then CTAS from that. This would push the responsibility of creating ORC format to Hive. However it would result in writing every record twice; in text format and in ORC. Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged) ACID tables can use arbitrary file format. Supporting full ACID tables would also be the first step in making "lastmodified" incremental imports work with Hive. Diffs (updated) - ivy.xml 6af94d9d ivy/libraries.properties c44b50bc src/java/org/apache/sqoop/SqoopOptions.java d9984af3 src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION src/test/org/apache/sqoop/TestAllTables.java 56d1f577 src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/66548/diff/6/ Changes: https://reviews.apache.org/r/66548/diff/5-6/ Testing --- - added some unit tests - tested basic Hive import scenarios on a cluster Thanks, daniel voros
Re: Review Request 66067: SQOOP-3052: Introduce gradle-based build for Sqoop to make it more developer friendly / open
blob/72c5cd717e3fad6d5f5a3a2b3d185ffbacd876cf/ivy.xml#L118). 4) SqoopVersion.java is now included. I think it makes sense to keep it. Any objections? Regards, Daniel - daniel voros On April 24, 2018, 2:23 p.m., Anna Szonyi wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66067/ > --- > > (Updated April 24, 2018, 2:23 p.m.) > > > Review request for Sqoop. > > > Bugs: Sqoop-3052 > https://issues.apache.org/jira/browse/Sqoop-3052 > > > Repository: sqoop-trunk > > > Description > --- > > SQOOP-3052: Introduce gradle based build for Sqoop to make it more developer > friendly / open > > > Diffs > - > > .gitignore 68cbe28731e613607c208824443d1edf256d9c8a > COMPILING.txt 3b82250488256871352056e9061ad08fabbd7fc5 > build.gradle PRE-CREATION > config/checkstyle/checkstyle-java-header.txt PRE-CREATION > config/checkstyle/checkstyle-noframes.xsl PRE-CREATION > config/checkstyle/checkstyle.xml PRE-CREATION > gradle.properties PRE-CREATION > gradle/customUnixStartScript.txt PRE-CREATION > gradle/customWindowsStartScript.txt PRE-CREATION > gradle/sqoop-package.gradle PRE-CREATION > gradle/sqoop-version-gen.gradle PRE-CREATION > gradle/wrapper/gradle-wrapper.jar PRE-CREATION > gradle/wrapper/gradle-wrapper.properties PRE-CREATION > gradlew PRE-CREATION > gradlew.bat PRE-CREATION > settings.gradle PRE-CREATION > src/scripts/rat-violations.sh 1cfbc1502b24dd1b8b7e7ce21f0b5d1880c06556 > testdata/hcatalog/conf/hive-site.xml > edac7aa9087a84b7a0c660907794adae684ae313 > > > Diff: https://reviews.apache.org/r/66067/diff/10/ > > > Testing > --- > > ran all new tasks, except for internal maven publishing > > Notes: > - To try it out you can call ./gradlew tasks --all to see all the tasks and > compare them to current tasks/artifacts. > - Replaced cobertura with jacoco, as it's easier/cleaner to configure, easier > to combine all test results into a single report. > - Generated pom.xml now has correct dependencies/versions > - Script generation is currently hardcoded and not based on sqoop help, as > previously - though added the possiblity of hooking it in later > > > Thanks, > > Anna Szonyi > >
Re: Review Request 66761: SQOOP-3318: Remove Kite dependency from test cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66761/#review201737 --- Ship it! Great stuff! Do you think we'll need ParquetReader in production code when removing Kite from the rest of the codebase? If we will, than it probably makes sense to move it under src/java now. - daniel voros On April 23, 2018, 12:21 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66761/ > --- > > (Updated April 23, 2018, 12:21 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3318 > https://issues.apache.org/jira/browse/SQOOP-3318 > > > Repository: sqoop-trunk > > > Description > --- > > Some Sqoop tests use Kite to create test data and verify test results. > > Since we want to remove the Kite dependency from Sqoop we should rewrite > these test cases not to use Kite anymore. > > > Diffs > - > > src/java/org/apache/sqoop/util/FileSystemUtil.java 1493e0954 > src/test/org/apache/sqoop/TestAllTables.java 56d1f5772 > src/test/org/apache/sqoop/TestMerge.java 8eef8d4ac > src/test/org/apache/sqoop/TestParquetExport.java c8bb663e0 > src/test/org/apache/sqoop/TestParquetImport.java 379529a8d > src/test/org/apache/sqoop/hive/TestHiveImport.java 4e1f249a8 > src/test/org/apache/sqoop/util/ParquetReader.java PRE-CREATION > > > Diff: https://reviews.apache.org/r/66761/diff/1/ > > > Testing > --- > > Executed unit and third party tests. > > > Thanks, > > Szabolcs Vasas > >
[jira] [Commented] (SQOOP-3314) Sqoop doesn't display full log on console
[ https://issues.apache.org/jira/browse/SQOOP-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440999#comment-16440999 ] Daniel Voros commented on SQOOP-3314: - Hi [~shailu.lahar], thank you for reporting this! The {{... 19 more}} part is refering to the previous lines above, it's not truncated. I'm afraid the {{--verbose}} is your best bet. The "method specified in wallet_location is not supported" message suggests you have misconfigured your Oracle wallet. Could you please confirm if it's working outside of Sqoop? > Sqoop doesn't display full log on console > - > > Key: SQOOP-3314 > URL: https://issues.apache.org/jira/browse/SQOOP-3314 > Project: Sqoop > Issue Type: Bug >Reporter: Shailesh Lahariya >Priority: Major > > I am running a sqoop command (using sqoop 1.4.7) and getting an error. I cant > see full error, > it seems some of the useful information is not being displayed on console, > for ex. instead of ...19 more in the log below, it should be given the > complete message to help debug the issue. > > > > 18/04/17 01:59:12 WARN tool.EvalSqlTool: SQL exception executing statement: > java.sql.SQLRecoverableException: IO Error: The Network Adapter could not > establish the connection > at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:774) > at > oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688) > at > oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:39) > at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:691) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:247) > at > org.apache.sqoop.manager.OracleManager.makeConnection(OracleManager.java:329) > at > org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59) > at org.apache.sqoop.tool.EvalSqlTool.run(EvalSqlTool.java:64) > at org.apache.sqoop.Sqoop.run(Sqoop.java:147) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) > at org.apache.sqoop.Sqoop.main(Sqoop.java:252) > Caused by: oracle.net.ns.NetException: The Network Adapter could not > establish the connection > at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:523) > at > oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:521) > at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:660) > at oracle.net.ns.NSProtocol.connect(NSProtocol.java:286) > at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1438) > at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:518) > ... 14 more > Caused by: oracle.net.ns.NetException: The method specified in > wallet_location is not supported. Location: /home/hadoop/wallet/jnetadmin_c > at > oracle.net.nt.CustomSSLSocketFactory.getSSLSocketEngine(CustomSSLSocketFactory.java:487) > at oracle.net.nt.TcpsNTAdapter.connect(TcpsNTAdapter.java:143) > at oracle.net.nt.ConnOption.connect(ConnOption.java:161) > at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:470) > ... 19 more > > > Also, sharing the command that is producing the above error (altered it to > remove any confidential info)- > > sqoop eval -D mapred.map.child.java.opts='-Doracle.net.tns_admin=. > -Doracle.net.wallet_location=.' -files > /home/hadoop/wallet/jnetadmin_c/ewallet.jks,/home/hadoop/wallet/jnetadmin_c/ewallet.jks,$HOME/wallet/sqlnet.ora,$HOME/wallet/tnsnames.ora > --username xx --password xx --connect > "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=xx)(PORT=2484))(CONNECT_DATA=(SERVICE_NAME=xx)))" > --query "select 1 from dual" --verbose --throw-on-error > > Please let me know if there is any option to get more log than it is > producing currently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66361: Implement HiveServer2 client
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66361/#review201202 --- Ship it! Ship It! - daniel voros On April 16, 2018, 9:12 a.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66361/ > --- > > (Updated April 16, 2018, 9:12 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3309 > https://issues.apache.org/jira/browse/SQOOP-3309 > > > Repository: sqoop-trunk > > > Description > --- > > This JIRA covers the implementation of the client for HiveServer2 and its > integration into the classes which use HiveImport. > > - HiveClient interface is introduced with 2 implementation: > - HiveImport: this is the original implementation which uses HiveCLI > - HiveServer2Client: the new clients which connects to HS2 using JDBC > connection > - The common code is extracted to HiveCommon class > - HiveClient should be instantiated using HiveClientFactory which creates and > configures the right HiveClient based on the configuration in SqoopOptions > - HiveMiniCluster is introduced with a couple of helper classes to enable > end-to-end HS2 tests > - A couple of new options are added to SqoopOptions to be able to configure > the connection to HS2 > - Validation is implemented for these new options > > > Diffs > - > > build.xml 7f68b573c65a61150ca78d158084586c87775d84 > ivy.xml 6be4fa20fbbf1f303c69d86942b1874e18a14afc > src/docs/user/hive-args.txt 441f54e8e0cee63595937f4e1811abc2d89f9237 > src/docs/user/hive.txt 3dc8bb463d602d525fe5f2d07d52cb97efcbab7e > src/java/org/apache/sqoop/SqoopOptions.java > 651cebd69ee7e75d06c75945e3607c4fab7eb11c > src/java/org/apache/sqoop/hive/HiveClient.java PRE-CREATION > src/java/org/apache/sqoop/hive/HiveClientCommon.java PRE-CREATION > src/java/org/apache/sqoop/hive/HiveClientFactory.java PRE-CREATION > src/java/org/apache/sqoop/hive/HiveImport.java > c2729119d31f7e585f204f2d31b2051eea71b72b > src/java/org/apache/sqoop/hive/HiveServer2Client.java PRE-CREATION > src/java/org/apache/sqoop/hive/HiveServer2ConnectionFactory.java > PRE-CREATION > src/java/org/apache/sqoop/hive/TableDefWriter.java > b7a25b7809e0d50166966a77161dc8ff603fb2d2 > src/java/org/apache/sqoop/tool/BaseSqoopTool.java > b02e4fe7fda25c7f8171c7db17d15a7987459687 > src/java/org/apache/sqoop/tool/CreateHiveTableTool.java > d259566180369a55d490144e6f865e728f4f2e61 > src/java/org/apache/sqoop/tool/ImportAllTablesTool.java > 18f7a0af48d972d5186e9414475e080f1eb765f3 > src/java/org/apache/sqoop/tool/ImportTool.java > e9920058858653bec7407bf7992eb6445401e813 > src/test/org/apache/sqoop/hive/TestHiveClientFactory.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestHiveMiniCluster.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestHiveServer2Client.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestTableDefWriter.java > 8bdc3beb3677312ec0ee2e612616358bca4ca838 > src/test/org/apache/sqoop/hive/minicluster/AuthenticationConfiguration.java > PRE-CREATION > src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java > PRE-CREATION > > src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java > PRE-CREATION > > src/test/org/apache/sqoop/hive/minicluster/NoAuthenticationConfiguration.java > PRE-CREATION > > src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java > PRE-CREATION > src/test/org/apache/sqoop/testutil/HiveServer2TestUtil.java PRE-CREATION > src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java > PRE-CREATION > src/test/org/apache/sqoop/tool/TestImportTool.java > 1c0cf4d863692f75bb8831e834fae47fc18b5df5 > > > Diff: https://reviews.apache.org/r/66361/diff/5/ > > > Testing > --- > > Ran unit and third party tests suite. > > > Thanks, > > Szabolcs Vasas > >
[jira] [Created] (SQOOP-3313) Remove Kite dependency
Daniel Voros created SQOOP-3313: --- Summary: Remove Kite dependency Key: SQOOP-3313 URL: https://issues.apache.org/jira/browse/SQOOP-3313 Project: Sqoop Issue Type: Improvement Reporter: Daniel Voros Assignee: Daniel Voros Having Kite as a dependency makes it hard to release a version of Sqoop compatible with Hadoop 3. For details see discussion on dev list in [this thread|http://example.com] and also SQOOP-3305. Let's use this ticket to gather features that need to be changed/reimplemented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3312) Can not export column data named `value` from hive to mysql
[ https://issues.apache.org/jira/browse/SQOOP-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439047#comment-16439047 ] Daniel Voros commented on SQOOP-3312: - [~zimmem] I think this is the same as SQOOP-3038, that was fixed in 1.4.7. Could you please check if you see the issue with 1.4.7? > Can not export column data named `value` from hive to mysql > --- > > Key: SQOOP-3312 > URL: https://issues.apache.org/jira/browse/SQOOP-3312 > Project: Sqoop > Issue Type: Bug > Components: tools >Affects Versions: 1.4.6 >Reporter: zimmem zhuang >Priority: Critical > > the hive table > {code:java} > CREATE TABLE if not exists `test_table`( > `id` bigint, > `value` double) > STORED AS parquet > {code} > the mysql table > {code:java} > CREATE TABLE if not exists `test_table`( > `id` bigint, > `value` double); > {code} > the export command > > {code:java} > sqoop export --connect "${jdbc_connect_url}" --username test --password *** > --table test_table --columns id,value --hcatalog-database default > --hcatalog-table test_table > {code} > The `value` column will null after running the command above. But if I > change the column name to `value_x` (both hive and mysql), it works corretly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66361: Implement HiveServer2 client
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66361/#review201101 --- Hey Szabolcs, I'm trying to run the latest patch on a (non-kerberized) cluster, but I get the following: ``` 18/04/13 13:50:47 INFO hive.HiveServer2ConnectionFactory: Creating connection to HiveServer2 as: hdfs (auth:SIMPLE) 18/04/13 13:50:47 INFO jdbc.Utils: Supplied authorities: hostname:1 18/04/13 13:50:47 INFO jdbc.Utils: Resolved authority: hostname:1 18/04/13 13:50:48 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Error executing Hive import. java.lang.RuntimeException: Error executing Hive import. at org.apache.sqoop.hive.HiveServer2Client.executeHiveImport(HiveServer2Client.java:85) at org.apache.sqoop.hive.HiveServer2Client.importTable(HiveServer2Client.java:63) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:547) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:632) at org.apache.sqoop.Sqoop.run(Sqoop.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:232) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:241) at org.apache.sqoop.Sqoop.main(Sqoop.java:250) ... Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:17 Invalid path ''hdfs://hostname:8020/user/hdfs/asd'' at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraintsAndGetFiles(LoadSemanticAnalyzer.java:160) at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:225) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146) ... 26 more Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=EXECUTE, inode="/user/hdfs/asd":hdfs:hdfs:drwx-- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4142) ... ``` Note that according to the log message, we're running as 'hdfs' user (current OS user), but HDFS checks permission for anonymous. Could it be the result of org/apache/sqoop/hive/HiveServer2ConnectionFactory.java:42 (passing username=null)? ``` public HiveServer2ConnectionFactory(String connectionString) { this(connectionString, null, null); } ``` Also, it might make sense to use the --hs2-user parameter in the non-kerberized case as well. Like beeline allows you to override user with `-n username`. What do you think? Regards, Daniel - daniel voros On April 12, 2018, 2:10 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66361/ > --- > > (Updated April 12, 2018, 2:10 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3309 > https://issues.apache.org/jira/browse/SQOOP-3309 > > > Repository: sqoop-trunk > > > Description > --- > > This JIRA covers the implementation of the client for HiveServer2 and its > integration into the classes which use HiveImport. > > - HiveClient interface is introduced with 2 implementation: > - HiveImport: this is the original implementation which uses HiveCLI > - HiveServer2Client: the new clients which connects to HS2 using JDBC > connection > - The common code is extracted to HiveCommon class > - HiveClient should be instantiated using HiveClientFactory which crea
[jira] [Resolved] (SQOOP-2878) Sqoop import into Hive transactional tables
[ https://issues.apache.org/jira/browse/SQOOP-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros resolved SQOOP-2878. - Resolution: Duplicate See SQOOP-3311. > Sqoop import into Hive transactional tables > --- > > Key: SQOOP-2878 > URL: https://issues.apache.org/jira/browse/SQOOP-2878 > Project: Sqoop > Issue Type: Improvement >Affects Versions: 1.4.6 >Reporter: Rohan More >Priority: Minor > > Hive has introduced support for transactions from version 0.13. For > transactional support, the hive table should be bucketed and should be in ORC > format. > This improvement is to import data directly into hive transactional table > using sqoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-2192) SQOOP IMPORT/EXPORT for the ORC file HIVE TABLE Failing
[ https://issues.apache.org/jira/browse/SQOOP-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436935#comment-16436935 ] Daniel Voros commented on SQOOP-2192: - [~Ankush] please refer to SQOOP-3311 for ORC updates. > SQOOP IMPORT/EXPORT for the ORC file HIVE TABLE Failing > --- > > Key: SQOOP-2192 > URL: https://issues.apache.org/jira/browse/SQOOP-2192 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.5 > Environment: Hadoop 2.6.0 > Hive 1.0.0 > Sqoop 1.4.5 >Reporter: Sunil Kumar >Assignee: Venkat Ranganathan >Priority: Major > > We are trying to export RDMB table to Hive table for running Hive delete, > update queries on exported Hive table. Since for the Hive to support delete, > update queries on following is required: > 1. Needs to declare table as having Transaction Property > 2. Table must be in ORC format > 3. Tables must to be bucketed > to do that i have create hive table using hcat: > create table bookinfo(md5 STRING , isbn STRING , bookid STRING , booktitle > STRING , author STRING , yearofpub STRING , publisher STRING , imageurls > STRING , imageurlm STRING , imageurll STRING , price DOUBLE , totalrating > DOUBLE , totalusers BIGINT , maxrating INT , minrating INT , avgrating DOUBLE > , rawscore DOUBLE , norm_score DOUBLE) clustered by (md5) into 10 buckets > stored as orc TBLPROPERTIES('transactional'='true'); > then running sqoop import: > sqoop import --verbose --connect 'RDBMS_JDBC_URL' --driver JDBC_DRIVER > --table bookinfo --null-string '\\N' --null-non-string '\\N' --username USER > --password PASSWPRD --hcatalog-database hive_test_trans --hcatalog-table > bookinfo --hcatalog-storage-stanza "storedas orc" -m 1 > Following exception is comming: > 15/03/09 16:28:59 ERROR tool.ImportTool: Encountered IOException running > import job: org.apache.hive.hcatalog.common.HCatException : 2016 : Error > operation not supported : Store into a partition with bucket definition from > Pig/Mapreduce is not supported > at > org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:109) > at > org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70) > at > org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:339) > at > org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:753) > at > org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98) > at > org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:240) > at > org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665) > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601) > at org.apache.sqoop.Sqoop.run(Sqoop.java:143) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) > at org.apache.sqoop.Sqoop.main(Sqoop.java:236) > Please let any futher details required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3311) Importing as ORC file to support full ACID Hive tables
[ https://issues.apache.org/jira/browse/SQOOP-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433797#comment-16433797 ] Daniel Voros commented on SQOOP-3311: - Attached review request. > Importing as ORC file to support full ACID Hive tables > -- > > Key: SQOOP-3311 > URL: https://issues.apache.org/jira/browse/SQOOP-3311 > Project: Sqoop > Issue Type: New Feature > Components: hive-integration > Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > > Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID > by default. This will probably result in increased usage of ACID tables and > the need to support importing into ACID tables with Sqoop. > Currently the only table format supporting full ACID tables is ORC. > The easiest and most effective way to support importing into these tables > would be to write out files as ORC and keep using LOAD DATA as we do for all > other Hive tables (supported since HIVE-17361). > Workaround could be to create table as textfile (as before) and then CTAS > from that. This would push the responsibility of creating ORC format to Hive. > However it would result in writing every record twice; in text format and in > ORC. > Note that ORC is only necessary for full ACID tables. Insert-only (aka. > micromanaged) ACID tables can use arbitrary file format. > Supporting full ACID tables would also be the first step in making > "lastmodified" incremental imports work with Hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66548/#review200902 --- Patch #1 is an initial patch that contains the most fundamental changes to support ORC importing. I'll add documentation and extend the tests with thridparty tests etc. but wanted to share to get feedback early on. - daniel voros On April 11, 2018, 12:02 p.m., daniel voros wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66548/ > --- > > (Updated April 11, 2018, 12:02 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3311 > https://issues.apache.org/jira/browse/SQOOP-3311 > > > Repository: sqoop-trunk > > > Description > --- > > Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID > by default. This will probably result in increased usage of ACID tables and > the need to support importing into ACID tables with Sqoop. > > Currently the only table format supporting full ACID tables is ORC. > > The easiest and most effective way to support importing into these tables > would be to write out files as ORC and keep using LOAD DATA as we do for all > other Hive tables (supported since HIVE-17361). > > Workaround could be to create table as textfile (as before) and then CTAS > from that. This would push the responsibility of creating ORC format to Hive. > However it would result in writing every record twice; in text format and in > ORC. > > Note that ORC is only necessary for full ACID tables. Insert-only (aka. > micromanaged) ACID tables can use arbitrary file format. > > Supporting full ACID tables would also be the first step in making > "lastmodified" incremental imports work with Hive. > > > Diffs > - > > ivy.xml 6be4fa2 > ivy/libraries.properties c44b50b > src/java/org/apache/sqoop/SqoopOptions.java 651cebd > src/java/org/apache/sqoop/hive/TableDefWriter.java b7a25b7 > src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba > src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION > src/java/org/apache/sqoop/tool/BaseSqoopTool.java b02e4fe > src/java/org/apache/sqoop/tool/ExportTool.java 060f2c0 > src/java/org/apache/sqoop/tool/ImportTool.java e992005 > src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION > src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 8bdc3be > src/test/org/apache/sqoop/orm/TestClassWriter.java 0cc07cf > src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION > > > Diff: https://reviews.apache.org/r/66548/diff/1/ > > > Testing > --- > > - added some unit tests > - tested basic Hive import scenarios on a cluster > > > Thanks, > > daniel voros > >
Review Request 66548: Importing as ORC file to support full ACID Hive tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66548/ --- Review request for Sqoop. Bugs: SQOOP-3311 https://issues.apache.org/jira/browse/SQOOP-3311 Repository: sqoop-trunk Description --- Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default. This will probably result in increased usage of ACID tables and the need to support importing into ACID tables with Sqoop. Currently the only table format supporting full ACID tables is ORC. The easiest and most effective way to support importing into these tables would be to write out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported since HIVE-17361). Workaround could be to create table as textfile (as before) and then CTAS from that. This would push the responsibility of creating ORC format to Hive. However it would result in writing every record twice; in text format and in ORC. Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged) ACID tables can use arbitrary file format. Supporting full ACID tables would also be the first step in making "lastmodified" incremental imports work with Hive. Diffs - ivy.xml 6be4fa2 ivy/libraries.properties c44b50b src/java/org/apache/sqoop/SqoopOptions.java 651cebd src/java/org/apache/sqoop/hive/TableDefWriter.java b7a25b7 src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION src/java/org/apache/sqoop/tool/BaseSqoopTool.java b02e4fe src/java/org/apache/sqoop/tool/ExportTool.java 060f2c0 src/java/org/apache/sqoop/tool/ImportTool.java e992005 src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION src/test/org/apache/sqoop/hive/TestTableDefWriter.java 8bdc3be src/test/org/apache/sqoop/orm/TestClassWriter.java 0cc07cf src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/66548/diff/1/ Testing --- - added some unit tests - tested basic Hive import scenarios on a cluster Thanks, daniel voros
[jira] [Updated] (SQOOP-3305) Upgrade to Hadoop 3, Hive 3, and HBase 2
[ https://issues.apache.org/jira/browse/SQOOP-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Voros updated SQOOP-3305: Summary: Upgrade to Hadoop 3, Hive 3, and HBase 2 (was: Upgrade to Hadoop 3.0.0) I'm adding Hive and HBase to the summary, since they need to be handled together. See review request for details. > Upgrade to Hadoop 3, Hive 3, and HBase 2 > > > Key: SQOOP-3305 > URL: https://issues.apache.org/jira/browse/SQOOP-3305 > Project: Sqoop > Issue Type: Task > Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > In this ticket I'll collect the necessary changes to do the upgrade. I'm not > setting a fix version yet, since this might mean a major release and to be > done together with the upgrade of related components. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3311) Importing as ORC file to support full ACID Hive tables
Daniel Voros created SQOOP-3311: --- Summary: Importing as ORC file to support full ACID Hive tables Key: SQOOP-3311 URL: https://issues.apache.org/jira/browse/SQOOP-3311 Project: Sqoop Issue Type: New Feature Components: hive-integration Reporter: Daniel Voros Assignee: Daniel Voros Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default. This will probably result in increased usage of ACID tables and the need to support importing into ACID tables with Sqoop. Currently the only table format supporting full ACID tables is ORC. The easiest and most effective way to support importing into these tables would be to write out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported since HIVE-17361). Workaround could be to create table as textfile (as before) and then CTAS from that. This would push the responsibility of creating ORC format to Hive. However it would result in writing every record twice; in text format and in ORC. Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged) ACID tables can use arbitrary file format. Supporting full ACID tables would also be the first step in making "lastmodified" incremental imports work with Hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66300: Upgrade to Hadoop 3.0.0
> On March 28, 2018, 3:44 p.m., Szabolcs Vasas wrote: > > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java > > Lines 69 (patched) > > <https://reviews.apache.org/r/66300/diff/1/?file=1988993#file1988993line69> > > > > Can we use List interface and diamond operator here? Fixed, please note that this file was originally copied from Hive. - daniel --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66300/#review200113 --- On March 27, 2018, 8:50 a.m., daniel voros wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66300/ > --- > > (Updated March 27, 2018, 8:50 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3305 > https://issues.apache.org/jira/browse/SQOOP-3305 > > > Repository: sqoop-trunk > > > Description > --- > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > > > Diffs > - > > ivy.xml 6be4fa2 > ivy/libraries.properties c44b50b > src/java/org/apache/sqoop/SqoopOptions.java 651cebd > src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 > src/java/org/apache/sqoop/hive/HiveImport.java c272911 > src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 > src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b > src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d > src/test/org/apache/sqoop/util/TestSqoopJsonUtil.java fdf972c > testdata/hcatalog/conf/hive-site.xml edac7aa > > > Diff: https://reviews.apache.org/r/66300/diff/2/ > > > Testing > --- > > Normal and third-party unit tests. > > > Thanks, > > daniel voros > >
Re: Review Request 66282: Mock ConnManager field in TestTableDefWriter
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66282/#review200121 --- Ship it! Ship It! - daniel voros On March 27, 2018, noon, Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66282/ > --- > > (Updated March 27, 2018, noon) > > > Review request for Sqoop. > > > Bugs: SQOOP-3308 > https://issues.apache.org/jira/browse/SQOOP-3308 > > > Repository: sqoop-trunk > > > Description > --- > > This patch removes the externalColTypes field from TableDefWriter since it > was only used for testing purposes. > TestTableDefWriter is fixed to mock the ConnManager object provided to the > TableDefWriter constructor and a minor refactoring is done on the class. > > > Diffs > - > > src/java/org/apache/sqoop/hive/TableDefWriter.java e1424c383 > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 496b5add9 > > > Diff: https://reviews.apache.org/r/66282/diff/4/ > > > Testing > --- > > ant clean test > > > Thanks, > > Szabolcs Vasas > >
Re: Review Request 66067: SQOOP-3052: Introduce gradle-based build for Sqoop to make it more developer friendly / open
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66067/#review200102 --- Hey Anna, I've experimented with running the gradle build in a clean dockerized environment and found some minor issues: 1) All dependencies are downloaded from jcenter, despite having central repository defined in build.gradle. This might be a result of my current setup, could you please confirm? 2) Deprecation warnings for '<<' task doLast syntax ("Deprecation warning: The Task.leftShift(Closure) method has been deprecated...") 3) SqoopVersion.java generation happens during task definition and not in action (missing doLast?). 4) `relnotes` task fails if version is SNAPSHOT with: "A problem occurred starting process 'command 'cd''". 5) The `release` task prints the path of tar and rat report but they're incorrect. (I've specified version on the command line with "-Pversion=1.5.0") 6) `ant releaseaudit` now lists gradle files as errors I've corrected 2,3,4,5,6 in this commit: https://github.com/dvoros/sqoop/commit/47e361829b1004bdedd6f5c223332e3fb8b85696 What's the reasoning behind using Gradle 3.5.1? Shouldn't we use 4.x? (I've successfully executed a simple build with 4.6) Regards, Daniel - daniel voros On March 23, 2018, 10:28 a.m., Anna Szonyi wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66067/ > --- > > (Updated March 23, 2018, 10:28 a.m.) > > > Review request for Sqoop. > > > Bugs: Sqoop-3052 > https://issues.apache.org/jira/browse/Sqoop-3052 > > > Repository: sqoop-trunk > > > Description > --- > > SQOOP-3052: Introduce gradle based build for Sqoop to make it more developer > friendly / open > > > Diffs > - > > .gitignore 68cbe28 > COMPILING.txt 3b82250 > build.gradle PRE-CREATION > buildSrc/customUnixStartScript.txt PRE-CREATION > buildSrc/customWindowsStartScript.txt PRE-CREATION > buildSrc/sqoop-package.gradle PRE-CREATION > buildSrc/sqoop-version-gen.gradle PRE-CREATION > config/checkstyle/checkstyle-java-header.txt PRE-CREATION > config/checkstyle/checkstyle-noframes.xsl PRE-CREATION > config/checkstyle/checkstyle.xml PRE-CREATION > gradle.properties PRE-CREATION > gradle/wrapper/gradle-wrapper.jar PRE-CREATION > gradle/wrapper/gradle-wrapper.properties PRE-CREATION > gradlew PRE-CREATION > gradlew.bat PRE-CREATION > settings.gradle PRE-CREATION > > > Diff: https://reviews.apache.org/r/66067/diff/6/ > > > Testing > --- > > ran all new tasks, except for internal maven publishing > > Notes: > - To try it out you can call ./gradlew tasks --all to see all the tasks and > compare them to current tasks/artifacts. > - Replaced cobertura with jacoco, as it's easier/cleaner to configure, easier > to combine all test results into a single report. > - Generated pom.xml now has correct dependencies/versions > - Script generation is currently hardcoded and not based on sqoop help, as > previously - though added the possiblity of hooking it in later > > > Thanks, > > Anna Szonyi > >
Re: Review Request 66300: Upgrade to Hadoop 3.0.0
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66300/#review200037 --- Patch #1 is the minimal set of changes required to upgrade to Hadoop 3.0.0 that passes all unit tests. It also updates: - Hive to 3.0.0-SNAPSHOT since Hive hadoop shims was unable to handle Hadoop 3. - HBase 2.0.0-beta2 since Hive 3.0.0-SNAPSHOT depends on HBase 2.0.0-alpha4 at the moment. For the list of other changes and some reasoning behind them see https://github.com/dvoros/sqoop/pull/4. - daniel voros On March 27, 2018, 8:50 a.m., daniel voros wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66300/ > --- > > (Updated March 27, 2018, 8:50 a.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3305 > https://issues.apache.org/jira/browse/SQOOP-3305 > > > Repository: sqoop-trunk > > > Description > --- > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > > > Diffs > - > > ivy.xml 6be4fa2 > ivy/libraries.properties c44b50b > src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 > src/java/org/apache/sqoop/hive/HiveImport.java c272911 > src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 > src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION > src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 > src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b > src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d > testdata/hcatalog/conf/hive-site.xml edac7aa > > > Diff: https://reviews.apache.org/r/66300/diff/1/ > > > Testing > --- > > Normal and third-party unit tests. > > > Thanks, > > daniel voros > >
[jira] [Commented] (SQOOP-3305) Upgrade to Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SQOOP-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415267#comment-16415267 ] Daniel Voros commented on SQOOP-3305: - Attached review request. > Upgrade to Hadoop 3.0.0 > --- > > Key: SQOOP-3305 > URL: https://issues.apache.org/jira/browse/SQOOP-3305 > Project: Sqoop > Issue Type: Task > Reporter: Daniel Voros > Assignee: Daniel Voros >Priority: Major > > To be able to eventually support the latest versions of Hive, HBase and > Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See > https://hadoop.apache.org/docs/r3.0.0/index.html > In this ticket I'll collect the necessary changes to do the upgrade. I'm not > setting a fix version yet, since this might mean a major release and to be > done together with the upgrade of related components. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 66300: Upgrade to Hadoop 3.0.0
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66300/ --- Review request for Sqoop. Bugs: SQOOP-3305 https://issues.apache.org/jira/browse/SQOOP-3305 Repository: sqoop-trunk Description --- To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html Diffs - ivy.xml 6be4fa2 ivy/libraries.properties c44b50b src/java/org/apache/sqoop/config/ConfigurationHelper.java e07a699 src/java/org/apache/sqoop/hive/HiveImport.java c272911 src/java/org/apache/sqoop/mapreduce/JobBase.java 6d1e049 src/java/org/apache/sqoop/mapreduce/hcat/DerbyPolicy.java PRE-CREATION src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 784b5f2 src/java/org/apache/sqoop/util/SqoopJsonUtil.java adf186b src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20d testdata/hcatalog/conf/hive-site.xml edac7aa Diff: https://reviews.apache.org/r/66300/diff/1/ Testing --- Normal and third-party unit tests. Thanks, daniel voros
Re: Review Request 66282: Mock ConnManager field in TestTableDefWriter
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66282/#review199972 --- Ship it! Nice addition, ship it! - daniel voros On March 26, 2018, 2:02 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66282/ > --- > > (Updated March 26, 2018, 2:02 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3308 > https://issues.apache.org/jira/browse/SQOOP-3308 > > > Repository: sqoop-trunk > > > Description > --- > > This patch removes the externalColTypes field from TableDefWriter since it > was only used for testing purposes. > TestTableDefWriter is fixed to mock the ConnManager object provided to the > TableDefWriter constructor and a minor refactoring is done on the class. > > > Diffs > - > > src/java/org/apache/sqoop/hive/TableDefWriter.java e1424c383 > src/test/org/apache/sqoop/hive/TestTableDefWriter.java 496b5add9 > > > Diff: https://reviews.apache.org/r/66282/diff/3/ > > > Testing > --- > > ant clean test > > > Thanks, > > Szabolcs Vasas > >
Review Request 66277: Don't create HTML during Ivy report
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66277/ --- Review request for Sqoop. Bugs: SQOOP-3307 https://issues.apache.org/jira/browse/SQOOP-3307 Repository: sqoop-trunk Description --- ant clean report invokes the ivy:report task and creates both HTML and GraphML reports. Creation of the HTML reports takes ~7 minutes and results in a ~700MB html that's hard to make use of, while the GraphML reporting is fast and is easier to read. Diffs - build.xml d85cf71 Diff: https://reviews.apache.org/r/66277/diff/1/ Testing --- `ant clean report` Thanks, daniel voros
[jira] [Commented] (SQOOP-3307) Don't create HTML during Ivy report
[ https://issues.apache.org/jira/browse/SQOOP-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16413747#comment-16413747 ] Daniel Voros commented on SQOOP-3307: - Attaching review request. > Don't create HTML during Ivy report > --- > > Key: SQOOP-3307 > URL: https://issues.apache.org/jira/browse/SQOOP-3307 > Project: Sqoop > Issue Type: Task >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Minor > Fix For: 1.5.0 > > > {{ant clean report}} invokes the [ivy:report > |https://ant.apache.org/ivy/history/2.1.0/use/report.html] task and creates > both HTML and GraphML reports. > Creation of the HTML reports takes ~7 minutes and results in a ~700MB html > that's hard to make use of, while the GraphML reporting is fast and is easier > to read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3307) Don't create HTML during Ivy report
Daniel Voros created SQOOP-3307: --- Summary: Don't create HTML during Ivy report Key: SQOOP-3307 URL: https://issues.apache.org/jira/browse/SQOOP-3307 Project: Sqoop Issue Type: Task Affects Versions: 1.4.7 Reporter: Daniel Voros Assignee: Daniel Voros Fix For: 1.5.0 {{ant clean report}} invokes the [ivy:report |https://ant.apache.org/ivy/history/2.1.0/use/report.html] task and creates both HTML and GraphML reports. Creation of the HTML reports takes ~7 minutes and results in a ~700MB html that's hard to make use of, while the GraphML reporting is fast and is easier to read. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (SQOOP-3305) Upgrade to Hadoop 3.0.0
Daniel Voros created SQOOP-3305: --- Summary: Upgrade to Hadoop 3.0.0 Key: SQOOP-3305 URL: https://issues.apache.org/jira/browse/SQOOP-3305 Project: Sqoop Issue Type: Task Reporter: Daniel Voros Assignee: Daniel Voros To be able to eventually support the latest versions of Hive, HBase and Accumulo, we should start by upgrading our Hadoop dependencies to 3.0.0. See https://hadoop.apache.org/docs/r3.0.0/index.html In this ticket I'll collect the necessary changes to do the upgrade. I'm not setting a fix version yet, since this might mean a major release and to be done together with the upgrade of related components. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66195: Implement JDBC and Kerberos tools for HiveServer2 support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66195/#review199880 --- Ship it! Ship It! - daniel voros On March 21, 2018, 12:48 p.m., Szabolcs Vasas wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66195/ > --- > > (Updated March 21, 2018, 12:48 p.m.) > > > Review request for Sqoop. > > > Bugs: SQOOP-3300 > https://issues.apache.org/jira/browse/SQOOP-3300 > > > Repository: sqoop-trunk > > > Description > --- > > The idea of the Sqoop HS2 support is to connect to HS2 using JDBC and execute > the Hive commands on this connection. Sqoop should also support Kerberos > authentication when building this JDBC connection. > > The goal of this JIRA is to implement the necessary classes for building JDBC > connections and authenticating with Kerberos. > > > Diffs > - > > src/java/org/apache/sqoop/authentication/KerberosAuthenticator.java > PRE-CREATION > src/java/org/apache/sqoop/db/DriverManagerJdbcConnectionFactory.java > PRE-CREATION > src/java/org/apache/sqoop/db/JdbcConnectionFactory.java PRE-CREATION > src/java/org/apache/sqoop/db/decorator/JdbcConnectionFactoryDecorator.java > PRE-CREATION > > src/java/org/apache/sqoop/db/decorator/KerberizedConnectionFactoryDecorator.java > PRE-CREATION > src/test/org/apache/sqoop/authentication/TestKerberosAuthenticator.java > PRE-CREATION > src/test/org/apache/sqoop/db/TestDriverManagerJdbcConnectionFactory.java > PRE-CREATION > > src/test/org/apache/sqoop/db/decorator/TestKerberizedConnectionFactoryDecorator.java > PRE-CREATION > src/test/org/apache/sqoop/hbase/HBaseTestCase.java > f96b6587ff3756aa5a696df8b7fc12ef0b0f > > src/test/org/apache/sqoop/infrastructure/kerberos/MiniKdcInfrastructureRule.java > a704d0b07282e54e7c19d7a6725d6d026d037073 > > > Diff: https://reviews.apache.org/r/66195/diff/1/ > > > Testing > --- > > Executed unit and third party tests. > > > Thanks, > > Szabolcs Vasas > >
[jira] [Commented] (SQOOP-3289) Add .travis.yml
[ https://issues.apache.org/jira/browse/SQOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393199#comment-16393199 ] Daniel Voros commented on SQOOP-3289: - Hi [~BoglarkaEgyed], Thanks for you review! In the meantime I've started fooling around with thirdparty tests in Travis. Thought I'll share the current status so you can comment on that early on. For the latest results, please check this build: https://travis-ci.org/dvoros/sqoop/builds/351353673 cc [~vasas] [~maugli] > Add .travis.yml > --- > > Key: SQOOP-3289 > URL: https://issues.apache.org/jira/browse/SQOOP-3289 > Project: Sqoop > Issue Type: Task > Components: build >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Minor > Fix For: 1.5.0 > > > Adding a .travis.yml would enable running builds/tests on travis-ci.org. > Currently if you wish to use Travis for testing your changes, you have to > manually add a .travis.yml to your branch. Having it committed to trunk would > save us this extra step. > I currently have an example > [{{.travis.yml}}|https://github.com/dvoros/sqoop/blob/93a4c06c1a3da1fd5305c99e379484507797b3eb/.travis.yml] > on my travis branch running unit tests for every commit and every pull > request: https://travis-ci.org/dvoros/sqoop/builds > Later we could add the build status to the project readme as well, see: > https://github.com/dvoros/sqoop/tree/travis > Also, an example of a pull request: https://github.com/dvoros/sqoop/pull/1 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (SQOOP-3291) SqoopJobDataPublisher is invoked before Hive imports succeed
[ https://issues.apache.org/jira/browse/SQOOP-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393081#comment-16393081 ] Daniel Voros commented on SQOOP-3291: - Thank you [~venkatnrangan]! > SqoopJobDataPublisher is invoked before Hive imports succeed > > > Key: SQOOP-3291 > URL: https://issues.apache.org/jira/browse/SQOOP-3291 > Project: Sqoop > Issue Type: Bug > Components: hive-integration >Affects Versions: 1.4.7 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 1.5.0 > > > Job data is published to listeners (defined via sqoop.job.data.publish.class) > in case of Hive and HCat imports. Currently this happens before the Hive > import completes, so it gets reported even if Hive import fails. -- This message was sent by Atlassian JIRA (v7.6.3#76005)