-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65607/
-----------------------------------------------------------
(Updated Feb. 12, 2018, 2:31 p.m.)
Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas.
Bugs: SQOOP-2976
https://issues.apache.org/jira/browse/SQOOP-2976
Repository: sqoop-trunk
Description
-------
**Summary:**
Certain databases, such as SQL Server and Postgres are storing decimal values
padded with 0s, should the user insert them with less digits than the given
scale.
Other databases however, such as Oracle and HSQLDB store these numbers without
trailing 0s. Then, when the JDBC driver returns these as BigDecimals, they
won't match the scale in the avro schema.
Take the following SQL commands for an example:
```
create table salary (id int, amount number (10,5));
insert into salary (id, amount) values (1, 10.5);
insert into salary (id, amount) values (2, 10.50);
select * from salary;
```
Records in an Oracle database:
1 10.5
2 10.5
Records in SQL Server (using decimal instead of number in the create statement):
1 10.50000
2 10.50000
**Solition:**
The fix is simply checking the scale of the returned BigDecimals against what's
in the avro schema and recreates the objects in case of a mismatch. I've
introduced a new property to enable this new feature, so existing behavior is
not affected.
**Concerns: **
- trimmings can happen silently, should we rather raise an exception? Enabling
trimming adds a new feature, but it also adds the possibility silently lose
scale while import. The latter could be mitigated by a thorough documentation.
- The flags current name () doesn't really match the behavior, should I change
it to something else? (avro.decimal_scale_harmonization.enable)
- How / where to document this new flag?
**Other notable changes:**
- Introduced ArgumentArrayBuilder that reuses the existing Argument class and
introduces a useful builder pattern for creating commandline arguments for
tests.
- Slightly modified BaseSqoopTest to fit my needs. *(However, further
refactoring would be required in this class to enable better reuse. For
example: the current implementation can't be used with SQL Server, because one
also needs to specify the schema besides the tablename in the create and insert
statements. There are also code duplications.)*
Diffs (updated)
-----
src/java/org/apache/sqoop/avro/AvroUtil.java 1aae8df2
src/java/org/apache/sqoop/config/ConfigurationConstants.java 7a19a62c
src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java a5e5bf5a
src/test/org/apache/sqoop/manager/hsqldb/TestHsqldbAvroPadding.java
PRE-CREATION
src/test/org/apache/sqoop/manager/oracle/OracleAvroPaddingImportTest.java
PRE-CREATION
src/test/org/apache/sqoop/manager/sqlserver/MSSQLTestUtils.java 2220b7d5
src/test/org/apache/sqoop/manager/sqlserver/SQLServerAvroPaddingImportTest.java
PRE-CREATION
src/test/org/apache/sqoop/testutil/ArgumentArrayBuilder.java PRE-CREATION
src/test/org/apache/sqoop/testutil/AvroTestUtils.java PRE-CREATION
src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java 588f439c
Diff: https://reviews.apache.org/r/65607/diff/2/
Changes: https://reviews.apache.org/r/65607/diff/1-2/
Testing
-------
See the 3 new test classes (HSQLDB, Oracle, SQL Server).
Thanks,
Fero Szabo