Hello!
I've implemented simple solution with some hard code by now.
It's tested with oracle database.
{code:sql}
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000:
Enter password for jdbc:hive2://localhost:10000:
Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> SET
hive.metastore.warehouse.dir=${env:HOME}/Documents/hive-warehouse;
No rows affected (0.158 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .> book_id INT,
. . . . . . . . . . . . . . . .> book_name STRING,
. . . . . . . . . . . . . . . .> author_name STRING,
. . . . . . . . . . . . . . . .> book_isbn STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY
"org.apache.hive.storagehandler.JDBCStorageHandler"
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .> "mapred.jdbc.driver.class" =
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .> "mapred.jdbc.url" =
"jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .> "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .> "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .> "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .> "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );
No rows affected (2.297 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> select * from books3;
+-----------------+-------------------+---------------------+-------------------+
| books3.book_id | books3.book_name | books3.author_name | books3.book_isbn
|
+-----------------+-------------------+---------------------+-------------------+
| 124123 | name | author | 132321adsaf31
|
| 13 | name2 | author2 | asd213fadsf
|
| 2345236 | name3 | author3 | asdfds1234123
|
+-----------------+-------------------+---------------------+-------------------+
3 rows selected (2.146 seconds)
0: jdbc:hive2://localhost:10000> explain select * from books3;
+----------------------------------------------------+
| Explain |
+----------------------------------------------------+
| STAGE DEPENDENCIES: |
| Stage-0 is a root stage |
| |
| STAGE PLANS: |
| Stage: Stage-0 |
| Fetch Operator |
| limit: -1 |
| Processor Tree: |
| TableScan |
| alias: books3 |
| Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE |
| Select Operator |
| expressions: book_id (type: string), book_name (type: string),
author_name (type: string), book_isbn (type: string) |
| outputColumnNames: _col0, _col1, _col2, _col3 |
| Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE |
| ListSink |
| |
+----------------------------------------------------+
17 rows selected (0.508 seconds)
{code}
This solution works with two steps:
1. First grab all meta info from external table
2. Configure DBInputFormat, DBOutputFormat with table meta
What do you think about to ask user specify all needed information about
columns and types inside serde properties section?
Smth like that:
0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .> book_id INT,
. . . . . . . . . . . . . . . .> book_name STRING,
. . . . . . . . . . . . . . . .> author_name STRING,
. . . . . . . . . . . . . . . .> book_isbn STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY
“org.apache.hive.storagehandler.JDBCStorageHandler"
WITH SERDEPROPERTIES (
"hive.jdbc.columns.mapping" =
“book_id:int(32), book_name:varchar(20), author_name:varchar(20),
book_isbn:varchar(20)")
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .> "mapred.jdbc.driver.class" =
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .> "mapred.jdbc.url" =
"jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .> "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .> "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .> "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .> "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );