Kengo Seki created AIRFLOW-2641:
-----------------------------------

             Summary: Fix MySqlToHiveTransfer to handle MySQL DECIMAL correctly
                 Key: AIRFLOW-2641
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2641
             Project: Apache Airflow
          Issue Type: Bug
            Reporter: Kengo Seki
            Assignee: Kengo Seki


This line

{code:title=airflow/operators/mysql_to_hive.py}
 98     @classmethod
 99     def type_map(cls, mysql_type):
100         t = MySQLdb.constants.FIELD_TYPE
101         d = {
102             t.BIT: 'INT',
103             t.DECIMAL: 'DOUBLE',
{code}

perhaps intends to convert MySQL DECIMAL to Hive DOUBLE, but it doesn't work as 
expected.

{code}
mysql> DESC t;
+-------+---------------+------+-----+---------+-------+
| Field | Type          | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+-------+
| c     | decimal(10,0) | YES  |     | NULL    |       |
+-------+---------------+------+-----+---------+-------+
1 row in set (0.00 sec)
{code}

{code}
In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer

In [2]: t = MySqlToHiveTransfer(mysql_conn_id="airflow_db", sql="SELECT * FROM 
t", hive_table="t", recreate=True, task_id="t", ignore_ti_state=True)

In [3]: t.execute(None)
[2018-06-18 23:37:53,193] {base_hook.py:83} INFO - Using connection to: 
localhost
[2018-06-18 23:37:53,199] {hive_hooks.py:429} INFO - DROP TABLE IF EXISTS t;
CREATE TABLE IF NOT EXISTS t (
c STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ''
STORED AS textfile
;

(snip)

[2018-06-18 23:38:25,048] {hive_hooks.py:235} INFO - Loading data to table 
default.t
[2018-06-18 23:38:25,866] {hive_hooks.py:235} INFO - Table default.t stats: 
[numFiles=1$ numRows=0, totalSize=0, rawDataSize=0]
[2018-06-18 23:38:25,868] {hive_hooks.py:235} INFO - OK
[2018-06-18 23:38:25,871] {hive_hooks.py:235} INFO - Time taken: 1.498 seconds
{code}

{code}
$ hive -e 'DESC t'

Logging initialized using configuration in 
file:/etc/hive/conf.dist/hive-log4j.properties
OK
c                       string
Time taken: 2.342 seconds, Fetched: 1 row(s)
{code}

This is because {{MySQLdb.constants.FIELD_TYPE.DECIMAL}} does not stand for 
DECIMAL type on MySQL 5.0+. {{MySQLdb.constants.FIELD_TYPE.NEWDECIMAL}} should 
be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to