Sudhinder created SQOOP-3456:
--------------------------------
Summary: Sqoop import from Hive to MySQL fails for Emoitcons
Key: SQOOP-3456
URL: https://issues.apache.org/jira/browse/SQOOP-3456
Project: Sqoop
Issue Type: Bug
Components: connectors/mysql
Affects Versions: 1.4.6
Environment: Production:
Part of CloudEra
Reporter: Sudhinder
We are facing an issue with inserting emoticons into Mysql via sqoop import
from Hive. Here is the full description of the problem. Please can we get some
urgent help.
**
*Error Message:*
Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x9A\x97
\xF0...' for column 'Description' at row
*Script:*
nohup sqoop export -D mapreduce.job.queuename="******" --connect
"jdbc:mysql://xxx.mysql.database.azure.com/xxx" --username
'mysqladmin@xxx-mysql-db10' --password '********' --table workorder_emojitest
--hcatalog-database test_db -hcatalog-table workorder_test_1025
--input-fields-terminated-by , --driver com.mysql.jdbc.Driver >>
/tmp/workorder_em.logs &
*Options tried so far (none of them have worked)*
*Option 1 - not successful*
*A SerDe allows Hive to read in data from a table, and write it back out to
HDFS in any custom format.*
We have installed the serde jars and kepts in hive/lib location for emojis data
support But it's not worked. and also we kept this serde jars in in Sqoop/lib
but no luck.
*Option 2 - not successful*
In mySql server the emojis data types we are using var char with serde
properties utf8mb4
In Hive server the emojis data types are strings with serde properties utf-8
Manually we are able to insert in my SQL server and hive server based on above
properties.
*Note*: In MySQL server serde properties which are using utf8mb4 is *4bytes*
** In hive server serde properties which are using utf8
is *3bytes*
we have tried all the way to change utf8mb4 to utf8 in MySQL server but it
was not supporting.
and we changed UTF-8 properties to UFT-16 in hive server and checked but no
luck.
also we have modified every properties in sqoop export command and tried.
jdbc:mysql://localhost:3306/?useUnicode=yes&characterEncoding=UTF-8
default-character-set=utf8 character-set=utf8mb4
collation-server=utf8mb4_unicode_ci
*Option 3 - not successful*
We also we tried all the possible ways on MySQL side and changed MySQL
properties and restarted the server.
character_set_server=utf8mb4character_set_server=utf-8character_set_connection
utf-8 character-set results utf-8
--
This message was sent by Atlassian Jira
(v8.3.4#803005)