*Hi *
We are facing an issue with inserting emoticons into Mysql via sqoop import
from Hive. Here is the full description of the problem. Please can we get
some urgent help.
*Error Message:*
Caused by: java.sql.SQLException: Incorrect string value:
'\xF0\x9F\x9A\x97 \xF0...' for column 'Description' at row
*Script:*
nohup sqoop export -D mapreduce.job.queuename="******" --connect
"jdbc:mysql://xxx.mysql.database.azure.com/xxx" --username
'mysqladmin@xxx-mysql-db10' --password '********' --table
workorder_emojitest --hcatalog-database test_db -hcatalog-table
workorder_test_1025 --input-fields-terminated-by , --driver
com.mysql.jdbc.Driver >> /tmp/m.logs &
*Options tried so far (none of them have worked)*
*Option 1 – not successful*
*A SerDe allows Hive to read in data from a table, and write it back out to
HDFS in any custom format.*
We have installed the serde jars and kepts in hive/lib location for emojis
data support But it's not worked. and also we kept this serde jars in in
Sqoop/lib but no luck.
*Option 2 – not successful*
In mySql server the emojis data types we are using var char with serde
properties utf8mb4
In Hive server the emojis data types are strings with serde properties
utf-8
Manually we are able to insert in my SQL server and hive server based on
above properties.
*Note*: In MySQL server serde properties which are using utf8mb4 is
*4bytes*
In hive server serde properties which are using utf8 is *3bytes*
we have tried all the way to change utf8mb4 to utf8 in MySQL server but
it was not supporting.
and we changed UTF-8 properties to UFT-16 in hive server and checked but no
luck.
also we have modified every properties in sqoop export command and tried.
jdbc:mysql://localhost:3306/?useUnicode=yes&characterEncoding=UTF-8
default-character-set=utf8
character-set=utf8mb4
collation-server=utf8mb4_unicode_ci
*Option 3 – not successful*
We also we tried all the possible ways on MySQL side and changed MySQL
properties and restarted the server.
character_set_server=utf8mb4
character_set_server=utf-8
character_set_connection utf-8
character-set results utf-8
Regards,
Sudhinder