Hi Abe, Thanks for your mail, well mysql table is defined with utf-8 and even the data is visible like mentioned below,
*Data in mysql : *सुरेन्द्र कुमार पाण्डेय but as I move the same through sqoop import of data gets corrupted, as provided in the last thread of this mail. Well I even tried to set the parameters *useUnicode=true&characterEncoding=utf8* and *--direct -- --default-character-set=utf8* to sqoop import mysql connection string but still there's no luck. Additionally, the data is containing some control character like Ctrl-A (x001) and Ctrl-M likewise, which is even violating the field delimeter set to sqoop import precisely as Ctrl-A. Is there a way to keep a possible delimeter which can handle/work with any special or control character introduced. Looking out for quick response. Thanks! On Sun, Nov 23, 2014 at 12:40 AM, Abraham Elmahrek <[email protected]> wrote: > This could be in 2 places: Loading to HDFS, or extracting from MySQL. > Sqoop should load every thing as UTF-8 by default, which supports Hindi. > > What is your default character set in MySQL? Could you copy/paste your > my.cnf? Also, what version of MySQL are you running? > > On Sat, Nov 22, 2014 at 12:28 AM, Vineet Mishra <[email protected]> > wrote: > >> Hi Abe, >> >> Well with the above statement I mean to say that the data which is >> residing in mysql is different from what is been imported via sqoop. >> >> So let me shoot out an example for the same, >> >> *Data in mysql : *सुरेन्द्र कुमार पाण्डेय >> *Data in HDFS(Sqoop import) : * M-`M-$M-8M-`M-%M- >> >> So this is the kind of changes I am landing into which is completely >> loosing the meaning of the data. >> >> Any help would be appreciated. >> >> Thanks again! >> >> On Sat, Nov 22, 2014 at 2:15 AM, Abraham Elmahrek <[email protected]> >> wrote: >> >>> Hey there, >>> >>> Could you explain what you mean by "losing its meaning"? It's possible >>> you may need to set the character set: >>> http://dev.mysql.com/doc/connector-j/en/connector-j-reference-charsets.html >>> . >>> >>> -Abe >>> >>> On Fri, Nov 21, 2014 at 5:57 AM, Vineet Mishra <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> I am doing a Sqoop import from mysql as source, recently I figured out >>>> that data imported through sqoop from mysql was having some special >>>> characters and even control character which was loosing its meaning while >>>> moved to sqoop data files. >>>> >>>> Looking out for a solution as how to handle this case of special >>>> character or if possible pruning the unwanted data out of my target >>>> dataset. >>>> >>>> Looking out for resolution at the earliest! >>>> >>>> Thanks! >>>> >>> >>> >> >
