This could be in 2 places: Loading to HDFS, or extracting from MySQL. Sqoop should load every thing as UTF-8 by default, which supports Hindi.
What is your default character set in MySQL? Could you copy/paste your my.cnf? Also, what version of MySQL are you running? On Sat, Nov 22, 2014 at 12:28 AM, Vineet Mishra <[email protected]> wrote: > Hi Abe, > > Well with the above statement I mean to say that the data which is > residing in mysql is different from what is been imported via sqoop. > > So let me shoot out an example for the same, > > *Data in mysql : *सुरेन्द्र कुमार पाण्डेय > *Data in HDFS(Sqoop import) : * M-`M-$M-8M-`M-%M- > > So this is the kind of changes I am landing into which is completely > loosing the meaning of the data. > > Any help would be appreciated. > > Thanks again! > > On Sat, Nov 22, 2014 at 2:15 AM, Abraham Elmahrek <[email protected]> > wrote: > >> Hey there, >> >> Could you explain what you mean by "losing its meaning"? It's possible >> you may need to set the character set: >> http://dev.mysql.com/doc/connector-j/en/connector-j-reference-charsets.html >> . >> >> -Abe >> >> On Fri, Nov 21, 2014 at 5:57 AM, Vineet Mishra <[email protected]> >> wrote: >> >>> Hi, >>> >>> I am doing a Sqoop import from mysql as source, recently I figured out >>> that data imported through sqoop from mysql was having some special >>> characters and even control character which was loosing its meaning while >>> moved to sqoop data files. >>> >>> Looking out for a solution as how to handle this case of special >>> character or if possible pruning the unwanted data out of my target dataset. >>> >>> Looking out for resolution at the earliest! >>> >>> Thanks! >>> >> >> >
