I write a test program to output chinese characters from hive table by java:
Test.java:
import java.util.Scanner;

public class Test {
    public static void main( String[] args ) {
        // TODO Auto-generated method stub
        Scanner scan = new Scanner( System.in );
            while( scan.hasNext() ){
                String s = scan.nextLine();
                System.out.println( s );
            }
    }
}
and then:
hive:>add file Test.jar;
hive:>create table test(zh string);
hive:>load data local inpath 'test.txt' into test;
hive:>from test select transform( zh ) using  'java jar Test.jar';

the output like this:
?????? ?????? ?????? ?????? ?????? ??????
??????
?????? VANCL ??????

if I do:
hive:>select * from test;
or
$cat test.txt |java -jar Test.jar.Both the outputs are correct:
官方网 指环王 指环 中文 官方网站 官方
农场
桌面 VANCL 新闻

I have tried python script instead,there is no problem.But I have to use
java for some reason.

I' m sure  my java code encoded by utf-8.
So,It seems neigther the problem of jar file nor hive,if that where is the
cause?
Thanks.

Reply via email to