we encountered an encoding issue when dealing with Chinese character
the producer send characters in right encode(UTF-8),while after the
consumer get it ,it all turns into question marks:????
when start up producer,kafka broker server and consumer, we tried specified
-Dfile.encoding=UTF-8,but it doesn't work
In producer,we use StringEncoder,below is the snippet of producer:
val props = new Properties();
...
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("compression.codec", "1") //gzip
val producerConfig = new ProducerConfig(props);
val producer = new Producer[String, String](producerConfig);
val data = new ProducerData[String, String](topic, partitionKey,
List("string_to_send_to_borker"));
producer.send(data);
and consumer:
val topicMessageStreams =
consumerConnector.createMessageStreams(Predef.Map(topic -> consumers),
new StringDecoder)
for ((topic, streamList) <- topicMessageStreams) {
for (stream <- streamList) {
val processor = new StreamProcessor(stream)
new Thread(processor).start();
}
}
and the StreamProcessor just iterate each streams
val message = iterator.next.message//chinese characters in message
turns into ?????
Anyone any help?
--
Best Regards
----------------------
刘明敏 | mmLiu