Hi you all, I'm using LogstreamerInput to collect some application logs, and then send them to elasticsearch. The encoding of the log files is Latin1, but I know there's Chinese charset of GBK or GB2312 encode in the files. So the problem is when i output the logs, there're many unreadable codes on logoutput screen or Kibana. I notice that there's no config entry specifying charset encode with which to read the log content.
1. Is utf-8 the default encode in LogstreamerInput? I' can find it in the source code. 2. Is there a way to solve this problem? --------The LogstreamerInput config toml-------- [server1] type = "LogstreamerInput" splitter = "rule_splitter" log_directory = "/home/wolfias/heka/rulelog" file_match = 'nohup\.log\.(?P<Year>\d+)-(?P<Month>\d+)-(?P<Day>\d+)_(?P<Hour>\d+)' priority = ["Year","Month","Day","Hour"] [server1.translation.year] missing = 9999 [rule_splitter] type = "RegexSplitter" delimiter = '########' delimiter_eol = true [PayloadEncoder] [rule_KafkaOutput] type = "KafkaOutput" message_matcher = "TRUE" topic = "rulelog" addrs = ["10.63.204.70:9092"] encoder = "PayloadEncoder" [LogOutput] message_matcher = "TRUE" encoder = "PayloadEncoder" --------nohup log files opened using GB2312-------- 2015-12-28 15:46:37 信息: ######## 2015-12-28 15:46:37 信息: 开始时间戳:1451288797063 2015-12-28 15:46:37 信息: 服务端Servet调用提示->BomRuleServet->进入servet程序! --------nohup log files opened using utf-8-------- 2015-12-28 15:46:37 ÐÅÏ¢: ######## 2015-12-28 15:46:37 ÐÅÏ¢: ¿ªÊ¼Ê±¼ä´Á:1451288797063 2015-12-28 15:46:37 ÐÅÏ¢: ·þÎñ¶ËServetµ÷ÓÃÌáʾ->BomRuleServet->½øÈëservet³ÌÐò£¡ _______________________________________________ Heka mailing list Heka@mozilla.org https://mail.mozilla.org/listinfo/heka