RE: Kafka UTF 8 encoding problem

2016-11-09 Thread Radoslaw Gruchalski
It’s rather difficult to diagnose without having a minimum viable example.
It’s either that the encoding used is not what the data is ancoded as or
the data is actually utf-8 but the output (what you see after reading out
of kafka) for whatever reason is not utf.
Can you provide a simple unit test (gist would be enough) with the original
data?

For sure, the solution suggested by Ali might give you a working solution.
However, what you are seeing now might be a result of an underlaying
problem.

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 3:06:27 PM, Baris Akgun (Garanti Teknoloji) (
barisa...@garanti.com.tr) wrote:

Hi

I try to run with below parameters but again I face with same issue

Properties props = new Properties();
props.put("metadata.broker.list", brokerList);
props.put("serializer.class", encoder); //"kafka.serializer.StringEncoder"
//props.put("partitioner.class", "example.producer.SimplePartitioner");
props.put("request.required.acks", "1");
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer.encoding", "ISO-8859-9");


My message is ;

{"TW_USER_LOCATION":"Antalya,Türkiye}

I have problem with "ü" character.

İs there any other idea?


-Original Message-
From: Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
Sent: Wednesday, November 9, 2016 2:33 PM
To: Ali Akhtar; users@kafka.apache.org
Subject: Re: Kafka UTF 8 encoding problem

Yes, understandandable, however, the OP mentions the data in UTF-8.
If it’s not UTF, it needs to be converted to UTF. Or consider using
value.serializer.encoding
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java#L29

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 12:27:53 PM, Ali Akhtar (ali.rac...@gmail.com)
wrote:

Its probably not UTF-8 if it contains Turkish characters. That's why base64
encoding / decoding it might help.

On Wed, Nov 9, 2016 at 4:22 PM, Radoslaw Gruchalski 
wrote:

> Are you sure your string is in utf-8 in the first place?
> What if you pass your string via something like:
>
> System.out.println( new String(
> args[0].getBytes(StandardCharsets.UTF8),
> StandardCharsets.UTF8) )
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi,
>
>
>
> Producer Side//
>
>
>
> Properties props = *new* Properties();
>
> props.put("metadata.broker.list", brokerList);
>
> props.put("serializer.class", “kafka.serializer.StringEncoder”);
>
> props.put("request.required.acks", "1");
>
>
>
> Consumer side//
>
>
>
> I am using Spark Streaming Kafka API, I also try with Kafka CLI and
> Java kafka api but I always face with same issue.
>
>
>
> Thanks
>
>
>
> *From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
> *Sent:* Wednesday, November 9, 2016 1:49 PM
> *To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
> *Subject:* Re: Kafka UTF 8 encoding problem
>
>
>
> Baris,
>
>
>
> Kafka does not care about encoding, everything is transported as bytes.
>
> What’s the configueration of your producer / consumer?
>
> Are you using Java / JVM?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
>
> On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic
> in
> UTF-8 format but when we consume the messages from topic we saw that
kafka
> does not keep the original utf-8 format and we did not see the
> messages exactly.
>
>
> For example our message that includes turkish characters is "Barış"
> but when we consume it we saw Bar?? . How can we solve that problem?
> Is there any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere
> ozeldir ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza
> ulasmis olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik
> yukumlulugune uyulmasi zorunlulugu tarafiniz icin de soz konusudur.
> Mesaj ve eklerinde yer alan bilgilerin dogrulugu ve guncelligi
> konusunda gonderenin ya da sirketimizin herhangi bir sorumlulugu
> bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin size degisiklige

RE: Kafka UTF 8 encoding problem

2016-11-09 Thread Baris Akgun (Garanti Teknoloji)
Hi

I try to run with below parameters but again I face with same issue

Properties props = new Properties();
props.put("metadata.broker.list", brokerList);
props.put("serializer.class", encoder); 
//"kafka.serializer.StringEncoder"
//props.put("partitioner.class", "example.producer.SimplePartitioner");
props.put("request.required.acks", "1");
props.put("key.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer.encoding", "ISO-8859-9");


My message is ;

{"TW_USER_LOCATION":"Antalya,Türkiye}

I have problem with  "ü" character.

İs there any other idea?


-Original Message-----
From: Radoslaw Gruchalski [mailto:ra...@gruchalski.com] 
Sent: Wednesday, November 9, 2016 2:33 PM
To: Ali Akhtar; users@kafka.apache.org
Subject: Re: Kafka UTF 8 encoding problem

Yes, understandandable, however, the OP mentions the data in UTF-8.
If it’s not UTF, it needs to be converted to UTF. Or consider using 
value.serializer.encoding
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java#L29

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 12:27:53 PM, Ali Akhtar (ali.rac...@gmail.com) wrote:

Its probably not UTF-8 if it contains Turkish characters. That's why base64 
encoding / decoding it might help.

On Wed, Nov 9, 2016 at 4:22 PM, Radoslaw Gruchalski 
wrote:

> Are you sure your string is in utf-8 in the first place?
> What if you pass your string via something like:
>
> System.out.println( new String( 
> args[0].getBytes(StandardCharsets.UTF8),
> StandardCharsets.UTF8) )
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi,
>
>
>
> Producer Side//
>
>
>
> Properties props = *new* Properties();
>
> props.put("metadata.broker.list", brokerList);
>
> props.put("serializer.class", “kafka.serializer.StringEncoder”);
>
> props.put("request.required.acks", "1");
>
>
>
> Consumer side//
>
>
>
> I am using Spark Streaming Kafka API, I also try with Kafka CLI and 
> Java kafka api but I always face with same issue.
>
>
>
> Thanks
>
>
>
> *From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
> *Sent:* Wednesday, November 9, 2016 1:49 PM
> *To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
> *Subject:* Re: Kafka UTF 8 encoding problem
>
>
>
> Baris,
>
>
>
> Kafka does not care about encoding, everything is transported as bytes.
>
> What’s the configueration of your producer / consumer?
>
> Are you using Java / JVM?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
>
> On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic 
> in
> UTF-8 format but when we consume the messages from topic we saw that
kafka
> does not keep the original utf-8 format and we did not see the 
> messages exactly.
>
>
> For example our message that includes turkish characters is "Barış" 
> but when we consume it we saw Bar?? . How can we solve that problem? 
> Is there any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere 
> ozeldir ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza 
> ulasmis olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik 
> yukumlulugune uyulmasi zorunlulugu tarafiniz icin de soz konusudur. 
> Mesaj ve eklerinde yer alan bilgilerin dogrulugu ve guncelligi 
> konusunda gonderenin ya da sirketimizin herhangi bir sorumlulugu 
> bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin size degisiklige 
> ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin 
> korunamamasindan, virus icermesinden ve bilgisayar sisteminize 
> verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for 
> the
> individual(s) stated in this message. If you received this message
although
> you are not the addressee, you are responsible to keep the message 
> confidential. The sender has no responsibility for the accuracy or 
> correctness of the information in the message and its attachm

RE: Kafka UTF 8 encoding problem

2016-11-09 Thread Baris Akgun (Garanti Teknoloji)
H, 

@Ali , I tried  base64 but it did not work.

My original case, I collect the tweets that is in json format. And tweet text 
includes turkish characters. I will try the key.serializer.encoding properties 
and I will inform you

Thanks,
-Original Message-
From: Radoslaw Gruchalski [mailto:ra...@gruchalski.com] 
Sent: Wednesday, November 9, 2016 2:33 PM
To: Ali Akhtar; users@kafka.apache.org
Subject: Re: Kafka UTF 8 encoding problem

Yes, understandandable, however, the OP mentions the data in UTF-8.
If it’s not UTF, it needs to be converted to UTF. Or consider using 
value.serializer.encoding
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java#L29

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 12:27:53 PM, Ali Akhtar (ali.rac...@gmail.com) wrote:

Its probably not UTF-8 if it contains Turkish characters. That's why base64 
encoding / decoding it might help.

On Wed, Nov 9, 2016 at 4:22 PM, Radoslaw Gruchalski 
wrote:

> Are you sure your string is in utf-8 in the first place?
> What if you pass your string via something like:
>
> System.out.println( new String( 
> args[0].getBytes(StandardCharsets.UTF8),
> StandardCharsets.UTF8) )
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi,
>
>
>
> Producer Side//
>
>
>
> Properties props = *new* Properties();
>
> props.put("metadata.broker.list", brokerList);
>
> props.put("serializer.class", “kafka.serializer.StringEncoder”);
>
> props.put("request.required.acks", "1");
>
>
>
> Consumer side//
>
>
>
> I am using Spark Streaming Kafka API, I also try with Kafka CLI and 
> Java kafka api but I always face with same issue.
>
>
>
> Thanks
>
>
>
> *From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
> *Sent:* Wednesday, November 9, 2016 1:49 PM
> *To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
> *Subject:* Re: Kafka UTF 8 encoding problem
>
>
>
> Baris,
>
>
>
> Kafka does not care about encoding, everything is transported as bytes.
>
> What’s the configueration of your producer / consumer?
>
> Are you using Java / JVM?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
>
> On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic 
> in
> UTF-8 format but when we consume the messages from topic we saw that
kafka
> does not keep the original utf-8 format and we did not see the 
> messages exactly.
>
>
> For example our message that includes turkish characters is "Barış" 
> but when we consume it we saw Bar?? . How can we solve that problem? 
> Is there any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere 
> ozeldir ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza 
> ulasmis olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik 
> yukumlulugune uyulmasi zorunlulugu tarafiniz icin de soz konusudur. 
> Mesaj ve eklerinde yer alan bilgilerin dogrulugu ve guncelligi 
> konusunda gonderenin ya da sirketimizin herhangi bir sorumlulugu 
> bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin size degisiklige 
> ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin 
> korunamamasindan, virus icermesinden ve bilgisayar sisteminize 
> verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for 
> the
> individual(s) stated in this message. If you received this message
although
> you are not the addressee, you are responsible to keep the message 
> confidential. The sender has no responsibility for the accuracy or 
> correctness of the information in the message and its attachments. Our 
> company shall have no liability for any changes or late receiving, 
> loss
of
> integrity and confidentiality, viruses and any damages caused in 
> anyway
to
> your computer system.
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere 
> ozeldir ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza 
> ulasmis olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik 
> yukumlulugune uyulmasi zorunlulugu tarafiniz icin de soz konusudur. 
> Mesaj ve eklerinde yer alan bilgilerin dogrulugu ve guncelligi 
> konusunda gonderenin ya da sirketimizin herhangi bir sorumlulugu 
> bulunmamaktadir. S

Re: Kafka UTF 8 encoding problem

2016-11-09 Thread Radoslaw Gruchalski
Yes, understandandable, however, the OP mentions the data in UTF-8.
If it’s not UTF, it needs to be converted to UTF. Or consider using
value.serializer.encoding
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/serialization/StringSerializer.java#L29

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 12:27:53 PM, Ali Akhtar (ali.rac...@gmail.com) wrote:

Its probably not UTF-8 if it contains Turkish characters. That's why base64
encoding / decoding it might help.

On Wed, Nov 9, 2016 at 4:22 PM, Radoslaw Gruchalski 
wrote:

> Are you sure your string is in utf-8 in the first place?
> What if you pass your string via something like:
>
> System.out.println( new String( args[0].getBytes(StandardCharsets.UTF8),
> StandardCharsets.UTF8) )
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi,
>
>
>
> Producer Side//
>
>
>
> Properties props = *new* Properties();
>
> props.put("metadata.broker.list", brokerList);
>
> props.put("serializer.class", “kafka.serializer.StringEncoder”);
>
> props.put("request.required.acks", "1");
>
>
>
> Consumer side//
>
>
>
> I am using Spark Streaming Kafka API, I also try with Kafka CLI and Java
> kafka api but I always face with same issue.
>
>
>
> Thanks
>
>
>
> *From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
> *Sent:* Wednesday, November 9, 2016 1:49 PM
> *To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
> *Subject:* Re: Kafka UTF 8 encoding problem
>
>
>
> Baris,
>
>
>
> Kafka does not care about encoding, everything is transported as bytes.
>
> What’s the configueration of your producer / consumer?
>
> Are you using Java / JVM?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
>
> On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic in
> UTF-8 format but when we consume the messages from topic we saw that
kafka
> does not keep the original utf-8 format and we did not see the messages
> exactly.
>
>
> For example our message that includes turkish characters is "Barış" but
> when we consume it we saw Bar?? . How can we solve that problem? Is there
> any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
> tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message
although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss
of
> integrity and confidentiality, viruses and any damages caused in anyway
to
> your computer system.
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
> tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message
although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss
of
> integrity and confidentiality, viruses and any damages caused in anyway
to
> your computer system.
>


Re: Kafka UTF 8 encoding problem

2016-11-09 Thread Ali Akhtar
Its probably not UTF-8 if it contains Turkish characters. That's why base64
encoding / decoding it might help.

On Wed, Nov 9, 2016 at 4:22 PM, Radoslaw Gruchalski 
wrote:

> Are you sure your string is in utf-8 in the first place?
> What if you pass your string via something like:
>
> System.out.println( new String( args[0].getBytes(StandardCharsets.UTF8),
> StandardCharsets.UTF8) )
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi,
>
>
>
> Producer Side//
>
>
>
> Properties props = *new* Properties();
>
> props.put("metadata.broker.list", brokerList);
>
> props.put("serializer.class", “kafka.serializer.StringEncoder”);
>
> props.put("request.required.acks", "1");
>
>
>
> Consumer side//
>
>
>
> I am using Spark Streaming Kafka API, I also try with Kafka CLI and Java
> kafka api but I always face with same issue.
>
>
>
> Thanks
>
>
>
> *From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
> *Sent:* Wednesday, November 9, 2016 1:49 PM
> *To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
> *Subject:* Re: Kafka UTF 8 encoding problem
>
>
>
> Baris,
>
>
>
> Kafka does not care about encoding, everything is transported as bytes.
>
> What’s the configueration of your producer / consumer?
>
> Are you using Java / JVM?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
>
> On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
> barisa...@garanti.com.tr) wrote:
>
> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic in
> UTF-8 format but when we consume the messages from topic we saw that kafka
> does not keep the original utf-8 format and we did not see the messages
> exactly.
>
>
> For example our message that includes turkish characters is "Barış" but
> when we consume it we saw Bar?? . How can we solve that problem? Is there
> any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
> tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
> tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>


RE: Kafka UTF 8 encoding problem

2016-11-09 Thread Radoslaw Gruchalski
Are you sure your string is in utf-8 in the first place?
What if you pass your string via something like:

System.out.println( new String( args[0].getBytes(StandardCharsets.UTF8),
StandardCharsets.UTF8) )

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 12:14:03 PM, Baris Akgun (Garanti Teknoloji) (
barisa...@garanti.com.tr) wrote:

Hi,



Producer Side//



Properties props = *new* Properties();

props.put("metadata.broker.list", brokerList);

props.put("serializer.class", “kafka.serializer.StringEncoder”);

props.put("request.required.acks", "1");



Consumer side//



I am using Spark Streaming Kafka API, I also try with Kafka CLI and Java
kafka api but I always face with same issue.



Thanks



*From:* Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
*Sent:* Wednesday, November 9, 2016 1:49 PM
*To:* Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
*Subject:* Re: Kafka UTF 8 encoding problem



Baris,



Kafka does not care about encoding, everything is transported as bytes.

What’s the configueration of your producer / consumer?

Are you using Java / JVM?

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com



On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
barisa...@garanti.com.tr) wrote:

Hi All,

We are using Kafka 0,9.0.0 and we want to send our messages to topic in
UTF-8 format but when we consume the messages from topic we saw that kafka
does not keep the original utf-8 format and we did not see the messages
exactly.


For example our message that includes turkish characters is "Barış" but
when we consume it we saw Bar?? . How can we solve that problem? Is there
any way to set kafka topic encoding?

Thanks

Barış
Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
tutulamaz.

This message and attachments are confidential and intended solely for the
individual(s) stated in this message. If you received this message although
you are not the addressee, you are responsible to keep the message
confidential. The sender has no responsibility for the accuracy or
correctness of the information in the message and its attachments. Our
company shall have no liability for any changes or late receiving, loss of
integrity and confidentiality, viruses and any damages caused in anyway to
your computer system.

Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.

This message and attachments are confidential and intended solely for the
individual(s) stated in this message. If you received this message although
you are not the addressee, you are responsible to keep the message
confidential. The sender has no responsibility for the accuracy or
correctness of the information in the message and its attachments. Our
company shall have no liability for any changes or late receiving, loss of
integrity and confidentiality, viruses and any damages caused in anyway to
your computer system.


RE: Kafka UTF 8 encoding problem

2016-11-09 Thread Baris Akgun (Garanti Teknoloji)
Hi,

Producer Side//

Properties props = new Properties();
props.put("metadata.broker.list", brokerList);
props.put("serializer.class", “kafka.serializer.StringEncoder”);
props.put("request.required.acks", "1");

Consumer side//

I am using Spark Streaming Kafka API, I also try with Kafka CLI and Java kafka 
api but I always face with same issue.

Thanks

From: Radoslaw Gruchalski [mailto:ra...@gruchalski.com]
Sent: Wednesday, November 9, 2016 1:49 PM
To: Baris Akgun (Garanti Teknoloji); users@kafka.apache.org
Subject: Re: Kafka UTF 8 encoding problem

Baris,

Kafka does not care about encoding, everything is transported as bytes.
What’s the configueration of your producer / consumer?
Are you using Java / JVM?

–
Best regards,

Radek Gruchalski

ra...@gruchalski.com<mailto:ra...@gruchalski.com>


On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) 
(barisa...@garanti.com.tr<mailto:barisa...@garanti.com.tr>) wrote:
Hi All,

We are using Kafka 0,9.0.0 and we want to send our messages to topic in UTF-8 
format but when we consume the messages from topic we saw that kafka does not 
keep the original utf-8 format and we did not see the messages exactly.


For example our message that includes turkish characters is "Barış" but when we 
consume it we saw Bar?? . How can we solve that problem? Is there any way to 
set kafka topic encoding?

Thanks

Barış
Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir ve 
gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis olmasi 
halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune uyulmasi 
zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde yer alan 
bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da sirketimizin 
herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin 
size degisiklige ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin 
korunamamasindan, virus icermesinden ve bilgisayar sisteminize verebilecegi 
herhangi bir zarardan sorumlu tutulamaz.

This message and attachments are confidential and intended solely for the 
individual(s) stated in this message. If you received this message although you 
are not the addressee, you are responsible to keep the message confidential. 
The sender has no responsibility for the accuracy or correctness of the 
information in the message and its attachments. Our company shall have no 
liability for any changes or late receiving, loss of integrity and 
confidentiality, viruses and any damages caused in anyway to your computer 
system.
Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir ve 
gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis olmasi 
halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune uyulmasi 
zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde yer alan 
bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da sirketimizin 
herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin 
size degisiklige ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin 
korunamamasindan, virus icermesinden ve bilgisayar sisteminize verebilecegi 
herhangi bir zarardan sorumlu tutulamaz.

This message and attachments are confidential and intended solely for the 
individual(s) stated in this message. If you received this message although you 
are not the addressee, you are responsible to keep the message confidential. 
The sender has no responsibility for the accuracy or correctness of the 
information in the message and its attachments. Our company shall have no 
liability for any changes or late receiving, loss of integrity and 
confidentiality, viruses and any damages caused in anyway to your computer 
system.


Re: Kafka UTF 8 encoding problem

2016-11-09 Thread Radoslaw Gruchalski
Baris,

Kafka does not care about encoding, everything is transported as bytes.
What’s the configueration of your producer / consumer?
Are you using Java / JVM?

–
Best regards,
Radek Gruchalski
ra...@gruchalski.com


On November 9, 2016 at 11:42:02 AM, Baris Akgun (Garanti Teknoloji) (
barisa...@garanti.com.tr) wrote:

Hi All,

We are using Kafka 0,9.0.0 and we want to send our messages to topic in
UTF-8 format but when we consume the messages from topic we saw that kafka
does not keep the original utf-8 format and we did not see the messages
exactly.


For example our message that includes turkish characters is "Barış" but
when we consume it we saw Bar?? . How can we solve that problem? Is there
any way to set kafka topic encoding?

Thanks

Barış
Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu
tutulamaz.

This message and attachments are confidential and intended solely for the
individual(s) stated in this message. If you received this message although
you are not the addressee, you are responsible to keep the message
confidential. The sender has no responsibility for the accuracy or
correctness of the information in the message and its attachments. Our
company shall have no liability for any changes or late receiving, loss of
integrity and confidentiality, viruses and any damages caused in anyway to
your computer system.


Re: Kafka UTF 8 encoding problem

2016-11-09 Thread Ali Akhtar
I would recommend base64 encoding the message on the producer side, and
decoding it on the consumer side.

On Wed, Nov 9, 2016 at 3:40 PM, Baris Akgun (Garanti Teknoloji) <
barisa...@garanti.com.tr> wrote:

> Hi All,
>
> We are using Kafka 0,9.0.0 and we want to send our messages to topic in
> UTF-8 format but when we consume the messages from topic we saw that kafka
> does not keep the original utf-8 format and we did not see the messages
> exactly.
>
>
> For example our message that includes turkish characters is "Barış" but
> when we consume it we saw Bar?? . How can we solve that problem? Is there
> any way to set kafka topic encoding?
>
> Thanks
>
> Barış
> Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir
> ve gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis
> olmasi halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune
> uyulmasi zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde
> yer alan bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da
> sirketimizin herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin
> ve bilgilerinin size degisiklige ugrayarak veya gec ulasmasindan,
> butunlugunun ve gizliliginin korunamamasindan, virus icermesinden ve
> bilgisayar sisteminize verebilecegi herhangi bir zarardan sorumlu tutulamaz.
>
> This message and attachments are confidential and intended solely for the
> individual(s) stated in this message. If you received this message although
> you are not the addressee, you are responsible to keep the message
> confidential. The sender has no responsibility for the accuracy or
> correctness of the information in the message and its attachments. Our
> company shall have no liability for any changes or late receiving, loss of
> integrity and confidentiality, viruses and any damages caused in anyway to
> your computer system.
>


Kafka UTF 8 encoding problem

2016-11-09 Thread Baris Akgun (Garanti Teknoloji)
Hi All,

We are using Kafka 0,9.0.0 and we want to send our messages to topic in UTF-8 
format but when we consume the messages from topic we saw that kafka does not 
keep the original utf-8 format and we did not see the messages exactly.


For example our message that includes turkish characters is "Barış" but when we 
consume it we saw Bar?? . How can we solve that problem? Is there any way to 
set kafka topic encoding?

Thanks

Barış
Bu mesaj ve ekleri, mesajda gonderildigi belirtilen kisi/kisilere ozeldir ve 
gizlidir. Bu mesajin muhatabi olmamaniza ragmen tarafiniza ulasmis olmasi 
halinde mesaj iceriginin gizliligi ve bu gizlilik yukumlulugune uyulmasi 
zorunlulugu tarafiniz icin de soz konusudur. Mesaj ve eklerinde yer alan 
bilgilerin dogrulugu ve guncelligi konusunda gonderenin ya da sirketimizin 
herhangi bir sorumlulugu bulunmamaktadir. Sirketimiz mesajin ve bilgilerinin 
size degisiklige ugrayarak veya gec ulasmasindan, butunlugunun ve gizliliginin 
korunamamasindan, virus icermesinden ve bilgisayar sisteminize verebilecegi 
herhangi bir zarardan sorumlu tutulamaz.

This message and attachments are confidential and intended solely for the 
individual(s) stated in this message. If you received this message although you 
are not the addressee, you are responsible to keep the message confidential. 
The sender has no responsibility for the accuracy or correctness of the 
information in the message and its attachments. Our company shall have no 
liability for any changes or late receiving, loss of integrity and 
confidentiality, viruses and any damages caused in anyway to your computer 
system.