[ 
https://issues.apache.org/jira/browse/ATLAS-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlos Alberto Rocha Cardoso updated ATLAS-3953:
------------------------------------------------
    Description: 
The Export API return a ZIP file with some JSON files describing Atlas Entities 
and TypeDefs.

I am having a issue where some special chars in JSON are being replaced by "?" 
chars.

An Entity name like "Distribuição" was exported in JSON file like 
"Distribui??o". The special chars "çã" was replaced for the "??" chars.

I tried to change the exported JSON file encoding and the request header for 
Export API, but without success.

After analyzing the Atlas source code, specialy the *splitAndWriteBytes* method 
of the 
*[ZipSink|https://github.com/apache/atlas/blob/cc601d7371fae1dbc16b55d1ca84f06b745700dc/repository/src/main/java/org/apache/atlas/repository/impexp/ZipSink.java]
 class*, I tought if maybe the problem is because the *s.getBytes()* is 
returning the JSON string to be written to ZIP with other encode than *UTF-8*, 
and maybe set the encode like *s.getBytes(StandardCharsets.UTF_8)* could be a 
solution.

Its my first contact with the Atlas source code, and I'm not a JAVA programmer, 
so it's only a guess.

I saw that it's possible set the default encode to the plataform, or JVM, but 
how they said in this below discussion, perhaps this doesn't work properly in 
all situations.

[https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding]

  was:
The Export API return a ZIP file with some JSON files describing Atlas Entities 
and TypeDefs.

I am having a issue where some special chars in JSON are being replaced by "?" 
chars.

An Entity name like "Distribuição" was exported in JSON file like 
"Distribui??o". The special chars "çã" was replaced for the "??" chars.

I tried to change the exported JSON file encoding and the request header for 
Export API, but without success.

After analyzing the Atlas source code, specialy the *splitAndWriteBytes* method 
of the 
*[ZipSink|https://github.com/apache/atlas/blob/cc601d7371fae1dbc16b55d1ca84f06b745700dc/repository/src/main/java/org/apache/atlas/repository/impexp/ZipSink.java]
 class*, I tought if maybe the problem is because the *s.getBytes()* is 
returning the JSON string to be written to ZIP with other encode than *UTF-8*, 
and maybe setting the encode like *s.getBytes(StandardCharsets.UTF_8)* could be 
a solution.

Its my first contact with the Atlas source code, and I'm not a JAVA programmer, 
so it's only a guess.

I saw that it's possible set the default encode to the plataform, or JVM, but 
how they said in this below discussion, perhaps this doesn't work properly in 
all situations.

[https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding]


> JSON Files from Export API with "?" char for string with special chars 
> -----------------------------------------------------------------------
>
>                 Key: ATLAS-3953
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3953
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: 2.1.0
>         Environment: Apache Atlas 2.1.0 embedded HBASE and SOLR
>            Reporter: Carlos Alberto Rocha Cardoso
>            Priority: Minor
>
> The Export API return a ZIP file with some JSON files describing Atlas 
> Entities and TypeDefs.
> I am having a issue where some special chars in JSON are being replaced by 
> "?" chars.
> An Entity name like "Distribuição" was exported in JSON file like 
> "Distribui??o". The special chars "çã" was replaced for the "??" chars.
> I tried to change the exported JSON file encoding and the request header for 
> Export API, but without success.
> After analyzing the Atlas source code, specialy the *splitAndWriteBytes* 
> method of the 
> *[ZipSink|https://github.com/apache/atlas/blob/cc601d7371fae1dbc16b55d1ca84f06b745700dc/repository/src/main/java/org/apache/atlas/repository/impexp/ZipSink.java]
>  class*, I tought if maybe the problem is because the *s.getBytes()* is 
> returning the JSON string to be written to ZIP with other encode than 
> *UTF-8*, and maybe set the encode like *s.getBytes(StandardCharsets.UTF_8)* 
> could be a solution.
> Its my first contact with the Atlas source code, and I'm not a JAVA 
> programmer, so it's only a guess.
> I saw that it's possible set the default encode to the plataform, or JVM, but 
> how they said in this below discussion, perhaps this doesn't work properly in 
> all situations.
> [https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to