[jira] [Comment Edited] (GROOVY-11314) JsonOutput Pretty Print always escapes characters

Denis Jakupovic (Jira) Fri, 16 Feb 2024 07:26:04 -0800


    [ 
https://issues.apache.org/jira/browse/GROOVY-11314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817999#comment-17817999
 ]


Denis Jakupovic edited comment on GROOVY-11314 at 2/16/24 3:25 PM:
-------------------------------------------------------------------

 
{code:java}
import static groovy.json.JsonGenerator.Options
import static groovy.json.JsonOutput.*


def json1 = [name:"Jerry"] //json map
def json2 = [name:"Järry"] //json map with unicode character
def json001 = '\1'
def json002 = 'ä'
def json127 = '\177' // 177 octal = 127 decimal


// JsonOutput always does escaping except for explicitly "unescaped" content
assert toJson(json001) == '"\\u0001"'
assert toJson(json002) == '"\\u00e4"'
assert toJson(json127) == '"\\u007f"'
assert toJson(json2) == '{"name":"J\\u00e4rry"}' //correct in the current 
approach
assert toJson(unescaped(json001)) == '\1'
assert toJson(unescaped(json002)) == 'ä'
assert toJson(unescaped(json127)) == '\177'
assert toJson(unescaped(json1)) == '{"name":"Jerry"}' //not possible due to 
LinkedHashMap input, class cannot parse LinkedHashMaps
assert toJson(unescaped(json2)) == '{"name":"Järry"}' //same

// DefaultJsonGenerator also escapes by default --> correct, per default it 
escapes but it shouldn't imo
def escaping = new Options().build()
assert escaping.toJson(json001) == '"\\u0001"'
assert escaping.toJson(json127) == '"\\u007f"'

// Unicode escaping can be disabled for DefaultJsonGenerator
// for chars decimal 127 & above, but is always done for
// ctrl chars below 32 & double quote & backslash
def nonEscaping = new Options().disableUnicodeEscaping().build()
assert nonEscaping.toJson(json001) == '"\\u0001"'
assert nonEscaping.toJson(json127) == '"\177"' {code}
Hi Paul,

 

thank you for your help. Well I would actually assume that JsonOutput does not 
escape per default. If the input is unescaped, why should it escape in the 
output? Your code provided calls toJson and unescaped. Unescaped cannot work 
with Maps oder Collections which are typically a result of the JsonSlurper. 

However, yes we can set disableUnicodeEscaping in the DefaultJsonGenerator and 
JsonBuilder class but still not in JsonOutput class to call prettyPrint with 
unescaped characters.

JsonOutput: 
    static final JsonGenerator DEFAULT_GENERATOR = new DefaultJsonGenerator(new 
JsonGenerator.Options());

There should be a constructor in the JsonOutput to overwrite the default Options
 


was (Author: djakupovic):
 
{code:java}
import static groovy.json.JsonGenerator.Options
import static groovy.json.JsonOutput.*


def json1 = [name:"Jerry"] //json map
def json2 = [name:"Järry"] //json map with unicode character
def json001 = '\1'
def json002 = 'ä'
def json127 = '\177' // 177 octal = 127 decimal


// JsonOutput always does escaping except for explicitly "unescaped" content
assert toJson(json001) == '"\\u0001"'
assert toJson(json002) == '"\\u00e4"'
assert toJson(json127) == '"\\u007f"'
assert toJson(json2) == '{"name":"J\\u00e4rry"}' //correct in the current 
approach
assert toJson(unescaped(json001)) == '\1'
assert toJson(unescaped(json002)) == 'ä'
assert toJson(unescaped(json127)) == '\177'
assert toJson(unescaped(json1)) == '{"name":"Jerry"}' //not possible due to 
LinkedHashMap input, class cannot parse LinkedHashMaps
assert toJson(unescaped(json2)) == '{"name":"Järry"}' //same

// DefaultJsonGenerator also escapes by default --> correct, per default it 
escapes but it shouldn't imo
def escaping = new Options().build()
assert escaping.toJson(json001) == '"\\u0001"'
assert escaping.toJson(json127) == '"\\u007f"'

// Unicode escaping can be disabled for DefaultJsonGenerator
// for chars decimal 127 & above, but is always done for
// ctrl chars below 32 & double quote & backslash
def nonEscaping = new Options().disableUnicodeEscaping().build()
assert nonEscaping.toJson(json001) == '"\\u0001"'
assert nonEscaping.toJson(json127) == '"\177"' {code}
Hi Paul,

 

thank you for your help. Well I would actually assume that JsonOutput does not 
escape per default. If the input is unescaped, why should it unescape in the 
output? Your code provided calls toJson and unescaped. Unescaped cannot work 
with Maps oder Collections which are typically a result of the JsonSlurper. 

However, yes we can set disableUnicodeEscaping in the DefaultJsonGenerator and 
JsonBuilder class but still not in JsonOutput class to call prettyPrint with 
unescaped characters.

JsonOutput: 
    static final JsonGenerator DEFAULT_GENERATOR = new DefaultJsonGenerator(new 
JsonGenerator.Options());

There should be a constructor in the JsonOutput to overwrite the default Options
 

> JsonOutput Pretty Print always escapes characters
> -------------------------------------------------
>
>                 Key: GROOVY-11314
>                 URL: https://issues.apache.org/jira/browse/GROOVY-11314
>             Project: Groovy
>          Issue Type: Wish
>          Components: JSON
>    Affects Versions: 2.5.23, 3.0.20, 4.0.18
>            Reporter: Denis Jakupovic
>            Priority: Major
>
> Hi,
> the groovy.json package is widely used.
> [https://github.com/apache/groovy/blob/master/subprojects/groovy-json/src/main/java/groovy/json/JsonOutput.java]
>  
> If we use the toPrettyString function the json is being escaped. 
> {code:java}
> JsonBuilder Class:
> public String toPrettyString() {
>   return JsonOutput.prettyPrint(toString());
> } {code}
> However, we can construct a JsonBuilder with JsonGenerator and 
> disableUnicodeEscaping and use new JsonBuilder(content, generator).toString() 
> to create a proper json string representation. This is not possible with 
> JsonOutput though because there is a final constructor and uses a default 
> generator. However disableUnicodeEscaping is true by default but it is not 
> handled properly by the JsonOutput class. It would be great if the JsonOutput 
> had the same feature to construct JsonOutput with a custom JsonGenerator. The 
> JsonOutput object uses the DefaultJsonGenerator with enabled unicode escaping 
> through Options but the toJson() and prettyPrint methods do not handle the 
> escaping properly. 
> [https://github.com/apache/groovy/blob/GROOVY_3_0_X/subprojects/groovy-json/src/main/java/groovy/json/JsonOutput.java#L209|https://github.com/apache/groovy/blob/GROOVY_3_0_X/subprojects/groovy-json/src/main/java/groovy/json/JsonOutput.java#L162]
> {code:java}
> case STRING:
>   String textStr = token.getText();
>   String textWithoutQuotes = textStr.substring(1, textStr.length() - 1);
>   if (textWithoutQuotes.length() > 0) {
>     output.addJsonEscapedString(textWithoutQuotes);
>   } else {
>     output.addQuoted(Chr.array());
>   }
>   break; {code}
> And here: 
> [https://github.com/apache/groovy/blob/master/subprojects/groovy-json/src/main/java/org/apache/groovy/json/internal/CharBuf.java#L379]
> {code:java}
> public final CharBuf addJsonEscapedString(final char[] charArray, boolean 
> disableUnicodeEscaping) {
>   if (charArray.length == 0) return this;
>     if (hasAnyJSONControlChars(charArray, disableUnicodeEscaping)) {
>       return doAddJsonEscapedString(charArray, disableUnicodeEscaping);
>     } else {
>       return this.addQuoted(charArray);
>     }
>    } {code}
> *If the JsonBuilder is constructed with a JsonGenerator it should be 
> constructed with JsonOuput as well and the prettyPrint and toJson function 
> shall not add escaped strings.* *The Bug is in JsonOutput in the CharBuf 
> call.*
> This has to be fixed for toPretty method: 
> {code:java}
> case STRING:
>    String textStr = token.getText();
>    String textWithoutQuotes = textStr.substring(1, textStr.length() - 1);
>    if (textWithoutQuotes.length() > 0) {
>      output.addJsonEscapedString(textWithoutQuotes, disableUnicodeEscaping);
>    } else {
>      output.addQuoted(Chr.array());
>    } 
>    break; {code}
> output.addJsonEscapedString(textWithoutQuotes, disableUnicodeEscaping) should 
> be called with disableUnicodeEscaping. 
> Currently there is no way to prettyPrint a json with the groovy.json classes 
> without having escaped characters in the generated json. 
> Best
> Denis



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (GROOVY-11314) JsonOutput Pretty Print always escapes characters

Reply via email to