[ 
https://issues.apache.org/jira/browse/SPARK-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074445#comment-15074445
 ] 

Cazen Lee commented on SPARK-11745:
-----------------------------------

Good Day [~rxin] This is Cazen

I'm sorry for asking question, but could you let me know why 
ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER option has been unsupported?

Recently, I created jira issue SPARK-12537 to support this, and I wonder that 
is there a reason to disable 3 option you mentioned

Thank you in advance!

> Enable more JSON parsing options for parsing non-standard JSON files
> --------------------------------------------------------------------
>
>                 Key: SPARK-11745
>                 URL: https://issues.apache.org/jira/browse/SPARK-11745
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>              Labels: releasenotes
>             Fix For: 1.6.0
>
>
> As a user, I want to be able to read non-standard JSON files. Jackson itself 
> includes a few options that we should allow users to specify:
> - ALLOW_COMMENTS
> - ALLOW_UNQUOTED_FIELD_NAMES
> - ALLOW_SINGLE_QUOTES
> - ALLOW_NUMERIC_LEADING_ZEROS
> - ALLOW_NON_NUMERIC_NUMBERS
> After this change, the following options are still unsupported:
> - ALLOW_YAML_COMMENTS
> - ALLOW_UNQUOTED_CONTROL_CHARS
> - ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER
> See the Jackson source code pasted below for the definition of these config 
> options:
> {code}
>         /**
>          * Feature that determines whether parser will allow use
>          * of Java/C++ style comments (both '/'+'*' and
>          * '//' varieties) within parsed content or not.
>          *<p>
>          * Since JSON specification does not mention comments as legal
>          * construct,
>          * this is a non-standard feature; however, in the wild
>          * this is extensively used. As such, feature is
>          * <b>disabled by default</b> for parsers and must be
>          * explicitly enabled.
>          */
>         ALLOW_COMMENTS(false),
>         /**
>          * Feature that determines whether parser will allow use
>          * of YAML comments, ones starting with '#' and continuing
>          * until the end of the line. This commenting style is common
>          * with scripting languages as well.
>          *<p>
>          * Since JSON specification does not mention comments as legal
>          * construct,
>          * this is a non-standard feature. As such, feature is
>          * <b>disabled by default</b> for parsers and must be
>          * explicitly enabled.
>          */
>         ALLOW_YAML_COMMENTS(false),
>         
>         /**
>          * Feature that determines whether parser will allow use
>          * of unquoted field names (which is allowed by Javascript,
>          * but not by JSON specification).
>          *<p>
>          * Since JSON specification requires use of double quotes for
>          * field names,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>         ALLOW_UNQUOTED_FIELD_NAMES(false),
>         /**
>          * Feature that determines whether parser will allow use
>          * of single quotes (apostrophe, character '\'') for
>          * quoting Strings (names and String values). If so,
>          * this is in addition to other acceptabl markers.
>          * but not by JSON specification).
>          *<p>
>          * Since JSON specification requires use of double quotes for
>          * field names,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>         ALLOW_SINGLE_QUOTES(false),
>         /**
>          * Feature that determines whether parser will allow
>          * JSON Strings to contain unquoted control characters
>          * (ASCII characters with value less than 32, including
>          * tab and line feed characters) or not.
>          * If feature is set false, an exception is thrown if such a
>          * character is encountered.
>          *<p>
>          * Since JSON specification requires quoting for all control 
> characters,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>         ALLOW_UNQUOTED_CONTROL_CHARS(false),
>         /**
>          * Feature that can be enabled to accept quoting of all character
>          * using backslash qooting mechanism: if not enabled, only characters
>          * that are explicitly listed by JSON specification can be thus
>          * escaped (see JSON spec for small list of these characters)
>          *<p>
>          * Since JSON specification requires quoting for all control 
> characters,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>         ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER(false),
>         /**
>          * Feature that determines whether parser will allow
>          * JSON integral numbers to start with additional (ignorable) 
>          * zeroes (like: 000001). If enabled, no exception is thrown, and 
> extra
>          * nulls are silently ignored (and not included in textual 
> representation
>          * exposed via {@link JsonParser#getText}).
>          *<p>
>          * Since JSON specification does not allow leading zeroes,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>         ALLOW_NUMERIC_LEADING_ZEROS(false),
>         
>         /**
>          * Feature that allows parser to recognize set of
>          * "Not-a-Number" (NaN) tokens as legal floating number
>          * values (similar to how many other data formats and
>          * programming language source code allows it).
>          * Specific subset contains values that
>          * <a href="http://www.w3.org/TR/xmlschema-2/";>XML Schema</a>
>          * (see section 3.2.4.1, Lexical Representation)
>          * allows (tokens are quoted contents, not including quotes):
>          *<ul>
>          *  <li>"INF" (for positive infinity), as well as alias of "Infinity"
>          *  <li>"-INF" (for negative infinity), alias "-Infinity"
>          *  <li>"NaN" (for other not-a-numbers, like result of division by 
> zero)
>          *</ul>
>          *<p>
>          * Since JSON specification does not allow use of such values,
>          * this is a non-standard feature, and as such disabled by default.
>          */
>          ALLOW_NON_NUMERIC_NUMBERS(false),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to