[ 
https://issues.apache.org/jira/browse/FLINK-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14244181#comment-14244181
 ] 

ASF GitHub Bot commented on FLINK-1318:
---------------------------------------

GitHub user fhueske opened a pull request:

    https://github.com/apache/incubator-flink/pull/265

    [FLINK-1318] CsvInputFormat: Made quoted string parsing optional with 
configurable quote character. Simplified parsing

    - Parsing of quoted strings is disabled by default
    - When enabling a quoting character needs to be specified
    - If quoting parsing is enabled, Strings are parsed as quoted if the first 
character is the quoting character (leading and tailing whitespace characters 
are NOT ignored)
    - If quoting parsing is enabled and the first character is NOT the quoting 
character, Strings are treated as unquoted
    - Quoted parsing fails 1) if the last character of the field ist NOT the 
quote character (tailing characters), or 2) the closing quote character is 
missing
    
    This mode of operation differs from the previous implementation:
    - Leading and tailing characters were ignored in case of quoted strings but 
would have caused problems if whitespaces were used as field delimiters.
    - double quote characters could be used to escape quotes in quoted strings
    
    This pull request builds on PR #264 (only the last commit is valid)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/incubator-flink quotedStringParsing

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-flink/pull/265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #265
    
----
commit be532817e00fa03050530e5995a3675740eb070d
Author: Fabian Hueske <[email protected]>
Date:   2014-10-20T13:18:20Z

    [FLINK-1168] Added support for multi-char delimiters.
    This commit includes parts of Cbro's pull request and subsumes PR #247
    
    This closes #247

commit 95267269c694c16ee89191dc28c386f3165be432
Author: Fabian Hueske <[email protected]>
Date:   2014-12-12T13:04:17Z

    [FLINK-1318] Simplified quoted string parsing, made it optional, and use a 
configurable quote character

----


> Make quoted String parsing optional and configurable for CSVInputFormats
> ------------------------------------------------------------------------
>
>                 Key: FLINK-1318
>                 URL: https://issues.apache.org/jira/browse/FLINK-1318
>             Project: Flink
>          Issue Type: Improvement
>          Components: Java API, Scala API
>    Affects Versions: 0.8-incubating
>            Reporter: Fabian Hueske
>            Assignee: Fabian Hueske
>            Priority: Minor
>
> With the current implementation of the CSVInputFormat, quoted string parsing 
> kicks in, if the first non-whitespace character of a field is a double quote. 
> I see two issues with this implementation:
> 1. Quoted String parsing cannot be disabled
> 2. The quoting character is fixed to double quotes (")
> I propose to add parameters to disable quoted String parsing and set the 
> quote character.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to