[ https://issues.apache.org/jira/browse/DRILL-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781429#comment-17781429 ]
ASF GitHub Bot commented on DRILL-8457: --------------------------------------- cgivre merged PR #2840: URL: https://github.com/apache/drill/pull/2840 > Allow configuring csv parser in http storage plugin configuration > ----------------------------------------------------------------- > > Key: DRILL-8457 > URL: https://issues.apache.org/jira/browse/DRILL-8457 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - HTTP > Affects Versions: Future > Reporter: Zbigniew Tomanek > Priority: Minor > Fix For: Future > > > Currently there is no way to configure csv parser when http plugin is used. > Because of that some kind of files cannot be parsed (e.g. when any column has > more than 4096 chars or file has a delimiter different from `,`). > Since in DataWalk we utilize http plugin quite often we've changed our > internal fork of Drill so following parser/format properties can be > configured using additional `csvOptions` field: > > {code:json} > { > "csvOptions": { > "delimiter": "\t", > "quote": "\"", > "quote_escape": "\"", > "line_separator": "\n", > "header_extraction_enabled": null, > "number_of_rows_to_skip": 0, > "number_of_records_to_read": -1, > "line_separator_detection_enabled": true, > "max_columns": 512, > "max_chars_per_column": 4096, > "skip_empty_lines": true, > "ignore_leading_whitespaces": true, > "ignore_trailing_whitespaces": true, > "null_value": null > } > }{code} > I'd be glad to get feedback whether creating PR with these changes would > bring any value to the Drill -- This message was sent by Atlassian Jira (v8.20.10#820010)