morningman commented on code in PR #16055:
URL: https://github.com/apache/doris/pull/16055#discussion_r1090766395
##########
docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD.md:
##########
@@ -180,6 +180,8 @@ ERRORS:
25. trim_double_quotes: 布尔类型,默认值为 false,为 true 时表示裁剪掉 csv 文件每个字段最外层的双引号。
+26. skip_lines: <version since="1.2" type="inline"> 整数类型, 默认值为0,
含义为跳过csv文件的前几行. 当设置format设置为csv_with_names或、csv_with_names_and_types时, 该参数会失效.
</version>
Review Comment:
```suggestion
26. skip_lines: <version since="dev" type="inline"> 整数类型, 默认值为0,
含义为跳过csv文件的前几行. 当设置format设置为 `csv_with_names` 或、`csv_with_names_and_types` 时,
该参数会失效. </version>
```
##########
be/src/vec/exec/format/csv/csv_reader.cpp:
##########
@@ -88,14 +88,18 @@ CsvReader::~CsvReader() = default;
Status CsvReader::init_reader(bool is_load) {
// set the skip lines and start offset
int64_t start_offset = _range.start_offset;
- if (start_offset == 0 && _params.__isset.file_attributes &&
- _params.file_attributes.__isset.header_type &&
- _params.file_attributes.header_type.size() > 0) {
- std::string header_type =
to_lower(_params.file_attributes.header_type);
- if (header_type == BeConsts::CSV_WITH_NAMES) {
- _skip_lines = 1;
- } else if (header_type == BeConsts::CSV_WITH_NAMES_AND_TYPES) {
- _skip_lines = 2;
+ if (start_offset == 0) {
+ // check header typer first
+ if (_params.__isset.file_attributes &&
_params.file_attributes.__isset.header_type &&
+ _params.file_attributes.header_type.size() > 0) {
+ std::string header_type =
to_lower(_params.file_attributes.header_type);
+ if (header_type == BeConsts::CSV_WITH_NAMES) {
+ _skip_lines = 1;
+ } else if (header_type == BeConsts::CSV_WITH_NAMES_AND_TYPES) {
+ _skip_lines = 2;
+ }
+ } else if (_params.file_attributes.__isset.skip_lines) {
Review Comment:
Need to check `_params.__isset.file_attributes`?
##########
docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/STREAM-LOAD.md:
##########
@@ -183,6 +183,8 @@ ERRORS:
25. trim_double_quotes: Boolean type, The default value is false. True means
that the outermost double quotes of each field in the csv file are trimmed.
+26. skip_lines: <version since="1.2" type="inline"> Integer type, the default
value is 0. It will skip some lines in the head of csv file. It will be disable
when format is csv_with_names or csv_with_names_and_types. </version>
Review Comment:
```suggestion
26. skip_lines: <version since="dev" type="inline"> Integer type, the
default value is 0. It will skip some lines in the head of csv file. It will be
disabled when format is `csv_with_names` or `csv_with_names_and_types`.
</version>
```
##########
fe/fe-core/src/main/cup/sql_parser.cup:
##########
@@ -621,7 +621,8 @@ terminal String
KW_AUTO,
KW_PREPARE,
KW_EXECUTE,
- KW_LINES;
+ KW_LINES,
+ KW_IGNORE;
Review Comment:
Need to add `KW_IGNORE` to the `keywords ::=` entry
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]