This is an automated email from the ASF dual-hosted git repository.

fpaul pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


The following commit(s) were added to refs/heads/master by this push:
     new aeb3822  [FLINK-25927][docs][formats] Add DataStream documentation for 
CSV format
aeb3822 is described below

commit aeb3822ece887734dcaed5b2554f5583488d2dc0
Author: Alexander Fedulov <1492164+afedu...@users.noreply.github.com>
AuthorDate: Thu Feb 24 01:34:18 2022 +0100

    [FLINK-25927][docs][formats] Add DataStream documentation for CSV format
---
 .../docs/connectors/datastream/formats/csv.md      | 60 ++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/docs/content/docs/connectors/datastream/formats/csv.md 
b/docs/content/docs/connectors/datastream/formats/csv.md
new file mode 100644
index 0000000..15d47ed
--- /dev/null
+++ b/docs/content/docs/connectors/datastream/formats/csv.md
@@ -0,0 +1,60 @@
+---
+title:  "CSV"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/csv.html
+- /apis/streaming/connectors/formats/csv.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# CSV format
+
+To use the CSV format you need to add the Flink CSV dependency to your project:
+
+```xml
+<dependency>
+       <groupId>org.apache.flink</groupId>
+       <artifactId>flink-csv</artifactId>
+       <version>{{< version >}}</version>
+</dependency>
+```
+
+Flink supports reading CSV files using `CsvReaderFormat`. The reader utilizes 
Jackson library and allows passing the corresponding configuration for the CSV 
schema and parsing options.
+
+`CsvReaderFormat` can be initialized and used like this:
+```java
+CsvReaderFormat<SomePojo> csvFormat = CsvReaderFormat.forPojo(SomePojo.class);
+FileSource<SomePojo> source = 
+        FileSource.forRecordStreamFormat(csvFormat, 
Path.fromLocalFile(...)).build();
+```
+
+The schema for CSV parsing, in this case, is automatically derived based on 
the fields of the `SomePojo` class using the `Jackson` library. (Note: you 
might need to add `@JsonPropertyOrder({field1, field2, ...})` annotation to 
your class definition with the fields order exactly matching those of the CSV 
file columns).
+
+If you need more fine-grained control over the CSV schema or the parsing 
options, use the more low-level `forSchema` static factory method of 
`CsvReaderFormat`:
+
+```java
+CsvReaderFormat<T> forSchema(CsvMapper mapper, 
+                             CsvSchema schema, 
+                             TypeInformation<T> typeInformation) 
+```
+
+Similarly to the `TextLineInputFormat`, `CsvReaderFormat` can be used in both 
continues and batch modes (see [TextLineInputFormat]({{< ref 
"docs/connectors/datastream/formats/text_files" >}})  for examples).

Reply via email to