MartijnVisser commented on a change in pull request #17640:
URL: https://github.com/apache/flink/pull/17640#discussion_r742836114



##########
File path: docs/content/docs/connectors/datastream/formats/avro.md
##########
@@ -0,0 +1,61 @@
+---
+title:  "Avro"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/avro.html
+- /apis/streaming/connectors/formats/avro.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Avro formats

Review comment:
       The SQL page called it `AVRO Format` which I think is a little bit 
better.

##########
File path: 
docs/content/docs/connectors/datastream/formats/azure_table_storage.md
##########
@@ -0,0 +1,130 @@
+---
+title:  "Microsoft Azure table"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/azure_table_storage.html
+- /apis/streaming/connectors/formats/azure_table_storage.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Microsoft Azure Table Storage format
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing 
Hadoop input format implementation for accessing [Azure's Table 
Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format 
developed by the project is not yet available in Maven Central, therefore, we 
have to build the project ourselves.
+   Execute the following commands:
+
+```bash
+git clone https://github.com/mooso/azure-tables-hadoop.git
+cd azure-tables-hadoop
+mvn clean install
+```
+
+2. Setup a new Flink project using the quickstarts:
+
+```bash
+curl https://flink.apache.org/q/quickstart.sh | bash
+```
+
+3. Add the following dependencies (in the `<dependencies>` section) to your 
`pom.xml` file:
+
+```xml
+<dependency>
+   <groupId>org.apache.flink</groupId>
+   <artifactId>flink-hadoop-compatibility{{< scala_version >}}</artifactId>
+   <version>{{< version >}}</version>
+</dependency>
+<dependency>
+ <groupId>com.microsoft.hadoop</groupId>
+ <artifactId>microsoft-hadoop-azure</artifactId>
+ <version>0.0.4</version>
+</dependency>
+```
+
+`flink-hadoop-compatibility` is a Flink package that provides the Hadoop input 
format wrappers.
+`microsoft-hadoop-azure` is adding the project we've build before to our 
project.
+
+The project is now prepared for starting to code. We recommend to import the 
project into an IDE, such as Eclipse or IntelliJ. (Import as a Maven project!).

Review comment:
       ```suggestion
   The project is now ready for starting to code. We recommend to import the 
project into an IDE, such as IntelliJ. You should import it as a Maven project.
   ```

##########
File path: 
docs/content/docs/connectors/datastream/formats/azure_table_storage.md
##########
@@ -0,0 +1,130 @@
+---
+title:  "Microsoft Azure table"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/azure_table_storage.html
+- /apis/streaming/connectors/formats/azure_table_storage.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Microsoft Azure Table Storage format
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing 
Hadoop input format implementation for accessing [Azure's Table 
Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format 
developed by the project is not yet available in Maven Central, therefore, we 
have to build the project ourselves.
+   Execute the following commands:
+
+```bash
+git clone https://github.com/mooso/azure-tables-hadoop.git
+cd azure-tables-hadoop
+mvn clean install
+```
+
+2. Setup a new Flink project using the quickstarts:
+
+```bash
+curl https://flink.apache.org/q/quickstart.sh | bash
+```
+
+3. Add the following dependencies (in the `<dependencies>` section) to your 
`pom.xml` file:
+
+```xml
+<dependency>
+   <groupId>org.apache.flink</groupId>
+   <artifactId>flink-hadoop-compatibility{{< scala_version >}}</artifactId>
+   <version>{{< version >}}</version>
+</dependency>
+<dependency>
+ <groupId>com.microsoft.hadoop</groupId>
+ <artifactId>microsoft-hadoop-azure</artifactId>
+ <version>0.0.4</version>
+</dependency>
+```
+
+`flink-hadoop-compatibility` is a Flink package that provides the Hadoop input 
format wrappers.
+`microsoft-hadoop-azure` is adding the project we've build before to our 
project.
+
+The project is now prepared for starting to code. We recommend to import the 
project into an IDE, such as Eclipse or IntelliJ. (Import as a Maven project!).
+Browse to the code of the `Job.java` file. Its an empty skeleton for a Flink 
job.
+
+Paste the following code into it:

Review comment:
       ```suggestion
   Paste the following code:
   ```

##########
File path: docs/content/docs/connectors/datastream/formats/parquet.md
##########
@@ -0,0 +1,67 @@
+---
+title:  "Parquet"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/parquet.html
+- /apis/streaming/connectors/formats/parquet.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Parquet formats
+
+Flink has extensive built-in support for [Apache 
Parquet](http://parquet.apache.org/). This allows to easily read from Parquet 
files with Flink. 
+Be sure to include the Flink Parquet dependency to the pom.xml of your project.

Review comment:
       ```suggestion
   In order to use the Parquet format the following dependencies are required 
for projects using a build automation tool (such as Maven or SBT).
   ```

##########
File path: docs/content/docs/connectors/datastream/formats/avro.md
##########
@@ -0,0 +1,61 @@
+---
+title:  "Avro"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/avro.html
+- /apis/streaming/connectors/formats/avro.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Avro formats
+
+Flink has extensive built-in support for [Apache 
Avro](http://avro.apache.org/). This allows to easily read from Avro files with 
Flink.
+Also, the serialization framework of Flink is able to handle classes generated 
from Avro schemas. Be sure to include the Flink Avro dependency to the pom.xml 
of your project.

Review comment:
       ```suggestion
   The serialization framework of Flink is able to handle classes generated 
from Avro schemas. In order to use the Avro format the following dependencies 
are required for projects using a build automation tool (such as Maven or SBT).
   ```

##########
File path: 
docs/content/docs/connectors/datastream/formats/azure_table_storage.md
##########
@@ -0,0 +1,130 @@
+---
+title:  "Microsoft Azure table"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/azure_table_storage.html
+- /apis/streaming/connectors/formats/azure_table_storage.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Microsoft Azure Table Storage format
+
+_Note: This example works starting from Flink 0.6-incubating_

Review comment:
       I don't think we need to include this note, since we don't support Flink 
0.6 anymore (and the documentation is specifically targeted towards Flink 1.15)

##########
File path: 
docs/content/docs/connectors/datastream/formats/azure_table_storage.md
##########
@@ -0,0 +1,130 @@
+---
+title:  "Microsoft Azure table"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/azure_table_storage.html
+- /apis/streaming/connectors/formats/azure_table_storage.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Microsoft Azure Table Storage format
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing 
Hadoop input format implementation for accessing [Azure's Table 
Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format 
developed by the project is not yet available in Maven Central, therefore, we 
have to build the project ourselves.
+   Execute the following commands:
+
+```bash
+git clone https://github.com/mooso/azure-tables-hadoop.git
+cd azure-tables-hadoop
+mvn clean install
+```
+
+2. Setup a new Flink project using the quickstarts:
+
+```bash
+curl https://flink.apache.org/q/quickstart.sh | bash
+```
+
+3. Add the following dependencies (in the `<dependencies>` section) to your 
`pom.xml` file:
+
+```xml
+<dependency>
+   <groupId>org.apache.flink</groupId>
+   <artifactId>flink-hadoop-compatibility{{< scala_version >}}</artifactId>
+   <version>{{< version >}}</version>
+</dependency>
+<dependency>
+ <groupId>com.microsoft.hadoop</groupId>
+ <artifactId>microsoft-hadoop-azure</artifactId>
+ <version>0.0.4</version>
+</dependency>
+```
+
+`flink-hadoop-compatibility` is a Flink package that provides the Hadoop input 
format wrappers.
+`microsoft-hadoop-azure` is adding the project we've build before to our 
project.
+
+The project is now prepared for starting to code. We recommend to import the 
project into an IDE, such as Eclipse or IntelliJ. (Import as a Maven project!).
+Browse to the code of the `Job.java` file. Its an empty skeleton for a Flink 
job.

Review comment:
       ```suggestion
   Browse to the file `Job.java`. This is an empty skeleton for a Flink job.
   ```

##########
File path: docs/content/docs/connectors/datastream/formats/avro.md
##########
@@ -0,0 +1,61 @@
+---
+title:  "Avro"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/avro.html
+- /apis/streaming/connectors/formats/avro.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Avro formats
+
+Flink has extensive built-in support for [Apache 
Avro](http://avro.apache.org/). This allows to easily read from Avro files with 
Flink.

Review comment:
       ```suggestion
   Flink has built-in support for [Apache Avro](http://avro.apache.org/). This 
allows to easily read and write Avro data based on an Avro schema with Flink.
   ```

##########
File path: docs/content/docs/connectors/datastream/formats/parquet.md
##########
@@ -0,0 +1,67 @@
+---
+title:  "Parquet"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/parquet.html
+- /apis/streaming/connectors/formats/parquet.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Parquet formats

Review comment:
       ```suggestion
   # Parquet format
   ```

##########
File path: 
docs/content/docs/connectors/datastream/formats/azure_table_storage.md
##########
@@ -0,0 +1,130 @@
+---
+title:  "Microsoft Azure table"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/azure_table_storage.html
+- /apis/streaming/connectors/formats/azure_table_storage.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Microsoft Azure Table Storage format
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing 
Hadoop input format implementation for accessing [Azure's Table 
Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format 
developed by the project is not yet available in Maven Central, therefore, we 
have to build the project ourselves.
+   Execute the following commands:
+
+```bash
+git clone https://github.com/mooso/azure-tables-hadoop.git
+cd azure-tables-hadoop
+mvn clean install
+```
+
+2. Setup a new Flink project using the quickstarts:
+
+```bash
+curl https://flink.apache.org/q/quickstart.sh | bash
+```
+
+3. Add the following dependencies (in the `<dependencies>` section) to your 
`pom.xml` file:
+
+```xml
+<dependency>
+   <groupId>org.apache.flink</groupId>
+   <artifactId>flink-hadoop-compatibility{{< scala_version >}}</artifactId>
+   <version>{{< version >}}</version>
+</dependency>
+<dependency>
+ <groupId>com.microsoft.hadoop</groupId>
+ <artifactId>microsoft-hadoop-azure</artifactId>
+ <version>0.0.4</version>

Review comment:
       Nit: the ident is slightly different from the one above. 

##########
File path: docs/content/docs/connectors/datastream/formats/hadoop.md
##########
@@ -0,0 +1,38 @@
+---
+title:  "Hadoop"
+weight: 4
+type: docs
+aliases:
+  - /dev/connectors/formats/hadoop.html
+  - /apis/streaming/connectors/formats/hadoop.html
+
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hadoop formats
+
+Apache Flink allows users to access many different systems as data sources.
+The system is designed for very easy extensibility. Similar to Apache Hadoop, 
Flink has the concept
+of so called `InputFormat`s
+
+One implementation of these `InputFormat`s is the `HadoopInputFormat`. This is 
a wrapper that allows
+users to use all existing Hadoop input formats with Flink.
+
+{{< top >}}

Review comment:
       Woud it make sense to move the documentation from 
https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/dataset/hadoop_compatibility/#complete-hadoop-wordcount-example
 to this page? 

##########
File path: docs/content/docs/connectors/datastream/formats/mongodb.md
##########
@@ -0,0 +1,33 @@
+---
+title:  "MongoDb"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/mongodb.html
+- /apis/streaming/connectors/formats/mongodb.html
+
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# MongoDB format

Review comment:
       Is MongoDB a format or a connector? I would expect the latter?

##########
File path: docs/content/docs/connectors/datastream/formats/parquet.md
##########
@@ -0,0 +1,67 @@
+---
+title:  "Parquet"
+weight: 4
+type: docs
+aliases:
+- /dev/connectors/formats/parquet.html
+- /apis/streaming/connectors/formats/parquet.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+
+# Parquet formats
+
+Flink has extensive built-in support for [Apache 
Parquet](http://parquet.apache.org/). This allows to easily read from Parquet 
files with Flink. 

Review comment:
       ```suggestion
   Flink has built-in support for [Apache Parquet](http://parquet.apache.org/). 
This allows to read and write Parquet data with Flink. 
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to