Jennifer88huang commented on a change in pull request #403: URL: https://github.com/apache/flink-web/pull/403#discussion_r551984826
########## File path: _posts/2020-12-22-pulsar-flink-connector-270.md ########## @@ -0,0 +1,171 @@ +--- +layout: post +title: "What's New in Pulsar Flink Connector 2.7.0" +date: 2020-12-22T08:00:00.000Z +categories: news +authors: +- jianyun: + name: "Jianyun Zhao" + twitter: "yihy8023" +- jennifer: + name: "Jennifer Huang" + twitter: "Jennife06125739" + +excerpt: Batch and streaming is the future, Pulsar Flink Connector provides an ideal solution for unified batch and streaming with Apache Pulsar and Apache Flink. Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12, and is fully compatible with Flink data format. Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository, the contribution process is ongoing. +--- + +## About Pulsar Flink Connector +In order for companies to access real-time data insights, they need unified batch and streaming capabilities. Apache Flink unifies batch and stream processing into one single computing engine with “streams” as the unified data representation. Although developers have done extensive work at the computing and API layers, very little work has been done at the data and messaging and storage layers. However, in reality, data is segregated into data silos, created by various storage and messaging technologies. As a result, there is still no single source-of-truth and the overall operation for the developer teams is still messy. To address the messy operations, we need to store data in streams. Apache Pulsar (together with Apache BookKeeper) perfectly meets the criteria: data is stored as one copy (source-of-truth), and can be accessed in streams (via pub-sub interfaces) and segments (for batch processing). When Flink and Pulsar come together, the two open source technologies create a unified data architecture for real-time data-driven businesses. + +The [Pulsar Flink connector](https://github.com/streamnative/pulsar-flink/) provides elastic data processing with [Apache Pulsar](https://pulsar.apache.org/) and [Apache Flink](https://flink.apache.org/), allowing Apache Flink to read/write data from/to Apache Pulsar. The Pulsar Flink Connector enables you to concentrate on your business logic without worrying about the storage details. + +## Challenges +When we first developed the Pulsar Flink Connector, it received wide adoption from both the Flink and Pulsar communities. Leveraging the Pulsar Flink connector, [Hewlett Packard Enterprise (HPE)](https://www.hpe.com/us/en/home.html) built a real-time computing platform, [BIGO](https://www.bigo.sg/) built a real-time message processing system, and [Zhihu](https://www.zhihu.com/) is in the process of assessing the Connector’s fit for a real-time computing system. Review comment: Currently, no English version are documented. Maybe we can try to share their talks/slides. Meanwhile, we'll try our best to work with the speakers and document their cases soon. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org